No announcement yet.

Large files - compare only first [X] bytes

  • Filter
  • Time
  • Show
Clear All
new posts

  • Large files - compare only first [X] bytes

    I reguarly compare directories with about 200 .mpg files of 1 to 2 GB each. The directories are over the local network, so a full binary compare would take a few hours. Comparing file size is a pretty reliable test for files this big, but subtle changes to headers in the file can leave the filesize intact.

    What I would like to do is have an option to do a binary compare the first X bytes of the files. Ideally, this would be a extension-based comparison rule, but I'd be just as happy if it was a session-specific setting, etc.

    Or perhaps there's a way I can hack this in as "unimportant text?" But I'm guessing it would still have to read the whole file to decide which text is ignorable?

    I looked at the plug-ins at, but I didn't see anything promising. Are there other repositories for plug-ins?

    FWIW, I'm not so concerned about the File Viewer...The directory compare is all that matters for this project.

    Any help you can offer will be most appreciated! I can't tell you how much BC makes our work day easier and faster!


    ps. Bonus gratis if I could *also* have it compare the final [N] bytes of the file, just to be sure. But only if that didn't force BC to read the entire file over the network to do it.

  • #2
    Re: Large files - compare only first [X] bytes


    Thanks for the suggestion.

    A rules-based comparison is quite a bit slower than a binary comparison, and it will always read the entire contents of a file.

    A binary comparison only compares file contents until it finds a difference.

    Right now size only comparison or binary comparison are your best bets for comparing files like these.

    One other option for remote file comparison is to use FTP with an FTP server that supports server side CRC generation. With the right FTP server, the server will only translate the CRC values over the network. To see if your FTP server supports this, set CRC comparison criteria in BC, and look for the XCRC command in BC's log. If the FTP server doesn't support the XCRC command, BC has to transfer the entire file, then calculate the CRC value, so it won't be very quick.

    I'll add binary comparison of only file headers to our wish list.
    Chris K Scooter Software


    • #3
      Re: Large files - compare only first [X] bytes

      I vote for this as well, but with a more complete implementation:
      Check the first (X) bytes
      Check some (Y) random bytes in the middle of the file
      Check the last (Z) bytes
      the 2nd and 3rd options do not require reading the entire file, only moving the file pointer.
      Of course, a server-side CRC check would be ideal, but not always available.