No announcement yet.

Option to disregard duplicate files?

  • Filter
  • Time
  • Show
Clear All
new posts

  • Option to disregard duplicate files?

    Beyond Compare is extremely useful when comparing two folder trees using "ignore folder structure". This allows validating whether any file in one tree -- regardless of location -- is orphaned from the other tree, even though the files may be in different locations in each tree. This one feature saves an immense amount of work.

    Unfortunately if any of the files within a tree are duplicate, this throws off the comparison. The first file fulfills the match, hence is not shown as an orphan. The next instance of the file is a duplicate so appears as an orphan. It's not possible to tell why it appears, so each one must be checked manually.

    E.g, verify all the files on a portable drive are present in another drive. You don't care where -- you just want to make sure you have them, so the portable drive can be reformatted for other use.

    There are de-duplication utilities you could run on the portable drive before running BC, but this has its own issues. If BC could optionally re-compare the set of orphans to the existing "equals" set, then don't display any which match, that would be great.

    Is there any way to do this now, or is this a possible future feature?

  • #2
    There is not a method to detect duplicates now; you could hide Orphans, but this would hide any files that did not match on file name (exist only on one side), not just duplicates.

    Duplicate handling is on our wishlist.
    Aaron P Scooter Software


    • #3
      I just wanted to reiterate I am running into this situation frequently. My current workaround is run a stand alone de-duplicating tool (there are many, each with pros/cons), delete the duplicates, then run BC between that de-duplicated folder and the other folder. This exact issue was raised in this thread:

      In my work as an archivist for a documentary film group, I often need to compare two folders with "ignore folder structure" to verify we have all files from one hard drive located somewhere within another hard drive having a different folder structure. Duplicates are OK as long as we have the files. I don't need a generalized de-duplicating tool or a duplicate file finder -- I only need BC to disregard duplicates during a compare using "ignore folder structure".


      • #4

        BC would treat duplicates as Orphans. Only the first match would match, at which point any additional duplicates on one side would be Orphans (unless both sides have multiple duplicates, at which point they would also align). It would then be a matter of reviewing the Orphans for duplicates or additions/deletes.

        Adding duplicate detection is on our Customer Wishlist, but is not a small project and one we haven't been able to tackle yet.
        Aaron P Scooter Software


        • #5

          Have y'all gotten any further on this request? It's been a while.. This is one of the main reasons I need software like this. I frequently wind up with Windows duplicating user profile common folder files such as documents and pictures due to people seeding their folders from old computers and roaming profiles dropping another copy with an index at the end of the base filename in parentheses.

          original file: filename.txt
          duplicate one: filename (1).txt
          duplicate two: filename (2).txt

          Obviously I don't care about these duplicates and would even love having a way to easily strip them out.

          THERE IS A MAJOR CAVEAT HOWEVER to the obvious solution of using the filename filter "* (?).*": sometimes there is no instance of the base filename or the file was intentionally saved by a user with that filename format, and we need the first copy of that file with the parens in the name if so.


          Ingram Barclay


          • #6

            It's still something on our wishlist. However, it's a very large project (some other entire programs only perform this single task), and not one we've been able to tackle yet.
            Aaron P Scooter Software