Compare as Equal two files that have a character difference in the names

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • BillDewey
    New User
    • Dec 2018
    • 2

    Compare as Equal two files that have a character difference in the names

    I have a requirement to compare large file structures, >80,000 files, which are moving from a file share into a document management system where certain characters are not allowed. The process we use to copy these files transforms the characters to a "--". What I need to do is to have:
    File_Name.txt in the source folder be seen the same as File--Name.txt in the target.

    In trying to utilize the Alignment Filter->Misc tab, and am having difficulty figuring out the correct syntax. I need to ignore the "_" and the "--" completely, just do my comparison without those characters at all. As if the names were "FileName.txt".

    Ideas on how to accomplish this? My Regex skills are quite rusty, I believe that I need to set up the correct set for the Align Left and then use something like $1 on the other side. Anyone have an example of something like this?

    Thanks in advance.
  • Chris
    Team Scooter
    • Oct 2007
    • 5538

    #2
    Alignment overrides in Beyond Compare Pro should be able to do this.

    Example filenames:
    test_one.txt test--one.txt
    test_two.txt test--two.txt
    test_file_three.txt test--file--three.txt
    test_file_number_four.txt test--file--number--four.txt

    To align the differences in the Folder Compare, click the Rules toolbar button (referee icon).
    Go to the Misc tab.
    In Alignment Overrides, click +.
    Align left file (or folder): (.*)_(.*)
    with right file (or folder): $1--$2
    Check Regular expression.

    The above handles a single _ on the left to a single -- on the right.

    Repeat with the following alignment overrides for two and three _ to -- replacements:
    (.*)_(.*)_(.*)=$1--$2--$3
    (.*)_(.*)_(.*)_(.*)=$1--$2--$3--$4

    Regular expression explanation:
    () - group an expression and store it in the variable $1 through $9 for use in a replacement
    . - match on any one character
    * - match on 0 or more of the preceding character
    Chris K Scooter Software

    Comment

    • BillDewey
      New User
      • Dec 2018
      • 2

      #3
      I was closer than I thought yesterday, at least I was on the right track. Your reply is perfect, works like an absolute charm. The only other, so far the only other, wrinkle we had were a bunch of files with a " " (space) in the beginning. That was a simple one to add. The hardest part was building out the entire set of "left-side" characters, ones that our document management system does not allow, and then building out allowance for up to 5 occurrences. What is truly amazing to me is just how fast your product is in dealing with what is now a set of 6 Regex filters, plus a name and size filter.

      Thank you for your very thorough, and very timely, response.

      We will now become official users, this product will save us a lot of time and trouble.

      Comment

      Working...