Page 3 of 7 FirstFirst 12345 ... LastLast
Results 21 to 30 of 69
  1. #21
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    11,616

    Default

    Hello Arjay,

    1) Determine the number of the shorter filename on the right:
    When I copy and paste "a@b New Text Document New Text Document New Text Document 0123.eml" into BC3, I find that I can get to the 67th position *including* the extension. Since ".eml" is explicitely defined in the regular expression, you want to only count the number of characters in the main part of the filename, up to the "3". If the "a" is position 1, then place the cursor before the "3" to find that position number in the Text Compare (this assumes there is no preceding whitespace.)

    2) In the Folder Compare, Session Settings dialog, Misc tab, define the Alignment Override:
    (.{62}).*\.eml
    with
    $1.eml
    X Regular Expression
    Works with this pair of example files. If you replace 65 with 62, does that work for you?

    If you are still having any trouble, please email us at support@scootersoftware.com with:
    1) Your Support.zip (from the Help menu -> Support; Export)
    2) A link back to this forum post for reference
    3) Two Snapshot files generated from the Tools menu -> Save Snapshot. One for each side of the comparison.

    This will help us quickly compare using your settings on your specific folder structure and see if we can reproduce any of the trouble you have been running into.
    Aaron P Scooter Software

  2. #22
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    11,616

    Default

    In addition, I work with 2 tabs in Beyond Compare. The Folder Compare tab and Text Compare tabs are both open.

    In the Folder Compare, I
    1) Rename command the file on the left. With all the text of the filename highlighted, I press Ctrl+C to copy to clipboard.
    2) In the Text Compare tab, I click on the left pane, go to the File menu -> Open Clipboard
    3) Rename command the file on the right. Ctrl+C the right file name.
    4) In the Text Compare tab, I click on the right pane, and go to the File menu -> Open Clipboard.

    This opens a Text Compare session comparing the two file names, quickly showing the differences and where they occur.
    Aaron P Scooter Software

  3. #23
    Join Date
    Oct 2010
    Posts
    53

    Thumbs up bingo!

    Quote Originally Posted by Aaron View Post
    Hello Arjay,

    (.{62}).*\.eml
    with
    $1.eml
    X Regular Expression
    Works with this pair of example files. If you replace 65 with 62, does that work for you?
    Yes thanks BUT you have to know the length of the filename to do the partial match, i.e. 62 in this case, because if I set it lower e.g. 50 where it should match at least the first 50 of the characters, it doesn't work, only on the precise length of the individual shorter filename of the 2 being compared. You cannot say, for example, (.{62,}).*\.eml - meaning match at least 62 or more.

    Obviously I want to automate as much as possible the comparison, hence the partial match, so having to know each filename length defeats the object; I might as well do a manual compareto comparison.
    Last edited by arjaydavis; 03-May-2011 at 03:10 PM.

  4. #24
    Join Date
    Oct 2010
    Posts
    53

    Default

    Any thoughts on how I can get it to match "at least" a number of characters? Or "1 or more" successfully. The regex reference in the beyond compare defines these but they don't work in this situation.

  5. #25
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    11,616

    Default

    Hello,

    The {62,} would be a greedy expression, and would match more than you intend. Since it then matches more of the left file, past the truncation, there would be no match on the right side. Adding a {62,}? makes it Non-greedy, but it could then still match on more of the left line than you want.

    The goal of the {62} is to make a regular expression that is the length of the truncation and matches the text on the right side, so that it works when it is transplanted onto the right side with $1.eml. In most truncation scenarios, there is a hard character count limit where truncation will occur. If your left file is smaller than that limit, then it theoretically was not truncated on the right, both file names are of equal size and equal text, and it should align without assistance of an alignment override.

    Could you go into more detail on your truncation and folders? Do you expect your pair of folders to have files of variable truncation on the right side? If so, what is causing the variable length of the file names?
    Aaron P Scooter Software

  6. #26
    Join Date
    Oct 2010
    Posts
    53

    Post

    Quote Originally Posted by Aaron View Post
    In most truncation scenarios, there is a hard character count limit where truncation will occur.
    This makes sense: the filename would have been truncated to be with the UDF1.02 length before burning to disc. So all the filenames that were originally longer than that should all be the same length, so the fixed value e.g. 62, 95 as we've used or whatever ought to apply.

    But the other problem is that if the truncation results in 2 files with the same name, then the truncate has to append a unique identifier to the filename to make the filename unique. So even if we know that the length is always going to be 62 for example, we should also know that the last 1, 2 or maybe 3 characters in the truncated filename will be different (i.e. not seen in) from the original longer length file.

    So our regex needs to be modified to account for the appended unique identifier seen in the truncated filename. I will take a look at what is actually being appended later and come back with more info later.

  7. #27
    Join Date
    Oct 2010
    Posts
    53

    Question partial match with unique identifier/index added

    So I accept that the truncation will be fixed everytime at a certain value, so...

    A specific example following on from my last comment would be:
    left hand side:
    original non-truncated filenames:
    a@b New Text Document New Text Document New Text Document really quite long part a.eml
    a@b New Text Document New Text Document New Text Document really quite long part b.eml
    a@b New Text Document New Text Document New Text Document really quite long part c.eml

    right hand side:
    truncated files, with index appended to make unique (so that truncate rename possible i.e. doesnt clash with same name of existing filename when rename attempt made):
    a@b New Text Document New Text Document New Text Document1.eml
    a@b New Text Document New Text Document New Text Document2.eml
    a@b New Text Document New Text Document New Text Document3.eml

    How would we modify the regex so that the partial match was possible here?
    Last edited by arjaydavis; 04-May-2011 at 08:55 AM.

  8. #28
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    11,616

    Default

    Hello,

    Unfortunately, this is the scenario I was worried about in my earlier reply to this forum thread. If
    "a@b New Text Document New Text Document New Text Document really quite long part a.eml"
    shorted to
    "a@b New Text Document New Text Document New Text Documenta.eml"
    Then we could work on a regular expression to catch this.

    If it shortens to:
    "a@b New Text Document New Text Document New Text Document1.eml"
    then the "1" is different text that must be explicitly defined.
    The right side would be:
    $1\1.eml
    or
    $1\2.eml

    With multiple possible matches, it is not guaranteed that part a will align with 1; it may align to 2 or 3 since the "a" is not a part of the regular expression on the left. You could alter the regular expression to match:
    (.{57}).*a\.eml
    to
    $1\1.eml

    This assumes the "a" is literal. Is there an "a", or something similar, that could be defined and matched on with the Left regular expression?
    Aaron P Scooter Software

  9. #29
    Join Date
    Oct 2010
    Posts
    53

    Cool

    Quote Originally Posted by Aaron View Post
    This assumes the "a" is literal. Is there an "a", or something similar, that could be defined and matched on with the Left regular expression?
    The appended value is variable so we are going to be unlucky with your suggestion as you feared.

    However I had about 60 partially matched files to compare and I counted the fixed truncated length that they were at and came up with the following regex setup, based on the suggestion already made here:

    left hand:
    (.{123}).*\.eml

    right hand:
    $1.eml

    this matched the files with names where the one side was a pure partial match of the other.

    The remaining files not matched were those with a unique value appended on the truncated version as discussed in my last comment.

    I can test these for being identical (and therefore purge them) using MindGems Fast Duplicate File Finder which i also have a license for.

    I put both folders into the program, one folder being the folder that I want to keep all contents intact and the second folder being the folder that I want to purge the duplicates in, to leave files that are not present in the first folder which i will want to merge in.

    I make sure I keep the first folder intact by right clicking on this and in the pop up selecting disable auto-scan for this folder so that the program doesnt remove files from the folder but instead the other one i want to purge.

    another preventative measure is to turn off unicode in burning the DVD so that single bytes are used for filenames which doubles the length available, reducing or eliminating truncation. This is fine if the filenames are ascii only and still complies with UDF standard as there is actually a bit set on or off in the standard to indicate unicode in use, i believe from reading imgburn forums

    other measures are to write a script to uniquely truncate the source files before burning. this can get complicated if the files are referenced from other files - the reference would be broken. for me this may not be an issue because my files tend to be standalone .eml record files of important or memorable emails i want to keep.



    the best solution is for beyond compare to have a more advance file selection 'engine', perhaps for BC4 future release that can operate on partial matches with the option for the user to click a button to cycle through the matches if not a single one can be determined, coupled with a selection precedence system whereby filesize can have higher precedence than filename to find a match (this ive discussed as a want in another thread)

    Therefore it's good that the Beyond Compare featureset hasn't plateaued and that there are opportunities for more releases - and revenue streams for you.

    In the meantime, No silver bullet for my need at the moment i would think, but several partial solutions.

    I've outlined them here too:
    http://superuser.com/questions/27840...tools-software

  10. #30
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    11,616

    Default

    Hello,

    Thanks for the detailed summary and suggestions. We do have a content alignment method on our Customer Wishlist. I've added your current workflow as a workcase example. Thanks for all the details.
    Aaron P Scooter Software

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •