Compare files with partially matching file names?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • MattVsMatt
    Visitor
    • Oct 2011
    • 6

    #61
    Hi, I'm new here, hope this is an appropriate place to post this question - please advise if there is a better place!
    I'm trying to use an alignment override to match the last N characters of filenames on the left with the same set of N characters occurring anywhere in the names of files on the right - e.g.:
    match
    "05 Little Child.mp3"
    to
    "Right: The Beatles - With The Beatles - 05 - Little Child.mp3".
    I've tried with regular expressions such as
    (........)$ = .*\1.*
    and
    (........)$ = .*$0.*
    but everything I've tried succeeds only in aligning files whose names are completely identical.
    What am I missing here?
    - Matt

    Comment

    • Aaron
      Team Scooter
      • Oct 2007
      • 15997

      #62
      The Alignment Overrides only support Regular Expressions on the Left side, so you'll want to Swap sides so the longer file name is on the Left. Then you can use a Regular Expression similar to:
      (.*)(\d\d) - (.*)\.mp3
      to align with the right of:
      $2 $3.mp3

      The definition is matching on the front of the left file, then throwing it away in the match on the right.
      Aaron P Scooter Software

      Comment

      • MattVsMatt
        Visitor
        • Oct 2011
        • 6

        #63
        Originally posted by Aaron
        The Alignment Overrides only support Regular Expressions on the Left side, so you'll want to Swap sides so the longer file name is on the Left. Then you can use a Regular Expression similar to:
        (.*)(\d\d) - (.*)\.mp3
        to align with the right of:
        $2 $3.mp3

        The definition is matching on the front of the left file, then throwing it away in the match on the right.
        Aaron,

        Thanks for the tip, but it hasn't solved my problem. I tried your suggested alignment override spec and it seems to work as intended, but it's too specific. In my file set, many of the files on the side with longer filenames don't conform to the format of the example I gave yesterday. So what I'm actually trying to do right now is align files whose names match in the last 8 characters, regardless of whatever else may be in the name. On the left I'm now attempting with:

        (.*)(........)\.mp3

        and on the right with:

        $2.mp3

        What I'm getting with this and pretty much everything else I've tried is that, for example, files with the name "03 Track 03.mp3" on both sides get aligned, but a file on the left with the name "01 Track 01.mp3" does not get matched to a file on the right with the name "01. Track 01.mp3". These filenames are identical in the last 8 characters, so evidently I'm doing this wrong, or else the alignment override facility doesn't completely work as intended.

        In fact, almost all of the files on both sides have names with more than 8 characters. I don't really see how it's possible to not use a regular expression on the right as well. Does there not need to be something before the "$2" to match the right filename characters before the last 8? And in general, wouldn't it be incredibly limiting if regular expressions can be applied to only the left filename? BC doesn't complain if I use:

        (.*)$2.mp3

        on the right, but in fact it makes no difference in the outcome. If it's true that regular expressions can't be used on the right, it'd be great if the UI made that more clear and / or if BC would reject attempts to do so.

        Do you have any suggestions or ideas? It seems to me that something like what I'm trying to do shouldn't be so difficult, unless for some reason it's just not possible.

        Ultimately I'd like to be able to align files if the last 8 characters of the left file name appear as an identical string anywhere in the right filename – or (better than regexp matching in this case) if files could be aligned according to size.

        Matt

        Comment

        • Aaron
          Team Scooter
          • Oct 2007
          • 15997

          #64
          Hello,

          The issue is that the Right (match to) side requires the full name to be defined and does not support masking. BC4 doesn't complain, but it is looking for the literal (.* characters, which isn't the name of your file on the right side: "(.*)Track 01.mp3", so the alignment won't find this file and create the match.

          The "." on the right side is going to make the right mask more literal and difficult, since as a different character it would need to be explicitly defined. Without the period, it could match on a larger, easier mask.
          Aaron P Scooter Software

          Comment

          • MattVsMatt
            Visitor
            • Oct 2011
            • 6

            #65
            Hi Aaron, I'm still confused. I think what you just told me is that the tactic I'm trying to use (matching only the last 8 characters of the file names) is not actually possible, because there's no way to specify only a partial name on the right side. Is that right, or can you suggest a way to achieve that?

            I'm not deeply initiated into regular expressions – definitely no guru – so when stuff like this doesn't work I'm always quick to suspect that there's something I haven't understood correctly – maybe you can shed some light here:

            If the name must be fully defined on the right, then I don't understand why my spec (in particular the "$2.mp3") is able to match anything at all. It seems to me that "$2" should correspond to the group "(........)" on the left (in "(.*)(........)\.mp3"). But that group should almost never match the complete file name on either left or right sides. Can you clarify for me why it is nevertheless able to find matches (but only when the names are identical on both sides)?

            Would you say that I'm misunderstanding some detail of regular expression formulation and that my spec is doing something different from what I intend it to do? If so – what is it doing?
            Last edited by MattVsMatt; 25-Oct-2017, 01:31 PM.

            Comment

            • Aaron
              Team Scooter
              • Oct 2007
              • 15997

              #66
              Short answer: yes, you won't be able to define a RegEx to match of this.

              There's nothing technically wrong with your RegEx handling, it's just a matter of the limitations of our program or the logic of the definition. For example, $2.mp3 would align if the right file was exactly: "Track 01.mp3". Since the right file has extra characters in the front, then it's not a match so it's not aligning. Our "right" side does not support masking the prefix characters; it only supports $1, $2, etc, so any different characters that don't match need to be explicitly defined.
              Aaron P Scooter Software

              Comment

              • MattVsMatt
                Visitor
                • Oct 2011
                • 6

                #67
                Thanks! That's clear enough. Although – at your convenience – I would still appreciate your insight as to how my spec is able to match identical filenames on left and right even though they are longer than 8 characters.

                Moving away from that: is BC able to align files based solely on file size? For my current application, that would actually be a better way to go.

                - Matt

                Comment

                • Aaron
                  Team Scooter
                  • Oct 2007
                  • 15997

                  #68
                  Hello,

                  File names are the only criteria we currently support aligning on, so we won't be able to match on file size.

                  File names longer than 8 characters shouldn't be a barrier, but the "." on the right that isn't on the left needs to be included in the right name. Let's remove the Regular Expression and insert the literal characters:
                  01 Track 01.mp3 <=> 01. Track 01.mp3
                  which could also be:
                  (01) (Track 01).mp3 <=> $1. $2.mp3

                  Note the ". " between $1 and $2. This mask can be shifted a little to also include the " "
                  (01)( Track 01).mp3 <=> $1.$2.mp3
                  but since the "." doesn't exist on the left, and can't match, then it needs to be included in the right side definition, since the right side filename has that character present as part of it's Track Number, while the left just has a space. Once this is working for one file, you can begin replacing the interiors of (01) and (Track 01) with regular expressions to match on a larger variety of file names.
                  Aaron P Scooter Software

                  Comment

                  • MattVsMatt
                    Visitor
                    • Oct 2011
                    • 6

                    #69
                    That all makes sense. I very much appreciate your patient help! Now I understand that I'm trying to get BC to do something it's not currently able to do - and just as important: I understand why!

                    Best,
                    Matt

                    Comment

                    Working...