Announcement

Collapse
No announcement yet.

Question comparing only first term of line

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question comparing only first term of line

    When I do a text compare, I want to check only for the first term of the complete page (if missing or renamed), all other, following text to the right should be ignored. How can I archieve this? Scratching head here....
    Click image for larger version  Name:	lQ1xOU8.png Views:	0 Size:	198.8 KB ID:	86645

    Edit: This should go the Beyond Compare 4 (not 3) forum, sorry.
    Last edited by olle; 13-Oct-2021, 06:34 AM.

  • #2
    Is it possible to use Table Compare?
    Then You should be able to choose column importance.
    regards
    Rodolfo Giovanninetti

    Comment


    • #3
      Hello,

      Rodolfo's idea is sound; if your files are delimited or fixed width data, the Table Compare's columns could work.

      Otherwise, you can define a grammar to match on the initial 'column', then in the Session Settings, Importance tab, you would check that custom element name as Important, and uncheck Everything Else (unchecked items are Unimportant).
      http://www.scootersoftware.com/suppo..._unimportantv3

      It looks like your initial grammar might look something like this?
      ^\s+IDP_[^\s]+

      ^ beginning of line
      \s* zero or more whitespace
      IDP_ literal
      [^s]+ one or more Not Whitespace characters
      Aaron P Scooter Software

      Comment


      • #4
        Originally posted by RodolfoGiovanninetti View Post
        Is it possible to use Table Compare?
        Then You should be able to choose column importance.
        regards
        Rodolfo Giovanninetti
        Hi Rodolfo,
        just tried it, but it didn't work (it's not a table, complete line is in one field).
        Thanks
        oli

        Comment


        • #5
          Originally posted by Aaron View Post
          Hello,
          It looks like your initial grammar might look something like this?
          ^\s+IDP_[^\s]+

          ^ beginning of line
          \s* zero or more whitespace
          IDP_ literal
          [^s]+ one or more Not Whitespace characters
          Hmmm, difficult stuff.
          The "IDP_" doesn't work, because there's over a thousand files to compare and there are many other terms beside "IDP_". Is there a regular expression, that defines the complete leading word in a line and compare only these (yes, there can be tabs and white spaces before and after 'em)?
          Thanks
          oli

          BTW: There's a nice video about regular expressions here https://www.youtube.com/watch?v=sa-TUpSx1JA
          Last edited by olle; 20-Oct-2021, 10:12 AM.

          Comment


          • #6
            Hi,

            What do you mean by 'defines the complete leading word'? As in, any word? What would qualify as a match and what should be rejected?

            \w can be used for alphanumeric characters, so something like
            ^\s+\w+_[^\s]+
            replaces the literal IDP with '1 or more alphanumeric' but still requires a _ break in the first word.

            ^\s*[^\s]+
            would start at the beginning of the line, optionally have some whitespace, then mark the first 'word' (Not Whitespace) up until the first whitespace break.
            Aaron P Scooter Software

            Comment


            • #7
              Originally posted by Aaron View Post
              Hi,

              What do you mean by 'defines the complete leading word'? As in, any word? What would qualify as a match and what should be rejected?

              \w can be used for alphanumeric characters, so something like
              ^\s+\w+_[^\s]+
              replaces the literal IDP with '1 or more alphanumeric' but still requires a _ break in the first word.

              ^\s*[^\s]+
              would start at the beginning of the line, optionally have some whitespace, then mark the first 'word' (Not Whitespace) up until the first whitespace break.
              Hi Aaron,
              thanks, this works partly, but doesn't detect, huge differences like missing lines or changed. I mean with "leading word" the first term without whitespaces: e.g. in this line
              HOTKEY_SELECT_LIVE "Live Selection" "Live Selection Hotkey";
              I want only compare "HOTKEY_SELECT_LIVE", if this is missing or changed, that would be an important differance.
              I uploaded here a zip file with the problem (another problem, the folder names are differant (in one case "strings_de-DE" and in the other case "strings_en-US). Maybe you can take a look at it?
              Cheers
              Oliver
              Attached Files

              Comment


              • #8
                Hello,

                My hunch on List vs. Text behavior would be forgetting to enable Regular Expression (something that I do all the time).

                As for these test files, I found that my suggestion doesn't account for the frequent whitespace prefix differences (leading tab vs. leading whitespace), b/c that whitespace is included in the grammar I defined, so any difference will be a difference. And you want to only compare that prefix Text (first column) right?

                Another pattern I note: the other columns are "delimited" by quotes. How about defining a " to " delimited grammar, mark that element as unimportant (uncheck it in the Text Compare's Session Settings, Importance tab) and check Everything Else. Also check Orphan Lines are Always Important.
                Is this closer to what you are looking for?
                Aaron P Scooter Software

                Comment


                • #9
                  Originally posted by Aaron View Post
                  Hello,
                  Another pattern I note: the other columns are "delimited" by quotes. How about defining a " to " delimited grammar, mark that element as unimportant (uncheck it in the Text Compare's Session Settings, Importance tab) and check Everything Else. Also check Orphan Lines are Always Important.
                  Is this closer to what you are looking for?
                  It's not only closer, it does the task exactly, thank you!
                  I have to understand, that not only the important grammar can be defined, but that the unimportant can be excluded, too (with the same results).

                  BTW, a grammar, that matches all the characters left of "_" (underscore) and all character right of "_" until a whitespace is coming. And defining this as important would do the trick, too? Is this possible?

                  HOTKEY_SELECT_LIVE "Live Selection" "Live Selection Hotkey";

                  Comment


                  • #10
                    That's roughly this element from earlier:
                    ^\s+\w+_[^\s]+

                    Which is
                    ^ start of line
                    \s+ at least one whitespace
                    \w+ at least one alphanumeric character
                    _ literal underscore
                    [^\s]+ Not Whitespace, at least one character

                    So perhaps something like
                    [^\s]+_[^\s]+
                    which would be Not Whitespace, with an underscore in it, with no reference to the start of the line? It assumes _ can't appear anywhere else in the file, which seems less resilient than using the Quotes idea.
                    Aaron P Scooter Software

                    Comment

                    Working...
                    X