No announcement yet.

Making text with variable prefix uninportant - Grammar vs Replacements

  • Filter
  • Time
  • Show
Clear All
new posts

  • Making text with variable prefix uninportant - Grammar vs Replacements

    I am trying to compare 2 text files which contains lines with a format like the one below:

    (Prefix) (Fixed Text) (Variable timestamp)

    The format of each of these items is very easy to define as a regex.

    I want to mark as unimportant the lines having the same (Prefix) (Fixed Text) yet different timestamps.

    Using grammar, I was able to achieve this, but is requires a new grammar for each prefix. I created a grammar for Prefix1, Prefix2, Prefix3... and it works really well. The problem is that I have several different prefixes and I would prefer no to enter them manually.
    Creating grarmmar element with a regex for the prefix would result in considering that Prefix1 vs Prefix2 is uninmportant, which is not the case.

    If I undestand replacements, correctly, it allows to define rules to replace some defined string/pattern in one file, and after the replacement, match that with the content of the other file. Given that the content of the timestamp changes, there is no way to write a rule that will transform the original line in the final one.

    Another option would be to create an external program to replace the timestamp by a series of identical characters.

    Is there any way to acheive this without manually entering all the prefixes as grammar or using an external program?

  • #2

    If the Prefixes are Important (shouldn't be ignored if they are different: Prefix 1 vs Prefix 2), then can the grammar element be defined to only include the timestamp? Such as something that matches only timestamps at the end of the line?

    Replacements are defined to ignore a specific change. This would help cover the opposite scenario: if you wanted to ignore Prefix 1 vs Prefix 2 specifically, but not Prefix 1 vs Other Changes.
    Aaron P Scooter Software


    • #3
      How are replacements evaluated?

      Can I write a left->right replacement replacing (Prefix)(Fixed)(Timestamp) to (Prefix)(Fixed)XXXXX
      and a right replacement replacing (Prefix)(Fixed)(Timestamp) to (Prefix)(Fixed)XXXXX
      and have both match?


      • #4
        Replacements are evaluated when the text is already aligned in the comparison, if a specific replacement matches, then it is marked as Unimportant. It is designed to help with static changes, such as a variable name change.

        Given the above, the right side Replacement "XXXX" value must be specifically defined and does not support masking, so it would have to be the literal timestamp value "10:11am", etc. Would this indicate that you know the exact text of the timestamps that need to be ignored? You could probably create a simpler rule to ignore them.

        For your data, you logically want to ignore a specific pattern if the prefix exists. However, our regular expression support is for the entire defined phrase, not conditional. If there is a simpler logic that could apply, such as only timestamps that appear at the end of the line, then the Prefix and Fixed would remain Important Text (outside the definition), while only the timestamp is then marked as Unimportant.
        Aaron P Scooter Software


        • #5
          Unfortunately, the format for the timestamp is 6 digits: 030405. The file contains other lines with 6 digits at the end of the line.


          • #6

            Ok, that would be an issue for a general regular expression. Does the above Replacement work for you? It does require the Destination side be the explicit timestamp, and not a mask.
            Aaron P Scooter Software