No announcement yet.

Lookaround in regex

  • Filter
  • Time
  • Show
Clear All
new posts

  • Lookaround in regex

    I would like to have lookbehind/lookahead in the regex used in grammars, since there are a lot of things that just impossible to do without it.

  • #2
    Thanks. Enhancing this support is on our Customer Wishlist. If you have any examples you would like to post, we can add those to our entry on this subject.
    Aaron P Scooter Software


    • #3
      For example, I have a file that looks like this:
      name("strings","and","numbers",1234) = "data"

      Without compromising the string/numeric/punctuation formatting in the middle, I would like to have the last subscript to be unimportant if the name at the beginning is a specific value. I would also like the data to be unimportant if the last subscript has a specific value.

      Currently I can make a regex that does this, but it cannibalizes the rest of the syntax highlighting since I have to consume all of the text that enabled the condition.


      • #4
        I would also like to add my voice to this request.

        I want to use a simple regex such as [a-zA-Z]+(?=\([^)]*\)) to set up a grammar rule for method/function names, without consuming the content within the parenthesis. Since lookahead is apparently not supported, I am forced to consume the arguments with my Method grammar rule, which is not preferable.

        Any suggestions for alternate regex would be appreciated.


        • #5
          Yes, Lookaround is definitively needed for some cases which otherwise become a great problem.
          I want to compare floating point numbers and ignore the last digits, e.g. achieve something like an epsilon, based on the context, f.e. when a X is follewed by >3 digits, a point and > 7 digits, i want to ignore the last 3.
          Maybe there's an existing easy solution for this?


          • #6

            Thanks, although there isn't an existing easy solution. Lookbacks are not currently supported, and the Unimportance does not support conditional definitions (if this, then this). You could match a grammar on abc.1234567 (3 digits, then 7 decimals) as an entire grammar, but the whole number would be Important or Unimportant, not just the last 3 digits. If you can define a different regular expression without lookahead or lookbehind (such as 3 digits before an End of Line, since your data is structured to always end that way), then it could be defined like that.
            Aaron P Scooter Software


            • #7
              Hi Aaron
              thanks for your reply. Unfortunately, the grammar is too complex so that there's no way to have a fixed ending, basically a number can be of any format with variable digits before and after comma and might occur anywhere within a line. Some do have a special prefix string (which in rare cases would need another lookbehind to really cover all cases) but some don't.

              On 20-Oct-2015, 11:34 PM, you answered that "Enhancing this support is on our Customer Wishlist". I would be glad to post more use cases if this is of any help.


              • #8
                We've got quite a few sample cases, so we probably don't need more, but if you have any unique samples you'd like to include you can post here or email files to support@scootersoftware (with a link back to this forum thread for our reference).
                Aaron P Scooter Software