No announcement yet.

regex inconsistency

  • Filter
  • Time
  • Show
Clear All
new posts

  • regex inconsistency

    I have a slightly complex regex using a negative look ahead which works in the text compare area but fails in the file formats->grammar section. The regex looks for the start of a ColdFusion comment and then goes until a closing comment as long as there is not another start comment between the two. This is to deal with nested comments. See example below:

    Is there something I'm missing in the grammar section? Are there two different implementations of regex in BC3?


  • #2
    Yes, there are two implementations. The "Find" dialog uses a fairly recent version of PCRE, so it should support anything you need. The grammar parser is our own, because we need it to parse all of the REs in parallel, and PCRE can't handle that.

    We do plan to replace it with a more powerful parser in the future, but for now these are the only things it supports (assuming our comments aren't out of date).

    	\			Escape
    	^			Assert start of line
    	$			Assert end of line
    	.			Match any character
    	[			Start character set
    	|			Start alternative
    	(			Start subpattern
    	)			End subpattern
    	?			0 or 1 iterations (equal to {0,1})
    	*			0 or more iterations (equal to {0,})
    	+			1 or more iterations (equal to {1,})
    	{			Start iterator
    	}			End iterator
    	Metacharacters In Character Sets
    	\			Escape
    	^			Invert, only if first char
    	-			Range, only if surrounded by chars
    	]			End character set, only if not first char (second if first is ^)
    	Escaped Characters
    	\a			Alarm/bell (0x07)
    	\t			Tab (0x09)
    	\f			Formfeed (0x0C)
    	\e			Esc (0x1B)
    *	\cx			"Control-x", where x is any character
    	\ddd		Character with the oct code ddd, 1 to 3 oct digits
    	\xhh		Character with the hex code hh, 0 to 2 hex digits
    	\x{hhh..}	Character with the hex code hhh.., max of 7FFFFFFF
    	Escaped Character Sets
    	\d			Any decimal digit (equal to [0-9])
    	\D			Any non decimal digit (equal to [^0-9])
    	\s			Any whitespace character (equal to [\t\f ])
    	\S			Any non whitespace character (equal to [^\t\f ])
    	\w			Any "word" character (equal to [_a-zA-Z0-9])
    	\W			Any "non-word" character (equal to [^_a-zA-Z0-9])
    	Iterators (Greedy)
    	{n}			n iterartions (equal to {n,n})
    	{n,}		n or more iterartions
    	{n,m}		at least n but no more than m iterartions
    ZoŽ P Scooter Software


    • #3
      I'd be more than happy to test out any regex changes you make to BC. I use them a LOT and I'd like to upgrade the ColdFusion language file format.



      • #4
        I would also find this feature very useful. It is a little annoying to need to workaround regex expressions that work great in the file find utility, but not the grammar settings.


        • #5
          Any news on RE in grammar parser ?

          Originally posted by ZoŽ View Post
          We do plan to replace it with a more powerful parser in the future
          Are there any news ?
          I'd also love to see improvements there ...
          Thanks, Frank


          • #6

            We've had a few small tweaks, but not a big overhaul yet. It's still on our wishlist.
            Aaron P Scooter Software