regex inconsistency

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mdinowitz
    New User
    • Apr 2012
    • 2

    regex inconsistency

    I have a slightly complex regex using a negative look ahead which works in the text compare area but fails in the file formats->grammar section. The regex looks for the start of a ColdFusion comment and then goes until a closing comment as long as there is not another start comment between the two. This is to deal with nested comments. See example below:
    <!---(.(?!<!---))+.--->

    Is there something I'm missing in the grammar section? Are there two different implementations of regex in BC3?

    Thanks
  • Zoë
    Team Scooter
    • Oct 2007
    • 2666

    #2
    Yes, there are two implementations. The "Find" dialog uses a fairly recent version of PCRE, so it should support anything you need. The grammar parser is our own, because we need it to parse all of the REs in parallel, and PCRE can't handle that.

    We do plan to replace it with a more powerful parser in the future, but for now these are the only things it supports (assuming our comments aren't out of date).

    Code:
    	Metacharacters
    	--------------
    	\			Escape
    	^			Assert start of line
    	$			Assert end of line
    	.			Match any character
    	[			Start character set
    	|			Start alternative
    	(			Start subpattern
    	)			End subpattern
    	?			0 or 1 iterations (equal to {0,1})
    	*			0 or more iterations (equal to {0,})
    	+			1 or more iterations (equal to {1,})
    	{			Start iterator
    	}			End iterator
    
    	Metacharacters In Character Sets
    	--------------------------------
    	\			Escape
    	^			Invert, only if first char
    	-			Range, only if surrounded by chars
    	]			End character set, only if not first char (second if first is ^)
    
    	Escaped Characters
    	------------------
    	\a			Alarm/bell (0x07)
    	\t			Tab (0x09)
    	\f			Formfeed (0x0C)
    	\e			Esc (0x1B)
    *	\cx			"Control-x", where x is any character
    	\ddd		Character with the oct code ddd, 1 to 3 oct digits
    	\xhh		Character with the hex code hh, 0 to 2 hex digits
    	\x{hhh..}	Character with the hex code hhh.., max of 7FFFFFFF
    
    	Escaped Character Sets
    	----------------------
    	\d			Any decimal digit (equal to [0-9])
    	\D			Any non decimal digit (equal to [^0-9])
    	\s			Any whitespace character (equal to [\t\f ])
    	\S			Any non whitespace character (equal to [^\t\f ])
    	\w			Any "word" character (equal to [_a-zA-Z0-9])
    	\W			Any "non-word" character (equal to [^_a-zA-Z0-9])
    
    	Iterators (Greedy)
    	------------------
    	{n}			n iterartions (equal to {n,n})
    	{n,}		n or more iterartions
    	{n,m}		at least n but no more than m iterartions
    Zoë P Scooter Software

    Comment

    • mdinowitz
      New User
      • Apr 2012
      • 2

      #3
      I'd be more than happy to test out any regex changes you make to BC. I use them a LOT and I'd like to upgrade the ColdFusion language file format.

      Thanks

      Comment

      • Andrew Tawil
        Visitor
        • Jun 2013
        • 3

        #4
        I would also find this feature very useful. It is a little annoying to need to workaround regex expressions that work great in the file find utility, but not the grammar settings.

        Comment

        • quarky
          Visitor
          • Oct 2011
          • 6

          #5
          Any news on RE in grammar parser ?

          Originally posted by Zoë
          We do plan to replace it with a more powerful parser in the future
          Are there any news ?
          I'd also love to see improvements there ...
          Thanks, Frank

          Comment

          • Aaron
            Team Scooter
            • Oct 2007
            • 16002

            #6
            Hello,

            We've had a few small tweaks, but not a big overhaul yet. It's still on our wishlist.
            Aaron P Scooter Software

            Comment

            • SarahFlexBox
              New User
              • Mar 2019
              • 2

              #7
              I've tried so many times to create complex regexes for what I need, regexes which work in other testers ... glad to finally understand why they don't work in BC. +1 vote for upgrading this to a standard parser
              Also - this workaround for creating a grammar item for multiline comments was exactly what I needed: https://www.scootersoftware.com/vbul...-line-re-match

              Comment

              • cyberchicken
                Enthusiast
                • Mar 2015
                • 26

                #8
                Originally posted by Aaron
                Hello,

                We've had a few small tweaks, but not a big overhaul yet. It's still on our wishlist.
                What about the other way around?
                I use RegexBuddy to create and convert RegExps; maybe you could send the specs or even the source of your parser to the man (Jan Goyvaerts)

                Comment

                • Aaron
                  Team Scooter
                  • Oct 2007
                  • 16002

                  #9
                  Next to each text box, if you click the dropdown control arrow down to see a list of supported RegEx characters for that textbox (note: with two boxes, each might support a different subset). We also have a Regular Expression Reference chapter in the Help file.
                  Aaron P Scooter Software

                  Comment

                  • cyberchicken
                    Enthusiast
                    • Mar 2015
                    • 26

                    #10
                    Yes of course, I thought there could be something else hidden somewhere
                    Thank you!

                    Comment

                    Working...