RxParser non standard?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

RxParser non standard?

Aaron Rosenzweig
Hi,

Is there anything newbies should know about Vassili Bykov’s RxParser? It is the de-facto tool for parsing regular expressions in Smalltalk correct? 

Even the “String” class has a utility method that uses it.

Strange for it to be the de-facto standard because… it feels limiting -> see below:

I’m trying to “detect” a url so in this example:

allRegexMatches: 
'\b(https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[A-Z0-9+&@#/%=~_|]’

I believe it should work. On my Mac, using “RegExRX” I tested that the expression itself is ok.

In Pharo I get “RegexMatchingError: invalid predicate selector”

When I debug into it a bit more I see:

Character(Object)>>doesNotUnderstand: #'//[-A-Z0-9+&@#/%?=~_|!’

I *assume* it is having trouble with the colons. There is a colon after the “http” part and then one in the first set of []* part right after the exclamation mark… it’s almost like it wants to treat it as if it was one of the convenience constructs like -> [:alpha:] and then obviously chokes because it is not. 

doh! Has anyone run into this before?

…..

*Assuming* that we have to escape every colon when we use the RxParser I modified the code to look like this:

allRegexMatches: 
'\b(https?|ftp|file)\://([-A-Z0-9+&@#/%?=~_|!,.;]|\:)*[A-Z0-9+&@#/%=~_|]’

I tested this in RegExRX and it is fine… but in Pharo I at least don’t get an error but I also don’t get any results! Just an empty OrderedCollection. 

Does anyone have any suggestions or am I out of luck? Do I need a different regular expression tool for Smalltalk besides RxParser? Hopefully I’m just making a simple mistake. 

Hopefully I’m the one who is dain bramaged. 

Thanks in advance,
Aaron Rosenzweig / Chat 'n Bike
e:  [hidden email]  t:  (301) 956-2319
Chat 'n Bike Chat 'n Bike


_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: RxParser non standard?

dirk newbold
This worked for me...

'http://time.com/3073948/border-bill-republicians/'
allRegexMatches: '\b(https?|ftp|file)[:]//[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[a-zA-Z0-9+&@#/%=~_|]'

Just add [] around : and included lowercase a-z

For your reference

Cincom VW7.9
RxParser class side method #c:_ syntax:__



On Sun, Aug 3, 2014 at 11:53 PM, Aaron Rosenzweig <[hidden email]> wrote:
Hi,

Is there anything newbies should know about Vassili Bykov’s RxParser? It is the de-facto tool for parsing regular expressions in Smalltalk correct? 

Even the “String” class has a utility method that uses it.

Strange for it to be the de-facto standard because… it feels limiting -> see below:

I’m trying to “detect” a url so in this example:

allRegexMatches: 
'\b(https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[A-Z0-9+&@#/%=~_|]’

I believe it should work. On my Mac, using “RegExRX” I tested that the expression itself is ok.

In Pharo I get “RegexMatchingError: invalid predicate selector”

When I debug into it a bit more I see:

Character(Object)>>doesNotUnderstand: #'//[-A-Z0-9+&@#/%?=~_|!’

I *assume* it is having trouble with the colons. There is a colon after the “http” part and then one in the first set of []* part right after the exclamation mark… it’s almost like it wants to treat it as if it was one of the convenience constructs like -> [:alpha:] and then obviously chokes because it is not. 

doh! Has anyone run into this before?

…..

*Assuming* that we have to escape every colon when we use the RxParser I modified the code to look like this:

allRegexMatches: 
'\b(https?|ftp|file)\://([-A-Z0-9+&@#/%?=~_|!,.;]|\:)*[A-Z0-9+&@#/%=~_|]’

I tested this in RegExRX and it is fine… but in Pharo I at least don’t get an error but I also don’t get any results! Just an empty OrderedCollection. 

Does anyone have any suggestions or am I out of luck? Do I need a different regular expression tool for Smalltalk besides RxParser? Hopefully I’m just making a simple mistake. 

Hopefully I’m the one who is dain bramaged. 

Thanks in advance,
Aaron Rosenzweig / Chat 'n Bike
e:  [hidden email]  t:  <a href="tel:%28301%29%20956-2319" value="+13019562319" target="_blank">(301) 956-2319
Chat 'n Bike Chat 'n Bike



_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside