GST's regex syntax..

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

GST's regex syntax..

Rick Flower-2
Ok.. So I'm chewing text again.. this time with regular expressions..

I'm having problems with repeating patterns that I'm hoping to solve..

If I've got the string "My DOG and CAT ate the BIG blue BIRD" in the
file "foo.txt" and want to find all uppercase 3 & 4 letter words I was
hoping to use the following code snippet :

| fileStream coll |

fileStream := FileStream open: 'foo.txt' mode: FileStream read.
coll := OrderedCollection new.
fileStream linesDo: [:line |
        (line =~ '([A-Z]{3,4})+') ifMatched: [:match | coll add: match].
].
coll inspect.

However the result is just DOG -- the others are missing.. What is
wrong
with the repeating aspect of it where there are multiple hits within a
single line?  It works fine for one hit per line but not multiples..
Any ideas?  Thanks!

_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: GST's regex syntax..

Leszek Kowalczyk

Hi,

> Date: Tue, 23 Oct 2012 12:48:38 -0700
> From: [hidden email]
> To: [hidden email]
> Subject: [Help-smalltalk] GST's regex syntax..
>
> Ok.. So I'm chewing text again.. this time with regular expressions..
>
> I'm having problems with repeating patterns that I'm hoping to solve..
>
> If I've got the string "My DOG and CAT ate the BIG blue BIRD" in the
> file "foo.txt" and want to find all uppercase 3 & 4 letter words I was
> hoping to use the following code snippet :
>
> | fileStream coll |
>
> fileStream := FileStream open: 'foo.txt' mode: FileStream read.
> coll := OrderedCollection new.
> fileStream linesDo: [:line |
> (line =~ '([A-Z]{3,4})+') ifMatched: [:match | coll add: match].
> ].
> coll inspect.
>
> However the result is just DOG -- the others are missing.. What is
> wrong
> with the repeating aspect of it where there are multiple hits within a
> single line?  It works fine for one hit per line but not multiples..
> Any ideas?  Thanks!

From manual:

1.155.9 String: regex


=~ patternAnswer a RegexResults object for matching the receiver against
the Regex or String object pattern.

     
allOccurrencesOfRegex: patternFind all the matches of pattern within the receiver and
collect them into an OrderedCollection.

     
allOccurrencesOfRegex: pattern do: aBlockFind all the matches of pattern within the receiver and
pass the RegexResults objects to aBlock.

So =~ looks for first occurrence, try other one, and the match then is an OrderedCollection.

Best regards,
Leszek
     
_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: GST's regex syntax..

Rick Flower-2
 

Doh! That would explain it.. Yet another RTFM! I didn't notice that
'minor' detail apparently!!

Thanks much!!
_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk