VB-Regex issue...

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

VB-Regex issue...

LawsonEnglish
anyone familiar with VB-Regex package?
this code should be returning an ordered collection of 25 hits (I
thought). Instead it returns one long string of all 25 hits:
  http://pastebin.com/eGcX6vg2
source string: http://pastebin.com/AkyQrXGD
I've been playing with this one for an several hours.  I can't tell if
the strings are too complicated or if I'm just using the wrong syntax,
though the simple example works just fine.

'\w+' asRegex matchesIn: 'Now is the Time' => an OrderedCollection('Now'
'is' 'the' 'Time')


Thanks.


Lawson
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: VB-Regex issue...

Prashanth Hebbar


On Mon, Aug 1, 2011 at 1:32 PM, Lawson English <[hidden email]> wrote:
anyone familiar with VB-Regex package?
this code should be returning an ordered collection of 25 hits (I thought). Instead it returns one long string of all 25 hits:  http://pastebin.com/eGcX6vg2
source string: http://pastebin.com/AkyQrXGD
I've been playing with this one for an several hours.  I can't tell if the strings are too complicated or if I'm just using the wrong syntax, though the simple example works just fine.

'\w+' asRegex matchesIn: 'Now is the Time' => an OrderedCollection('Now' 'is' 'the' 'Time')


 
Perhaps it was the string size which has multiple quotes and is not all escaped. Sean DeNigris writes about an interesting trick to preserve all the quotation marks inside long strings, especially html-strings. See this post from Sean for this trick http://seandenigris.com/blog/?p=647.
 

This code returns the OrderedCollection as expected by you.

source := htmltext678 contents.
aString := '<a href="billionaires08_(.*)html">(.*).</a></td>'.
matcher := RxMatcher forString: aString.
matcher matchesIn: source.
"Transcript show: (matcher matchesIn: source); cr."


The htmltext678 is the TextMorph where I stored your html page and extracted the contents to preserve all inline quotes. I took a shorter match-string (aString). 

One thing i noticed in the referred code that there was an ordered collection being created which wasn't doing anything.

Regards,
--
Prashanth Hebbar


_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: VB-Regex issue...

LawsonEnglish
On 8/1/11 3:14 AM, Prashanth Hebbar wrote:
> See this post from Sean for this trick
> http://seandenigris.com/blog/?p=647.
Thanks for the tip. I'll use it from now on, just in case. However, it
turns out that my ancient eyes were missing a few extra symbols in the
html, and once I more carefully set up my regex, I started getting hits.

Lawson
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners