Understanding regular expression matching

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Understanding regular expression matching

Nick
Hi,

Is it possible to make a regular expression which matches only part of a string. For example: 

testRegex
| toSearch matcher doesMatch match1 match2 numMatches |
toSearch := 'value: 6474 mm'.

matcher := RxMatcher forString: ' (\d+) '.
doesMatch := matcher matches: toSearch.
numMatches := matcher subexpressionCount.
match1 := (matcher subexpression: 1).
match2 := (matcher subexpression: 2).
self halt.

which results in:
   doesMarch = false
   numMatches = 2
   match1 = nil
   match2 = nil

However if I replace the regular expression with: '.* (\d+) .*' so the whole string is matched I see

   doesMarch = true
   numMatches = 2
   match1 = 'value: 6474 mm'
   match2 = '6474'

I can always pad my sub-matches with '.*', but I wondered if I'm missing a trick to make the Regex work with substring matches

Thanks

Nick 
Reply | Threaded
Open this post in threaded view
|

Re: Understanding regular expression matching

Schwab,Wilhelm K
Have you looked at messages like #copyWithRegex:matchesTranslatedUsing:?  It might help??




________________________________________
From: [hidden email] [[hidden email]] On Behalf Of Nick Ager [[hidden email]]
Sent: Friday, February 11, 2011 1:52 PM
To: pharo-project
Subject: [Pharo-project] Understanding regular expression matching

Hi,

Is it possible to make a regular expression which matches only part of a string. For example:

testRegex
| toSearch matcher doesMatch match1 match2 numMatches |
toSearch := 'value: 6474 mm'.

matcher := RxMatcher forString: ' (\d+) '.
doesMatch := matcher matches: toSearch.
numMatches := matcher subexpressionCount.
match1 := (matcher subexpression: 1).
match2 := (matcher subexpression: 2).
self halt.

which results in:
   doesMarch = false
   numMatches = 2
   match1 = nil
   match2 = nil

However if I replace the regular expression with: '.* (\d+) .*' so the whole string is matched I see

   doesMarch = true
   numMatches = 2
   match1 = 'value: 6474 mm'
   match2 = '6474'

I can always pad my sub-matches with '.*', but I wondered if I'm missing a trick to make the Regex work with substring matches

Thanks

Nick

Reply | Threaded
Open this post in threaded view
|

Re: Understanding regular expression matching

laurent laffont
In reply to this post by Nick
'(\d+)' asRegex 
search: 'value: 6474 mm';
subexpression: 1.
answers 6474.

See 

Laurent Laffont - @lolgzs

Pharo Smalltalk Screencasts: http://www.pharocasts.com/
Blog: http://magaloma.blogspot.com/


On Fri, Feb 11, 2011 at 7:52 PM, Nick Ager <[hidden email]> wrote:
Hi,

Is it possible to make a regular expression which matches only part of a string. For example: 

testRegex
| toSearch matcher doesMatch match1 match2 numMatches |
toSearch := 'value: 6474 mm'.

matcher := RxMatcher forString: ' (\d+) '.
doesMatch := matcher matches: toSearch.
numMatches := matcher subexpressionCount.
match1 := (matcher subexpression: 1).
match2 := (matcher subexpression: 2).
self halt.

which results in:
   doesMarch = false
   numMatches = 2
   match1 = nil
   match2 = nil

However if I replace the regular expression with: '.* (\d+) .*' so the whole string is matched I see

   doesMarch = true
   numMatches = 2
   match1 = 'value: 6474 mm'
   match2 = '6474'

I can always pad my sub-matches with '.*', but I wondered if I'm missing a trick to make the Regex work with substring matches

Thanks

Nick 

Reply | Threaded
Open this post in threaded view
|

Re: Understanding regular expression matching

Nick
Hi Laurent,


Perfect somehow I'd missed #search: 
The link to the Regex chapter is great background

Thanks again

Nick