matchesRegex: multiline

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

matchesRegex: multiline

Nick
Hi,

Is there a way to allow the regex '.' (dot) to match line break characters:

'hello regex' matchesRegex: '.*regex'    "true"

'hello
regex' matchesRegex: '.*regex'    "false"


Thanks

Nick


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: matchesRegex: multiline

Lukas Renggli
In my image the second expression returns true as well.

In fact the '.' (dot) matches anything but the null character (see
RxMatcher>>#syntaxAny).

Lukas

2010/8/31 Nick Ager <[hidden email]>:

> Hi,
> Is there a way to allow the regex '.' (dot) to match line break characters:
> 'hello regex' matchesRegex: '.*regex'    "true"
> 'hello
> regex' matchesRegex: '.*regex'    "false"
>
> Thanks
> Nick
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>



--
Lukas Renggli
www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: matchesRegex: multiline

Nick
> Is there a way to allow the regex '.' (dot) to match line break characters:
> 'hello regex' matchesRegex: '.*regex'    "true"
> 'hello
> regex' matchesRegex: '.*regex'    "false"
 
In my image the second expression returns true as well.

In fact the '.' (dot) matches anything but the null character (see
RxMatcher>>#syntaxAny).

Thanks - I shouldn't make assumptions - it matches in Pharo but not in Gemstone - I assumed the implementations would be identical. 

As you say the culprit is RxMatcher>>#syntaxAny

In Pharo:

RxMatcher>>#syntaxAny
^RxmPredicate new
predicate: [:char | char asInteger ~= 0]


In Gemstone:

RxMatcher>>#syntaxAny
^RxmPredicate new
predicate: [:char | (Cr = char or: [Lf = char]) not]

Nick

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: matchesRegex: multiline

Lukas Renggli
Looks like your version is very old, because the change-log in
RxParser class>>#b:whatsNewInThisRelease says:

  VERSION 1.1a (May 2001)

  1. Support for keeping track of multiple subexpressions.
  2. Dot (.) matches anything but NUL character, as it should per POSIX spec.
  3. Some bug fixes.

Pharo contains the latest version available:

  VERSION 1.3.1 (September 2008)

Additionally the version in Pharo includes an uncountable number of
bug-fixes related to regular expressions that would cause infinite
recursion, repeated matches, skipped matches, invalid empty matches,
misaligned replacements, etc. Unfortunately these changes were not
back-ported into the official VisualWorks version, as I never heard
back from Vassily.

Lukas




2010/8/31 Nick Ager <[hidden email]>:

>> Is there a way to allow the regex '.' (dot) to match line break
>> characters:
>> 'hello regex' matchesRegex: '.*regex'    "true"
>> 'hello
>> regex' matchesRegex: '.*regex'    "false"
>
>>
>> In my image the second expression returns true as well.
>>
>> In fact the '.' (dot) matches anything but the null character (see
>> >>#syntaxAny).
>
> Thanks - I shouldn't make assumptions - it matches in Pharo but not in
> Gemstone - I assumed the implementations would be identical.
> As you say the culprit is RxMatcher>>#syntaxAny
> In Pharo:
> RxMatcher>>#syntaxAny
> ^RxmPredicate new
> predicate: [:char | char asInteger ~= 0]
>
> In Gemstone:
> RxMatcher>>#syntaxAny
> ^RxmPredicate new
> predicate: [:char | (Cr = char or: [Lf = char]) not]
> Nick
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>



--
Lukas Renggli
www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: matchesRegex: multiline

Nick


On 31 August 2010 10:19, Lukas Renggli <[hidden email]> wrote:
Looks like your version is very old, because the change-log in
RxParser class>>#b:whatsNewInThisRelease says:

 VERSION 1.1a  (May 2001)

 1. Support for keeping track of multiple subexpressions.
 2. Dot (.) matches anything but NUL character, as it should per POSIX spec.
 3. Some bug fixes.


Thanks Lukas I've filed a Gemstone bug: http://code.google.com/p/glassdb/issues/detail?id=166 

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project