Regex bug?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Regex bug?

Boris Popov, DeepCove Labs (SNN)
[originally sent to vw-dev, my bad]

Shouldn't both of these answer true?

'^ab' asRegex matches: '   ab'
'\<ab' asRegex matches: '   ab'

From the #c:syntax:

\<   an empty string at the beginning of a word
^    matching an empty string at the beginning of a line

Thanks!

-Boris

--
+1.604.689.0322
DeepCove Labs Ltd.
4th floor 595 Howe Street
Vancouver, Canada V6C 2T5

[hidden email]

CONFIDENTIALITY NOTICE

This email is intended only for the persons named in the message
header. Unless otherwise indicated, it contains information that is
private and confidential. If you have received it in error, please
notify the sender and delete the entire message including any
attachments.

Thank you.

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Regex bug?

Marc-Philippe Huget
Hello Boris,

Except if Regex for VW has specific behavior but the first regex
expression could not be true, it means the first character of a line is
an 'a' followed by a 'b', and in your example, you have some white
spaces before a and b

For the second one, no idea

Regards,
Marc-Philippe

Boris Popov a écrit :

> [originally sent to vw-dev, my bad]
>
> Shouldn't both of these answer true?
>
> '^ab' asRegex matches: '   ab'
> '\<ab' asRegex matches: '   ab'
>
> >From the #c:syntax:
>
> \<   an empty string at the beginning of a word
> ^    matching an empty string at the beginning of a line
>
> Thanks!
>
> -Boris
>
>  


--
 %%%%%%%%%%%%%%%%%%%%%%
 Dr Marc-Philippe Huget, Lecturer
 ESIA-LISTIC
 University of Savoie
 B.P. 806
 74016 Annecy cedex
 
 http://marcphilippe.huget.free.fr
 %%%%%%%%%%%%%%%%%%%%%%


Reply | Threaded
Open this post in threaded view
|

Re: Regex bug?

Ladislav Lenart
In reply to this post by Boris Popov, DeepCove Labs (SNN)
Boris Popov wrote:

> [originally sent to vw-dev, my bad]
>
> Shouldn't both of these answer true?
>
> '^ab' asRegex matches: '   ab'
> '\<ab' asRegex matches: '   ab'
>
>>From the #c:syntax:
>
> \<   an empty string at the beginning of a word
> ^    matching an empty string at the beginning of a line
>
> Thanks!
>
> -Boris

Both expressions above properly return false, because:
        1) Sending #matches: to the regex means match the whole input and not just
           a portion of it.
        2) '^' stands for begining of line - it matches specific position (an *empty*
           string) therefore leading white spaces violate the pattern.
        3) '\<' (resp. '\>') matches *empty* string at the begining (resp. end) of
           a word therefore leading white spaces violate the pattern again.

Note also that *empty* string stands for ''.

The intended usage of '\<' and '\>' is like this:
        '.*\<ab\>.*' asRegex matches: 'ab' "true"
        '.*\<ab\>.*' asRegex matches: ' ab' "true"
        '.*\<ab\>.*' asRegex matches: ' ab ' "true"
        '.*\<ab\>.*' asRegex matches: 'ab ' "true"
BUT
        '.*\<ab\>.*' asRegex matches: 'aab' "false"
        '.*\<ab\>.*' asRegex matches: 'aabb' "false"
        '.*\<ab\>.*' asRegex matches: 'abb' "false"
So the pattern above matches all strings that contain *word* 'ab'.

Hope this helps,

Ladislav Lenart

Reply | Threaded
Open this post in threaded view
|

RE: Regex bug?

Boris Popov, DeepCove Labs (SNN)
In reply to this post by Boris Popov, DeepCove Labs (SNN)
Thanks, this makes quite a bit more sense now and the examples helped quite
a bit.

Cheers!

-Boris

--
+1.604.689.0322
DeepCove Labs Ltd.
4th floor 595 Howe Street
Vancouver, Canada V6C 2T5

[hidden email]

CONFIDENTIALITY NOTICE

This email is intended only for the persons named in the message
header. Unless otherwise indicated, it contains information that is
private and confidential. If you have received it in error, please
notify the sender and delete the entire message including any
attachments.

Thank you.

-----Original Message-----
From: Ladislav Lenart [mailto:[hidden email]]
Sent: Tuesday, May 16, 2006 12:45 AM
To: [hidden email]
Subject: Re: Regex bug?

Boris Popov wrote:

> [originally sent to vw-dev, my bad]
>
> Shouldn't both of these answer true?
>
> '^ab' asRegex matches: '   ab'
> '\<ab' asRegex matches: '   ab'
>
>>From the #c:syntax:
>
> \<   an empty string at the beginning of a word
> ^    matching an empty string at the beginning of a line
>
> Thanks!
>
> -Boris
Both expressions above properly return false, because:
        1) Sending #matches: to the regex means match the whole input and
not just
           a portion of it.
        2) '^' stands for begining of line - it matches specific position
(an *empty*
           string) therefore leading white spaces violate the pattern.
        3) '\<' (resp. '\>') matches *empty* string at the begining (resp.
end) of
           a word therefore leading white spaces violate the pattern again.

Note also that *empty* string stands for ''.

The intended usage of '\<' and '\>' is like this:
        '.*\<ab\>.*' asRegex matches: 'ab' "true"
        '.*\<ab\>.*' asRegex matches: ' ab' "true"
        '.*\<ab\>.*' asRegex matches: ' ab ' "true"
        '.*\<ab\>.*' asRegex matches: 'ab ' "true"
BUT
        '.*\<ab\>.*' asRegex matches: 'aab' "false"
        '.*\<ab\>.*' asRegex matches: 'aabb' "false"
        '.*\<ab\>.*' asRegex matches: 'abb' "false"
So the pattern above matches all strings that contain *word* 'ab'.

Hope this helps,

Ladislav Lenart


smime.p7s (4K) Download Attachment