Fwd: nice bug with regex

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Fwd: nice bug with regex

Stéphane Ducasse
> l := '1ère partie\nmilieu\nfin'.
> '.*' asRegex matchesIn l.

-> infinite and freezing loop.

Stef




_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: nice bug with regex

Igor Stasenko
2009/11/4 Stéphane Ducasse <[hidden email]>:
>> l := '1ère partie\nmilieu\nfin'.
>> '.*' asRegex matchesIn l.
>
> -> infinite and freezing loop.
>

in my 10491

'.*' asRegex matches: 'kkkwefrwefw'
-> true

'.*' asRegex matches: 'БУГАГА'
-> true

is your string above contains line ending characters (\n)?


> Stef
>
>
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>



--
Best regards,
Igor Stasenko AKA sig.

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: nice bug with regex

Stéphane Ducasse

On Nov 4, 2009, at 2:35 PM, Igor Stasenko wrote:

> 2009/11/4 Stéphane Ducasse <[hidden email]>:
>>> l := '1ère partie\nmilieu\nfin'.
>>> '.*' asRegex matchesIn l.
>>
>> -> infinite and freezing loop.
>>
>
> in my 10491
>
> '.*' asRegex matches: 'kkkwefrwefw'
> -> true
>
> '.*' asRegex matches: 'БУГАГА'
> -> true
>
> is your string above contains line ending characters (\n)?

yes

>
>
>> Stef
>>
>>
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: nice bug with regex

Lukas Renggli
In reply to this post by Stéphane Ducasse
I committed VB-Regex-lr.34 to the PharoInbox that fixes this issue.

In fact, this is a completely new port of VB-Regex with numerous
bug-fixes and enhancements. The bug you found was fixed in version
1.1c in December 2004. The version 1.1 we used up to now was from
October 1999.

The detailed log of changes from Vassily you can see below.

Cheers,
Lukas

Name: VB-Regex-lr.34
Author: lr
Time: 4 November 2009, 10:40:27 pm
UUID: 9254d5aa-3e22-4e3b-83aa-2d633ff05570
Ancestors: VB-Regex-StephaneDucasse.33

VERSION 1.2.3 (November 2007)

1. Regexs with ^ or $ applied to copy empty strings caused infinite
loops, e.g. ('' copyWithRegex: '^.*$' matchesReplacedWith: 'foo').
Applied a similar correction to that from version 1.1c, to
#copyStream:to:(replacingMatchesWith:|translatingMatchesUsing:).
2. Extended RxParser testing to run each test for
#copy:translatingMatchesUsing: as well as #search:.
3. Corrected #testSuite test that a dot does not match a null, which
was passing by luck with Smalltalk code in a literal array.
4. Added test to end of test suite for fix 1 above.

VERSION 1.2.2 (November 2006)

There was no way to specify a backslash in a character set. Now [\\]
is accepted.

VERSION 1.2.1 (August 2006)

1. Support for returning all ranges (startIndex to: stopIndex)
matching a regex - #allRangesOfRegexMatches:, #matchingRangesIn:
2. Added hint to usage documentation on how to get more information
about matches when enumerating
3. Syntax description of dot corrected: matches anything but NUL since 1.1a

VERSION 1.2 (May 2006)

Fixed case-insensitive search for character sets.

VERSION 1.1c (December 2004)

Fixed the issue with #matchesOnStream:do: which caused infinite loops
for matches
that matched empty strings.

VERSION 1.1b (November 2001)

Changes valueNowOrOnUnwindDo: to ensure:, plus incorporates some earlier fixes.

VERSION 1.1a (May 2001)

1. Support for keeping track of multiple subexpressions.
2. Dot (.) matches anything but NUL character, as it should per POSIX spec.
3. Some bug fixes.



2009/11/4 Stéphane Ducasse <[hidden email]>:

>> l := '1ère partie\nmilieu\nfin'.
>> '.*' asRegex matchesIn l.
>
> -> infinite and freezing loop.
>
> Stef
>
>
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>



--
Lukas Renggli
http://www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: nice bug with regex

Nicolas Cellier
This version does not seem to solve http://bugs.squeak.org/view.php?id=5391

{
'15' matchesRegex: '[1-9]|1[0-9]'.
'15' matchesRegex: '1[0-9]|[1-9]'.
} #(false true)


2009/11/4 Lukas Renggli <[hidden email]>:

> I committed VB-Regex-lr.34 to the PharoInbox that fixes this issue.
>
> In fact, this is a completely new port of VB-Regex with numerous
> bug-fixes and enhancements. The bug you found was fixed in version
> 1.1c in December 2004. The version 1.1 we used up to now was from
> October 1999.
>
> The detailed log of changes from Vassily you can see below.
>
> Cheers,
> Lukas
>
> Name: VB-Regex-lr.34
> Author: lr
> Time: 4 November 2009, 10:40:27 pm
> UUID: 9254d5aa-3e22-4e3b-83aa-2d633ff05570
> Ancestors: VB-Regex-StephaneDucasse.33
>
> VERSION 1.2.3 (November 2007)
>
> 1. Regexs with ^ or $ applied to copy empty strings caused infinite
> loops, e.g. ('' copyWithRegex: '^.*$' matchesReplacedWith: 'foo').
> Applied a similar correction to that from version 1.1c, to
> #copyStream:to:(replacingMatchesWith:|translatingMatchesUsing:).
> 2. Extended RxParser testing to run each test for
> #copy:translatingMatchesUsing: as well as #search:.
> 3. Corrected #testSuite test that a dot does not match a null, which
> was passing by luck with Smalltalk code in a literal array.
> 4. Added test to end of test suite for fix 1 above.
>
> VERSION 1.2.2 (November 2006)
>
> There was no way to specify a backslash in a character set. Now [\\]
> is accepted.
>
> VERSION 1.2.1   (August 2006)
>
> 1. Support for returning all ranges (startIndex to: stopIndex)
> matching a regex - #allRangesOfRegexMatches:, #matchingRangesIn:
> 2. Added hint to usage documentation on how to get more information
> about matches when enumerating
> 3. Syntax description of dot corrected: matches anything but NUL since 1.1a
>
> VERSION 1.2     (May 2006)
>
> Fixed case-insensitive search for character sets.
>
> VERSION 1.1c    (December 2004)
>
> Fixed the issue with #matchesOnStream:do: which caused infinite loops
> for matches
> that matched empty strings.
>
> VERSION 1.1b    (November 2001)
>
> Changes valueNowOrOnUnwindDo: to ensure:, plus incorporates some earlier fixes.
>
> VERSION 1.1a    (May 2001)
>
> 1. Support for keeping track of multiple subexpressions.
> 2. Dot (.) matches anything but NUL character, as it should per POSIX spec.
> 3. Some bug fixes.
>
>
>
> 2009/11/4 Stéphane Ducasse <[hidden email]>:
>>> l := '1ère partie\nmilieu\nfin'.
>>> '.*' asRegex matchesIn l.
>>
>> -> infinite and freezing loop.
>>
>> Stef
>>
>>
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: nice bug with regex

Lukas Renggli
I don't know, I did not write the code. I trust it, because it passes
the test suite of Henry Spencer's regexp.c package.

Lukas


2009/11/4 Nicolas Cellier <[hidden email]>:

> This version does not seem to solve http://bugs.squeak.org/view.php?id=5391
>
> {
> '15' matchesRegex: '[1-9]|1[0-9]'.
> '15' matchesRegex: '1[0-9]|[1-9]'.
> } #(false true)
>
>
> 2009/11/4 Lukas Renggli <[hidden email]>:
>> I committed VB-Regex-lr.34 to the PharoInbox that fixes this issue.
>>
>> In fact, this is a completely new port of VB-Regex with numerous
>> bug-fixes and enhancements. The bug you found was fixed in version
>> 1.1c in December 2004. The version 1.1 we used up to now was from
>> October 1999.
>>
>> The detailed log of changes from Vassily you can see below.
>>
>> Cheers,
>> Lukas
>>
>> Name: VB-Regex-lr.34
>> Author: lr
>> Time: 4 November 2009, 10:40:27 pm
>> UUID: 9254d5aa-3e22-4e3b-83aa-2d633ff05570
>> Ancestors: VB-Regex-StephaneDucasse.33
>>
>> VERSION 1.2.3 (November 2007)
>>
>> 1. Regexs with ^ or $ applied to copy empty strings caused infinite
>> loops, e.g. ('' copyWithRegex: '^.*$' matchesReplacedWith: 'foo').
>> Applied a similar correction to that from version 1.1c, to
>> #copyStream:to:(replacingMatchesWith:|translatingMatchesUsing:).
>> 2. Extended RxParser testing to run each test for
>> #copy:translatingMatchesUsing: as well as #search:.
>> 3. Corrected #testSuite test that a dot does not match a null, which
>> was passing by luck with Smalltalk code in a literal array.
>> 4. Added test to end of test suite for fix 1 above.
>>
>> VERSION 1.2.2 (November 2006)
>>
>> There was no way to specify a backslash in a character set. Now [\\]
>> is accepted.
>>
>> VERSION 1.2.1   (August 2006)
>>
>> 1. Support for returning all ranges (startIndex to: stopIndex)
>> matching a regex - #allRangesOfRegexMatches:, #matchingRangesIn:
>> 2. Added hint to usage documentation on how to get more information
>> about matches when enumerating
>> 3. Syntax description of dot corrected: matches anything but NUL since 1.1a
>>
>> VERSION 1.2     (May 2006)
>>
>> Fixed case-insensitive search for character sets.
>>
>> VERSION 1.1c    (December 2004)
>>
>> Fixed the issue with #matchesOnStream:do: which caused infinite loops
>> for matches
>> that matched empty strings.
>>
>> VERSION 1.1b    (November 2001)
>>
>> Changes valueNowOrOnUnwindDo: to ensure:, plus incorporates some earlier fixes.
>>
>> VERSION 1.1a    (May 2001)
>>
>> 1. Support for keeping track of multiple subexpressions.
>> 2. Dot (.) matches anything but NUL character, as it should per POSIX spec.
>> 3. Some bug fixes.
>>
>>
>>
>> 2009/11/4 Stéphane Ducasse <[hidden email]>:
>>>> l := '1ère partie\nmilieu\nfin'.
>>>> '.*' asRegex matchesIn l.
>>>
>>> -> infinite and freezing loop.
>>>
>>> Stef
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>>
>>
>> --
>> Lukas Renggli
>> http://www.lukas-renggli.ch
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>



--
Lukas Renggli
http://www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: nice bug with regex

Lukas Renggli
2009/11/5 Lukas Renggli <[hidden email]>:

> I don't know, I did not write the code. I trust it, because it passes
> the test suite of Henry Spencer's regexp.c package.
>
> Lukas
>
>
> 2009/11/4 Nicolas Cellier <[hidden email]>:
>> This version does not seem to solve http://bugs.squeak.org/view.php?id=5391
>>
>> {
>> '15' matchesRegex: '[1-9]|1[0-9]'.
>> '15' matchesRegex: '1[0-9]|[1-9]'.
>> } #(false true)

I am not saying that this bug is not worth too look at and to fix. I
am just no expert in regular expressions and I don't know what the
desired behavior is? And why? Still looking for a standard that
describes the semantics of regular expressions.

I think everybody agrees that the updated code is an improvement, even
if it does not fix all problems. No?

Lukas


>>
>>
>> 2009/11/4 Lukas Renggli <[hidden email]>:
>>> I committed VB-Regex-lr.34 to the PharoInbox that fixes this issue.
>>>
>>> In fact, this is a completely new port of VB-Regex with numerous
>>> bug-fixes and enhancements. The bug you found was fixed in version
>>> 1.1c in December 2004. The version 1.1 we used up to now was from
>>> October 1999.
>>>
>>> The detailed log of changes from Vassily you can see below.
>>>
>>> Cheers,
>>> Lukas
>>>
>>> Name: VB-Regex-lr.34
>>> Author: lr
>>> Time: 4 November 2009, 10:40:27 pm
>>> UUID: 9254d5aa-3e22-4e3b-83aa-2d633ff05570
>>> Ancestors: VB-Regex-StephaneDucasse.33
>>>
>>> VERSION 1.2.3 (November 2007)
>>>
>>> 1. Regexs with ^ or $ applied to copy empty strings caused infinite
>>> loops, e.g. ('' copyWithRegex: '^.*$' matchesReplacedWith: 'foo').
>>> Applied a similar correction to that from version 1.1c, to
>>> #copyStream:to:(replacingMatchesWith:|translatingMatchesUsing:).
>>> 2. Extended RxParser testing to run each test for
>>> #copy:translatingMatchesUsing: as well as #search:.
>>> 3. Corrected #testSuite test that a dot does not match a null, which
>>> was passing by luck with Smalltalk code in a literal array.
>>> 4. Added test to end of test suite for fix 1 above.
>>>
>>> VERSION 1.2.2 (November 2006)
>>>
>>> There was no way to specify a backslash in a character set. Now [\\]
>>> is accepted.
>>>
>>> VERSION 1.2.1   (August 2006)
>>>
>>> 1. Support for returning all ranges (startIndex to: stopIndex)
>>> matching a regex - #allRangesOfRegexMatches:, #matchingRangesIn:
>>> 2. Added hint to usage documentation on how to get more information
>>> about matches when enumerating
>>> 3. Syntax description of dot corrected: matches anything but NUL since 1.1a
>>>
>>> VERSION 1.2     (May 2006)
>>>
>>> Fixed case-insensitive search for character sets.
>>>
>>> VERSION 1.1c    (December 2004)
>>>
>>> Fixed the issue with #matchesOnStream:do: which caused infinite loops
>>> for matches
>>> that matched empty strings.
>>>
>>> VERSION 1.1b    (November 2001)
>>>
>>> Changes valueNowOrOnUnwindDo: to ensure:, plus incorporates some earlier fixes.
>>>
>>> VERSION 1.1a    (May 2001)
>>>
>>> 1. Support for keeping track of multiple subexpressions.
>>> 2. Dot (.) matches anything but NUL character, as it should per POSIX spec.
>>> 3. Some bug fixes.
>>>
>>>
>>>
>>> 2009/11/4 Stéphane Ducasse <[hidden email]>:
>>>>> l := '1ère partie\nmilieu\nfin'.
>>>>> '.*' asRegex matchesIn l.
>>>>
>>>> -> infinite and freezing loop.
>>>>
>>>> Stef
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>
>>>
>>>
>>> --
>>> Lukas Renggli
>>> http://www.lukas-renggli.ch
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>



--
Lukas Renggli
http://www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: nice bug with regex

Stéphane Ducasse
In reply to this post by Lukas Renggli
Thanks I will integrate it.

On Nov 4, 2009, at 10:46 PM, Lukas Renggli wrote:

> I committed VB-Regex-lr.34 to the PharoInbox that fixes this issue.
>
> In fact, this is a completely new port of VB-Regex with numerous
> bug-fixes and enhancements. The bug you found was fixed in version
> 1.1c in December 2004. The version 1.1 we used up to now was from
> October 1999.
>
> The detailed log of changes from Vassily you can see below.
>
> Cheers,
> Lukas
>
> Name: VB-Regex-lr.34
> Author: lr
> Time: 4 November 2009, 10:40:27 pm
> UUID: 9254d5aa-3e22-4e3b-83aa-2d633ff05570
> Ancestors: VB-Regex-StephaneDucasse.33
>
> VERSION 1.2.3 (November 2007)
>
> 1. Regexs with ^ or $ applied to copy empty strings caused infinite
> loops, e.g. ('' copyWithRegex: '^.*$' matchesReplacedWith: 'foo').
> Applied a similar correction to that from version 1.1c, to
> #copyStream:to:(replacingMatchesWith:|translatingMatchesUsing:).
> 2. Extended RxParser testing to run each test for
> #copy:translatingMatchesUsing: as well as #search:.
> 3. Corrected #testSuite test that a dot does not match a null, which
> was passing by luck with Smalltalk code in a literal array.
> 4. Added test to end of test suite for fix 1 above.
>
> VERSION 1.2.2 (November 2006)
>
> There was no way to specify a backslash in a character set. Now [\\]
> is accepted.
>
> VERSION 1.2.1 (August 2006)
>
> 1. Support for returning all ranges (startIndex to: stopIndex)
> matching a regex - #allRangesOfRegexMatches:, #matchingRangesIn:
> 2. Added hint to usage documentation on how to get more information
> about matches when enumerating
> 3. Syntax description of dot corrected: matches anything but NUL  
> since 1.1a
>
> VERSION 1.2 (May 2006)
>
> Fixed case-insensitive search for character sets.
>
> VERSION 1.1c (December 2004)
>
> Fixed the issue with #matchesOnStream:do: which caused infinite loops
> for matches
> that matched empty strings.
>
> VERSION 1.1b (November 2001)
>
> Changes valueNowOrOnUnwindDo: to ensure:, plus incorporates some  
> earlier fixes.
>
> VERSION 1.1a (May 2001)
>
> 1. Support for keeping track of multiple subexpressions.
> 2. Dot (.) matches anything but NUL character, as it should per  
> POSIX spec.
> 3. Some bug fixes.
>
>
>
> 2009/11/4 Stéphane Ducasse <[hidden email]>:
>>> l := '1ère partie\nmilieu\nfin'.
>>> '.*' asRegex matchesIn l.
>>
>> -> infinite and freezing loop.
>>
>> Stef
>>
>>
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: nice bug with regex

Nicolas Cellier
In reply to this post by Lukas Renggli
2009/11/5 Lukas Renggli <[hidden email]>:

> 2009/11/5 Lukas Renggli <[hidden email]>:
>> I don't know, I did not write the code. I trust it, because it passes
>> the test suite of Henry Spencer's regexp.c package.
>>
>> Lukas
>>
>>
>> 2009/11/4 Nicolas Cellier <[hidden email]>:
>>> This version does not seem to solve http://bugs.squeak.org/view.php?id=5391
>>>
>>> {
>>> '15' matchesRegex: '[1-9]|1[0-9]'.
>>> '15' matchesRegex: '1[0-9]|[1-9]'.
>>> } #(false true)
>
> I am not saying that this bug is not worth too look at and to fix. I
> am just no expert in regular expressions and I don't know what the
> desired behavior is? And why? Still looking for a standard that
> describes the semantics of regular expressions.
>

man 7 regex
used to be a good start, but it seems quite short on my current
distrib of linux...
The documentation tells it is POSIX, googleing shouldn't take long...

> I think everybody agrees that the updated code is an improvement, even
> if it does not fix all problems. No?
>
> Lukas
>

Oh yes, of course, thank you !
It was just the occasion to remind the bug to any potential fixer.

>
>>>
>>>
>>> 2009/11/4 Lukas Renggli <[hidden email]>:
>>>> I committed VB-Regex-lr.34 to the PharoInbox that fixes this issue.
>>>>
>>>> In fact, this is a completely new port of VB-Regex with numerous
>>>> bug-fixes and enhancements. The bug you found was fixed in version
>>>> 1.1c in December 2004. The version 1.1 we used up to now was from
>>>> October 1999.
>>>>
>>>> The detailed log of changes from Vassily you can see below.
>>>>
>>>> Cheers,
>>>> Lukas
>>>>
>>>> Name: VB-Regex-lr.34
>>>> Author: lr
>>>> Time: 4 November 2009, 10:40:27 pm
>>>> UUID: 9254d5aa-3e22-4e3b-83aa-2d633ff05570
>>>> Ancestors: VB-Regex-StephaneDucasse.33
>>>>
>>>> VERSION 1.2.3 (November 2007)
>>>>
>>>> 1. Regexs with ^ or $ applied to copy empty strings caused infinite
>>>> loops, e.g. ('' copyWithRegex: '^.*$' matchesReplacedWith: 'foo').
>>>> Applied a similar correction to that from version 1.1c, to
>>>> #copyStream:to:(replacingMatchesWith:|translatingMatchesUsing:).
>>>> 2. Extended RxParser testing to run each test for
>>>> #copy:translatingMatchesUsing: as well as #search:.
>>>> 3. Corrected #testSuite test that a dot does not match a null, which
>>>> was passing by luck with Smalltalk code in a literal array.
>>>> 4. Added test to end of test suite for fix 1 above.
>>>>
>>>> VERSION 1.2.2 (November 2006)
>>>>
>>>> There was no way to specify a backslash in a character set. Now [\\]
>>>> is accepted.
>>>>
>>>> VERSION 1.2.1   (August 2006)
>>>>
>>>> 1. Support for returning all ranges (startIndex to: stopIndex)
>>>> matching a regex - #allRangesOfRegexMatches:, #matchingRangesIn:
>>>> 2. Added hint to usage documentation on how to get more information
>>>> about matches when enumerating
>>>> 3. Syntax description of dot corrected: matches anything but NUL since 1.1a
>>>>
>>>> VERSION 1.2     (May 2006)
>>>>
>>>> Fixed case-insensitive search for character sets.
>>>>
>>>> VERSION 1.1c    (December 2004)
>>>>
>>>> Fixed the issue with #matchesOnStream:do: which caused infinite loops
>>>> for matches
>>>> that matched empty strings.
>>>>
>>>> VERSION 1.1b    (November 2001)
>>>>
>>>> Changes valueNowOrOnUnwindDo: to ensure:, plus incorporates some earlier fixes.
>>>>
>>>> VERSION 1.1a    (May 2001)
>>>>
>>>> 1. Support for keeping track of multiple subexpressions.
>>>> 2. Dot (.) matches anything but NUL character, as it should per POSIX spec.
>>>> 3. Some bug fixes.
>>>>
>>>>
>>>>
>>>> 2009/11/4 Stéphane Ducasse <[hidden email]>:
>>>>>> l := '1ère partie\nmilieu\nfin'.
>>>>>> '.*' asRegex matchesIn l.
>>>>>
>>>>> -> infinite and freezing loop.
>>>>>
>>>>> Stef
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Pharo-project mailing list
>>>>> [hidden email]
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Lukas Renggli
>>>> http://www.lukas-renggli.ch
>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>>
>>
>> --
>> Lukas Renggli
>> http://www.lukas-renggli.ch
>>
>
>
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: nice bug with regex

csrabak
In reply to this post by Lukas Renggli
Em 05/11/2009 05:19, Lukas Renggli < [hidden email] > escreveu:

>  2009/11/5 Lukas Renggli :
> > I don't  know, I did  not write the  code. I trust it,  because it
> > passes the test suite of Henry Spencer's regexp.c package.
> > Lukas

I'm not sure if this is trivial and you're looking for something else,
but for standards this link is a start: http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html and if you want an historic perspective and an European Standard in melting pot: http://www.opengroup.org/onlinepubs/007908799/xbd/re.html.

HTH

--
Cesar Rabak

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: nice bug with regex

Adrian Lienhard
In reply to this post by Stéphane Ducasse
Do we need to update the package also in 1.0? It seems there are quite  
some important fixes.

Stef, did you already create an issue?

Adrian

On Nov 5, 2009, at 11:28 , Stéphane Ducasse wrote:

> Thanks I will integrate it.
>
> On Nov 4, 2009, at 10:46 PM, Lukas Renggli wrote:
>
>> I committed VB-Regex-lr.34 to the PharoInbox that fixes this issue.
>>
>> In fact, this is a completely new port of VB-Regex with numerous
>> bug-fixes and enhancements. The bug you found was fixed in version
>> 1.1c in December 2004. The version 1.1 we used up to now was from
>> October 1999.
>>
>> The detailed log of changes from Vassily you can see below.
>>
>> Cheers,
>> Lukas
>>
>> Name: VB-Regex-lr.34
>> Author: lr
>> Time: 4 November 2009, 10:40:27 pm
>> UUID: 9254d5aa-3e22-4e3b-83aa-2d633ff05570
>> Ancestors: VB-Regex-StephaneDucasse.33
>>
>> VERSION 1.2.3 (November 2007)
>>
>> 1. Regexs with ^ or $ applied to copy empty strings caused infinite
>> loops, e.g. ('' copyWithRegex: '^.*$' matchesReplacedWith: 'foo').
>> Applied a similar correction to that from version 1.1c, to
>> #copyStream:to:(replacingMatchesWith:|translatingMatchesUsing:).
>> 2. Extended RxParser testing to run each test for
>> #copy:translatingMatchesUsing: as well as #search:.
>> 3. Corrected #testSuite test that a dot does not match a null, which
>> was passing by luck with Smalltalk code in a literal array.
>> 4. Added test to end of test suite for fix 1 above.
>>
>> VERSION 1.2.2 (November 2006)
>>
>> There was no way to specify a backslash in a character set. Now [\\]
>> is accepted.
>>
>> VERSION 1.2.1 (August 2006)
>>
>> 1. Support for returning all ranges (startIndex to: stopIndex)
>> matching a regex - #allRangesOfRegexMatches:, #matchingRangesIn:
>> 2. Added hint to usage documentation on how to get more information
>> about matches when enumerating
>> 3. Syntax description of dot corrected: matches anything but NUL
>> since 1.1a
>>
>> VERSION 1.2 (May 2006)
>>
>> Fixed case-insensitive search for character sets.
>>
>> VERSION 1.1c (December 2004)
>>
>> Fixed the issue with #matchesOnStream:do: which caused infinite loops
>> for matches
>> that matched empty strings.
>>
>> VERSION 1.1b (November 2001)
>>
>> Changes valueNowOrOnUnwindDo: to ensure:, plus incorporates some
>> earlier fixes.
>>
>> VERSION 1.1a (May 2001)
>>
>> 1. Support for keeping track of multiple subexpressions.
>> 2. Dot (.) matches anything but NUL character, as it should per
>> POSIX spec.
>> 3. Some bug fixes.
>>
>>
>>
>> 2009/11/4 Stéphane Ducasse <[hidden email]>:
>>>> l := '1ère partie\nmilieu\nfin'.
>>>> '.*' asRegex matchesIn l.
>>>
>>> -> infinite and freezing loop.
>>>
>>> Stef
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>>
>>
>> --
>> Lukas Renggli
>> http://www.lukas-renggli.ch
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: nice bug with regex

Stéphane Ducasse
Yes I did
        - Issue 1415: New version VBRegex


> Do we need to update the package also in 1.0? It seems there are quite
> some important fixes.
>
> Stef, did you already create an issue?
>
> Adrian
>
> On Nov 5, 2009, at 11:28 , Stéphane Ducasse wrote:
>
>> Thanks I will integrate it.
>>
>> On Nov 4, 2009, at 10:46 PM, Lukas Renggli wrote:
>>
>>> I committed VB-Regex-lr.34 to the PharoInbox that fixes this issue.
>>>
>>> In fact, this is a completely new port of VB-Regex with numerous
>>> bug-fixes and enhancements. The bug you found was fixed in version
>>> 1.1c in December 2004. The version 1.1 we used up to now was from
>>> October 1999.
>>>
>>> The detailed log of changes from Vassily you can see below.
>>>
>>> Cheers,
>>> Lukas
>>>
>>> Name: VB-Regex-lr.34
>>> Author: lr
>>> Time: 4 November 2009, 10:40:27 pm
>>> UUID: 9254d5aa-3e22-4e3b-83aa-2d633ff05570
>>> Ancestors: VB-Regex-StephaneDucasse.33
>>>
>>> VERSION 1.2.3 (November 2007)
>>>
>>> 1. Regexs with ^ or $ applied to copy empty strings caused infinite
>>> loops, e.g. ('' copyWithRegex: '^.*$' matchesReplacedWith: 'foo').
>>> Applied a similar correction to that from version 1.1c, to
>>> #copyStream:to:(replacingMatchesWith:|translatingMatchesUsing:).
>>> 2. Extended RxParser testing to run each test for
>>> #copy:translatingMatchesUsing: as well as #search:.
>>> 3. Corrected #testSuite test that a dot does not match a null, which
>>> was passing by luck with Smalltalk code in a literal array.
>>> 4. Added test to end of test suite for fix 1 above.
>>>
>>> VERSION 1.2.2 (November 2006)
>>>
>>> There was no way to specify a backslash in a character set. Now [\\]
>>> is accepted.
>>>
>>> VERSION 1.2.1 (August 2006)
>>>
>>> 1. Support for returning all ranges (startIndex to: stopIndex)
>>> matching a regex - #allRangesOfRegexMatches:, #matchingRangesIn:
>>> 2. Added hint to usage documentation on how to get more information
>>> about matches when enumerating
>>> 3. Syntax description of dot corrected: matches anything but NUL
>>> since 1.1a
>>>
>>> VERSION 1.2 (May 2006)
>>>
>>> Fixed case-insensitive search for character sets.
>>>
>>> VERSION 1.1c (December 2004)
>>>
>>> Fixed the issue with #matchesOnStream:do: which caused infinite  
>>> loops
>>> for matches
>>> that matched empty strings.
>>>
>>> VERSION 1.1b (November 2001)
>>>
>>> Changes valueNowOrOnUnwindDo: to ensure:, plus incorporates some
>>> earlier fixes.
>>>
>>> VERSION 1.1a (May 2001)
>>>
>>> 1. Support for keeping track of multiple subexpressions.
>>> 2. Dot (.) matches anything but NUL character, as it should per
>>> POSIX spec.
>>> 3. Some bug fixes.
>>>
>>>
>>>
>>> 2009/11/4 Stéphane Ducasse <[hidden email]>:
>>>>> l := '1ère partie\nmilieu\nfin'.
>>>>> '.*' asRegex matchesIn l.
>>>>
>>>> -> infinite and freezing loop.
>>>>
>>>> Stef
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>
>>>
>>>
>>> --
>>> Lukas Renggli
>>> http://www.lukas-renggli.ch
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project