The Inbox: Regex-Tests-Core-pre.6.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

The Inbox: Regex-Tests-Core-pre.6.mcz

commits-2
Patrick Rein uploaded a new version of Regex-Tests-Core to project The Inbox:
http://source.squeak.org/inbox/Regex-Tests-Core-pre.6.mcz

==================== Summary ====================

Name: Regex-Tests-Core-pre.6
Author: pre
Time: 19 May 2016, 8:50:59.559548 pm
UUID: e1cbdf20-9ce2-4e4a-b1d8-574186c2e746
Ancestors: Regex-Tests-Core-ul.5

Adjustments to the Regex tests for optional subexpressions in multiple quantifiers. These tests demonstrate the different behavior when nesting subexpressions in quantifiers and the other way round.

=============== Diff against Regex-Tests-Core-ul.5 ===============

Item was added:
+ ----- Method: RxMatcherTest>>testOptionalMultipleQuantifiedSubexpression (in category 'testing') -----
+ testOptionalMultipleQuantifiedSubexpression
+ <timeout: 0.1>
+
+ self runRegex: #('((aa?){2})'
+ '' false nil
+ 'a' false nil
+ 'aa' true (1 'aa')
+ 'baaa' true (2 'aaa'))!

Item was changed:
  ----- Method: RxMatcherTest>>testOptionalNestedIntoMultipleQuantified (in category 'testing') -----
  testOptionalNestedIntoMultipleQuantified
  <timeout: 0.1>
 
  self runRegex: #('(aa?){2}'
  '' false nil
  'a' false nil
  'aa' true (1 'aa')
+ 'baaa' true (2 'a'))!
- 'baaa' true (2 'aaa'))!


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Regex-Tests-Core-pre.6.mcz

Patrick R.
I would like to comment further on these changes as they change the expected Regex engine behavior. The behavior, that matching groups in a multiple quantifier return the last match of the group is consistent with for example the Ruby regex engine:

http://rubular.com/r/kxoreyPolG

The issue is also decribed here: http://www.regular-expressions.info/captureall.html

However, I am not sure what the intended behavior for RxMatcher is. Has this ever been working and what was the intended behavior?

Bests
Patrick
________________________________________
From: [hidden email] <[hidden email]> on behalf of [hidden email] <[hidden email]>
Sent: Thursday, May 19, 2016 13:50
To: [hidden email]
Subject: [squeak-dev] The Inbox: Regex-Tests-Core-pre.6.mcz

Patrick Rein uploaded a new version of Regex-Tests-Core to project The Inbox:
http://source.squeak.org/inbox/Regex-Tests-Core-pre.6.mcz

==================== Summary ====================

Name: Regex-Tests-Core-pre.6
Author: pre
Time: 19 May 2016, 8:50:59.559548 pm
UUID: e1cbdf20-9ce2-4e4a-b1d8-574186c2e746
Ancestors: Regex-Tests-Core-ul.5

Adjustments to the Regex tests for optional subexpressions in multiple quantifiers. These tests demonstrate the different behavior when nesting subexpressions in quantifiers and the other way round.

=============== Diff against Regex-Tests-Core-ul.5 ===============

Item was added:
+ ----- Method: RxMatcherTest>>testOptionalMultipleQuantifiedSubexpression (in category 'testing') -----
+ testOptionalMultipleQuantifiedSubexpression
+       <timeout: 0.1>
+
+       self runRegex: #('((aa?){2})'
+               '' false nil
+               'a' false nil
+               'aa' true (1 'aa')
+               'baaa' true (2 'aaa'))!

Item was changed:
  ----- Method: RxMatcherTest>>testOptionalNestedIntoMultipleQuantified (in category 'testing') -----
  testOptionalNestedIntoMultipleQuantified
        <timeout: 0.1>

        self runRegex: #('(aa?){2}'
                '' false nil
                'a' false nil
                'aa' true (1 'aa')
+               'baaa' true (2 'a'))!
-               'baaa' true (2 'aaa'))!


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Regex-Tests-Core-pre.6.mcz

Levente Uzonyi
Hi Patrick,

This test has never passed before. I wrote it to document a bug I had
found in the matcher creation. It used to crash the VM, but I fixed that
part earlier.
Now that you made the matcher work, it's clear that the test case has a
bug as well. The last line should read

  'baaa' true (1 'aaa' 2 'a'))

because the test checks all listed subexpressions, and there's no reason
for us not to list all.
The first subexpression is by default the whole match, which is expected
to be 'aaa'. The second subexpression is the last match of the part
between the parenthesis, which is just 'a', as you suggested.

Levente

On Thu, 19 May 2016, Rein, Patrick wrote:

> I would like to comment further on these changes as they change the expected Regex engine behavior. The behavior, that matching groups in a multiple quantifier return the last match of the group is consistent with for example the Ruby regex engine:
>
> http://rubular.com/r/kxoreyPolG
>
> The issue is also decribed here: http://www.regular-expressions.info/captureall.html
>
> However, I am not sure what the intended behavior for RxMatcher is. Has this ever been working and what was the intended behavior?
>
> Bests
> Patrick
> ________________________________________
> From: [hidden email] <[hidden email]> on behalf of [hidden email] <[hidden email]>
> Sent: Thursday, May 19, 2016 13:50
> To: [hidden email]
> Subject: [squeak-dev] The Inbox: Regex-Tests-Core-pre.6.mcz
>
> Patrick Rein uploaded a new version of Regex-Tests-Core to project The Inbox:
> http://source.squeak.org/inbox/Regex-Tests-Core-pre.6.mcz
>
> ==================== Summary ====================
>
> Name: Regex-Tests-Core-pre.6
> Author: pre
> Time: 19 May 2016, 8:50:59.559548 pm
> UUID: e1cbdf20-9ce2-4e4a-b1d8-574186c2e746
> Ancestors: Regex-Tests-Core-ul.5
>
> Adjustments to the Regex tests for optional subexpressions in multiple quantifiers. These tests demonstrate the different behavior when nesting subexpressions in quantifiers and the other way round.
>
> =============== Diff against Regex-Tests-Core-ul.5 ===============
>
> Item was added:
> + ----- Method: RxMatcherTest>>testOptionalMultipleQuantifiedSubexpression (in category 'testing') -----
> + testOptionalMultipleQuantifiedSubexpression
> +       <timeout: 0.1>
> +
> +       self runRegex: #('((aa?){2})'
> +               '' false nil
> +               'a' false nil
> +               'aa' true (1 'aa')
> +               'baaa' true (2 'aaa'))!
>
> Item was changed:
>  ----- Method: RxMatcherTest>>testOptionalNestedIntoMultipleQuantified (in category 'testing') -----
>  testOptionalNestedIntoMultipleQuantified
>        <timeout: 0.1>
>
>        self runRegex: #('(aa?){2}'
>                '' false nil
>                'a' false nil
>                'aa' true (1 'aa')
> +               'baaa' true (2 'a'))!
> -               'baaa' true (2 'aaa'))!
>