Patrick Rein uploaded a new version of Regex-Tests-Core to project The Inbox:
http://source.squeak.org/inbox/Regex-Tests-Core-pre.6.mcz ==================== Summary ==================== Name: Regex-Tests-Core-pre.6 Author: pre Time: 19 May 2016, 8:50:59.559548 pm UUID: e1cbdf20-9ce2-4e4a-b1d8-574186c2e746 Ancestors: Regex-Tests-Core-ul.5 Adjustments to the Regex tests for optional subexpressions in multiple quantifiers. These tests demonstrate the different behavior when nesting subexpressions in quantifiers and the other way round. =============== Diff against Regex-Tests-Core-ul.5 =============== Item was added: + ----- Method: RxMatcherTest>>testOptionalMultipleQuantifiedSubexpression (in category 'testing') ----- + testOptionalMultipleQuantifiedSubexpression + <timeout: 0.1> + + self runRegex: #('((aa?){2})' + '' false nil + 'a' false nil + 'aa' true (1 'aa') + 'baaa' true (2 'aaa'))! Item was changed: ----- Method: RxMatcherTest>>testOptionalNestedIntoMultipleQuantified (in category 'testing') ----- testOptionalNestedIntoMultipleQuantified <timeout: 0.1> self runRegex: #('(aa?){2}' '' false nil 'a' false nil 'aa' true (1 'aa') + 'baaa' true (2 'a'))! - 'baaa' true (2 'aaa'))! |
I would like to comment further on these changes as they change the expected Regex engine behavior. The behavior, that matching groups in a multiple quantifier return the last match of the group is consistent with for example the Ruby regex engine:
http://rubular.com/r/kxoreyPolG The issue is also decribed here: http://www.regular-expressions.info/captureall.html However, I am not sure what the intended behavior for RxMatcher is. Has this ever been working and what was the intended behavior? Bests Patrick ________________________________________ From: [hidden email] <[hidden email]> on behalf of [hidden email] <[hidden email]> Sent: Thursday, May 19, 2016 13:50 To: [hidden email] Subject: [squeak-dev] The Inbox: Regex-Tests-Core-pre.6.mcz Patrick Rein uploaded a new version of Regex-Tests-Core to project The Inbox: http://source.squeak.org/inbox/Regex-Tests-Core-pre.6.mcz ==================== Summary ==================== Name: Regex-Tests-Core-pre.6 Author: pre Time: 19 May 2016, 8:50:59.559548 pm UUID: e1cbdf20-9ce2-4e4a-b1d8-574186c2e746 Ancestors: Regex-Tests-Core-ul.5 Adjustments to the Regex tests for optional subexpressions in multiple quantifiers. These tests demonstrate the different behavior when nesting subexpressions in quantifiers and the other way round. =============== Diff against Regex-Tests-Core-ul.5 =============== Item was added: + ----- Method: RxMatcherTest>>testOptionalMultipleQuantifiedSubexpression (in category 'testing') ----- + testOptionalMultipleQuantifiedSubexpression + <timeout: 0.1> + + self runRegex: #('((aa?){2})' + '' false nil + 'a' false nil + 'aa' true (1 'aa') + 'baaa' true (2 'aaa'))! Item was changed: ----- Method: RxMatcherTest>>testOptionalNestedIntoMultipleQuantified (in category 'testing') ----- testOptionalNestedIntoMultipleQuantified <timeout: 0.1> self runRegex: #('(aa?){2}' '' false nil 'a' false nil 'aa' true (1 'aa') + 'baaa' true (2 'a'))! - 'baaa' true (2 'aaa'))! |
Hi Patrick,
This test has never passed before. I wrote it to document a bug I had found in the matcher creation. It used to crash the VM, but I fixed that part earlier. Now that you made the matcher work, it's clear that the test case has a bug as well. The last line should read 'baaa' true (1 'aaa' 2 'a')) because the test checks all listed subexpressions, and there's no reason for us not to list all. The first subexpression is by default the whole match, which is expected to be 'aaa'. The second subexpression is the last match of the part between the parenthesis, which is just 'a', as you suggested. Levente On Thu, 19 May 2016, Rein, Patrick wrote: > I would like to comment further on these changes as they change the expected Regex engine behavior. The behavior, that matching groups in a multiple quantifier return the last match of the group is consistent with for example the Ruby regex engine: > > http://rubular.com/r/kxoreyPolG > > The issue is also decribed here: http://www.regular-expressions.info/captureall.html > > However, I am not sure what the intended behavior for RxMatcher is. Has this ever been working and what was the intended behavior? > > Bests > Patrick > ________________________________________ > From: [hidden email] <[hidden email]> on behalf of [hidden email] <[hidden email]> > Sent: Thursday, May 19, 2016 13:50 > To: [hidden email] > Subject: [squeak-dev] The Inbox: Regex-Tests-Core-pre.6.mcz > > Patrick Rein uploaded a new version of Regex-Tests-Core to project The Inbox: > http://source.squeak.org/inbox/Regex-Tests-Core-pre.6.mcz > > ==================== Summary ==================== > > Name: Regex-Tests-Core-pre.6 > Author: pre > Time: 19 May 2016, 8:50:59.559548 pm > UUID: e1cbdf20-9ce2-4e4a-b1d8-574186c2e746 > Ancestors: Regex-Tests-Core-ul.5 > > Adjustments to the Regex tests for optional subexpressions in multiple quantifiers. These tests demonstrate the different behavior when nesting subexpressions in quantifiers and the other way round. > > =============== Diff against Regex-Tests-Core-ul.5 =============== > > Item was added: > + ----- Method: RxMatcherTest>>testOptionalMultipleQuantifiedSubexpression (in category 'testing') ----- > + testOptionalMultipleQuantifiedSubexpression > + <timeout: 0.1> > + > + self runRegex: #('((aa?){2})' > + '' false nil > + 'a' false nil > + 'aa' true (1 'aa') > + 'baaa' true (2 'aaa'))! > > Item was changed: > ----- Method: RxMatcherTest>>testOptionalNestedIntoMultipleQuantified (in category 'testing') ----- > testOptionalNestedIntoMultipleQuantified > <timeout: 0.1> > > self runRegex: #('(aa?){2}' > '' false nil > 'a' false nil > 'aa' true (1 'aa') > + 'baaa' true (2 'a'))! > - 'baaa' true (2 'aaa'))! > |
Free forum by Nabble | Edit this page |