RBPattern matching dynamic array expressions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

RBPattern matching dynamic array expressions

Nicolai Hess-3-2
Background :

initial question:
http://forum.world.st/RewriteTool-matching-parts-of-an-array-tp4910482.html

my fix :
18910 MNU: OrderedCollection>>parent: when trying to rewrite a tree with RBParseTreeRewriter

the error my fix introduced :-) :
19445 Pattern code ArrayNode matching failure

Question : how should a pattern with three variables (first/second/third) look like to match the expression
{ (1@2) . Color red . 3} and assign the (sub)-expressions
1@2
Color red
3
to the three variables first/second/third ?

Looking at the initial question "RegisterRewriteTool matching parts of an array", I was suprised that
for the pattern and the dynamic array expression:
pattern := '{ `@first . `@second . `@third }'
expression := '{ (1@2) . Color red . 3}'
the third variable matches all three expressions and the first two are empty:

RBPatternVariableNode(`@first)->an OrderedCollection()
RBPatternVariableNode(`@second)->an OrderedCollection()
RBPatternVariableNode(`@third)->an OrderedCollection(RBMessageNode((1 @ 2)) RBMessageNode(Color red) RBLiteralValueNode(3))

I fixed this in a way, that all three variables gets assigned one (sub)expression of the array.
But the fix was wrong, now a statement-list-pattern-variable cannot be used anymore to match
the array statements:

pattern := '{ `@.stmts }'.
expression:= RBParser parseExpression:'{ (1@2) . Color red . 3}'.

results in an error during matching but I would expect the result:
RBPatternVariableNode(`@.stmts)->an OrderedCollection(RBMessageNode((1 @ 2)) RBMessageNode(Color red) RBLiteralValueNode(3))

Now I am not sure that the pattern { `@first . `@second . `@third }' is actually a valid
pattern for matching the array (sub)-expressions, the array contents
is a list of three single statemts and @first isn't a statement pattern, but a pattern for an
expression list.

On the other hand, if I try to match *three statements* without the being in a dynamic array:
(Note the missing curly braces)

pattern := '`@first . `@second . `@third '.
expression : parseExpression: ' (1@2) . Color red . 3'.

The result works as I wanted them for the array pattern:

RBPatternVariableNode(`@first)->RBMessageNode(1 @ 2) RBPatternVariableNode(`@second)->RBMessageNode(Color red) RBPatternVariableNode(`@third)->RBLiteralValueNode(3)


So, is the pattern actually right or is the matching wrong?
What should I expect by these patterns and expressions:

pattern := '`@first . `@second . `@third '.
expression : parseExpression: ' (1@2) . Color red . 3'.

pattern := '`.first. `.second. `.third. '.
expression := ' (1@2) . Color red . 3'.

pattern := '{`.first. `.second. `.third.}'.
expression := '{ (1@2) . Color red . 3}'.

pattern := '{`@.all}'.
expression := '{ (1@2) . Color red . 3}'.


pattern := '{`@.all}'.
expression := '{ (1@2) . Color red . 3}'.

pattern := '`.head. `.@tail'.
expression := '(1@2) . Color red . 3'.

pattern := '{`.head. `.@tail}'.
expression := '{(1@2) . Color red . 3}'.

I would expect all statement-pattern without the dynamic array, assign the variables to the
subexpressions, and the same way for the pattern with dynamic arrays.

Btw, I don't know why this does not match
pattern := '`.head `.@tail'.
expression := '(1@2) . Color red . 3'.

isn't the first variable a statement pattern ?
From the rewrite-tool documentation (http://www.refactory.com/tools/refactoring-browser/rewrite-tool):

`       recurse into the thing that is matched looking for more matches
@       treat the item as a list
;       used for messages where the message is treated as a list of messages in a cascade
.       treat the item as a statement in a sequence node
#       matches literals


thanks in advance

nicolai




Reply | Threaded
Open this post in threaded view
|

Re: RBPattern matching dynamic array expressions

stepharong
Hi nicolai

I always thought that I understand and I always find a situation that totally confused me. 
This is why I asked some guys expert in transformation to have a look and they got confused.
This is why camille teruel started to work with Mark on Phorms.
But the effort got stalled. 

Stef

Background :

initial question:
http://forum.world.st/RewriteTool-matching-parts-of-an-array-tp4910482.html

my fix :
18910 MNU: OrderedCollection>>parent: when trying to rewrite a tree with RBParseTreeRewriter

the error my fix introduced :-) :
19445 Pattern code ArrayNode matching failure

Question : how should a pattern with three variables (first/second/third) look like to match the expression
{ (1@2) . Color red . 3} and assign the (sub)-expressions
1@2
Color red
3
to the three variables first/second/third ?

Looking at the initial question "RegisterRewriteTool matching parts of an array", I was suprised that
for the pattern and the dynamic array expression:
pattern := '{ `@first . `@second . `@third }'
expression := '{ (1@2) . Color red . 3}'
the third variable matches all three expressions and the first two are empty:

RBPatternVariableNode(`@first)->an OrderedCollection()
RBPatternVariableNode(`@second)->an OrderedCollection()
RBPatternVariableNode(`@third)->an OrderedCollection(RBMessageNode((1 @ 2)) RBMessageNode(Color red) RBLiteralValueNode(3))

I fixed this in a way, that all three variables gets assigned one (sub)expression of the array.
But the fix was wrong, now a statement-list-pattern-variable cannot be used anymore to match
the array statements:

pattern := '{ `@.stmts }'.
expression:= RBParser parseExpression:'{ (1@2) . Color red . 3}'.

results in an error during matching but I would expect the result:
RBPatternVariableNode(`@.stmts)->an OrderedCollection(RBMessageNode((1 @ 2)) RBMessageNode(Color red) RBLiteralValueNode(3))

Now I am not sure that the pattern { `@first . `@second . `@third }' is actually a valid
pattern for matching the array (sub)-expressions, the array contents
is a list of three single statemts and @first isn't a statement pattern, but a pattern for an
expression list.

On the other hand, if I try to match *three statements* without the being in a dynamic array:
(Note the missing curly braces)

pattern := '`@first . `@second . `@third '.
expression : parseExpression: ' (1@2) . Color red . 3'.

The result works as I wanted them for the array pattern:

RBPatternVariableNode(`@first)->RBMessageNode(1 @ 2) RBPatternVariableNode(`@second)->RBMessageNode(Color red) RBPatternVariableNode(`@third)->RBLiteralValueNode(3)


So, is the pattern actually right or is the matching wrong?
What should I expect by these patterns and expressions:

pattern := '`@first . `@second . `@third '.
expression : parseExpression: ' (1@2) . Color red . 3'.

pattern := '`.first. `.second. `.third. '.
expression := ' (1@2) . Color red . 3'.

pattern := '{`.first. `.second. `.third.}'.
expression := '{ (1@2) . Color red . 3}'.

pattern := '{`@.all}'.
expression := '{ (1@2) . Color red . 3}'.


pattern := '{`@.all}'.
expression := '{ (1@2) . Color red . 3}'.

pattern := '`.head. `.@tail'.
expression := '(1@2) . Color red . 3'.

pattern := '{`.head. `.@tail}'.
expression := '{(1@2) . Color red . 3}'.

I would expect all statement-pattern without the dynamic array, assign the variables to the
subexpressions, and the same way for the pattern with dynamic arrays.

Btw, I don't know why this does not match
pattern := '`.head `.@tail'.
expression := '(1@2) . Color red . 3'.

isn't the first variable a statement pattern ?
From the rewrite-tool documentation (http://www.refactory.com/tools/refactoring-browser/rewrite-tool):

`       recurse into the thing that is matched looking for more matches
@       treat the item as a list
;       used for messages where the message is treated as a list of messages in a cascade
.       treat the item as a statement in a sequence node
#       matches literals


thanks in advance

nicolai







--
Using Opera's mail client: http://www.opera.com/mail/
Reply | Threaded
Open this post in threaded view
|

Re: RBPattern matching dynamic array expressions

Uko2
Hi,

this is good that we are discussing these problems as the pattern syntax is very powerful, but it has some inconsistencies.

Now I’d say that dynamic arrays should work similarly to statements and in '{ `@first . `@second . `@third }’ `@third should not match multiple “statements” as we have a dot syntax for that.

Regarding your last example '`.head `.@tail' should not parse at all (at leas according to what I have seen) but it does. It is parsed into a message node with selector `.@tail and a receiver variable node `.head. It will match well if you put a dot between statements: '`.head . `.@tail’. But yes, I’d like to have a parser failure as dot is used for defining statement nodes…

By the way for testing purposes you can inspect something like:
RBPatternParser parseExpression: '`.head `.@tail’
and see that it is parsed into a message node.

Cheers.
Uko

On 10 Dec 2016, at 10:05, stepharong <[hidden email]> wrote:

Hi nicolai

I always thought that I understand and I always find a situation that totally confused me. 
This is why I asked some guys expert in transformation to have a look and they got confused.
This is why camille teruel started to work with Mark on Phorms.
But the effort got stalled. 

Stef

Background :

initial question:
http://forum.world.st/RewriteTool-matching-parts-of-an-array-tp4910482.html 

my fix :
18910 MNU: OrderedCollection>>parent: when trying to rewrite a tree with RBParseTreeRewriter

the error my fix introduced :-) :
19445 Pattern code ArrayNode matching failure

Question : how should a pattern with three variables (first/second/third) look like to match the expression
{ (1@2) . Color red . 3} and assign the (sub)-expressions 
1@2
Color red
3
to the three variables first/second/third ?

Looking at the initial question "RegisterRewriteTool matching parts of an array", I was suprised that
for the pattern and the dynamic array expression:
pattern := '{ `@first . `@second . `@third }'
expression := '{ (1@2) . Color red . 3}'
the third variable matches all three expressions and the first two are empty:

RBPatternVariableNode(`@first)->an OrderedCollection() 
RBPatternVariableNode(`@second)->an OrderedCollection() 
RBPatternVariableNode(`@third)->an OrderedCollection(RBMessageNode((1 @ 2)) RBMessageNode(Color red) RBLiteralValueNode(3))

I fixed this in a way, that all three variables gets assigned one (sub)expression of the array.
But the fix was wrong, now a statement-list-pattern-variable cannot be used anymore to match
the array statements:

pattern := '{ `@.stmts }'.
expression:= RBParser parseExpression:'{ (1@2) . Color red . 3}'.

results in an error during matching but I would expect the result:
RBPatternVariableNode(`@.stmts)->an OrderedCollection(RBMessageNode((1 @ 2)) RBMessageNode(Color red) RBLiteralValueNode(3)) 

Now I am not sure that the pattern { `@first . `@second . `@third }' is actually a valid
pattern for matching the array (sub)-expressions, the array contents
is a list of three single statemts and @first isn't a statement pattern, but a pattern for an
expression list.

On the other hand, if I try to match *three statements* without the being in a dynamic array:
(Note the missing curly braces)

pattern := '`@first . `@second . `@third '.
expression : parseExpression: ' (1@2) . Color red . 3'.

The result works as I wanted them for the array pattern:

RBPatternVariableNode(`@first)->RBMessageNode(1 @ 2) RBPatternVariableNode(`@second)->RBMessageNode(Color red) RBPatternVariableNode(`@third)->RBLiteralValueNode(3)


So, is the pattern actually right or is the matching wrong?
What should I expect by these patterns and expressions:

pattern := '`@first . `@second . `@third '.
expression : parseExpression: ' (1@2) . Color red . 3'.

pattern := '`.first. `.second. `.third. '.
expression := ' (1@2) . Color red . 3'.

pattern := '{`.first. `.second. `.third.}'.
expression := '{ (1@2) . Color red . 3}'.

pattern := '{`@.all}'.
expression := '{ (1@2) . Color red . 3}'.


pattern := '{`@.all}'.
expression := '{ (1@2) . Color red . 3}'.

pattern := '`.head. `.@tail'.
expression := '(1@2) . Color red . 3'.

pattern := '{`.head. `.@tail}'.
expression := '{(1@2) . Color red . 3}'.

I would expect all statement-pattern without the dynamic array, assign the variables to the
subexpressions, and the same way for the pattern with dynamic arrays.

Btw, I don't know why this does not match
pattern := '`.head `.@tail'.
expression := '(1@2) . Color red . 3'.

isn't the first variable a statement pattern ?
From the rewrite-tool documentation (http://www.refactory.com/tools/refactoring-browser/rewrite-tool):

`       recurse into the thing that is matched looking for more matches
@       treat the item as a list
;       used for messages where the message is treated as a list of messages in a cascade
.       treat the item as a statement in a sequence node
#       matches literals


thanks in advance

nicolai







--
Using Opera's mail client: http://www.opera.com/mail/

Reply | Threaded
Open this post in threaded view
|

Re: RBPattern matching dynamic array expressions

John Brant-2
In reply to this post by Nicolai Hess-3-2
On 12/10/2016 02:37 AM, Nicolai Hess wrote:

> Question : how should a pattern with three variables
> (first/second/third) look like to match the expression
> { (1@2) . Color red . 3} and assign the (sub)-expressions
> 1@2
> Color red
> 3
> to the three variables first/second/third ?

The RB was written before the {} array syntax. It appears that are a few
bugs in the pattern matching for the arrays. The first is that it needs
to reset the isList variable of RBPatternVariableNode when it is part of
a dynamic array. You can fix that by adding this to the end of
RBPatternVariableNode>>parent: method:

        parent isDynamicArray
                ifTrue: [ self isStatement
                                ifFalse: [ isList := false ] ]

I think this will make the matching work like what you are expecting. If
you want to match multiple statements, you need to add a ".", but if you
just have "`@first", it will only match any expression node.

While testing I noticed an error in the RBArrayNode>>match:inContext:.
The method doesn't test the lengths of the nodes before sending the
#with:do: message. If you try to match two arrays of different sizes,
you get an error. This should be changed back to be:

match: aNode inContext: aDictionary
        aNode class = self class
                ifFalse: [ ^ false ].
        ^ self matchList: statements against: aNode statements inContext:
aDictionary


John Brant

Reply | Threaded
Open this post in threaded view
|

Re: RBPattern matching dynamic array expressions

Uko2
Hi John,

thank you for our reply it is really helpful.

Can you please also spend a bit of your time to talk about the “permissiveness" of the pattern syntax. For example the string '`.head `.@tail’ is parsed into a message node with selector `.@tail and a receiver variable node `.head. To my knowledge this is not a valid syntax as the names that have a dot in the beginning should become matchers for statements. Also for example if you write `#@literal the list marker is not doing anything when combines with the literal pound as far as I know, and some people are confused about this, were the any reason why this is not simply rising a syntax error?

Cheers.
Uko

> On 11 Dec 2016, at 00:46, John Brant <[hidden email]> wrote:
>
> On 12/10/2016 02:37 AM, Nicolai Hess wrote:
>
>> Question : how should a pattern with three variables
>> (first/second/third) look like to match the expression
>> { (1@2) . Color red . 3} and assign the (sub)-expressions
>> 1@2
>> Color red
>> 3
>> to the three variables first/second/third ?
>
> The RB was written before the {} array syntax. It appears that are a few bugs in the pattern matching for the arrays. The first is that it needs to reset the isList variable of RBPatternVariableNode when it is part of a dynamic array. You can fix that by adding this to the end of RBPatternVariableNode>>parent: method:
>
> parent isDynamicArray
> ifTrue: [ self isStatement
> ifFalse: [ isList := false ] ]
>
> I think this will make the matching work like what you are expecting. If you want to match multiple statements, you need to add a ".", but if you just have "`@first", it will only match any expression node.
>
> While testing I noticed an error in the RBArrayNode>>match:inContext:. The method doesn't test the lengths of the nodes before sending the #with:do: message. If you try to match two arrays of different sizes, you get an error. This should be changed back to be:
>
> match: aNode inContext: aDictionary
> aNode class = self class
> ifFalse: [ ^ false ].
> ^ self matchList: statements against: aNode statements inContext: aDictionary
>
>
> John Brant
>


Reply | Threaded
Open this post in threaded view
|

Re: RBPattern matching dynamic array expressions

Nicolai Hess-3-2
In reply to this post by John Brant-2


2016-12-11 0:46 GMT+01:00 John Brant <[hidden email]>:
On 12/10/2016 02:37 AM, Nicolai Hess wrote:

Question : how should a pattern with three variables
(first/second/third) look like to match the expression
{ (1@2) . Color red . 3} and assign the (sub)-expressions
1@2
Color red
3
to the three variables first/second/third ?

The RB was written before the {} array syntax. It appears that are a few bugs in the pattern matching for the arrays. The first is that it needs to reset the isList variable of RBPatternVariableNode when it is part of a dynamic array. You can fix that by adding this to the end of RBPatternVariableNode>>parent: method:

        parent isDynamicArray
                ifTrue: [ self isStatement
                                ifFalse: [ isList := false ] ]

Yes, that works
 

I think this will make the matching work like what you are expecting. If you want to match multiple statements, you need to add a ".", but if you just have "`@first", it will only match any expression node.

While testing I noticed an error in the RBArrayNode>>match:inContext:. The method doesn't test the lengths of the nodes before sending the #with:do: message. If you try to match two arrays of different sizes, you get an error. This should be changed back to be:

match: aNode inContext: aDictionary
        aNode class = self class
                ifFalse: [ ^ false ].
        ^ self matchList: statements against: aNode statements inContext: aDictionary

Yes, I used the call that is used in RBMessageNodes #match:in:Context:, but RBMessageNode indirectly checks for the different size of arguments by comparing the selector.
For the RBArrayNode this size comparsion isn't done.
But with your fix, the change in this method isn't needed.

Thank you John.

nicolai
 


John Brant


Reply | Threaded
Open this post in threaded view
|

Re: RBPattern matching dynamic array expressions

John Brant-2
In reply to this post by Uko2

> On Dec 11, 2016, at 1:06 PM, Yuriy Tymchuk <[hidden email]> wrote:
>
> Hi John,
>
> thank you for our reply it is really helpful.
>
> Can you please also spend a bit of your time to talk about the “permissiveness" of the pattern syntax. For example the string '`.head `.@tail’ is parsed into a message node with selector `.@tail and a receiver variable node `.head. To my knowledge this is not a valid syntax as the names that have a dot in the beginning should become matchers for statements. Also for example if you write `#@literal the list marker is not doing anything when combines with the literal pound as far as I know, and some people are confused about this, were the any reason why this is not simply rising a syntax error?
>

The @ in a literal can make sense when you are matching inside of a literal array:
        (RBParser parseRewriteExpression: '#(`#@a 1 `#@b)')
                match: (RBParser parseExpression: '#(2 1 3 4)')
                inContext: Dictionary new

As for the permissiveness, I built the rewriter to be used internally for the refactorings so I wasn’t too concerned about that when it was built. Only after using the rewrites internally and seeing how useful they were, did we decide to surface them to the user. The rewriter was easier to build without all of the validation checks since we could just extend the existing parser to check for pattern tokens in a few places. If you add the validation to the parser, then you’ll need to override several methods to check for valid input.

Another validation approach is to validate the tree after it has been created. I think Niall Ross did this for the VA Smalltalk rewriter about a decade ago.  Using this approach one can also validate the replacement expression only references patterns defined in the search expression. I think this is probably the best way to add validation.

BTW, the scanner actually allows any characters between the initial backquote and the first letter character, so you can have `!@#$%^&*()0987654321~`<>?,./{}|[]\a for a pattern variable name. The code only processes the first modifier characters that it knows about. For the example above, since it doesn’t know how to process the ! modifier character, it stops looking so the pattern above will only match a variable node.

If I was defining the pattern modifiers again using the prefix backquote like is done in the existing RB, I think I would make `name match anything (so you didn’t need to add an @ so often since that is the most common case), and add a | to signify a variable  match (e.g., `|var) — although the bar character may be too easily confused with a lower case L. Lists, @, could become * or + or {2,3} to look like regexs. For example, you could have a pattern like this:
        | `|{3,}temps |
        `.*Statement
This would match sequence nodes with 3 or more temps and 0 or more statements.

In SmaCC, I took the approach to spell out what I wanted. The patterns begin and end with a backquote. For example, if I wanted a list, instead of using a special @ character, I would send the #beList message (e.g., `name{beList}`). These can be combined/cascaded: `name{beList; testBlock: [:nodes | … do something to test the nodes …]}`.

While some can understand and use these pattern rewrites, I don’t know if they will ever be easy enough for most people to use. Instead I think some other interface should be built for these people. For example, I can envision a system where a user provides an sample of something to find. The system would build some potential patterns and find potential matches, and then the user would look at the matches and tell the tool which were valid or not. After a few iterations, hopefully the tool has figured out the pattern the user wanted.


John Brant