Smalltalk › Squeak › Squeak - Dev

Failing Decompiler tests - how to fix?

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

7 messages Options

Frank Shearar

Failing Decompiler tests - how to fix?

One of the failing Decompiler tests is when we check
MethodPragmaTest>>testCompileCharacter:

testCompileCharacter
self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ).
self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ).

As bytecodes, this method looks like this:

37 <70> self
38 <21> pushConstant: 'foo: $a'
39 <22> pushConstant: #foo:
40 <23> pushConstant: #($a)
41 <83 60> send: assertPragma:givesKeyword:arguments:
43 <87> pop
44 <70> self
45 <24> pushConstant: 'foo: $ '
46 <22> pushConstant: #foo:
47 <25> pushConstant: {Character space}
48 <83 60> send: assertPragma:givesKeyword:arguments:
50 <87> pop
51 <78> returnSelf

And when decompiled, we have:

testCompileCharacter
self
assertPragma: 'foo: $a'
givesKeyword: #foo:
arguments: #($a ).
self
assertPragma: 'foo: $ '
givesKeyword: #foo:
arguments: ((Array new: 1) at: 1 put: Character space; yourself)

Notice that the literal #( $ ) has expanded into an Array declaration etc.

When we compile this decompiled string, we end up with the rather different

self
pushConstant: ''foo: $a''
pushConstant: #foo:
pushConstant: #($a)
send: #assertPragma:givesKeyword:arguments: (3 args)
pop
self
pushConstant: ''foo: $ ''
pushConstant: #foo:
pushLit: Array
pushConstant: 1
send: #new: (1 arg)
dup
pushConstant: 1
pushLit: Character
send: #space (0 args)
send: #at:put: (2 args)
pop
send: #yourself (0 args)
send: #assertPragma:givesKeyword:arguments: (3 args)
pop
returnSelf

The decompiler turns that pushConstant: into a whole bunch of stuff
because it interprets #( $ ) as (Array new: 1) at: 1 put: Character
space; yourself.

(The same thing happens for SyntaxMorph>>replaceSel:menuItem: and
SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being
interpreted as Character space, and then decompiled to pushLit:
Character send: #space.)

That's all fine, but what happens is that when you compile the
decompiled method (and then decompile _that_ method), you can't have the
newly-decompiled method match the original decompiled method. The parse
tree are different: where the original has a LiteralNode, the new
version has a CascadeNode.

frank

Levente Uzonyi-2

Re: Failing Decompiler tests - how to fix?

On Tue, 21 Dec 2010, Frank Shearar wrote:

> One of the failing Decompiler tests is when we check
> MethodPragmaTest>>testCompileCharacter:
>
> testCompileCharacter
> self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ).
> self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ).
>
> As bytecodes, this method looks like this:
>
> 37 <70> self
> 38 <21> pushConstant: 'foo: $a'
> 39 <22> pushConstant: #foo:
> 40 <23> pushConstant: #($a)
> 41 <83 60> send: assertPragma:givesKeyword:arguments:
> 43 <87> pop
> 44 <70> self
> 45 <24> pushConstant: 'foo: $ '
> 46 <22> pushConstant: #foo:
> 47 <25> pushConstant: {Character space}
> 48 <83 60> send: assertPragma:givesKeyword:arguments:
> 50 <87> pop
> 51 <78> returnSelf
>
> And when decompiled, we have:
>
> testCompileCharacter
> self
> assertPragma: 'foo: $a'
> givesKeyword: #foo:
> arguments: #($a ).
> self
> assertPragma: 'foo: $ '
> givesKeyword: #foo:
> arguments: ((Array new: 1) at: 1 put: Character space;
> yourself)
>
> Notice that the literal #( $ ) has expanded into an Array declaration etc.
>
> When we compile this decompiled string, we end up with the rather different
>
> self
> pushConstant: ''foo: $a''
> pushConstant: #foo:
> pushConstant: #($a)
> send: #assertPragma:givesKeyword:arguments: (3 args)
> pop
> self
> pushConstant: ''foo: $ ''
> pushConstant: #foo:
> pushLit: Array
> pushConstant: 1
> send: #new: (1 arg)
> dup
> pushConstant: 1
> pushLit: Character
> send: #space (0 args)
> send: #at:put: (2 args)
> pop
> send: #yourself (0 args)
> send: #assertPragma:givesKeyword:arguments: (3 args)
> pop
> returnSelf
>
> The decompiler turns that pushConstant: into a whole bunch of stuff because
> it interprets #( $ ) as (Array new: 1) at: 1 put: Character space; yourself.
>
> (The same thing happens for SyntaxMorph>>replaceSel:menuItem: and
> SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being interpreted
> as Character space, and then decompiled to pushLit: Character send: #space.)
>
> That's all fine, but what happens is that when you compile the decompiled
> method (and then decompile _that_ method), you can't have the
> newly-decompiled method match the original decompiled method. The parse tree
> are different: where the original has a LiteralNode, the new version has a
> CascadeNode.

It's easy to fix it, just change Character >> shouldBePrintedAsLiteral to
^value between: 32 and: 255. Though it may break other stuff. See here:
http://bugs.squeak.org/view.php?id=2254
http://bugs.squeak.org/view.php?id=4322
Btw the 33-255 range is funny, because 127 is a control character.

Levente

>
> frank
>
>

Casey Ransberger-2

Re: Failing Decompiler tests - how to fix?

In reply to this post by Frank Shearar

So is the test BS? Should we modify it to expect the cascade node instead?

On Dec 21, 2010, at 3:07 AM, Frank Shearar <[hidden email]> wrote:

> One of the failing Decompiler tests is when we check MethodPragmaTest>>testCompileCharacter:
>
> testCompileCharacter
> self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ).
> self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ).
>
> As bytecodes, this method looks like this:
>
> 37 <70> self
> 38 <21> pushConstant: 'foo: $a'
> 39 <22> pushConstant: #foo:
> 40 <23> pushConstant: #($a)
> 41 <83 60> send: assertPragma:givesKeyword:arguments:
> 43 <87> pop
> 44 <70> self
> 45 <24> pushConstant: 'foo: $ '
> 46 <22> pushConstant: #foo:
> 47 <25> pushConstant: {Character space}
> 48 <83 60> send: assertPragma:givesKeyword:arguments:
> 50 <87> pop
> 51 <78> returnSelf
>
> And when decompiled, we have:
>
> testCompileCharacter
> self
> assertPragma: 'foo: $a'
> givesKeyword: #foo:
> arguments: #($a ).
> self
> assertPragma: 'foo: $ '
> givesKeyword: #foo:
> arguments: ((Array new: 1) at: 1 put: Character space; yourself)
>
> Notice that the literal #( $ ) has expanded into an Array declaration etc.
>
> When we compile this decompiled string, we end up with the rather different
>
> self
> pushConstant: ''foo: $a''
> pushConstant: #foo:
> pushConstant: #($a)
> send: #assertPragma:givesKeyword:arguments: (3 args)
> pop
> self
> pushConstant: ''foo: $ ''
> pushConstant: #foo:
> pushLit: Array
> pushConstant: 1
> send: #new: (1 arg)
> dup
> pushConstant: 1
> pushLit: Character
> send: #space (0 args)
> send: #at:put: (2 args)
> pop
> send: #yourself (0 args)
> send: #assertPragma:givesKeyword:arguments: (3 args)
> pop
> returnSelf
>
> The decompiler turns that pushConstant: into a whole bunch of stuff because it interprets #( $ ) as (Array new: 1) at: 1 put: Character space; yourself.
>
> (The same thing happens for SyntaxMorph>>replaceSel:menuItem: and SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being interpreted as Character space, and then decompiled to pushLit: Character send: #space.)
>
> That's all fine, but what happens is that when you compile the decompiled method (and then decompile _that_ method), you can't have the newly-decompiled method match the original decompiled method. The parse tree are different: where the original has a LiteralNode, the new version has a CascadeNode.
>
> frank
>

Frank Shearar

Re: Failing Decompiler tests - how to fix?

In reply to this post by Levente Uzonyi-2

On 2010/12/21 13:16, Levente Uzonyi wrote:

> On Tue, 21 Dec 2010, Frank Shearar wrote:
>
>> One of the failing Decompiler tests is when we check
>> MethodPragmaTest>>testCompileCharacter:
>>
>> testCompileCharacter
>> self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ).
>> self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ).
>>
>> As bytecodes, this method looks like this:
>>
>> 37 <70> self
>> 38 <21> pushConstant: 'foo: $a'
>> 39 <22> pushConstant: #foo:
>> 40 <23> pushConstant: #($a)
>> 41 <83 60> send: assertPragma:givesKeyword:arguments:
>> 43 <87> pop
>> 44 <70> self
>> 45 <24> pushConstant: 'foo: $ '
>> 46 <22> pushConstant: #foo:
>> 47 <25> pushConstant: {Character space}
>> 48 <83 60> send: assertPragma:givesKeyword:arguments:
>> 50 <87> pop
>> 51 <78> returnSelf
>>
>> And when decompiled, we have:
>>
>> testCompileCharacter
>> self
>> assertPragma: 'foo: $a'
>> givesKeyword: #foo:
>> arguments: #($a ).
>> self
>> assertPragma: 'foo: $ '
>> givesKeyword: #foo:
>> arguments: ((Array new: 1) at: 1 put: Character space; yourself)
>>
>> Notice that the literal #( $ ) has expanded into an Array declaration
>> etc.
>>
>> When we compile this decompiled string, we end up with the rather
>> different
>>
>> self
>> pushConstant: ''foo: $a''
>> pushConstant: #foo:
>> pushConstant: #($a)
>> send: #assertPragma:givesKeyword:arguments: (3 args)
>> pop
>> self
>> pushConstant: ''foo: $ ''
>> pushConstant: #foo:
>> pushLit: Array
>> pushConstant: 1
>> send: #new: (1 arg)
>> dup
>> pushConstant: 1
>> pushLit: Character
>> send: #space (0 args)
>> send: #at:put: (2 args)
>> pop
>> send: #yourself (0 args)
>> send: #assertPragma:givesKeyword:arguments: (3 args)
>> pop
>> returnSelf
>>
>> The decompiler turns that pushConstant: into a whole bunch of stuff
>> because it interprets #( $ ) as (Array new: 1) at: 1 put: Character
>> space; yourself.
>>
>> (The same thing happens for SyntaxMorph>>replaceSel:menuItem: and
>> SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being
>> interpreted as Character space, and then decompiled to pushLit:
>> Character send: #space.)
>>
>> That's all fine, but what happens is that when you compile the
>> decompiled method (and then decompile _that_ method), you can't have
>> the newly-decompiled method match the original decompiled method. The
>> parse tree are different: where the original has a LiteralNode, the
>> new version has a CascadeNode.
>
> It's easy to fix it, just change Character >> shouldBePrintedAsLiteral to
> ^value between: 32 and: 255. Though it may break other stuff. See here:
> http://bugs.squeak.org/view.php?id=2254
> http://bugs.squeak.org/view.php?id=4322
> Btw the 33-255 range is funny, because 127 is a control character.

It's not (just) the Character literal. It's that #( $ ) (or {Character
space} is a compiled as a LiteralNode.

The expansion's semantically correct, but the decompiled string no
longer matches bytecode-for-bytecode the bytecodes of the original
string, which means that the old decompiled and new decompiled strings
differ by whitespace (because of the way a CascadeNode's written).

frank

Frank Shearar

Re: Failing Decompiler tests - how to fix?

On 2010/12/21 19:10, Frank Shearar wrote:

> On 2010/12/21 13:16, Levente Uzonyi wrote:
>> On Tue, 21 Dec 2010, Frank Shearar wrote:
>>
>>> One of the failing Decompiler tests is when we check
>>> MethodPragmaTest>>testCompileCharacter:
>>>
>>> testCompileCharacter
>>> self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ).
>>> self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ).
>>>
>>> As bytecodes, this method looks like this:
>>>
>>> 37 <70> self
>>> 38 <21> pushConstant: 'foo: $a'
>>> 39 <22> pushConstant: #foo:
>>> 40 <23> pushConstant: #($a)
>>> 41 <83 60> send: assertPragma:givesKeyword:arguments:
>>> 43 <87> pop
>>> 44 <70> self
>>> 45 <24> pushConstant: 'foo: $ '
>>> 46 <22> pushConstant: #foo:
>>> 47 <25> pushConstant: {Character space}
>>> 48 <83 60> send: assertPragma:givesKeyword:arguments:
>>> 50 <87> pop
>>> 51 <78> returnSelf
>>>
>>> And when decompiled, we have:
>>>
>>> testCompileCharacter
>>> self
>>> assertPragma: 'foo: $a'
>>> givesKeyword: #foo:
>>> arguments: #($a ).
>>> self
>>> assertPragma: 'foo: $ '
>>> givesKeyword: #foo:
>>> arguments: ((Array new: 1) at: 1 put: Character space; yourself)
>>>
>>> Notice that the literal #( $ ) has expanded into an Array declaration
>>> etc.
>>>
>>> When we compile this decompiled string, we end up with the rather
>>> different
>>>
>>> self
>>> pushConstant: ''foo: $a''
>>> pushConstant: #foo:
>>> pushConstant: #($a)
>>> send: #assertPragma:givesKeyword:arguments: (3 args)
>>> pop
>>> self
>>> pushConstant: ''foo: $ ''
>>> pushConstant: #foo:
>>> pushLit: Array
>>> pushConstant: 1
>>> send: #new: (1 arg)
>>> dup
>>> pushConstant: 1
>>> pushLit: Character
>>> send: #space (0 args)
>>> send: #at:put: (2 args)
>>> pop
>>> send: #yourself (0 args)
>>> send: #assertPragma:givesKeyword:arguments: (3 args)
>>> pop
>>> returnSelf
>>>
>>> The decompiler turns that pushConstant: into a whole bunch of stuff
>>> because it interprets #( $ ) as (Array new: 1) at: 1 put: Character
>>> space; yourself.
>>>
>>> (The same thing happens for SyntaxMorph>>replaceSel:menuItem: and
>>> SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being
>>> interpreted as Character space, and then decompiled to pushLit:
>>> Character send: #space.)
>>>
>>> That's all fine, but what happens is that when you compile the
>>> decompiled method (and then decompile _that_ method), you can't have
>>> the newly-decompiled method match the original decompiled method. The
>>> parse tree are different: where the original has a LiteralNode, the
>>> new version has a CascadeNode.
>>
>> It's easy to fix it, just change Character >> shouldBePrintedAsLiteral to
>> ^value between: 32 and: 255. Though it may break other stuff. See here:
>> http://bugs.squeak.org/view.php?id=2254
>> http://bugs.squeak.org/view.php?id=4322
>> Btw the 33-255 range is funny, because 127 is a control character.
>
> It's not (just) the Character literal. It's that #( $ ) (or {Character
> space} is a compiled as a LiteralNode.
>
> The expansion's semantically correct, but the decompiled string no
> longer matches bytecode-for-bytecode the bytecodes of the original
> string, which means that the old decompiled and new decompiled strings
> differ by whitespace (because of the way a CascadeNode's written).

Me and my big mouth. An array with all literals is itself a literal.

Changing shouldBePrintedAsLiteral as suggested above makes all the
Decompiler tests pass.

I'll run a full suite and report the results later.

frank

Frank Shearar

Re: Failing Decompiler tests - how to fix?

On 2010/12/21 22:24, Frank Shearar wrote:

> On 2010/12/21 19:10, Frank Shearar wrote:
>> On 2010/12/21 13:16, Levente Uzonyi wrote:
>>> On Tue, 21 Dec 2010, Frank Shearar wrote:
>>>
>>>> One of the failing Decompiler tests is when we check
>>>> MethodPragmaTest>>testCompileCharacter:
>>>>
>>>> testCompileCharacter
>>>> self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ).
>>>> self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ).
>>>>
>>>> As bytecodes, this method looks like this:
>>>>
>>>> 37 <70> self
>>>> 38 <21> pushConstant: 'foo: $a'
>>>> 39 <22> pushConstant: #foo:
>>>> 40 <23> pushConstant: #($a)
>>>> 41 <83 60> send: assertPragma:givesKeyword:arguments:
>>>> 43 <87> pop
>>>> 44 <70> self
>>>> 45 <24> pushConstant: 'foo: $ '
>>>> 46 <22> pushConstant: #foo:
>>>> 47 <25> pushConstant: {Character space}
>>>> 48 <83 60> send: assertPragma:givesKeyword:arguments:
>>>> 50 <87> pop
>>>> 51 <78> returnSelf
>>>>
>>>> And when decompiled, we have:
>>>>
>>>> testCompileCharacter
>>>> self
>>>> assertPragma: 'foo: $a'
>>>> givesKeyword: #foo:
>>>> arguments: #($a ).
>>>> self
>>>> assertPragma: 'foo: $ '
>>>> givesKeyword: #foo:
>>>> arguments: ((Array new: 1) at: 1 put: Character space; yourself)
>>>>
>>>> Notice that the literal #( $ ) has expanded into an Array declaration
>>>> etc.
>>>>
>>>> When we compile this decompiled string, we end up with the rather
>>>> different
>>>>
>>>> self
>>>> pushConstant: ''foo: $a''
>>>> pushConstant: #foo:
>>>> pushConstant: #($a)
>>>> send: #assertPragma:givesKeyword:arguments: (3 args)
>>>> pop
>>>> self
>>>> pushConstant: ''foo: $ ''
>>>> pushConstant: #foo:
>>>> pushLit: Array
>>>> pushConstant: 1
>>>> send: #new: (1 arg)
>>>> dup
>>>> pushConstant: 1
>>>> pushLit: Character
>>>> send: #space (0 args)
>>>> send: #at:put: (2 args)
>>>> pop
>>>> send: #yourself (0 args)
>>>> send: #assertPragma:givesKeyword:arguments: (3 args)
>>>> pop
>>>> returnSelf
>>>>
>>>> The decompiler turns that pushConstant: into a whole bunch of stuff
>>>> because it interprets #( $ ) as (Array new: 1) at: 1 put: Character
>>>> space; yourself.
>>>>
>>>> (The same thing happens for SyntaxMorph>>replaceSel:menuItem: and
>>>> SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being
>>>> interpreted as Character space, and then decompiled to pushLit:
>>>> Character send: #space.)
>>>>
>>>> That's all fine, but what happens is that when you compile the
>>>> decompiled method (and then decompile _that_ method), you can't have
>>>> the newly-decompiled method match the original decompiled method. The
>>>> parse tree are different: where the original has a LiteralNode, the
>>>> new version has a CascadeNode.
>>>
>>> It's easy to fix it, just change Character >>
>>> shouldBePrintedAsLiteral to
>>> ^value between: 32 and: 255. Though it may break other stuff. See here:
>>> http://bugs.squeak.org/view.php?id=2254
>>> http://bugs.squeak.org/view.php?id=4322
>>> Btw the 33-255 range is funny, because 127 is a control character.
>>
>> It's not (just) the Character literal. It's that #( $ ) (or {Character
>> space} is a compiled as a LiteralNode.
>>
>> The expansion's semantically correct, but the decompiled string no
>> longer matches bytecode-for-bytecode the bytecodes of the original
>> string, which means that the old decompiled and new decompiled strings
>> differ by whitespace (because of the way a CascadeNode's written).
>
> Me and my big mouth. An array with all literals is itself a literal.
>
> Changing shouldBePrintedAsLiteral as suggested above makes all the
> Decompiler tests pass.
>
> I'll run a full suite and report the results later.

Predictably, the only test to fail after the above-mentioned change is
CharacterTest>>testStoreString, which explicitly checks that space is
stored as 'Character space'.

frank

Frank Shearar

Re: Failing Decompiler tests - how to fix?

In reply to this post by Frank Shearar

On 2010/12/21 22:24, Frank Shearar wrote:

It only just occurred to me that a better idea is, if Character space
shouldBePrintedAsLiteral not, to replace the three places that use $
instead. We know why those three tests are failing, we've verified that
it is because of the space character, so we're not running any great
risk of breaking something by changing the code under test.

frank