One of the failing Decompiler tests is when we check
MethodPragmaTest>>testCompileCharacter: testCompileCharacter self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ). self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ). As bytecodes, this method looks like this: 37 <70> self 38 <21> pushConstant: 'foo: $a' 39 <22> pushConstant: #foo: 40 <23> pushConstant: #($a) 41 <83 60> send: assertPragma:givesKeyword:arguments: 43 <87> pop 44 <70> self 45 <24> pushConstant: 'foo: $ ' 46 <22> pushConstant: #foo: 47 <25> pushConstant: {Character space} 48 <83 60> send: assertPragma:givesKeyword:arguments: 50 <87> pop 51 <78> returnSelf And when decompiled, we have: testCompileCharacter self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #($a ). self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: ((Array new: 1) at: 1 put: Character space; yourself) Notice that the literal #( $ ) has expanded into an Array declaration etc. When we compile this decompiled string, we end up with the rather different self pushConstant: ''foo: $a'' pushConstant: #foo: pushConstant: #($a) send: #assertPragma:givesKeyword:arguments: (3 args) pop self pushConstant: ''foo: $ '' pushConstant: #foo: pushLit: Array pushConstant: 1 send: #new: (1 arg) dup pushConstant: 1 pushLit: Character send: #space (0 args) send: #at:put: (2 args) pop send: #yourself (0 args) send: #assertPragma:givesKeyword:arguments: (3 args) pop returnSelf The decompiler turns that pushConstant: into a whole bunch of stuff because it interprets #( $ ) as (Array new: 1) at: 1 put: Character space; yourself. (The same thing happens for SyntaxMorph>>replaceSel:menuItem: and SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being interpreted as Character space, and then decompiled to pushLit: Character send: #space.) That's all fine, but what happens is that when you compile the decompiled method (and then decompile _that_ method), you can't have the newly-decompiled method match the original decompiled method. The parse tree are different: where the original has a LiteralNode, the new version has a CascadeNode. frank |
On Tue, 21 Dec 2010, Frank Shearar wrote:
> One of the failing Decompiler tests is when we check > MethodPragmaTest>>testCompileCharacter: > > testCompileCharacter > self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ). > self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ). > > As bytecodes, this method looks like this: > > 37 <70> self > 38 <21> pushConstant: 'foo: $a' > 39 <22> pushConstant: #foo: > 40 <23> pushConstant: #($a) > 41 <83 60> send: assertPragma:givesKeyword:arguments: > 43 <87> pop > 44 <70> self > 45 <24> pushConstant: 'foo: $ ' > 46 <22> pushConstant: #foo: > 47 <25> pushConstant: {Character space} > 48 <83 60> send: assertPragma:givesKeyword:arguments: > 50 <87> pop > 51 <78> returnSelf > > And when decompiled, we have: > > testCompileCharacter > self > assertPragma: 'foo: $a' > givesKeyword: #foo: > arguments: #($a ). > self > assertPragma: 'foo: $ ' > givesKeyword: #foo: > arguments: ((Array new: 1) at: 1 put: Character space; > yourself) > > Notice that the literal #( $ ) has expanded into an Array declaration etc. > > When we compile this decompiled string, we end up with the rather different > > self > pushConstant: ''foo: $a'' > pushConstant: #foo: > pushConstant: #($a) > send: #assertPragma:givesKeyword:arguments: (3 args) > pop > self > pushConstant: ''foo: $ '' > pushConstant: #foo: > pushLit: Array > pushConstant: 1 > send: #new: (1 arg) > dup > pushConstant: 1 > pushLit: Character > send: #space (0 args) > send: #at:put: (2 args) > pop > send: #yourself (0 args) > send: #assertPragma:givesKeyword:arguments: (3 args) > pop > returnSelf > > The decompiler turns that pushConstant: into a whole bunch of stuff because > it interprets #( $ ) as (Array new: 1) at: 1 put: Character space; yourself. > > (The same thing happens for SyntaxMorph>>replaceSel:menuItem: and > SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being interpreted > as Character space, and then decompiled to pushLit: Character send: #space.) > > That's all fine, but what happens is that when you compile the decompiled > method (and then decompile _that_ method), you can't have the > newly-decompiled method match the original decompiled method. The parse tree > are different: where the original has a LiteralNode, the new version has a > CascadeNode. It's easy to fix it, just change Character >> shouldBePrintedAsLiteral to ^value between: 32 and: 255. Though it may break other stuff. See here: http://bugs.squeak.org/view.php?id=2254 http://bugs.squeak.org/view.php?id=4322 Btw the 33-255 range is funny, because 127 is a control character. Levente > > frank > > |
In reply to this post by Frank Shearar
So is the test BS? Should we modify it to expect the cascade node instead?
On Dec 21, 2010, at 3:07 AM, Frank Shearar <[hidden email]> wrote: > One of the failing Decompiler tests is when we check MethodPragmaTest>>testCompileCharacter: > > testCompileCharacter > self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ). > self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ). > > As bytecodes, this method looks like this: > > 37 <70> self > 38 <21> pushConstant: 'foo: $a' > 39 <22> pushConstant: #foo: > 40 <23> pushConstant: #($a) > 41 <83 60> send: assertPragma:givesKeyword:arguments: > 43 <87> pop > 44 <70> self > 45 <24> pushConstant: 'foo: $ ' > 46 <22> pushConstant: #foo: > 47 <25> pushConstant: {Character space} > 48 <83 60> send: assertPragma:givesKeyword:arguments: > 50 <87> pop > 51 <78> returnSelf > > And when decompiled, we have: > > testCompileCharacter > self > assertPragma: 'foo: $a' > givesKeyword: #foo: > arguments: #($a ). > self > assertPragma: 'foo: $ ' > givesKeyword: #foo: > arguments: ((Array new: 1) at: 1 put: Character space; yourself) > > Notice that the literal #( $ ) has expanded into an Array declaration etc. > > When we compile this decompiled string, we end up with the rather different > > self > pushConstant: ''foo: $a'' > pushConstant: #foo: > pushConstant: #($a) > send: #assertPragma:givesKeyword:arguments: (3 args) > pop > self > pushConstant: ''foo: $ '' > pushConstant: #foo: > pushLit: Array > pushConstant: 1 > send: #new: (1 arg) > dup > pushConstant: 1 > pushLit: Character > send: #space (0 args) > send: #at:put: (2 args) > pop > send: #yourself (0 args) > send: #assertPragma:givesKeyword:arguments: (3 args) > pop > returnSelf > > The decompiler turns that pushConstant: into a whole bunch of stuff because it interprets #( $ ) as (Array new: 1) at: 1 put: Character space; yourself. > > (The same thing happens for SyntaxMorph>>replaceSel:menuItem: and SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being interpreted as Character space, and then decompiled to pushLit: Character send: #space.) > > That's all fine, but what happens is that when you compile the decompiled method (and then decompile _that_ method), you can't have the newly-decompiled method match the original decompiled method. The parse tree are different: where the original has a LiteralNode, the new version has a CascadeNode. > > frank > |
In reply to this post by Levente Uzonyi-2
On 2010/12/21 13:16, Levente Uzonyi wrote:
> On Tue, 21 Dec 2010, Frank Shearar wrote: > >> One of the failing Decompiler tests is when we check >> MethodPragmaTest>>testCompileCharacter: >> >> testCompileCharacter >> self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ). >> self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ). >> >> As bytecodes, this method looks like this: >> >> 37 <70> self >> 38 <21> pushConstant: 'foo: $a' >> 39 <22> pushConstant: #foo: >> 40 <23> pushConstant: #($a) >> 41 <83 60> send: assertPragma:givesKeyword:arguments: >> 43 <87> pop >> 44 <70> self >> 45 <24> pushConstant: 'foo: $ ' >> 46 <22> pushConstant: #foo: >> 47 <25> pushConstant: {Character space} >> 48 <83 60> send: assertPragma:givesKeyword:arguments: >> 50 <87> pop >> 51 <78> returnSelf >> >> And when decompiled, we have: >> >> testCompileCharacter >> self >> assertPragma: 'foo: $a' >> givesKeyword: #foo: >> arguments: #($a ). >> self >> assertPragma: 'foo: $ ' >> givesKeyword: #foo: >> arguments: ((Array new: 1) at: 1 put: Character space; yourself) >> >> Notice that the literal #( $ ) has expanded into an Array declaration >> etc. >> >> When we compile this decompiled string, we end up with the rather >> different >> >> self >> pushConstant: ''foo: $a'' >> pushConstant: #foo: >> pushConstant: #($a) >> send: #assertPragma:givesKeyword:arguments: (3 args) >> pop >> self >> pushConstant: ''foo: $ '' >> pushConstant: #foo: >> pushLit: Array >> pushConstant: 1 >> send: #new: (1 arg) >> dup >> pushConstant: 1 >> pushLit: Character >> send: #space (0 args) >> send: #at:put: (2 args) >> pop >> send: #yourself (0 args) >> send: #assertPragma:givesKeyword:arguments: (3 args) >> pop >> returnSelf >> >> The decompiler turns that pushConstant: into a whole bunch of stuff >> because it interprets #( $ ) as (Array new: 1) at: 1 put: Character >> space; yourself. >> >> (The same thing happens for SyntaxMorph>>replaceSel:menuItem: and >> SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being >> interpreted as Character space, and then decompiled to pushLit: >> Character send: #space.) >> >> That's all fine, but what happens is that when you compile the >> decompiled method (and then decompile _that_ method), you can't have >> the newly-decompiled method match the original decompiled method. The >> parse tree are different: where the original has a LiteralNode, the >> new version has a CascadeNode. > > It's easy to fix it, just change Character >> shouldBePrintedAsLiteral to > ^value between: 32 and: 255. Though it may break other stuff. See here: > http://bugs.squeak.org/view.php?id=2254 > http://bugs.squeak.org/view.php?id=4322 > Btw the 33-255 range is funny, because 127 is a control character. It's not (just) the Character literal. It's that #( $ ) (or {Character space} is a compiled as a LiteralNode. The expansion's semantically correct, but the decompiled string no longer matches bytecode-for-bytecode the bytecodes of the original string, which means that the old decompiled and new decompiled strings differ by whitespace (because of the way a CascadeNode's written). frank |
On 2010/12/21 19:10, Frank Shearar wrote:
> On 2010/12/21 13:16, Levente Uzonyi wrote: >> On Tue, 21 Dec 2010, Frank Shearar wrote: >> >>> One of the failing Decompiler tests is when we check >>> MethodPragmaTest>>testCompileCharacter: >>> >>> testCompileCharacter >>> self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ). >>> self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ). >>> >>> As bytecodes, this method looks like this: >>> >>> 37 <70> self >>> 38 <21> pushConstant: 'foo: $a' >>> 39 <22> pushConstant: #foo: >>> 40 <23> pushConstant: #($a) >>> 41 <83 60> send: assertPragma:givesKeyword:arguments: >>> 43 <87> pop >>> 44 <70> self >>> 45 <24> pushConstant: 'foo: $ ' >>> 46 <22> pushConstant: #foo: >>> 47 <25> pushConstant: {Character space} >>> 48 <83 60> send: assertPragma:givesKeyword:arguments: >>> 50 <87> pop >>> 51 <78> returnSelf >>> >>> And when decompiled, we have: >>> >>> testCompileCharacter >>> self >>> assertPragma: 'foo: $a' >>> givesKeyword: #foo: >>> arguments: #($a ). >>> self >>> assertPragma: 'foo: $ ' >>> givesKeyword: #foo: >>> arguments: ((Array new: 1) at: 1 put: Character space; yourself) >>> >>> Notice that the literal #( $ ) has expanded into an Array declaration >>> etc. >>> >>> When we compile this decompiled string, we end up with the rather >>> different >>> >>> self >>> pushConstant: ''foo: $a'' >>> pushConstant: #foo: >>> pushConstant: #($a) >>> send: #assertPragma:givesKeyword:arguments: (3 args) >>> pop >>> self >>> pushConstant: ''foo: $ '' >>> pushConstant: #foo: >>> pushLit: Array >>> pushConstant: 1 >>> send: #new: (1 arg) >>> dup >>> pushConstant: 1 >>> pushLit: Character >>> send: #space (0 args) >>> send: #at:put: (2 args) >>> pop >>> send: #yourself (0 args) >>> send: #assertPragma:givesKeyword:arguments: (3 args) >>> pop >>> returnSelf >>> >>> The decompiler turns that pushConstant: into a whole bunch of stuff >>> because it interprets #( $ ) as (Array new: 1) at: 1 put: Character >>> space; yourself. >>> >>> (The same thing happens for SyntaxMorph>>replaceSel:menuItem: and >>> SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being >>> interpreted as Character space, and then decompiled to pushLit: >>> Character send: #space.) >>> >>> That's all fine, but what happens is that when you compile the >>> decompiled method (and then decompile _that_ method), you can't have >>> the newly-decompiled method match the original decompiled method. The >>> parse tree are different: where the original has a LiteralNode, the >>> new version has a CascadeNode. >> >> It's easy to fix it, just change Character >> shouldBePrintedAsLiteral to >> ^value between: 32 and: 255. Though it may break other stuff. See here: >> http://bugs.squeak.org/view.php?id=2254 >> http://bugs.squeak.org/view.php?id=4322 >> Btw the 33-255 range is funny, because 127 is a control character. > > It's not (just) the Character literal. It's that #( $ ) (or {Character > space} is a compiled as a LiteralNode. > > The expansion's semantically correct, but the decompiled string no > longer matches bytecode-for-bytecode the bytecodes of the original > string, which means that the old decompiled and new decompiled strings > differ by whitespace (because of the way a CascadeNode's written). Me and my big mouth. An array with all literals is itself a literal. Changing shouldBePrintedAsLiteral as suggested above makes all the Decompiler tests pass. I'll run a full suite and report the results later. frank |
On 2010/12/21 22:24, Frank Shearar wrote:
> On 2010/12/21 19:10, Frank Shearar wrote: >> On 2010/12/21 13:16, Levente Uzonyi wrote: >>> On Tue, 21 Dec 2010, Frank Shearar wrote: >>> >>>> One of the failing Decompiler tests is when we check >>>> MethodPragmaTest>>testCompileCharacter: >>>> >>>> testCompileCharacter >>>> self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ). >>>> self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ). >>>> >>>> As bytecodes, this method looks like this: >>>> >>>> 37 <70> self >>>> 38 <21> pushConstant: 'foo: $a' >>>> 39 <22> pushConstant: #foo: >>>> 40 <23> pushConstant: #($a) >>>> 41 <83 60> send: assertPragma:givesKeyword:arguments: >>>> 43 <87> pop >>>> 44 <70> self >>>> 45 <24> pushConstant: 'foo: $ ' >>>> 46 <22> pushConstant: #foo: >>>> 47 <25> pushConstant: {Character space} >>>> 48 <83 60> send: assertPragma:givesKeyword:arguments: >>>> 50 <87> pop >>>> 51 <78> returnSelf >>>> >>>> And when decompiled, we have: >>>> >>>> testCompileCharacter >>>> self >>>> assertPragma: 'foo: $a' >>>> givesKeyword: #foo: >>>> arguments: #($a ). >>>> self >>>> assertPragma: 'foo: $ ' >>>> givesKeyword: #foo: >>>> arguments: ((Array new: 1) at: 1 put: Character space; yourself) >>>> >>>> Notice that the literal #( $ ) has expanded into an Array declaration >>>> etc. >>>> >>>> When we compile this decompiled string, we end up with the rather >>>> different >>>> >>>> self >>>> pushConstant: ''foo: $a'' >>>> pushConstant: #foo: >>>> pushConstant: #($a) >>>> send: #assertPragma:givesKeyword:arguments: (3 args) >>>> pop >>>> self >>>> pushConstant: ''foo: $ '' >>>> pushConstant: #foo: >>>> pushLit: Array >>>> pushConstant: 1 >>>> send: #new: (1 arg) >>>> dup >>>> pushConstant: 1 >>>> pushLit: Character >>>> send: #space (0 args) >>>> send: #at:put: (2 args) >>>> pop >>>> send: #yourself (0 args) >>>> send: #assertPragma:givesKeyword:arguments: (3 args) >>>> pop >>>> returnSelf >>>> >>>> The decompiler turns that pushConstant: into a whole bunch of stuff >>>> because it interprets #( $ ) as (Array new: 1) at: 1 put: Character >>>> space; yourself. >>>> >>>> (The same thing happens for SyntaxMorph>>replaceSel:menuItem: and >>>> SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being >>>> interpreted as Character space, and then decompiled to pushLit: >>>> Character send: #space.) >>>> >>>> That's all fine, but what happens is that when you compile the >>>> decompiled method (and then decompile _that_ method), you can't have >>>> the newly-decompiled method match the original decompiled method. The >>>> parse tree are different: where the original has a LiteralNode, the >>>> new version has a CascadeNode. >>> >>> It's easy to fix it, just change Character >> >>> shouldBePrintedAsLiteral to >>> ^value between: 32 and: 255. Though it may break other stuff. See here: >>> http://bugs.squeak.org/view.php?id=2254 >>> http://bugs.squeak.org/view.php?id=4322 >>> Btw the 33-255 range is funny, because 127 is a control character. >> >> It's not (just) the Character literal. It's that #( $ ) (or {Character >> space} is a compiled as a LiteralNode. >> >> The expansion's semantically correct, but the decompiled string no >> longer matches bytecode-for-bytecode the bytecodes of the original >> string, which means that the old decompiled and new decompiled strings >> differ by whitespace (because of the way a CascadeNode's written). > > Me and my big mouth. An array with all literals is itself a literal. > > Changing shouldBePrintedAsLiteral as suggested above makes all the > Decompiler tests pass. > > I'll run a full suite and report the results later. Predictably, the only test to fail after the above-mentioned change is CharacterTest>>testStoreString, which explicitly checks that space is stored as 'Character space'. frank |
In reply to this post by Frank Shearar
On 2010/12/21 22:24, Frank Shearar wrote:
> On 2010/12/21 19:10, Frank Shearar wrote: >> On 2010/12/21 13:16, Levente Uzonyi wrote: >>> On Tue, 21 Dec 2010, Frank Shearar wrote: >>> >>>> One of the failing Decompiler tests is when we check >>>> MethodPragmaTest>>testCompileCharacter: >>>> >>>> testCompileCharacter >>>> self assertPragma: 'foo: $a' givesKeyword: #foo: arguments: #( $a ). >>>> self assertPragma: 'foo: $ ' givesKeyword: #foo: arguments: #( $ ). >>>> >>>> As bytecodes, this method looks like this: >>>> >>>> 37 <70> self >>>> 38 <21> pushConstant: 'foo: $a' >>>> 39 <22> pushConstant: #foo: >>>> 40 <23> pushConstant: #($a) >>>> 41 <83 60> send: assertPragma:givesKeyword:arguments: >>>> 43 <87> pop >>>> 44 <70> self >>>> 45 <24> pushConstant: 'foo: $ ' >>>> 46 <22> pushConstant: #foo: >>>> 47 <25> pushConstant: {Character space} >>>> 48 <83 60> send: assertPragma:givesKeyword:arguments: >>>> 50 <87> pop >>>> 51 <78> returnSelf >>>> >>>> And when decompiled, we have: >>>> >>>> testCompileCharacter >>>> self >>>> assertPragma: 'foo: $a' >>>> givesKeyword: #foo: >>>> arguments: #($a ). >>>> self >>>> assertPragma: 'foo: $ ' >>>> givesKeyword: #foo: >>>> arguments: ((Array new: 1) at: 1 put: Character space; yourself) >>>> >>>> Notice that the literal #( $ ) has expanded into an Array declaration >>>> etc. >>>> >>>> When we compile this decompiled string, we end up with the rather >>>> different >>>> >>>> self >>>> pushConstant: ''foo: $a'' >>>> pushConstant: #foo: >>>> pushConstant: #($a) >>>> send: #assertPragma:givesKeyword:arguments: (3 args) >>>> pop >>>> self >>>> pushConstant: ''foo: $ '' >>>> pushConstant: #foo: >>>> pushLit: Array >>>> pushConstant: 1 >>>> send: #new: (1 arg) >>>> dup >>>> pushConstant: 1 >>>> pushLit: Character >>>> send: #space (0 args) >>>> send: #at:put: (2 args) >>>> pop >>>> send: #yourself (0 args) >>>> send: #assertPragma:givesKeyword:arguments: (3 args) >>>> pop >>>> returnSelf >>>> >>>> The decompiler turns that pushConstant: into a whole bunch of stuff >>>> because it interprets #( $ ) as (Array new: 1) at: 1 put: Character >>>> space; yourself. >>>> >>>> (The same thing happens for SyntaxMorph>>replaceSel:menuItem: and >>>> SyntaxMorph>>replaceKeyWord:menuItem. It looks like $ is being >>>> interpreted as Character space, and then decompiled to pushLit: >>>> Character send: #space.) >>>> >>>> That's all fine, but what happens is that when you compile the >>>> decompiled method (and then decompile _that_ method), you can't have >>>> the newly-decompiled method match the original decompiled method. The >>>> parse tree are different: where the original has a LiteralNode, the >>>> new version has a CascadeNode. >>> >>> It's easy to fix it, just change Character >> >>> shouldBePrintedAsLiteral to >>> ^value between: 32 and: 255. Though it may break other stuff. See here: >>> http://bugs.squeak.org/view.php?id=2254 >>> http://bugs.squeak.org/view.php?id=4322 >>> Btw the 33-255 range is funny, because 127 is a control character. >> >> It's not (just) the Character literal. It's that #( $ ) (or {Character >> space} is a compiled as a LiteralNode. >> >> The expansion's semantically correct, but the decompiled string no >> longer matches bytecode-for-bytecode the bytecodes of the original >> string, which means that the old decompiled and new decompiled strings >> differ by whitespace (because of the way a CascadeNode's written). > > Me and my big mouth. An array with all literals is itself a literal. > > Changing shouldBePrintedAsLiteral as suggested above makes all the > Decompiler tests pass. > > I'll run a full suite and report the results later. It only just occurred to me that a better idea is, if Character space shouldBePrintedAsLiteral not, to replace the three places that use $ instead. We know why those three tests are failing, we've verified that it is because of the space character, so we're not running any great risk of breaking something by changing the code under test. frank |
Free forum by Nabble | Edit this page |