Smalltalk › Squeak › Squeak - Dev

here's a curio for compiler mavens...

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

14 messages Options

Eliot Miranda-2

here's a curio for compiler mavens...

Hi All,

(whitewash alert) I just had occasion to look at the bytecode generated by the standard compiler for HashedCollection class>>#goodPrimeAtLeast:. Here's the source, with the issue in bold:

goodPrimeAtLeast: lowerLimit

"Answer the next good prime >= lowerlimit.

If lowerLimit is larger than the largest known good prime,

just make it odd."

| primes low mid high prime |

primes := self goodPrimes.

low := 1.

high := primes size.

lowerLimit > (primes at: high) ifTrue: [

^lowerLimit bitOr: 1 ].

[ high - low <= 1 ] whileFalse: [

mid := high + low // 2.

prime := primes at: mid.

prime = lowerLimit ifTrue: [ ^prime ].

prime < lowerLimit

ifTrue: [ low := mid ]

ifFalse: [ high := mid ] ].

(primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].

^primes at: high

The code for this sequence is

58 <15> pushTemp: 5

59 <10> pushTemp: 0

60 <B2> send: <

61 <9B> jumpFalse: 66

62 <13> pushTemp: 3

63 <81 42> storeIntoTemp: 2

65 <92> jumpTo: 69

66 <13> pushTemp: 3

67 <81 44> storeIntoTemp: 4

69 <87> pop

where-as the following would be better:

58 <15> pushTemp: 5

59 <10> pushTemp: 0

60 <B2> send: <

61 <9B> jumpFalse: 66

62 <13> pushTemp: 3

63 <82 42> popIntoTemp: 2

65 <92> jumpTo: 69

66 <13> pushTemp: 3

67 <82 44> popIntoTemp: 4

The reason is that the code generator favours using a single pop for both arms of the if:

MessageNode>>sizeCodeForIf: encoder value: forValue

| thenExpr elseExpr branchSize thenSize elseSize |

thenExpr := arguments at: 1.

elseExpr := arguments at: 2.

(forValue

or: [(thenExpr isJust: NodeNil)

or: [elseExpr isJust: NodeNil]]) not

"(...not ifTrue: avoids using ifFalse: alone during this compile)"

ifTrue: "Two-armed IFs forEffect share a single pop"

[^super sizeCodeForEffect: encoder].

...

MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue

| thenExpr thenSize elseExpr elseSize |

thenSize := sizes at: 1.

elseSize := sizes at: 2.

(forValue not and: [elseSize * thenSize > 0]) ifTrue:

"Two-armed IFs forEffect share a single pop"

[^super emitCodeForEffect: stack encoder: encoder].

It would be nice if this only happened if doing so actually reduced the size of the generated code.

--
best,

Eliot

Igor Stasenko

Re: here's a curio for compiler mavens...

On 13 February 2012 20:58, Eliot Miranda <[hidden email]> wrote:

[ (a) ifTrue:[ b] ifFalse: [c] ] value
should answer b or c (depending on a)
now if you use pop instead of store, it will break.

Of course in while loop, the [ loop-body closure ] value is always
discarded, so yes we can
eliminate extra pop.
But you can do even more:

1. loop head
...
....
61 <9B> jumpFalse: 66
62 <13> pushTemp: 3
63 <82 42> popIntoTemp: 2
** 65 <92> jumpTo: 1
66 <13> pushTemp: 3
67 <82 44> popIntoTemp: 4
69 ... jumpTo: 1

so, we can have 1 less jump when code enters first branch.

> The reason is that the code generator favours using a single pop for both
> arms of the if:
>
> MessageNode>>sizeCodeForIf: encoder value: forValue
> | thenExpr elseExpr branchSize thenSize elseSize |
> thenExpr := arguments at: 1.
> elseExpr := arguments at: 2.
> (forValue
> or: [(thenExpr isJust: NodeNil)
> or: [elseExpr isJust: NodeNil]]) not
> "(...not ifTrue: avoids using ifFalse: alone during this compile)"
> ifTrue: "Two-armed IFs forEffect share a single pop"
> [^super sizeCodeForEffect: encoder].
> ...
>
> MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
> | thenExpr thenSize elseExpr elseSize |
> thenSize := sizes at: 1.
> elseSize := sizes at: 2.
> (forValue not and: [elseSize * thenSize > 0]) ifTrue:
> "Two-armed IFs forEffect share a single pop"
> [^super emitCodeForEffect: stack encoder: encoder].
>
> It would be nice if this only happened if doing so actually reduced the size
> of the generated code.
> --
> best,
> Eliot
>
>
>
>

--
Best regards,
Igor Stasenko.

Nicolas Cellier

Re: here's a curio for compiler mavens...

In reply to this post by Eliot Miranda-2

Le 13 février 2012 20:58, Eliot Miranda <[hidden email]> a écrit :

> Hi All,
>
> (whitewash alert) I just had occasion to look at the bytecode generated
> by the standard compiler for HashedCollection class>>#goodPrimeAtLeast:.
> Here's the source, with the issue in bold:
>
> goodPrimeAtLeast: lowerLimit
> "Answer the next good prime >= lowerlimit.
> If lowerLimit is larger than the largest known good prime,
> just make it odd."
> | primes low mid high prime |
> primes := self goodPrimes.
> low := 1.
> high := primes size.
> lowerLimit > (primes at: high) ifTrue: [
> ^lowerLimit bitOr: 1 ].
> [ high - low <= 1 ] whileFalse: [
> mid := high + low // 2.
> prime := primes at: mid.
> prime = lowerLimit ifTrue: [ ^prime ].
> prime < lowerLimit
> ifTrue: [ low := mid ]
> ifFalse: [ high := mid ] ].
> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
> ^primes at: high
>
> The code for this sequence is
> 58 <15> pushTemp: 5
> 59 <10> pushTemp: 0
> 60 <B2> send: <
> 61 <9B> jumpFalse: 66
> 62 <13> pushTemp: 3
> 63 <81 42> storeIntoTemp: 2
> 65 <92> jumpTo: 69
> 66 <13> pushTemp: 3
> 67 <81 44> storeIntoTemp: 4
> 69 <87> pop
>
> where-as the following would be better:
>
> 58 <15> pushTemp: 5
> 59 <10> pushTemp: 0
> 60 <B2> send: <
> 61 <9B> jumpFalse: 66
> 62 <13> pushTemp: 3
> 63 <82 42> popIntoTemp: 2
> 65 <92> jumpTo: 69
> 66 <13> pushTemp: 3
> 67 <82 44> popIntoTemp: 4
>
> The reason is that the code generator favours using a single pop for both
> arms of the if:
>
> MessageNode>>sizeCodeForIf: encoder value: forValue
> | thenExpr elseExpr branchSize thenSize elseSize |
> thenExpr := arguments at: 1.
> elseExpr := arguments at: 2.
> (forValue
> or: [(thenExpr isJust: NodeNil)
> or: [elseExpr isJust: NodeNil]]) not
> "(...not ifTrue: avoids using ifFalse: alone during this compile)"
> ifTrue: "Two-armed IFs forEffect share a single pop"
> [^super sizeCodeForEffect: encoder].
> ...
>
> MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
> | thenExpr thenSize elseExpr elseSize |
> thenSize := sizes at: 1.
> elseSize := sizes at: 2.
> (forValue not and: [elseSize * thenSize > 0]) ifTrue:
> "Two-armed IFs forEffect share a single pop"
> [^super emitCodeForEffect: stack encoder: encoder].
>
> It would be nice if this only happened if doing so actually reduced the size
> of the generated code.
> --

Ah yes, for example ifTrue: [self doThis] ifFalse: [self doThat] will
factor a pop.
So, the room for optimisation is the case when both branch finish with
assignment? (storePop)
Will it break the Decompiler?

Nicolas

> best,
> Eliot
>
>
>
>

Nicolas Cellier

Re: here's a curio for compiler mavens...

In reply to this post by Igor Stasenko

Le 13 février 2012 21:27, Igor Stasenko <[hidden email]> a écrit :

> On 13 February 2012 20:58, Eliot Miranda <[hidden email]> wrote:
>> Hi All,
>>
>> (whitewash alert) I just had occasion to look at the bytecode generated
>> by the standard compiler for HashedCollection class>>#goodPrimeAtLeast:.
>> Here's the source, with the issue in bold:
>>
>> goodPrimeAtLeast: lowerLimit
>> "Answer the next good prime >= lowerlimit.
>> If lowerLimit is larger than the largest known good prime,
>> just make it odd."
>> | primes low mid high prime |
>> primes := self goodPrimes.
>> low := 1.
>> high := primes size.
>> lowerLimit > (primes at: high) ifTrue: [
>> ^lowerLimit bitOr: 1 ].
>> [ high - low <= 1 ] whileFalse: [
>> mid := high + low // 2.
>> prime := primes at: mid.
>> prime = lowerLimit ifTrue: [ ^prime ].
>> prime < lowerLimit
>> ifTrue: [ low := mid ]
>> ifFalse: [ high := mid ] ].
>> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>> ^primes at: high
>>
>> The code for this sequence is
>> 58 <15> pushTemp: 5
>> 59 <10> pushTemp: 0
>> 60 <B2> send: <
>> 61 <9B> jumpFalse: 66
>> 62 <13> pushTemp: 3
>> 63 <81 42> storeIntoTemp: 2
>> 65 <92> jumpTo: 69
>> 66 <13> pushTemp: 3
>> 67 <81 44> storeIntoTemp: 4
>> 69 <87> pop
>>
>> where-as the following would be better:
>>
>> 58 <15> pushTemp: 5
>> 59 <10> pushTemp: 0
>> 60 <B2> send: <
>> 61 <9B> jumpFalse: 66
>> 62 <13> pushTemp: 3
>> 63 <82 42> popIntoTemp: 2
>> 65 <92> jumpTo: 69
>> 66 <13> pushTemp: 3
>> 67 <82 44> popIntoTemp: 4
>>
>
> [ (a) ifTrue:[ b] ifFalse: [c] ] value
> should answer b or c (depending on a)
> now if you use pop instead of store, it will break.
>

Yes but we always know whether we want to emit code forValue or just for effect.

> Of course in while loop, the [ loop-body closure ] value is always
> discarded, so yes we can
> eliminate extra pop.
> But you can do even more:
>
> 1. loop head
> ...
> ....
> 61 <9B> jumpFalse: 66
> 62 <13> pushTemp: 3
> 63 <82 42> popIntoTemp: 2
> ** 65 <92> jumpTo: 1
> 66 <13> pushTemp: 3
> 67 <82 44> popIntoTemp: 4
> 69 ... jumpTo: 1
>
> so, we can have 1 less jump when code enters first branch.
>

Very good!
I think however that hacking the Decompiler will be a challenge!

>
>> The reason is that the code generator favours using a single pop for both
>> arms of the if:
>>
>> MessageNode>>sizeCodeForIf: encoder value: forValue
>> | thenExpr elseExpr branchSize thenSize elseSize |
>> thenExpr := arguments at: 1.
>> elseExpr := arguments at: 2.
>> (forValue
>> or: [(thenExpr isJust: NodeNil)
>> or: [elseExpr isJust: NodeNil]]) not
>> "(...not ifTrue: avoids using ifFalse: alone during this compile)"
>> ifTrue: "Two-armed IFs forEffect share a single pop"
>> [^super sizeCodeForEffect: encoder].
>> ...
>>
>> MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
>> | thenExpr thenSize elseExpr elseSize |
>> thenSize := sizes at: 1.
>> elseSize := sizes at: 2.
>> (forValue not and: [elseSize * thenSize > 0]) ifTrue:
>> "Two-armed IFs forEffect share a single pop"
>> [^super emitCodeForEffect: stack encoder: encoder].
>>
>> It would be nice if this only happened if doing so actually reduced the size
>> of the generated code.
>> --
>> best,
>> Eliot
>>
>>
>>
>>
>
>
>
> --
> Best regards,
> Igor Stasenko.
>

Igor Stasenko

Re: here's a curio for compiler mavens...

On 13 February 2012 22:38, Nicolas Cellier
<[hidden email]> wrote:

> Le 13 février 2012 21:27, Igor Stasenko <[hidden email]> a écrit :
>> On 13 February 2012 20:58, Eliot Miranda <[hidden email]> wrote:
>>> Hi All,
>>>
>>> (whitewash alert) I just had occasion to look at the bytecode generated
>>> by the standard compiler for HashedCollection class>>#goodPrimeAtLeast:.
>>> Here's the source, with the issue in bold:
>>>
>>> goodPrimeAtLeast: lowerLimit
>>> "Answer the next good prime >= lowerlimit.
>>> If lowerLimit is larger than the largest known good prime,
>>> just make it odd."
>>> | primes low mid high prime |
>>> primes := self goodPrimes.
>>> low := 1.
>>> high := primes size.
>>> lowerLimit > (primes at: high) ifTrue: [
>>> ^lowerLimit bitOr: 1 ].
>>> [ high - low <= 1 ] whileFalse: [
>>> mid := high + low // 2.
>>> prime := primes at: mid.
>>> prime = lowerLimit ifTrue: [ ^prime ].
>>> prime < lowerLimit
>>> ifTrue: [ low := mid ]
>>> ifFalse: [ high := mid ] ].
>>> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>>> ^primes at: high
>>>
>>> The code for this sequence is
>>> 58 <15> pushTemp: 5
>>> 59 <10> pushTemp: 0
>>> 60 <B2> send: <
>>> 61 <9B> jumpFalse: 66
>>> 62 <13> pushTemp: 3
>>> 63 <81 42> storeIntoTemp: 2
>>> 65 <92> jumpTo: 69
>>> 66 <13> pushTemp: 3
>>> 67 <81 44> storeIntoTemp: 4
>>> 69 <87> pop
>>>
>>> where-as the following would be better:
>>>
>>> 58 <15> pushTemp: 5
>>> 59 <10> pushTemp: 0
>>> 60 <B2> send: <
>>> 61 <9B> jumpFalse: 66
>>> 62 <13> pushTemp: 3
>>> 63 <82 42> popIntoTemp: 2
>>> 65 <92> jumpTo: 69
>>> 66 <13> pushTemp: 3
>>> 67 <82 44> popIntoTemp: 4
>>>
>>
>> [ (a) ifTrue:[ b] ifFalse: [c] ] value
>> should answer b or c (depending on a)
>> now if you use pop instead of store, it will break.
>>
>
> Yes but we always know whether we want to emit code forValue or just for effect.
>
>> Of course in while loop, the [ loop-body closure ] value is always
>> discarded, so yes we can
>> eliminate extra pop.
>> But you can do even more:
>>
>> 1. loop head
>> ...
>> ....
>> 61 <9B> jumpFalse: 66
>> 62 <13> pushTemp: 3
>> 63 <82 42> popIntoTemp: 2
>> ** 65 <92> jumpTo: 1
>> 66 <13> pushTemp: 3
>> 67 <82 44> popIntoTemp: 4
>> 69 ... jumpTo: 1
>>
>> so, we can have 1 less jump when code enters first branch.
>>
>
> Very good!
> I think however that hacking the Decompiler will be a challenge!
>

Well, in this case it is easy to hide hints for decompiler directly in
a bytecode:

1. loop head
...
100. jumpTo: 106
101. push ...
102. pop
103. jumpTo: 1
104. hint
105. more hints
106. ... section start for false branch

n+1 ...

----
same actually for loops. instead of writing sophisticated detection
code , we can simply read hints:

1. loop head
...

10. jumpIfFalse: 101 (leaving the loop )

... loop body

99. jumpTo: 1 (loop head)
100. hint for loop
101. ... rest of method's bytecode

As you can see , it doesn't affects the performance, since its
unreachable code, just the size of bytecode.

just don't put me on fire for suggesting that :)

--
Best regards,
Igor Stasenko.

Nicolas Cellier

Re: here's a curio for compiler mavens...

Le 13 février 2012 22:08, Igor Stasenko <[hidden email]> a écrit :

> On 13 February 2012 22:38, Nicolas Cellier
> <[hidden email]> wrote:
>> Le 13 février 2012 21:27, Igor Stasenko <[hidden email]> a écrit :
>>> On 13 February 2012 20:58, Eliot Miranda <[hidden email]> wrote:
>>>> Hi All,
>>>>
>>>> (whitewash alert) I just had occasion to look at the bytecode generated
>>>> by the standard compiler for HashedCollection class>>#goodPrimeAtLeast:.
>>>> Here's the source, with the issue in bold:
>>>>
>>>> goodPrimeAtLeast: lowerLimit
>>>> "Answer the next good prime >= lowerlimit.
>>>> If lowerLimit is larger than the largest known good prime,
>>>> just make it odd."
>>>> | primes low mid high prime |
>>>> primes := self goodPrimes.
>>>> low := 1.
>>>> high := primes size.
>>>> lowerLimit > (primes at: high) ifTrue: [
>>>> ^lowerLimit bitOr: 1 ].
>>>> [ high - low <= 1 ] whileFalse: [
>>>> mid := high + low // 2.
>>>> prime := primes at: mid.
>>>> prime = lowerLimit ifTrue: [ ^prime ].
>>>> prime < lowerLimit
>>>> ifTrue: [ low := mid ]
>>>> ifFalse: [ high := mid ] ].
>>>> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>>>> ^primes at: high
>>>>
>>>> The code for this sequence is
>>>> 58 <15> pushTemp: 5
>>>> 59 <10> pushTemp: 0
>>>> 60 <B2> send: <
>>>> 61 <9B> jumpFalse: 66
>>>> 62 <13> pushTemp: 3
>>>> 63 <81 42> storeIntoTemp: 2
>>>> 65 <92> jumpTo: 69
>>>> 66 <13> pushTemp: 3
>>>> 67 <81 44> storeIntoTemp: 4
>>>> 69 <87> pop
>>>>
>>>> where-as the following would be better:
>>>>
>>>> 58 <15> pushTemp: 5
>>>> 59 <10> pushTemp: 0
>>>> 60 <B2> send: <
>>>> 61 <9B> jumpFalse: 66
>>>> 62 <13> pushTemp: 3
>>>> 63 <82 42> popIntoTemp: 2
>>>> 65 <92> jumpTo: 69
>>>> 66 <13> pushTemp: 3
>>>> 67 <82 44> popIntoTemp: 4
>>>>
>>>
>>> [ (a) ifTrue:[ b] ifFalse: [c] ] value
>>> should answer b or c (depending on a)
>>> now if you use pop instead of store, it will break.
>>>
>>
>> Yes but we always know whether we want to emit code forValue or just for effect.
>>
>>> Of course in while loop, the [ loop-body closure ] value is always
>>> discarded, so yes we can
>>> eliminate extra pop.
>>> But you can do even more:
>>>
>>> 1. loop head
>>> ...
>>> ....
>>> 61 <9B> jumpFalse: 66
>>> 62 <13> pushTemp: 3
>>> 63 <82 42> popIntoTemp: 2
>>> ** 65 <92> jumpTo: 1
>>> 66 <13> pushTemp: 3
>>> 67 <82 44> popIntoTemp: 4
>>> 69 ... jumpTo: 1
>>>
>>> so, we can have 1 less jump when code enters first branch.
>>>
>>
>> Very good!
>> I think however that hacking the Decompiler will be a challenge!
>>
>
> Well, in this case it is easy to hide hints for decompiler directly in
> a bytecode:
>
> 1. loop head
> ...
> 100. jumpTo: 106
> 101. push ...
> 102. pop
> 103. jumpTo: 1
> 104. hint
> 105. more hints
> 106. ... section start for false branch
>
> n+1 ...
>
> ----
> same actually for loops. instead of writing sophisticated detection
> code , we can simply read hints:
>
> 1. loop head
> ...
>
> 10. jumpIfFalse: 101 (leaving the loop )
>
> ... loop body
>
> 99. jumpTo: 1 (loop head)
> 100. hint for loop
> 101. ... rest of method's bytecode
>
> As you can see , it doesn't affects the performance, since its
> unreachable code, just the size of bytecode.
>

Huh, reserving plenty byte codes for nop or intermixing instructions and text,
whatever you choose might lead you directly to the stake ;)

You invented an old new idea: literate bytecode programming

Nicolas

> just don't put me on fire for suggesting that :)
>
>
> --
> Best regards,
> Igor Stasenko.
>

Eliot Miranda-2

Re: here's a curio for compiler mavens...

In reply to this post by Eliot Miranda-2

OK, I fixed this (a pleasant Monday afternoon distraction) (at least I *think* I've fixed it; I still have to check the decompiler). But... in doing so I realised I had to fix CompiledMethod>>#= since I needed to see exactly which methods the compiler change affected (just committed as Kernel-eem.669). In doing this I noticed that recent compiler changes have not been applied by recompiling affected methods in the system. If you look at Behavior>>#sourceMatchesBytecodeAt: it uses CompiledMethod>>#= to check that the compiler's output remains unchanged. In the 4.3 trunk image with CompiledMethod>>#= fixed, sourceMatchesBytecodeAt: reports that some 19654 methods need recompiling:

(SystemNavigation new allSelect: [:m| (m methodClass sourceMatchesBytecodeAt: m selector) not]) size 19654

Being selective in recompiling, via e.g.

SystemNavigation new allSelect: [:m| (m methodClass sourceMatchesBytecodeAt: m selector) ifFalse: [m methodClass recompile: m selector]. false]

one gets to the desired

(SystemNavigation new allSelect: [:m| (m methodClass sourceMatchesBytecodeAt: m selector) not]) size 0

So the question is should we add an update that does the selective recompile? I think we should, e.g. by committing a version of Compiler that contains the selective recompile as a post-script. Agreed?

On Mon, Feb 13, 2012 at 11:58 AM, Eliot Miranda <[hidden email]> wrote:

Hi All,

(whitewash alert) I just had occasion to look at the bytecode generated by the standard compiler for HashedCollection class>>#goodPrimeAtLeast:. Here's the source, with the issue in bold:

goodPrimeAtLeast: lowerLimit
"Answer the next good prime >= lowerlimit.
If lowerLimit is larger than the largest known good prime,

just make it odd."

| primes low mid high prime |

primes := self goodPrimes.
low := 1.
high := primes size.

lowerLimit > (primes at: high) ifTrue: [
^lowerLimit bitOr: 1 ].
[ high - low <= 1 ] whileFalse: [

mid := high + low // 2.
prime := primes at: mid.
prime = lowerLimit ifTrue: [ ^prime ].

prime < lowerLimit
ifTrue: [ low := mid ]
ifFalse: [ high := mid ] ].

(primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
^primes at: high

The code for this sequence is
58 <15> pushTemp: 5
59 <10> pushTemp: 0

60 <B2> send: <
61 <9B> jumpFalse: 66
62 <13> pushTemp: 3

63 <81 42> storeIntoTemp: 2
65 <92> jumpTo: 69
66 <13> pushTemp: 3

67 <81 44> storeIntoTemp: 4
69 <87> pop

where-as the following would be better:

58 <15> pushTemp: 5
59 <10> pushTemp: 0

60 <B2> send: <
61 <9B> jumpFalse: 66
62 <13> pushTemp: 3

63 <82 42> popIntoTemp: 2
65 <92> jumpTo: 69
66 <13> pushTemp: 3

67 <82 44> popIntoTemp: 4

The reason is that the code generator favours using a single pop for both arms of the if:

MessageNode>>sizeCodeForIf: encoder value: forValue
| thenExpr elseExpr branchSize thenSize elseSize |
thenExpr := arguments at: 1.

elseExpr := arguments at: 2.
(forValue
or: [(thenExpr isJust: NodeNil)

or: [elseExpr isJust: NodeNil]]) not
"(...not ifTrue: avoids using ifFalse: alone during this compile)"

ifTrue: "Two-armed IFs forEffect share a single pop"
[^super sizeCodeForEffect: encoder].

...

MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
| thenExpr thenSize elseExpr elseSize |

thenSize := sizes at: 1.
elseSize := sizes at: 2.
(forValue not and: [elseSize * thenSize > 0]) ifTrue:

"Two-armed IFs forEffect share a single pop"
[^super emitCodeForEffect: stack encoder: encoder].

It would be nice if this only happened if doing so actually reduced the size of the generated code.
--
best,
Eliot

--
best,

Eliot

Nicolas Cellier

Re: here's a curio for compiler mavens...

Ah, I came up with a relatively explicit solution, but also quite
heavy in term of number of messages required to implement the
optimisation.
I'm curious to see your implementation :)
Also, I'm sure that multiple-jump-elimination proposed by Igor and
other kind of optimization would be a perfect subject for proving
concepts of NewCompiler IR infrastructure.

MessageNode>>doesEffectRequiresAPop: encoder
special > 0
ifTrue:
[^self perform: (MacroTesters at: special) with: encoder].
^super doesEffectRequiresAPop: encoder

MessageNode>>doesEffectForIfRequiresAPop: encoder
"If both branch require a pop instruction, then the pop is factored
after the jump.
In this case, the translation of this node also require a final pop
that can further be eliminated eventually."
| thenExpr elseExpr |
thenExpr := arguments at: 1.
elseExpr := arguments at: 2.
^((thenExpr isJust: NodeNil) or: [elseExpr isJust: NodeNil]) not
and: [(thenExpr doesEvaluatingEffectRequiresAPop: encoder) and:
[elseExpr doesEvaluatingEffectRequiresAPop: encoder]].

BlockNode>>doesEvaluatingEffectRequiresAPop: anEncoder
"Test whether a pop instruction is required to skip the value on the
stack when this block is inlined."
^statements size > 0 and: [statements last doesEffectRequiresAPop: anEncoder]

AssigmentNode>>doesEffectRequiresAPop: anEncoder
"Test whether a pop instruction is required to skip the value on the stack"
^(variable canStorePopInASingleInstruction: anEncoder) not

InstanceVariableNode>>canStorePopInASingleInstruction: anEncoder
^true

etc...

Le 14 février 2012 01:27, Eliot Miranda <[hidden email]> a écrit :

> OK, I fixed this (a pleasant Monday afternoon distraction) (at least I
> *think* I've fixed it; I still have to check the decompiler). But... in
> doing so I realised I had to fix CompiledMethod>>#= since I needed to see
> exactly which methods the compiler change affected (just committed as
> Kernel-eem.669). In doing this I noticed that recent compiler changes have
> not been applied by recompiling affected methods in the system. If you look
> at Behavior>>#sourceMatchesBytecodeAt: it uses CompiledMethod>>#= to check
> that the compiler's output remains unchanged. In the 4.3 trunk image with
> CompiledMethod>>#= fixed, sourceMatchesBytecodeAt: reports that some 19654
> methods need recompiling:
>
> (SystemNavigation new allSelect: [:m| (m methodClass
> sourceMatchesBytecodeAt: m selector) not]) size 19654
>
> Being selective in recompiling, via e.g.
>
> SystemNavigation new allSelect: [:m| (m methodClass sourceMatchesBytecodeAt:
> m selector) ifFalse: [m methodClass recompile: m selector]. false]
>
> one gets to the desired
>
> (SystemNavigation new allSelect: [:m| (m methodClass
> sourceMatchesBytecodeAt: m selector) not]) size 0
>
> So the question is should we add an update that does the selective
> recompile? I think we should, e.g. by committing a version of Compiler that
> contains the selective recompile as a post-script. Agreed?
>
> On Mon, Feb 13, 2012 at 11:58 AM, Eliot Miranda <[hidden email]>
> wrote:
>>
>> Hi All,
>>
>> (whitewash alert) I just had occasion to look at the bytecode
>> generated by the standard compiler for HashedCollection
>> class>>#goodPrimeAtLeast:. Here's the source, with the issue in bold:
>>
>> goodPrimeAtLeast: lowerLimit
>> "Answer the next good prime >= lowerlimit.
>> If lowerLimit is larger than the largest known good prime,
>> just make it odd."
>> | primes low mid high prime |
>> primes := self goodPrimes.
>> low := 1.
>> high := primes size.
>> lowerLimit > (primes at: high) ifTrue: [
>> ^lowerLimit bitOr: 1 ].
>> [ high - low <= 1 ] whileFalse: [
>> mid := high + low // 2.
>> prime := primes at: mid.
>> prime = lowerLimit ifTrue: [ ^prime ].
>> prime < lowerLimit
>> ifTrue: [ low := mid ]
>> ifFalse: [ high := mid ] ].
>> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>> ^primes at: high
>>
>> The code for this sequence is
>> 58 <15> pushTemp: 5
>> 59 <10> pushTemp: 0
>> 60 <B2> send: <
>> 61 <9B> jumpFalse: 66
>> 62 <13> pushTemp: 3
>> 63 <81 42> storeIntoTemp: 2
>> 65 <92> jumpTo: 69
>> 66 <13> pushTemp: 3
>> 67 <81 44> storeIntoTemp: 4
>> 69 <87> pop
>>
>> where-as the following would be better:
>>
>> 58 <15> pushTemp: 5
>> 59 <10> pushTemp: 0
>> 60 <B2> send: <
>> 61 <9B> jumpFalse: 66
>> 62 <13> pushTemp: 3
>> 63 <82 42> popIntoTemp: 2
>> 65 <92> jumpTo: 69
>> 66 <13> pushTemp: 3
>> 67 <82 44> popIntoTemp: 4
>>
>> The reason is that the code generator favours using a single pop for both
>> arms of the if:
>>
>> MessageNode>>sizeCodeForIf: encoder value: forValue
>> | thenExpr elseExpr branchSize thenSize elseSize |
>> thenExpr := arguments at: 1.
>> elseExpr := arguments at: 2.
>> (forValue
>> or: [(thenExpr isJust: NodeNil)
>> or: [elseExpr isJust: NodeNil]]) not
>> "(...not ifTrue: avoids using ifFalse: alone during this compile)"
>> ifTrue: "Two-armed IFs forEffect share a single pop"
>> [^super sizeCodeForEffect: encoder].
>> ...
>>
>> MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
>> | thenExpr thenSize elseExpr elseSize |
>> thenSize := sizes at: 1.
>> elseSize := sizes at: 2.
>> (forValue not and: [elseSize * thenSize > 0]) ifTrue:
>> "Two-armed IFs forEffect share a single pop"
>> [^super emitCodeForEffect: stack encoder: encoder].
>>
>> It would be nice if this only happened if doing so actually reduced the
>> size of the generated code.
>> --
>> best,
>> Eliot
>>
>
>
>
> --
> best,
> Eliot
>
>
>
>

Eliot Miranda-2

Re: here's a curio for compiler mavens...

On Tue, Feb 14, 2012 at 5:30 AM, Nicolas Cellier <[hidden email]> wrote:

Ah, I came up with a relatively explicit solution, but also quite
heavy in term of number of messages required to implement the
optimisation.
I'm curious to see your implementation :)

similar, but should I commit it? I think mine is fine since it is only assignments that have storepop forms, so your generality doesn't win anything. but i could be wrong.

!MessageNode methodsFor: 'code generation' stamp: 'eem 2/13/2012 15:25'!

ifThenElseForEffectShouldSharePop

| thenExpr elseExpr |

thenExpr := arguments at: 1.

elseExpr := arguments at: 2.

^((thenExpr isJust: NodeNil)

or: [(elseExpr isJust: NodeNil)

or: [thenExpr statements last isAssignmentNode

and: [elseExpr statements last isAssignmentNode]]]) not

"^(thenExpr isJust: NodeNil) not

and: [(elseExpr isJust: NodeNil) not

and: [thenExpr statements last isAssignmentNode not

or: [elseExpr statements last isAssignmentNode not]]]"! !

Also, I'm sure that multiple-jump-elimination proposed by Igor and
other kind of optimization would be a perfect subject for proving
concepts of NewCompiler IR infrastructure.

MessageNode>>doesEffectRequiresAPop: encoder
special > 0
ifTrue:
[^self perform: (MacroTesters at: special) with: encoder].
^super doesEffectRequiresAPop: encoder

MessageNode>>doesEffectForIfRequiresAPop: encoder
"If both branch require a pop instruction, then the pop is factored
after the jump.
In this case, the translation of this node also require a final pop
that can further be eliminated eventually."
| thenExpr elseExpr |

thenExpr := arguments at: 1.
elseExpr := arguments at: 2.

^((thenExpr isJust: NodeNil) or: [elseExpr isJust: NodeNil]) not
and: [(thenExpr doesEvaluatingEffectRequiresAPop: encoder) and:
[elseExpr doesEvaluatingEffectRequiresAPop: encoder]].

BlockNode>>doesEvaluatingEffectRequiresAPop: anEncoder
"Test whether a pop instruction is required to skip the value on the
stack when this block is inlined."
^statements size > 0 and: [statements last doesEffectRequiresAPop: anEncoder]

AssigmentNode>>doesEffectRequiresAPop: anEncoder
"Test whether a pop instruction is required to skip the value on the stack"
^(variable canStorePopInASingleInstruction: anEncoder) not

InstanceVariableNode>>canStorePopInASingleInstruction: anEncoder
^true

etc...

Le 14 février 2012 01:27, Eliot Miranda <[hidden email]> a écrit :

> OK, I fixed this (a pleasant Monday afternoon distraction) (at least I
> *think* I've fixed it; I still have to check the decompiler). But... in
> doing so I realised I had to fix CompiledMethod>>#= since I needed to see
> exactly which methods the compiler change affected (just committed as
> Kernel-eem.669). In doing this I noticed that recent compiler changes have
> not been applied by recompiling affected methods in the system. If you look
> at Behavior>>#sourceMatchesBytecodeAt: it uses CompiledMethod>>#= to check
> that the compiler's output remains unchanged. In the 4.3 trunk image with
> CompiledMethod>>#= fixed, sourceMatchesBytecodeAt: reports that some 19654
> methods need recompiling:
>
> (SystemNavigation new allSelect: [:m| (m methodClass
> sourceMatchesBytecodeAt: m selector) not]) size 19654
>
> Being selective in recompiling, via e.g.
>
> SystemNavigation new allSelect: [:m| (m methodClass sourceMatchesBytecodeAt:
> m selector) ifFalse: [m methodClass recompile: m selector]. false]
>
> one gets to the desired
>
> (SystemNavigation new allSelect: [:m| (m methodClass
> sourceMatchesBytecodeAt: m selector) not]) size 0
>
> So the question is should we add an update that does the selective
> recompile? I think we should, e.g. by committing a version of Compiler that
> contains the selective recompile as a post-script. Agreed?
>
> On Mon, Feb 13, 2012 at 11:58 AM, Eliot Miranda <[hidden email]>
> wrote:
>>
>> Hi All,
>>
>> (whitewash alert) I just had occasion to look at the bytecode
>> generated by the standard compiler for HashedCollection
>> class>>#goodPrimeAtLeast:. Here's the source, with the issue in bold:
>>
>> goodPrimeAtLeast: lowerLimit
>> "Answer the next good prime >= lowerlimit.
>> If lowerLimit is larger than the largest known good prime,
>> just make it odd."
>> | primes low mid high prime |
>> primes := self goodPrimes.
>> low := 1.
>> high := primes size.
>> lowerLimit > (primes at: high) ifTrue: [
>> ^lowerLimit bitOr: 1 ].
>> [ high - low <= 1 ] whileFalse: [
>> mid := high + low // 2.
>> prime := primes at: mid.
>> prime = lowerLimit ifTrue: [ ^prime ].
>> prime < lowerLimit
>> ifTrue: [ low := mid ]
>> ifFalse: [ high := mid ] ].
>> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>> ^primes at: high
>>
>> The code for this sequence is
>> 58 <15> pushTemp: 5
>> 59 <10> pushTemp: 0
>> 60 <B2> send: <
>> 61 <9B> jumpFalse: 66
>> 62 <13> pushTemp: 3
>> 63 <81 42> storeIntoTemp: 2
>> 65 <92> jumpTo: 69
>> 66 <13> pushTemp: 3
>> 67 <81 44> storeIntoTemp: 4
>> 69 <87> pop
>>
>> where-as the following would be better:
>>
>> 58 <15> pushTemp: 5
>> 59 <10> pushTemp: 0
>> 60 <B2> send: <
>> 61 <9B> jumpFalse: 66
>> 62 <13> pushTemp: 3
>> 63 <82 42> popIntoTemp: 2
>> 65 <92> jumpTo: 69
>> 66 <13> pushTemp: 3
>> 67 <82 44> popIntoTemp: 4
>>
>> The reason is that the code generator favours using a single pop for both
>> arms of the if:
>>
>> MessageNode>>sizeCodeForIf: encoder value: forValue
>> | thenExpr elseExpr branchSize thenSize elseSize |
>> thenExpr := arguments at: 1.
>> elseExpr := arguments at: 2.
>> (forValue
>> or: [(thenExpr isJust: NodeNil)
>> or: [elseExpr isJust: NodeNil]]) not
>> "(...not ifTrue: avoids using ifFalse: alone during this compile)"
>> ifTrue: "Two-armed IFs forEffect share a single pop"
>> [^super sizeCodeForEffect: encoder].
>> ...
>>
>> MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
>> | thenExpr thenSize elseExpr elseSize |
>> thenSize := sizes at: 1.
>> elseSize := sizes at: 2.
>> (forValue not and: [elseSize * thenSize > 0]) ifTrue:
>> "Two-armed IFs forEffect share a single pop"
>> [^super emitCodeForEffect: stack encoder: encoder].
>>
>> It would be nice if this only happened if doing so actually reduced the
>> size of the generated code.
>> --
>> best,
>> Eliot
>>
>
>
>
> --
> best,
> Eliot
>
>
>
>

--
best,

Eliot

ifThenElseForEffectShouldSharePop.st (6K) Download Attachment

Nicolas Cellier

Re: here's a curio for compiler mavens...

Le 14 février 2012 23:10, Eliot Miranda <[hidden email]> a écrit :

> On Tue, Feb 14, 2012 at 5:30 AM, Nicolas Cellier
> <[hidden email]> wrote:
>>
>> Ah, I came up with a relatively explicit solution, but also quite
>> heavy in term of number of messages required to implement the
>> optimisation.
>> I'm curious to see your implementation :)
>
>
> similar, but should I commit it? I think mine is fine since it is only
> assignments that have storepop forms, so your generality doesn't win
> anything. but i could be wrong.
>

I was thinking of

1) nested ifTrue: ifFalse:
cond1 ifTrue: [x1 := y] ifFalse: [cond2 ifTrue: [x2 := y] ifFalse: [x3 := y]]

elseExpr statements last is a message, not an assignment, though we
can factor the pop out...

2) inlined loop messages don't require a final pop...

x <= y ifTrue: [x to: y do: [:i | (self at: i) doSome]] ifFalse: [y
to: x do: [:i | (self at: i) doSome]].

An encoder pushing a value in both branches just to pop it after
wouldn't be that well behaved...

I won't give any version publicly because I crashed several images in
the bootstrap process, and don't know if the bug is in the bootstrap
or in the code.

Nicolas

> !MessageNode methodsFor: 'code generation' stamp: 'eem 2/13/2012 15:25'!
> ifThenElseForEffectShouldSharePop
> | thenExpr elseExpr |
> thenExpr := arguments at: 1.
> elseExpr := arguments at: 2.
> ^((thenExpr isJust: NodeNil)
> or: [(elseExpr isJust: NodeNil)
> or: [thenExpr statements last isAssignmentNode
> and: [elseExpr statements last isAssignmentNode]]]) not
>
> "^(thenExpr isJust: NodeNil) not
> and: [(elseExpr isJust: NodeNil) not
> and: [thenExpr statements last isAssignmentNode not
> or: [elseExpr statements last isAssignmentNode not]]]"! !
>
>> Also, I'm sure that multiple-jump-elimination proposed by Igor and
>> other kind of optimization would be a perfect subject for proving
>> concepts of NewCompiler IR infrastructure.
>>
>> MessageNode>>doesEffectRequiresAPop: encoder
>> special > 0
>> ifTrue:
>> [^self perform: (MacroTesters at: special) with:
>> encoder].
>> ^super doesEffectRequiresAPop: encoder
>>
>> MessageNode>>doesEffectForIfRequiresAPop: encoder
>> "If both branch require a pop instruction, then the pop is factored
>> after the jump.
>> In this case, the translation of this node also require a final pop
>> that can further be eliminated eventually."
>> | thenExpr elseExpr |
>> thenExpr := arguments at: 1.
>> elseExpr := arguments at: 2.
>> ^((thenExpr isJust: NodeNil) or: [elseExpr isJust: NodeNil]) not
>> and: [(thenExpr doesEvaluatingEffectRequiresAPop: encoder)
>> and:
>> [elseExpr doesEvaluatingEffectRequiresAPop: encoder]].
>>
>> BlockNode>>doesEvaluatingEffectRequiresAPop: anEncoder
>> "Test whether a pop instruction is required to skip the value on
>> the
>> stack when this block is inlined."
>> ^statements size > 0 and: [statements last doesEffectRequiresAPop:
>> anEncoder]
>>
>> AssigmentNode>>doesEffectRequiresAPop: anEncoder
>> "Test whether a pop instruction is required to skip the value on
>> the stack"
>> ^(variable canStorePopInASingleInstruction: anEncoder) not
>>
>> InstanceVariableNode>>canStorePopInASingleInstruction: anEncoder
>> ^true
>>
>> etc...
>>
>> Le 14 février 2012 01:27, Eliot Miranda <[hidden email]> a écrit
>> :
>> > OK, I fixed this (a pleasant Monday afternoon distraction) (at least I
>> > *think* I've fixed it; I still have to check the decompiler). But... in
>> > doing so I realised I had to fix CompiledMethod>>#= since I needed to
>> > see
>> > exactly which methods the compiler change affected (just committed as
>> > Kernel-eem.669). In doing this I noticed that recent compiler changes
>> > have
>> > not been applied by recompiling affected methods in the system. If you
>> > look
>> > at Behavior>>#sourceMatchesBytecodeAt: it uses CompiledMethod>>#= to
>> > check
>> > that the compiler's output remains unchanged. In the 4.3 trunk image
>> > with
>> > CompiledMethod>>#= fixed, sourceMatchesBytecodeAt: reports that
>> > some 19654
>> > methods need recompiling:
>> >
>> > (SystemNavigation new allSelect: [:m| (m methodClass
>> > sourceMatchesBytecodeAt: m selector) not]) size 19654
>> >
>> > Being selective in recompiling, via e.g.
>> >
>> > SystemNavigation new allSelect: [:m| (m methodClass
>> > sourceMatchesBytecodeAt:
>> > m selector) ifFalse: [m methodClass recompile: m selector]. false]
>> >
>> > one gets to the desired
>> >
>> > (SystemNavigation new allSelect: [:m| (m methodClass
>> > sourceMatchesBytecodeAt: m selector) not]) size 0
>> >
>> > So the question is should we add an update that does the selective
>> > recompile? I think we should, e.g. by committing a version of Compiler
>> > that
>> > contains the selective recompile as a post-script. Agreed?
>> >
>> > On Mon, Feb 13, 2012 at 11:58 AM, Eliot Miranda
>> > <[hidden email]>
>> > wrote:
>> >>
>> >> Hi All,
>> >>
>> >> (whitewash alert) I just had occasion to look at the bytecode
>> >> generated by the standard compiler for HashedCollection
>> >> class>>#goodPrimeAtLeast:. Here's the source, with the issue in bold:
>> >>
>> >> goodPrimeAtLeast: lowerLimit
>> >> "Answer the next good prime >= lowerlimit.
>> >> If lowerLimit is larger than the largest known good prime,
>> >> just make it odd."
>> >> | primes low mid high prime |
>> >> primes := self goodPrimes.
>> >> low := 1.
>> >> high := primes size.
>> >> lowerLimit > (primes at: high) ifTrue: [
>> >> ^lowerLimit bitOr: 1 ].
>> >> [ high - low <= 1 ] whileFalse: [
>> >> mid := high + low // 2.
>> >> prime := primes at: mid.
>> >> prime = lowerLimit ifTrue: [ ^prime ].
>> >> prime < lowerLimit
>> >> ifTrue: [ low := mid ]
>> >> ifFalse: [ high := mid ] ].
>> >> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>> >> ^primes at: high
>> >>
>> >> The code for this sequence is
>> >> 58 <15> pushTemp: 5
>> >> 59 <10> pushTemp: 0
>> >> 60 <B2> send: <
>> >> 61 <9B> jumpFalse: 66
>> >> 62 <13> pushTemp: 3
>> >> 63 <81 42> storeIntoTemp: 2
>> >> 65 <92> jumpTo: 69
>> >> 66 <13> pushTemp: 3
>> >> 67 <81 44> storeIntoTemp: 4
>> >> 69 <87> pop
>> >>
>> >> where-as the following would be better:
>> >>
>> >> 58 <15> pushTemp: 5
>> >> 59 <10> pushTemp: 0
>> >> 60 <B2> send: <
>> >> 61 <9B> jumpFalse: 66
>> >> 62 <13> pushTemp: 3
>> >> 63 <82 42> popIntoTemp: 2
>> >> 65 <92> jumpTo: 69
>> >> 66 <13> pushTemp: 3
>> >> 67 <82 44> popIntoTemp: 4
>> >>
>> >> The reason is that the code generator favours using a single pop for
>> >> both
>> >> arms of the if:
>> >>
>> >> MessageNode>>sizeCodeForIf: encoder value: forValue
>> >> | thenExpr elseExpr branchSize thenSize elseSize |
>> >> thenExpr := arguments at: 1.
>> >> elseExpr := arguments at: 2.
>> >> (forValue
>> >> or: [(thenExpr isJust: NodeNil)
>> >> or: [elseExpr isJust: NodeNil]]) not
>> >> "(...not ifTrue: avoids using ifFalse: alone during this compile)"
>> >> ifTrue: "Two-armed IFs forEffect share a single pop"
>> >> [^super sizeCodeForEffect: encoder].
>> >> ...
>> >>
>> >> MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
>> >> | thenExpr thenSize elseExpr elseSize |
>> >> thenSize := sizes at: 1.
>> >> elseSize := sizes at: 2.
>> >> (forValue not and: [elseSize * thenSize > 0]) ifTrue:
>> >> "Two-armed IFs forEffect share a single pop"
>> >> [^super emitCodeForEffect: stack encoder: encoder].
>> >>
>> >> It would be nice if this only happened if doing so actually reduced the
>> >> size of the generated code.
>> >> --
>> >> best,
>> >> Eliot
>> >>
>> >
>> >
>> >
>> > --
>> > best,
>> > Eliot
>> >
>> >
>> >
>> >
>>
>
>
>
> --
> best,
> Eliot
>
>
>
>

Igor Stasenko

Re: here's a curio for compiler mavens...

On 15 February 2012 00:50, Nicolas Cellier
<[hidden email]> wrote:

> Le 14 février 2012 23:10, Eliot Miranda <[hidden email]> a écrit :
>> On Tue, Feb 14, 2012 at 5:30 AM, Nicolas Cellier
>> <[hidden email]> wrote:
>>>
>>> Ah, I came up with a relatively explicit solution, but also quite
>>> heavy in term of number of messages required to implement the
>>> optimisation.
>>> I'm curious to see your implementation :)
>>
>>
>> similar, but should I commit it? I think mine is fine since it is only
>> assignments that have storepop forms, so your generality doesn't win
>> anything. but i could be wrong.
>>
>
> I was thinking of
>
> 1) nested ifTrue: ifFalse:
> cond1 ifTrue: [x1 := y] ifFalse: [cond2 ifTrue: [x2 := y] ifFalse: [x3 := y]]
>
> elseExpr statements last is a message, not an assignment, though we
> can factor the pop out...
>
> 2) inlined loop messages don't require a final pop...
>
> x <= y ifTrue: [x to: y do: [:i | (self at: i) doSome]] ifFalse: [y
> to: x do: [:i | (self at: i) doSome]].
>
> An encoder pushing a value in both branches just to pop it after
> wouldn't be that well behaved...
>
> I won't give any version publicly because I crashed several images in
> the bootstrap process, and don't know if the bug is in the bootstrap
> or in the code.
>

IMO, too much effort for such diminishing returns. :)

> Nicolas
>
>> !MessageNode methodsFor: 'code generation' stamp: 'eem 2/13/2012 15:25'!
>> ifThenElseForEffectShouldSharePop
>> | thenExpr elseExpr |
>> thenExpr := arguments at: 1.
>> elseExpr := arguments at: 2.
>> ^((thenExpr isJust: NodeNil)
>> or: [(elseExpr isJust: NodeNil)
>> or: [thenExpr statements last isAssignmentNode
>> and: [elseExpr statements last isAssignmentNode]]]) not
>>
>> "^(thenExpr isJust: NodeNil) not
>> and: [(elseExpr isJust: NodeNil) not
>> and: [thenExpr statements last isAssignmentNode not
>> or: [elseExpr statements last isAssignmentNode not]]]"! !
>>
>>> Also, I'm sure that multiple-jump-elimination proposed by Igor and
>>> other kind of optimization would be a perfect subject for proving
>>> concepts of NewCompiler IR infrastructure.
>>>
>>> MessageNode>>doesEffectRequiresAPop: encoder
>>> special > 0
>>> ifTrue:
>>> [^self perform: (MacroTesters at: special) with:
>>> encoder].
>>> ^super doesEffectRequiresAPop: encoder
>>>
>>> MessageNode>>doesEffectForIfRequiresAPop: encoder
>>> "If both branch require a pop instruction, then the pop is factored
>>> after the jump.
>>> In this case, the translation of this node also require a final pop
>>> that can further be eliminated eventually."
>>> | thenExpr elseExpr |
>>> thenExpr := arguments at: 1.
>>> elseExpr := arguments at: 2.
>>> ^((thenExpr isJust: NodeNil) or: [elseExpr isJust: NodeNil]) not
>>> and: [(thenExpr doesEvaluatingEffectRequiresAPop: encoder)
>>> and:
>>> [elseExpr doesEvaluatingEffectRequiresAPop: encoder]].
>>>
>>> BlockNode>>doesEvaluatingEffectRequiresAPop: anEncoder
>>> "Test whether a pop instruction is required to skip the value on
>>> the
>>> stack when this block is inlined."
>>> ^statements size > 0 and: [statements last doesEffectRequiresAPop:
>>> anEncoder]
>>>
>>> AssigmentNode>>doesEffectRequiresAPop: anEncoder
>>> "Test whether a pop instruction is required to skip the value on
>>> the stack"
>>> ^(variable canStorePopInASingleInstruction: anEncoder) not
>>>
>>> InstanceVariableNode>>canStorePopInASingleInstruction: anEncoder
>>> ^true
>>>
>>> etc...
>>>
>>> Le 14 février 2012 01:27, Eliot Miranda <[hidden email]> a écrit
>>> :
>>> > OK, I fixed this (a pleasant Monday afternoon distraction) (at least I
>>> > *think* I've fixed it; I still have to check the decompiler). But... in
>>> > doing so I realised I had to fix CompiledMethod>>#= since I needed to
>>> > see
>>> > exactly which methods the compiler change affected (just committed as
>>> > Kernel-eem.669). In doing this I noticed that recent compiler changes
>>> > have
>>> > not been applied by recompiling affected methods in the system. If you
>>> > look
>>> > at Behavior>>#sourceMatchesBytecodeAt: it uses CompiledMethod>>#= to
>>> > check
>>> > that the compiler's output remains unchanged. In the 4.3 trunk image
>>> > with
>>> > CompiledMethod>>#= fixed, sourceMatchesBytecodeAt: reports that
>>> > some 19654
>>> > methods need recompiling:
>>> >
>>> > (SystemNavigation new allSelect: [:m| (m methodClass
>>> > sourceMatchesBytecodeAt: m selector) not]) size 19654
>>> >
>>> > Being selective in recompiling, via e.g.
>>> >
>>> > SystemNavigation new allSelect: [:m| (m methodClass
>>> > sourceMatchesBytecodeAt:
>>> > m selector) ifFalse: [m methodClass recompile: m selector]. false]
>>> >
>>> > one gets to the desired
>>> >
>>> > (SystemNavigation new allSelect: [:m| (m methodClass
>>> > sourceMatchesBytecodeAt: m selector) not]) size 0
>>> >
>>> > So the question is should we add an update that does the selective
>>> > recompile? I think we should, e.g. by committing a version of Compiler
>>> > that
>>> > contains the selective recompile as a post-script. Agreed?
>>> >
>>> > On Mon, Feb 13, 2012 at 11:58 AM, Eliot Miranda
>>> > <[hidden email]>
>>> > wrote:
>>> >>
>>> >> Hi All,
>>> >>
>>> >> (whitewash alert) I just had occasion to look at the bytecode
>>> >> generated by the standard compiler for HashedCollection
>>> >> class>>#goodPrimeAtLeast:. Here's the source, with the issue in bold:
>>> >>
>>> >> goodPrimeAtLeast: lowerLimit
>>> >> "Answer the next good prime >= lowerlimit.
>>> >> If lowerLimit is larger than the largest known good prime,
>>> >> just make it odd."
>>> >> | primes low mid high prime |
>>> >> primes := self goodPrimes.
>>> >> low := 1.
>>> >> high := primes size.
>>> >> lowerLimit > (primes at: high) ifTrue: [
>>> >> ^lowerLimit bitOr: 1 ].
>>> >> [ high - low <= 1 ] whileFalse: [
>>> >> mid := high + low // 2.
>>> >> prime := primes at: mid.
>>> >> prime = lowerLimit ifTrue: [ ^prime ].
>>> >> prime < lowerLimit
>>> >> ifTrue: [ low := mid ]
>>> >> ifFalse: [ high := mid ] ].
>>> >> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>>> >> ^primes at: high
>>> >>
>>> >> The code for this sequence is
>>> >> 58 <15> pushTemp: 5
>>> >> 59 <10> pushTemp: 0
>>> >> 60 <B2> send: <
>>> >> 61 <9B> jumpFalse: 66
>>> >> 62 <13> pushTemp: 3
>>> >> 63 <81 42> storeIntoTemp: 2
>>> >> 65 <92> jumpTo: 69
>>> >> 66 <13> pushTemp: 3
>>> >> 67 <81 44> storeIntoTemp: 4
>>> >> 69 <87> pop
>>> >>
>>> >> where-as the following would be better:
>>> >>
>>> >> 58 <15> pushTemp: 5
>>> >> 59 <10> pushTemp: 0
>>> >> 60 <B2> send: <
>>> >> 61 <9B> jumpFalse: 66
>>> >> 62 <13> pushTemp: 3
>>> >> 63 <82 42> popIntoTemp: 2
>>> >> 65 <92> jumpTo: 69
>>> >> 66 <13> pushTemp: 3
>>> >> 67 <82 44> popIntoTemp: 4
>>> >>
>>> >> The reason is that the code generator favours using a single pop for
>>> >> both
>>> >> arms of the if:
>>> >>
>>> >> MessageNode>>sizeCodeForIf: encoder value: forValue
>>> >> | thenExpr elseExpr branchSize thenSize elseSize |
>>> >> thenExpr := arguments at: 1.
>>> >> elseExpr := arguments at: 2.
>>> >> (forValue
>>> >> or: [(thenExpr isJust: NodeNil)
>>> >> or: [elseExpr isJust: NodeNil]]) not
>>> >> "(...not ifTrue: avoids using ifFalse: alone during this compile)"
>>> >> ifTrue: "Two-armed IFs forEffect share a single pop"
>>> >> [^super sizeCodeForEffect: encoder].
>>> >> ...
>>> >>
>>> >> MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
>>> >> | thenExpr thenSize elseExpr elseSize |
>>> >> thenSize := sizes at: 1.
>>> >> elseSize := sizes at: 2.
>>> >> (forValue not and: [elseSize * thenSize > 0]) ifTrue:
>>> >> "Two-armed IFs forEffect share a single pop"
>>> >> [^super emitCodeForEffect: stack encoder: encoder].
>>> >>
>>> >> It would be nice if this only happened if doing so actually reduced the
>>> >> size of the generated code.
>>> >> --
>>> >> best,
>>> >> Eliot
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > best,
>>> > Eliot
>>> >
>>> >
>>> >
>>> >
>>>
>>
>>
>>
>> --
>> best,
>> Eliot
>>
>>
>>
>>
>

--
Best regards,
Igor Stasenko.

Eliot Miranda-2

Re: here's a curio for compiler mavens...

On Tue, Feb 14, 2012 at 2:55 PM, Igor Stasenko <[hidden email]> wrote:

On 15 February 2012 00:50, Nicolas Cellier

<[hidden email]> wrote:
> Le 14 février 2012 23:10, Eliot Miranda <[hidden email]> a écrit :
>> On Tue, Feb 14, 2012 at 5:30 AM, Nicolas Cellier
>> <[hidden email]> wrote:
>>>
>>> Ah, I came up with a relatively explicit solution, but also quite
>>> heavy in term of number of messages required to implement the
>>> optimisation.
>>> I'm curious to see your implementation :)
>>
>>
>> similar, but should I commit it? I think mine is fine since it is only
>> assignments that have storepop forms, so your generality doesn't win
>> anything. but i could be wrong.
>>
>
> I was thinking of
>
> 1) nested ifTrue: ifFalse:
> cond1 ifTrue: [x1 := y] ifFalse: [cond2 ifTrue: [x2 := y] ifFalse: [x3 := y]]
>
> elseExpr statements last is a message, not an assignment, though we
> can factor the pop out...
>
> 2) inlined loop messages don't require a final pop...
>
> x <= y ifTrue: [x to: y do: [:i | (self at: i) doSome]] ifFalse: [y
> to: x do: [:i | (self at: i) doSome]].
>
> An encoder pushing a value in both branches just to pop it after
> wouldn't be that well behaved...
>
> I won't give any version publicly because I crashed several images in
> the bootstrap process, and don't know if the bug is in the bootstrap
> or in the code.
>

IMO, too much effort for such diminishing returns. :)

That was my thought too. It would be interesting to quantify the benefits. I noticed some 451 methods that were improved by my simple assignment-only mod. How many additional methods get caught by yours Nicolas?

> Nicolas
>
>> !MessageNode methodsFor: 'code generation' stamp: 'eem 2/13/2012 15:25'!
>> ifThenElseForEffectShouldSharePop
>> | thenExpr elseExpr |
>> thenExpr := arguments at: 1.
>> elseExpr := arguments at: 2.
>> ^((thenExpr isJust: NodeNil)
>> or: [(elseExpr isJust: NodeNil)
>> or: [thenExpr statements last isAssignmentNode
>> and: [elseExpr statements last isAssignmentNode]]]) not
>>
>> "^(thenExpr isJust: NodeNil) not
>> and: [(elseExpr isJust: NodeNil) not
>> and: [thenExpr statements last isAssignmentNode not
>> or: [elseExpr statements last isAssignmentNode not]]]"! !
>>
>>> Also, I'm sure that multiple-jump-elimination proposed by Igor and
>>> other kind of optimization would be a perfect subject for proving
>>> concepts of NewCompiler IR infrastructure.
>>>
>>> MessageNode>>doesEffectRequiresAPop: encoder
>>> special > 0
>>> ifTrue:
>>> [^self perform: (MacroTesters at: special) with:
>>> encoder].
>>> ^super doesEffectRequiresAPop: encoder
>>>
>>> MessageNode>>doesEffectForIfRequiresAPop: encoder
>>> "If both branch require a pop instruction, then the pop is factored
>>> after the jump.
>>> In this case, the translation of this node also require a final pop
>>> that can further be eliminated eventually."
>>> | thenExpr elseExpr |
>>> thenExpr := arguments at: 1.
>>> elseExpr := arguments at: 2.
>>> ^((thenExpr isJust: NodeNil) or: [elseExpr isJust: NodeNil]) not
>>> and: [(thenExpr doesEvaluatingEffectRequiresAPop: encoder)
>>> and:
>>> [elseExpr doesEvaluatingEffectRequiresAPop: encoder]].
>>>
>>> BlockNode>>doesEvaluatingEffectRequiresAPop: anEncoder
>>> "Test whether a pop instruction is required to skip the value on
>>> the
>>> stack when this block is inlined."
>>> ^statements size > 0 and: [statements last doesEffectRequiresAPop:
>>> anEncoder]
>>>
>>> AssigmentNode>>doesEffectRequiresAPop: anEncoder
>>> "Test whether a pop instruction is required to skip the value on
>>> the stack"
>>> ^(variable canStorePopInASingleInstruction: anEncoder) not
>>>
>>> InstanceVariableNode>>canStorePopInASingleInstruction: anEncoder
>>> ^true
>>>
>>> etc...
>>>
>>> Le 14 février 2012 01:27, Eliot Miranda <[hidden email]> a écrit
>>> :
>>> > OK, I fixed this (a pleasant Monday afternoon distraction) (at least I
>>> > *think* I've fixed it; I still have to check the decompiler). But... in
>>> > doing so I realised I had to fix CompiledMethod>>#= since I needed to
>>> > see
>>> > exactly which methods the compiler change affected (just committed as
>>> > Kernel-eem.669). In doing this I noticed that recent compiler changes
>>> > have
>>> > not been applied by recompiling affected methods in the system. If you
>>> > look
>>> > at Behavior>>#sourceMatchesBytecodeAt: it uses CompiledMethod>>#= to
>>> > check
>>> > that the compiler's output remains unchanged. In the 4.3 trunk image
>>> > with
>>> > CompiledMethod>>#= fixed, sourceMatchesBytecodeAt: reports that
>>> > some 19654
>>> > methods need recompiling:
>>> >
>>> > (SystemNavigation new allSelect: [:m| (m methodClass
>>> > sourceMatchesBytecodeAt: m selector) not]) size 19654
>>> >
>>> > Being selective in recompiling, via e.g.
>>> >
>>> > SystemNavigation new allSelect: [:m| (m methodClass
>>> > sourceMatchesBytecodeAt:
>>> > m selector) ifFalse: [m methodClass recompile: m selector]. false]
>>> >
>>> > one gets to the desired
>>> >
>>> > (SystemNavigation new allSelect: [:m| (m methodClass
>>> > sourceMatchesBytecodeAt: m selector) not]) size 0
>>> >
>>> > So the question is should we add an update that does the selective
>>> > recompile? I think we should, e.g. by committing a version of Compiler
>>> > that
>>> > contains the selective recompile as a post-script. Agreed?
>>> >
>>> > On Mon, Feb 13, 2012 at 11:58 AM, Eliot Miranda
>>> > <[hidden email]>
>>> > wrote:
>>> >>
>>> >> Hi All,
>>> >>
>>> >> (whitewash alert) I just had occasion to look at the bytecode
>>> >> generated by the standard compiler for HashedCollection
>>> >> class>>#goodPrimeAtLeast:. Here's the source, with the issue in bold:
>>> >>
>>> >> goodPrimeAtLeast: lowerLimit
>>> >> "Answer the next good prime >= lowerlimit.
>>> >> If lowerLimit is larger than the largest known good prime,
>>> >> just make it odd."
>>> >> | primes low mid high prime |
>>> >> primes := self goodPrimes.
>>> >> low := 1.
>>> >> high := primes size.
>>> >> lowerLimit > (primes at: high) ifTrue: [
>>> >> ^lowerLimit bitOr: 1 ].
>>> >> [ high - low <= 1 ] whileFalse: [
>>> >> mid := high + low // 2.
>>> >> prime := primes at: mid.
>>> >> prime = lowerLimit ifTrue: [ ^prime ].
>>> >> prime < lowerLimit
>>> >> ifTrue: [ low := mid ]
>>> >> ifFalse: [ high := mid ] ].
>>> >> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>>> >> ^primes at: high
>>> >>
>>> >> The code for this sequence is
>>> >> 58 <15> pushTemp: 5
>>> >> 59 <10> pushTemp: 0
>>> >> 60 <B2> send: <
>>> >> 61 <9B> jumpFalse: 66
>>> >> 62 <13> pushTemp: 3
>>> >> 63 <81 42> storeIntoTemp: 2
>>> >> 65 <92> jumpTo: 69
>>> >> 66 <13> pushTemp: 3
>>> >> 67 <81 44> storeIntoTemp: 4
>>> >> 69 <87> pop
>>> >>
>>> >> where-as the following would be better:
>>> >>
>>> >> 58 <15> pushTemp: 5
>>> >> 59 <10> pushTemp: 0
>>> >> 60 <B2> send: <
>>> >> 61 <9B> jumpFalse: 66
>>> >> 62 <13> pushTemp: 3
>>> >> 63 <82 42> popIntoTemp: 2
>>> >> 65 <92> jumpTo: 69
>>> >> 66 <13> pushTemp: 3
>>> >> 67 <82 44> popIntoTemp: 4
>>> >>
>>> >> The reason is that the code generator favours using a single pop for
>>> >> both
>>> >> arms of the if:
>>> >>
>>> >> MessageNode>>sizeCodeForIf: encoder value: forValue
>>> >> | thenExpr elseExpr branchSize thenSize elseSize |
>>> >> thenExpr := arguments at: 1.
>>> >> elseExpr := arguments at: 2.
>>> >> (forValue
>>> >> or: [(thenExpr isJust: NodeNil)
>>> >> or: [elseExpr isJust: NodeNil]]) not
>>> >> "(...not ifTrue: avoids using ifFalse: alone during this compile)"
>>> >> ifTrue: "Two-armed IFs forEffect share a single pop"
>>> >> [^super sizeCodeForEffect: encoder].
>>> >> ...
>>> >>
>>> >> MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
>>> >> | thenExpr thenSize elseExpr elseSize |
>>> >> thenSize := sizes at: 1.
>>> >> elseSize := sizes at: 2.
>>> >> (forValue not and: [elseSize * thenSize > 0]) ifTrue:
>>> >> "Two-armed IFs forEffect share a single pop"
>>> >> [^super emitCodeForEffect: stack encoder: encoder].
>>> >>
>>> >> It would be nice if this only happened if doing so actually reduced the
>>> >> size of the generated code.
>>> >> --
>>> >> best,
>>> >> Eliot
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > best,
>>> > Eliot
>>> >
>>> >
>>> >
>>> >
>>>
>>
>>
>>
>> --
>> best,
>> Eliot
>>
>>
>>
>>
>

--
Best regards,
Igor Stasenko.

--
best,

Eliot

Nicolas Cellier

Re: here's a curio for compiler mavens...

It's not a clean image, but here are the results:

Eliot code
-> 503 methods

Modified Eliot code
doesEffectForIfRequiresAPop: encoder
"If both branch require a pop instruction, then the pop is factored
after the jump.
If a branch is empty or ends with assignment, it don't ends with a
pop, so we don't need to factor a pop"
^arguments noneSatisfy: [:branch | (branch isJust: NodeNil) or:
[branch statements last isAssignmentNode]].
-> 880 methods

Mine
doesEffectForIfRequiresAPop: encoder
"If both branch require a pop instruction, then the pop is factored
after the jump.
In this case, the translation of this node also require a final pop
that can further be eliminated eventually."
^arguments allSatisfy: [:branch | (branch isJust: NodeNil) not and:
[branch doesEvaluatingEffectRequiresAPop: encoder]].
-> 1202 methods (code attached)

And I think I forgot the case of ReturnNode which don't require a pop...

Nicolas

Le 15 février 2012 00:57, Eliot Miranda <[hidden email]> a écrit :

>
>
> On Tue, Feb 14, 2012 at 2:55 PM, Igor Stasenko <[hidden email]> wrote:
>>
>> On 15 February 2012 00:50, Nicolas Cellier
>> <[hidden email]> wrote:
>> > Le 14 février 2012 23:10, Eliot Miranda <[hidden email]> a
>> > écrit :
>> >> On Tue, Feb 14, 2012 at 5:30 AM, Nicolas Cellier
>> >> <[hidden email]> wrote:
>> >>>
>> >>> Ah, I came up with a relatively explicit solution, but also quite
>> >>> heavy in term of number of messages required to implement the
>> >>> optimisation.
>> >>> I'm curious to see your implementation :)
>> >>
>> >>
>> >> similar, but should I commit it? I think mine is fine since it is only
>> >> assignments that have storepop forms, so your generality doesn't win
>> >> anything. but i could be wrong.
>> >>
>> >
>> > I was thinking of
>> >
>> > 1) nested ifTrue: ifFalse:
>> > cond1 ifTrue: [x1 := y] ifFalse: [cond2 ifTrue: [x2 := y] ifFalse: [x3
>> > := y]]
>> >
>> > elseExpr statements last is a message, not an assignment, though we
>> > can factor the pop out...
>> >
>> > 2) inlined loop messages don't require a final pop...
>> >
>> > x <= y ifTrue: [x to: y do: [:i | (self at: i) doSome]] ifFalse: [y
>> > to: x do: [:i | (self at: i) doSome]].
>> >
>> > An encoder pushing a value in both branches just to pop it after
>> > wouldn't be that well behaved...
>> >
>> > I won't give any version publicly because I crashed several images in
>> > the bootstrap process, and don't know if the bug is in the bootstrap
>> > or in the code.
>> >
>> IMO, too much effort for such diminishing returns. :)
>
>
> That was my thought too. It would be interesting to quantify the benefits.
> I noticed some 451 methods that were improved by my simple assignment-only
> mod. How many additional methods get caught by yours Nicolas?
>
>>
>>
>> > Nicolas
>> >
>> >> !MessageNode methodsFor: 'code generation' stamp: 'eem 2/13/2012
>> >> 15:25'!
>> >> ifThenElseForEffectShouldSharePop
>> >> | thenExpr elseExpr |
>> >> thenExpr := arguments at: 1.
>> >> elseExpr := arguments at: 2.
>> >> ^((thenExpr isJust: NodeNil)
>> >> or: [(elseExpr isJust: NodeNil)
>> >> or: [thenExpr statements last isAssignmentNode
>> >> and: [elseExpr statements last isAssignmentNode]]])
>> >> not
>> >>
>> >> "^(thenExpr isJust: NodeNil) not
>> >> and: [(elseExpr isJust: NodeNil) not
>> >> and: [thenExpr statements last isAssignmentNode not
>> >> or: [elseExpr statements last isAssignmentNode
>> >> not]]]"! !
>> >>
>> >>> Also, I'm sure that multiple-jump-elimination proposed by Igor and
>> >>> other kind of optimization would be a perfect subject for proving
>> >>> concepts of NewCompiler IR infrastructure.
>> >>>
>> >>> MessageNode>>doesEffectRequiresAPop: encoder
>> >>> special > 0
>> >>> ifTrue:
>> >>> [^self perform: (MacroTesters at: special)
>> >>> with:
>> >>> encoder].
>> >>> ^super doesEffectRequiresAPop: encoder
>> >>>
>> >>> MessageNode>>doesEffectForIfRequiresAPop: encoder
>> >>> "If both branch require a pop instruction, then the pop is
>> >>> factored
>> >>> after the jump.
>> >>> In this case, the translation of this node also require a final
>> >>> pop
>> >>> that can further be eliminated eventually."
>> >>> | thenExpr elseExpr |
>> >>> thenExpr := arguments at: 1.
>> >>> elseExpr := arguments at: 2.
>> >>> ^((thenExpr isJust: NodeNil) or: [elseExpr isJust: NodeNil])
>> >>> not
>> >>> and: [(thenExpr doesEvaluatingEffectRequiresAPop:
>> >>> encoder)
>> >>> and:
>> >>> [elseExpr doesEvaluatingEffectRequiresAPop: encoder]].
>> >>>
>> >>> BlockNode>>doesEvaluatingEffectRequiresAPop: anEncoder
>> >>> "Test whether a pop instruction is required to skip the value
>> >>> on
>> >>> the
>> >>> stack when this block is inlined."
>> >>> ^statements size > 0 and: [statements last
>> >>> doesEffectRequiresAPop:
>> >>> anEncoder]
>> >>>
>> >>> AssigmentNode>>doesEffectRequiresAPop: anEncoder
>> >>> "Test whether a pop instruction is required to skip the value
>> >>> on
>> >>> the stack"
>> >>> ^(variable canStorePopInASingleInstruction: anEncoder) not
>> >>>
>> >>> InstanceVariableNode>>canStorePopInASingleInstruction: anEncoder
>> >>> ^true
>> >>>
>> >>> etc...
>> >>>
>> >>> Le 14 février 2012 01:27, Eliot Miranda <[hidden email]> a
>> >>> écrit
>> >>> :
>> >>> > OK, I fixed this (a pleasant Monday afternoon distraction) (at least
>> >>> > I
>> >>> > *think* I've fixed it; I still have to check the decompiler).
>> >>> > But... in
>> >>> > doing so I realised I had to fix CompiledMethod>>#= since I needed
>> >>> > to
>> >>> > see
>> >>> > exactly which methods the compiler change affected (just committed
>> >>> > as
>> >>> > Kernel-eem.669). In doing this I noticed that recent compiler
>> >>> > changes
>> >>> > have
>> >>> > not been applied by recompiling affected methods in the system. If
>> >>> > you
>> >>> > look
>> >>> > at Behavior>>#sourceMatchesBytecodeAt: it uses CompiledMethod>>#= to
>> >>> > check
>> >>> > that the compiler's output remains unchanged. In the 4.3 trunk
>> >>> > image
>> >>> > with
>> >>> > CompiledMethod>>#= fixed, sourceMatchesBytecodeAt: reports that
>> >>> > some 19654
>> >>> > methods need recompiling:
>> >>> >
>> >>> > (SystemNavigation new allSelect: [:m| (m methodClass
>> >>> > sourceMatchesBytecodeAt: m selector) not]) size 19654
>> >>> >
>> >>> > Being selective in recompiling, via e.g.
>> >>> >
>> >>> > SystemNavigation new allSelect: [:m| (m methodClass
>> >>> > sourceMatchesBytecodeAt:
>> >>> > m selector) ifFalse: [m methodClass recompile: m selector]. false]
>> >>> >
>> >>> > one gets to the desired
>> >>> >
>> >>> > (SystemNavigation new allSelect: [:m| (m methodClass
>> >>> > sourceMatchesBytecodeAt: m selector) not]) size 0
>> >>> >
>> >>> > So the question is should we add an update that does the selective
>> >>> > recompile? I think we should, e.g. by committing a version of
>> >>> > Compiler
>> >>> > that
>> >>> > contains the selective recompile as a post-script. Agreed?
>> >>> >
>> >>> > On Mon, Feb 13, 2012 at 11:58 AM, Eliot Miranda
>> >>> > <[hidden email]>
>> >>> > wrote:
>> >>> >>
>> >>> >> Hi All,
>> >>> >>
>> >>> >> (whitewash alert) I just had occasion to look at the bytecode
>> >>> >> generated by the standard compiler for HashedCollection
>> >>> >> class>>#goodPrimeAtLeast:. Here's the source, with the issue in
>> >>> >> bold:
>> >>> >>
>> >>> >> goodPrimeAtLeast: lowerLimit
>> >>> >> "Answer the next good prime >= lowerlimit.
>> >>> >> If lowerLimit is larger than the largest known good prime,
>> >>> >> just make it odd."
>> >>> >> | primes low mid high prime |
>> >>> >> primes := self goodPrimes.
>> >>> >> low := 1.
>> >>> >> high := primes size.
>> >>> >> lowerLimit > (primes at: high) ifTrue: [
>> >>> >> ^lowerLimit bitOr: 1 ].
>> >>> >> [ high - low <= 1 ] whileFalse: [
>> >>> >> mid := high + low // 2.
>> >>> >> prime := primes at: mid.
>> >>> >> prime = lowerLimit ifTrue: [ ^prime ].
>> >>> >> prime < lowerLimit
>> >>> >> ifTrue: [ low := mid ]
>> >>> >> ifFalse: [ high := mid ] ].
>> >>> >> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>> >>> >> ^primes at: high
>> >>> >>
>> >>> >> The code for this sequence is
>> >>> >> 58 <15> pushTemp: 5
>> >>> >> 59 <10> pushTemp: 0
>> >>> >> 60 <B2> send: <
>> >>> >> 61 <9B> jumpFalse: 66
>> >>> >> 62 <13> pushTemp: 3
>> >>> >> 63 <81 42> storeIntoTemp: 2
>> >>> >> 65 <92> jumpTo: 69
>> >>> >> 66 <13> pushTemp: 3
>> >>> >> 67 <81 44> storeIntoTemp: 4
>> >>> >> 69 <87> pop
>> >>> >>
>> >>> >> where-as the following would be better:
>> >>> >>
>> >>> >> 58 <15> pushTemp: 5
>> >>> >> 59 <10> pushTemp: 0
>> >>> >> 60 <B2> send: <
>> >>> >> 61 <9B> jumpFalse: 66
>> >>> >> 62 <13> pushTemp: 3
>> >>> >> 63 <82 42> popIntoTemp: 2
>> >>> >> 65 <92> jumpTo: 69
>> >>> >> 66 <13> pushTemp: 3
>> >>> >> 67 <82 44> popIntoTemp: 4
>> >>> >>
>> >>> >> The reason is that the code generator favours using a single pop
>> >>> >> for
>> >>> >> both
>> >>> >> arms of the if:
>> >>> >>
>> >>> >> MessageNode>>sizeCodeForIf: encoder value: forValue
>> >>> >> | thenExpr elseExpr branchSize thenSize elseSize |
>> >>> >> thenExpr := arguments at: 1.
>> >>> >> elseExpr := arguments at: 2.
>> >>> >> (forValue
>> >>> >> or: [(thenExpr isJust: NodeNil)
>> >>> >> or: [elseExpr isJust: NodeNil]]) not
>> >>> >> "(...not ifTrue: avoids using ifFalse: alone during this compile)"
>> >>> >> ifTrue: "Two-armed IFs forEffect share a single pop"
>> >>> >> [^super sizeCodeForEffect: encoder].
>> >>> >> ...
>> >>> >>
>> >>> >> MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
>> >>> >> | thenExpr thenSize elseExpr elseSize |
>> >>> >> thenSize := sizes at: 1.
>> >>> >> elseSize := sizes at: 2.
>> >>> >> (forValue not and: [elseSize * thenSize > 0]) ifTrue:
>> >>> >> "Two-armed IFs forEffect share a single pop"
>> >>> >> [^super emitCodeForEffect: stack encoder: encoder].
>> >>> >>
>> >>> >> It would be nice if this only happened if doing so actually reduced
>> >>> >> the
>> >>> >> size of the generated code.
>> >>> >> --
>> >>> >> best,
>> >>> >> Eliot
>> >>> >>
>> >>> >
>> >>> >
>> >>> >
>> >>> > --
>> >>> > best,
>> >>> > Eliot
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> best,
>> >> Eliot
>> >>
>> >>
>> >>
>> >>
>> >
>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko.
>>
>
>
>
> --
> best,
> Eliot
>
>
>
>

sharePop2.st (13K) Download Attachment

Nicolas Cellier

Re: here's a curio for compiler mavens...

1238 methods don't really need to factor pop if I also handle the case of return
Note that this does not always make a difference neither in method
length nor in execution time...
It does for example if you end with an assignment in one branch and a
return in the other, like:
cond ifTrue: [x := 0] ifFalse: [x := -1. ^self]. self doFurtherProcessing

So the above count does not really tell which methods are worth changing...

The case of assignment in single branch is interesting because we
replace a store and a pop with a single storePop.
cond ifTrue: [x := 0] ifFalse: [self doSome]. self doFurtherProcessing
Don't know if it gains anything when jitted though...

Nicolas

Le 15 février 2012 01:24, Nicolas Cellier
<[hidden email]> a écrit :

> It's not a clean image, but here are the results:
>
> Eliot code
> -> 503 methods
>
> Modified Eliot code
> doesEffectForIfRequiresAPop: encoder
> "If both branch require a pop instruction, then the pop is factored
> after the jump.
> If a branch is empty or ends with assignment, it don't ends with a
> pop, so we don't need to factor a pop"
> ^arguments noneSatisfy: [:branch | (branch isJust: NodeNil) or:
> [branch statements last isAssignmentNode]].
> -> 880 methods
>
> Mine
> doesEffectForIfRequiresAPop: encoder
> "If both branch require a pop instruction, then the pop is factored
> after the jump.
> In this case, the translation of this node also require a final pop
> that can further be eliminated eventually."
> ^arguments allSatisfy: [:branch | (branch isJust: NodeNil) not and:
> [branch doesEvaluatingEffectRequiresAPop: encoder]].
> -> 1202 methods (code attached)
>
> And I think I forgot the case of ReturnNode which don't require a pop...
>
> Nicolas
>
> Le 15 février 2012 00:57, Eliot Miranda <[hidden email]> a écrit :
>>
>>
>> On Tue, Feb 14, 2012 at 2:55 PM, Igor Stasenko <[hidden email]> wrote:
>>>
>>> On 15 February 2012 00:50, Nicolas Cellier
>>> <[hidden email]> wrote:
>>> > Le 14 février 2012 23:10, Eliot Miranda <[hidden email]> a
>>> > écrit :
>>> >> On Tue, Feb 14, 2012 at 5:30 AM, Nicolas Cellier
>>> >> <[hidden email]> wrote:
>>> >>>
>>> >>> Ah, I came up with a relatively explicit solution, but also quite
>>> >>> heavy in term of number of messages required to implement the
>>> >>> optimisation.
>>> >>> I'm curious to see your implementation :)
>>> >>
>>> >>
>>> >> similar, but should I commit it? I think mine is fine since it is only
>>> >> assignments that have storepop forms, so your generality doesn't win
>>> >> anything. but i could be wrong.
>>> >>
>>> >
>>> > I was thinking of
>>> >
>>> > 1) nested ifTrue: ifFalse:
>>> > cond1 ifTrue: [x1 := y] ifFalse: [cond2 ifTrue: [x2 := y] ifFalse: [x3
>>> > := y]]
>>> >
>>> > elseExpr statements last is a message, not an assignment, though we
>>> > can factor the pop out...
>>> >
>>> > 2) inlined loop messages don't require a final pop...
>>> >
>>> > x <= y ifTrue: [x to: y do: [:i | (self at: i) doSome]] ifFalse: [y
>>> > to: x do: [:i | (self at: i) doSome]].
>>> >
>>> > An encoder pushing a value in both branches just to pop it after
>>> > wouldn't be that well behaved...
>>> >
>>> > I won't give any version publicly because I crashed several images in
>>> > the bootstrap process, and don't know if the bug is in the bootstrap
>>> > or in the code.
>>> >
>>> IMO, too much effort for such diminishing returns. :)
>>
>>
>> That was my thought too. It would be interesting to quantify the benefits.
>> I noticed some 451 methods that were improved by my simple assignment-only
>> mod. How many additional methods get caught by yours Nicolas?
>>
>>>
>>>
>>> > Nicolas
>>> >
>>> >> !MessageNode methodsFor: 'code generation' stamp: 'eem 2/13/2012
>>> >> 15:25'!
>>> >> ifThenElseForEffectShouldSharePop
>>> >> | thenExpr elseExpr |
>>> >> thenExpr := arguments at: 1.
>>> >> elseExpr := arguments at: 2.
>>> >> ^((thenExpr isJust: NodeNil)
>>> >> or: [(elseExpr isJust: NodeNil)
>>> >> or: [thenExpr statements last isAssignmentNode
>>> >> and: [elseExpr statements last isAssignmentNode]]])
>>> >> not
>>> >>
>>> >> "^(thenExpr isJust: NodeNil) not
>>> >> and: [(elseExpr isJust: NodeNil) not
>>> >> and: [thenExpr statements last isAssignmentNode not
>>> >> or: [elseExpr statements last isAssignmentNode
>>> >> not]]]"! !
>>> >>
>>> >>> Also, I'm sure that multiple-jump-elimination proposed by Igor and
>>> >>> other kind of optimization would be a perfect subject for proving
>>> >>> concepts of NewCompiler IR infrastructure.
>>> >>>
>>> >>> MessageNode>>doesEffectRequiresAPop: encoder
>>> >>> special > 0
>>> >>> ifTrue:
>>> >>> [^self perform: (MacroTesters at: special)
>>> >>> with:
>>> >>> encoder].
>>> >>> ^super doesEffectRequiresAPop: encoder
>>> >>>
>>> >>> MessageNode>>doesEffectForIfRequiresAPop: encoder
>>> >>> "If both branch require a pop instruction, then the pop is
>>> >>> factored
>>> >>> after the jump.
>>> >>> In this case, the translation of this node also require a final
>>> >>> pop
>>> >>> that can further be eliminated eventually."
>>> >>> | thenExpr elseExpr |
>>> >>> thenExpr := arguments at: 1.
>>> >>> elseExpr := arguments at: 2.
>>> >>> ^((thenExpr isJust: NodeNil) or: [elseExpr isJust: NodeNil])
>>> >>> not
>>> >>> and: [(thenExpr doesEvaluatingEffectRequiresAPop:
>>> >>> encoder)
>>> >>> and:
>>> >>> [elseExpr doesEvaluatingEffectRequiresAPop: encoder]].
>>> >>>
>>> >>> BlockNode>>doesEvaluatingEffectRequiresAPop: anEncoder
>>> >>> "Test whether a pop instruction is required to skip the value
>>> >>> on
>>> >>> the
>>> >>> stack when this block is inlined."
>>> >>> ^statements size > 0 and: [statements last
>>> >>> doesEffectRequiresAPop:
>>> >>> anEncoder]
>>> >>>
>>> >>> AssigmentNode>>doesEffectRequiresAPop: anEncoder
>>> >>> "Test whether a pop instruction is required to skip the value
>>> >>> on
>>> >>> the stack"
>>> >>> ^(variable canStorePopInASingleInstruction: anEncoder) not
>>> >>>
>>> >>> InstanceVariableNode>>canStorePopInASingleInstruction: anEncoder
>>> >>> ^true
>>> >>>
>>> >>> etc...
>>> >>>
>>> >>> Le 14 février 2012 01:27, Eliot Miranda <[hidden email]> a
>>> >>> écrit
>>> >>> :
>>> >>> > OK, I fixed this (a pleasant Monday afternoon distraction) (at least
>>> >>> > I
>>> >>> > *think* I've fixed it; I still have to check the decompiler).
>>> >>> > But... in
>>> >>> > doing so I realised I had to fix CompiledMethod>>#= since I needed
>>> >>> > to
>>> >>> > see
>>> >>> > exactly which methods the compiler change affected (just committed
>>> >>> > as
>>> >>> > Kernel-eem.669). In doing this I noticed that recent compiler
>>> >>> > changes
>>> >>> > have
>>> >>> > not been applied by recompiling affected methods in the system. If
>>> >>> > you
>>> >>> > look
>>> >>> > at Behavior>>#sourceMatchesBytecodeAt: it uses CompiledMethod>>#= to
>>> >>> > check
>>> >>> > that the compiler's output remains unchanged. In the 4.3 trunk
>>> >>> > image
>>> >>> > with
>>> >>> > CompiledMethod>>#= fixed, sourceMatchesBytecodeAt: reports that
>>> >>> > some 19654
>>> >>> > methods need recompiling:
>>> >>> >
>>> >>> > (SystemNavigation new allSelect: [:m| (m methodClass
>>> >>> > sourceMatchesBytecodeAt: m selector) not]) size 19654
>>> >>> >
>>> >>> > Being selective in recompiling, via e.g.
>>> >>> >
>>> >>> > SystemNavigation new allSelect: [:m| (m methodClass
>>> >>> > sourceMatchesBytecodeAt:
>>> >>> > m selector) ifFalse: [m methodClass recompile: m selector]. false]
>>> >>> >
>>> >>> > one gets to the desired
>>> >>> >
>>> >>> > (SystemNavigation new allSelect: [:m| (m methodClass
>>> >>> > sourceMatchesBytecodeAt: m selector) not]) size 0
>>> >>> >
>>> >>> > So the question is should we add an update that does the selective
>>> >>> > recompile? I think we should, e.g. by committing a version of
>>> >>> > Compiler
>>> >>> > that
>>> >>> > contains the selective recompile as a post-script. Agreed?
>>> >>> >
>>> >>> > On Mon, Feb 13, 2012 at 11:58 AM, Eliot Miranda
>>> >>> > <[hidden email]>
>>> >>> > wrote:
>>> >>> >>
>>> >>> >> Hi All,
>>> >>> >>
>>> >>> >> (whitewash alert) I just had occasion to look at the bytecode
>>> >>> >> generated by the standard compiler for HashedCollection
>>> >>> >> class>>#goodPrimeAtLeast:. Here's the source, with the issue in
>>> >>> >> bold:
>>> >>> >>
>>> >>> >> goodPrimeAtLeast: lowerLimit
>>> >>> >> "Answer the next good prime >= lowerlimit.
>>> >>> >> If lowerLimit is larger than the largest known good prime,
>>> >>> >> just make it odd."
>>> >>> >> | primes low mid high prime |
>>> >>> >> primes := self goodPrimes.
>>> >>> >> low := 1.
>>> >>> >> high := primes size.
>>> >>> >> lowerLimit > (primes at: high) ifTrue: [
>>> >>> >> ^lowerLimit bitOr: 1 ].
>>> >>> >> [ high - low <= 1 ] whileFalse: [
>>> >>> >> mid := high + low // 2.
>>> >>> >> prime := primes at: mid.
>>> >>> >> prime = lowerLimit ifTrue: [ ^prime ].
>>> >>> >> prime < lowerLimit
>>> >>> >> ifTrue: [ low := mid ]
>>> >>> >> ifFalse: [ high := mid ] ].
>>> >>> >> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>>> >>> >> ^primes at: high
>>> >>> >>
>>> >>> >> The code for this sequence is
>>> >>> >> 58 <15> pushTemp: 5
>>> >>> >> 59 <10> pushTemp: 0
>>> >>> >> 60 <B2> send: <
>>> >>> >> 61 <9B> jumpFalse: 66
>>> >>> >> 62 <13> pushTemp: 3
>>> >>> >> 63 <81 42> storeIntoTemp: 2
>>> >>> >> 65 <92> jumpTo: 69
>>> >>> >> 66 <13> pushTemp: 3
>>> >>> >> 67 <81 44> storeIntoTemp: 4
>>> >>> >> 69 <87> pop
>>> >>> >>
>>> >>> >> where-as the following would be better:
>>> >>> >>
>>> >>> >> 58 <15> pushTemp: 5
>>> >>> >> 59 <10> pushTemp: 0
>>> >>> >> 60 <B2> send: <
>>> >>> >> 61 <9B> jumpFalse: 66
>>> >>> >> 62 <13> pushTemp: 3
>>> >>> >> 63 <82 42> popIntoTemp: 2
>>> >>> >> 65 <92> jumpTo: 69
>>> >>> >> 66 <13> pushTemp: 3
>>> >>> >> 67 <82 44> popIntoTemp: 4
>>> >>> >>
>>> >>> >> The reason is that the code generator favours using a single pop
>>> >>> >> for
>>> >>> >> both
>>> >>> >> arms of the if:
>>> >>> >>
>>> >>> >> MessageNode>>sizeCodeForIf: encoder value: forValue
>>> >>> >> | thenExpr elseExpr branchSize thenSize elseSize |
>>> >>> >> thenExpr := arguments at: 1.
>>> >>> >> elseExpr := arguments at: 2.
>>> >>> >> (forValue
>>> >>> >> or: [(thenExpr isJust: NodeNil)
>>> >>> >> or: [elseExpr isJust: NodeNil]]) not
>>> >>> >> "(...not ifTrue: avoids using ifFalse: alone during this compile)"
>>> >>> >> ifTrue: "Two-armed IFs forEffect share a single pop"
>>> >>> >> [^super sizeCodeForEffect: encoder].
>>> >>> >> ...
>>> >>> >>
>>> >>> >> MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
>>> >>> >> | thenExpr thenSize elseExpr elseSize |
>>> >>> >> thenSize := sizes at: 1.
>>> >>> >> elseSize := sizes at: 2.
>>> >>> >> (forValue not and: [elseSize * thenSize > 0]) ifTrue:
>>> >>> >> "Two-armed IFs forEffect share a single pop"
>>> >>> >> [^super emitCodeForEffect: stack encoder: encoder].
>>> >>> >>
>>> >>> >> It would be nice if this only happened if doing so actually reduced
>>> >>> >> the
>>> >>> >> size of the generated code.
>>> >>> >> --
>>> >>> >> best,
>>> >>> >> Eliot
>>> >>> >>
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> > --
>>> >>> > best,
>>> >>> > Eliot
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> best,
>>> >> Eliot
>>> >>
>>> >>
>>> >>
>>> >>
>>> >
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Igor Stasenko.
>>>
>>
>>
>>
>> --
>> best,
>> Eliot
>>
>>
>>
>>

sharePop3.st (14K) Download Attachment