here's a curio for compiler mavens...

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

here's a curio for compiler mavens...

Eliot Miranda-2
Hi All,

    (whitewash alert)  I just had occasion to look at the bytecode generated by the standard compiler for HashedCollection class>>#goodPrimeAtLeast:.  Here's the source, with the issue in bold:

goodPrimeAtLeast: lowerLimit
"Answer the next good prime >= lowerlimit.
If lowerLimit is larger than the largest known good prime,
just make it odd."
| primes low mid high prime |
primes := self goodPrimes.
low := 1.
high := primes size.
lowerLimit > (primes at: high) ifTrue: [
^lowerLimit bitOr: 1 ].
[ high - low <= 1 ] whileFalse: [
mid := high + low // 2.
prime := primes at: mid.
prime = lowerLimit ifTrue: [ ^prime ].
prime < lowerLimit
ifTrue: [ low := mid ]
ifFalse: [ high := mid ] ].
(primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
^primes at: high

The code for this sequence is
58 <15> pushTemp: 5
59 <10> pushTemp: 0
60 <B2> send: <
61 <9B> jumpFalse: 66
62 <13> pushTemp: 3
63 <81 42> storeIntoTemp: 2
65 <92> jumpTo: 69
66 <13> pushTemp: 3
67 <81 44> storeIntoTemp: 4
69 <87> pop

where-as the following would be better:

58 <15> pushTemp: 5
59 <10> pushTemp: 0
60 <B2> send: <
61 <9B> jumpFalse: 66
62 <13> pushTemp: 3
63 <82 42> popIntoTemp: 2
65 <92> jumpTo: 69
66 <13> pushTemp: 3
67 <82 44> popIntoTemp: 4

The reason is that the code generator favours using a single pop for both arms of the if:

MessageNode>>sizeCodeForIf: encoder value: forValue
| thenExpr elseExpr branchSize thenSize elseSize |
thenExpr := arguments at: 1.
elseExpr := arguments at: 2.
(forValue
or: [(thenExpr isJust: NodeNil)
or: [elseExpr isJust: NodeNil]]) not
"(...not ifTrue: avoids using ifFalse: alone during this compile)"
ifTrue:  "Two-armed IFs forEffect share a single pop"
[^super sizeCodeForEffect: encoder].
...

MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
| thenExpr thenSize elseExpr elseSize |
thenSize := sizes at: 1.
elseSize := sizes at: 2.
(forValue not and: [elseSize * thenSize > 0]) ifTrue:
"Two-armed IFs forEffect share a single pop"
[^super emitCodeForEffect: stack encoder: encoder].

It would be nice if this only happened if doing so actually reduced the size of the generated code.
--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] here's a curio for compiler mavens...

Igor Stasenko
On 13 February 2012 20:58, Eliot Miranda <[hidden email]> wrote:

> Hi All,
>
>     (whitewash alert)  I just had occasion to look at the bytecode generated
> by the standard compiler for HashedCollection class>>#goodPrimeAtLeast:.
>  Here's the source, with the issue in bold:
>
> goodPrimeAtLeast: lowerLimit
> "Answer the next good prime >= lowerlimit.
> If lowerLimit is larger than the largest known good prime,
> just make it odd."
> | primes low mid high prime |
> primes := self goodPrimes.
> low := 1.
> high := primes size.
> lowerLimit > (primes at: high) ifTrue: [
> ^lowerLimit bitOr: 1 ].
> [ high - low <= 1 ] whileFalse: [
> mid := high + low // 2.
> prime := primes at: mid.
> prime = lowerLimit ifTrue: [ ^prime ].
> prime < lowerLimit
> ifTrue: [ low := mid ]
> ifFalse: [ high := mid ] ].
> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
> ^primes at: high
>
> The code for this sequence is
> 58 <15> pushTemp: 5
> 59 <10> pushTemp: 0
> 60 <B2> send: <
> 61 <9B> jumpFalse: 66
> 62 <13> pushTemp: 3
> 63 <81 42> storeIntoTemp: 2
> 65 <92> jumpTo: 69
> 66 <13> pushTemp: 3
> 67 <81 44> storeIntoTemp: 4
> 69 <87> pop
>
> where-as the following would be better:
>
> 58 <15> pushTemp: 5
> 59 <10> pushTemp: 0
> 60 <B2> send: <
> 61 <9B> jumpFalse: 66
> 62 <13> pushTemp: 3
> 63 <82 42> popIntoTemp: 2
> 65 <92> jumpTo: 69
> 66 <13> pushTemp: 3
> 67 <82 44> popIntoTemp: 4
>

[ (a) ifTrue:[ b] ifFalse: [c] ] value
should answer b or c (depending on a)
now if you use pop instead of store, it will break.

Of course in while loop, the [ loop-body closure ] value is always
discarded, so yes we can
eliminate extra pop.
But you can do even more:

1. loop head
...
....
 61 <9B> jumpFalse: 66
 62 <13> pushTemp: 3
 63 <82 42> popIntoTemp: 2
** 65 <92> jumpTo: 1
 66 <13> pushTemp: 3
 67 <82 44> popIntoTemp: 4
 69  ...  jumpTo: 1

so, we can have 1 less jump when code enters first branch.


> The reason is that the code generator favours using a single pop for both
> arms of the if:
>
> MessageNode>>sizeCodeForIf: encoder value: forValue
> | thenExpr elseExpr branchSize thenSize elseSize |
> thenExpr := arguments at: 1.
> elseExpr := arguments at: 2.
> (forValue
> or: [(thenExpr isJust: NodeNil)
> or: [elseExpr isJust: NodeNil]]) not
> "(...not ifTrue: avoids using ifFalse: alone during this compile)"
> ifTrue:  "Two-armed IFs forEffect share a single pop"
> [^super sizeCodeForEffect: encoder].
> ...
>
> MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
> | thenExpr thenSize elseExpr elseSize |
> thenSize := sizes at: 1.
> elseSize := sizes at: 2.
> (forValue not and: [elseSize * thenSize > 0]) ifTrue:
> "Two-armed IFs forEffect share a single pop"
> [^super emitCodeForEffect: stack encoder: encoder].
>
> It would be nice if this only happened if doing so actually reduced the size
> of the generated code.
> --
> best,
> Eliot
>
>
>
>



--
Best regards,
Igor Stasenko.

Reply | Threaded
Open this post in threaded view
|

Re: here's a curio for compiler mavens...

Eliot Miranda-2
In reply to this post by Eliot Miranda-2
OK, I fixed this (a pleasant Monday afternoon distraction) (at least I *think* I've fixed it; I still have to check the decompiler).  But... in doing so I realised I had to fix CompiledMethod>>#= since I needed to see exactly which methods the compiler change affected (just committed as Kernel-eem.669).  In doing this I noticed that recent compiler changes have not been applied by recompiling affected methods in the system.  If you look at Behavior>>#sourceMatchesBytecodeAt: it uses CompiledMethod>>#= to check that the compiler's output remains unchanged.  In the 4.3 trunk image with CompiledMethod>>#= fixed, sourceMatchesBytecodeAt: reports that some 19654 methods need recompiling:

(SystemNavigation new allSelect: [:m| (m methodClass sourceMatchesBytecodeAt: m selector) not]) size 19654

Being selective in recompiling, via e.g.

SystemNavigation new allSelect: [:m| (m methodClass sourceMatchesBytecodeAt: m selector) ifFalse: [m methodClass recompile: m selector]. false]

one gets to the desired

(SystemNavigation new allSelect: [:m| (m methodClass sourceMatchesBytecodeAt: m selector) not]) size 0

So the question is should we add an update that does the selective recompile?  I think we should, e.g. by committing a version of Compiler that contains the selective recompile as a post-script.  Agreed?

On Mon, Feb 13, 2012 at 11:58 AM, Eliot Miranda <[hidden email]> wrote:
Hi All,

    (whitewash alert)  I just had occasion to look at the bytecode generated by the standard compiler for HashedCollection class>>#goodPrimeAtLeast:.  Here's the source, with the issue in bold:

goodPrimeAtLeast: lowerLimit
"Answer the next good prime >= lowerlimit.
If lowerLimit is larger than the largest known good prime,
just make it odd."
| primes low mid high prime |
primes := self goodPrimes.
low := 1.
high := primes size.
lowerLimit > (primes at: high) ifTrue: [
^lowerLimit bitOr: 1 ].
[ high - low <= 1 ] whileFalse: [
mid := high + low // 2.
prime := primes at: mid.
prime = lowerLimit ifTrue: [ ^prime ].
prime < lowerLimit
ifTrue: [ low := mid ]
ifFalse: [ high := mid ] ].
(primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
^primes at: high

The code for this sequence is
58 <15> pushTemp: 5
59 <10> pushTemp: 0
60 <B2> send: <
61 <9B> jumpFalse: 66
62 <13> pushTemp: 3
63 <81 42> storeIntoTemp: 2
65 <92> jumpTo: 69
66 <13> pushTemp: 3
67 <81 44> storeIntoTemp: 4
69 <87> pop

where-as the following would be better:

58 <15> pushTemp: 5
59 <10> pushTemp: 0
60 <B2> send: <
61 <9B> jumpFalse: 66
62 <13> pushTemp: 3
63 <82 42> popIntoTemp: 2
65 <92> jumpTo: 69
66 <13> pushTemp: 3
67 <82 44> popIntoTemp: 4

The reason is that the code generator favours using a single pop for both arms of the if:

MessageNode>>sizeCodeForIf: encoder value: forValue
| thenExpr elseExpr branchSize thenSize elseSize |
thenExpr := arguments at: 1.
elseExpr := arguments at: 2.
(forValue
or: [(thenExpr isJust: NodeNil)
or: [elseExpr isJust: NodeNil]]) not
"(...not ifTrue: avoids using ifFalse: alone during this compile)"
ifTrue:  "Two-armed IFs forEffect share a single pop"
[^super sizeCodeForEffect: encoder].
...

MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
| thenExpr thenSize elseExpr elseSize |
thenSize := sizes at: 1.
elseSize := sizes at: 2.
(forValue not and: [elseSize * thenSize > 0]) ifTrue:
"Two-armed IFs forEffect share a single pop"
[^super emitCodeForEffect: stack encoder: encoder].

It would be nice if this only happened if doing so actually reduced the size of the generated code.
--
best,
Eliot




--
best,
Eliot