-ftlo option (link-time optimizer)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

-ftlo option (link-time optimizer)

Ben Coman
 
and first I heard of -ftlo flag for link-time optimizer and it providing additional type checking.  I was curious if we used it?

cheers -ben
Reply | Threaded
Open this post in threaded view
|

Re: -ftlo option (link-time optimizer)

Eliot Miranda-2
 
Hi Ben,

On Fri, Jan 4, 2019 at 8:32 PM Ben Coman <[hidden email]> wrote:
 
and first I heard of -ftlo flag for link-time optimizer and it providing additional type checking.  I was curious if we used it?

No, and I would counsel that we don't.  Link-time optimization is implementation-defined and so could have various effects on the binary.  There are some convenient implicit assumptions in the VM that there's a slim chance could be violated.  For example, the primitiveFunctionPointer is tested and it is assumed that any value above 519 is a function pointer, and any value less than or equal is a quick primitive index (a primitive that answers an inst var or one of a handful of constants).  This eliminates the need to maintain primitiveIndex and primitiveFunctionPointer in the method cache and shrinks it from eight slots per entry to four.  This hence significantly speeds up primitive dispatch in the StackInterpreter.  But it is the kind of hack that *could* be violated by too aggressive an optimizer.  The VM is in the business of executing Smalltalk quickly, and that may not be fully aligned with executing its C code quickly. IME it is best to be a little conservative with the C level optimizations and be aggressive with the VM's own algorithmically/representationally implemented optimizations.

Of course, once the JIT is involved, the system is spending less and less time in compiled C and more and more time in generated machine code.  At that point C love; optimizations have less and less effect on overall speed.

None of the above applies to the C code for the GC or for individual plugins.  But since the GC tries hard to take a small percentage of overall execution time anyway, use of aggressive optimization therein shouldn't pay back huge dividends.  If one can show that for a particular *external* plugin (a separate shared object/dll) I would have no objection to it being more aggressively optimized.

And I am happy to be proved wrong.

_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: -ftlo option (link-time optimizer)

Nicolas Cellier
 

Le sam. 5 janv. 2019 à 06:13, Eliot Miranda <[hidden email]> a écrit :
 
Hi Ben,

On Fri, Jan 4, 2019 at 8:32 PM Ben Coman <[hidden email]> wrote:
 
and first I heard of -ftlo flag for link-time optimizer and it providing additional type checking.  I was curious if we used it?

No, and I would counsel that we don't.  Link-time optimization is implementation-defined and so could have various effects on the binary.  There are some convenient implicit assumptions in the VM that there's a slim chance could be violated.  For example, the primitiveFunctionPointer is tested and it is assumed that any value above 519 is a function pointer, and any value less than or equal is a quick primitive index (a primitive that answers an inst var or one of a handful of constants).  This eliminates the need to maintain primitiveIndex and primitiveFunctionPointer in the method cache and shrinks it from eight slots per entry to four.  This hence significantly speeds up primitive dispatch in the StackInterpreter.  But it is the kind of hack that *could* be violated by too aggressive an optimizer.  The VM is in the business of executing Smalltalk quickly, and that may not be fully aligned with executing its C code quickly. IME it is best to be a little conservative with the C level optimizations and be aggressive with the VM's own algorithmically/representationally implemented optimizations.

Of course, once the JIT is involved, the system is spending less and less time in compiled C and more and more time in generated machine code.  At that point C love; optimizations have less and less effect on overall speed.

None of the above applies to the C code for the GC or for individual plugins.  But since the GC tries hard to take a small percentage of overall execution time anyway, use of aggressive optimization therein shouldn't pay back huge dividends.  If one can show that for a particular *external* plugin (a separate shared object/dll) I would have no objection to it being more aggressively optimized.

And I am happy to be proved wrong.

_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: -ftlo option (link-time optimizer)

Eliot Miranda-2
 


On Jan 5, 2019, at 2:17 AM, Nicolas Cellier <[hidden email]> wrote:


+1.  Right!  And that’s such a nice cheap hack to load the floating-point registers.  It would be completely broken because link-time optimization would simply delete the function call given that it has no explicit effect.


Le sam. 5 janv. 2019 à 06:13, Eliot Miranda <[hidden email]> a écrit :
 
Hi Ben,

On Fri, Jan 4, 2019 at 8:32 PM Ben Coman <[hidden email]> wrote:
 
and first I heard of -ftlo flag for link-time optimizer and it providing additional type checking.  I was curious if we used it?

No, and I would counsel that we don't.  Link-time optimization is implementation-defined and so could have various effects on the binary.  There are some convenient implicit assumptions in the VM that there's a slim chance could be violated.  For example, the primitiveFunctionPointer is tested and it is assumed that any value above 519 is a function pointer, and any value less than or equal is a quick primitive index (a primitive that answers an inst var or one of a handful of constants).  This eliminates the need to maintain primitiveIndex and primitiveFunctionPointer in the method cache and shrinks it from eight slots per entry to four.  This hence significantly speeds up primitive dispatch in the StackInterpreter.  But it is the kind of hack that *could* be violated by too aggressive an optimizer.  The VM is in the business of executing Smalltalk quickly, and that may not be fully aligned with executing its C code quickly. IME it is best to be a little conservative with the C level optimizations and be aggressive with the VM's own algorithmically/representationally implemented optimizations.

Of course, once the JIT is involved, the system is spending less and less time in compiled C and more and more time in generated machine code.  At that point C love; optimizations have less and less effect on overall speed.

None of the above applies to the C code for the GC or for individual plugins.  But since the GC tries hard to take a small percentage of overall execution time anyway, use of aggressive optimization therein shouldn't pay back huge dividends.  If one can show that for a particular *external* plugin (a separate shared object/dll) I would have no objection to it being more aggressively optimized.

And I am happy to be proved wrong.

_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: -ftlo option (link-time optimizer)

Ben Coman
 
Thanks Eliot & Nicolas.  Its always good to understand things in relation to specific circumstances.
cheers -ben

On Sat, 5 Jan 2019 at 23:39, Eliot Miranda <[hidden email]> wrote:
 
On Jan 5, 2019, at 2:17 AM, Nicolas Cellier <[hidden email]> wrote:

+1.  Right!  And that’s such a nice cheap hack to load the floating-point registers.  It would be completely broken because link-time optimization would simply delete the function call given that it has no explicit effect.


Le sam. 5 janv. 2019 à 06:13, Eliot Miranda <[hidden email]> a écrit :
 
Hi Ben,

On Fri, Jan 4, 2019 at 8:32 PM Ben Coman <[hidden email]> wrote:
 
and first I heard of -ftlo flag for link-time optimizer and it providing additional type checking.  I was curious if we used it?

No, and I would counsel that we don't.  Link-time optimization is implementation-defined and so could have various effects on the binary.  There are some convenient implicit assumptions in the VM that there's a slim chance could be violated.  For example, the primitiveFunctionPointer is tested and it is assumed that any value above 519 is a function pointer, and any value less than or equal is a quick primitive index (a primitive that answers an inst var or one of a handful of constants).  This eliminates the need to maintain primitiveIndex and primitiveFunctionPointer in the method cache and shrinks it from eight slots per entry to four.  This hence significantly speeds up primitive dispatch in the StackInterpreter.  But it is the kind of hack that *could* be violated by too aggressive an optimizer.  The VM is in the business of executing Smalltalk quickly, and that may not be fully aligned with executing its C code quickly. IME it is best to be a little conservative with the C level optimizations and be aggressive with the VM's own algorithmically/representationally implemented optimizations.

Of course, once the JIT is involved, the system is spending less and less time in compiled C and more and more time in generated machine code.  At that point C love; optimizations have less and less effect on overall speed.

None of the above applies to the C code for the GC or for individual plugins.  But since the GC tries hard to take a small percentage of overall execution time anyway, use of aggressive optimization therein shouldn't pay back huge dividends.  If one can show that for a particular *external* plugin (a separate shared object/dll) I would have no objection to it being more aggressively optimized.

And I am happy to be proved wrong.

_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: -ftlo option (link-time optimizer)

Levente Uzonyi
In reply to this post by Eliot Miranda-2
 
Our last discussion[1] about LTO ended at a similar point. LTO can be
disabled for specific files. And, besides its optimization benefits, it
can also detect type mismatches in different modules.
I have a VM built with gcc 8 that mostly works. For example, I can update
an image from the trunk repository, but some floating point tests fail.

Levente

[1] http://forum.world.st/primitiveDigitCompare-is-slow-was-Re-squeak-dev-The-Inbox-Kernel-dtl-1015-mcz-td4890070i20.html#a4890675

Reply | Threaded
Open this post in threaded view
|

Re: -ftlo option (link-time optimizer)

Levente Uzonyi
 
Well, I just checked what happens if I compile the VM with gcc 8, but no
LTO.
Actually the same issue exists. Float >> #basicAt: (primitive 38) always
returns 0 with that VM. So, I presume there's either undefined behavior,
an optimization issue or a compiler bug somewhere.

Levente

On Sat, 5 Jan 2019, Levente Uzonyi wrote:

>
> Our last discussion[1] about LTO ended at a similar point. LTO can be
> disabled for specific files. And, besides its optimization benefits, it
> can also detect type mismatches in different modules.
> I have a VM built with gcc 8 that mostly works. For example, I can update
> an image from the trunk repository, but some floating point tests fail.
>
> Levente
>
> [1]
> http://forum.world.st/primitiveDigitCompare-is-slow-was-Re-squeak-dev-The-Inbox-Kernel-dtl-1015-mcz-td4890070i20.html#a4890675
>
>