Awe and horror.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Awe and horror.

Alan Grimes
As I begin to grasp the scope of the optimizations in the interpreter
and C-Translator, I am filled with a sense of awe. These tweaks were
obviously done with a deep knowledge of Smalltalk, C, the behavior of C
compilers, and assembler. At the same time, I'm filled with horror as I
realize the people who wrote these must of have sold their souls, left
kidneys, and first born offspring in order to pull it off.

While the VM, as it exists today, is adequate for conventional Squeak as
 it is presently used. Unfortunately, however, the scope and nature of
its optimizations make it difficult to modify and extend. The
non-obvious things the C-translator does make VM hacking extremely
inhospitable to newcomers. For example, I would look at the class
definition and would think to myself: OK, these "instance" variables
will be put into a structure that can be instantiated by appropriate
calls to C, and the class variables will probably end up being static
variables in the same source file. -- thereby approximating the behavior
of Smalltalk classes.

Instead, class variables are optimized as constants, and while instance
variables are put in a structure, there is no way to instantiate the
structure or manage such instances should they be created. Much much
worse, however, is that the C translator will behave differently
depending on which class it is processing and apply special hacks where
it thinks nobody will notice. -- The hapless newbie probably won't
discover these hacks until he has read at least 500 messages...

I'm going to try to fork my own version of the vmmaker intended to be
much more flexible and robust to experimentation though, perhaps, a bit
slower... (The current version of my custom VM is half as fast but still
usable...)

The current issue that I'm having trouble with is figuring out exactly
how the CCode generator interfaces with the system wide compiler. I am
especially curious as to how to implement scope-reduced variables.
(variables declared within a block instead of a function...)

How exactly are syntax elements mapped onto the translation classes?

In any event, I hope to have an interesting variation on the VM
technology someday...


--
Friends don't let friends use GCC 3.4.4
GCC 3.3.6 produces code that's twice as fast on x86!

http://users.rcn.com/alangrimes/

Reply | Threaded
Open this post in threaded view
|

Re: Awe and horror.

Brad Fuller
Alan Grimes wrote:

>As I begin to grasp the scope of the optimizations in the interpreter
>and C-Translator, I am filled with a sense of awe. These tweaks were
>obviously done with a deep knowledge of Smalltalk, C, the behavior of C
>compilers, and assembler. At the same time, I'm filled with horror as I
>realize the people who wrote these must of have sold their souls, left
>kidneys, and first born offspring in order to pull it off.
>
>While the VM, as it exists today, is adequate for conventional Squeak as
> it is presently used. Unfortunately, however, the scope and nature of
>its optimizations make it difficult to modify and extend. The
>non-obvious things the C-translator does make VM hacking extremely
>inhospitable to newcomers. For example, I would look at the class
>definition and would think to myself: OK, these "instance" variables
>will be put into a structure that can be instantiated by appropriate
>calls to C, and the class variables will probably end up being static
>variables in the same source file. -- thereby approximating the behavior
>of Smalltalk classes.
>
>Instead, class variables are optimized as constants, and while instance
>variables are put in a structure, there is no way to instantiate the
>structure or manage such instances should they be created. Much much
>worse, however, is that the C translator will behave differently
>depending on which class it is processing and apply special hacks where
>it thinks nobody will notice. -- The hapless newbie probably won't
>discover these hacks until he has read at least 500 messages...
>
>I'm going to try to fork my own version of the vmmaker intended to be
>much more flexible and robust to experimentation though, perhaps, a bit
>slower... (The current version of my custom VM is half as fast but still
>usable...)
>
>The current issue that I'm having trouble with is figuring out exactly
>how the CCode generator interfaces with the system wide compiler. I am
>especially curious as to how to implement scope-reduced variables.
>(variables declared within a block instead of a function...)
>
>How exactly are syntax elements mapped onto the translation classes?
>
>In any event, I hope to have an interesting variation on the VM
>technology someday...
>  
>
As you plod along, can you give us updates of things you find, fix or
improve? That'd help the rest of us!

brad

Reply | Threaded
Open this post in threaded view
|

Re: Awe and horror.

johnmci
In reply to this post by Alan Grimes

On 1-Nov-05, at 12:41 PM, Alan Grimes wrote:

> Instead, class variables are optimized as constants, and while  
> instance
> variables are put in a structure

The instance variables are put into a structure because on powerpc if  
they are non-structure static/nonstatic  variables
in the scope of the file then it takes an extra memory load to  
deference the data storage pointer to load/store the variable.
By using a structure you avoid that extra load, this is why the  
structure is there. Testing on intel based and 68K machines showed
there was no impact so along the way we made it the default, although  
I think you can choose to turn it off.

a) Usage of the
register struct foo * foo = &fum;
ensures that on powerpc the foo pointer gets into a register if and  
only if two or more references are made to the structure.

b) Some variables are not in the structure because they require  
initialization, this could be changed by having a method
that actually does the initialization.

c) Over the years sometimes arrays have gone into or out of the  
structure on powerpc based on compiler behaviour.

d) Technically on register happy  machines you could say to GCC let  
register 42 contain the foo pointer, if you of course
ensure all plugins are happy with that rule.

e) Inlining has a modification so that if a instance variable that is  
used in  multiple routines and is then folded into
a single routine then that variable is consolidated into a local  
scoped variable. The main user of this logic is in the
GC logic where variables are shared between different methods making  
it easy to write the algorithms, but all those methods are folded  
into a single C procedure.
This change made a significant improvement in GC performance on  
register happy machines.

f) The interpreter case loop has logic to scope local variable usage  
to a particular case statement, versus scoping to the entire C  
procedure.
By scoping to individual cases statements most compilers are much  
happier to do register optimizations.

g) lastly gnuifying alters the case statement to use jumptables which  
is much more efficient.

h) using C++ inline keyword and not-inlining the VM has in the past  
produced lousy performance.

i) The inline uses some rules to decide if small routines can be  
inline, otherwise it follows the hint from self inline: boolean if  
the routine could
be inline and not fail some other rule other than length.  In the  
past there was a patch I did to say yes do the inline anyways for  
this procedure, not sure
if that is still in vmmaker.

j) Compiler optimizations can do ugly things with common code  
elimination etc etc, such as dragging part of the common send logic  
into many of the
individual bytecode case logic. Testing across (many) gcc versions  
will show you which is the best compiler for your platform.

--
========================================================================
===
John M. McIntosh <[hidden email]> 1-800-477-2659
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
========================================================================
===