[squeak-dev] Anyone know the following about Slang?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Anyone know the following about Slang?

Eliot Miranda-2
Hi All,

    does anyone know (or even better has anyone fixed it) how hard it is to make Slang inline methods that contain simple type declarations?

I would like to eliminate compile-time integer/pointer mismatch errors in the new StackInterpreter I'm writing.  I'd like to say things like:

callerSPOf: theFP
"Answer the SP of the caller provided theFP is not a base frame.
This points to the hottest item on the frame's stack."
self var: #theFP type: 'char *'.
self returnTypeC: 'char *'.
self assert: (self isBaseFrame: theFP) not.
^theFP + FoxCallerSavedIP + ((self frameNumArgs: theFP) + 2 * BytesPerWord)

but Slang refuses to inline anything that has C declarations.  I'm guessing that the issue is moving the type information from the method to its inlined form.  It took me half a day to discover where Slang refuses to inline (should have looked in the obvious place CCodeGenerator>>collectInlineList, instead of in the inlining code :/ ).  So I'm afraid to waste the time trying to find out where the restriction bites.  Anyone know how to fix this or better still have a fix?

best
Eliot


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

Igor Stasenko
2008/7/4 Eliot Miranda <[hidden email]>:

> Hi All,
>
>     does anyone know (or even better has anyone fixed it) how hard it is to
> make Slang inline methods that contain simple type declarations?
>
> I would like to eliminate compile-time integer/pointer mismatch errors in
> the new StackInterpreter I'm writing.  I'd like to say things like:
>
> callerSPOf: theFP
> "Answer the SP of the caller provided theFP is not a base frame.
> This points to the hottest item on the frame's stack."
> self var: #theFP type: 'char *'.
> self returnTypeC: 'char *'.
> self assert: (self isBaseFrame: theFP) not.
> ^theFP + FoxCallerSavedIP + ((self frameNumArgs: theFP) + 2 * BytesPerWord)
> but Slang refuses to inline anything that has C declarations.  I'm guessing
> that the issue is moving the type information from the method to its inlined
> form.  It took me half a day to discover where Slang refuses to inline
> (should have looked in the obvious place CCodeGenerator>>collectInlineList,
> instead of in the inlining code :/ ).  So I'm afraid to waste the time
> trying to find out where the restriction bites.  Anyone know how to fix this
> or better still have a fix?
>

Doesn't 'self inline: true' helps?
Or, can't you circumvent that by coercing a value to corresponding
type at call site? Like:

newSP := self cCoerce: (self callerSPOf: blabla) to: 'char *'.
And don't touch the return type of function.

> best
> Eliot
>
>
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

Eliot Miranda-2


On Thu, Jul 3, 2008 at 5:48 PM, Igor Stasenko <[hidden email]> wrote:
2008/7/4 Eliot Miranda <[hidden email]>:
> Hi All,
>
>     does anyone know (or even better has anyone fixed it) how hard it is to
> make Slang inline methods that contain simple type declarations?
>
> I would like to eliminate compile-time integer/pointer mismatch errors in
> the new StackInterpreter I'm writing.  I'd like to say things like:
>
> callerSPOf: theFP
> "Answer the SP of the caller provided theFP is not a base frame.
> This points to the hottest item on the frame's stack."
> self var: #theFP type: 'char *'.
> self returnTypeC: 'char *'.
> self assert: (self isBaseFrame: theFP) not.
> ^theFP + FoxCallerSavedIP + ((self frameNumArgs: theFP) + 2 * BytesPerWord)
> but Slang refuses to inline anything that has C declarations.  I'm guessing
> that the issue is moving the type information from the method to its inlined
> form.  It took me half a day to discover where Slang refuses to inline
> (should have looked in the obvious place CCodeGenerator>>collectInlineList,
> instead of in the inlining code :/ ).  So I'm afraid to waste the time
> trying to find out where the restriction bites.  Anyone know how to fix this
> or better still have a fix?
>

Doesn't 'self inline: true' helps?

No.  Slang refuses to inline anything containing a C declaration (retrnTypeC:, var:type: cCode:inSmalltalk: etc)
 

Or, can't you circumvent that by coercing a value to corresponding
type at call site? Like:

newSP := self cCoerce: (self callerSPOf: blabla) to: 'char *'.

That's worse than the disease :)  There are many more uses than definitions.  So I want the uses to look clean and I'll tolerate noisy definition.  There are also argument types to consider.  localIP has type char * for example, so where it is used as an argument I want the argument type to be char * or void *, etc.

 
And don't touch the return type of function.

> best
> Eliot
>
>
>
>



--
Best regards,
Igor Stasenko AKA sig.




Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

Igor Stasenko
2008/7/4 Eliot Miranda <[hidden email]>:

>
>
> On Thu, Jul 3, 2008 at 5:48 PM, Igor Stasenko <[hidden email]> wrote:
>>
>> 2008/7/4 Eliot Miranda <[hidden email]>:
>> > Hi All,
>> >
>> >     does anyone know (or even better has anyone fixed it) how hard it is
>> > to
>> > make Slang inline methods that contain simple type declarations?
>> >
>> > I would like to eliminate compile-time integer/pointer mismatch errors
>> > in
>> > the new StackInterpreter I'm writing.  I'd like to say things like:
>> >
>> > callerSPOf: theFP
>> > "Answer the SP of the caller provided theFP is not a base frame.
>> > This points to the hottest item on the frame's stack."
>> > self var: #theFP type: 'char *'.
>> > self returnTypeC: 'char *'.
>> > self assert: (self isBaseFrame: theFP) not.
>> > ^theFP + FoxCallerSavedIP + ((self frameNumArgs: theFP) + 2 *
>> > BytesPerWord)
>> > but Slang refuses to inline anything that has C declarations.  I'm
>> > guessing
>> > that the issue is moving the type information from the method to its
>> > inlined
>> > form.  It took me half a day to discover where Slang refuses to inline
>> > (should have looked in the obvious place
>> > CCodeGenerator>>collectInlineList,
>> > instead of in the inlining code :/ ).  So I'm afraid to waste the time
>> > trying to find out where the restriction bites.  Anyone know how to fix
>> > this
>> > or better still have a fix?
>> >
>>
>> Doesn't 'self inline: true' helps?
>
> No.  Slang refuses to inline anything containing a C declaration
> (retrnTypeC:, var:type: cCode:inSmalltalk: etc)
>
>>
>> Or, can't you circumvent that by coercing a value to corresponding
>> type at call site? Like:
>>
>> newSP := self cCoerce: (self callerSPOf: blabla) to: 'char *'.
>
> That's worse than the disease :)  There are many more uses than definitions.
>  So I want the uses to look clean and I'll tolerate noisy definition.  There
> are also argument types to consider.  localIP has type char * for example,
> so where it is used as an argument I want the argument type to be char * or
> void *, etc.
>
>
I guess i know why it refusing to inline methods with declarations,
because you may write like:

method: arg1

| foo |
self var: #foo declareInC: 'void **foo = malloc(arg1)'.

^ foo.

inliner simply moving any temps into enclosing method.
But here, a situation, where you have a C style declaration and
assignment both.
And ANSI C prohibits declaring vars not in the first lines of function
body. C++ allows it :)

Try comment a line of code in  collectInlineList.

hasCCode := false. "m declarations size > 0."

Its not safe, and you'll get a lot of compiler errors.

GCC will inline your method anyways, so why bother?

--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

Eliot Miranda-2


On Thu, Jul 3, 2008 at 6:19 PM, Igor Stasenko <[hidden email]> wrote:
2008/7/4 Eliot Miranda <[hidden email]>:
>
>
> On Thu, Jul 3, 2008 at 5:48 PM, Igor Stasenko <[hidden email]> wrote:
>>
>> 2008/7/4 Eliot Miranda <[hidden email]>:
>> > Hi All,
>> >
>> >     does anyone know (or even better has anyone fixed it) how hard it is
>> > to
>> > make Slang inline methods that contain simple type declarations?
>> >
>> > I would like to eliminate compile-time integer/pointer mismatch errors
>> > in
>> > the new StackInterpreter I'm writing.  I'd like to say things like:
>> >
>> > callerSPOf: theFP
>> > "Answer the SP of the caller provided theFP is not a base frame.
>> > This points to the hottest item on the frame's stack."
>> > self var: #theFP type: 'char *'.
>> > self returnTypeC: 'char *'.
>> > self assert: (self isBaseFrame: theFP) not.
>> > ^theFP + FoxCallerSavedIP + ((self frameNumArgs: theFP) + 2 *
>> > BytesPerWord)
>> > but Slang refuses to inline anything that has C declarations.  I'm
>> > guessing
>> > that the issue is moving the type information from the method to its
>> > inlined
>> > form.  It took me half a day to discover where Slang refuses to inline
>> > (should have looked in the obvious place
>> > CCodeGenerator>>collectInlineList,
>> > instead of in the inlining code :/ ).  So I'm afraid to waste the time
>> > trying to find out where the restriction bites.  Anyone know how to fix
>> > this
>> > or better still have a fix?
>> >
>>
>> Doesn't 'self inline: true' helps?
>
> No.  Slang refuses to inline anything containing a C declaration
> (retrnTypeC:, var:type: cCode:inSmalltalk: etc)
>
>>
>> Or, can't you circumvent that by coercing a value to corresponding
>> type at call site? Like:
>>
>> newSP := self cCoerce: (self callerSPOf: blabla) to: 'char *'.
>
> That's worse than the disease :)  There are many more uses than definitions.
>  So I want the uses to look clean and I'll tolerate noisy definition.  There
> are also argument types to consider.  localIP has type char * for example,
> so where it is used as an argument I want the argument type to be char * or
> void *, etc.
>
>
I guess i know why it refusing to inline methods with declarations,
because you may write like:

method: arg1

| foo |
self var: #foo declareInC: 'void **foo = malloc(arg1)'.

^ foo.

inliner simply moving any temps into enclosing method.
But here, a situation, where you have a C style declaration and
assignment both.
And ANSI C prohibits declaring vars not in the first lines of function
body. C++ allows it :)

Try comment a line of code in  collectInlineList.

hasCCode := false. "m declarations size > 0."

Its not safe, and you'll get a lot of compiler errors.


I did that.  I got lots of errors :)
 
GCC will inline your method anyways, so why bother?

The current sources use the localIP localSP localHomeContext localReturnContext localReturnValue scheme to get important variables in registers.  The translator rips out methods it can't inline that refer to these.  So lots of methods end up getting deleted unless they're inlined.

The localFoo scheme may or may not be important for performance.  I don't want to rip it out until I can measure in a working VM whether it has any effect or not.  I want to measure my new VM not the old one  The benefit of having localFP (the frame pointer) in a register in my VM is likely to be quite high.

I only want the type declarations to eliminate warnings.  So its easier to do without the declarations.  But I'd like my cake and eat it too, dammit :)



--
Best regards,
Igor Stasenko AKA sig.


best
Eliot


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

timrowledge
As mentioned this is likely  a bit of precautionary restriction  
implemented way back, probably by John Maloney when at Apple. Any  
cCode stuff stops inlining.

Given the fairly small set of types (ab)used you might specialise to  
the form #declareCharStar: #foo etc and instead of catching the  
various forms of #var:declareC: etc in TMethod>recordDeclarations it  
would let them through. There'd be some ugly fixups elsewhere though,  
guaranteed.

It has to be said that the current state of the Slang translation is  
just insane. It was a fairly ugly hack to start with and has been  
mangled, folded, spindled and mutilated ever since. As an example  
gleaned whilst taking a quick look for a solution for you, consider
  - TMethod>inlineCaseStatementBranchesIn:localizingVars: and its use  
of #hasNoCCode and
  - CCodeGenerator>collectInlineList and its non-use of hasCCode but a  
mangled inline almost equivalent.

The textual inlining is terribly poorly factored and horribly hacked.  
It makes decisions based on nonsense metrics like how many nodes in a  
tree and are there more than an arbitrary magic number. The code  
probably won't work if you don't inline! At least it didn't a while  
back. The inlining/internalising of the bytecode loop is nasty,  
resulting in near duplicates of many methods for no very good reason.

My guess - and it is only a guess based on a few limited experiments  
ages ago - is that it would be smarter to drop the textual inlining  
completely, make the inline: pragma result in placement of a gcc  
__inline__ (or whatever the hell it is these days) on the function  
declaration line and let the compiler handle it.

Or better yet, completely rewrite the whole damn thing to do the job  
properly. Invent a better Slang. Add those bitfileds handling  
capabilities you need, and structures.


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Fractured Idiom:- MONAGE A TROIS - I am three years old



Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

Igor Stasenko
In reply to this post by Eliot Miranda-2
2008/7/4 Eliot Miranda <[hidden email]>:
>>
>> GCC will inline your method anyways, so why bother?
>
> The current sources use the localIP localSP localHomeContext
> localReturnContext localReturnValue scheme to get important variables in
> registers.  The translator rips out methods it can't inline that refer to
> these.  So lots of methods end up getting deleted unless they're inlined.

Well, then i see no exit , other than use sqInt everywhere, and put
conversions at each place where it needs to be.

Or, you can rewrite the method above, to assign to localSP directly,
without returning a value (also declare localFP in same way as other
localFoes):

setCallerSP
"Set the SP of the caller provided localFP is not a base frame.
This points to the hottest item on the frame's stack."
self assert: (self isBaseFrame: localFP) not.
localSP :=  localFP + FoxCallerSavedIP + ((self frameNumArgs: localFP)
+ 2 * BytesPerWord)

>
> The localFoo scheme may or may not be important for performance.  I don't
> want to rip it out until I can measure in a working VM whether it has any
> effect or not.  I want to measure my new VM not the old one  The benefit of
> having localFP (the frame pointer) in a register in my VM is likely to be
> quite high.
>

localFoo along with gnuifier makes a lot difference - about 30% speedup.

> I only want the type declarations to eliminate warnings.  So its easier to
> do without the declarations.  But I'd like my cake and eat it too, dammit :)
>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>
>
> best
> Eliot
>
>
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

johnmci
In reply to this post by timrowledge
Six some years back I looked at C inlining and found it did a poor  
job. Maybe that has changed.
Also I had changed the inline logic to say oh, if this method  
shouldn't be inlined, we lie, please force an inlining based on the  
method name.

At the time there was only two methods that seemed likely candidates  
so it was never put into VMMaker.


On Jul 3, 2008, at 7:38 PM, tim Rowledge wrote:

> As mentioned this is likely  a bit of precautionary restriction  
> implemented way back, probably by John Maloney when at Apple. Any  
> cCode stuff stops inlining.

--
=
=
=
========================================================================
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
=
=
=
========================================================================



Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

Igor Stasenko
In reply to this post by timrowledge
2008/7/4 tim Rowledge <[hidden email]>:

> As mentioned this is likely  a bit of precautionary restriction implemented
> way back, probably by John Maloney when at Apple. Any cCode stuff stops
> inlining.
>
> Given the fairly small set of types (ab)used you might specialise to the
> form #declareCharStar: #foo etc and instead of catching the various forms of
> #var:declareC: etc in TMethod>recordDeclarations it would let them through.
> There'd be some ugly fixups elsewhere though, guaranteed.
>
> It has to be said that the current state of the Slang translation is just
> insane. It was a fairly ugly hack to start with and has been mangled,
> folded, spindled and mutilated ever since. As an example gleaned whilst
> taking a quick look for a solution for you, consider
>  - TMethod>inlineCaseStatementBranchesIn:localizingVars: and its use of
> #hasNoCCode and
>  - CCodeGenerator>collectInlineList and its non-use of hasCCode but a
> mangled inline almost equivalent.
>
> The textual inlining is terribly poorly factored and horribly hacked. It
> makes decisions based on nonsense metrics like how many nodes in a tree and
> are there more than an arbitrary magic number. The code probably won't work
> if you don't inline! At least it didn't a while back. The
> inlining/internalising of the bytecode loop is nasty, resulting in near
> duplicates of many methods for no very good reason.
>
> My guess - and it is only a guess based on a few limited experiments ages
> ago - is that it would be smarter to drop the textual inlining completely,
> make the inline: pragma result in placement of a gcc __inline__ (or whatever
> the hell it is these days) on the function declaration line and let the
> compiler handle it.
>
> Or better yet, completely rewrite the whole damn thing to do the job
> properly. Invent a better Slang. Add those bitfileds handling capabilities
> you need, and structures.
>
>

After couple of month of hacking slang to make Hydra working, i looked
for a ways how to overcome such shortcomings, which introduces a
translation to C.
And as result , i invented own translation from smalltalk AST to
lambda message sends. And even started prototyping a C translation
backend for it. But then i thought: why the hell i need to translate
things to  C, if i have Exupery on hands, with which i can control any
operation up to smaller details i need.
First thing, which i made is native method inlining, which inlines
code based not on knowledge about variable names or their types, but
based on AST , which makes it truly indifferent to what you want to
inline or perform. In essence lambdas its an abstract algorithm
representation, so it possible to easily translate them to any form
(be it C , native code or anything else) :)

> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> Fractured Idiom:- MONAGE A TROIS - I am three years old
>


--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

Eliot Miranda-2
In reply to this post by timrowledge


On Thu, Jul 3, 2008 at 7:38 PM, tim Rowledge <[hidden email]> wrote:
As mentioned this is likely  a bit of precautionary restriction implemented way back, probably by John Maloney when at Apple. Any cCode stuff stops inlining.

Given the fairly small set of types (ab)used you might specialise to the form #declareCharStar: #foo etc and instead of catching the various forms of #var:declareC: etc in TMethod>recordDeclarations it would let them through. There'd be some ugly fixups elsewhere though, guaranteed.

It has to be said that the current state of the Slang translation is just insane. It was a fairly ugly hack to start with and has been mangled, folded, spindled and mutilated ever since. As an example gleaned whilst taking a quick look for a solution for you, consider
 - TMethod>inlineCaseStatementBranchesIn:localizingVars: and its use of #hasNoCCode and
 - CCodeGenerator>collectInlineList and its non-use of hasCCode but a mangled inline almost equivalent.


Thanks Tim!  That's what I needed.  Being pointed to the right place.  It has taken 20 minutes to understand the code and 20 minutes to fix it.  Thanks so much!!

The textual inlining is terribly poorly factored and horribly hacked. It makes decisions based on nonsense metrics like how many nodes in a tree and are there more than an arbitrary magic number. The code probably won't work if you don't inline! At least it didn't a while back. The inlining/internalising of the bytecode loop is nasty, resulting in near duplicates of many methods for no very good reason.

My guess - and it is only a guess based on a few limited experiments ages ago - is that it would be smarter to drop the textual inlining completely, make the inline: pragma result in placement of a gcc __inline__ (or whatever the hell it is these days) on the function declaration line and let the compiler handle it.

Or better yet, completely rewrite the whole damn thing to do the job properly. Invent a better Slang. Add those bitfileds handling capabilities you need, and structures.

Right.  Any one brave enough should have a go at this.  Right now I can work within the existing system's limitations and so its not on my critical path (thank goodness). 

The way that Dave Simmons and I thought would be a great way to do it is to run a simulation of the VM in some context that allows one to collect concrete type information.  This could be e.g.

- a simulation of the simulation using ContextPart>>runSimulated:contextAtEachStep:, that captures the receiver type(s) at each send site, very slow being a simulation of a simulation

- a simulation above a VM that has polymorphic inline caches, extracting the type information from the PICs a la adaptive optimization/speculative inlining

The simulation has to be run with an exhaustive case that fully exercises the VM simulation so no code is left untouched.  This can be checked (but not verified) by seeing that one has collected type information for all send sites in the VM code. (Not verified because it doesn't prove that for a given send site a different execution would not introduce another type).

Applying a closed-world assumption to the VM code (it doesn't change once compiled as a C program) one can then transform the Smalltalk code, decorated with type information, into C.  Monomorphic send sites map to procedure calls and/or textual inlining.  <handwave>Polymorphic send sites transform to some form of case sttement based on some sort of discriminated union, or perhaps a warning to the programmer to try again</handwave>.

<hint weight="heavy">IMO THIS WOULD MAKE AN EXCELLENT MASTERS OR PHd TOPIC!!</hint>

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Fractured Idiom:- MONAGE A TROIS - I am three years old






Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

timrowledge

On 4-Jul-08, at 6:35 AM, Eliot Miranda wrote:
> [snip]
> Thanks Tim!  That's what I needed.  Being pointed to the right  
> place.  It has taken 20 minutes to understand the code and 20  
> minutes to fix it.  Thanks so much!!
Nice to have actually achieved something this week; it's been one of  
those weeks...

Simulating simulating the VM to gather type data seems like a pretty  
complex project. I can't help feeling it would be simpler and faster  
to simply write the VM cleanly, with decent documentation and specs.

Igor, if you can produce a Better Slang With Lambas, do please share  
the code. There has to be some way of cleaning up the current mess.

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Oyster (n.), a person who sprinkles his conversation with Yiddishisms.



Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

Eliot Miranda-2


On Fri, Jul 4, 2008 at 5:22 PM, tim Rowledge <[hidden email]> wrote:

On 4-Jul-08, at 6:35 AM, Eliot Miranda wrote:
[snip]

Thanks Tim!  That's what I needed.  Being pointed to the right place.  It has taken 20 minutes to understand the code and 20 minutes to fix it.  Thanks so much!!
Nice to have actually achieved something this week; it's been one of those weeks...

Simulating simulating the VM to gather type data seems like a pretty complex project. I can't help feeling it would be simpler and faster to simply write the VM cleanly, with decent documentation and specs.

But that's exactly what this project would allow one to do.  One implements a VM in Smalltalk as cleanly designed as one can.  If implementing a JIT then include a simulator for the processor.  Then simulate this written-in-Smalltalk VM and translate this clean design to (e.g.) C removing polymorphism and generating vanilla code based on the type information collected.

Igor, if you can produce a Better Slang With Lambas, do please share the code. There has to be some way of cleaning up the current mess. Oyster (n.), a person who sprinkles his conversation with Yiddishisms.






Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

Colin Putney

On 4-Jul-08, at 7:35 PM, Eliot Miranda wrote:

> But that's exactly what this project would allow one to do.  One  
> implements a VM in Smalltalk as cleanly designed as one can.  If  
> implementing a JIT then include a simulator for the processor.  Then  
> simulate this written-in-Smalltalk VM and translate this clean  
> design to (e.g.) C removing polymorphism and generating vanilla code  
> based on the type information collected.

I think collecting type information via the simulation is both  
overkill and insufficient. On one hand, the VM code is immutable while  
it runs, unlike Smalltalk code. Gathering type feedback at runtime is  
a great technique because it can respond to changing code, changing  
usage or the need for reflection. But the VM doesn't need any of that  
- it just needs a static snapshot of the possible types that could be  
encountered at each call site.

On the other hand, simulation may not be able to provide that. It runs  
the risk of not exercising all the code paths and therefor generating  
incomplete type information. Even with a complete test case to run in  
the simulator, there's quite a bit of indeterminism in the system.  
When interrupts happen, when garbage is collected, where objects  
happen to be allocated, all that can affect the code flow in the system.

I think a static type inferencer would be a better bet. There are  
several out there, and it probably wouldn't be tricky to press them  
into service. Chuck, for example, adds a "specific implementors"  
command to the browser that fits the bill perfectly. The VM isn't  
likely to be hugely polymorphic in any case, so it probably wouldn't  
be difficult to track down oddities that the inferencer turns up.

Colin

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

Eliot Miranda-2


On Fri, Jul 4, 2008 at 9:57 PM, Colin Putney <[hidden email]> wrote:

On 4-Jul-08, at 7:35 PM, Eliot Miranda wrote:

But that's exactly what this project would allow one to do.  One implements a VM in Smalltalk as cleanly designed as one can.  If implementing a JIT then include a simulator for the processor.  Then simulate this written-in-Smalltalk VM and translate this clean design to (e.g.) C removing polymorphism and generating vanilla code based on the type information collected.

I think collecting type information via the simulation is both overkill and insufficient. On one hand, the VM code is immutable while it runs, unlike Smalltalk code. Gathering type feedback at runtime is a great technique because it can respond to changing code, changing usage or the need for reflection. But the VM doesn't need any of that - it just needs a static snapshot of the possible types that could be encountered at each call site.

On the other hand, simulation may not be able to provide that. It runs the risk of not exercising all the code paths and therefor generating incomplete type information. Even with a complete test case to run in the simulator, there's quite a bit of indeterminism in the system. When interrupts happen, when garbage is collected, where objects happen to be allocated, all that can affect the code flow in the system.

I think a static type inferencer would be a better bet. There are several out there, and it probably wouldn't be tricky to press them into service. Chuck, for example, adds a "specific implementors" command to the browser that fits the bill perfectly. The VM isn't likely to be hugely polymorphic in any case, so it probably wouldn't be difficult to track down oddities that the inferencer turns up.

Yes, good point.  The VM is a closed world so the inferencer should be able to do a good job.  Hints from the programmer on how to resolve perform: could close any loopholes.  A much better approach.


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

David T. Lewis
In reply to this post by Eliot Miranda-2
On Thu, Jul 03, 2008 at 05:25:51PM -0700, Eliot Miranda wrote:
> Hi All,
>
>     does anyone know (or even better has anyone fixed it) how hard it is to
> make Slang inline methods that contain simple type declarations?

Over the last 2 weeks I've been playing around with replacing the memory access
macros in sqMemoryAccess.h with equivalent Smalltalk slang. Along the way I found
and fixed several slang issues that prevented inlining of methods with C
declarations, including the one related to case generation for the interpret()
loop. I'm in a rush right now (just happened to notice this thread) but I'll post
the results in a day or two when I have some free time.

btw, I do have the memory access methods working now, and with all the slang
inlining applied I'm withing about 20% of the performance of the interpreter
using C macros. It's surprising how many little bugs got flushed out in the
process, although I suspect there is not a great deal of demand for a VM that
features a 20% performance degradation ;)

Dave


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

Igor Stasenko
2008/7/5 David T. Lewis <[hidden email]>:

> On Thu, Jul 03, 2008 at 05:25:51PM -0700, Eliot Miranda wrote:
>> Hi All,
>>
>>     does anyone know (or even better has anyone fixed it) how hard it is to
>> make Slang inline methods that contain simple type declarations?
>
> Over the last 2 weeks I've been playing around with replacing the memory access
> macros in sqMemoryAccess.h with equivalent Smalltalk slang. Along the way I found
> and fixed several slang issues that prevented inlining of methods with C
> declarations, including the one related to case generation for the interpret()
> loop. I'm in a rush right now (just happened to notice this thread) but I'll post
> the results in a day or two when I have some free time.
>
> btw, I do have the memory access methods working now, and with all the slang
> inlining applied I'm withing about 20% of the performance of the interpreter
> using C macros. It's surprising how many little bugs got flushed out in the
> process, although I suspect there is not a great deal of demand for a VM that
> features a 20% performance degradation ;)
>

Is there chance that degradation because you broken gnuifier step?
I experienced same degradation, when gnuifier fails to find and
replace text in interp.c,
so gnu-interp.c is just a copy of interp.c w/o modifications.

If not, then we should blame GCC for being incapable to inline simple things :)


> Dave
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

David T. Lewis
On Sat, Jul 05, 2008 at 10:56:19PM +0300, Igor Stasenko wrote:

> 2008/7/5 David T. Lewis <[hidden email]>:
> > btw, I do have the memory access methods working now, and with all the slang
> > inlining applied I'm within about 20% of the performance of the interpreter
> > using C macros. It's surprising how many little bugs got flushed out in the
> > process, although I suspect there is not a great deal of demand for a VM that
> > features a 20% performance degradation ;)
> >
>
> Is there chance that degradation because you broken gnuifier step?
> I experienced same degradation, when gnuifier fails to find and
> replace text in interp.c,
> so gnu-interp.c is just a copy of interp.c w/o modifications.
>
> If not, then we should blame GCC for being incapable to inline simple things :)

No, it's certainly nothing to blame on gcc or gnuify. I'm using Ian's build tools
without modification. I just rewrote the memory access routines as Smalltalk translated
to C. It was a big step just getting it to work at all (simple though it might sound).

When I first got something working in the form of methods translated to simple C
functions, I was seeing perhaps 40% of the performance of the normal access routines
(CPP macros or static inline functions). By gradually convincing the slang translator
to inline these functions, and by fixing up the various unintended side effects, I'm
now up to the point where I've recovered perhaps 80% of the performance of the
original CPP macros.

I'm pretty happy to have gotten this far with it. I do not yet know if it will be
possible to achieve the performance of the CPP macros, maybe yes maybe no. I'm not
being very scientific about this, just running '0 tinyBenchmarks' as I proceed.

To be clear, the inlining I was referring to is the inline performed by the
Smalltalk to C slang translation process, nothing to do with the later gnuify
phase or anything that gcc might do with it afterwards.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

David T. Lewis
In reply to this post by Eliot Miranda-2
On Thu, Jul 03, 2008 at 05:25:51PM -0700, Eliot Miranda wrote:
> Hi All,
>
>     does anyone know (or even better has anyone fixed it) how hard it is to
> make Slang inline methods that contain simple type declarations?

Eliot, the attached change sets are a snapshot of some tinkering I've been doing
that relates to Slang generation of inlined C code and declarations. I'm sorry it's
not tidied up better, but the short change sets with bug fixes may be of some use
to you in their current form. I do plan to post the relevant bug fixes on Mantis
some time soon.

HTH,

- Dave

------
"Change Set: PermitInliningCCode-dtl
Date: 6 July 2008
Author: David T. Lewis

Honor a 'self inline: true' request regardless of whether the method contains C code or declarations. Includes a change to case statement generation to support this inlining."!
------

"Change Set: CaseGenerationFixes-dtl
Date: 6 July 2008
Author: David T. Lewis

The main loop of the interpreter dispatches on bytecodes, with the case targets generated by TCaseStmtNode. There are a number of bytecode methods in the interpreter that make assumptions about slang generation of the case targets in order to support early fetching of the next bytecode. This change set adds #flag: markers to identify the interpreter methods that make this assumption, and adds a check to TCaseStmtNode to force the expected slang code generation regardless of the size of the parse tree for the case target."!
------
"Change Set: TMethod-renameVarsFix-dtl
Date: 4 July 2008
Author: David T. Lewis

Fix bug in TMethod>>renameVariablesUsing: that caused incorrect C variable declarations for temporary variables that are renamed during slang inlining."!

------
'From Squeak3.9 of 7 November 2006 [latest update: #7067] on 6 July 2008 at 10:15:15 pm'!
"Change Set: TMethod-TParseNode-comments-dtl
Date: 6 July 2008
Author: David T. Lewis

Class comments, and categorization of some TParseNode methods"!
------
'From Squeak3.9 of 7 November 2006 [latest update: #7067] on 6 July 2008 at 10:09:15 pm'!
"Change Set: MemoryAccess-unresolvedBugs-dtl
Date: 3 July 2008
Author: David T. Lewis

Interpreter>>loadFloatOrIntFrom: should be inlined, but there is now a bug in temp variable handling that prevents this. It is probably a pre-existing bug that was masked by the fact that methods with C declarations previously could not be inlined. I'm saving it here in this change set in case I feel motivated later to try to fix the bug."!
------
'From Squeak3.9 of 7 November 2006 [latest update: #7067] on 6 July 2008 at 10:26:02 pm'!
"Change Set: MemoryAccess-dtl
Date: 29 June 2008
Author: David T. Lewis

MemoryAccess defines the low level mapping of object memory addresses to the underlying machine address space.

Prerequisite change sets:
  PermitInliningCCode-dtl
  CaseGenerationFixes-dtl      
  TMethod-renameVarsFix-dtl
  TMethod-TParseNode-comments-dtl
  MemoryAccess-unresolvedBugs-dtl

This is intended to replace the traditional external definitions in sqMemoryAccess.h
"!




PermitInliningCCode-dtl.2.cs.gz (2K) Download Attachment
CaseGenerationFixes-dtl.4.cs.gz (2K) Download Attachment
TMethod-renameVarsFix-dtl.4.cs.gz (1K) Download Attachment
TMethod-TParseNode-comments-dtl.1.cs.gz (1012 bytes) Download Attachment
MemoryAccess-unresolvedBugs-dtl.4.cs.gz (958 bytes) Download Attachment
MemoryAccess-dtl.10.cs.gz (11K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

David T. Lewis
In reply to this post by timrowledge
On Thu, Jul 03, 2008 at 07:38:27PM -0700, tim Rowledge wrote:
> The inlining/internalising of the bytecode loop is nasty,  
> resulting in near duplicates of many methods for no very good reason.

It might be nasty, but it's done that way for a good (albeit poorly
documented) reason. The near-duplicate case targets are written that
way to enable early fetching of the next bytecode, which presumably
carries a performance benefit on some machines.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Anyone know the following about Slang?

Eliot Miranda-2
In reply to this post by David T. Lewis
David,

    thanks a lot!  I have some integration to do :)   Not least I want to integrate the pragma support in 3.9 back into 3.8 because using pragmas instead of the null statements (self var: #foo type: #barf) etc is so much nicer.  I also have to produce bootstraps for the closure compiler in a few images, e.g. Croquet 1.0, 3.9, 3.10 & Spoon.  Once I get the Croquet bootstrap done I would welcome volunteers for the others.

cheers!

P.S.  Brilliant British GP today!! (just finished watching it).  Lewis 1st and Rubens 3rd in the Honda.  Fantastic!

On Sun, Jul 6, 2008 at 8:19 PM, David T. Lewis <[hidden email]> wrote:
On Thu, Jul 03, 2008 at 05:25:51PM -0700, Eliot Miranda wrote:
> Hi All,
>
>     does anyone know (or even better has anyone fixed it) how hard it is to
> make Slang inline methods that contain simple type declarations?

Eliot, the attached change sets are a snapshot of some tinkering I've been doing
that relates to Slang generation of inlined C code and declarations. I'm sorry it's
not tidied up better, but the short change sets with bug fixes may be of some use
to you in their current form. I do plan to post the relevant bug fixes on Mantis
some time soon.

HTH,

- Dave

------
"Change Set:            PermitInliningCCode-dtl
Date:                   6 July 2008
Author:                 David T. Lewis

Honor a 'self inline: true' request regardless of whether the method contains C code or declarations. Includes a change to case statement generation to support this inlining."!
------

"Change Set:            CaseGenerationFixes-dtl
Date:                   6 July 2008
Author:                 David T. Lewis

The main loop of the interpreter dispatches on bytecodes, with the case targets generated by TCaseStmtNode. There are a number of bytecode methods in the interpreter that make assumptions about slang generation of the case targets in order to support early fetching of the next bytecode. This change set adds #flag: markers to identify the interpreter methods that make this assumption, and adds a check to TCaseStmtNode to force the expected slang code generation regardless of the size of the parse tree for the case target."!
------
"Change Set:            TMethod-renameVarsFix-dtl
Date:                   4 July 2008
Author:                 David T. Lewis

Fix bug in TMethod>>renameVariablesUsing: that caused incorrect C variable declarations for temporary variables that are renamed during slang inlining."!

------
'From Squeak3.9 of 7 November 2006 [latest update: #7067] on 6 July 2008 at 10:15:15 pm'!
"Change Set:            TMethod-TParseNode-comments-dtl
Date:                   6 July 2008
Author:                 David T. Lewis

Class comments, and categorization of some TParseNode methods"!
------
'From Squeak3.9 of 7 November 2006 [latest update: #7067] on 6 July 2008 at 10:09:15 pm'!
"Change Set:            MemoryAccess-unresolvedBugs-dtl
Date:                   3 July 2008
Author:                 David T. Lewis

Interpreter>>loadFloatOrIntFrom: should be inlined, but there is now a bug in temp variable handling that prevents this. It is probably a pre-existing bug that was masked by the fact that methods with C declarations previously could not be inlined. I'm saving it here in this change set in case I feel motivated later to try to fix the bug."!
------
'From Squeak3.9 of 7 November 2006 [latest update: #7067] on 6 July 2008 at 10:26:02 pm'!
"Change Set:            MemoryAccess-dtl
Date:                   29 June 2008
Author:                 David T. Lewis

MemoryAccess defines the low level mapping of object memory addresses to the underlying machine address space.

Prerequisite change sets:
 PermitInliningCCode-dtl
 CaseGenerationFixes-dtl
 TMethod-renameVarsFix-dtl
 TMethod-TParseNode-comments-dtl
 MemoryAccess-unresolvedBugs-dtl

This is intended to replace the traditional external definitions in sqMemoryAccess.h
"!




12