Making a Slower VM

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Making a Slower VM

Sean P. DeNigris
Administrator
We often talk about making the VM faster. How about making it slower? In 1980, there were some optimizations that were needed for Smalltalk to be even usable, but now:
- Moore's Law has theoretically given us 131072 more computing power (2^((2014-1980)/2))
- Cog runs up to 3x slower than C [1]
- Ruby, which is widely accepted, seems to be much slower than Cog [2]

For example, inlined functions can be baffling for new users. I just ran into this myself when writing an #ifNil:ifNotNil: that was not picked up by the system [3], and Ungar and Smith describe several cases in the History of Self (pg. 9-5).

How many of these are premature optimizations that can be eliminated, or at least turned off by default until they're actually needed? I know Clement mentioned in [3] that some make a big difference, but it would certainly make the system more uniform and easy to understand.

[1] http://lists.gforge.inria.fr/pipermail/pharo-project/2011-February/042489.html
[2] http://benchmarksgame.alioth.debian.org/u32/benchmark.php?test=all&lang=yarv&lang2=gcc
[3] https://www.mail-archive.com/pharo-dev@lists.pharo.org/msg11694.html
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

Clément Béra
 
Hello Sean,

That's true that the ruby interpreter and CPython are around 20x slower that Cog. Now the use cases are different. 

Firstly, their tool suite is not written in ruby/python so they do not need speed to have a good IDE. For example, see the new SqueakJS VM, as it is slower Morphic is hardly usable, therefore they had to fall back on the old and fast MVC UI. We do not want to have to do that in Pharo/Squeak.

In addition, Ruby/python work well due to their good integration with C, because a ruby/python programmer needs to bind its performance critical methods to C methods. In most cases, we do not bing performance critical method in Pharo/Squeak with C method because we don't need to, and I don't think we want to do that.

So I wouldn't say that we can have pharo/squeak running 20x slower and still be happy.

One thing that you didn't mention is the Stack VM. This interpreter based VM is less efficient than Cog (2x-10x slower) but much more flexible IMO. For example, overriding each message send interpretation or adding new byte codes is quite easy. So somehow we have already a slower VM more flexible.

In addition, the Opal's compiler options allows to disable some optimized constructs as you mention, but this is static infos with pragmas. 

Disabling conditions inlining would decrease by 2.5x the performance of Pharo/Squeak according to Urs hozle phd, but recent attempts showed that the speed problem is even worst due to the fact the kernel was optimized knowing these constructs where inlined.

Solution for this problem

As you may have seen, in the Self VM, they do not have these optimized constructs but not because they are slower but because they have an adaptive recompiler. Currently I am working with Eliot on speculative inlining and different optimizations for the Cog. You can see a description of the project here: http://clementbera.wordpress.com/2014/01/09/the-sista-chronicles-i-an-introduction-to-adaptive-recompilation/ . I wrote that article quickly so there might be some typos and English errors but the overall should be OK. This is a big project, so we will have a production ready result in several months, perhaps even in a few years.

This will allow to both increase Cog's performance and reduce the code complexity due to optimizations with inlined constructs.

Precise solutions needs to be discussed and benchmarked, be we could have, as their performance impact will be lowered:
- ifNil:/ifNotNil: not inlined.
- all the specific messages as regular message sends in all cases (including #==): #(#+ 1 #- 1 #< 1 #> 1 #<= 1 #>= 1 #= 1 #~= 1 #* 1 #/ 1 #\\ 1 #@ 1 #bitShift: 1 #// 1 #bitAnd: 1 #bitOr: 1 #at: 1 #at:put: 2 #size 0 #next 0 #nextPut: 1 #atEnd 0 #== 1 nil 0 #blockCopy: 1 #value 0 #value: 1 #do: 1 #new 0 #new: 1 #x 0 #y 0)

Loops methods (whileTrue:, to:do:) are usually not a problem, the only constraint is that you cannot override these 4 methods: (SmallInteger>>to:do:, SmallInteger>>to:by:do:, BlockClosure>>#whileTrue:, BlockClosure>>#whileFalse) but you can override these selectors in your objects. If you want to override one of these methods, there's no simple solution without performance cost (one solution is to rewrite them as primitive and stop inlining them in the compiler but even there we'll have some performance cost that needs to be checked. 

ifTrue:ifFalse: is the most complex case. I know Eliot has a plan for it. You can look at the video at the bottom of the sista article where in the end Eliot explains AoSta (the ancestor of sista) and he mentions somethings about mustBeBoolean.

Best,

Clément



2014-02-09 5:37 GMT+01:00 Sean P. DeNigris <[hidden email]>:

We often talk about making the VM faster. How about making it slower? In
1980, there were some optimizations that were needed for Smalltalk to be
even usable, but now:
- Moore's Law has theoretically given us 131072 more computing power
(2^((2014-1980)/2))
- Cog runs up to 3x slower than C [1]
- Ruby, which is widely accepted, seems to be much slower than Cog [2]

For example, inlined functions can be baffling for new users. I just ran
into this myself when writing an #ifNil:ifNotNil: that was not picked up by
the system [3], and Ungar and Smith describe several cases in the History of
Self (pg. 9-5).

How many of these are premature optimizations that can be eliminated, or at
least turned off by default until they're actually needed? I know Clement
mentioned in [3] that some make a big difference, but it would certainly
make the system more uniform and easy to understand.

[1]
http://lists.gforge.inria.fr/pipermail/pharo-project/2011-February/042489.html
[2]
http://benchmarksgame.alioth.debian.org/u32/benchmark.php?test=all&lang=yarv&lang2=gcc
[3] https://www.mail-archive.com/pharo-dev@.../msg11694.html



-----
Cheers,
Sean
--
View this message in context: http://forum.world.st/Making-a-Slower-VM-tp4742391.html
Sent from the Squeak VM mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

Ben Coman
In reply to this post by Sean P. DeNigris
 
Sean P. DeNigris wrote:

>  
> We often talk about making the VM faster. How about making it slower? In
> 1980, there were some optimizations that were needed for Smalltalk to be
> even usable, but now:
> - Moore's Law has theoretically given us 131072 more computing power
> (2^((2014-1980)/2))
> - Cog runs up to 3x slower than C [1]
> - Ruby, which is widely accepted, seems to be much slower than Cog [2]
>
> For example, inlined functions can be baffling for new users.

Not VM related but it sparks a random idea - how about syntax
highlighting inlined messages with a different colour?
cheers -ben

> I just ran
> into this myself when writing an #ifNil:ifNotNil: that was not picked up by
> the system [3], and Ungar and Smith describe several cases in the History of
> Self (pg. 9-5).
>
> How many of these are premature optimizations that can be eliminated, or at
> least turned off by default until they're actually needed? I know Clement
> mentioned in [3] that some make a big difference, but it would certainly
> make the system more uniform and easy to understand.
>
> [1]
> http://lists.gforge.inria.fr/pipermail/pharo-project/2011-February/042489.html
> [2]
> http://benchmarksgame.alioth.debian.org/u32/benchmark.php?test=all&lang=yarv&lang2=gcc
> [3] https://www.mail-archive.com/pharo-dev@.../msg11694.html
>
>
>
> -----
> Cheers,
> Sean
> --
> View this message in context: http://forum.world.st/Making-a-Slower-VM-tp4742391.html
> Sent from the Squeak VM mailing list archive at Nabble.com.
>
>  

Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

Bert Freudenberg
In reply to this post by Sean P. DeNigris
 
On 09.02.2014, at 05:37, Sean P. DeNigris <[hidden email]> wrote:

> We often talk about making the VM faster. How about making it slower? In
> 1980, there were some optimizations that were needed for Smalltalk to be
> even usable, but now:
> - Moore's Law has theoretically given us 131072 more computing power
> (2^((2014-1980)/2))
> - Cog runs up to 3x slower than C [1]
> - Ruby, which is widely accepted, seems to be much slower than Cog [2]
>
> For example, inlined functions can be baffling for new users. I just ran
> into this myself when writing an #ifNil:ifNotNil: that was not picked up by
> the system [3], and Ungar and Smith describe several cases in the History of
> Self (pg. 9-5).
>
> How many of these are premature optimizations that can be eliminated, or at
> least turned off by default until they're actually needed? I know Clement
> mentioned in [3] that some make a big difference, but it would certainly
> make the system more uniform and easy to understand.
This is not a VM problem. The compiler is doing the inlining, not the VM. The VM just executes what it is told to.

If the VM encounters an #ifNil:ifNotNil: send, it will faithfully do a method lookup and execute that. It will even do that if it sees an #ifTrue:. There is no short-circuiting of actual message sends in the VM.

What *does* happen is that the Compiler replaces an #ifNil:ifNotNil: send with "== nil ifTrue:ifFalse:" and then compiles the latter into jump bytecodes. That means the VM never sees the original #ifNil:ifNotNil: message.

It is pretty simple to turn off the Compiler's inlining of ifNil:ifNotNil:. It should also be pretty simple to make ifTrue:/ifFalse: be actual message sends, although I would expect a pretty big slow down since it will need real blocks. But at least their Smalltalk implementation is "executable". It's harder for whileTrue:/whileFalse: because if you wanted to implement them with real messages you would need tail call optimization, which Smalltalk VMs don't ususally do. Hence the implementation in the image that relies on compiler inlining.

- Bert -



smime.p7s (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

Colin Putney-3
 



On Sun, Feb 9, 2014 at 9:08 AM, Bert Freudenberg <[hidden email]> wrote:
 
It is pretty simple to turn off the Compiler's inlining of ifNil:ifNotNil:. It should also be pretty simple to make ifTrue:/ifFalse: be actual message sends, although I would expect a pretty big slow down since it will need real blocks. But at least their Smalltalk implementation is "executable". It's harder for whileTrue:/whileFalse: because if you wanted to implement them with real messages you would need tail call optimization, which Smalltalk VMs don't ususally do. Hence the implementation in the image that relies on compiler inlining.

Well, since we're talking about de-optimizing here, you *could* do #whileTrue: without optimizing tail calls. It's just that it would be really slow, especially if you wanted to guard against run-away memory use for loops with lots of iterations. If you want to make things slower, the sky's the limit!

Colin

Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

David T. Lewis
In reply to this post by Sean P. DeNigris
 
On Sat, Feb 08, 2014 at 08:37:46PM -0800, Sean P. DeNigris wrote:
>  
> We often talk about making the VM faster. How about making it slower?

We do not usually get too many requests to make the VM slower, what a
refreshing change of perspective ;-)

http://www.ispot.tv/ad/Y94D/xfinity-internet-traffic-featuring-bill-and-karolyn-slowsky

Joking aside, there actually is one legitimate reason for wanting a slow VM.
With high performance VMs and with ever faster hardware, it is very easy to
implement sloppy things in the image that go unnoticed until someone runs the
image on an old machine or on limited hardware. It is sometimes useful to
test on old hardware or on a slow VM to check for this.

I think someone mentioned it earlier, but a very easy way to produce an
intentionally slow VM is to generate the sources from VMMaker with the
inlining step disabled. The slang inliner is extremely effective, and turning
it off produces impressively sluggish results.

Dave

Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

Bert Freudenberg
In reply to this post by Colin Putney-3
 
On 09.02.2014, at 17:49, Colin Putney <[hidden email]> wrote:

On Sun, Feb 9, 2014 at 9:08 AM, Bert Freudenberg <[hidden email]> wrote:
 
It is pretty simple to turn off the Compiler's inlining of ifNil:ifNotNil:. It should also be pretty simple to make ifTrue:/ifFalse: be actual message sends, although I would expect a pretty big slow down since it will need real blocks. But at least their Smalltalk implementation is "executable". It's harder for whileTrue:/whileFalse: because if you wanted to implement them with real messages you would need tail call optimization, which Smalltalk VMs don't ususally do. Hence the implementation in the image that relies on compiler inlining.

Well, since we're talking about de-optimizing here, you *could* do #whileTrue: without optimizing tail calls. It's just that it would be really slow, especially if you wanted to guard against run-away memory use for loops with lots of iterations. If you want to make things slower, the sky's the limit!

True. But in any case you would have to touch the implementation, otherwise you just get an infinite recursion :)

- Bert -


smime.p7s (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

timrowledge
In reply to this post by David T. Lewis


On 09-02-2014, at 10:07 AM, David T. Lewis <[hidden email]> wrote:
> Joking aside, there actually is one legitimate reason for wanting a slow VM.
> With high performance VMs and with ever faster hardware, it is very easy to
> implement sloppy things in the image that go unnoticed until someone runs the
> image on an old machine or on limited hardware. It is sometimes useful to
> test on old hardware or on a slow VM to check for this.

The cheapest and easiest way to do it these days is to buy a Raspberry Pi. You’ll learn very quickly where you have used crappy algorithms or poor technique… though of course you do have to put up with X windows as well. Unless you try RISC OS, which although not able to make the raw compute performance faster at least has a window system that doesn’t send every pixel to the screen via Deep Space Network to the relay on Sedna.

> I think someone mentioned it earlier, but a very easy way to produce an
> intentionally slow VM is to generate the sources from VMMaker with the
> inlining step disabled. The slang inliner is extremely effective, and turning
> it off produces impressively sluggish results.

Does that actually work these days? Last I remember was that turning inlining off wouldn’t produce a buildable interp.c file. If someone has had the patience to make it work then I’m impressed.


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: SDLI: Shift Disk Left Immediate


Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

David T. Lewis
 
On Sun, Feb 09, 2014 at 10:23:37AM -0800, tim Rowledge wrote:

>
>
> On 09-02-2014, at 10:07 AM, David T. Lewis <[hidden email]> wrote:
> > Joking aside, there actually is one legitimate reason for wanting a slow VM.
> > With high performance VMs and with ever faster hardware, it is very easy to
> > implement sloppy things in the image that go unnoticed until someone runs the
> > image on an old machine or on limited hardware. It is sometimes useful to
> > test on old hardware or on a slow VM to check for this.
>
> The cheapest and easiest way to do it these days is to buy a Raspberry Pi. You?ll learn very quickly where you have used crappy algorithms or poor technique? though of course you do have to put up with X windows as well. Unless you try RISC OS, which although not able to make the raw compute performance faster at least has a window system that doesn?t send every pixel to the screen via Deep Space Network to the relay on Sedna.
>
> > I think someone mentioned it earlier, but a very easy way to produce an
> > intentionally slow VM is to generate the sources from VMMaker with the
> > inlining step disabled. The slang inliner is extremely effective, and turning
> > it off produces impressively sluggish results.
>
> Does that actually work these days? Last I remember was that turning inlining off wouldn?t produce a buildable interp.c file. If someone has had the patience to make it work then I?m impressed.
>

Dang it, you're right, it's not working. I guess I have not tried this in
a while, though I know that it used to work. Making things go slower seems
like a worthwhile thing to do on a Sunday afternoon, so I think I'll see
if I can fix it.

Dave
 
Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

Eliot Miranda-2
 



On Sun, Feb 9, 2014 at 11:46 AM, David T. Lewis <[hidden email]> wrote:

On Sun, Feb 09, 2014 at 10:23:37AM -0800, tim Rowledge wrote:
>
>
> On 09-02-2014, at 10:07 AM, David T. Lewis <[hidden email]> wrote:
> > Joking aside, there actually is one legitimate reason for wanting a slow VM.
> > With high performance VMs and with ever faster hardware, it is very easy to
> > implement sloppy things in the image that go unnoticed until someone runs the
> > image on an old machine or on limited hardware. It is sometimes useful to
> > test on old hardware or on a slow VM to check for this.
>
> The cheapest and easiest way to do it these days is to buy a Raspberry Pi. You?ll learn very quickly where you have used crappy algorithms or poor technique? though of course you do have to put up with X windows as well. Unless you try RISC OS, which although not able to make the raw compute performance faster at least has a window system that doesn?t send every pixel to the screen via Deep Space Network to the relay on Sedna.
>
> > I think someone mentioned it earlier, but a very easy way to produce an
> > intentionally slow VM is to generate the sources from VMMaker with the
> > inlining step disabled. The slang inliner is extremely effective, and turning
> > it off produces impressively sluggish results.
>
> Does that actually work these days? Last I remember was that turning inlining off wouldn?t produce a buildable interp.c file. If someone has had the patience to make it work then I?m impressed.
>

Dang it, you're right, it's not working. I guess I have not tried this in
a while, though I know that it used to work. Making things go slower seems
like a worthwhile thing to do on a Sunday afternoon, so I think I'll see
if I can fix it.

I *think* the issue is the internal/external split brought abut by the introduction of the localFoo variables, such as localSP and localIP.  This optimization absolutely depends on inlining.  Which reminds me that anyone who is interested in creating a StackInterpreter or CoInterpreter that *doesn't* use the internal methods and uses only stackPointer, framePointer and instructionPointer would have my full support.  I'm very curious to see what the performance of stack+internal vs stack-internal, and cog+internal vs cog-internal will be.  I'm hoping that the performance of the -internal versions is good enough that we could eliminate all that duplication.

--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

timrowledge


On 10-02-2014, at 11:53 AM, Eliot Miranda <[hidden email]> wrote:
>
> I *think* the issue is the internal/external split brought abut by the introduction of the localFoo variables, such as localSP and localIP.

It’s really hard to be sure but I suspect that this isn’t the (only) issue. IIRC we used to be able to make non-inlined VMs at one point and that was well after the internalFoo code was added.

OK, some quick email searching reveals some work done in ’03 by johnMcI, Craig & me.
Craig found the following code helped -

!'From Squeak3.6alpha of ''17 March 2003'' [latest update: #5325] on 21 July 2003 at 1:11:25 pm'!

!Interpreter methodsFor: 'contexts' stamp: 'crl 7/19/2003 15:59'!
primitiveFindNextUnwindContext
        "Primitive. Search up the context stack for the next method context marked for unwind handling from the receiver up to but not including the argument. Return nil if none found."
        | thisCntx nilOop aContext isUnwindMarked header meth pIndex |
        aContext _ self popStack.
        thisCntx _ self fetchPointer: SenderIndex ofObject: self popStack.
        nilOop _ nilObj.

        [(thisCntx = aContext) or: [thisCntx = nilOop]] whileFalse: [

        header _ self baseHeader: aContext.

        (self isMethodContextHeader: header)
                ifTrue: [
                        meth _ self fetchPointer: MethodIndex ofObject: aContext.
                        pIndex _ self primitiveIndexOf: meth.
                        isUnwindMarked _ pIndex == 198]
                ifFalse: [isUnwindMarked _ false].
                isUnwindMarked ifTrue:[
                        self push: thisCntx.
                        ^nil].
                thisCntx _ self fetchPointer: SenderIndex ofObject: thisCntx].

        ^self push: nilOop! !

!Interpreter methodsFor: 'interpreter shell' stamp: 'crl 7/19/2003 15:33'!
interpret
        "This is the main interpreter loop. It normally loops forever, fetching and executing bytecodes. When running in the context of a browser plugin VM, however, it must return control to the browser periodically. This should done only when the state of the currently running Squeak thread is safely stored in the object heap. Since this is the case at the moment that a check for interrupts is performed, that is when we return to the browser if it is time to do so. Interrupt checks happen quite frequently."

        "record entry time when running as a browser plug-in"
        "self browserPluginInitialiseIfNeeded"
        self internalizeIPandSP.
        self fetchNextBytecode.
        [true] whileTrue: [self dispatchOn: currentBytecode in: BytecodeTable].
        localIP _ localIP - 1.  "undo the pre-increment of IP before returning"
        self externalizeIPandSP.
! !

!Interpreter methodsFor: 'return bytecodes' stamp: 'crl 7/19/2003 16:05'!
returnValueTo
        "Note: Assumed to be inlined into the dispatch loop."

        | nilOop thisCntx contextOfCaller localCntx localVal isUnwindMarked header meth pIndex |
        self inline: true.
        self sharedCodeNamed: 'commonReturn' inCase: 120.

        nilOop _ nilObj. "keep in a register"
        thisCntx _ activeContext.
        localCntx _ cntx.
        localVal _ val.

        "make sure we can return to the given context"
        ((localCntx = nilOop) or:
         [(self fetchPointer: InstructionPointerIndex ofObject: localCntx) = nilOop]) ifTrue: [
                "error: sender's instruction pointer or context is nil; cannot return"
                ^self internalCannotReturn: localVal].

        "If this return is not to our immediate predecessor (i.e. from a method to its sender, or from a block to its caller), scan the stack for the first unwind marked context and inform this context and let it deal with it. This provides a chance for ensure unwinding to occur."
        thisCntx _ self fetchPointer: SenderIndex ofObject: activeContext.

        "Just possibly a faster test would be to compare the homeContext and activeContext - they are of course different for blocks. Thus we might be able to optimise a touch by having a different returnTo for the blockreteurn (since we know that must return to caller) and then if active ~= home we must be doing a non-local return. I think. Maybe."
        [thisCntx = localCntx] whileFalse: [
                thisCntx = nilObj ifTrue:[
                        "error: sender's instruction pointer or context is nil; cannot return"
                        ^self internalCannotReturn: localVal].
                "Climb up stack towards localCntx. Break out to a send of #aboutToReturn:through: if an unwind marked context is found"
        header _ self baseHeader: thisCntx.

        (self isMethodContextHeader: header)
                ifTrue: [
                        meth _ self fetchPointer: MethodIndex ofObject: thisCntx.
                        pIndex _ self primitiveIndexOf: meth.
                        isUnwindMarked _ pIndex == 198]
                ifFalse: [isUnwindMarked _ false].

                isUnwindMarked ifTrue:[
                        "context is marked; break out"
                        ^self internalAboutToReturn: localVal through: thisCntx].
                thisCntx _ self fetchPointer: SenderIndex ofObject: thisCntx.
].

        "If we get here there is no unwind to worry about. Simply terminate the stack up to the localCntx - often just the sender of the method"
        thisCntx _ activeContext.
        [thisCntx = localCntx]
                whileFalse:
                ["climb up stack to localCntx"
                contextOfCaller _ self fetchPointer: SenderIndex ofObject: thisCntx.

                "zap exited contexts so any future attempted use will be caught"
                self storePointerUnchecked: SenderIndex ofObject: thisCntx withValue: nilOop.
                self storePointerUnchecked: InstructionPointerIndex ofObject: thisCntx withValue: nilOop.
                reclaimableContextCount > 0 ifTrue:
                        ["try to recycle this context"
                        reclaimableContextCount _ reclaimableContextCount - 1.
                        self recycleContextIfPossible: thisCntx].
                thisCntx _ contextOfCaller].

        activeContext _ thisCntx.
        (thisCntx < youngStart) ifTrue: [ self beRootIfOld: thisCntx ].

        self internalFetchContextRegisters: thisCntx.  "updates local IP and SP"
        self fetchNextBytecode.
        self internalPush: localVal.
! !

Shortly after that I released the VMMaker3.6 with a note that it couldn’t produce a completely non-inlined VM because of a problem in fetchByte if globalstruct was enabled, and some odd problems in B2DPlugin. When VMMaker3.7 was released a year late (march 04) I apparently thought it could make the core vm non-inlined. Since this is all a bazillion years ago I can’t remember any context to help extend the history.

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Science is imagination equipped with grappling hooks.

Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

David T. Lewis
 
I was looking at the trunk VMM yesterday and found that most of the issues
were just caused by accessor methods, where #foo and #foo: generate
conflicting foo(void) and foo(aParameter). In most cases, a convention of
#setFoo: rather than #foo: takes care of the problem. There were a few
other miscellaneous issues as well, but nothing that looked serious.

The variable 'memory' is a challenge because it is used extensively both
directly and through #memory and #memory:. I was considering changing the
variable name to something like memoryBase, and leaving the accessors
alone though I'm not sure that would be a very good idea.

I ran out of time yesterday and did not pursue it beyond this.

Dave

>
>
> On 10-02-2014, at 11:53 AM, Eliot Miranda <[hidden email]> wrote:
>>
>> I *think* the issue is the internal/external split brought abut by the
>> introduction of the localFoo variables, such as localSP and localIP.
>
> It’s really hard to be sure but I suspect that this isn’t the (only)
> issue. IIRC we used to be able to make non-inlined VMs at one point and
> that was well after the internalFoo code was added.
>
> OK, some quick email searching reveals some work done in ’03 by johnMcI,
> Craig & me.
> Craig found the following code helped -
>
> !'From Squeak3.6alpha of ''17 March 2003'' [latest update: #5325] on 21
> July 2003 at 1:11:25 pm'!
>
> !Interpreter methodsFor: 'contexts' stamp: 'crl 7/19/2003 15:59'!
> primitiveFindNextUnwindContext
> "Primitive. Search up the context stack for the next method context
> marked for unwind handling from the receiver up to but not including the
> argument. Return nil if none found."
> | thisCntx nilOop aContext isUnwindMarked header meth pIndex |
> aContext _ self popStack.
> thisCntx _ self fetchPointer: SenderIndex ofObject: self popStack.
> nilOop _ nilObj.
>
> [(thisCntx = aContext) or: [thisCntx = nilOop]] whileFalse: [
>
> header _ self baseHeader: aContext.
>
> (self isMethodContextHeader: header)
> ifTrue: [
> meth _ self fetchPointer: MethodIndex ofObject: aContext.
> pIndex _ self primitiveIndexOf: meth.
> isUnwindMarked _ pIndex == 198]
> ifFalse: [isUnwindMarked _ false].
> isUnwindMarked ifTrue:[
> self push: thisCntx.
> ^nil].
> thisCntx _ self fetchPointer: SenderIndex ofObject: thisCntx].
>
> ^self push: nilOop! !
>
> !Interpreter methodsFor: 'interpreter shell' stamp: 'crl 7/19/2003 15:33'!
> interpret
> "This is the main interpreter loop. It normally loops forever, fetching
> and executing bytecodes. When running in the context of a browser plugin
> VM, however, it must return control to the browser periodically. This
> should done only when the state of the currently running Squeak thread is
> safely stored in the object heap. Since this is the case at the moment
> that a check for interrupts is performed, that is when we return to the
> browser if it is time to do so. Interrupt checks happen quite
> frequently."
>
> "record entry time when running as a browser plug-in"
> "self browserPluginInitialiseIfNeeded"
> self internalizeIPandSP.
> self fetchNextBytecode.
> [true] whileTrue: [self dispatchOn: currentBytecode in: BytecodeTable].
> localIP _ localIP - 1.  "undo the pre-increment of IP before returning"
> self externalizeIPandSP.
> ! !
>
> !Interpreter methodsFor: 'return bytecodes' stamp: 'crl 7/19/2003 16:05'!
> returnValueTo
> "Note: Assumed to be inlined into the dispatch loop."
>
> | nilOop thisCntx contextOfCaller localCntx localVal isUnwindMarked
> header meth pIndex |
> self inline: true.
> self sharedCodeNamed: 'commonReturn' inCase: 120.
>
> nilOop _ nilObj. "keep in a register"
> thisCntx _ activeContext.
> localCntx _ cntx.
> localVal _ val.
>
> "make sure we can return to the given context"
> ((localCntx = nilOop) or:
> [(self fetchPointer: InstructionPointerIndex ofObject: localCntx) =
> nilOop]) ifTrue: [
> "error: sender's instruction pointer or context is nil; cannot return"
> ^self internalCannotReturn: localVal].
>
> "If this return is not to our immediate predecessor (i.e. from a method
> to its sender, or from a block to its caller), scan the stack for the
> first unwind marked context and inform this context and let it deal with
> it. This provides a chance for ensure unwinding to occur."
> thisCntx _ self fetchPointer: SenderIndex ofObject: activeContext.
>
> "Just possibly a faster test would be to compare the homeContext and
> activeContext - they are of course different for blocks. Thus we might be
> able to optimise a touch by having a different returnTo for the
> blockreteurn (since we know that must return to caller) and then if
> active ~= home we must be doing a non-local return. I think. Maybe."
> [thisCntx = localCntx] whileFalse: [
> thisCntx = nilObj ifTrue:[
> "error: sender's instruction pointer or context is nil; cannot return"
> ^self internalCannotReturn: localVal].
> "Climb up stack towards localCntx. Break out to a send of
> #aboutToReturn:through: if an unwind marked context is found"
> header _ self baseHeader: thisCntx.
>
> (self isMethodContextHeader: header)
> ifTrue: [
> meth _ self fetchPointer: MethodIndex ofObject: thisCntx.
> pIndex _ self primitiveIndexOf: meth.
> isUnwindMarked _ pIndex == 198]
> ifFalse: [isUnwindMarked _ false].
>
> isUnwindMarked ifTrue:[
> "context is marked; break out"
> ^self internalAboutToReturn: localVal through: thisCntx].
> thisCntx _ self fetchPointer: SenderIndex ofObject: thisCntx.
> ].
>
> "If we get here there is no unwind to worry about. Simply terminate the
> stack up to the localCntx - often just the sender of the method"
> thisCntx _ activeContext.
> [thisCntx = localCntx]
> whileFalse:
> ["climb up stack to localCntx"
> contextOfCaller _ self fetchPointer: SenderIndex ofObject: thisCntx.
>
> "zap exited contexts so any future attempted use will be caught"
> self storePointerUnchecked: SenderIndex ofObject: thisCntx withValue:
> nilOop.
> self storePointerUnchecked: InstructionPointerIndex ofObject: thisCntx
> withValue: nilOop.
> reclaimableContextCount > 0 ifTrue:
> ["try to recycle this context"
> reclaimableContextCount _ reclaimableContextCount - 1.
> self recycleContextIfPossible: thisCntx].
> thisCntx _ contextOfCaller].
>
> activeContext _ thisCntx.
> (thisCntx < youngStart) ifTrue: [ self beRootIfOld: thisCntx ].
>
> self internalFetchContextRegisters: thisCntx.  "updates local IP and SP"
> self fetchNextBytecode.
> self internalPush: localVal.
> ! !
>
> Shortly after that I released the VMMaker3.6 with a note that it couldn’t
> produce a completely non-inlined VM because of a problem in fetchByte if
> globalstruct was enabled, and some odd problems in B2DPlugin. When
> VMMaker3.7 was released a year late (march 04) I apparently thought it
> could make the core vm non-inlined. Since this is all a bazillion years
> ago I can’t remember any context to help extend the history.
>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> Science is imagination equipped with grappling hooks.
>


Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

Nicolas Cellier
 
Hi David,
do you realize that Eliot is (ab)using this in COG in order to eliminate some direct cCode: '...' inclusion?
So setFoo: is not an option (or i misunderstood something)


2014-02-10 21:51 GMT+01:00 David T. Lewis <[hidden email]>:

I was looking at the trunk VMM yesterday and found that most of the issues
were just caused by accessor methods, where #foo and #foo: generate
conflicting foo(void) and foo(aParameter). In most cases, a convention of
#setFoo: rather than #foo: takes care of the problem. There were a few
other miscellaneous issues as well, but nothing that looked serious.

The variable 'memory' is a challenge because it is used extensively both
directly and through #memory and #memory:. I was considering changing the
variable name to something like memoryBase, and leaving the accessors
alone though I'm not sure that would be a very good idea.

I ran out of time yesterday and did not pursue it beyond this.

Dave

>
>
> On 10-02-2014, at 11:53 AM, Eliot Miranda <[hidden email]> wrote:
>>
>> I *think* the issue is the internal/external split brought abut by the
>> introduction of the localFoo variables, such as localSP and localIP.
>
> It’s really hard to be sure but I suspect that this isn’t the (only)
> issue. IIRC we used to be able to make non-inlined VMs at one point and
> that was well after the internalFoo code was added.
>
> OK, some quick email searching reveals some work done in ’03 by johnMcI,
> Craig & me.
> Craig found the following code helped -
>
> !'From Squeak3.6alpha of ''17 March 2003'' [latest update: #5325] on 21
> July 2003 at 1:11:25 pm'!
>
> !Interpreter methodsFor: 'contexts' stamp: 'crl 7/19/2003 15:59'!
> primitiveFindNextUnwindContext
>       "Primitive. Search up the context stack for the next method context
> marked for unwind handling from the receiver up to but not including the
> argument. Return nil if none found."
>       | thisCntx nilOop aContext isUnwindMarked header meth pIndex |
>       aContext _ self popStack.
>       thisCntx _ self fetchPointer: SenderIndex ofObject: self popStack.
>       nilOop _ nilObj.
>
>       [(thisCntx = aContext) or: [thisCntx = nilOop]] whileFalse: [
>
>       header _ self baseHeader: aContext.
>
>       (self isMethodContextHeader: header)
>               ifTrue: [
>                       meth _ self fetchPointer: MethodIndex ofObject: aContext.
>                       pIndex _ self primitiveIndexOf: meth.
>                       isUnwindMarked _ pIndex == 198]
>               ifFalse: [isUnwindMarked _ false].
>               isUnwindMarked ifTrue:[
>                       self push: thisCntx.
>                       ^nil].
>               thisCntx _ self fetchPointer: SenderIndex ofObject: thisCntx].
>
>       ^self push: nilOop! !
>
> !Interpreter methodsFor: 'interpreter shell' stamp: 'crl 7/19/2003 15:33'!
> interpret
>       "This is the main interpreter loop. It normally loops forever, fetching
> and executing bytecodes. When running in the context of a browser plugin
> VM, however, it must return control to the browser periodically. This
> should done only when the state of the currently running Squeak thread is
> safely stored in the object heap. Since this is the case at the moment
> that a check for interrupts is performed, that is when we return to the
> browser if it is time to do so. Interrupt checks happen quite
> frequently."
>
>       "record entry time when running as a browser plug-in"
>       "self browserPluginInitialiseIfNeeded"
>       self internalizeIPandSP.
>       self fetchNextBytecode.
>       [true] whileTrue: [self dispatchOn: currentBytecode in: BytecodeTable].
>       localIP _ localIP - 1.  "undo the pre-increment of IP before returning"
>       self externalizeIPandSP.
> ! !
>
> !Interpreter methodsFor: 'return bytecodes' stamp: 'crl 7/19/2003 16:05'!
> returnValueTo
>       "Note: Assumed to be inlined into the dispatch loop."
>
>       | nilOop thisCntx contextOfCaller localCntx localVal isUnwindMarked
> header meth pIndex |
>       self inline: true.
>       self sharedCodeNamed: 'commonReturn' inCase: 120.
>
>       nilOop _ nilObj. "keep in a register"
>       thisCntx _ activeContext.
>       localCntx _ cntx.
>       localVal _ val.
>
>       "make sure we can return to the given context"
>       ((localCntx = nilOop) or:
>        [(self fetchPointer: InstructionPointerIndex ofObject: localCntx) =
> nilOop]) ifTrue: [
>               "error: sender's instruction pointer or context is nil; cannot return"
>               ^self internalCannotReturn: localVal].
>
>       "If this return is not to our immediate predecessor (i.e. from a method
> to its sender, or from a block to its caller), scan the stack for the
> first unwind marked context and inform this context and let it deal with
> it. This provides a chance for ensure unwinding to occur."
>       thisCntx _ self fetchPointer: SenderIndex ofObject: activeContext.
>
>       "Just possibly a faster test would be to compare the homeContext and
> activeContext - they are of course different for blocks. Thus we might be
> able to optimise a touch by having a different returnTo for the
> blockreteurn (since we know that must return to caller) and then if
> active ~= home we must be doing a non-local return. I think. Maybe."
>       [thisCntx = localCntx] whileFalse: [
>               thisCntx = nilObj ifTrue:[
>                       "error: sender's instruction pointer or context is nil; cannot return"
>                       ^self internalCannotReturn: localVal].
>               "Climb up stack towards localCntx. Break out to a send of
> #aboutToReturn:through: if an unwind marked context is found"
>       header _ self baseHeader: thisCntx.
>
>       (self isMethodContextHeader: header)
>               ifTrue: [
>                       meth _ self fetchPointer: MethodIndex ofObject: thisCntx.
>                       pIndex _ self primitiveIndexOf: meth.
>                       isUnwindMarked _ pIndex == 198]
>               ifFalse: [isUnwindMarked _ false].
>
>               isUnwindMarked ifTrue:[
>                       "context is marked; break out"
>                       ^self internalAboutToReturn: localVal through: thisCntx].
>               thisCntx _ self fetchPointer: SenderIndex ofObject: thisCntx.
> ].
>
>       "If we get here there is no unwind to worry about. Simply terminate the
> stack up to the localCntx - often just the sender of the method"
>       thisCntx _ activeContext.
>       [thisCntx = localCntx]
>               whileFalse:
>               ["climb up stack to localCntx"
>               contextOfCaller _ self fetchPointer: SenderIndex ofObject: thisCntx.
>
>               "zap exited contexts so any future attempted use will be caught"
>               self storePointerUnchecked: SenderIndex ofObject: thisCntx withValue:
> nilOop.
>               self storePointerUnchecked: InstructionPointerIndex ofObject: thisCntx
> withValue: nilOop.
>               reclaimableContextCount > 0 ifTrue:
>                       ["try to recycle this context"
>                       reclaimableContextCount _ reclaimableContextCount - 1.
>                       self recycleContextIfPossible: thisCntx].
>               thisCntx _ contextOfCaller].
>
>       activeContext _ thisCntx.
>       (thisCntx < youngStart) ifTrue: [ self beRootIfOld: thisCntx ].
>
>       self internalFetchContextRegisters: thisCntx.  "updates local IP and SP"
>       self fetchNextBytecode.
>       self internalPush: localVal.
> ! !
>
> Shortly after that I released the VMMaker3.6 with a note that it couldn’t
> produce a completely non-inlined VM because of a problem in fetchByte if
> globalstruct was enabled, and some odd problems in B2DPlugin. When
> VMMaker3.7 was released a year late (march 04) I apparently thought it
> could make the core vm non-inlined. Since this is all a bazillion years
> ago I can’t remember any context to help extend the history.
>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> Science is imagination equipped with grappling hooks.
>



Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

Eliot Miranda-2
In reply to this post by David T. Lewis
 



On Mon, Feb 10, 2014 at 12:51 PM, David T. Lewis <[hidden email]> wrote:

I was looking at the trunk VMM yesterday and found that most of the issues
were just caused by accessor methods, where #foo and #foo: generate
conflicting foo(void) and foo(aParameter). In most cases, a convention of
#setFoo: rather than #foo: takes care of the problem. There were a few
other miscellaneous issues as well, but nothing that looked serious.

There's a more convenient hack:

memory
<cmacro: '() GIV(memory)'>
^memory

memory: aValue
^memory := aValue
 

The variable 'memory' is a challenge because it is used extensively both
directly and through #memory and #memory:. I was considering changing the
variable name to something like memoryBase, and leaving the accessors
alone though I'm not sure that would be a very good idea.

See above.
 

I ran out of time yesterday and did not pursue it beyond this.

Dave

>
>
> On 10-02-2014, at 11:53 AM, Eliot Miranda <[hidden email]> wrote:
>>
>> I *think* the issue is the internal/external split brought abut by the
>> introduction of the localFoo variables, such as localSP and localIP.
>
> It’s really hard to be sure but I suspect that this isn’t the (only)
> issue. IIRC we used to be able to make non-inlined VMs at one point and
> that was well after the internalFoo code was added.
>
> OK, some quick email searching reveals some work done in ’03 by johnMcI,
> Craig & me.
> Craig found the following code helped -
>
> !'From Squeak3.6alpha of ''17 March 2003'' [latest update: #5325] on 21
> July 2003 at 1:11:25 pm'!
>
> !Interpreter methodsFor: 'contexts' stamp: 'crl 7/19/2003 15:59'!
> primitiveFindNextUnwindContext
>       "Primitive. Search up the context stack for the next method context
> marked for unwind handling from the receiver up to but not including the
> argument. Return nil if none found."
>       | thisCntx nilOop aContext isUnwindMarked header meth pIndex |
>       aContext _ self popStack.
>       thisCntx _ self fetchPointer: SenderIndex ofObject: self popStack.
>       nilOop _ nilObj.
>
>       [(thisCntx = aContext) or: [thisCntx = nilOop]] whileFalse: [
>
>       header _ self baseHeader: aContext.
>
>       (self isMethodContextHeader: header)
>               ifTrue: [
>                       meth _ self fetchPointer: MethodIndex ofObject: aContext.
>                       pIndex _ self primitiveIndexOf: meth.
>                       isUnwindMarked _ pIndex == 198]
>               ifFalse: [isUnwindMarked _ false].
>               isUnwindMarked ifTrue:[
>                       self push: thisCntx.
>                       ^nil].
>               thisCntx _ self fetchPointer: SenderIndex ofObject: thisCntx].
>
>       ^self push: nilOop! !
>
> !Interpreter methodsFor: 'interpreter shell' stamp: 'crl 7/19/2003 15:33'!
> interpret
>       "This is the main interpreter loop. It normally loops forever, fetching
> and executing bytecodes. When running in the context of a browser plugin
> VM, however, it must return control to the browser periodically. This
> should done only when the state of the currently running Squeak thread is
> safely stored in the object heap. Since this is the case at the moment
> that a check for interrupts is performed, that is when we return to the
> browser if it is time to do so. Interrupt checks happen quite
> frequently."
>
>       "record entry time when running as a browser plug-in"
>       "self browserPluginInitialiseIfNeeded"
>       self internalizeIPandSP.
>       self fetchNextBytecode.
>       [true] whileTrue: [self dispatchOn: currentBytecode in: BytecodeTable].
>       localIP _ localIP - 1.  "undo the pre-increment of IP before returning"
>       self externalizeIPandSP.
> ! !
>
> !Interpreter methodsFor: 'return bytecodes' stamp: 'crl 7/19/2003 16:05'!
> returnValueTo
>       "Note: Assumed to be inlined into the dispatch loop."
>
>       | nilOop thisCntx contextOfCaller localCntx localVal isUnwindMarked
> header meth pIndex |
>       self inline: true.
>       self sharedCodeNamed: 'commonReturn' inCase: 120.
>
>       nilOop _ nilObj. "keep in a register"
>       thisCntx _ activeContext.
>       localCntx _ cntx.
>       localVal _ val.
>
>       "make sure we can return to the given context"
>       ((localCntx = nilOop) or:
>        [(self fetchPointer: InstructionPointerIndex ofObject: localCntx) =
> nilOop]) ifTrue: [
>               "error: sender's instruction pointer or context is nil; cannot return"
>               ^self internalCannotReturn: localVal].
>
>       "If this return is not to our immediate predecessor (i.e. from a method
> to its sender, or from a block to its caller), scan the stack for the
> first unwind marked context and inform this context and let it deal with
> it. This provides a chance for ensure unwinding to occur."
>       thisCntx _ self fetchPointer: SenderIndex ofObject: activeContext.
>
>       "Just possibly a faster test would be to compare the homeContext and
> activeContext - they are of course different for blocks. Thus we might be
> able to optimise a touch by having a different returnTo for the
> blockreteurn (since we know that must return to caller) and then if
> active ~= home we must be doing a non-local return. I think. Maybe."
>       [thisCntx = localCntx] whileFalse: [
>               thisCntx = nilObj ifTrue:[
>                       "error: sender's instruction pointer or context is nil; cannot return"
>                       ^self internalCannotReturn: localVal].
>               "Climb up stack towards localCntx. Break out to a send of
> #aboutToReturn:through: if an unwind marked context is found"
>       header _ self baseHeader: thisCntx.
>
>       (self isMethodContextHeader: header)
>               ifTrue: [
>                       meth _ self fetchPointer: MethodIndex ofObject: thisCntx.
>                       pIndex _ self primitiveIndexOf: meth.
>                       isUnwindMarked _ pIndex == 198]
>               ifFalse: [isUnwindMarked _ false].
>
>               isUnwindMarked ifTrue:[
>                       "context is marked; break out"
>                       ^self internalAboutToReturn: localVal through: thisCntx].
>               thisCntx _ self fetchPointer: SenderIndex ofObject: thisCntx.
> ].
>
>       "If we get here there is no unwind to worry about. Simply terminate the
> stack up to the localCntx - often just the sender of the method"
>       thisCntx _ activeContext.
>       [thisCntx = localCntx]
>               whileFalse:
>               ["climb up stack to localCntx"
>               contextOfCaller _ self fetchPointer: SenderIndex ofObject: thisCntx.
>
>               "zap exited contexts so any future attempted use will be caught"
>               self storePointerUnchecked: SenderIndex ofObject: thisCntx withValue:
> nilOop.
>               self storePointerUnchecked: InstructionPointerIndex ofObject: thisCntx
> withValue: nilOop.
>               reclaimableContextCount > 0 ifTrue:
>                       ["try to recycle this context"
>                       reclaimableContextCount _ reclaimableContextCount - 1.
>                       self recycleContextIfPossible: thisCntx].
>               thisCntx _ contextOfCaller].
>
>       activeContext _ thisCntx.
>       (thisCntx < youngStart) ifTrue: [ self beRootIfOld: thisCntx ].
>
>       self internalFetchContextRegisters: thisCntx.  "updates local IP and SP"
>       self fetchNextBytecode.
>       self internalPush: localVal.
> ! !
>
> Shortly after that I released the VMMaker3.6 with a note that it couldn’t
> produce a completely non-inlined VM because of a problem in fetchByte if
> globalstruct was enabled, and some odd problems in B2DPlugin. When
> VMMaker3.7 was released a year late (march 04) I apparently thought it
> could make the core vm non-inlined. Since this is all a bazillion years
> ago I can’t remember any context to help extend the history.
>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> Science is imagination equipped with grappling hooks.
>





--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

David T. Lewis
In reply to this post by Nicolas Cellier
 
On Mon, Feb 10, 2014 at 10:12:32PM +0100, Nicolas Cellier wrote:
>  
> Hi David,
> do you realize that Eliot is (ab)using this in COG in order to eliminate
> some direct cCode: '...' inclusion?
> So setFoo: is not an option (or i misunderstood something)
>

Hi Nicolas,

Actually I am not sure what you are referring to here, so probably I am
missing something. Can you explain why setFoo: would be a problem in Cog?
I cannot check it myself right now but I am interested to know if I am
missing something important.

Thanks,
Dave


>
> 2014-02-10 21:51 GMT+01:00 David T. Lewis <[hidden email]>:
>
> >
> > I was looking at the trunk VMM yesterday and found that most of the issues
> > were just caused by accessor methods, where #foo and #foo: generate
> > conflicting foo(void) and foo(aParameter). In most cases, a convention of
> > #setFoo: rather than #foo: takes care of the problem. There were a few
> > other miscellaneous issues as well, but nothing that looked serious.
> >
> > The variable 'memory' is a challenge because it is used extensively both
> > directly and through #memory and #memory:. I was considering changing the
> > variable name to something like memoryBase, and leaving the accessors
> > alone though I'm not sure that would be a very good idea.
> >
> > I ran out of time yesterday and did not pursue it beyond this.
> >
> > Dave
> >
> > >
> > >
> > > On 10-02-2014, at 11:53 AM, Eliot Miranda <[hidden email]>
> > wrote:
> > >>
> > >> I *think* the issue is the internal/external split brought abut by the
> > >> introduction of the localFoo variables, such as localSP and localIP.
> > >
> > > It's really hard to be sure but I suspect that this isn't the (only)
> > > issue. IIRC we used to be able to make non-inlined VMs at one point and
> > > that was well after the internalFoo code was added.
> > >
> > > OK, some quick email searching reveals some work done in '03 by johnMcI,
> > > Craig & me.
> > > Craig found the following code helped -
> > >
> > > !'From Squeak3.6alpha of ''17 March 2003'' [latest update: #5325] on 21
> > > July 2003 at 1:11:25 pm'!
> > >
> > > !Interpreter methodsFor: 'contexts' stamp: 'crl 7/19/2003 15:59'!
> > > primitiveFindNextUnwindContext
> > >       "Primitive. Search up the context stack for the next method context
> > > marked for unwind handling from the receiver up to but not including the
> > > argument. Return nil if none found."
> > >       | thisCntx nilOop aContext isUnwindMarked header meth pIndex |
> > >       aContext _ self popStack.
> > >       thisCntx _ self fetchPointer: SenderIndex ofObject: self popStack.
> > >       nilOop _ nilObj.
> > >
> > >       [(thisCntx = aContext) or: [thisCntx = nilOop]] whileFalse: [
> > >
> > >       header _ self baseHeader: aContext.
> > >
> > >       (self isMethodContextHeader: header)
> > >               ifTrue: [
> > >                       meth _ self fetchPointer: MethodIndex ofObject:
> > aContext.
> > >                       pIndex _ self primitiveIndexOf: meth.
> > >                       isUnwindMarked _ pIndex == 198]
> > >               ifFalse: [isUnwindMarked _ false].
> > >               isUnwindMarked ifTrue:[
> > >                       self push: thisCntx.
> > >                       ^nil].
> > >               thisCntx _ self fetchPointer: SenderIndex ofObject:
> > thisCntx].
> > >
> > >       ^self push: nilOop! !
> > >
> > > !Interpreter methodsFor: 'interpreter shell' stamp: 'crl 7/19/2003
> > 15:33'!
> > > interpret
> > >       "This is the main interpreter loop. It normally loops forever,
> > fetching
> > > and executing bytecodes. When running in the context of a browser plugin
> > > VM, however, it must return control to the browser periodically. This
> > > should done only when the state of the currently running Squeak thread is
> > > safely stored in the object heap. Since this is the case at the moment
> > > that a check for interrupts is performed, that is when we return to the
> > > browser if it is time to do so. Interrupt checks happen quite
> > > frequently."
> > >
> > >       "record entry time when running as a browser plug-in"
> > >       "self browserPluginInitialiseIfNeeded"
> > >       self internalizeIPandSP.
> > >       self fetchNextBytecode.
> > >       [true] whileTrue: [self dispatchOn: currentBytecode in:
> > BytecodeTable].
> > >       localIP _ localIP - 1.  "undo the pre-increment of IP before
> > returning"
> > >       self externalizeIPandSP.
> > > ! !
> > >
> > > !Interpreter methodsFor: 'return bytecodes' stamp: 'crl 7/19/2003 16:05'!
> > > returnValueTo
> > >       "Note: Assumed to be inlined into the dispatch loop."
> > >
> > >       | nilOop thisCntx contextOfCaller localCntx localVal isUnwindMarked
> > > header meth pIndex |
> > >       self inline: true.
> > >       self sharedCodeNamed: 'commonReturn' inCase: 120.
> > >
> > >       nilOop _ nilObj. "keep in a register"
> > >       thisCntx _ activeContext.
> > >       localCntx _ cntx.
> > >       localVal _ val.
> > >
> > >       "make sure we can return to the given context"
> > >       ((localCntx = nilOop) or:
> > >        [(self fetchPointer: InstructionPointerIndex ofObject: localCntx)
> > =
> > > nilOop]) ifTrue: [
> > >               "error: sender's instruction pointer or context is nil;
> > cannot return"
> > >               ^self internalCannotReturn: localVal].
> > >
> > >       "If this return is not to our immediate predecessor (i.e. from a
> > method
> > > to its sender, or from a block to its caller), scan the stack for the
> > > first unwind marked context and inform this context and let it deal with
> > > it. This provides a chance for ensure unwinding to occur."
> > >       thisCntx _ self fetchPointer: SenderIndex ofObject: activeContext.
> > >
> > >       "Just possibly a faster test would be to compare the homeContext
> > and
> > > activeContext - they are of course different for blocks. Thus we might be
> > > able to optimise a touch by having a different returnTo for the
> > > blockreteurn (since we know that must return to caller) and then if
> > > active ~= home we must be doing a non-local return. I think. Maybe."
> > >       [thisCntx = localCntx] whileFalse: [
> > >               thisCntx = nilObj ifTrue:[
> > >                       "error: sender's instruction pointer or context is
> > nil; cannot return"
> > >                       ^self internalCannotReturn: localVal].
> > >               "Climb up stack towards localCntx. Break out to a send of
> > > #aboutToReturn:through: if an unwind marked context is found"
> > >       header _ self baseHeader: thisCntx.
> > >
> > >       (self isMethodContextHeader: header)
> > >               ifTrue: [
> > >                       meth _ self fetchPointer: MethodIndex ofObject:
> > thisCntx.
> > >                       pIndex _ self primitiveIndexOf: meth.
> > >                       isUnwindMarked _ pIndex == 198]
> > >               ifFalse: [isUnwindMarked _ false].
> > >
> > >               isUnwindMarked ifTrue:[
> > >                       "context is marked; break out"
> > >                       ^self internalAboutToReturn: localVal through:
> > thisCntx].
> > >               thisCntx _ self fetchPointer: SenderIndex ofObject:
> > thisCntx.
> > > ].
> > >
> > >       "If we get here there is no unwind to worry about. Simply
> > terminate the
> > > stack up to the localCntx - often just the sender of the method"
> > >       thisCntx _ activeContext.
> > >       [thisCntx = localCntx]
> > >               whileFalse:
> > >               ["climb up stack to localCntx"
> > >               contextOfCaller _ self fetchPointer: SenderIndex ofObject:
> > thisCntx.
> > >
> > >               "zap exited contexts so any future attempted use will be
> > caught"
> > >               self storePointerUnchecked: SenderIndex ofObject: thisCntx
> > withValue:
> > > nilOop.
> > >               self storePointerUnchecked: InstructionPointerIndex
> > ofObject: thisCntx
> > > withValue: nilOop.
> > >               reclaimableContextCount > 0 ifTrue:
> > >                       ["try to recycle this context"
> > >                       reclaimableContextCount _ reclaimableContextCount
> > - 1.
> > >                       self recycleContextIfPossible: thisCntx].
> > >               thisCntx _ contextOfCaller].
> > >
> > >       activeContext _ thisCntx.
> > >       (thisCntx < youngStart) ifTrue: [ self beRootIfOld: thisCntx ].
> > >
> > >       self internalFetchContextRegisters: thisCntx.  "updates local IP
> > and SP"
> > >       self fetchNextBytecode.
> > >       self internalPush: localVal.
> > > ! !
> > >
> > > Shortly after that I released the VMMaker3.6 with a note that it couldn't
> > > produce a completely non-inlined VM because of a problem in fetchByte if
> > > globalstruct was enabled, and some odd problems in B2DPlugin. When
> > > VMMaker3.7 was released a year late (march 04) I apparently thought it
> > > could make the core vm non-inlined. Since this is all a bazillion years
> > > ago I can't remember any context to help extend the history.
> > >
> > > tim
> > > --
> > > tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> > > Science is imagination equipped with grappling hooks.
> > >
> >
> >
> >

Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

Nicolas Cellier
 
Hi David,
I wanted to say that COG depends on (self malloc: n) to be translated malloc(n); and not setMalloc(n); for example (you can have many others by browsing unimplemented calls), but maybe foo was not a generic ID in your case?


2014-02-11 15:05 GMT+01:00 David T. Lewis <[hidden email]>:

On Mon, Feb 10, 2014 at 10:12:32PM +0100, Nicolas Cellier wrote:
>
> Hi David,
> do you realize that Eliot is (ab)using this in COG in order to eliminate
> some direct cCode: '...' inclusion?
> So setFoo: is not an option (or i misunderstood something)
>

Hi Nicolas,

Actually I am not sure what you are referring to here, so probably I am
missing something. Can you explain why setFoo: would be a problem in Cog?
I cannot check it myself right now but I am interested to know if I am
missing something important.

Thanks,
Dave


>
> 2014-02-10 21:51 GMT+01:00 David T. Lewis <[hidden email]>:
>
> >
> > I was looking at the trunk VMM yesterday and found that most of the issues
> > were just caused by accessor methods, where #foo and #foo: generate
> > conflicting foo(void) and foo(aParameter). In most cases, a convention of
> > #setFoo: rather than #foo: takes care of the problem. There were a few
> > other miscellaneous issues as well, but nothing that looked serious.
> >
> > The variable 'memory' is a challenge because it is used extensively both
> > directly and through #memory and #memory:. I was considering changing the
> > variable name to something like memoryBase, and leaving the accessors
> > alone though I'm not sure that would be a very good idea.
> >
> > I ran out of time yesterday and did not pursue it beyond this.
> >
> > Dave
> >
> > >
> > >
> > > On 10-02-2014, at 11:53 AM, Eliot Miranda <[hidden email]>
> > wrote:
> > >>
> > >> I *think* the issue is the internal/external split brought abut by the
> > >> introduction of the localFoo variables, such as localSP and localIP.
> > >
> > > It's really hard to be sure but I suspect that this isn't the (only)
> > > issue. IIRC we used to be able to make non-inlined VMs at one point and
> > > that was well after the internalFoo code was added.
> > >
> > > OK, some quick email searching reveals some work done in '03 by johnMcI,
> > > Craig & me.
> > > Craig found the following code helped -
> > >
> > > !'From Squeak3.6alpha of ''17 March 2003'' [latest update: #5325] on 21
> > > July 2003 at 1:11:25 pm'!
> > >
> > > !Interpreter methodsFor: 'contexts' stamp: 'crl 7/19/2003 15:59'!
> > > primitiveFindNextUnwindContext
> > >       "Primitive. Search up the context stack for the next method context
> > > marked for unwind handling from the receiver up to but not including the
> > > argument. Return nil if none found."
> > >       | thisCntx nilOop aContext isUnwindMarked header meth pIndex |
> > >       aContext _ self popStack.
> > >       thisCntx _ self fetchPointer: SenderIndex ofObject: self popStack.
> > >       nilOop _ nilObj.
> > >
> > >       [(thisCntx = aContext) or: [thisCntx = nilOop]] whileFalse: [
> > >
> > >       header _ self baseHeader: aContext.
> > >
> > >       (self isMethodContextHeader: header)
> > >               ifTrue: [
> > >                       meth _ self fetchPointer: MethodIndex ofObject:
> > aContext.
> > >                       pIndex _ self primitiveIndexOf: meth.
> > >                       isUnwindMarked _ pIndex == 198]
> > >               ifFalse: [isUnwindMarked _ false].
> > >               isUnwindMarked ifTrue:[
> > >                       self push: thisCntx.
> > >                       ^nil].
> > >               thisCntx _ self fetchPointer: SenderIndex ofObject:
> > thisCntx].
> > >
> > >       ^self push: nilOop! !
> > >
> > > !Interpreter methodsFor: 'interpreter shell' stamp: 'crl 7/19/2003
> > 15:33'!
> > > interpret
> > >       "This is the main interpreter loop. It normally loops forever,
> > fetching
> > > and executing bytecodes. When running in the context of a browser plugin
> > > VM, however, it must return control to the browser periodically. This
> > > should done only when the state of the currently running Squeak thread is
> > > safely stored in the object heap. Since this is the case at the moment
> > > that a check for interrupts is performed, that is when we return to the
> > > browser if it is time to do so. Interrupt checks happen quite
> > > frequently."
> > >
> > >       "record entry time when running as a browser plug-in"
> > >       "self browserPluginInitialiseIfNeeded"
> > >       self internalizeIPandSP.
> > >       self fetchNextBytecode.
> > >       [true] whileTrue: [self dispatchOn: currentBytecode in:
> > BytecodeTable].
> > >       localIP _ localIP - 1.  "undo the pre-increment of IP before
> > returning"
> > >       self externalizeIPandSP.
> > > ! !
> > >
> > > !Interpreter methodsFor: 'return bytecodes' stamp: 'crl 7/19/2003 16:05'!
> > > returnValueTo
> > >       "Note: Assumed to be inlined into the dispatch loop."
> > >
> > >       | nilOop thisCntx contextOfCaller localCntx localVal isUnwindMarked
> > > header meth pIndex |
> > >       self inline: true.
> > >       self sharedCodeNamed: 'commonReturn' inCase: 120.
> > >
> > >       nilOop _ nilObj. "keep in a register"
> > >       thisCntx _ activeContext.
> > >       localCntx _ cntx.
> > >       localVal _ val.
> > >
> > >       "make sure we can return to the given context"
> > >       ((localCntx = nilOop) or:
> > >        [(self fetchPointer: InstructionPointerIndex ofObject: localCntx)
> > =
> > > nilOop]) ifTrue: [
> > >               "error: sender's instruction pointer or context is nil;
> > cannot return"
> > >               ^self internalCannotReturn: localVal].
> > >
> > >       "If this return is not to our immediate predecessor (i.e. from a
> > method
> > > to its sender, or from a block to its caller), scan the stack for the
> > > first unwind marked context and inform this context and let it deal with
> > > it. This provides a chance for ensure unwinding to occur."
> > >       thisCntx _ self fetchPointer: SenderIndex ofObject: activeContext.
> > >
> > >       "Just possibly a faster test would be to compare the homeContext
> > and
> > > activeContext - they are of course different for blocks. Thus we might be
> > > able to optimise a touch by having a different returnTo for the
> > > blockreteurn (since we know that must return to caller) and then if
> > > active ~= home we must be doing a non-local return. I think. Maybe."
> > >       [thisCntx = localCntx] whileFalse: [
> > >               thisCntx = nilObj ifTrue:[
> > >                       "error: sender's instruction pointer or context is
> > nil; cannot return"
> > >                       ^self internalCannotReturn: localVal].
> > >               "Climb up stack towards localCntx. Break out to a send of
> > > #aboutToReturn:through: if an unwind marked context is found"
> > >       header _ self baseHeader: thisCntx.
> > >
> > >       (self isMethodContextHeader: header)
> > >               ifTrue: [
> > >                       meth _ self fetchPointer: MethodIndex ofObject:
> > thisCntx.
> > >                       pIndex _ self primitiveIndexOf: meth.
> > >                       isUnwindMarked _ pIndex == 198]
> > >               ifFalse: [isUnwindMarked _ false].
> > >
> > >               isUnwindMarked ifTrue:[
> > >                       "context is marked; break out"
> > >                       ^self internalAboutToReturn: localVal through:
> > thisCntx].
> > >               thisCntx _ self fetchPointer: SenderIndex ofObject:
> > thisCntx.
> > > ].
> > >
> > >       "If we get here there is no unwind to worry about. Simply
> > terminate the
> > > stack up to the localCntx - often just the sender of the method"
> > >       thisCntx _ activeContext.
> > >       [thisCntx = localCntx]
> > >               whileFalse:
> > >               ["climb up stack to localCntx"
> > >               contextOfCaller _ self fetchPointer: SenderIndex ofObject:
> > thisCntx.
> > >
> > >               "zap exited contexts so any future attempted use will be
> > caught"
> > >               self storePointerUnchecked: SenderIndex ofObject: thisCntx
> > withValue:
> > > nilOop.
> > >               self storePointerUnchecked: InstructionPointerIndex
> > ofObject: thisCntx
> > > withValue: nilOop.
> > >               reclaimableContextCount > 0 ifTrue:
> > >                       ["try to recycle this context"
> > >                       reclaimableContextCount _ reclaimableContextCount
> > - 1.
> > >                       self recycleContextIfPossible: thisCntx].
> > >               thisCntx _ contextOfCaller].
> > >
> > >       activeContext _ thisCntx.
> > >       (thisCntx < youngStart) ifTrue: [ self beRootIfOld: thisCntx ].
> > >
> > >       self internalFetchContextRegisters: thisCntx.  "updates local IP
> > and SP"
> > >       self fetchNextBytecode.
> > >       self internalPush: localVal.
> > > ! !
> > >
> > > Shortly after that I released the VMMaker3.6 with a note that it couldn't
> > > produce a completely non-inlined VM because of a problem in fetchByte if
> > > globalstruct was enabled, and some odd problems in B2DPlugin. When
> > > VMMaker3.7 was released a year late (march 04) I apparently thought it
> > > could make the core vm non-inlined. Since this is all a bazillion years
> > > ago I can't remember any context to help extend the history.
> > >
> > > tim
> > > --
> > > tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> > > Science is imagination equipped with grappling hooks.
> > >
> >
> >
> >


Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

David T. Lewis
 
OK thank you, I am aware of that trick so not a problem.

(But you should not blame Eliot, I think I started abusing slang that way
in OSProcessPlugin many years ago, so you can blame me just as well)

Thanks a lot,
Dave

>  Hi David,
> I wanted to say that COG depends on (self malloc: n) to be translated
> malloc(n); and not setMalloc(n); for example (you can have many others by
> browsing unimplemented calls), but maybe foo was not a generic ID in your
> case?
>
>
> 2014-02-11 15:05 GMT+01:00 David T. Lewis <[hidden email]>:
>
>>
>> On Mon, Feb 10, 2014 at 10:12:32PM +0100, Nicolas Cellier wrote:
>> >
>> > Hi David,
>> > do you realize that Eliot is (ab)using this in COG in order to
>> eliminate
>> > some direct cCode: '...' inclusion?
>> > So setFoo: is not an option (or i misunderstood something)
>> >
>>
>> Hi Nicolas,
>>
>> Actually I am not sure what you are referring to here, so probably I am
>> missing something. Can you explain why setFoo: would be a problem in
>> Cog?
>> I cannot check it myself right now but I am interested to know if I am
>> missing something important.
>>
>> Thanks,
>> Dave
>>
>>
>> >
>> > 2014-02-10 21:51 GMT+01:00 David T. Lewis <[hidden email]>:
>> >
>> > >
>> > > I was looking at the trunk VMM yesterday and found that most of the
>> issues
>> > > were just caused by accessor methods, where #foo and #foo: generate
>> > > conflicting foo(void) and foo(aParameter). In most cases, a
>> convention
>> of
>> > > #setFoo: rather than #foo: takes care of the problem. There were a
>> few
>> > > other miscellaneous issues as well, but nothing that looked serious.
>> > >
>> > > The variable 'memory' is a challenge because it is used extensively
>> both
>> > > directly and through #memory and #memory:. I was considering
>> changing
>> the
>> > > variable name to something like memoryBase, and leaving the
>> accessors
>> > > alone though I'm not sure that would be a very good idea.
>> > >
>> > > I ran out of time yesterday and did not pursue it beyond this.
>> > >
>> > > Dave
>> > >
>> > > >
>> > > >
>> > > > On 10-02-2014, at 11:53 AM, Eliot Miranda
>> <[hidden email]>
>> > > wrote:
>> > > >>
>> > > >> I *think* the issue is the internal/external split brought abut
>> by
>> the
>> > > >> introduction of the localFoo variables, such as localSP and
>> localIP.
>> > > >
>> > > > It's really hard to be sure but I suspect that this isn't the
>> (only)
>> > > > issue. IIRC we used to be able to make non-inlined VMs at one
>> point
>> and
>> > > > that was well after the internalFoo code was added.
>> > > >
>> > > > OK, some quick email searching reveals some work done in '03 by
>> johnMcI,
>> > > > Craig & me.
>> > > > Craig found the following code helped -
>> > > >
>> > > > !'From Squeak3.6alpha of ''17 March 2003'' [latest update: #5325]
>> on
>> 21
>> > > > July 2003 at 1:11:25 pm'!
>> > > >
>> > > > !Interpreter methodsFor: 'contexts' stamp: 'crl 7/19/2003 15:59'!
>> > > > primitiveFindNextUnwindContext
>> > > >       "Primitive. Search up the context stack for the next method
>> context
>> > > > marked for unwind handling from the receiver up to but not
>> including
>> the
>> > > > argument. Return nil if none found."
>> > > >       | thisCntx nilOop aContext isUnwindMarked header meth pIndex
>> |
>> > > >       aContext _ self popStack.
>> > > >       thisCntx _ self fetchPointer: SenderIndex ofObject: self
>> popStack.
>> > > >       nilOop _ nilObj.
>> > > >
>> > > >       [(thisCntx = aContext) or: [thisCntx = nilOop]] whileFalse:
>> [
>> > > >
>> > > >       header _ self baseHeader: aContext.
>> > > >
>> > > >       (self isMethodContextHeader: header)
>> > > >               ifTrue: [
>> > > >                       meth _ self fetchPointer: MethodIndex
>> ofObject:
>> > > aContext.
>> > > >                       pIndex _ self primitiveIndexOf: meth.
>> > > >                       isUnwindMarked _ pIndex == 198]
>> > > >               ifFalse: [isUnwindMarked _ false].
>> > > >               isUnwindMarked ifTrue:[
>> > > >                       self push: thisCntx.
>> > > >                       ^nil].
>> > > >               thisCntx _ self fetchPointer: SenderIndex ofObject:
>> > > thisCntx].
>> > > >
>> > > >       ^self push: nilOop! !
>> > > >
>> > > > !Interpreter methodsFor: 'interpreter shell' stamp: 'crl 7/19/2003
>> > > 15:33'!
>> > > > interpret
>> > > >       "This is the main interpreter loop. It normally loops
>> forever,
>> > > fetching
>> > > > and executing bytecodes. When running in the context of a browser
>> plugin
>> > > > VM, however, it must return control to the browser periodically.
>> This
>> > > > should done only when the state of the currently running Squeak
>> thread is
>> > > > safely stored in the object heap. Since this is the case at the
>> moment
>> > > > that a check for interrupts is performed, that is when we return
>> to
>> the
>> > > > browser if it is time to do so. Interrupt checks happen quite
>> > > > frequently."
>> > > >
>> > > >       "record entry time when running as a browser plug-in"
>> > > >       "self browserPluginInitialiseIfNeeded"
>> > > >       self internalizeIPandSP.
>> > > >       self fetchNextBytecode.
>> > > >       [true] whileTrue: [self dispatchOn: currentBytecode in:
>> > > BytecodeTable].
>> > > >       localIP _ localIP - 1.  "undo the pre-increment of IP before
>> > > returning"
>> > > >       self externalizeIPandSP.
>> > > > ! !
>> > > >
>> > > > !Interpreter methodsFor: 'return bytecodes' stamp: 'crl 7/19/2003
>> 16:05'!
>> > > > returnValueTo
>> > > >       "Note: Assumed to be inlined into the dispatch loop."
>> > > >
>> > > >       | nilOop thisCntx contextOfCaller localCntx localVal
>> isUnwindMarked
>> > > > header meth pIndex |
>> > > >       self inline: true.
>> > > >       self sharedCodeNamed: 'commonReturn' inCase: 120.
>> > > >
>> > > >       nilOop _ nilObj. "keep in a register"
>> > > >       thisCntx _ activeContext.
>> > > >       localCntx _ cntx.
>> > > >       localVal _ val.
>> > > >
>> > > >       "make sure we can return to the given context"
>> > > >       ((localCntx = nilOop) or:
>> > > >        [(self fetchPointer: InstructionPointerIndex ofObject:
>> localCntx)
>> > > =
>> > > > nilOop]) ifTrue: [
>> > > >               "error: sender's instruction pointer or context is
>> nil;
>> > > cannot return"
>> > > >               ^self internalCannotReturn: localVal].
>> > > >
>> > > >       "If this return is not to our immediate predecessor (i.e.
>> from
>> a
>> > > method
>> > > > to its sender, or from a block to its caller), scan the stack for
>> the
>> > > > first unwind marked context and inform this context and let it
>> deal
>> with
>> > > > it. This provides a chance for ensure unwinding to occur."
>> > > >       thisCntx _ self fetchPointer: SenderIndex ofObject:
>> activeContext.
>> > > >
>> > > >       "Just possibly a faster test would be to compare the
>> homeContext
>> > > and
>> > > > activeContext - they are of course different for blocks. Thus we
>> might be
>> > > > able to optimise a touch by having a different returnTo for the
>> > > > blockreteurn (since we know that must return to caller) and then
>> if
>> > > > active ~= home we must be doing a non-local return. I think.
>> Maybe."
>> > > >       [thisCntx = localCntx] whileFalse: [
>> > > >               thisCntx = nilObj ifTrue:[
>> > > >                       "error: sender's instruction pointer or
>> context is
>> > > nil; cannot return"
>> > > >                       ^self internalCannotReturn: localVal].
>> > > >               "Climb up stack towards localCntx. Break out to a
>> send
>> of
>> > > > #aboutToReturn:through: if an unwind marked context is found"
>> > > >       header _ self baseHeader: thisCntx.
>> > > >
>> > > >       (self isMethodContextHeader: header)
>> > > >               ifTrue: [
>> > > >                       meth _ self fetchPointer: MethodIndex
>> ofObject:
>> > > thisCntx.
>> > > >                       pIndex _ self primitiveIndexOf: meth.
>> > > >                       isUnwindMarked _ pIndex == 198]
>> > > >               ifFalse: [isUnwindMarked _ false].
>> > > >
>> > > >               isUnwindMarked ifTrue:[
>> > > >                       "context is marked; break out"
>> > > >                       ^self internalAboutToReturn: localVal
>> through:
>> > > thisCntx].
>> > > >               thisCntx _ self fetchPointer: SenderIndex ofObject:
>> > > thisCntx.
>> > > > ].
>> > > >
>> > > >       "If we get here there is no unwind to worry about. Simply
>> > > terminate the
>> > > > stack up to the localCntx - often just the sender of the method"
>> > > >       thisCntx _ activeContext.
>> > > >       [thisCntx = localCntx]
>> > > >               whileFalse:
>> > > >               ["climb up stack to localCntx"
>> > > >               contextOfCaller _ self fetchPointer: SenderIndex
>> ofObject:
>> > > thisCntx.
>> > > >
>> > > >               "zap exited contexts so any future attempted use
>> will
>> be
>> > > caught"
>> > > >               self storePointerUnchecked: SenderIndex ofObject:
>> thisCntx
>> > > withValue:
>> > > > nilOop.
>> > > >               self storePointerUnchecked: InstructionPointerIndex
>> > > ofObject: thisCntx
>> > > > withValue: nilOop.
>> > > >               reclaimableContextCount > 0 ifTrue:
>> > > >                       ["try to recycle this context"
>> > > >                       reclaimableContextCount _
>> reclaimableContextCount
>> > > - 1.
>> > > >                       self recycleContextIfPossible: thisCntx].
>> > > >               thisCntx _ contextOfCaller].
>> > > >
>> > > >       activeContext _ thisCntx.
>> > > >       (thisCntx < youngStart) ifTrue: [ self beRootIfOld: thisCntx
>> ].
>> > > >
>> > > >       self internalFetchContextRegisters: thisCntx.  "updates
>> local
>> IP
>> > > and SP"
>> > > >       self fetchNextBytecode.
>> > > >       self internalPush: localVal.
>> > > > ! !
>> > > >
>> > > > Shortly after that I released the VMMaker3.6 with a note that it
>> couldn't
>> > > > produce a completely non-inlined VM because of a problem in
>> fetchByte if
>> > > > globalstruct was enabled, and some odd problems in B2DPlugin. When
>> > > > VMMaker3.7 was released a year late (march 04) I apparently
>> thought
>> it
>> > > > could make the core vm non-inlined. Since this is all a bazillion
>> years
>> > > > ago I can't remember any context to help extend the history.
>> > > >
>> > > > tim
>> > > > --
>> > > > tim Rowledge; [hidden email]; http://www.rowledge.org/tim
>> > > > Science is imagination equipped with grappling hooks.
>> > > >
>> > >
>> > >
>> > >
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

Eliot Miranda-2
 



On Tue, Feb 11, 2014 at 11:05 AM, David T. Lewis <[hidden email]> wrote:

OK thank you, I am aware of that trick so not a problem.

(But you should not blame Eliot, I think I started abusing slang that way
in OSProcessPlugin many years ago, so you can blame me just as well)

Personally I fin it far less of an abuse than the horrible cCode: 'aString...' idiom.  With "self malloc: n" I can look for senders etc, but more importantly I can actually implement it in the simulator.  You'll see in the Cog branch working implementations of str:n:cmp: mem:mo:ve: etc which are actually required by the simulator.  Let me plead for those of you writing VM code to avoid cCode: as much as possible.  Use it to include code that only the simulator should use by all means, but please try and generate your C calls from Smalltalk code.

Here's the kind of thing I mean.  This coerces an address into a simulator's CogMethod:

printCogMethod: cogMethod
<api>
<var: #cogMethod type: #'CogMethod *'>
| address primitive |
self cCode: ''
inSmalltalk:
[self transcript ensureCr.
cogMethod isInteger ifTrue:
[^self printCogMethod: (self cCoerceSimple: cogMethod to: #'CogMethod *')]].
address := cogMethod asInteger.
self printHex: address;
print: ' <-> ';
printHex: address + cogMethod blockSize.
cogMethod cmType = CMMethod ifTrue:
...

Here's the kind of thing to be avoided:

interpreterProxy success:
((interpreterProxy isBytes: oop)
and: [(interpreterProxy slotSizeOf: oop) = (self cCode: 'sizeof(AsyncFile)')]).

It could be written as (and if so, simulated!!) 

interpreterProxy success:
((interpreterProxy isBytes: oop)
and: [(interpreterProxy slotSizeOf: oop) = (self sizeof: #AsyncFile)]).

cheers!


Thanks a lot,
Dave

>  Hi David,
> I wanted to say that COG depends on (self malloc: n) to be translated
> malloc(n); and not setMalloc(n); for example (you can have many others by
> browsing unimplemented calls), but maybe foo was not a generic ID in your
> case?
>
>
> 2014-02-11 15:05 GMT+01:00 David T. Lewis <[hidden email]>:
>
>>
>> On Mon, Feb 10, 2014 at 10:12:32PM +0100, Nicolas Cellier wrote:
>> >
>> > Hi David,
>> > do you realize that Eliot is (ab)using this in COG in order to
>> eliminate
>> > some direct cCode: '...' inclusion?
>> > So setFoo: is not an option (or i misunderstood something)
>> >
>>
>> Hi Nicolas,
>>
>> Actually I am not sure what you are referring to here, so probably I am
>> missing something. Can you explain why setFoo: would be a problem in
>> Cog?
>> I cannot check it myself right now but I am interested to know if I am
>> missing something important.
>>
>> Thanks,
>> Dave
>>
>>
>> >
>> > 2014-02-10 21:51 GMT+01:00 David T. Lewis <[hidden email]>:
>> >
>> > >
>> > > I was looking at the trunk VMM yesterday and found that most of the
>> issues
>> > > were just caused by accessor methods, where #foo and #foo: generate
>> > > conflicting foo(void) and foo(aParameter). In most cases, a
>> convention
>> of
>> > > #setFoo: rather than #foo: takes care of the problem. There were a
>> few
>> > > other miscellaneous issues as well, but nothing that looked serious.
>> > >
>> > > The variable 'memory' is a challenge because it is used extensively
>> both
>> > > directly and through #memory and #memory:. I was considering
>> changing
>> the
>> > > variable name to something like memoryBase, and leaving the
>> accessors
>> > > alone though I'm not sure that would be a very good idea.
>> > >
>> > > I ran out of time yesterday and did not pursue it beyond this.
>> > >
>> > > Dave
>> > >
>> > > >
>> > > >
>> > > > On 10-02-2014, at 11:53 AM, Eliot Miranda
>> <[hidden email]>
>> > > wrote:
>> > > >>
>> > > >> I *think* the issue is the internal/external split brought abut
>> by
>> the
>> > > >> introduction of the localFoo variables, such as localSP and
>> localIP.
>> > > >
>> > > > It's really hard to be sure but I suspect that this isn't the
>> (only)
>> > > > issue. IIRC we used to be able to make non-inlined VMs at one
>> point
>> and
>> > > > that was well after the internalFoo code was added.
>> > > >
>> > > > OK, some quick email searching reveals some work done in '03 by
>> johnMcI,
>> > > > Craig & me.
>> > > > Craig found the following code helped -
>> > > >
>> > > > !'From Squeak3.6alpha of ''17 March 2003'' [latest update: #5325]
>> on
>> 21
>> > > > July 2003 at 1:11:25 pm'!
>> > > >
>> > > > !Interpreter methodsFor: 'contexts' stamp: 'crl 7/19/2003 15:59'!
>> > > > primitiveFindNextUnwindContext
>> > > >       "Primitive. Search up the context stack for the next method
>> context
>> > > > marked for unwind handling from the receiver up to but not
>> including
>> the
>> > > > argument. Return nil if none found."
>> > > >       | thisCntx nilOop aContext isUnwindMarked header meth pIndex
>> |
>> > > >       aContext _ self popStack.
>> > > >       thisCntx _ self fetchPointer: SenderIndex ofObject: self
>> popStack.
>> > > >       nilOop _ nilObj.
>> > > >
>> > > >       [(thisCntx = aContext) or: [thisCntx = nilOop]] whileFalse:
>> [
>> > > >
>> > > >       header _ self baseHeader: aContext.
>> > > >
>> > > >       (self isMethodContextHeader: header)
>> > > >               ifTrue: [
>> > > >                       meth _ self fetchPointer: MethodIndex
>> ofObject:
>> > > aContext.
>> > > >                       pIndex _ self primitiveIndexOf: meth.
>> > > >                       isUnwindMarked _ pIndex == 198]
>> > > >               ifFalse: [isUnwindMarked _ false].
>> > > >               isUnwindMarked ifTrue:[
>> > > >                       self push: thisCntx.
>> > > >                       ^nil].
>> > > >               thisCntx _ self fetchPointer: SenderIndex ofObject:
>> > > thisCntx].
>> > > >
>> > > >       ^self push: nilOop! !
>> > > >
>> > > > !Interpreter methodsFor: 'interpreter shell' stamp: 'crl 7/19/2003
>> > > 15:33'!
>> > > > interpret
>> > > >       "This is the main interpreter loop. It normally loops
>> forever,
>> > > fetching
>> > > > and executing bytecodes. When running in the context of a browser
>> plugin
>> > > > VM, however, it must return control to the browser periodically.
>> This
>> > > > should done only when the state of the currently running Squeak
>> thread is
>> > > > safely stored in the object heap. Since this is the case at the
>> moment
>> > > > that a check for interrupts is performed, that is when we return
>> to
>> the
>> > > > browser if it is time to do so. Interrupt checks happen quite
>> > > > frequently."
>> > > >
>> > > >       "record entry time when running as a browser plug-in"
>> > > >       "self browserPluginInitialiseIfNeeded"
>> > > >       self internalizeIPandSP.
>> > > >       self fetchNextBytecode.
>> > > >       [true] whileTrue: [self dispatchOn: currentBytecode in:
>> > > BytecodeTable].
>> > > >       localIP _ localIP - 1.  "undo the pre-increment of IP before
>> > > returning"
>> > > >       self externalizeIPandSP.
>> > > > ! !
>> > > >
>> > > > !Interpreter methodsFor: 'return bytecodes' stamp: 'crl 7/19/2003
>> 16:05'!
>> > > > returnValueTo
>> > > >       "Note: Assumed to be inlined into the dispatch loop."
>> > > >
>> > > >       | nilOop thisCntx contextOfCaller localCntx localVal
>> isUnwindMarked
>> > > > header meth pIndex |
>> > > >       self inline: true.
>> > > >       self sharedCodeNamed: 'commonReturn' inCase: 120.
>> > > >
>> > > >       nilOop _ nilObj. "keep in a register"
>> > > >       thisCntx _ activeContext.
>> > > >       localCntx _ cntx.
>> > > >       localVal _ val.
>> > > >
>> > > >       "make sure we can return to the given context"
>> > > >       ((localCntx = nilOop) or:
>> > > >        [(self fetchPointer: InstructionPointerIndex ofObject:
>> localCntx)
>> > > =
>> > > > nilOop]) ifTrue: [
>> > > >               "error: sender's instruction pointer or context is
>> nil;
>> > > cannot return"
>> > > >               ^self internalCannotReturn: localVal].
>> > > >
>> > > >       "If this return is not to our immediate predecessor (i.e.
>> from
>> a
>> > > method
>> > > > to its sender, or from a block to its caller), scan the stack for
>> the
>> > > > first unwind marked context and inform this context and let it
>> deal
>> with
>> > > > it. This provides a chance for ensure unwinding to occur."
>> > > >       thisCntx _ self fetchPointer: SenderIndex ofObject:
>> activeContext.
>> > > >
>> > > >       "Just possibly a faster test would be to compare the
>> homeContext
>> > > and
>> > > > activeContext - they are of course different for blocks. Thus we
>> might be
>> > > > able to optimise a touch by having a different returnTo for the
>> > > > blockreteurn (since we know that must return to caller) and then
>> if
>> > > > active ~= home we must be doing a non-local return. I think.
>> Maybe."
>> > > >       [thisCntx = localCntx] whileFalse: [
>> > > >               thisCntx = nilObj ifTrue:[
>> > > >                       "error: sender's instruction pointer or
>> context is
>> > > nil; cannot return"
>> > > >                       ^self internalCannotReturn: localVal].
>> > > >               "Climb up stack towards localCntx. Break out to a
>> send
>> of
>> > > > #aboutToReturn:through: if an unwind marked context is found"
>> > > >       header _ self baseHeader: thisCntx.
>> > > >
>> > > >       (self isMethodContextHeader: header)
>> > > >               ifTrue: [
>> > > >                       meth _ self fetchPointer: MethodIndex
>> ofObject:
>> > > thisCntx.
>> > > >                       pIndex _ self primitiveIndexOf: meth.
>> > > >                       isUnwindMarked _ pIndex == 198]
>> > > >               ifFalse: [isUnwindMarked _ false].
>> > > >
>> > > >               isUnwindMarked ifTrue:[
>> > > >                       "context is marked; break out"
>> > > >                       ^self internalAboutToReturn: localVal
>> through:
>> > > thisCntx].
>> > > >               thisCntx _ self fetchPointer: SenderIndex ofObject:
>> > > thisCntx.
>> > > > ].
>> > > >
>> > > >       "If we get here there is no unwind to worry about. Simply
>> > > terminate the
>> > > > stack up to the localCntx - often just the sender of the method"
>> > > >       thisCntx _ activeContext.
>> > > >       [thisCntx = localCntx]
>> > > >               whileFalse:
>> > > >               ["climb up stack to localCntx"
>> > > >               contextOfCaller _ self fetchPointer: SenderIndex
>> ofObject:
>> > > thisCntx.
>> > > >
>> > > >               "zap exited contexts so any future attempted use
>> will
>> be
>> > > caught"
>> > > >               self storePointerUnchecked: SenderIndex ofObject:
>> thisCntx
>> > > withValue:
>> > > > nilOop.
>> > > >               self storePointerUnchecked: InstructionPointerIndex
>> > > ofObject: thisCntx
>> > > > withValue: nilOop.
>> > > >               reclaimableContextCount > 0 ifTrue:
>> > > >                       ["try to recycle this context"
>> > > >                       reclaimableContextCount _
>> reclaimableContextCount
>> > > - 1.
>> > > >                       self recycleContextIfPossible: thisCntx].
>> > > >               thisCntx _ contextOfCaller].
>> > > >
>> > > >       activeContext _ thisCntx.
>> > > >       (thisCntx < youngStart) ifTrue: [ self beRootIfOld: thisCntx
>> ].
>> > > >
>> > > >       self internalFetchContextRegisters: thisCntx.  "updates
>> local
>> IP
>> > > and SP"
>> > > >       self fetchNextBytecode.
>> > > >       self internalPush: localVal.
>> > > > ! !
>> > > >
>> > > > Shortly after that I released the VMMaker3.6 with a note that it
>> couldn't
>> > > > produce a completely non-inlined VM because of a problem in
>> fetchByte if
>> > > > globalstruct was enabled, and some odd problems in B2DPlugin. When
>> > > > VMMaker3.7 was released a year late (march 04) I apparently
>> thought
>> it
>> > > > could make the core vm non-inlined. Since this is all a bazillion
>> years
>> > > > ago I can't remember any context to help extend the history.
>> > > >
>> > > > tim
>> > > > --
>> > > > tim Rowledge; [hidden email]; http://www.rowledge.org/tim
>> > > > Science is imagination equipped with grappling hooks.
>> > > >
>> > >
>> > >
>> > >
>>
>>
>





--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

re: Making a Slower VM

ccrraaiigg
 

> Let me plead for those of you writing VM code to avoid cCode: as much
> as possible.  Use it to include code that only the simulator should
> use by all means, but please try and generate your C calls from
> Smalltalk code.

     Hear, hear!


-C

--
Craig Latta
www.netjam.org/resume
+31   6 2757 7177 (SMS ok)
+ 1 415 287 3547 (no SMS)

Reply | Threaded
Open this post in threaded view
|

Re: Making a Slower VM

David T. Lewis
In reply to this post by timrowledge
 
On Sun, Feb 09, 2014 at 10:23:37AM -0800, tim Rowledge wrote:

>
> On 09-02-2014, at 10:07 AM, David T. Lewis <[hidden email]> wrote:
> >
> > I think someone mentioned it earlier, but a very easy way to produce an
> > intentionally slow VM is to generate the sources from VMMaker with the
> > inlining step disabled. The slang inliner is extremely effective, and turning
> > it off produces impressively sluggish results.
>
> Does that actually work these days? Last I remember was that turning
> inlining off wouldn?t produce a buildable interp.c file. If someone has
> had the patience to make it work then I?m impressed.
>

You're right about one thing, it required a lot of patience ;-)

I did manage to get it working though, and the results are in VMMaker-dtl.342.

This turned out to be a useful exercise, as I flushed out a couple of type
declaration bugs along the way.

The major issue was that the refactoring of object memory and interpreter
into separate class hierarchies (which is a very good thing IMHO) requires
the use of accessor methods, and this leads to name conflicts in the generated
code if those accessor methods are not fully inlined.

I went with the approach of naming the accessors getFoo and setFoo: as well
as, for the case of array access, fooAt: and fooAt:put:. This is not very
pleasing from a readability point of view, but it is simple and it works.

If I compile a VM with inlining disabled and compiler optimization turned
off, the result is about 1/8th the speed of the same interpreter VM built
normally.

Dave
12