Smalltalk › Squeak › Squeak VM

Imminent change to Spur image format

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

7 messages Options

Eliot Miranda-2

Imminent change to Spur image format

Hi All,

as part of the Newspeak infrastructure we use at Cadence I implemented multiple bytecode set support and a lifting of the limits in a method on the number of literals and the span of branches about two years ago. This work involved adding a second interpretation to the bits in a method header, providing 16 bits of literal count. This was done by moving the primitive number out of the method header and into an optional callPrimitive bytecode, being the first bytecode of methods that have primitives.

Now in Spur I have the opportunity to use this expanded format for the exsting bytecode set as well. The SqueakV3 set does not use bytecode 139, which is convenient to use for its callPrimitiveBytecode. The advantage is that when and if a new bytecode set is added, as is planned for the Sista VMs, the VM will not have to test method headers to decide which format they're in, because there will only be one.

Alas I only just realised this last night, and implementing this for Spur now means that any existing Spur images will be invalid. But better I do this now than later, when Spur is in wide-spread use.

So this message is to warn you that any existing Spur images will have to be discarded and rebuilt soon, once I release new VMs. Apologies for the inconvenience. I do plan for the VMs to exit with a useful error message when loading Spur images with the old method header format, rather than just crashing. So I hope the transition won't be too painful.
--
best,

Eliot

Bert Freudenberg

Re: Imminent change to Spur image format

On 09.08.2014, at 00:46, Eliot Miranda <[hidden email]> wrote:

> Hi All,
>
> as part of the Newspeak infrastructure we use at Cadence I implemented multiple bytecode set support and a lifting of the limits in a method on the number of literals and the span of branches about two years ago. This work involved adding a second interpretation to the bits in a method header, providing 16 bits of literal count. This was done by moving the primitive number out of the method header and into an optional callPrimitive bytecode, being the first bytecode of methods that have primitives.
>
> Now in Spur I have the opportunity to use this expanded format for the exsting bytecode set as well. The SqueakV3 set does not use bytecode 139, which is convenient to use for its callPrimitiveBytecode. The advantage is that when and if a new bytecode set is added, as is planned for the Sista VMs, the VM will not have to test method headers to decide which format they're in, because there will only be one.

Just curious: how does the VM know which bytecode set to use for a given method?

- Bert -

smime.p7s (5K) Download Attachment

Clément Béra

Re: Imminent change to Spur image format

2014-08-11 17:37 GMT+02:00 Bert Freudenberg <[hidden email]>:

On 09.08.2014, at 00:46, Eliot Miranda <[hidden email]> wrote:

> Hi All,
>
> as part of the Newspeak infrastructure we use at Cadence I implemented multiple bytecode set support and a lifting of the limits in a method on the number of literals and the span of branches about two years ago. This work involved adding a second interpretation to the bits in a method header, providing 16 bits of literal count. This was done by moving the primitive number out of the method header and into an optional callPrimitive bytecode, being the first bytecode of methods that have primitives.
>
> Now in Spur I have the opportunity to use this expanded format for the exsting bytecode set as well. The SqueakV3 set does not use bytecode 139, which is convenient to use for its callPrimitiveBytecode. The advantage is that when and if a new bytecode set is added, as is planned for the Sista VMs, the VM will not have to test method headers to decide which format they're in, because there will only be one.

Just curious: how does the VM know which bytecode set to use for a given method?

A bit is set or not in the compiled method header.

- Bert -

Bert Freudenberg

Re: Imminent change to Spur image format

On 11.08.2014, at 18:16, Clément Bera <[hidden email]> wrote:

2014-08-11 17:37 GMT+02:00 Bert Freudenberg <[hidden email]>:

On 09.08.2014, at 00:46, Eliot Miranda <[hidden email]> wrote:

> Hi All,
>
> as part of the Newspeak infrastructure we use at Cadence I implemented multiple bytecode set support and a lifting of the limits in a method on the number of literals and the span of branches about two years ago. This work involved adding a second interpretation to the bits in a method header, providing 16 bits of literal count. This was done by moving the primitive number out of the method header and into an optional callPrimitive bytecode, being the first bytecode of methods that have primitives.
>
> Now in Spur I have the opportunity to use this expanded format for the exsting bytecode set as well. The SqueakV3 set does not use bytecode 139, which is convenient to use for its callPrimitiveBytecode. The advantage is that when and if a new bytecode set is added, as is planned for the Sista VMs, the VM will not have to test method headers to decide which format they're in, because there will only be one.

Just curious: how does the VM know which bytecode set to use for a given method?

A bit is set or not in the compiled method header.

But Eliot wrote "the VM will not have to test method headers"?

Also, with a single bit, how can there be more than one alternative bytecode set?

- Bert -

smime.p7s (5K) Download Attachment

Eliot Miranda-2

Re: Imminent change to Spur image format

In reply to this post by Clément Béra

On Aug 11, 2014, at 9:16 AM, Clément Bera <[hidden email]> wrote:

2014-08-11 17:37 GMT+02:00 Bert Freudenberg <[hidden email]>:

On 09.08.2014, at 00:46, Eliot Miranda <[hidden email]> wrote:

> Hi All,
>
> as part of the Newspeak infrastructure we use at Cadence I implemented multiple bytecode set support and a lifting of the limits in a method on the number of literals and the span of branches about two years ago. This work involved adding a second interpretation to the bits in a method header, providing 16 bits of literal count. This was done by moving the primitive number out of the method header and into an optional callPrimitive bytecode, being the first bytecode of methods that have primitives.
>
> Now in Spur I have the opportunity to use this expanded format for the exsting bytecode set as well. The SqueakV3 set does not use bytecode 139, which is convenient to use for its callPrimitiveBytecode. The advantage is that when and if a new bytecode set is added, as is planned for the Sista VMs, the VM will not have to test method headers to decide which format they're in, because there will only be one.

Just curious: how does the VM know which bytecode set to use for a given method?

A bit is set or not in the compiled method header.

And it's the sign bit. That means the flag bit is still available, and the bit adjacent to it, which used to be the msb of the primitive index is free. More bits can be stolen from the most significant bits of the num literals field (least 16 bits of header) cuz 64k literals is a lot ;-). e.g. the newspeak folks have their eyes set on those two free high bits for an access code (a protected bit and a private bit).

Eliot Miranda-2

Re: Imminent change to Spur image format

In reply to this post by Bert Freudenberg

Hi Bert,

On Mon, Aug 11, 2014 at 10:41 AM, Bert Freudenberg <[hidden email]> wrote:

On 11.08.2014, at 18:16, Clément Bera <[hidden email]> wrote:

2014-08-11 17:37 GMT+02:00 Bert Freudenberg <[hidden email]>:

On 09.08.2014, at 00:46, Eliot Miranda <[hidden email]> wrote:

> Hi All,
>
> as part of the Newspeak infrastructure we use at Cadence I implemented multiple bytecode set support and a lifting of the limits in a method on the number of literals and the span of branches about two years ago. This work involved adding a second interpretation to the bits in a method header, providing 16 bits of literal count. This was done by moving the primitive number out of the method header and into an optional callPrimitive bytecode, being the first bytecode of methods that have primitives.
>
> Now in Spur I have the opportunity to use this expanded format for the exsting bytecode set as well. The SqueakV3 set does not use bytecode 139, which is convenient to use for its callPrimitiveBytecode. The advantage is that when and if a new bytecode set is added, as is planned for the Sista VMs, the VM will not have to test method headers to decide which format they're in, because there will only be one.

Just curious: how does the VM know which bytecode set to use for a given method?

A bit is set or not in the compiled method header.

But Eliot wrote "the VM will not have to test method headers"?

Right. For a while the Newspeak VMs have supported two bytecode sets with two different header formats, the old format:

sign bit 0, header >= 0:

(index 0) 9 bits: main part of primitive number (#primitive)

(index 9) 8 bits: number of literals (#numLiterals)

(index 17) 1 bit: whether a large frame size is needed (#frameSize)

(index 18) 6 bits: number of temporary variables (#numTemps)

(index 24) 4 bits: number of arguments to the method (#numArgs)

(index 28) 1 bit: high-bit of primitive number (#primitive)

(index 29) 1 bit: flag bit, ignored by the VM (#flag)

(index 30/63) sign bit: 0 selects the Primary instruction set (#signFlag)

sign bit 1, header < 0:

(index 0) 16 bits: number of literals (#numLiterals)

(index 16) 1 bit: has primitive

(index 17) 1 bit: whether a large frame size is needed (#frameSize)

(index 18) 6 bits: number of temporary variables (#numTemps)

(index 24) 4 bits: number of arguments to the method (#numArgs)

(index 28) 2 bits: reserved for an access modifier (00-unused, 01-private, 10-protected, 11-public)

(index 30/63) sign bit: 1 selects the Secondary instruction set (e.g. NewsqueakV4) (#signFlag)

i.e. the Secondary Bytecode Set expands the number of literals to 65535 by assuming a CallPrimitive bytecode.

So whenever the VM needed to know the numLiterals (e.g. the GC in visiting pointer fields in methods) it had to switch-hit:

StackInterpreter>>literalCountOfHeader: headerPointer

<api>

"We support two method header formats, as selected by the sign flag. Even if the VM only

has one bytecode set, supporting teh two formats here allows for instantiating methods in

the other format for testing, etc."

^(self headerIndicatesAlternateBytecodeSet: headerPointer)

ifTrue: [self literalCountOfAlternateHeader: headerPointer]

ifFalse: [self literalCountOfOriginalHeader: headerPointer]

StackInterpreter>>headerIndicatesAlternateBytecodeSet: methodHeader

<api>

<inline: true>

"A negative header selects the alternate bytecode set."

^methodHeader signedIntFromLong < 0

It;s not a lot of work since the header has to be fetched anyway. But it's complexity, and things should be as simple as possible but no simpler ;-)

So since Spur is a chance for a fresh start it seemed like a good time to move to a single method header format, while keepoing the ability to support multiple bytecode sets.

Also, with a single bit, how can there be more than one alternative bytecode set?

One could I suppose put bits in a bytecode, just like the primitive is encoded in a callPrimitive: bytecode. But I don't like that. Instead, if one needs more juts add another bit. As described above there's room for a two bit field in the sign and flag bits combined.

One might take the view that only one additional set is needed. One can develop it, test it, then move to it by recompiling everything to it and switching. e.g. a snapshot operation that flips the sign bit on save could simply move methods from one set to another.

However, if, as the Smalltalk-X folks do, one wants to directly support the bytecode of another language (java, python?) then having, say, 4 sets to choose from might be nice.

Right now the pressing need is to support the Sista bytecode set. Clément is making string progress and I've implemented a fair ammount of the VM support. We have the bytecode set defined, the inline primitives defined. I'll leave it to Clément to describe the optimizer status. In the VM we have the bytecode set implemented, the performance counters on conditional branches that call-back into the image when they trip, and the class trap bytecode, used to check that objects are of the required classes before entering unsafe optimized code. That leaves the inlined primitives of which I've implemented a handful of the simpler ones. So we're on course to have a prototype by Christmas ;-)

At a later stage we'll add an OptimizedContext (which in the VM will have an associated new frame format) that will have a pointer stack (as the current Context does) and a byte stack for handling raw data such as floating-point. The JIT will map (at least some) memory locations in the byte stack onto the floating-point registers for much improved floating-point performance.

--
best,

Eliot

Clément Béra

Re: Imminent change to Spur image format

2014-08-11 19:59 GMT+02:00 Eliot Miranda <[hidden email]>:

Hi Bert,

On Mon, Aug 11, 2014 at 10:41 AM, Bert Freudenberg <[hidden email]> wrote:

On 11.08.2014, at 18:16, Clément Bera <[hidden email]> wrote:

2014-08-11 17:37 GMT+02:00 Bert Freudenberg <[hidden email]>:

On 09.08.2014, at 00:46, Eliot Miranda <[hidden email]> wrote:

> Hi All,
>
> as part of the Newspeak infrastructure we use at Cadence I implemented multiple bytecode set support and a lifting of the limits in a method on the number of literals and the span of branches about two years ago. This work involved adding a second interpretation to the bits in a method header, providing 16 bits of literal count. This was done by moving the primitive number out of the method header and into an optional callPrimitive bytecode, being the first bytecode of methods that have primitives.
>
> Now in Spur I have the opportunity to use this expanded format for the exsting bytecode set as well. The SqueakV3 set does not use bytecode 139, which is convenient to use for its callPrimitiveBytecode. The advantage is that when and if a new bytecode set is added, as is planned for the Sista VMs, the VM will not have to test method headers to decide which format they're in, because there will only be one.

Just curious: how does the VM know which bytecode set to use for a given method?

A bit is set or not in the compiled method header.

But Eliot wrote "the VM will not have to test method headers"?

Right. For a while the Newspeak VMs have supported two bytecode sets with two different header formats, the old format:

sign bit 0, header >= 0:
(index 0) 9 bits: main part of primitive number (#primitive)

(index 9) 8 bits: number of literals (#numLiterals)
(index 17) 1 bit: whether a large frame size is needed (#frameSize)

(index 18) 6 bits: number of temporary variables (#numTemps)
(index 24) 4 bits: number of arguments to the method (#numArgs)

(index 28) 1 bit: high-bit of primitive number (#primitive)
(index 29) 1 bit: flag bit, ignored by the VM (#flag)

(index 30/63) sign bit: 0 selects the Primary instruction set (#signFlag)
sign bit 1, header < 0:
(index 0) 16 bits: number of literals (#numLiterals)

(index 16) 1 bit: has primitive
(index 17) 1 bit: whether a large frame size is needed (#frameSize)

(index 18) 6 bits: number of temporary variables (#numTemps)
(index 24) 4 bits: number of arguments to the method (#numArgs)

(index 28) 2 bits: reserved for an access modifier (00-unused, 01-private, 10-protected, 11-public)

(index 30/63) sign bit: 1 selects the Secondary instruction set (e.g. NewsqueakV4) (#signFlag)
i.e. the Secondary Bytecode Set expands the number of literals to 65535 by assuming a CallPrimitive bytecode.

So whenever the VM needed to know the numLiterals (e.g. the GC in visiting pointer fields in methods) it had to switch-hit:

StackInterpreter>>literalCountOfHeader: headerPointer

<api>
"We support two method header formats, as selected by the sign flag. Even if the VM only

has one bytecode set, supporting teh two formats here allows for instantiating methods in
the other format for testing, etc."

^(self headerIndicatesAlternateBytecodeSet: headerPointer)
ifTrue: [self literalCountOfAlternateHeader: headerPointer]

ifFalse: [self literalCountOfOriginalHeader: headerPointer]

StackInterpreter>>headerIndicatesAlternateBytecodeSet: methodHeader

<api>
<inline: true>
"A negative header selects the alternate bytecode set."

^methodHeader signedIntFromLong < 0

It;s not a lot of work since the header has to be fetched anyway. But it's complexity, and things should be as simple as possible but no simpler ;-)

So since Spur is a chance for a fresh start it seemed like a good time to move to a single method header format, while keepoing the ability to support multiple bytecode sets.

Also, with a single bit, how can there be more than one alternative bytecode set?

One could I suppose put bits in a bytecode, just like the primitive is encoded in a callPrimitive: bytecode. But I don't like that. Instead, if one needs more juts add another bit. As described above there's room for a two bit field in the sign and flag bits combined.

One might take the view that only one additional set is needed. One can develop it, test it, then move to it by recompiling everything to it and switching. e.g. a snapshot operation that flips the sign bit on save could simply move methods from one set to another.

However, if, as the Smalltalk-X folks do, one wants to directly support the bytecode of another language (java, python?) then having, say, 4 sets to choose from might be nice.

Right now the pressing need is to support the Sista bytecode set. Clément is making string progress and I've implemented a fair ammount of the VM support. We have the bytecode set defined, the inline primitives defined. I'll leave it to Clément to describe the optimizer status.

This is difficult to say. I believe I will have lots of time from november 1st to mid december to make lots of progress on the optimizer. It should be stable enough for some benchs on Christmas.

Now there are other points, such as lazy deoptimization, discarding optimized methods, fixing the settings to reach maximum performance, uncommon bugs found on regression tests, debugging and inspecting context transparently, stack replacement for on-the-fly optimization (for dynamic deoptimization it is done and easier) that may or may not take time.

A prototype should be there for Christmas, but production will take more time.

In the VM we have the bytecode set implemented, the performance counters on conditional branches that call-back into the image when they trip, and the class trap bytecode, used to check that objects are of the required classes before entering unsafe optimized code. That leaves the inlined primitives of which I've implemented a handful of the simpler ones. So we're on course to have a prototype by Christmas ;-)

At a later stage we'll add an OptimizedContext (which in the VM will have an associated new frame format) that will have a pointer stack (as the current Context does) and a byte stack for handling raw data such as floating-point. The JIT will map (at least some) memory locations in the byte stack onto the floating-point registers for much improved floating-point performance.

--
best,
Eliot