New Cog VMs available

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

New Cog VMs available

Eliot Miranda-2
... at http://www.mirandabanda.org/files/Cog/VM/VM.r3311/.

These should fix the regression introduced by the map changes in 3308.  They certainly fix the two crashes I've looked at, one an update of a squeak trunk image and the other the startup of recent Newspeak images.  Apologies for the inconvenience.


CogVM binaries as per VMMaker.oscog-eem.1204/r3311

Cogits:

Fix regression in map machinery due to adding AnnotationExtension scheme.
findMapLocationForMcpc:inMethod: must not be confused by IsDisplacementX2N
bytes.  This is likely the cause of the recent crashes with r3308 and earlier.

Introduce marryFrameCopiesTemps and use it to
not copy temps in Spur context creation trampolines.

Change initial usage counts to keep more recently jitted methods around for
longer, and do *not* throw away PICs in freeOlderMethodsForCompaction, so that
there's a better chance of Sista finding send and branch data for the tripping
method.

extendedPushBytecode /does/ need a frame.

Don't save the header in a scratch register unless
it is useful to do so in the Spur at:[put:] primitives.

Fix slip in genGetNumBytesOf:into:.  And notice that
genGetFormatOf:into:baseHeaderIntoScratch: et al can use byte access
to get at format, as intended in the Spur header design.

Fix unlinking dynamic super sends.

Reduce false positives in access control violation reporting by marking the
super send we actually use as privileged. Remove unused Newspeak bytecodes.

Internal:

Fix code generation bug surfaced by inline primitives.  On x86 movb N(%reg),%rl
can only store into al, bl, cl & dl, whereas movzbl can store into any reg.  On
ARM move byte also zero-extends.  So change definition of MoveMbrR to always
zero-extend, use movzbl on x86 and remove all the MoveCq: 0 R: used to zero the
bits of the target of a MoveMb:r:R:.  And now that we have
genGetNumSlotsOf:into:, use it.

Fix a slip in genTrinaryInlinePrimitive:, meet constraint that the target must
be in ReceiverResultReg, and do a better job of register allocation there-in.

Do dead code elimination for the branch following an inlined comparison (this
is done in genBinaryInlineComparison:opFalse:destReg: copying the scheme in
genSpecialSelectorEqualsEquals).

Do register allocation in the right place in genUnaryInlinePrimitive:.

Fix overflow slot access in genGetNumSlotsOf:into: et al.

Fix several slips in inline primitive generation: Object>>at:put: needs to
include a store check.  Some register allocation code was wrong.  Some results
needed converting to SmallIntegers and recording results as pushed on the sim
stack.

Change callPrimitiveBytecode to genCallPrimitiveBytecode in the Cogit.
remove the misnomer genConvertIntegerToSmallIntegerInScratchReg:

Type of AbstractInstruction opcode must be unsigned now that we have
more than 128 opcodes (XCHGRR pushed things over the top).

Lay the groundwork for 32-bit intra-zone jumps and calls on ARM by introducing
CallFull and JumpFull (and rewrites thereof) that are expected to span the full
address space, leaving Call/JumpLong to span merely the 16mb code zone.  On x86
CallFull and JumpFull simply default to Call/JumpLong.

Replace bytecode trapIfNotInstanceOf by jumpIfNotInstanceOfOrPop.

Rewrote the JIT logic for traps to be able to write trap trampolines calls at
the end of the cogMethod.

Refactor the slot store and store check machinery to take an inFrame: argument
and hence deal with the store check in genInnerPrimitiveAtPut: on ARM.

Fix limitation with MoveRXbrR; can only do movb from
%al through %dl, so swap with %eax around movb.

Fix mistake with genGetNumBytesOf:into: by refactoring
genGetFormatOf:into:baseHeaderIntoScratch: into
genGetBits:ofFormatByteOf:into:baseHeaderIntoScratch:
and hence fetching and subtracting only odd bits of format.

Correct the in-line primitive SmallInteger comparisons; CmpXR is confusing ;-)

Fix var op var unsafe byte at:.  Result must be converted to SmallInteger.

Correct the generated Slang for the new register allocation code by adding a
read-before-written pass to C generation that initializes variables
read-before-written with 0 (the C equivalent of nil).

fix a bug where sometimes register allocation was marking ReceiverResultReg as
dead whereas it was still alive.

Added some abstraction over register allocation. This is now used in inline
primitives.



Reply | Threaded
Open this post in threaded view
|

Re: New Cog VMs available

Eliot Miranda-2

On 16 Apr 2015, at 16:50, Eliot Miranda wrote:

... at http://www.mirandabanda.org/files/Cog/VM/VM.r3311/.

These should fix the regression introduced by the map changes in 3308.  They certainly fix the two crashes I've looked at, one an update of a squeak trunk image and the other the startup of recent Newspeak images.  Apologies for the inconvenience.

Well, this is embarrassing as usual but I'm still seeing crashes in the image update.  So I'll have to look deeper.  At least the Newspeak fix was real, but it didn't fix everything.


CogVM binaries as per VMMaker.oscog-eem.1204/r3311

Cogits:

Fix regression in map machinery due to adding AnnotationExtension scheme.
findMapLocationForMcpc:inMethod: must not be confused by IsDisplacementX2N
bytes.  This is likely the cause of the recent crashes with r3308 and earlier.

Introduce marryFrameCopiesTemps and use it to
not copy temps in Spur context creation trampolines.

Change initial usage counts to keep more recently jitted methods around for
longer, and do *not* throw away PICs in freeOlderMethodsForCompaction, so that
there's a better chance of Sista finding send and branch data for the tripping
method.

extendedPushBytecode /does/ need a frame.

Don't save the header in a scratch register unless
it is useful to do so in the Spur at:[put:] primitives.

Fix slip in genGetNumBytesOf:into:.  And notice that
genGetFormatOf:into:baseHeaderIntoScratch: et al can use byte access
to get at format, as intended in the Spur header design.

Fix unlinking dynamic super sends.

Reduce false positives in access control violation reporting by marking the
super send we actually use as privileged. Remove unused Newspeak bytecodes.

Internal:

Fix code generation bug surfaced by inline primitives.  On x86 movb N(%reg),%rl
can only store into al, bl, cl & dl, whereas movzbl can store into any reg.  On
ARM move byte also zero-extends.  So change definition of MoveMbrR to always
zero-extend, use movzbl on x86 and remove all the MoveCq: 0 R: used to zero the
bits of the target of a MoveMb:r:R:.  And now that we have
genGetNumSlotsOf:into:, use it.

Fix a slip in genTrinaryInlinePrimitive:, meet constraint that the target must
be in ReceiverResultReg, and do a better job of register allocation there-in.

Do dead code elimination for the branch following an inlined comparison (this
is done in genBinaryInlineComparison:opFalse:destReg: copying the scheme in
genSpecialSelectorEqualsEquals).

Do register allocation in the right place in genUnaryInlinePrimitive:.

Fix overflow slot access in genGetNumSlotsOf:into: et al.

Fix several slips in inline primitive generation: Object>>at:put: needs to
include a store check.  Some register allocation code was wrong.  Some results
needed converting to SmallIntegers and recording results as pushed on the sim
stack.

Change callPrimitiveBytecode to genCallPrimitiveBytecode in the Cogit.
remove the misnomer genConvertIntegerToSmallIntegerInScratchReg:

Type of AbstractInstruction opcode must be unsigned now that we have
more than 128 opcodes (XCHGRR pushed things over the top).

Lay the groundwork for 32-bit intra-zone jumps and calls on ARM by introducing
CallFull and JumpFull (and rewrites thereof) that are expected to span the full
address space, leaving Call/JumpLong to span merely the 16mb code zone.  On x86
CallFull and JumpFull simply default to Call/JumpLong.

Replace bytecode trapIfNotInstanceOf by jumpIfNotInstanceOfOrPop.

Rewrote the JIT logic for traps to be able to write trap trampolines calls at
the end of the cogMethod.

Refactor the slot store and store check machinery to take an inFrame: argument
and hence deal with the store check in genInnerPrimitiveAtPut: on ARM.

Fix limitation with MoveRXbrR; can only do movb from
%al through %dl, so swap with %eax around movb.

Fix mistake with genGetNumBytesOf:into: by refactoring
genGetFormatOf:into:baseHeaderIntoScratch: into
genGetBits:ofFormatByteOf:into:baseHeaderIntoScratch:
and hence fetching and subtracting only odd bits of format.

Correct the in-line primitive SmallInteger comparisons; CmpXR is confusing ;-)

Fix var op var unsafe byte at:.  Result must be converted to SmallInteger.

Correct the generated Slang for the new register allocation code by adding a
read-before-written pass to C generation that initializes variables
read-before-written with 0 (the C equivalent of nil).

fix a bug where sometimes register allocation was marking ReceiverResultReg as
dead whereas it was still alive.

Added some abstraction over register allocation. This is now used in inline
primitives.




Reply | Threaded
Open this post in threaded view
|

Re: New Cog VMs available

Eliot Miranda-2

On 16 Apr 2015, at 17:39, Eliot Miranda wrote:


On 16 Apr 2015, at 16:50, Eliot Miranda wrote:

... at http://www.mirandabanda.org/files/Cog/VM/VM.r3311/.

These should fix the regression introduced by the map changes in 3308.  They certainly fix the two crashes I've looked at, one an update of a squeak trunk image and the other the startup of recent Newspeak images.  Apologies for the inconvenience.

Well, this is embarrassing as usual but I'm still seeing crashes in the image update.  So I'll have to look deeper.  At least the Newspeak fix was real, but it didn't fix everything.

And I have finally found it.  A stupid copy paste error which caused the failure to mark references to classes in linked sends.  Fixed VMs should be available in an hour or two.  Again apologies.



CogVM binaries as per VMMaker.oscog-eem.1204/r3311

Cogits:

Fix regression in map machinery due to adding AnnotationExtension scheme.
findMapLocationForMcpc:inMethod: must not be confused by IsDisplacementX2N
bytes.  This is likely the cause of the recent crashes with r3308 and earlier.

Introduce marryFrameCopiesTemps and use it to
not copy temps in Spur context creation trampolines.

Change initial usage counts to keep more recently jitted methods around for
longer, and do *not* throw away PICs in freeOlderMethodsForCompaction, so that
there's a better chance of Sista finding send and branch data for the tripping
method.

extendedPushBytecode /does/ need a frame.

Don't save the header in a scratch register unless
it is useful to do so in the Spur at:[put:] primitives.

Fix slip in genGetNumBytesOf:into:.  And notice that
genGetFormatOf:into:baseHeaderIntoScratch: et al can use byte access
to get at format, as intended in the Spur header design.

Fix unlinking dynamic super sends.

Reduce false positives in access control violation reporting by marking the
super send we actually use as privileged. Remove unused Newspeak bytecodes.

Internal:

Fix code generation bug surfaced by inline primitives.  On x86 movb N(%reg),%rl
can only store into al, bl, cl & dl, whereas movzbl can store into any reg.  On
ARM move byte also zero-extends.  So change definition of MoveMbrR to always
zero-extend, use movzbl on x86 and remove all the MoveCq: 0 R: used to zero the
bits of the target of a MoveMb:r:R:.  And now that we have
genGetNumSlotsOf:into:, use it.

Fix a slip in genTrinaryInlinePrimitive:, meet constraint that the target must
be in ReceiverResultReg, and do a better job of register allocation there-in.

Do dead code elimination for the branch following an inlined comparison (this
is done in genBinaryInlineComparison:opFalse:destReg: copying the scheme in
genSpecialSelectorEqualsEquals).

Do register allocation in the right place in genUnaryInlinePrimitive:.

Fix overflow slot access in genGetNumSlotsOf:into: et al.

Fix several slips in inline primitive generation: Object>>at:put: needs to
include a store check.  Some register allocation code was wrong.  Some results
needed converting to SmallIntegers and recording results as pushed on the sim
stack.

Change callPrimitiveBytecode to genCallPrimitiveBytecode in the Cogit.
remove the misnomer genConvertIntegerToSmallIntegerInScratchReg:

Type of AbstractInstruction opcode must be unsigned now that we have
more than 128 opcodes (XCHGRR pushed things over the top).

Lay the groundwork for 32-bit intra-zone jumps and calls on ARM by introducing
CallFull and JumpFull (and rewrites thereof) that are expected to span the full
address space, leaving Call/JumpLong to span merely the 16mb code zone.  On x86
CallFull and JumpFull simply default to Call/JumpLong.

Replace bytecode trapIfNotInstanceOf by jumpIfNotInstanceOfOrPop.

Rewrote the JIT logic for traps to be able to write trap trampolines calls at
the end of the cogMethod.

Refactor the slot store and store check machinery to take an inFrame: argument
and hence deal with the store check in genInnerPrimitiveAtPut: on ARM.

Fix limitation with MoveRXbrR; can only do movb from
%al through %dl, so swap with %eax around movb.

Fix mistake with genGetNumBytesOf:into: by refactoring
genGetFormatOf:into:baseHeaderIntoScratch: into
genGetBits:ofFormatByteOf:into:baseHeaderIntoScratch:
and hence fetching and subtracting only odd bits of format.

Correct the in-line primitive SmallInteger comparisons; CmpXR is confusing ;-)

Fix var op var unsafe byte at:.  Result must be converted to SmallInteger.

Correct the generated Slang for the new register allocation code by adding a
read-before-written pass to C generation that initializes variables
read-before-written with 0 (the C equivalent of nil).

fix a bug where sometimes register allocation was marking ReceiverResultReg as
dead whereas it was still alive.

Added some abstraction over register allocation. This is now used in inline
primitives.





Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] New Cog VMs available

Eliot Miranda-2
In reply to this post by Eliot Miranda-2


On Fri, Apr 17, 2015 at 5:16 PM, Clément Bera <[hidden email]> wrote:


2015-04-16 17:16 GMT-07:00 Sean P. DeNigris <[hidden email]>:
Eliot Miranda-2 wrote
> ...

Wow! That seems like a /lot/ of fixes... thanks :)

 
This past few weeks we were 4 to commit regularly on the JIT (Tim, Ryan, Eliot and I) so there are more fixes than when Eliot is working alone :-)

Finally we have a team working on it.  Come join us!!  There's /lots/ to work on.  See http://www.mirandabanda.org/cogblog/cog-projects/http://www.mirandabanda.org/cogblog/collaborators/. It is challenging, useful and FUN!


--
best,
Eliot


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] New Cog VMs available

David T. Lewis
On Fri, Apr 17, 2015 at 06:46:19PM -0700, Eliot Miranda wrote:

> On Fri, Apr 17, 2015 at 5:16 PM, Cl??ment Bera <[hidden email]> wrote:
> >
> > 2015-04-16 17:16 GMT-07:00 Sean P. DeNigris <[hidden email]>:
> >
> >> Eliot Miranda-2 wrote
> >> > ...
> >>
> >> Wow! That seems like a /lot/ of fixes... thanks :)
> >>
> > This past few weeks we were 4 to commit regularly on the JIT (Tim, Ryan,
> > Eliot and I) so there are more fixes than when Eliot is working alone :-)
>
> Finally we have a team working on it.  Come join us!!  There's /lots/ to
> work on.  See http://www.mirandabanda.org/cogblog/cog-projects/ &
> http://www.mirandabanda.org/cogblog/collaborators/. It is challenging,
> useful and FUN!

The project list at http://www.mirandabanda.org/cogblog/cog-projects/ makes
for really good reading in its own right, even if it makes me feel just a bit
badly about my own lack of progress on that last item in the list. But we
will get there :-)

Kudos to Cl??ment, Tim, Ryan and Eliot for the initiative and teamwork!

Dave


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] New Cog VMs available

Chris Muller-3
In reply to this post by Eliot Miranda-2
> But yesterday and today a wonderful thing has happened.  Clément has
> understood the core optimization and code generation structures in the JIT
> as well as I do and is now both improving the code quality and implementing
> more aggressive optimizations.  This is /so/ satisfying.  You know how good
> Clément is.  His input is so strong.  I am /very/ happy.  For example,
> Clément is currently modifying #== in the Spur JIT so that the code checks
> for forwarders only if #== is false, and that if #== is false and either
> object is a forwarder, it/them is/are followed and the #== retried.  This
> should speed up #== by 50%.  It won't make much difference at the macro
> level but this is a non-trivial optimization to write and so now we have two
> people who really understand the core optimizing JIT and I can now happily
> run in front of a bus.

This is like watching a gripping novel unfold!  Congratulations
Clément for joining Eliot in Spur's engine room, and to Eliot for
enjoying more company there, your elation is infectious..

Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] New Cog VMs available

Levente Uzonyi-2
In reply to this post by Eliot Miranda-2
On Fri, 17 Apr 2015, Eliot Miranda wrote:

>
>
> On Fri, Apr 17, 2015 at 5:16 PM, Clément Bera <[hidden email]> wrote:
>
>
>       2015-04-16 17:16 GMT-07:00 Sean P. DeNigris <[hidden email]>:
>             Eliot Miranda-2 wrote
>             > ...
>
>             Wow! That seems like a /lot/ of fixes... thanks :)
>
>        
> This past few weeks we were 4 to commit regularly on the JIT (Tim, Ryan, Eliot and I) so there are more fixes than when Eliot is working alone :-)
>
>
> Finally we have a team working on it.  Come join us!!  There's /lots/ to work on.  See http://www.mirandabanda.org/cogblog/cog-projects/ & http://www.mirandabanda.org/cogblog/collaborators/. It is challenging, useful and FUN!
"- a solution to the problem of free space in segments containing pinned
objects at snapshot time"

My naive approach would be to separate pinned objects from non-pinned
objects. Whenever you pin an object, it's moved to a "pinned" segment -
which is a segment with only pinned objects - and vica versa.

Levente

>
>
> --
> best,Eliot
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] New Cog VMs available

Eliot Miranda-2
Hi Levente,

     the code is indeed written to cluster pinned objects.  Objects can only be punned in old space.  IIRC there's a bit in each segment header to say whether it contains pinned objects.  If such a segment exists then newly pinned objects in new space will be becomes to a pinned object in that segment.  But I expect the standard use case is pin, make an FFI call, unpin, so we shouldn't worry too much about clustering.  So if an object is already in old space the system merely sets the pin bit.  It may erroneously set the "segment contains pinned" but too.  It would be better for the system to nominate a single segment to contain pinned objects.  I'll try and remember to take a look but you're welcome to too ;-)

Eliot (phone)

On Apr 17, 2015, at 10:00 PM, Levente Uzonyi <[hidden email]> wrote:

> On Fri, 17 Apr 2015, Eliot Miranda wrote:
>
>> On Fri, Apr 17, 2015 at 5:16 PM, Clément Bera <[hidden email]> wrote:
>>
>>      2015-04-16 17:16 GMT-07:00 Sean P. DeNigris <[hidden email]>:
>>            Eliot Miranda-2 wrote
>>            > ...
>>
>>            Wow! That seems like a /lot/ of fixes... thanks :)
>>
>>      
>> This past few weeks we were 4 to commit regularly on the JIT (Tim, Ryan, Eliot and I) so there are more fixes than when Eliot is working alone :-)
>> Finally we have a team working on it.  Come join us!!  There's /lots/ to work on.  See http://www.mirandabanda.org/cogblog/cog-projects/ & http://www.mirandabanda.org/cogblog/collaborators/. It is challenging, useful and FUN!
>
> "- a solution to the problem of free space in segments containing pinned objects at snapshot time"
>
> My naive approach would be to separate pinned objects from non-pinned objects. Whenever you pin an object, it's moved to a "pinned" segment -
> which is a segment with only pinned objects - and vica versa.
>
> Levente
>
>> --
>> best,Eliot
>

Reply | Threaded
Open this post in threaded view
|

[OT] New Cog VMs available

Martin Bähr
Excerpts from Eliot Miranda's message of 2015-04-18 18:49:16 +0200:
>      the code is indeed written to cluster pinned objects.  Objects can only be punned in old space.  

how do you pun an object?
it's like pinning, except you don't use a pin, but a pun, so shocking that the
object is stunned and won't move.

(apologies for the disruption)

greetings, martin.

--
eKita                   -   the online platform for your entire academic life
--
chief engineer                                                       eKita.co
pike programmer      pike.lysator.liu.se    caudium.net     societyserver.org
secretary                                                      beijinglug.org
mentor                                                           fossasia.org
foresight developer  foresightlinux.org                            realss.com
unix sysadmin
Martin Bähr          working in china        http://societyserver.org/mbaehr/

Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] New Cog VMs available

Levente Uzonyi-2
In reply to this post by Eliot Miranda-2
Hi Eliot,

I expect pinning to be used for things other than FFI. For example the
ExternalObjectTable is nothing but poor man's pinning, which can be nuked
when the VM supports pinning out of the box. In this case pinned objects
(Semaphores) will be all over the place due to files and sockets, and
their lifetime will also vary from a few seconds to years.

Not mixing pinned objects with non-pinned objects can also save the bit
used for marking pinned objects in the object header.

Levente

On Sat, 18 Apr 2015, Eliot Miranda wrote:

> Hi Levente,
>
>     the code is indeed written to cluster pinned objects.  Objects can only be punned in old space.  IIRC there's a bit in each segment header to say whether it contains pinned objects.  If such a segment exists then newly pinned objects in new space will be becomes to a pinned object in that segment.  But I expect the standard use case is pin, make an FFI call, unpin, so we shouldn't worry too much about clustering.  So if an object is already in old space the system merely sets the pin bit.  It may erroneously set the "segment contains pinned" but too.  It would be better for the system to nominate a single segment to contain pinned objects.  I'll try and remember to take a look but you're welcome to too ;-)
>
> Eliot (phone)
>
> On Apr 17, 2015, at 10:00 PM, Levente Uzonyi <[hidden email]> wrote:
>
>> On Fri, 17 Apr 2015, Eliot Miranda wrote:
>>
>>> On Fri, Apr 17, 2015 at 5:16 PM, Clément Bera <[hidden email]> wrote:
>>>
>>>      2015-04-16 17:16 GMT-07:00 Sean P. DeNigris <[hidden email]>:
>>>            Eliot Miranda-2 wrote
>>>           > ...
>>>
>>>            Wow! That seems like a /lot/ of fixes... thanks :)
>>>
>>>
>>> This past few weeks we were 4 to commit regularly on the JIT (Tim, Ryan, Eliot and I) so there are more fixes than when Eliot is working alone :-)
>>> Finally we have a team working on it.  Come join us!!  There's /lots/ to work on.  See http://www.mirandabanda.org/cogblog/cog-projects/ & http://www.mirandabanda.org/cogblog/collaborators/. It is challenging, useful and FUN!
>>
>> "- a solution to the problem of free space in segments containing pinned objects at snapshot time"
>>
>> My naive approach would be to separate pinned objects from non-pinned objects. Whenever you pin an object, it's moved to a "pinned" segment -
>> which is a segment with only pinned objects - and vica versa.
>>
>> Levente
>>
>>> --
>>> best,Eliot
>>
>
>