Smalltalk › Squeak › Squeak - Dev

New Cog VMs available

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

3 messages Options

Eliot Miranda-2

New Cog VMs available

... at http://www.mirandabanda.org/files/Cog/VM/VM.r3684

CogVM binaries as per VMMaker.oscog-eem.1834/r3684

General:

Correct undue sign extension wile promoting 32 to 64 bits int in

fetchLong64:ofObject: This was causing trouble in non spur object memory (V3).

Fix slowdown of update on 64-bit x64 by using int variables for the tides in

sqExternalSemaphores.c and inline assembler for the sqCompareAndSwap in

sqAtomicOps.h.

Fix slips in offset time primitives. They need to pop argumentCount + 1, not

argumentCount.

Define an optional primitive as the fast primitive fail code so that unselected

optional primitives are in the prmitiveTable as fast primitive fails.

Spur:

Have WideString>>at: fail to answer an out-of-range character in the

interpreter primitive (as well as in the JIT).

Fix bug in following state on primitive failure. The old code would

always follow to depth 1, even if the accessor depth was 0.

Hard-code primitiveSize's depth to 0 (accessing length in at:[put:]

and size causes the stack depth computation to answer 1 instead of

0 for these primitives.

Fix assert and dequeueMourner for case where mournQueue is nil.

- fixed a bug in receiver accessing in immutability primitive for

mirror primitive.

- change primitiveSetOrHasIdentityHash to patch the class table if

the new hash is set to a behavior (the primitive knows it's a

behavior if the second (optional) argument is true)

For example:

FullBlockClosure tryPrimitive: 161 withArgs: {38.true}

No longer follow the method and context fields in a closure in

activateNewClosureMethod:numArgs:mayContextSwitch:; the caller will

have failed if these are forwarders, so no need to check again.

Cogit:

Add a primitive that answers pc map data for methods which can be

used to better decorate methods in the VM Profiler. Refactor the pc

map enumeration facilities so that the Sista pic data primitive can

share the same enumerator. Do this by collapsing the

isBackwardBranch and annotation parameters into a single parameter.

Follow selectors in the openPICList post Spur become.

Fix the ARM's caller-saved register mask now that we can name all the actual

registers.

Reworked machine code generation of immutability so for common stores it

uses a single trampoline for both store checks and immutability checks.

- improved support for register allocation: branch merge

successfully compiled with register moved instead of spilling.

Sista Cogit:

Don't bother to add counters to conditional jumps implementing and:

and or:. Added the remote inst var access bytecode in sista V1

bytecode set without interfering with existing code.

Plugins:

Upgrade LargeIntegersPlugin to v2.0

LargeInteger primitives now deal with 32-bits digits. No change to image code.

Memory is 8 bytes aligned on Spur. When storing 32/64 bits large

integers values, allways fill the eight bytes whatever the

effectivily used size, rather than bother with dissertion of size.

Generate integer type checking as C macros rather than direct/indirect

interpreterProxy function call in plugins. This, and 32-bit accessing mean

singificantly faster large integer arithmetic.

Fix primAlienReplace to use positiveMachineIntegerValueOf: instead of

positive32BitValueOf:.

BitBltPlugin operates on 32bit word units, therefore it's better to declare its

operands as 'unsigned int' rather than sqInt. On 32bit VM, this doesn't change

anything, but on 64bits spur, it makes this snippet work:

| wideString source pos blt expectedWideString |

source := #[1 64 255 14 1 64 48 251].

expectedWideString := WideString fromByteArray: source.

wideString := WideString new: source size // 4.

pos := 0.

blt := (BitBlt

toForm: (Form new hackBits: wideString))

sourceForm: (Form new hackBits: source).

blt

combinationRule: Form over;

sourceX: 0;

sourceY: pos // 4;

height: wideString byteSize // 4;

width: 4;

destX: 0;

destY: 0;

copyBits.

wideString restoreEndianness.

self assert: wideString = expectedWideString

Hence it fixes loading/diffing MCZ with wide character.

Mac OS Builds:

Don't link with -fvisibility=hidden; it breaks external plugins. Use

-fvisibility=default instead. This fixes e.g. UnixOSProcessPlugin on Mac OS X.

Windows Builds:

The Windows VMs are no longer "dpiAware". If you want one-to-one pixel mapping check the README for ins ructions. It's a simple edit.

_,,,^..^,,,_

best, Eliot

bpi

Re: Crash Dump on OS X (was: New Cog VMs available)

Hi Eliot,

I just found a way to reproduce it: Maximise the window by holding alt and click the green window button.

Cheers,
Bernhard

> Am 24.04.2016 um 10:38 schrieb Bernhard Pieber <[hidden email]>:
>
> Hi Eliot,
>
> Thanks for your VMs.
>
> I just had a crash on OS X 10.11.4. Maybe it’s helpful:
> <crash.dmp.zip>
>
> Cheers,
> Bernhard
>
>> Am 23.04.2016 um 21:48 schrieb Eliot Miranda <[hidden email]>:
>>
>> ... at http://www.mirandabanda.org/files/Cog/VM/VM.r3684
>>
>> CogVM binaries as per VMMaker.oscog-eem.1834/r3684
>>
>> General:
>> Correct undue sign extension wile promoting 32 to 64 bits int in
>> fetchLong64:ofObject: This was causing trouble in non spur object memory (V3).
>>
>> Fix slowdown of update on 64-bit x64 by using int variables for the tides in
>> sqExternalSemaphores.c and inline assembler for the sqCompareAndSwap in
>> sqAtomicOps.h.
>>
>> Fix slips in offset time primitives. They need to pop argumentCount + 1, not
>> argumentCount.
>>
>> Define an optional primitive as the fast primitive fail code so that unselected
>> optional primitives are in the prmitiveTable as fast primitive fails.
>>
>>
>> Spur:
>> Have WideString>>at: fail to answer an out-of-range character in the
>> interpreter primitive (as well as in the JIT).
>>
>> Fix bug in following state on primitive failure. The old code would
>> always follow to depth 1, even if the accessor depth was 0.
>> Hard-code primitiveSize's depth to 0 (accessing length in at:[put:]
>> and size causes the stack depth computation to answer 1 instead of
>> 0 for these primitives.
>>
>> Fix assert and dequeueMourner for case where mournQueue is nil.
>>
>> - fixed a bug in receiver accessing in immutability primitive for
>> mirror primitive.
>>
>> - change primitiveSetOrHasIdentityHash to patch the class table if
>> the new hash is set to a behavior (the primitive knows it's a
>> behavior if the second (optional) argument is true)
>> For example:
>> FullBlockClosure tryPrimitive: 161 withArgs: {38.true}
>>
>> No longer follow the method and context fields in a closure in
>> activateNewClosureMethod:numArgs:mayContextSwitch:; the caller will
>> have failed if these are forwarders, so no need to check again.
>>
>>
>> Cogit:
>> Add a primitive that answers pc map data for methods which can be
>> used to better decorate methods in the VM Profiler. Refactor the pc
>> map enumeration facilities so that the Sista pic data primitive can
>> share the same enumerator. Do this by collapsing the
>> isBackwardBranch and annotation parameters into a single parameter.
>>
>> Follow selectors in the openPICList post Spur become.
>>
>> Fix the ARM's caller-saved register mask now that we can name all the actual
>> registers.
>>
>> Reworked machine code generation of immutability so for common stores it
>> uses a single trampoline for both store checks and immutability checks.
>>
>> - improved support for register allocation: branch merge
>> successfully compiled with register moved instead of spilling.
>>
>> Sista Cogit:
>> Don't bother to add counters to conditional jumps implementing and:
>> and or:. Added the remote inst var access bytecode in sista V1
>> bytecode set without interfering with existing code.
>>
>>
>> Plugins:
>> Upgrade LargeIntegersPlugin to v2.0
>> LargeInteger primitives now deal with 32-bits digits. No change to image code.
>>
>> Memory is 8 bytes aligned on Spur. When storing 32/64 bits large
>> integers values, allways fill the eight bytes whatever the
>> effectivily used size, rather than bother with dissertion of size.
>>
>> Generate integer type checking as C macros rather than direct/indirect
>> interpreterProxy function call in plugins. This, and 32-bit accessing mean
>> singificantly faster large integer arithmetic.
>>
>> Fix primAlienReplace to use positiveMachineIntegerValueOf: instead of
>> positive32BitValueOf:.
>>
>> BitBltPlugin operates on 32bit word units, therefore it's better to declare its
>> operands as 'unsigned int' rather than sqInt. On 32bit VM, this doesn't change
>> anything, but on 64bits spur, it makes this snippet work:
>>
>> | wideString source pos blt expectedWideString |
>> source := #[1 64 255 14 1 64 48 251].
>> expectedWideString := WideString fromByteArray: source.
>> wideString := WideString new: source size // 4.
>> pos := 0.
>> blt := (BitBlt
>> toForm: (Form new hackBits: wideString))
>> sourceForm: (Form new hackBits: source).
>> blt
>> combinationRule: Form over;
>> sourceX: 0;
>> sourceY: pos // 4;
>> height: wideString byteSize // 4;
>> width: 4;
>> destX: 0;
>> destY: 0;
>> copyBits.
>> wideString restoreEndianness.
>> self assert: wideString = expectedWideString
>>
>> Hence it fixes loading/diffing MCZ with wide character.
>>
>>
>> Mac OS Builds:
>> Don't link with -fvisibility=hidden; it breaks external plugins. Use
>> -fvisibility=default instead. This fixes e.g. UnixOSProcessPlugin on Mac OS X.
>>
>>
>> Windows Builds:
>> The Windows VMs are no longer "dpiAware". If you want one-to-one pixel mapping check the README for ins ructions. It's a simple edit.
>> _,,,^..^,,,_
>> best, Eliot
>

Levente Uzonyi

Re: [Vm-dev] New Cog VMs available

In reply to this post by Eliot Miranda-2

Hi Holger,

I ran the Shootout benchmarks[1] which I'd recently updated on
cogspurlinuxht 3397 and 3648 using the latest Squeak Trunk image, and the
latter performed better in three benchmarks: reverseComplement (-20%),
pidigits (-30%) and fasta (-40%).
For the rest, the performance of the two VMs were the same.
So, I suspect there's an area not covered by these benchmarks where you
experience the slowdown. Or there's some other difference responsible for
the slowdown.
Can you tell us more about your benchmark?
Have you tried profiling it?

Levente

[1] http://leves.web.elte.hu/squeak/Shootout-ul.19.mcz