Re: [Vm-dev] Where to get Monitor implementation based on primitives?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Where to get Monitor implementation based on primitives?

Eliot Miranda-2
Hi Denis, Hi Clément,  Hi Frank,

On Thu, Jan 7, 2016 at 5:34 AM, Clément Bera <[hidden email]> wrote:
Hello,

Eliot, please, you told me you had the code and Denis is interested.

It uses 3 primitives for performance.

Forgive the delay.  I thought it proper to ask permission since the code was written while I was at Qwaq. I'm attaching the code in a fairly raw state, see the attached.  The code is MIT, but copyright 3DICC.

It is a plugin replacement for Squeak's Mutex, and with a little ingenuity could be a replacement for Squeak's Monitor.  It is quicker because it uses three new primitives to manage entering a critical section and setting the owner, exiting the critical section and releasing the owner, and testing if a critical section, entering if the section is unowned.  The use of the primitives means fewer block activations and ensure: blocks in entering and exiting the critical section, and that's the actual cause of the speed-up.

You can benchmark the code as is.  Here are some results on 32-bit Spur, on my 2.2GHz Core i7

{Mutex new. Monitor new. CriticalSection new} collect:
[:cs| | n |
n := 0.
[cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
n ] bench]

{Mutex new. Monitor new. CriticalSection new} collect:
[:cs| | n |
n := 0.
cs class name, ' -> ',
[cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
n ] bench]

#( 'Mutex -> 440,000 per second. 2.27 microseconds per run.'
'Monitor -> 688,000 per second. 1.45 microseconds per run.'
'CriticalSection -> 1,110,000 per second. 900 nanoseconds per run.')

Replacement is probably trivial; rename Mutex to OldMutex, rename CriticalSection to Mutex, recompile.  But there are lots of mutexes in the system and these are potentially owned.  Transforming unowned ones is trivial, but transforming owned ones is, I think, impossible.  But at least in my system there are no owned mutexes or monitors.

Frank (or anyone else), would you be interested in creating a replacement for Squeak's Monitor based on CriticalSection?


Here are the two business methods:
CriticalSection methods for mutual exclusion
critical: aBlock
"Evaluate aBlock protected by the receiver."
^self primitiveEnterCriticalSection
ifTrue: [aBlock value]
ifFalse: [aBlock ensure: [self primitiveExitCriticalSection]]

critical: aBlock ifLocked: lockedBlock
"Answer the evaluation of aBlock protected by the receiver.  If it is already in a critical
section on behalf of some other process answer the evaluation of lockedBlock."
^self primitiveTestAndSetOwnershipOfCriticalSection
ifNil: [lockedBlock value]
ifNotNil:[:alreadyOwner|
alreadyOwner
ifTrue: [aBlock value]
ifFalse: [aBlock ensure: [self primitiveExitCriticalSection]]]

and the primitives:
CriticalSection methods for private-primitives
primitiveEnterCriticalSection
"Primitive. The receiver must be unowned or owned by the current process to proceed.
Answer if the process is owned by the current process."
<primitive: 186>
self primitiveFailed
"In the spirit of the following"
"[owner ifNil:
[owner := Processor activeProcess.
^false].
 owner = Processor activeProcess ifTrue:
[^true].
 self addLast: Processor activeProcess.
 Processor activeProcess suspend] valueUnpreemptively"

primitiveExitCriticalSection
"Primitive. Set te receiver to unowned and if any processes are waiting on
the receiver then proceed the first one, indicating that the receiver is unowned."
<primitive: 185>
self primitiveFailed
"In the spirit of the following"
"[owner := nil.
 self isEmpty ifFalse:
[process := self removeFirst.
process resume]] valueUnpreemptively"

primitiveTestAndSetOwnershipOfCriticalSection
"Primitive. Attempt to set the ownership of the receiver.
If the receiver is unowned set its owningProcess to the
activeProcess and answer false.  If the receiver is owned
by the activeProcess answer true.  If the receiver is owned
by some other process answer nil."
<primitive: 187>
self primitiveFail
"In the spirit of the following"
"[owner ifNil:
[owningProcess := Processor activeProcess.
^false].
 owner = Processor activeProcess ifTrue: [^true].
 ^nil] valueUnpreemptively"

2016-01-07 13:24 GMT+01:00 Denis Kudriashov <[hidden email]>:
 
Hello.

I hear about new Monitor implementation based on new primitives.
Where to get it?

_,,,^..^,,,_
best, Eliot

CriticalSection.st (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Where to get Monitor implementation based on primitives?

Eliot Miranda-2
and here's a version with a better class comment

On Thu, Jan 7, 2016 at 9:12 AM, Eliot Miranda <[hidden email]> wrote:
Hi Denis, Hi Clément,  Hi Frank,

On Thu, Jan 7, 2016 at 5:34 AM, Clément Bera <[hidden email]> wrote:
Hello,

Eliot, please, you told me you had the code and Denis is interested.

It uses 3 primitives for performance.

Forgive the delay.  I thought it proper to ask permission since the code was written while I was at Qwaq. I'm attaching the code in a fairly raw state, see the attached.  The code is MIT, but copyright 3DICC.

It is a plugin replacement for Squeak's Mutex, and with a little ingenuity could be a replacement for Squeak's Monitor.  It is quicker because it uses three new primitives to manage entering a critical section and setting the owner, exiting the critical section and releasing the owner, and testing if a critical section, entering if the section is unowned.  The use of the primitives means fewer block activations and ensure: blocks in entering and exiting the critical section, and that's the actual cause of the speed-up.

You can benchmark the code as is.  Here are some results on 32-bit Spur, on my 2.2GHz Core i7

{Mutex new. Monitor new. CriticalSection new} collect:
[:cs| | n |
n := 0.
[cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
n ] bench]

{Mutex new. Monitor new. CriticalSection new} collect:
[:cs| | n |
n := 0.
cs class name, ' -> ',
[cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
n ] bench]

#( 'Mutex -> 440,000 per second. 2.27 microseconds per run.'
'Monitor -> 688,000 per second. 1.45 microseconds per run.'
'CriticalSection -> 1,110,000 per second. 900 nanoseconds per run.')

Replacement is probably trivial; rename Mutex to OldMutex, rename CriticalSection to Mutex, recompile.  But there are lots of mutexes in the system and these are potentially owned.  Transforming unowned ones is trivial, but transforming owned ones is, I think, impossible.  But at least in my system there are no owned mutexes or monitors.

Frank (or anyone else), would you be interested in creating a replacement for Squeak's Monitor based on CriticalSection?


Here are the two business methods:
CriticalSection methods for mutual exclusion
critical: aBlock
"Evaluate aBlock protected by the receiver."
^self primitiveEnterCriticalSection
ifTrue: [aBlock value]
ifFalse: [aBlock ensure: [self primitiveExitCriticalSection]]

critical: aBlock ifLocked: lockedBlock
"Answer the evaluation of aBlock protected by the receiver.  If it is already in a critical
section on behalf of some other process answer the evaluation of lockedBlock."
^self primitiveTestAndSetOwnershipOfCriticalSection
ifNil: [lockedBlock value]
ifNotNil:[:alreadyOwner|
alreadyOwner
ifTrue: [aBlock value]
ifFalse: [aBlock ensure: [self primitiveExitCriticalSection]]]

and the primitives:
CriticalSection methods for private-primitives
primitiveEnterCriticalSection
"Primitive. The receiver must be unowned or owned by the current process to proceed.
Answer if the process is owned by the current process."
<primitive: 186>
self primitiveFailed
"In the spirit of the following"
"[owner ifNil:
[owner := Processor activeProcess.
^false].
 owner = Processor activeProcess ifTrue:
[^true].
 self addLast: Processor activeProcess.
 Processor activeProcess suspend] valueUnpreemptively"

primitiveExitCriticalSection
"Primitive. Set te receiver to unowned and if any processes are waiting on
the receiver then proceed the first one, indicating that the receiver is unowned."
<primitive: 185>
self primitiveFailed
"In the spirit of the following"
"[owner := nil.
 self isEmpty ifFalse:
[process := self removeFirst.
process resume]] valueUnpreemptively"

primitiveTestAndSetOwnershipOfCriticalSection
"Primitive. Attempt to set the ownership of the receiver.
If the receiver is unowned set its owningProcess to the
activeProcess and answer false.  If the receiver is owned
by the activeProcess answer true.  If the receiver is owned
by some other process answer nil."
<primitive: 187>
self primitiveFail
"In the spirit of the following"
"[owner ifNil:
[owningProcess := Processor activeProcess.
^false].
 owner = Processor activeProcess ifTrue: [^true].
 ^nil] valueUnpreemptively"

2016-01-07 13:24 GMT+01:00 Denis Kudriashov <[hidden email]>:
 
Hello.

I hear about new Monitor implementation based on new primitives.
Where to get it?

_,,,^..^,,,_
best, Eliot



--
_,,,^..^,,,_
best, Eliot

CriticalSection.st (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Where to get Monitor implementation based on primitives?

Eliot Miranda-2
Hi Ben,

On Thu, Jan 7, 2016 at 10:39 AM, Ben Coman <[hidden email]> wrote:

On Fri, Jan 8, 2016 at 1:20 AM, Eliot Miranda <[hidden email]> wrote:
>
> and here's a version with a better class comment
>
> On Thu, Jan 7, 2016 at 9:12 AM, Eliot Miranda <[hidden email]> wrote:
>>
>> Hi Denis, Hi Clément,  Hi Frank,
>>
>> On Thu, Jan 7, 2016 at 5:34 AM, Clément Bera <[hidden email]> wrote:
>>>
>>> Hello,
>>>
>>> Eliot, please, you told me you had the code and Denis is interested.
>>>
>>> It uses 3 primitives for performance.
>>
>>
>> Forgive the delay.  I thought it proper to ask permission since the code was written while I was at Qwaq. I'm attaching the code in a fairly raw state, see the attached.  The code is MIT, but copyright 3DICC.
>>
>> It is a plugin replacement for Squeak's Mutex, and with a little ingenuity could be a replacement for Squeak's Monitor.  It is quicker because it uses three new primitives to manage entering a critical section and setting the owner, exiting the critical section and releasing the owner, and testing if a critical section, entering if the section is unowned.  The use of the primitives means fewer block activations and ensure: blocks in entering and exiting the critical section, and that's the actual cause of the speed-up.
>>
>> You can benchmark the code as is.  Here are some results on 32-bit Spur, on my 2.2GHz Core i7
>>
>> {Mutex new. Monitor new. CriticalSection new} collect:
>> [:cs| | n |
>> n := 0.
>> [cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
>> cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
>> n ] bench]
>>
>> {Mutex new. Monitor new. CriticalSection new} collect:
>> [:cs| | n |
>> n := 0.
>> cs class name, ' -> ',
>> [cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
>> cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
>> n ] bench]
>>
>> #( 'Mutex -> 440,000 per second. 2.27 microseconds per run.'
>> 'Monitor -> 688,000 per second. 1.45 microseconds per run.'
>> 'CriticalSection -> 1,110,000 per second. 900 nanoseconds per run.')
>>

This is great Eliot. Thank you and 3DICC.  After loading the changeset
into Pharo-50515 (32 bit Spur) I get the following results on my
laptop i5-2520M @ 2.50GHz

#('Mutex -> 254,047 per second'
 'Monitor -> 450,442 per second'
 'CriticalSection -> 683,393 per second')

In a fresh Image "Mutex allInstances basicInspect" lists just two mutexes...
1. NetNameResolver-->ResolverMutex
2. ThreadSafeTranscript-->accessSemaphore

I hate myself for getting distracted but I'm finding this is un.  One can migrate to the new representation using normal Monticello loads by

In the first version redefine Mutex and Monitor to subclass LinkedList and have their owner/ownerProcess inst var first (actually third after firstLink & lastLink), and add the primitives.

In the next version check that all Mutex and Monitor instanes are unowned and then redefine to discard excess inst vars

Let me test this before committing, and see that all tests are ok.


cheers -ben

>> Replacement is probably trivial; rename Mutex to OldMutex, rename CriticalSection to Mutex, recompile.  But there are lots of mutexes in the system and these are potentially owned.  Transforming unowned ones is trivial, but transforming owned ones is, I think, impossible.  But at least in my system there are no owned mutexes or monitors.
>>
>> Frank (or anyone else), would you be interested in creating a replacement for Squeak's Monitor based on CriticalSection?
>>
>>
>> Here are the two business methods:
>> CriticalSection methods for mutual exclusion
>> critical: aBlock
>> "Evaluate aBlock protected by the receiver."
>> ^self primitiveEnterCriticalSection
>> ifTrue: [aBlock value]
>> ifFalse: [aBlock ensure: [self primitiveExitCriticalSection]]
>>
>> critical: aBlock ifLocked: lockedBlock
>> "Answer the evaluation of aBlock protected by the receiver.  If it is already in a critical
>> section on behalf of some other process answer the evaluation of lockedBlock."
>> ^self primitiveTestAndSetOwnershipOfCriticalSection
>> ifNil: [lockedBlock value]
>> ifNotNil:[:alreadyOwner|
>> alreadyOwner
>> ifTrue: [aBlock value]
>> ifFalse: [aBlock ensure: [self primitiveExitCriticalSection]]]
>>
>> and the primitives:
>> CriticalSection methods for private-primitives
>> primitiveEnterCriticalSection
>> "Primitive. The receiver must be unowned or owned by the current process to proceed.
>> Answer if the process is owned by the current process."
>> <primitive: 186>
>> self primitiveFailed
>> "In the spirit of the following"
>> "[owner ifNil:
>> [owner := Processor activeProcess.
>> ^false].
>>  owner = Processor activeProcess ifTrue:
>> [^true].
>>  self addLast: Processor activeProcess.
>>  Processor activeProcess suspend] valueUnpreemptively"
>>
>> primitiveExitCriticalSection
>> "Primitive. Set te receiver to unowned and if any processes are waiting on
>> the receiver then proceed the first one, indicating that the receiver is unowned."
>> <primitive: 185>
>> self primitiveFailed
>> "In the spirit of the following"
>> "[owner := nil.
>>  self isEmpty ifFalse:
>> [process := self removeFirst.
>> process resume]] valueUnpreemptively"
>>
>> primitiveTestAndSetOwnershipOfCriticalSection
>> "Primitive. Attempt to set the ownership of the receiver.
>> If the receiver is unowned set its owningProcess to the
>> activeProcess and answer false.  If the receiver is owned
>> by the activeProcess answer true.  If the receiver is owned
>> by some other process answer nil."
>> <primitive: 187>
>> self primitiveFail
>> "In the spirit of the following"
>> "[owner ifNil:
>> [owningProcess := Processor activeProcess.
>> ^false].
>>  owner = Processor activeProcess ifTrue: [^true].
>>  ^nil] valueUnpreemptively"
>>
>>> 2016-01-07 13:24 GMT+01:00 Denis Kudriashov <[hidden email]>:
>>>>
>>>>
>>>> Hello.
>>>>
>>>> I hear about new Monitor implementation based on new primitives.
>>>> Where to get it?
>>
>>
>> _,,,^..^,,,_
>> best, Eliot
>
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>



--
_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: [Vm-dev] Where to get Monitor implementation based on primitives?

Ben Coman
On Fri, Jan 8, 2016 at 2:51 AM, Eliot Miranda <[hidden email]> wrote:

> Hi Ben,
>
> On Thu, Jan 7, 2016 at 10:39 AM, Ben Coman <[hidden email]> wrote:
>>
>>
>> On Fri, Jan 8, 2016 at 1:20 AM, Eliot Miranda <[hidden email]>
>> wrote:
>> >
>> > and here's a version with a better class comment
>> >
>> > On Thu, Jan 7, 2016 at 9:12 AM, Eliot Miranda <[hidden email]>
>> > wrote:
>> >>
>> >> Hi Denis, Hi Clément,  Hi Frank,
>> >>
>> >> On Thu, Jan 7, 2016 at 5:34 AM, Clément Bera <[hidden email]>
>> >> wrote:
>> >>>
>> >>> Hello,
>> >>>
>> >>> Eliot, please, you told me you had the code and Denis is interested.
>> >>>
>> >>> It uses 3 primitives for performance.
>> >>
>> >>
>> >> Forgive the delay.  I thought it proper to ask permission since the
>> >> code was written while I was at Qwaq. I'm attaching the code in a fairly raw
>> >> state, see the attached.  The code is MIT, but copyright 3DICC.
>> >>
>> >> It is a plugin replacement for Squeak's Mutex, and with a little
>> >> ingenuity could be a replacement for Squeak's Monitor.  It is quicker
>> >> because it uses three new primitives to manage entering a critical section
>> >> and setting the owner, exiting the critical section and releasing the owner,
>> >> and testing if a critical section, entering if the section is unowned.  The
>> >> use of the primitives means fewer block activations and ensure: blocks in
>> >> entering and exiting the critical section, and that's the actual cause of
>> >> the speed-up.
>> >>
>> >> You can benchmark the code as is.  Here are some results on 32-bit
>> >> Spur, on my 2.2GHz Core i7
>> >>
>> >> {Mutex new. Monitor new. CriticalSection new} collect:
>> >> [:cs| | n |
>> >> n := 0.
>> >> [cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n
>> >> := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
>> >> cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n
>> >> := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
>> >> n ] bench]
>> >>
>> >> {Mutex new. Monitor new. CriticalSection new} collect:
>> >> [:cs| | n |
>> >> n := 0.
>> >> cs class name, ' -> ',
>> >> [cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n
>> >> := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
>> >> cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n
>> >> := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
>> >> n ] bench]
>> >>
>> >> #( 'Mutex -> 440,000 per second. 2.27 microseconds per run.'
>> >> 'Monitor -> 688,000 per second. 1.45 microseconds per run.'
>> >> 'CriticalSection -> 1,110,000 per second. 900 nanoseconds per run.')
>> >>
>>
>> This is great Eliot. Thank you and 3DICC.  After loading the changeset
>> into Pharo-50515 (32 bit Spur) I get the following results on my
>> laptop i5-2520M @ 2.50GHz
>>
>> #('Mutex -> 254,047 per second'
>>  'Monitor -> 450,442 per second'
>>  'CriticalSection -> 683,393 per second')
>>
>> In a fresh Image "Mutex allInstances basicInspect" lists just two
>> mutexes...
>> 1. NetNameResolver-->ResolverMutex
>> 2. ThreadSafeTranscript-->accessSemaphore
>
>
> I hate myself for getting distracted but I'm finding this is un.  One can
> migrate to the new representation using normal Monticello loads by
>
> In the first version redefine Mutex and Monitor to subclass LinkedList and
> have their owner/ownerProcess inst var first (actually third after firstLink
> & lastLink), and add the primitives.
>
> In the next version check that all Mutex and Monitor instanes are unowned
> and then redefine to discard excess inst vars
>
> Let me test this before committing, and see that all tests are ok.

Should Mutex and Monitor both directly subclass LinkedList and
duplicate the primitives in each?

Or should they both subclass CriticalSection which subclasses
LinkedList so the primitives are only defined once?

What effect would using the primitives from the superclass have on
performance? If any, I'd vote to optimise for duplication rather than
"nice" design, but our comments should document this.

cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: [Vm-dev] Where to get Monitor implementation based on primitives?

Eliot Miranda-2
Hi Ben,

On Thu, Jan 7, 2016 at 4:40 PM, Ben Coman <[hidden email]> wrote:
On Fri, Jan 8, 2016 at 2:51 AM, Eliot Miranda <[hidden email]> wrote:
> Hi Ben,
>
> On Thu, Jan 7, 2016 at 10:39 AM, Ben Coman <[hidden email]> wrote:
>>
>>
>> On Fri, Jan 8, 2016 at 1:20 AM, Eliot Miranda <[hidden email]>
>> wrote:
>> >
>> > and here's a version with a better class comment
>> >
>> > On Thu, Jan 7, 2016 at 9:12 AM, Eliot Miranda <[hidden email]>
>> > wrote:
>> >>
>> >> Hi Denis, Hi Clément,  Hi Frank,
>> >>
>> >> On Thu, Jan 7, 2016 at 5:34 AM, Clément Bera <[hidden email]>
>> >> wrote:
>> >>>
>> >>> Hello,
>> >>>
>> >>> Eliot, please, you told me you had the code and Denis is interested.
>> >>>
>> >>> It uses 3 primitives for performance.
>> >>
>> >>
>> >> Forgive the delay.  I thought it proper to ask permission since the
>> >> code was written while I was at Qwaq. I'm attaching the code in a fairly raw
>> >> state, see the attached.  The code is MIT, but copyright 3DICC.
>> >>
>> >> It is a plugin replacement for Squeak's Mutex, and with a little
>> >> ingenuity could be a replacement for Squeak's Monitor.  It is quicker
>> >> because it uses three new primitives to manage entering a critical section
>> >> and setting the owner, exiting the critical section and releasing the owner,
>> >> and testing if a critical section, entering if the section is unowned.  The
>> >> use of the primitives means fewer block activations and ensure: blocks in
>> >> entering and exiting the critical section, and that's the actual cause of
>> >> the speed-up.
>> >>
>> >> You can benchmark the code as is.  Here are some results on 32-bit
>> >> Spur, on my 2.2GHz Core i7
>> >>
>> >> {Mutex new. Monitor new. CriticalSection new} collect:
>> >> [:cs| | n |
>> >> n := 0.
>> >> [cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n
>> >> := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
>> >> cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n
>> >> := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
>> >> n ] bench]
>> >>
>> >> {Mutex new. Monitor new. CriticalSection new} collect:
>> >> [:cs| | n |
>> >> n := 0.
>> >> cs class name, ' -> ',
>> >> [cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n
>> >> := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
>> >> cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n
>> >> := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
>> >> n ] bench]
>> >>
>> >> #( 'Mutex -> 440,000 per second. 2.27 microseconds per run.'
>> >> 'Monitor -> 688,000 per second. 1.45 microseconds per run.'
>> >> 'CriticalSection -> 1,110,000 per second. 900 nanoseconds per run.')
>> >>
>>
>> This is great Eliot. Thank you and 3DICC.  After loading the changeset
>> into Pharo-50515 (32 bit Spur) I get the following results on my
>> laptop i5-2520M @ 2.50GHz
>>
>> #('Mutex -> 254,047 per second'
>>  'Monitor -> 450,442 per second'
>>  'CriticalSection -> 683,393 per second')
>>
>> In a fresh Image "Mutex allInstances basicInspect" lists just two
>> mutexes...
>> 1. NetNameResolver-->ResolverMutex
>> 2. ThreadSafeTranscript-->accessSemaphore
>
>
> I hate myself for getting distracted but I'm finding this is un.  One can
> migrate to the new representation using normal Monticello loads by
>
> In the first version redefine Mutex and Monitor to subclass LinkedList and
> have their owner/ownerProcess inst var first (actually third after firstLink
> & lastLink), and add the primitives.
>
> In the next version check that all Mutex and Monitor instanes are unowned
> and then redefine to discard excess inst vars
>
> Let me test this before committing, and see that all tests are ok.

Should Mutex and Monitor both directly subclass LinkedList and
duplicate the primitives in each?

Or should they both subclass CriticalSection which subclasses
LinkedList so the primitives are only defined once?

That's a good idea.  Feel free to change the code, but test that the Monticello load handles this case properly first :-).  Actually, given that the default state of all the Mutex and Monitor instances in the image is unowned (owner process is nil) then it'll just work anyway.  If we do that, we must make sure to include the ICC copyright in CriticalSection's class comment, and can eliminate it from the primitives.

What effect would using the primitives from the superclass have on
performance? If any, I'd vote to optimise for duplication rather than
"nice" design, but our comments should document this.

Likely in the noise.  The inline cacheing machinery in the VM is far cheaper than the real overheads here which are in block creation, process switch, interpreter primitive invocation.
 

cheers -ben




--
_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: [Vm-dev] Where to get Monitor implementation based on primitives?

Denis Kudriashov
Now I implement pragma approach to set up local variable during termination process. So I can write such methods:

critical: aBlock
"Evaluate aBlock protected by the receiver."
|  lockAcquired |
<lockAt: #lock trackStateAt: 1>
lockAcquired := false. 
^[
lockAcquired := true.
lockAcquired := lock wait.
aBlock value
] ensure: [lockAcquired ifTrue: [lock signal]].

And Process>>terminate detects waiting on such methods and push false to variable lockAcquired (which is 1 temp here).
This approach allow me to use multiple "locks" (semaphores or whatever) in single method which I need for ReadWriteLock.


2016-01-08 18:31 GMT+01:00 Eliot Miranda <[hidden email]>:
Hi Ben,

On Thu, Jan 7, 2016 at 4:40 PM, Ben Coman <[hidden email]> wrote:
On Fri, Jan 8, 2016 at 2:51 AM, Eliot Miranda <[hidden email]> wrote:
> Hi Ben,
>
> On Thu, Jan 7, 2016 at 10:39 AM, Ben Coman <[hidden email]> wrote:
>>
>>
>> On Fri, Jan 8, 2016 at 1:20 AM, Eliot Miranda <[hidden email]>
>> wrote:
>> >
>> > and here's a version with a better class comment
>> >
>> > On Thu, Jan 7, 2016 at 9:12 AM, Eliot Miranda <[hidden email]>
>> > wrote:
>> >>
>> >> Hi Denis, Hi Clément,  Hi Frank,
>> >>
>> >> On Thu, Jan 7, 2016 at 5:34 AM, Clément Bera <[hidden email]>
>> >> wrote:
>> >>>
>> >>> Hello,
>> >>>
>> >>> Eliot, please, you told me you had the code and Denis is interested.
>> >>>
>> >>> It uses 3 primitives for performance.
>> >>
>> >>
>> >> Forgive the delay.  I thought it proper to ask permission since the
>> >> code was written while I was at Qwaq. I'm attaching the code in a fairly raw
>> >> state, see the attached.  The code is MIT, but copyright 3DICC.
>> >>
>> >> It is a plugin replacement for Squeak's Mutex, and with a little
>> >> ingenuity could be a replacement for Squeak's Monitor.  It is quicker
>> >> because it uses three new primitives to manage entering a critical section
>> >> and setting the owner, exiting the critical section and releasing the owner,
>> >> and testing if a critical section, entering if the section is unowned.  The
>> >> use of the primitives means fewer block activations and ensure: blocks in
>> >> entering and exiting the critical section, and that's the actual cause of
>> >> the speed-up.
>> >>
>> >> You can benchmark the code as is.  Here are some results on 32-bit
>> >> Spur, on my 2.2GHz Core i7
>> >>
>> >> {Mutex new. Monitor new. CriticalSection new} collect:
>> >> [:cs| | n |
>> >> n := 0.
>> >> [cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n
>> >> := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
>> >> cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n
>> >> := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
>> >> n ] bench]
>> >>
>> >> {Mutex new. Monitor new. CriticalSection new} collect:
>> >> [:cs| | n |
>> >> n := 0.
>> >> cs class name, ' -> ',
>> >> [cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n
>> >> := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
>> >> cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n
>> >> := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
>> >> n ] bench]
>> >>
>> >> #( 'Mutex -> 440,000 per second. 2.27 microseconds per run.'
>> >> 'Monitor -> 688,000 per second. 1.45 microseconds per run.'
>> >> 'CriticalSection -> 1,110,000 per second. 900 nanoseconds per run.')
>> >>
>>
>> This is great Eliot. Thank you and 3DICC.  After loading the changeset
>> into Pharo-50515 (32 bit Spur) I get the following results on my
>> laptop i5-2520M @ 2.50GHz
>>
>> #('Mutex -> 254,047 per second'
>>  'Monitor -> 450,442 per second'
>>  'CriticalSection -> 683,393 per second')
>>
>> In a fresh Image "Mutex allInstances basicInspect" lists just two
>> mutexes...
>> 1. NetNameResolver-->ResolverMutex
>> 2. ThreadSafeTranscript-->accessSemaphore
>
>
> I hate myself for getting distracted but I'm finding this is un.  One can
> migrate to the new representation using normal Monticello loads by
>
> In the first version redefine Mutex and Monitor to subclass LinkedList and
> have their owner/ownerProcess inst var first (actually third after firstLink
> & lastLink), and add the primitives.
>
> In the next version check that all Mutex and Monitor instanes are unowned
> and then redefine to discard excess inst vars
>
> Let me test this before committing, and see that all tests are ok.

Should Mutex and Monitor both directly subclass LinkedList and
duplicate the primitives in each?

Or should they both subclass CriticalSection which subclasses
LinkedList so the primitives are only defined once?

That's a good idea.  Feel free to change the code, but test that the Monticello load handles this case properly first :-).  Actually, given that the default state of all the Mutex and Monitor instances in the image is unowned (owner process is nil) then it'll just work anyway.  If we do that, we must make sure to include the ICC copyright in CriticalSection's class comment, and can eliminate it from the primitives.

What effect would using the primitives from the superclass have on
performance? If any, I'd vote to optimise for duplication rather than
"nice" design, but our comments should document this.

Likely in the noise.  The inline cacheing machinery in the VM is far cheaper than the real overheads here which are in block creation, process switch, interpreter primitive invocation.
 

cheers -ben




--
_,,,^..^,,,_
best, Eliot




Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Where to get Monitor implementation based on primitives?

Frank Shearar-3
In reply to this post by Eliot Miranda-2
On 7 January 2016 at 17:12, Eliot Miranda <[hidden email]> wrote:

> Hi Denis, Hi Clément,  Hi Frank,
>
> On Thu, Jan 7, 2016 at 5:34 AM, Clément Bera <[hidden email]> wrote:
>>
>> Hello,
>>
>> Eliot, please, you told me you had the code and Denis is interested.
>>
>> It uses 3 primitives for performance.
>
>
> Forgive the delay.  I thought it proper to ask permission since the code was
> written while I was at Qwaq. I'm attaching the code in a fairly raw state,
> see the attached.  The code is MIT, but copyright 3DICC.
>
> It is a plugin replacement for Squeak's Mutex, and with a little ingenuity
> could be a replacement for Squeak's Monitor.  It is quicker because it uses
> three new primitives to manage entering a critical section and setting the
> owner, exiting the critical section and releasing the owner, and testing if
> a critical section, entering if the section is unowned.  The use of the
> primitives means fewer block activations and ensure: blocks in entering and
> exiting the critical section, and that's the actual cause of the speed-up.
>
> You can benchmark the code as is.  Here are some results on 32-bit Spur, on
> my 2.2GHz Core i7
>
> {Mutex new. Monitor new. CriticalSection new} collect:
> [:cs| | n |
> n := 0.
> [cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n
> + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
> cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n -
> 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
> n ] bench]
>
> {Mutex new. Monitor new. CriticalSection new} collect:
> [:cs| | n |
> n := 0.
> cs class name, ' -> ',
> [cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n
> + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
> cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n -
> 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
> n ] bench]
>
> #( 'Mutex -> 440,000 per second. 2.27 microseconds per run.'
> 'Monitor -> 688,000 per second. 1.45 microseconds per run.'
> 'CriticalSection -> 1,110,000 per second. 900 nanoseconds per run.')
>
> Replacement is probably trivial; rename Mutex to OldMutex, rename
> CriticalSection to Mutex, recompile.  But there are lots of mutexes in the
> system and these are potentially owned.  Transforming unowned ones is
> trivial, but transforming owned ones is, I think, impossible.  But at least
> in my system there are no owned mutexes or monitors.
>
> Frank (or anyone else), would you be interested in creating a replacement
> for Squeak's Monitor based on CriticalSection?

It sounds like an interesting problem. I watched the thread unfold and
I'm still not massively clear on what is required though? (Apologies
for the delay in responding: my cup overfloweth at the moment, but
should resume normal levels in a month or so.)

frank

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Where to get Monitor implementation based on primitives?

Ben Coman
In reply to this post by Eliot Miranda-2
On Fri, Jan 8, 2016 at 9:39 PM, Ben Coman <[hidden email]> wrote:

>
> btw, Looking at the following...
>   CriticalSection>>critical: aBlock
>       ^self primitiveEnterCriticalSection
>           ifTrue: [aBlock value]
>           ifFalse: [aBlock ensure: [self primitiveExitCriticalSection]]
>
> without intimate VM knowledge but seeing Eliot recently say
> "invocations of methods with primitives [...] are not suspension
> points, unless their primitives fail" -- I'm paranoid that since
> #ensure: primitive 198 always fail, there might be some some small
> window for a race where a process might be terminated before
> #primitiveExitCriticalSection can be executed. I'll take a refutation
> of this to improve the method comment.
>
> The following instils more confidence...
>
>   CriticalSection2>>critical: aBlock
>       | reEntered |
>       reEntered := false.
>       [ reEntered := self primitiveEnterCriticalSection.
>          aBlock value ] ensure:
>          [ reEntered ifFalse: [ self primitiveExitCriticalSection] ]

Having spent more time considering this, I found I was off track here.
If a process waiting at #primitiveEnterCriticalSection is terminated,
then #primitiveExitCriticalSection is erroneously executed.

And it will be the same problem with the proposal for Case17373 based
on OwnedLock>>acquire

  Mutex>>critical: aBlock
      |  lockAcquiredNotHere |
     <lockAt: #lock tracksStateAt: 1>
     lockAcquiredNotHere := true.
     ^[
           lockAcquiredNotHere := false.
           lockAcquiredNotHere := lock acquire.
           aBlock value
       ] ensure: [lockAcquiredNotHere ifFalse: [lock release]].

So now I believe the original #critical: proposed by Eliot is optimal.
* If #primitiveEnterCriticalSection is terminated while waiting, then
the #ifFalse: is never executed.
* When #primitiveEnterCriticalSection returns, the inlined
#ifTrue:ifFalse cant be interrupted.
* #ensure is a primitive which can't be interrupted.  By the time it
has done its usual "fail", which I raised concern over, actually it
has already done its job to make sure #primitiveExitCriticalSection is
executed.