Why isn't signalSemaphoreWithIndex() thread-safe?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Why isn't signalSemaphoreWithIndex() thread-safe?

Joshua Gargus-2
 
Hi,

I've always assumed that signalSemaphoreWithIndex() must be thread-safe;
after all, it's the "official" mechanism for notifying the image that an
asynchronous event has occurred.  However, a pang of paranoia prompted
me to actually look at the code, and it seems clearly unsafe.  This is
bad, because I've been using it to signal events from separate native
threads.

What should we do about this?  It seems to me that it should be wrapped
in a critical section, using the appropriate platform-specific
synchronization primitives.

Thanks,
Josh
Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

johnmci
 
Well there is no generic synchronizing code, so you are correct it has  
to be platform specific.
When I wrote the original code a decade? back in the macintosh os-9  
era there was no agreement
about doing platform specific implementations since at the time there  
was a tendency to keep all
the code as generic as possible.

So do you have a test case or scenario where the code fails?

On 2009-09-20, at 12:43 PM, Joshua Gargus wrote:

>
> Hi,
>
> I've always assumed that signalSemaphoreWithIndex() must be thread-
> safe;
> after all, it's the "official" mechanism for notifying the image  
> that an
> asynchronous event has occurred.  However, a pang of paranoia prompted
> me to actually look at the code, and it seems clearly unsafe.  This is
> bad, because I've been using it to signal events from separate native
> threads.
>
> What should we do about this?  It seems to me that it should be  
> wrapped
> in a critical section, using the appropriate platform-specific
> synchronization primitives.
>
> Thanks,
> Josh

--
=
=
=
========================================================================
John M. McIntosh <[hidden email]>   Twitter:  
squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
=
=
=
========================================================================




Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Eliot Miranda-2
In reply to this post by Joshua Gargus-2
 
Hi Josh,

On Sun, Sep 20, 2009 at 12:43 PM, Joshua Gargus <[hidden email]> wrote:

Hi,

I've always assumed that signalSemaphoreWithIndex() must be thread-safe;
after all, it's the "official" mechanism for notifying the image that an
asynchronous event has occurred.  However, a pang of paranoia prompted
me to actually look at the code, and it seems clearly unsafe.  This is
bad, because I've been using it to signal events from separate native
threads.

What should we do about this?  It seems to me that it should be wrapped
in a critical section, using the appropriate platform-specific
synchronization primitives.
 
There's no need for this heavyweight approach because the VM can only respond to these signals synchronously.  Instead we can use three variables per index to implement an exclusion-free solution, thusly:

typedef struct { int lock; // accessed via e.g. XCHG to protect signalRequest
                       int requests;
                       int responses;
           } ExternalSignalRequest;

ExternalSignalRequest *externalSignalRequests;
int checkExternalSignalRequests;

void
requestSignal(int i)
{
       while (!lock(&externalSignalRequests[i].lock))
             usleep(1);

       ++externalSignalRequests[i].requests;

        unlock(&externalSignalRequests[i].lock);

        checkExternalSignalRequests = 1;
        forceInterruptCheck();  // set a flag to cause the VM to check for interrupts; in the stack VM this is stackLimit := (usqInt)-1;
}
                       

The VM responds in checkEvents:

    if (checkExternalSignalRequests)
        for (i = 0; i < numExternalSemaphores; i++)
            while (externalSignalRequests[i]. responses != externalSignalRequests[i]. requests) {
                signalSemaphoreWithIndex(i);
                externalSignalRequests[i]. responses;
            }



Thanks,
Josh

Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Eliot Miranda-2
 


On Sun, Sep 20, 2009 at 1:00 PM, Eliot Miranda <[hidden email]> wrote:
Hi Josh,

On Sun, Sep 20, 2009 at 12:43 PM, Joshua Gargus <[hidden email]> wrote:

Hi,

I've always assumed that signalSemaphoreWithIndex() must be thread-safe;
after all, it's the "official" mechanism for notifying the image that an
asynchronous event has occurred.  However, a pang of paranoia prompted
me to actually look at the code, and it seems clearly unsafe.  This is
bad, because I've been using it to signal events from separate native
threads.

What should we do about this?  It seems to me that it should be wrapped
in a critical section, using the appropriate platform-specific
synchronization primitives.
 
There's no need for this heavyweight approach because the VM can only respond to these signals synchronously.

On second reading I think what I wrote below is exactly what you implied.  Sorry.

 
 Instead we can use three variables per index to implement an exclusion-free solution, thusly:

typedef struct { int lock; // accessed via e.g. XCHG to protect signalRequest
                       int requests;
                       int responses;
           } ExternalSignalRequest;

ExternalSignalRequest *externalSignalRequests;
int checkExternalSignalRequests;

void
requestSignal(int i)
{
       while (!lock(&externalSignalRequests[i].lock))
             usleep(1);

       ++externalSignalRequests[i].requests;

        unlock(&externalSignalRequests[i].lock);

        checkExternalSignalRequests = 1;
        forceInterruptCheck();  // set a flag to cause the VM to check for interrupts; in the stack VM this is stackLimit := (usqInt)-1;
}
                       

The VM responds in checkEvents:

    if (checkExternalSignalRequests)
        for (i = 0; i < numExternalSemaphores; i++)
            while (externalSignalRequests[i]. responses != externalSignalRequests[i]. requests) {
                signalSemaphoreWithIndex(i);
                externalSignalRequests[i]. responses;
            }



Thanks,
Josh


Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Joshua Gargus-2
In reply to this post by johnmci
 
John M McIntosh wrote:

>
> Well there is no generic synchronizing code, so you are correct it has
> to be platform specific.
> When I wrote the original code a decade? back in the macintosh os-9
> era there was no agreement
> about doing platform specific implementations since at the time there
> was a tendency to keep all
> the code as generic as possible.
>
> So do you have a test case

No test case.

> or scenario where the code fails?

Just the obvious race condition.  Let's just look at the first case of
the if-statement; the case for the other buffer is symmetric.  Here's
the code in interp.c

    if (foo->semaphoresUseBufferA) {
        if (foo->semaphoresToSignalCountA < SemaphoresToSignalSize) {
            foo->semaphoresToSignalCountA += 1;
            foo->semaphoresToSignalA[foo->semaphoresToSignalCountA] = index;
        }
    }


Let's say that there are two native threads A and B that want to signal
semaphores with indices 7 and 8, and that there are no other semaphores
to be signalled.  Let's say that thread A is running until just after
"semaphoresToSignalCountA" is incremented, but is interrupted before it
assigns the index.  Then thread B runs, increments
"semaphoresToSignalCountA" again (so its value is now 2), and sets
foo->semaphoresToSignalA[2] = 8.  Then A resumes and stomps this value
by setting foo->semaphoresToSignalA[2] = 7.  Now the semaphore with
index 8 will not be signalled, and just as bad, the index stored in
oo->semaphoresToSignalA[1] is now garbage that will be treated as a
semaphore-index to signal.

Cheers,
Josh



>
> On 2009-09-20, at 12:43 PM, Joshua Gargus wrote:
>
>>
>> Hi,
>>
>> I've always assumed that signalSemaphoreWithIndex() must be thread-safe;
>> after all, it's the "official" mechanism for notifying the image that an
>> asynchronous event has occurred.  However, a pang of paranoia prompted
>> me to actually look at the code, and it seems clearly unsafe.  This is
>> bad, because I've been using it to signal events from separate native
>> threads.
>>
>> What should we do about this?  It seems to me that it should be wrapped
>> in a critical section, using the appropriate platform-specific
>> synchronization primitives.
>>
>> Thanks,
>> Josh
>
> --
> ===========================================================================
>
> John M. McIntosh <[hidden email]>   Twitter:
> squeaker68882
> Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
> ===========================================================================
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Andreas.Raab
 
Joshua Gargus wrote:

> Let's say that there are two native threads A and B that want to signal
> semaphores with indices 7 and 8, and that there are no other semaphores
> to be signalled.  Let's say that thread A is running until just after
> "semaphoresToSignalCountA" is incremented, but is interrupted before it
> assigns the index.  Then thread B runs, increments
> "semaphoresToSignalCountA" again (so its value is now 2), and sets
> foo->semaphoresToSignalA[2] = 8.  Then A resumes and stomps this value
> by setting foo->semaphoresToSignalA[2] = 7.  Now the semaphore with
> index 8 will not be signalled, and just as bad, the index stored in
> oo->semaphoresToSignalA[1] is now garbage that will be treated as a
> semaphore-index to signal.

On Windows, this case is handled by replacing signalSemaphoreWithIndex()
by synchronizedSignalSemaphoreWithIndex() (sqWin32Window.c) in the vm
proxy which serializes incoming signal requests.

Cheers,
   - Andreas
Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Igor Stasenko
In reply to this post by Joshua Gargus-2

2009/9/21 Joshua Gargus <[hidden email]>:

>
> John M McIntosh wrote:
>>
>> Well there is no generic synchronizing code, so you are correct it has
>> to be platform specific.
>> When I wrote the original code a decade? back in the macintosh os-9
>> era there was no agreement
>> about doing platform specific implementations since at the time there
>> was a tendency to keep all
>> the code as generic as possible.
>>
>> So do you have a test case
>
> No test case.
>
>> or scenario where the code fails?
>
> Just the obvious race condition.  Let's just look at the first case of
> the if-statement; the case for the other buffer is symmetric.  Here's
> the code in interp.c
>
>    if (foo->semaphoresUseBufferA) {
>        if (foo->semaphoresToSignalCountA < SemaphoresToSignalSize) {
>            foo->semaphoresToSignalCountA += 1;
>            foo->semaphoresToSignalA[foo->semaphoresToSignalCountA] = index;
>        }
>    }
>
>
> Let's say that there are two native threads A and B that want to signal
> semaphores with indices 7 and 8, and that there are no other semaphores
> to be signalled.  Let's say that thread A is running until just after
> "semaphoresToSignalCountA" is incremented, but is interrupted before it
> assigns the index.  Then thread B runs, increments
> "semaphoresToSignalCountA" again (so its value is now 2), and sets
> foo->semaphoresToSignalA[2] = 8.  Then A resumes and stomps this value
> by setting foo->semaphoresToSignalA[2] = 7.  Now the semaphore with
> index 8 will not be signalled, and just as bad, the index stored in
> oo->semaphoresToSignalA[1] is now garbage that will be treated as a
> semaphore-index to signal.
>
> Cheers,
> Josh
>
>
>
+1 , its not thread-safe.

In Hydra VM, i made thread-safe event queue, and made signaling going though it.

In win32 Squeak vm, the above situation (race condition you have
described) is not possible, because
there are additional mutex, than makes sure that only single thread
can be inside a call of signaling semaphore
(See synchronousSemaphoreSignal() function).
But it doesn't makes this code smells better, for sure :)

--
Best regards,
Igor Stasenko AKA sig.
Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Joshua Gargus-2
In reply to this post by Eliot Miranda-2
 
Eliot Miranda wrote:

>  
>
> ------------------------------------------------------------------------
>
>
>
> On Sun, Sep 20, 2009 at 1:00 PM, Eliot Miranda
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hi Josh,
>
>     On Sun, Sep 20, 2009 at 12:43 PM, Joshua Gargus <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>
>         Hi,
>
>         I've always assumed that signalSemaphoreWithIndex() must be
>         thread-safe;
>         after all, it's the "official" mechanism for notifying the
>         image that an
>         asynchronous event has occurred.  However, a pang of paranoia
>         prompted
>         me to actually look at the code, and it seems clearly unsafe.
>          This is
>         bad, because I've been using it to signal events from separate
>         native
>         threads.
>
>         What should we do about this?  It seems to me that it should
>         be wrapped
>         in a critical section, using the appropriate platform-specific
>         synchronization primitives.
>
>      
>     There's no need for this heavyweight approach because the VM can
>     only respond to these signals synchronously.
>
>
> On second reading I think what I wrote below is exactly what you
> implied.  Sorry.

Nope, you're always giving me too much credit ;-)  I wasn't thinking
about any particular implementation of thread-safety, only raising the
issue that (in my understanding) we don't have it.

>
>  
>
>      Instead we can use three variables per index to implement an
>     exclusion-free solution, thusly:
>
>     typedef struct { int lock; // accessed via e.g. XCHG to protect
>     signalRequest
>                            int requests;
>                            int responses;
>                } ExternalSignalRequest;
>
>     ExternalSignalRequest *externalSignalRequests;
>     int checkExternalSignalRequests;
>
>     void
>     requestSignal(int i)
>     {
>            while (!lock(&externalSignalRequests[i].lock))
>                  usleep(1);
>
>            ++externalSignalRequests[i].requests;
>
>             unlock(&externalSignalRequests[i].lock);
>
>             checkExternalSignalRequests = 1;
>             forceInterruptCheck();  // set a flag to cause the VM to
>     check for interrupts; in the stack VM this is stackLimit :=
>     (usqInt)-1;
>     }
>                            
>
>     The VM responds in checkEvents:
>
>         if (checkExternalSignalRequests)
>             for (i = 0; i < numExternalSemaphores; i++)
>                 while (externalSignalRequests[i]. responses
>     != externalSignalRequests[i]. requests) {
>                     signalSemaphoreWithIndex(i);
>                     externalSignalRequests[i]. responses;
>                 }
>


Am I correct that ExternalSignalRequest.reponses is never reset to zero,
instead eventually wrapping around?  That worried me for a minute, but I
see that it doesn't matter.  Neeto.

Your solution looks good; it's certainly nicer than a heavyweight
critical section, especially since contention will be infrequent.

Cheers,
Josh




>
>
>         Thanks,
>         Josh
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Igor Stasenko
In reply to this post by Andreas.Raab

2009/9/21 Andreas Raab <[hidden email]>:

>
> Joshua Gargus wrote:
>>
>> Let's say that there are two native threads A and B that want to signal
>> semaphores with indices 7 and 8, and that there are no other semaphores
>> to be signalled.  Let's say that thread A is running until just after
>> "semaphoresToSignalCountA" is incremented, but is interrupted before it
>> assigns the index.  Then thread B runs, increments
>> "semaphoresToSignalCountA" again (so its value is now 2), and sets
>> foo->semaphoresToSignalA[2] = 8.  Then A resumes and stomps this value
>> by setting foo->semaphoresToSignalA[2] = 7.  Now the semaphore with
>> index 8 will not be signalled, and just as bad, the index stored in
>> oo->semaphoresToSignalA[1] is now garbage that will be treated as a
>> semaphore-index to signal.
>
> On Windows, this case is handled by replacing signalSemaphoreWithIndex() by
> synchronizedSignalSemaphoreWithIndex() (sqWin32Window.c) in the vm proxy
> which serializes incoming signal requests.
>

What makes me think, that promoting synchronization protocol into
standard VM API would be nice
idea.
The platforms which doesn't provide multithreading, the function(s)
could be no-op. But on platforms
which providing it, they should be implemented to make VM thread safe.

> Cheers,
>  - Andreas
>



--
Best regards,
Igor Stasenko AKA sig.
Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Igor Stasenko
In reply to this post by Joshua Gargus-2

2009/9/21 Joshua Gargus <[hidden email]>:

>
> Eliot Miranda wrote:
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>> On Sun, Sep 20, 2009 at 1:00 PM, Eliot Miranda
>> <[hidden email] <mailto:[hidden email]>> wrote:
>>
>>     Hi Josh,
>>
>>     On Sun, Sep 20, 2009 at 12:43 PM, Joshua Gargus <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>
>>
>>         Hi,
>>
>>         I've always assumed that signalSemaphoreWithIndex() must be
>>         thread-safe;
>>         after all, it's the "official" mechanism for notifying the
>>         image that an
>>         asynchronous event has occurred.  However, a pang of paranoia
>>         prompted
>>         me to actually look at the code, and it seems clearly unsafe.
>>          This is
>>         bad, because I've been using it to signal events from separate
>>         native
>>         threads.
>>
>>         What should we do about this?  It seems to me that it should
>>         be wrapped
>>         in a critical section, using the appropriate platform-specific
>>         synchronization primitives.
>>
>>
>>     There's no need for this heavyweight approach because the VM can
>>     only respond to these signals synchronously.
>>
>>
>> On second reading I think what I wrote below is exactly what you
>> implied.  Sorry.
>
> Nope, you're always giving me too much credit ;-)  I wasn't thinking
> about any particular implementation of thread-safety, only raising the
> issue that (in my understanding) we don't have it.
>
>>
>>
>>
>>      Instead we can use three variables per index to implement an
>>     exclusion-free solution, thusly:
>>
>>     typedef struct { int lock; // accessed via e.g. XCHG to protect
>>     signalRequest
>>                            int requests;
>>                            int responses;
>>                } ExternalSignalRequest;
>>
>>     ExternalSignalRequest *externalSignalRequests;
>>     int checkExternalSignalRequests;
>>
>>     void
>>     requestSignal(int i)
>>     {
>>            while (!lock(&externalSignalRequests[i].lock))
>>                  usleep(1);
>>
>>            ++externalSignalRequests[i].requests;
>>
>>             unlock(&externalSignalRequests[i].lock);
>>
>>             checkExternalSignalRequests = 1;
>>             forceInterruptCheck();  // set a flag to cause the VM to
>>     check for interrupts; in the stack VM this is stackLimit :=
>>     (usqInt)-1;
>>     }
>>
>>
>>     The VM responds in checkEvents:
>>
>>         if (checkExternalSignalRequests)
>>             for (i = 0; i < numExternalSemaphores; i++)
>>                 while (externalSignalRequests[i]. responses
>>     != externalSignalRequests[i]. requests) {
>>                     signalSemaphoreWithIndex(i);
>>                     externalSignalRequests[i]. responses;
>>                 }
>>
>
>
> Am I correct that ExternalSignalRequest.reponses is never reset to zero,
> instead eventually wrapping around?  That worried me for a minute, but I
> see that it doesn't matter.  Neeto.
>

Nope, it does resets the counters.. see signalExternalSemaphores  method.
...
                        semaphoresToSignalCountB = 0;
...
                        semaphoresToSignalCountA = 0;
...
and btw, this method is non-thread-safe, despite the
synchronizedSignalSemaphoreWithIndex() function.

Consider the code:
*****         semaphoresUseBufferA = !semaphoresUseBufferA;
                xArray = longAt((specialObjectsOop + BaseHeaderSize) +
(ExternalObjectsArray << ShiftForWord));
                xSize = stSizeOf(xArray);
                if (semaphoresUseBufferA) {

if non-VM thread is interrupted in the middle of signaling, just after
it checked the semaphoresUseBufferA flag,
but not yet added the signal to buffer, and OS deside to switch to VM
thread which enters the signalExternalSemaphores method, then signal
will be lost.

> Your solution looks good; it's certainly nicer than a heavyweight
> critical section, especially since contention will be infrequent.
>
> Cheers,
> Josh
>
>
>
>
>>
>>
>>         Thanks,
>>         Josh
>>
>>
>>
>
>



--
Best regards,
Igor Stasenko AKA sig.
Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Joshua Gargus-2
 
Igor Stasenko wrote:

>  
> 2009/9/21 Joshua Gargus <[hidden email]>:
>  
>> Eliot Miranda wrote:
>>    
>>> ------------------------------------------------------------------------
>>>
>>>
>>>
>>> On Sun, Sep 20, 2009 at 1:00 PM, Eliot Miranda
>>> <[hidden email] <mailto:[hidden email]>> wrote:
>>>
>>>      Instead we can use three variables per index to implement an
>>>     exclusion-free solution, thusly:
>>>
>>>     typedef struct { int lock; // accessed via e.g. XCHG to protect
>>>     signalRequest
>>>                            int requests;
>>>                            int responses;
>>>                } ExternalSignalRequest;
>>>
>>>     ExternalSignalRequest *externalSignalRequests;
>>>     int checkExternalSignalRequests;
>>>
>>>     void
>>>     requestSignal(int i)
>>>     {
>>>            while (!lock(&externalSignalRequests[i].lock))
>>>                  usleep(1);
>>>
>>>            ++externalSignalRequests[i].requests;
>>>
>>>             unlock(&externalSignalRequests[i].lock);
>>>
>>>             checkExternalSignalRequests = 1;
>>>             forceInterruptCheck();  // set a flag to cause the VM to
>>>     check for interrupts; in the stack VM this is stackLimit :=
>>>     (usqInt)-1;
>>>     }
>>>
>>>
>>>     The VM responds in checkEvents:
>>>
>>>         if (checkExternalSignalRequests)
>>>             for (i = 0; i < numExternalSemaphores; i++)
>>>                 while (externalSignalRequests[i]. responses
>>>     != externalSignalRequests[i]. requests) {
>>>                     signalSemaphoreWithIndex(i);
>>>                     externalSignalRequests[i]. responses;
>>>                 }
>>>
>>>      
>> Am I correct that ExternalSignalRequest.reponses is never reset to zero,
>> instead eventually wrapping around?  That worried me for a minute, but I
>> see that it doesn't matter.  Neeto.
>>
>>    
>
> Nope, it does resets the counters.. see signalExternalSemaphores  method.
>  

I was responding to Eliot's proposal, not the existing code.  If I
understand properly, he's proposing to maintain a per-semaphore count of
signal requests/responses, so there would be no
"semaphoresToSignalCountA/B".

Cheers,
Josh

Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Igor Stasenko

2009/9/21 Joshua Gargus <[hidden email]>:

>
> Igor Stasenko wrote:
>>
>> 2009/9/21 Joshua Gargus <[hidden email]>:
>>
>>> Eliot Miranda wrote:
>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>
>>>>
>>>> On Sun, Sep 20, 2009 at 1:00 PM, Eliot Miranda
>>>> <[hidden email] <mailto:[hidden email]>> wrote:
>>>>
>>>>      Instead we can use three variables per index to implement an
>>>>     exclusion-free solution, thusly:
>>>>
>>>>     typedef struct { int lock; // accessed via e.g. XCHG to protect
>>>>     signalRequest
>>>>                            int requests;
>>>>                            int responses;
>>>>                } ExternalSignalRequest;
>>>>
>>>>     ExternalSignalRequest *externalSignalRequests;
>>>>     int checkExternalSignalRequests;
>>>>
>>>>     void
>>>>     requestSignal(int i)
>>>>     {
>>>>            while (!lock(&externalSignalRequests[i].lock))
>>>>                  usleep(1);
>>>>
>>>>            ++externalSignalRequests[i].requests;
>>>>
>>>>             unlock(&externalSignalRequests[i].lock);
>>>>
>>>>             checkExternalSignalRequests = 1;
>>>>             forceInterruptCheck();  // set a flag to cause the VM to
>>>>     check for interrupts; in the stack VM this is stackLimit :=
>>>>     (usqInt)-1;
>>>>     }
>>>>
>>>>
>>>>     The VM responds in checkEvents:
>>>>
>>>>         if (checkExternalSignalRequests)
>>>>             for (i = 0; i < numExternalSemaphores; i++)
>>>>                 while (externalSignalRequests[i]. responses
>>>>     != externalSignalRequests[i]. requests) {
>>>>                     signalSemaphoreWithIndex(i);
>>>>                     externalSignalRequests[i]. responses;
>>>>                 }
>>>>
>>>>
>>> Am I correct that ExternalSignalRequest.reponses is never reset to zero,
>>> instead eventually wrapping around?  That worried me for a minute, but I
>>> see that it doesn't matter.  Neeto.
>>>
>>>
>>
>> Nope, it does resets the counters.. see signalExternalSemaphores  method.
>>
>
> I was responding to Eliot's proposal, not the existing code.  If I
> understand properly, he's proposing to maintain a per-semaphore count of
> signal requests/responses, so there would be no
> "semaphoresToSignalCountA/B".
>
Ah.. sorry for not reading carefully..
In proposed scenario i don't like 1 (one) thing, that checkForInterrupts will
spend a linear amount of time to check what semaphores need to be signaled
instead of spending linear amount of time to signal semaphores which
is to be signaled:

     for (i = 0; i < numExternalSemaphores; i++)

means, that if you having 1000 external semaphores, and only 1 of them
signaled currently, you have to
loop through all of them, inevitably :(

> Cheers,
> Josh
>
>



--
Best regards,
Igor Stasenko AKA sig.
Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Eliot Miranda-2
In reply to this post by Joshua Gargus-2
 


On Sun, Sep 20, 2009 at 3:02 PM, Joshua Gargus <[hidden email]> wrote:

Eliot Miranda wrote:
>
>
> ------------------------------------------------------------------------
>
>
>
> On Sun, Sep 20, 2009 at 1:00 PM, Eliot Miranda
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hi Josh,
>
>     On Sun, Sep 20, 2009 at 12:43 PM, Joshua Gargus <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>
>         Hi,
>
>         I've always assumed that signalSemaphoreWithIndex() must be
>         thread-safe;
>         after all, it's the "official" mechanism for notifying the
>         image that an
>         asynchronous event has occurred.  However, a pang of paranoia
>         prompted
>         me to actually look at the code, and it seems clearly unsafe.
>          This is
>         bad, because I've been using it to signal events from separate
>         native
>         threads.
>
>         What should we do about this?  It seems to me that it should
>         be wrapped
>         in a critical section, using the appropriate platform-specific
>         synchronization primitives.
>
>
>     There's no need for this heavyweight approach because the VM can
>     only respond to these signals synchronously.
>
>
> On second reading I think what I wrote below is exactly what you
> implied.  Sorry.

Nope, you're always giving me too much credit ;-)  I wasn't thinking
about any particular implementation of thread-safety, only raising the
issue that (in my understanding) we don't have it.

>
>
>
>      Instead we can use three variables per index to implement an
>     exclusion-free solution, thusly:
>
>     typedef struct { int lock; // accessed via e.g. XCHG to protect
>     signalRequest
>                            int requests;
>                            int responses;
>                } ExternalSignalRequest;
>
>     ExternalSignalRequest *externalSignalRequests;
>     int checkExternalSignalRequests;
>
>     void
>     requestSignal(int i)
>     {
>            while (!lock(&externalSignalRequests[i].lock))
>                  usleep(1);
>
>            ++externalSignalRequests[i].requests;
>
>             unlock(&externalSignalRequests[i].lock);
>
>             checkExternalSignalRequests = 1;
>             forceInterruptCheck();  // set a flag to cause the VM to
>     check for interrupts; in the stack VM this is stackLimit :=
>     (usqInt)-1;
>     }
>
>
>     The VM responds in checkEvents:
>
>         if (checkExternalSignalRequests)
>             for (i = 0; i < numExternalSemaphores; i++)
>                 while (externalSignalRequests[i]. responses
>     != externalSignalRequests[i]. requests) {
>                     signalSemaphoreWithIndex(i);
>                     externalSignalRequests[i]. responses;
>                 }
>


Am I correct that ExternalSignalRequest.reponses is never reset to zero,
instead eventually wrapping around?  That worried me for a minute, but I
see that it doesn't matter.  Neeto.

Yes.  This could be Allen Shiffman's or Peter Deutsch's idea.  Its used in the VW VM.  Very nice.  It can only fail if the VM ignores at least 2^32 successive requests :)


Your solution looks good; it's certainly nicer than a heavyweight
critical section, especially since contention will be infrequent.

Right.  Of course if any one external semaphore is never signalled from more than one thread we don't even need that but the cost of the XCHG should be very low compared to the overall cost of the signal.


Cheers,
Josh




>
>
>         Thanks,
>         Josh
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Eliot Miranda-2
In reply to this post by Igor Stasenko
 


On Sun, Sep 20, 2009 at 3:35 PM, Igor Stasenko <[hidden email]> wrote:

2009/9/21 Joshua Gargus <[hidden email]>:
>
> Igor Stasenko wrote:
>>
>> 2009/9/21 Joshua Gargus <[hidden email]>:
>>
>>> Eliot Miranda wrote:
>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>
>>>>
>>>> On Sun, Sep 20, 2009 at 1:00 PM, Eliot Miranda
>>>> <[hidden email] <mailto:[hidden email]>> wrote:
>>>>
>>>>      Instead we can use three variables per index to implement an
>>>>     exclusion-free solution, thusly:
>>>>
>>>>     typedef struct { int lock; // accessed via e.g. XCHG to protect
>>>>     signalRequest
>>>>                            int requests;
>>>>                            int responses;
>>>>                } ExternalSignalRequest;
>>>>
>>>>     ExternalSignalRequest *externalSignalRequests;
>>>>     int checkExternalSignalRequests;
>>>>
>>>>     void
>>>>     requestSignal(int i)
>>>>     {
>>>>            while (!lock(&externalSignalRequests[i].lock))
>>>>                  usleep(1);
>>>>
>>>>            ++externalSignalRequests[i].requests;
>>>>
>>>>             unlock(&externalSignalRequests[i].lock);
>>>>
>>>>             checkExternalSignalRequests = 1;
>>>>             forceInterruptCheck();  // set a flag to cause the VM to
>>>>     check for interrupts; in the stack VM this is stackLimit :=
>>>>     (usqInt)-1;
>>>>     }
>>>>
>>>>
>>>>     The VM responds in checkEvents:
>>>>
>>>>         if (checkExternalSignalRequests)
>>>>             for (i = 0; i < numExternalSemaphores; i++)
>>>>                 while (externalSignalRequests[i]. responses
>>>>     != externalSignalRequests[i]. requests) {
>>>>                     signalSemaphoreWithIndex(i);
>>>>                     externalSignalRequests[i]. responses;
>>>>                 }
>>>>
>>>>
>>> Am I correct that ExternalSignalRequest.reponses is never reset to zero,
>>> instead eventually wrapping around?  That worried me for a minute, but I
>>> see that it doesn't matter.  Neeto.
>>>
>>>
>>
>> Nope, it does resets the counters.. see signalExternalSemaphores  method.
>>
>
> I was responding to Eliot's proposal, not the existing code.  If I
> understand properly, he's proposing to maintain a per-semaphore count of
> signal requests/responses, so there would be no
> "semaphoresToSignalCountA/B".
>
Ah.. sorry for not reading carefully..
In proposed scenario i don't like 1 (one) thing, that checkForInterrupts will
spend a linear amount of time to check what semaphores need to be signaled
instead of spending linear amount of time to signal semaphores which
is to be signaled:

    for (i = 0; i < numExternalSemaphores; i++)

means, that if you having 1000 external semaphores, and only 1 of them
signaled currently, you have to
loop through all of them, inevitably :(

Yes, that caused a twinge.  Bit with a separate lock we can maintain high and low tides, e.g.

typedef struct { int lock; // accessed via e.g. XCHG to protect signalRequest
                       int requests;
                       int responses;
           } ExternalSignalRequest;

ExternalSignalRequest *externalSignalRequests;
int checkExternalSignalRequests;
int tideLock = 0;
int useTideA = 1;
int lowTideA = (unsigned)-1 >> 1, highTideA = 0;
int lowTideB = (unsigned)-1 >> 1, highTideB = 0;

void
requestSignal(int i)
{
       while (!lock(& tideLock))
             usleep(1);

       if (useTideA) {
            if (lowTideA > i) lowTideA = i;
            if (highTideA < i) highTideA = i);
       }
       else {
            if (lowTideB > i) lowTideB = i;
            if (highTideB < i) highTideB = i);
       }
 
       while (!lock(&externalSignalRequests[i].lock))
             usleep(1);

       ++externalSignalRequests[i].requests;

        unlock(&externalSignalRequests[i].lock);

        checkExternalSignalRequests = 1;
        forceInterruptCheck();  // set a flag to cause the VM to check for interrupts; in the stack VM this is stackLimit := (usqInt)-1;
}
                       

The VM responds in checkEvents:

    if (checkExternalSignalRequests) {
      checkExternalSignalRequests = 0; // forgot this in my first email
      if (useTideA) {
        useTideA = 0;
        for (i = lowTideA; i < highTideA; i++)
            while (externalSignalRequests[i]. responses != externalSignalRequests[i]. requests) {
                signalSemaphoreWithIndex(i);
                externalSignalRequests[i]. responses;
            }
            lowTideA = (unsigned)-1 >> 1, highTideA = 0;
      }
      else  {
        useTideA = 1;
        for (i = lowTideB; i < highTideB; i++)
            while (externalSignalRequests[i]. responses != externalSignalRequests[i]. requests) {
                signalSemaphoreWithIndex(i);
                externalSignalRequests[i]. responses;
            }
        lowTideB = (unsigned)-1 >> 1, highTideB = 0;
      }
 
and we should make lowTideA = (unsigned)-1 >> 1, highTideA = 0; a macro, to avoid the duplication.
Most of the time the low and high tides will get set to the same thing because very likely most of the time only one semaphore will get signalled.


Note that I did the moral equivalent of this with the processor scheduler's queue in both VW and in the Cog Squeak VMs and it speeds up wakeHighestPriority a lot.  The idea is to maintain in a variable the highest runnable priority, which is increased whenever a process is added to the runnable queue and is of higher priority than the high tide, and reduced in wakeHighestPriority to that of the first process found.  This avoids wakeHighestPriority testing 50 odd empty lists on the typical pass through runnable processes from 100 down to user priority.


We're getting close to the end of a release cycle so I expect we'll release the Stack Cog VM soon thereafter.  Sorry for the wait so far :/


> Cheers,
> Josh
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Igor Stasenko

2009/9/21 Eliot Miranda <[hidden email]>:

>
>
>
> On Sun, Sep 20, 2009 at 3:35 PM, Igor Stasenko <[hidden email]> wrote:
>>
>> 2009/9/21 Joshua Gargus <[hidden email]>:
>> >
>> > Igor Stasenko wrote:
>> >>
>> >> 2009/9/21 Joshua Gargus <[hidden email]>:
>> >>
>> >>> Eliot Miranda wrote:
>> >>>
>> >>>> ------------------------------------------------------------------------
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Sun, Sep 20, 2009 at 1:00 PM, Eliot Miranda
>> >>>> <[hidden email] <mailto:[hidden email]>> wrote:
>> >>>>
>> >>>>      Instead we can use three variables per index to implement an
>> >>>>     exclusion-free solution, thusly:
>> >>>>
>> >>>>     typedef struct { int lock; // accessed via e.g. XCHG to protect
>> >>>>     signalRequest
>> >>>>                            int requests;
>> >>>>                            int responses;
>> >>>>                } ExternalSignalRequest;
>> >>>>
>> >>>>     ExternalSignalRequest *externalSignalRequests;
>> >>>>     int checkExternalSignalRequests;
>> >>>>
>> >>>>     void
>> >>>>     requestSignal(int i)
>> >>>>     {
>> >>>>            while (!lock(&externalSignalRequests[i].lock))
>> >>>>                  usleep(1);
>> >>>>
>> >>>>            ++externalSignalRequests[i].requests;
>> >>>>
>> >>>>             unlock(&externalSignalRequests[i].lock);
>> >>>>
>> >>>>             checkExternalSignalRequests = 1;
>> >>>>             forceInterruptCheck();  // set a flag to cause the VM to
>> >>>>     check for interrupts; in the stack VM this is stackLimit :=
>> >>>>     (usqInt)-1;
>> >>>>     }
>> >>>>
>> >>>>
>> >>>>     The VM responds in checkEvents:
>> >>>>
>> >>>>         if (checkExternalSignalRequests)
>> >>>>             for (i = 0; i < numExternalSemaphores; i++)
>> >>>>                 while (externalSignalRequests[i]. responses
>> >>>>     != externalSignalRequests[i]. requests) {
>> >>>>                     signalSemaphoreWithIndex(i);
>> >>>>                     externalSignalRequests[i]. responses;
>> >>>>                 }
>> >>>>
>> >>>>
>> >>> Am I correct that ExternalSignalRequest.reponses is never reset to zero,
>> >>> instead eventually wrapping around?  That worried me for a minute, but I
>> >>> see that it doesn't matter.  Neeto.
>> >>>
>> >>>
>> >>
>> >> Nope, it does resets the counters.. see signalExternalSemaphores  method.
>> >>
>> >
>> > I was responding to Eliot's proposal, not the existing code.  If I
>> > understand properly, he's proposing to maintain a per-semaphore count of
>> > signal requests/responses, so there would be no
>> > "semaphoresToSignalCountA/B".
>> >
>> Ah.. sorry for not reading carefully..
>> In proposed scenario i don't like 1 (one) thing, that checkForInterrupts will
>> spend a linear amount of time to check what semaphores need to be signaled
>> instead of spending linear amount of time to signal semaphores which
>> is to be signaled:
>>
>>     for (i = 0; i < numExternalSemaphores; i++)
>>
>> means, that if you having 1000 external semaphores, and only 1 of them
>> signaled currently, you have to
>> loop through all of them, inevitably :(
>
> Yes, that caused a twinge.  Bit with a separate lock we can maintain high and low tides, e.g.
> typedef struct { int lock; // accessed via e.g. XCHG to protect signalRequest
>                        int requests;
>                        int responses;
>            } ExternalSignalRequest;
> ExternalSignalRequest *externalSignalRequests;
> int checkExternalSignalRequests;
> int tideLock = 0;
> int useTideA = 1;
> int lowTideA = (unsigned)-1 >> 1, highTideA = 0;
> int lowTideB = (unsigned)-1 >> 1, highTideB = 0;
> void
> requestSignal(int i)
> {
>        while (!lock(& tideLock))
>              usleep(1);
>        if (useTideA) {
>             if (lowTideA > i) lowTideA = i;
>             if (highTideA < i) highTideA = i);
>        }
>        else {
>             if (lowTideB > i) lowTideB = i;
>             if (highTideB < i) highTideB = i);
>        }
>
>        while (!lock(&externalSignalRequests[i].lock))
>              usleep(1);
>        ++externalSignalRequests[i].requests;
>         unlock(&externalSignalRequests[i].lock);
>         checkExternalSignalRequests = 1;
>         forceInterruptCheck();  // set a flag to cause the VM to check for interrupts; in the stack VM this is stackLimit := (usqInt)-1;
> }
>
> The VM responds in checkEvents:
>     if (checkExternalSignalRequests) {

i think you missed here something like following:
        while (!lock(& tideLock))
              usleep(1);

.. and corresponding unlock() somewhere down below, where its safe.

>       checkExternalSignalRequests = 0; // forgot this in my first email
>       if (useTideA) {
>         useTideA = 0;
>         for (i = lowTideA; i < highTideA; i++)
>             while (externalSignalRequests[i]. responses != externalSignalRequests[i]. requests) {
>                 signalSemaphoreWithIndex(i);
>                 externalSignalRequests[i]. responses;
>             }
>             lowTideA = (unsigned)-1 >> 1, highTideA = 0;
>       }
>       else  {
>         useTideA = 1;
>         for (i = lowTideB; i < highTideB; i++)
>             while (externalSignalRequests[i]. responses != externalSignalRequests[i]. requests) {
>                 signalSemaphoreWithIndex(i);
>                 externalSignalRequests[i]. responses;
>             }
>         lowTideB = (unsigned)-1 >> 1, highTideB = 0;
>       }
>
> and we should make lowTideA = (unsigned)-1 >> 1, highTideA = 0; a macro, to avoid the duplication.
> Most of the time the low and high tides will get set to the same thing because very likely most of the time only one semaphore will get signalled.
>
> Note that I did the moral equivalent of this with the processor scheduler's queue in both VW and in the Cog Squeak VMs and it speeds up wakeHighestPriority a lot.  The idea is to maintain in a variable the highest runnable priority, which is increased whenever a process is added to the runnable queue and is of higher priority than the high tide, and reduced in wakeHighestPriority to that of the first process found.  This avoids wakeHighestPriority testing 50 odd empty lists on the typical pass through runnable processes from 100 down to user priority.

This can be defeated by following code (wild guess)

| proc time |
proc := [ [ Processor yield. ] repeat ] newProcess.
proc priority: Processor highestPriority.
proc resume.
time := [ 100000000 timesRepeat: [ "***" Processor yield. ] ] timeToRun.
proc terminate.
time

resulting run time should be very close for both VMs, where one using
and another not using a new quirk.
I may be wrong , maybe at "****" there should be just some code , like
1+1 , but not Processor yield, to make sure
that given loop get interrupted, but not yields execution by itself :)

Regarding scheduling, did you taken a look of my scheduling
refactorings, which moving scheduling logic to image side,
freeing VM from any knowledge of what Semaphores is?
I think it could be a good move. The only thing, which is concern is
speed. But this could be improved by making some primitives
which reducing the number of sends, needed to handle the scheduling
(most consuming is - adding/removing to list(s)). But not rest of
logic, which should belong to image side.
And in presence of JIT, this will no longer be issue.

>
> We're getting close to the end of a release cycle so I expect we'll release the Stack Cog VM soon thereafter.  Sorry for the wait so far :/
>>
>> > Cheers,
>> > Josh
>> >
>> >
>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>


--
Best regards,
Igor Stasenko AKA sig.
Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Eliot Miranda-2

On Sun, Sep 20, 2009 at 5:47 PM, Igor Stasenko <[hidden email]> wrote:

>
> 2009/9/21 Eliot Miranda <[hidden email]>:
>>
>>
>>
>> On Sun, Sep 20, 2009 at 3:35 PM, Igor Stasenko <[hidden email]> wrote:
>>>
>>> 2009/9/21 Joshua Gargus <[hidden email]>:
>>> >
>>> > Igor Stasenko wrote:
>>> >>
>>> >> 2009/9/21 Joshua Gargus <[hidden email]>:
>>> >>
>>> >>> Eliot Miranda wrote:
>>> >>>
>>> >>>> ------------------------------------------------------------------------
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> On Sun, Sep 20, 2009 at 1:00 PM, Eliot Miranda
>>> >>>> <[hidden email] <mailto:[hidden email]>> wrote:
>>> >>>>
>>> >>>>      Instead we can use three variables per index to implement an
>>> >>>>     exclusion-free solution, thusly:
>>> >>>>
>>> >>>>     typedef struct { int lock; // accessed via e.g. XCHG to protect
>>> >>>>     signalRequest
>>> >>>>                            int requests;
>>> >>>>                            int responses;
>>> >>>>                } ExternalSignalRequest;
>>> >>>>
>>> >>>>     ExternalSignalRequest *externalSignalRequests;
>>> >>>>     int checkExternalSignalRequests;
>>> >>>>
>>> >>>>     void
>>> >>>>     requestSignal(int i)
>>> >>>>     {
>>> >>>>            while (!lock(&externalSignalRequests[i].lock))
>>> >>>>                  usleep(1);
>>> >>>>
>>> >>>>            ++externalSignalRequests[i].requests;
>>> >>>>
>>> >>>>             unlock(&externalSignalRequests[i].lock);
>>> >>>>
>>> >>>>             checkExternalSignalRequests = 1;
>>> >>>>             forceInterruptCheck();  // set a flag to cause the VM to
>>> >>>>     check for interrupts; in the stack VM this is stackLimit :=
>>> >>>>     (usqInt)-1;
>>> >>>>     }
>>> >>>>
>>> >>>>
>>> >>>>     The VM responds in checkEvents:
>>> >>>>
>>> >>>>         if (checkExternalSignalRequests)
>>> >>>>             for (i = 0; i < numExternalSemaphores; i++)
>>> >>>>                 while (externalSignalRequests[i]. responses
>>> >>>>     != externalSignalRequests[i]. requests) {
>>> >>>>                     signalSemaphoreWithIndex(i);
>>> >>>>                     externalSignalRequests[i]. responses;
>>> >>>>                 }
>>> >>>>
>>> >>>>
>>> >>> Am I correct that ExternalSignalRequest.reponses is never reset to zero,
>>> >>> instead eventually wrapping around?  That worried me for a minute, but I
>>> >>> see that it doesn't matter.  Neeto.
>>> >>>
>>> >>>
>>> >>
>>> >> Nope, it does resets the counters.. see signalExternalSemaphores  method.
>>> >>
>>> >
>>> > I was responding to Eliot's proposal, not the existing code.  If I
>>> > understand properly, he's proposing to maintain a per-semaphore count of
>>> > signal requests/responses, so there would be no
>>> > "semaphoresToSignalCountA/B".
>>> >
>>> Ah.. sorry for not reading carefully..
>>> In proposed scenario i don't like 1 (one) thing, that checkForInterrupts will
>>> spend a linear amount of time to check what semaphores need to be signaled
>>> instead of spending linear amount of time to signal semaphores which
>>> is to be signaled:
>>>
>>>     for (i = 0; i < numExternalSemaphores; i++)
>>>
>>> means, that if you having 1000 external semaphores, and only 1 of them
>>> signaled currently, you have to
>>> loop through all of them, inevitably :(
>>
>> Yes, that caused a twinge.  Bit with a separate lock we can maintain high and low tides, e.g.
>> typedef struct { int lock; // accessed via e.g. XCHG to protect signalRequest
>>                        int requests;
>>                        int responses;
>>            } ExternalSignalRequest;
>> ExternalSignalRequest *externalSignalRequests;
>> int checkExternalSignalRequests;
>> int tideLock = 0;
>> int useTideA = 1;
>> int lowTideA = (unsigned)-1 >> 1, highTideA = 0;
>> int lowTideB = (unsigned)-1 >> 1, highTideB = 0;
>> void
>> requestSignal(int i)
>> {
>>        while (!lock(& tideLock))
>>              usleep(1);
>>        if (useTideA) {
>>             if (lowTideA > i) lowTideA = i;
>>             if (highTideA < i) highTideA = i);
>>        }
>>        else {
>>             if (lowTideB > i) lowTideB = i;
>>             if (highTideB < i) highTideB = i);
>>        }
>>
>>        while (!lock(&externalSignalRequests[i].lock))
>>              usleep(1);
>>        ++externalSignalRequests[i].requests;
>>         unlock(&externalSignalRequests[i].lock);
>>         checkExternalSignalRequests = 1;
>>         forceInterruptCheck();  // set a flag to cause the VM to check for interrupts; in the stack VM this is stackLimit := (usqInt)-1;
>> }
>>
>> The VM responds in checkEvents:
>>     if (checkExternalSignalRequests) {
>
> i think you missed here something like following:
>        while (!lock(& tideLock))
>              usleep(1);
>
> .. and corresponding unlock() somewhere down below, where its safe.

Good catch.  Thanks.

>
>>       checkExternalSignalRequests = 0; // forgot this in my first email
>>       if (useTideA) {
>>         useTideA = 0;
>>         for (i = lowTideA; i < highTideA; i++)
>>             while (externalSignalRequests[i]. responses != externalSignalRequests[i]. requests) {
>>                 signalSemaphoreWithIndex(i);
>>                 externalSignalRequests[i]. responses;
>>             }
>>             lowTideA = (unsigned)-1 >> 1, highTideA = 0;
>>       }
>>       else  {
>>         useTideA = 1;
>>         for (i = lowTideB; i < highTideB; i++)
>>             while (externalSignalRequests[i]. responses != externalSignalRequests[i]. requests) {
>>                 signalSemaphoreWithIndex(i);
>>                 externalSignalRequests[i]. responses;
>>             }
>>         lowTideB = (unsigned)-1 >> 1, highTideB = 0;
>>       }
>>
>> and we should make lowTideA = (unsigned)-1 >> 1, highTideA = 0; a macro, to avoid the duplication.
>> Most of the time the low and high tides will get set to the same thing because very likely most of the time only one semaphore will get signalled.
>>
>> Note that I did the moral equivalent of this with the processor scheduler's queue in both VW and in the Cog Squeak VMs and it speeds up wakeHighestPriority a lot.  The idea is to maintain in a variable the highest runnable priority, which is increased whenever a process is added to the runnable queue and is of higher priority than the high tide, and reduced in wakeHighestPriority to that of the first process found.  This avoids wakeHighestPriority testing 50 odd empty lists on the typical pass through runnable processes from 100 down to user priority.
>
> This can be defeated by following code (wild guess)
>
> | proc time |
> proc := [ [ Processor yield. ] repeat ] newProcess.
> proc priority: Processor highestPriority.
> proc resume.
> time := [ 100000000 timesRepeat: [ "***" Processor yield. ] ] timeToRun.
> proc terminate.
> time

That doesn't defeat it; it merely reduces its performance to the same
as the original naive algorithm.  But in a typical scenario the tide
scheme wins big.  What is broken about the tide scheme is that
changing a process's priority must be intercepted to correctly
maintain the high-tide.  The code in our Qwaq images is incorrect in
that it merely assigns the priority inst var which won't actually move
the process onto the right run queue if it is already runnable; i.e.
the priority: accessor can only be safely used at initialization time.
 In VM there's a primitive that takes care of this and hence makes it
easy to maintain the high tide correctly.


>
> resulting run time should be very close for both VMs, where one using
> and another not using a new quirk.
> I may be wrong , maybe at "****" there should be just some code , like
> 1+1 , but not Processor yield, to make sure
> that given loop get interrupted, but not yields execution by itself :)
>
> Regarding scheduling, did you taken a look of my scheduling
> refactorings, which moving scheduling logic to image side,
> freeing VM from any knowledge of what Semaphores is?
> I think it could be a good move. The only thing, which is concern is
> speed. But this could be improved by making some primitives
> which reducing the number of sends, needed to handle the scheduling
> (most consuming is - adding/removing to list(s)). But not rest of
> logic, which should belong to image side.
> And in presence of JIT, this will no longer be issue.
>
>>
>> We're getting close to the end of a release cycle so I expect we'll release the Stack Cog VM soon thereafter.  Sorry for the wait so far :/
>>>
>>> > Cheers,
>>> > Josh
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Igor Stasenko AKA sig.
>>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Igor Stasenko

2009/9/21 Eliot Miranda <[hidden email]>:

[snip]

>> This can be defeated by following code (wild guess)
>>
>> | proc time |
>> proc := [ [ Processor yield. ] repeat ] newProcess.
>> proc priority: Processor highestPriority.
>> proc resume.
>> time := [ 100000000 timesRepeat: [ "***" Processor yield. ] ] timeToRun.
>> proc terminate.
>> time
>
> That doesn't defeat it; it merely reduces its performance to the same
> as the original naive algorithm.

Right. This is what i wanted to illustrate. Sure thing, for typical
scenario your code is
certainly works better. :)

Btw, i am free to change my scheduler to use a linked list of
entries like:
(prioX, processesList) --> (prioY, processesList) --> ...

and simply do not waste the time & space holding the empty lists.
Or i am free to use any other scheme.. That's was the main reason for
removing the
scheduling from VM - and make it easy to improve things & correct bugs
without the need
of changing VM each time :)

>  But in a typical scenario the tide
> scheme wins big.  What is broken about the tide scheme is that
> changing a process's priority must be intercepted to correctly
> maintain the high-tide.  The code in our Qwaq images is incorrect in
> that it merely assigns the priority inst var which won't actually move
> the process onto the right run queue if it is already runnable; i.e.
> the priority: accessor can only be safely used at initialization time.
>  In VM there's a primitive that takes care of this and hence makes it
> easy to maintain the high tide correctly.
>
Right, here's the snippet from my new scheduling code :)

priority: anInteger
        "Set the receiver's priority to anInteger."
        Processor checkPriorityValue: anInteger.
        self isSuspended ifFalse: [
                "handle the potential need in rescheduling"
                Processor reschedule: self atPriority: anInteger
        ] ifTrue: [ priority := anInteger ]

>
>>
>> resulting run time should be very close for both VMs, where one using
>> and another not using a new quirk.
>> I may be wrong , maybe at "****" there should be just some code , like
>> 1+1 , but not Processor yield, to make sure
>> that given loop get interrupted, but not yields execution by itself :)
>>
>> Regarding scheduling, did you taken a look of my scheduling
>> refactorings, which moving scheduling logic to image side,
>> freeing VM from any knowledge of what Semaphores is?
>> I think it could be a good move. The only thing, which is concern is
>> speed. But this could be improved by making some primitives
>> which reducing the number of sends, needed to handle the scheduling
>> (most consuming is - adding/removing to list(s)). But not rest of
>> logic, which should belong to image side.
>> And in presence of JIT, this will no longer be issue.
>>
>>>
>>> We're getting close to the end of a release cycle so I expect we'll release the Stack Cog VM soon thereafter.  Sorry for the wait so far :/
>>>>
>>>> > Cheers,
>>>> > Josh
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Igor Stasenko AKA sig.
>>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>
>



--
Best regards,
Igor Stasenko AKA sig.
Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

johnmci
 
Er, did we reach a consensus about what to do?

My problem is determining what is best based on some the code concepts  
given (which people pointed out flaws in, or confusion about).
and the premise they work all the same on different hardware platforms  
intel, powerpc, arm, which I'm not sure about.

Perhaps a more heavy weight, platform dependent solution using the  
generic acceptable locking logic is required.

Er like
acquireTheHostPlatformIndexedSemaphoreLock()
{Do what ever is required to remember the semaphore index so that  
checkForInterrupts can find it, a queue perhaps?
releaseTheHostPlatformIndexSemaphoreLock()

I'd keep in mind

(a) How many times do we execute the signalExternalSemaphore logic per  
seconds, and
(b) if someone want to do  this a million times a second I think they  
can do their own "exotic" solution via overriding  
acquireTheHostPlatformIndexedSemaphoreLock &  
releaseTheHostPlatformIndexSemaphoreLock
(c) keep it simple so I don't have to worry how it works on powerpc,  
intel, and arm.

acquireTheHostPlatformIndexedSemaphoreLock/
releaseTheHostPlatformIndexSemaphoreLock
Obviously I'd just throw myself on the evil pthread solution.


Would we do a linked list, or queue for the semaphores, versus that  
fixed size list? A size I picked based on exploring network interrupt  
value rates on a mind numbling 200Mhz powerpc machine?


On 2009-09-20, at 7:00 PM, Igor Stasenko wrote:
>

--
=
=
=
========================================================================
John M. McIntosh <[hidden email]>   Twitter:  
squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
=
=
=
========================================================================




Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Eliot Miranda-2
 
Hi John,

On Mon, Sep 21, 2009 at 1:26 PM, John M McIntosh <[hidden email]> wrote:

Er, did we reach a consensus about what to do?

My problem is determining what is best based on some the code concepts given (which people pointed out flaws in, or confusion about).
and the premise they work all the same on different hardware platforms intel, powerpc, arm, which I'm not sure about.

Perhaps a more heavy weight, platform dependent solution using the generic acceptable locking logic is required.

No, no, no, no. no :)  I'm currently implementing the scheme for testing here at Qwaq.  So hang on for a day and I'll provide code to start from if our testing is positive.
 
Er like
acquireTheHostPlatformIndexedSemaphoreLock()
{Do what ever is required to remember the semaphore index so that checkForInterrupts can find it, a queue perhaps?
releaseTheHostPlatformIndexSemaphoreLock()

I'd keep in mind

(a) How many times do we execute the signalExternalSemaphore logic per seconds, and
(b) if someone want to do  this a million times a second I think they can do their own "exotic" solution via overriding acquireTheHostPlatformIndexedSemaphoreLock & releaseTheHostPlatformIndexSemaphoreLock
(c) keep it simple so I don't have to worry how it works on powerpc, intel, and arm.

acquireTheHostPlatformIndexedSemaphoreLock/releaseTheHostPlatformIndexSemaphoreLock
Obviously I'd just throw myself on the evil pthread solution.


Would we do a linked list, or queue for the semaphores, versus that fixed size list? A size I picked based on exploring network interrupt value rates on a mind numbling 200Mhz powerpc machine?


On 2009-09-20, at 7:00 PM, Igor Stasenko wrote:


--

===========================================================================
John M. McIntosh <[hidden email]>   Twitter:  squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================





Reply | Threaded
Open this post in threaded view
|

Re: Why isn't signalSemaphoreWithIndex() thread-safe?

Igor Stasenko
In reply to this post by johnmci

2009/9/21 John M McIntosh <[hidden email]>:

>
> Er, did we reach a consensus about what to do?
>
> My problem is determining what is best based on some the code concepts given
> (which people pointed out flaws in, or confusion about).
> and the premise they work all the same on different hardware platforms
> intel, powerpc, arm, which I'm not sure about.
>
> Perhaps a more heavy weight, platform dependent solution using the generic
> acceptable locking logic is required.
>
+1. As i mentioned before, it would be nice to extend the platform
API, which VM could use
to deal with multithreading.
And platforms which have no threading support could simply do nothing
in these functions.

> Er like
> acquireTheHostPlatformIndexedSemaphoreLock()
> {Do what ever is required to remember the semaphore index so that
> checkForInterrupts can find it, a queue perhaps?
> releaseTheHostPlatformIndexSemaphoreLock()
>
> I'd keep in mind
>
> (a) How many times do we execute the signalExternalSemaphore logic per
> seconds, and
> (b) if someone want to do  this a million times a second I think they can do
> their own "exotic" solution via overriding
> acquireTheHostPlatformIndexedSemaphoreLock &
> releaseTheHostPlatformIndexSemaphoreLock
> (c) keep it simple so I don't have to worry how it works on powerpc, intel,
> and arm.
>
> acquireTheHostPlatformIndexedSemaphoreLock/releaseTheHostPlatformIndexSemaphoreLock
> Obviously I'd just throw myself on the evil pthread solution.
>
>
> Would we do a linked list, or queue for the semaphores, versus that fixed
> size list? A size I picked based on exploring network interrupt value rates
> on a mind numbling 200Mhz powerpc machine?
>

My own experience is following:
- i implemented a shared queue to use in Hydra based on atomic xchg
available on Intel
platforms. It worked well, until i had chance to run Hydra on a
multicore PC, where it failed
just after a few seconds of running the VM, causing VM to freeze infinitely.
Obviously, because implementation was wrong :)
The moral of it that, its not a question, how often we need the
synchronization between threads,
but how correct it is :)


>
> On 2009-09-20, at 7:00 PM, Igor Stasenko wrote:
>>
>
> --
> ===========================================================================
> John M. McIntosh <[hidden email]>   Twitter:  squeaker68882
> Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
> ===========================================================================
>
>
>
>
>



--
Best regards,
Igor Stasenko AKA sig.
12