VM on Solaris (was: Camera sig fault on 64 bits machines.)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

VM on Solaris (was: Camera sig fault on 64 bits machines.)

Andreas Wacknitz

This evening I further dealt with the problems on OpenSolaris (openindiana).
I finally got a pthread version running without superuser rights. But I don’t know whether this will really work (ATM it does for me)
because I removed the call to pthread_setschedparam in beatStateMachine leaving the heartbeat thread with the same
priority than the vm thread. I tried to replace the pthread_setschedparam call with a similar pthread_setschedprio call but
with no luck (same problem: failed call with "Not owner"). I don’t know wether this is a general problem with the pthreads implementation
on Solaris or just a problem with the gcc version (4.4.4) coming with the openindiana distribution I am using. Maybe this works only
with the compilers and libraries that is delivered by Oracle (Solaris 10 ships with gcc 3.4.3; Solaris Studio has its own compilers).

Second, I needed to change the implementation of ioUpdateVMTimezone because Solaris does not have time_t->tm_gmtoff.
There seem to be copies of this function in all three heartbeat files with the one in sqUnixITimerHeartbeat.c working for those OS’s without
tm_gmtoff.

NativeBoost doesn’t seem to work yet (at least UnixEnvironment>>environ raises an error: „failed to get a symbol address: environ“).

This VM gives me 686787391 bytecodes/sec and 80516849 sends/sec on my 6 years old Sun Ultra 24 (2,4GHz Intel Q9300).
I get similar values for the VM with pthread_setschedparam call and superuser rights.
My 4 years old iMac (2,8GHz Core i5) gives me 829149797 bytecodes/sec; 117122195 sends/sec.
So the results seem comparable.

Andreas


Reply | Threaded
Open this post in threaded view
|

Re: VM on Solaris (was: Camera sig fault on 64 bits machines.)

Eliot Miranda-2
 
Hi Andreas,


On Tue, Apr 22, 2014 at 12:05 PM, Andreas Wacknitz <[hidden email]> wrote:

This evening I further dealt with the problems on OpenSolaris (openindiana).
I finally got a pthread version running without superuser rights. But I don’t know whether this will really work (ATM it does for me)
because I removed the call to pthread_setschedparam in beatStateMachine leaving the heartbeat thread with the same
priority than the vm thread.

Alas, that will not work -(.  As soon as the image enters into a hard loop (e.g. [true] whileTrue) the heartbeat thread will be blocked and the VM will never break out of the loop.
 
I tried to replace the pthread_setschedparam call with a similar pthread_setschedprio call but
with no luck (same problem: failed call with "Not owner"). I don’t know wether this is a general problem with the pthreads implementation
on Solaris or just a problem with the gcc version (4.4.4) coming with the openindiana distribution I am using. Maybe this works only
with the compilers and libraries that is delivered by Oracle (Solaris 10 ships with gcc 3.4.3; Solaris Studio has its own compilers).

It's too do with pthreads.  Nothing to do with the compiler.  On some implementations it requires special permission to create threads with different priorities.  That used to be the case on linux and it appears to be the case on OpenSolaris.  Hence one is stuck with the itimer heartbeat.

You /could/ try and implement the heartbeat in another process and use nice to change its priority.  I don't know how well that would work.  I've never tried it.

Second, I needed to change the implementation of ioUpdateVMTimezone because Solaris does not have time_t->tm_gmtoff.
There seem to be copies of this function in all three heartbeat files with the one in sqUnixITimerHeartbeat.c working for those OS’s without
tm_gmtoff.

NativeBoost doesn’t seem to work yet (at least UnixEnvironment>>environ raises an error: „failed to get a symbol address: environ“).

This VM gives me 686787391 bytecodes/sec and 80516849 sends/sec on my 6 years old Sun Ultra 24 (2,4GHz Intel Q9300).
I get similar values for the VM with pthread_setschedparam call and superuser rights.
My 4 years old iMac (2,8GHz Core i5) gives me 829149797 bytecodes/sec; 117122195 sends/sec.
So the results seem comparable.

Andreas
 
--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: VM on Solaris (was: Camera sig fault on 64 bits machines.)

Andreas Wacknitz
 

Am 22.04.2014 um 21:36 schrieb Eliot Miranda <[hidden email]>:

Hi Andreas,


On Tue, Apr 22, 2014 at 12:05 PM, Andreas Wacknitz <[hidden email]> wrote:

This evening I further dealt with the problems on OpenSolaris (openindiana).
I finally got a pthread version running without superuser rights. But I don’t know whether this will really work (ATM it does for me)
because I removed the call to pthread_setschedparam in beatStateMachine leaving the heartbeat thread with the same
priority than the vm thread.

Alas, that will not work -(.  As soon as the image enters into a hard loop (e.g. [true] whileTrue) the heartbeat thread will be blocked and the VM will never break out of the loop.

How can I check this blockage? I started the VM with --pollpipe 1 and then [true] whileTrue in a Workspace. The GUI is blocked but the pipe is still rotating.


 
I tried to replace the pthread_setschedparam call with a similar pthread_setschedprio call but
with no luck (same problem: failed call with "Not owner"). I don’t know wether this is a general problem with the pthreads implementation
on Solaris or just a problem with the gcc version (4.4.4) coming with the openindiana distribution I am using. Maybe this works only
with the compilers and libraries that is delivered by Oracle (Solaris 10 ships with gcc 3.4.3; Solaris Studio has its own compilers).

It's too do with pthreads.  Nothing to do with the compiler.  On some implementations it requires special permission to create threads with different priorities.  That used to be the case on linux and it appears to be the case on OpenSolaris.  Hence one is stuck with the itimer heartbeat.
Is there any implementation actually using sqUnixITimerHeartbeat.c? I don’t think that Solaris has special problems here.
Also, I cannot imagine that this situation is so uncommon that nobody else got it before SqueakVM. A higher prioritized thread should be possible for ordinary users.
At least that was my idea when I changed the code to use pthread_setschedprio. Changing the prio while keeping the policy seemed reasonable.
  

You /could/ try and implement the heartbeat in another process and use nice to change its priority.  I don't know how well that would work.  I've never tried it.
As I wrote I already wrote a simple C program that does send a SIGALRM. It’s working albeit very slowly.

Second, I needed to change the implementation of ioUpdateVMTimezone because Solaris does not have time_t->tm_gmtoff.
There seem to be copies of this function in all three heartbeat files with the one in sqUnixITimerHeartbeat.c working for those OS’s without
tm_gmtoff.

NativeBoost doesn’t seem to work yet (at least UnixEnvironment>>environ raises an error: „failed to get a symbol address: environ“).

This VM gives me 686787391 bytecodes/sec and 80516849 sends/sec on my 6 years old Sun Ultra 24 (2,4GHz Intel Q9300).
I get similar values for the VM with pthread_setschedparam call and superuser rights.
My 4 years old iMac (2,8GHz Core i5) gives me 829149797 bytecodes/sec; 117122195 sends/sec.
So the results seem comparable.

Andreas
 
--
best,
Eliot

Regards,
Andreas
Reply | Threaded
Open this post in threaded view
|

Re: VM on Solaris (was: Camera sig fault on 64 bits machines.)

Eliot Miranda-2
 



On Tue, Apr 22, 2014 at 1:10 PM, Andreas Wacknitz <[hidden email]> wrote:
 

Am 22.04.2014 um 21:36 schrieb Eliot Miranda <[hidden email]>:

Hi Andreas,


On Tue, Apr 22, 2014 at 12:05 PM, Andreas Wacknitz <[hidden email]> wrote:

This evening I further dealt with the problems on OpenSolaris (openindiana).
I finally got a pthread version running without superuser rights. But I don’t know whether this will really work (ATM it does for me)
because I removed the call to pthread_setschedparam in beatStateMachine leaving the heartbeat thread with the same
priority than the vm thread.

Alas, that will not work -(.  As soon as the image enters into a hard loop (e.g. [true] whileTrue) the heartbeat thread will be blocked and the VM will never break out of the loop.

How can I check this blockage? I started the VM with --pollpipe 1 and then [true] whileTrue in a Workspace. The GUI is blocked but the pipe is still rotating.

Can you interrupt with ctrl-period?  If not, then I don't understand how the pip is still rotating :-).  If you can, then you're not blocking the system.  Try e.g. [[true] whileTrue]  forkAt: Processor highestPriority.

You are running the JIT right?

I tried to replace the pthread_setschedparam call with a similar pthread_setschedprio call but
with no luck (same problem: failed call with "Not owner"). I don’t know wether this is a general problem with the pthreads implementation
on Solaris or just a problem with the gcc version (4.4.4) coming with the openindiana distribution I am using. Maybe this works only
with the compilers and libraries that is delivered by Oracle (Solaris 10 ships with gcc 3.4.3; Solaris Studio has its own compilers).

It's too do with pthreads.  Nothing to do with the compiler.  On some implementations it requires special permission to create threads with different priorities.  That used to be the case on linux and it appears to be the case on OpenSolaris.  Hence one is stuck with the itimer heartbeat.
Is there any implementation actually using sqUnixITimerHeartbeat.c?

Yes, but unhappily.  We use it at Cadence because we have customers on pre 2.6.12 kernels.  We have to e.g. switch off the heartbeast around certain external calls.
 
I don’t think that Solaris has special problems here.
Also, I cannot imagine that this situation is so uncommon that nobody else got it before SqueakVM. A higher prioritized thread should be possible for ordinary users.

I definitely wasn't the case on linux.

At least that was my idea when I changed the code to use pthread_setschedprio. Changing the prio while keeping the policy seemed reasonable.

If you can lower the priority of the main thread that'll work too.
 
You /could/ try and implement the heartbeat in another process and use nice to change its priority.  I don't know how well that would work.  I've never tried it.
As I wrote I already wrote a simple C program that does send a SIGALRM. It’s working albeit very slowly.

Second, I needed to change the implementation of ioUpdateVMTimezone because Solaris does not have time_t->tm_gmtoff.
There seem to be copies of this function in all three heartbeat files with the one in sqUnixITimerHeartbeat.c working for those OS’s without
tm_gmtoff.

NativeBoost doesn’t seem to work yet (at least UnixEnvironment>>environ raises an error: „failed to get a symbol address: environ“).

This VM gives me 686787391 bytecodes/sec and 80516849 sends/sec on my 6 years old Sun Ultra 24 (2,4GHz Intel Q9300).
I get similar values for the VM with pthread_setschedparam call and superuser rights.
My 4 years old iMac (2,8GHz Core i5) gives me 829149797 bytecodes/sec; 117122195 sends/sec.
So the results seem comparable.

Andreas
 
--
best,
Eliot
Regards,
Andreas
--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: VM on Solaris (was: Camera sig fault on 64 bits machines.)

Andreas Wacknitz
 
Thanks again Eliot,

First, I solved the pthreads problem under OpenSolaris. While Solaris 10 doesn’t need special user privileges for thread control (at least within the same thread policy I guess),
users under Solaris 11 (and thus OpenSolaris) need the privilege „proc_priocntl“ to be given by an administrator.
(For those who are interested: usermod -K defaultpriv=basic,proc_priocntl andreas)

More below…

Am 22.04.2014 um 22:31 schrieb Eliot Miranda <[hidden email]>:




On Tue, Apr 22, 2014 at 1:10 PM, Andreas Wacknitz <[hidden email]> wrote:
 

Am 22.04.2014 um 21:36 schrieb Eliot Miranda <[hidden email]>:

Hi Andreas,


On Tue, Apr 22, 2014 at 12:05 PM, Andreas Wacknitz <[hidden email]> wrote:

This evening I further dealt with the problems on OpenSolaris (openindiana).
I finally got a pthread version running without superuser rights. But I don’t know whether this will really work (ATM it does for me)
because I removed the call to pthread_setschedparam in beatStateMachine leaving the heartbeat thread with the same
priority than the vm thread.

Alas, that will not work -(.  As soon as the image enters into a hard loop (e.g. [true] whileTrue) the heartbeat thread will be blocked and the VM will never break out of the loop.

How can I check this blockage? I started the VM with --pollpipe 1 and then [true] whileTrue in a Workspace. The GUI is blocked but the pipe is still rotating.

Can you interrupt with ctrl-period?  If not, then I don't understand how the pip is still rotating :-).  If you can, then you're not blocking the system.  Try e.g. [[true] whileTrue]  forkAt: Processor highestPriority.
Yes, I can do that in both (with and without higher priority) BUT not when running this with highestPriority (again in BOTH versions!).


You are running the JIT right?
How to tell for sure? I started the VM with —trace. The last log is „IRBytecodeGenerator>>from:goto“.
The pipe is still rotating but ALT-. does not break the loop (maybe a problem of my Pharo image?; I will try later with a Squeak image).


I tried to replace the pthread_setschedparam call with a similar pthread_setschedprio call but
with no luck (same problem: failed call with "Not owner"). I don’t know wether this is a general problem with the pthreads implementation
on Solaris or just a problem with the gcc version (4.4.4) coming with the openindiana distribution I am using. Maybe this works only
with the compilers and libraries that is delivered by Oracle (Solaris 10 ships with gcc 3.4.3; Solaris Studio has its own compilers).

It's too do with pthreads.  Nothing to do with the compiler.  On some implementations it requires special permission to create threads with different priorities.  That used to be the case on linux and it appears to be the case on OpenSolaris.  Hence one is stuck with the itimer heartbeat.
Is there any implementation actually using sqUnixITimerHeartbeat.c?

Yes, but unhappily.  We use it at Cadence because we have customers on pre 2.6.12 kernels.  We have to e.g. switch off the heartbeast around certain external calls.
I am still wondering about where the necessary sleep call will be generated in this case. I will check your latest VM sources. Maybe PharoVM is different here…

 
I don’t think that Solaris has special problems here.
Also, I cannot imagine that this situation is so uncommon that nobody else got it before SqueakVM. A higher prioritized thread should be possible for ordinary users.

I definitely wasn't the case on linux.

I haven’t tried to build the VM for Solaris 10 yet but I have tried a multi-threaded example program I found in the web. This example program is running 4 threads for computation
and does not run under OpenSolaris (same error code like the SqueakVM is generating) but it runs without hassles on Solaris 10 SPARC. So I guess Solaris 10 will run the
two-threaded SqueakVM without problems.


Best regards,
Andreas

Reply | Threaded
Open this post in threaded view
|

Re: VM on Solaris (was: Camera sig fault on 64 bits machines.)

Eliot Miranda-2
 
Hi Andreas,


On Wed, Apr 23, 2014 at 9:35 AM, Andreas Wacknitz <[hidden email]> wrote:
 
Thanks again Eliot,

First, I solved the pthreads problem under OpenSolaris. While Solaris 10 doesn’t need special user privileges for thread control (at least within the same thread policy I guess),
users under Solaris 11 (and thus OpenSolaris) need the privilege „proc_priocntl“ to be given by an administrator.
(For those who are interested: usermod -K defaultpriv=basic,proc_priocntl andreas)

This is a pain :-).  You could either assume that people can always get the necessary permission and go with the threaded heartbeat (my preferred suggestion) or provide two VMs (always tedious).

More below…

Am 22.04.2014 um 22:31 schrieb Eliot Miranda <[hidden email]>:




On Tue, Apr 22, 2014 at 1:10 PM, Andreas Wacknitz <[hidden email]> wrote:
 

Am 22.04.2014 um 21:36 schrieb Eliot Miranda <[hidden email]>:

Hi Andreas,


On Tue, Apr 22, 2014 at 12:05 PM, Andreas Wacknitz <[hidden email]> wrote:

This evening I further dealt with the problems on OpenSolaris (openindiana).
I finally got a pthread version running without superuser rights. But I don’t know whether this will really work (ATM it does for me)
because I removed the call to pthread_setschedparam in beatStateMachine leaving the heartbeat thread with the same
priority than the vm thread.

Alas, that will not work -(.  As soon as the image enters into a hard loop (e.g. [true] whileTrue) the heartbeat thread will be blocked and the VM will never break out of the loop.

How can I check this blockage? I started the VM with --pollpipe 1 and then [true] whileTrue in a Workspace. The GUI is blocked but the pipe is still rotating.

Can you interrupt with ctrl-period?  If not, then I don't understand how the pip is still rotating :-).  If you can, then you're not blocking the system.  Try e.g. [[true] whileTrue]  forkAt: Processor highestPriority.
Yes, I can do that in both (with and without higher priority) BUT not when running this with highestPriority (again in BOTH versions!).

Oops That's right.  It should be just "[[true] whileTrue]  forkAt: Processor userPriority + 1".  Obviously one can't interrupt something running higher than userInterrupt priority.  Sorry, I was asleep.

You are running the JIT right?
How to tell for sure?

vm -version

If it includes a CoInterpreter line you're running the JIT.  e.g.
McStalker.macbuild$ oscfvm -version
/Users/eliot/Cog/oscogvm/macbuild/Fast.app/Contents/MacOS/Squeak
4.0 4.0.2894 Mac OS X built on Apr 14 2014 17:02:16 Compiler: 4.2.1 (Apple Inc. build 5666) (dot 3) [Production VM]
CoInterpreter VMMaker.oscog-eem.674 uuid: eefd603d-9638-4ad8-99c0-4ee12e87d49d Apr 14 2014
StackToRegisterMappingCogit VMMaker.oscog-eem.674 uuid: eefd603d-9638-4ad8-99c0-4ee12e87d49d Apr 14 2014
VM: r2894 http://www.squeakvm.org/svn/squeak/branches/Cog Date: 2014-04-14 15:32:11 -0700

 
I started the VM with —trace. The last log is „IRBytecodeGenerator>>from:goto“.
The pipe is still rotating but ALT-. does not break the loop (maybe a problem of my Pharo image?; I will try later with a Squeak image).


I tried to replace the pthread_setschedparam call with a similar pthread_setschedprio call but
with no luck (same problem: failed call with "Not owner"). I don’t know wether this is a general problem with the pthreads implementation
on Solaris or just a problem with the gcc version (4.4.4) coming with the openindiana distribution I am using. Maybe this works only
with the compilers and libraries that is delivered by Oracle (Solaris 10 ships with gcc 3.4.3; Solaris Studio has its own compilers).

It's too do with pthreads.  Nothing to do with the compiler.  On some implementations it requires special permission to create threads with different priorities.  That used to be the case on linux and it appears to be the case on OpenSolaris.  Hence one is stuck with the itimer heartbeat.
Is there any implementation actually using sqUnixITimerHeartbeat.c?

Yes, but unhappily.  We use it at Cadence because we have customers on pre 2.6.12 kernels.  We have to e.g. switch off the heartbeast around certain external calls.
I am still wondering about where the necessary sleep call will be generated in this case. I will check your latest VM sources. Maybe PharoVM is different here…

Where is there a necessary sleep?

I don’t think that Solaris has special problems here.
Also, I cannot imagine that this situation is so uncommon that nobody else got it before SqueakVM. A higher prioritized thread should be possible for ordinary users.

I definitely wasn't the case on linux.

I haven’t tried to build the VM for Solaris 10 yet but I have tried a multi-threaded example program I found in the web. This example program is running 4 threads for computation
and does not run under OpenSolaris (same error code like the SqueakVM is generating) but it runs without hassles on Solaris 10 SPARC. So I guess Solaris 10 will run the
two-threaded SqueakVM without problems.

sigh...
 

Best regards,
Andreas





--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: VM on Solaris (was: Camera sig fault on 64 bits machines.)

Colin Putney-3



On 23 Apr, 2014 at 5:14:57 PM, Eliot Miranda ([hidden email]) wrote:
> > This is a pain :-). You could either assume that people can always  
> get the necessary permission and go with the threaded heartbeat  
> (my preferred suggestion) or provide two VMs (always tedious).  

Right. The number of non-administrators running Squeak/Pharo/Cuis on Solaris is going to be vanishingly small, and right now we know it’s zero. If circumstances change, that’s good news, and we can figure out how to accommodate the massive influx of new users.

--  
Colin Putney


Reply | Threaded
Open this post in threaded view
|

Re: VM on Solaris (was: Camera sig fault on 64 bits machines.)

Andreas Wacknitz
In reply to this post by Eliot Miranda-2
 

Am 24.04.2014 um 00:14 schrieb Eliot Miranda <[hidden email]>:

Hi Andreas,


On Wed, Apr 23, 2014 at 9:35 AM, Andreas Wacknitz <[hidden email]> wrote:
 
Thanks again Eliot,

First, I solved the pthreads problem under OpenSolaris. While Solaris 10 doesn’t need special user privileges for thread control (at least within the same thread policy I guess),
users under Solaris 11 (and thus OpenSolaris) need the privilege „proc_priocntl“ to be given by an administrator.
(For those who are interested: usermod -K defaultpriv=basic,proc_priocntl andreas)

This is a pain :-).  You could either assume that people can always get the necessary permission and go with the threaded heartbeat (my preferred suggestion) or provide two VMs (always tedious).
Yes, I consider going with the threaded heartbeat for OpenSolaris (I will also try to compile everything under Solaris 11.1 but that’s on lower priority for me as I am not really using it.).
I am not yet decided whether the version without increased priority would be enough. At the moment everything seems to run fine with this version; I can interrupt "[[true] whileTrue] forkAt: Processor userInterruptPriority“
by ALT-.
But the whole thing isn’t finished yet as FFI (NativeBoost doesn’t seem to work (e.g. "UnixEnvironment environ“ fails. I don’t know when I will find time to deal with that.



More below…

Am 22.04.2014 um 22:31 schrieb Eliot Miranda <[hidden email]>:




On Tue, Apr 22, 2014 at 1:10 PM, Andreas Wacknitz <[hidden email]> wrote:
 

Am 22.04.2014 um 21:36 schrieb Eliot Miranda <[hidden email]>:

Hi Andreas,


On Tue, Apr 22, 2014 at 12:05 PM, Andreas Wacknitz <[hidden email]> wrote:

This evening I further dealt with the problems on OpenSolaris (openindiana).
I finally got a pthread version running without superuser rights. But I don’t know whether this will really work (ATM it does for me)
because I removed the call to pthread_setschedparam in beatStateMachine leaving the heartbeat thread with the same
priority than the vm thread.

Alas, that will not work -(.  As soon as the image enters into a hard loop (e.g. [true] whileTrue) the heartbeat thread will be blocked and the VM will never break out of the loop.

How can I check this blockage? I started the VM with --pollpipe 1 and then [true] whileTrue in a Workspace. The GUI is blocked but the pipe is still rotating.

Can you interrupt with ctrl-period?  If not, then I don't understand how the pip is still rotating :-).  If you can, then you're not blocking the system.  Try e.g. [[true] whileTrue]  forkAt: Processor highestPriority.
Yes, I can do that in both (with and without higher priority) BUT not when running this with highestPriority (again in BOTH versions!).

Oops That's right.  It should be just "[[true] whileTrue]  forkAt: Processor userPriority + 1".  Obviously one can't interrupt something running higher than userInterrupt priority.  Sorry, I was asleep.
See above. OpenSolaris seems to run fine with the two threads having the same priority.



You are running the JIT right?
How to tell for sure?

vm -version

If it includes a CoInterpreter line you're running the JIT.  e.g.
McStalker.macbuild$ oscfvm -version
/Users/eliot/Cog/oscogvm/macbuild/Fast.app/Contents/MacOS/Squeak
4.0 4.0.2894 Mac OS X built on Apr 14 2014 17:02:16 Compiler: 4.2.1 (Apple Inc. build 5666) (dot 3) [Production VM]
CoInterpreter VMMaker.oscog-eem.674 uuid: eefd603d-9638-4ad8-99c0-4ee12e87d49d Apr 14 2014
StackToRegisterMappingCogit VMMaker.oscog-eem.674 uuid: eefd603d-9638-4ad8-99c0-4ee12e87d49d Apr 14 2014
VM: r2894 http://www.squeakvm.org/svn/squeak/branches/Cog Date: 2014-04-14 15:32:11 -0700

merkur pharo-without-higher-priority $ ./pharo --version
3.9-7 #1 22. April 2014 20:30:39 CEST gcc 4.7.3 [Production VM]
NBCoInterpreter NativeBoost-CogPlugin-GuillermoPolito.19 uuid: acc98e51-2fba-4841-a965-2975997bba66 Apr 22 2014
NBCogit NativeBoost-CogPlugin-GuillermoPolito.19 uuid: acc98e51-2fba-4841-a965-2975997bba66 Apr 22 2014
https://github.com/pharo-project/pharo-vm.git Commit: 9e648898f53aadb692f2dc95f432daedc449d432 Date: 2014-04-09 16:01:20 +0200 By: Esteban Lorenzano <[hidden email]
SunOS merkur 5.11 illumos-b6240e8 i86pc i386 i86pc
plugin path: /home/andreas/bin/pharo-without-higher-priority/ [default: /home/andreas/bin/pharo-without-higher-priority/]


What does this tell?


 
I started the VM with —trace. The last log is „IRBytecodeGenerator>>from:goto“.
The pipe is still rotating but ALT-. does not break the loop (maybe a problem of my Pharo image?; I will try later with a Squeak image).


I tried to replace the pthread_setschedparam call with a similar pthread_setschedprio call but
with no luck (same problem: failed call with "Not owner"). I don’t know wether this is a general problem with the pthreads implementation
on Solaris or just a problem with the gcc version (4.4.4) coming with the openindiana distribution I am using. Maybe this works only
with the compilers and libraries that is delivered by Oracle (Solaris 10 ships with gcc 3.4.3; Solaris Studio has its own compilers).

It's too do with pthreads.  Nothing to do with the compiler.  On some implementations it requires special permission to create threads with different priorities.  That used to be the case on linux and it appears to be the case on OpenSolaris.  Hence one is stuck with the itimer heartbeat.
Is there any implementation actually using sqUnixITimerHeartbeat.c?

Yes, but unhappily.  We use it at Cadence because we have customers on pre 2.6.12 kernels.  We have to e.g. switch off the heartbeast around certain external calls.
I am still wondering about where the necessary sleep call will be generated in this case. I will check your latest VM sources. Maybe PharoVM is different here…

Where is there a necessary sleep?
My understanding is that the interrupt handler for the heartbeat is waiting for SIGALRM. Typically this is emitted by an expiring usleep or nanosleep call. I cannot see one in the code that is active
when compiling the pharo-vm with ITIMER_HEARTBEAT flag set. In case that VM_TICKER flag is set there is an nanosleep call in the corresponding code.

The fact that my ITIMER_HEARTBEAT version is running when external SIGALRM’s being triggered confirms my view that a source for this signal is missing.

Best regards,
Andreas

Reply | Threaded
Open this post in threaded view
|

Re: VM on Solaris (was: Camera sig fault on 64 bits machines.)

Eliot Miranda-2
 
Hi Andreas,


On Thu, Apr 24, 2014 at 9:58 AM, Andreas Wacknitz <[hidden email]> wrote:
 

Am 24.04.2014 um 00:14 schrieb Eliot Miranda <[hidden email]>:

Hi Andreas,


On Wed, Apr 23, 2014 at 9:35 AM, Andreas Wacknitz <[hidden email]> wrote:
 
Thanks again Eliot,

First, I solved the pthreads problem under OpenSolaris. While Solaris 10 doesn’t need special user privileges for thread control (at least within the same thread policy I guess),
users under Solaris 11 (and thus OpenSolaris) need the privilege „proc_priocntl“ to be given by an administrator.
(For those who are interested: usermod -K defaultpriv=basic,proc_priocntl andreas)

This is a pain :-).  You could either assume that people can always get the necessary permission and go with the threaded heartbeat (my preferred suggestion) or provide two VMs (always tedious).
Yes, I consider going with the threaded heartbeat for OpenSolaris (I will also try to compile everything under Solaris 11.1 but that’s on lower priority for me as I am not really using it.).
I am not yet decided whether the version without increased priority would be enough. At the moment everything seems to run fine with this version; I can interrupt "[[true] whileTrue] forkAt: Processor userInterruptPriority“
by ALT-.

That implies it is working.  But I would definitely make sure the heartbeat runs at a higher priority than the main thread.

One thing to check is that delays expire even when the system is fully busy, e.g.

| run s |
run := true.
s := Semaphore new.
[| i | i := 0. s wait. [run] whileTrue: [i := i + 1]] forkAt: Processor highestPriority - 1.
[(Delay forSeconds: 1) wait. run := false] forkAt: Processor highestPriority.
s signal

should lock up the system for 1 second.  If the heartbeat is not advancing the clock used to check for delays then the sytsem will remain locked.

But the whole thing isn’t finished yet as FFI (NativeBoost doesn’t seem to work (e.g. "UnixEnvironment environ“ fails. I don’t know when I will find time to deal with that.

Well, the code is still useful for the Squeak VM, so please commit if and when you have the heartbeat working to your satisfaction.
 
More below…

Am 22.04.2014 um 22:31 schrieb Eliot Miranda <[hidden email]>:




On Tue, Apr 22, 2014 at 1:10 PM, Andreas Wacknitz <[hidden email]> wrote:
 

Am 22.04.2014 um 21:36 schrieb Eliot Miranda <[hidden email]>:

Hi Andreas,


On Tue, Apr 22, 2014 at 12:05 PM, Andreas Wacknitz <[hidden email]> wrote:

This evening I further dealt with the problems on OpenSolaris (openindiana).
I finally got a pthread version running without superuser rights. But I don’t know whether this will really work (ATM it does for me)
because I removed the call to pthread_setschedparam in beatStateMachine leaving the heartbeat thread with the same
priority than the vm thread.

Alas, that will not work -(.  As soon as the image enters into a hard loop (e.g. [true] whileTrue) the heartbeat thread will be blocked and the VM will never break out of the loop.

How can I check this blockage? I started the VM with --pollpipe 1 and then [true] whileTrue in a Workspace. The GUI is blocked but the pipe is still rotating.

Can you interrupt with ctrl-period?  If not, then I don't understand how the pip is still rotating :-).  If you can, then you're not blocking the system.  Try e.g. [[true] whileTrue]  forkAt: Processor highestPriority.
Yes, I can do that in both (with and without higher priority) BUT not when running this with highestPriority (again in BOTH versions!).

Oops That's right.  It should be just "[[true] whileTrue]  forkAt: Processor userPriority + 1".  Obviously one can't interrupt something running higher than userInterrupt priority.  Sorry, I was asleep.
See above. OpenSolaris seems to run fine with the two threads having the same priority.

Since I've been here before I know that examples can be constructed when this will not work properly.  My Delay example above should be one of them.  All one needs is to arrange that the system is fully busy, shuts out the heartbeat thread, but depends on the heartbeat thread to make progress (as in the Delay example; the heartbeat advances a low-resolution (~2ms) clock that is used to fire delays).
 
You are running the JIT right?
How to tell for sure?

vm -version

If it includes a CoInterpreter line you're running the JIT.  e.g.
McStalker.macbuild$ oscfvm -version
/Users/eliot/Cog/oscogvm/macbuild/Fast.app/Contents/MacOS/Squeak
4.0 4.0.2894 Mac OS X built on Apr 14 2014 17:02:16 Compiler: 4.2.1 (Apple Inc. build 5666) (dot 3) [Production VM]
CoInterpreter VMMaker.oscog-eem.674 uuid: eefd603d-9638-4ad8-99c0-4ee12e87d49d Apr 14 2014
StackToRegisterMappingCogit VMMaker.oscog-eem.674 uuid: eefd603d-9638-4ad8-99c0-4ee12e87d49d Apr 14 2014
VM: r2894 http://www.squeakvm.org/svn/squeak/branches/Cog Date: 2014-04-14 15:32:11 -0700

merkur pharo-without-higher-priority $ ./pharo --version
3.9-7 #1 22. April 2014 20:30:39 CEST gcc 4.7.3 [Production VM]
NBCoInterpreter NativeBoost-CogPlugin-GuillermoPolito.19 uuid: acc98e51-2fba-4841-a965-2975997bba66 Apr 22 2014
NBCogit NativeBoost-CogPlugin-GuillermoPolito.19 uuid: acc98e51-2fba-4841-a965-2975997bba66 Apr 22 2014
https://github.com/pharo-project/pharo-vm.git Commit: 9e648898f53aadb692f2dc95f432daedc449d432 Date: 2014-04-09 16:01:20 +0200 By: Esteban Lorenzano <[hidden email]
SunOS merkur 5.11 illumos-b6240e8 i86pc i386 i86pc
plugin path: /home/andreas/bin/pharo-without-higher-priority/ [default: /home/andreas/bin/pharo-without-higher-priority/]


What does this tell?

That you're using the JIT (NBCogit & NBCoInterpreter).
  
I started the VM with —trace. The last log is „IRBytecodeGenerator>>from:goto“.
The pipe is still rotating but ALT-. does not break the loop (maybe a problem of my Pharo image?; I will try later with a Squeak image).

I tried to replace the pthread_setschedparam call with a similar pthread_setschedprio call but
with no luck (same problem: failed call with "Not owner"). I don’t know wether this is a general problem with the pthreads implementation
on Solaris or just a problem with the gcc version (4.4.4) coming with the openindiana distribution I am using. Maybe this works only
with the compilers and libraries that is delivered by Oracle (Solaris 10 ships with gcc 3.4.3; Solaris Studio has its own compilers).

It's too do with pthreads.  Nothing to do with the compiler.  On some implementations it requires special permission to create threads with different priorities.  That used to be the case on linux and it appears to be the case on OpenSolaris.  Hence one is stuck with the itimer heartbeat.
Is there any implementation actually using sqUnixITimerHeartbeat.c?

Yes, but unhappily.  We use it at Cadence because we have customers on pre 2.6.12 kernels.  We have to e.g. switch off the heartbeast around certain external calls.
I am still wondering about where the necessary sleep call will be generated in this case. I will check your latest VM sources. Maybe PharoVM is different here…

Where is there a necessary sleep?
My understanding is that the interrupt handler for the heartbeat is waiting for SIGALRM. Typically this is emitted by an expiring usleep or nanosleep call. I cannot see one in the code that is active
when compiling the pharo-vm with ITIMER_HEARTBEAT flag set. In case that VM_TICKER flag is set there is an nanosleep call in the corresponding code.

But SIGALRM is also delivered by setitimer, as in

...
# define THE_ITIMER ITIMER_REAL
# define ITIMER_SIGNAL SIGALRM
...
if (setitimer(THE_ITIMER, &pulse, &pulse)) {

in platforms/unix/vm/sqUnixITimerHeartbeat.c.  I'm surprised this doesn't work on OpenSolaris.  In fact, I can't believe it doesn't work.  Something odd must be going on.


The fact that my ITIMER_HEARTBEAT version is running when external SIGALRM’s being triggered confirms my view that a source for this signal is missing.

We;;, the source is there (the call to setitimer) but for some reason something is going wrong.  I suspect the signal handler fires only once.  SysV unixes need a signal handler to be rearmed with a call to signal or sigaction in the handler.  This is daft, soby default sigaction avoids having to rearm (saving a costly system call on every signal).  The old behaviour can be reinstated using SA_RESETHAND.  Perhaps on OpenSolaris one has to explicitly use a flag that is the converse of SA_RESETHAND that says "don't reset the handler after delivery".

Here's the relevant excerpt from the Mac OS X manual page for sigaction:
"SA_RESETHAND    If this bit is set, the handler is reset back to SIG_DFL at the moment the signal is delivered."

HTH,
--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: VM on Solaris (was: Camera sig fault on 64 bits machines.)

Eliot Miranda-2
 
Hi Andreas,

    I just read the OpenSolaris sigaction manual page and it has the expected semantics with SA_RESETHAND; i.e. one does /not/ have to do anything special to avoid having to reset the handler.  So I wonder whether ioInitHeartbeat is even being called.  You might check.  It ends with a call of setIntervalTimer(beatMilliseconds) which should set the heartbeat itimer going.


On Thu, Apr 24, 2014 at 11:28 AM, Eliot Miranda <[hidden email]> wrote:
Hi Andreas,


On Thu, Apr 24, 2014 at 9:58 AM, Andreas Wacknitz <[hidden email]> wrote:
 

Am 24.04.2014 um 00:14 schrieb Eliot Miranda <[hidden email]>:

Hi Andreas,


On Wed, Apr 23, 2014 at 9:35 AM, Andreas Wacknitz <[hidden email]> wrote:
 
Thanks again Eliot,

First, I solved the pthreads problem under OpenSolaris. While Solaris 10 doesn’t need special user privileges for thread control (at least within the same thread policy I guess),
users under Solaris 11 (and thus OpenSolaris) need the privilege „proc_priocntl“ to be given by an administrator.
(For those who are interested: usermod -K defaultpriv=basic,proc_priocntl andreas)

This is a pain :-).  You could either assume that people can always get the necessary permission and go with the threaded heartbeat (my preferred suggestion) or provide two VMs (always tedious).
Yes, I consider going with the threaded heartbeat for OpenSolaris (I will also try to compile everything under Solaris 11.1 but that’s on lower priority for me as I am not really using it.).
I am not yet decided whether the version without increased priority would be enough. At the moment everything seems to run fine with this version; I can interrupt "[[true] whileTrue] forkAt: Processor userInterruptPriority“
by ALT-.

That implies it is working.  But I would definitely make sure the heartbeat runs at a higher priority than the main thread.

One thing to check is that delays expire even when the system is fully busy, e.g.

| run s |
run := true.
s := Semaphore new.
[| i | i := 0. s wait. [run] whileTrue: [i := i + 1]] forkAt: Processor highestPriority - 1.
[(Delay forSeconds: 1) wait. run := false] forkAt: Processor highestPriority.
s signal

should lock up the system for 1 second.  If the heartbeat is not advancing the clock used to check for delays then the sytsem will remain locked.

But the whole thing isn’t finished yet as FFI (NativeBoost doesn’t seem to work (e.g. "UnixEnvironment environ“ fails. I don’t know when I will find time to deal with that.

Well, the code is still useful for the Squeak VM, so please commit if and when you have the heartbeat working to your satisfaction.
 
More below…

Am 22.04.2014 um 22:31 schrieb Eliot Miranda <[hidden email]>:




On Tue, Apr 22, 2014 at 1:10 PM, Andreas Wacknitz <[hidden email]> wrote:
 

Am 22.04.2014 um 21:36 schrieb Eliot Miranda <[hidden email]>:

Hi Andreas,


On Tue, Apr 22, 2014 at 12:05 PM, Andreas Wacknitz <[hidden email]> wrote:

This evening I further dealt with the problems on OpenSolaris (openindiana).
I finally got a pthread version running without superuser rights. But I don’t know whether this will really work (ATM it does for me)
because I removed the call to pthread_setschedparam in beatStateMachine leaving the heartbeat thread with the same
priority than the vm thread.

Alas, that will not work -(.  As soon as the image enters into a hard loop (e.g. [true] whileTrue) the heartbeat thread will be blocked and the VM will never break out of the loop.

How can I check this blockage? I started the VM with --pollpipe 1 and then [true] whileTrue in a Workspace. The GUI is blocked but the pipe is still rotating.

Can you interrupt with ctrl-period?  If not, then I don't understand how the pip is still rotating :-).  If you can, then you're not blocking the system.  Try e.g. [[true] whileTrue]  forkAt: Processor highestPriority.
Yes, I can do that in both (with and without higher priority) BUT not when running this with highestPriority (again in BOTH versions!).

Oops That's right.  It should be just "[[true] whileTrue]  forkAt: Processor userPriority + 1".  Obviously one can't interrupt something running higher than userInterrupt priority.  Sorry, I was asleep.
See above. OpenSolaris seems to run fine with the two threads having the same priority.

Since I've been here before I know that examples can be constructed when this will not work properly.  My Delay example above should be one of them.  All one needs is to arrange that the system is fully busy, shuts out the heartbeat thread, but depends on the heartbeat thread to make progress (as in the Delay example; the heartbeat advances a low-resolution (~2ms) clock that is used to fire delays).
 
You are running the JIT right?
How to tell for sure?

vm -version

If it includes a CoInterpreter line you're running the JIT.  e.g.
McStalker.macbuild$ oscfvm -version
/Users/eliot/Cog/oscogvm/macbuild/Fast.app/Contents/MacOS/Squeak
4.0 4.0.2894 Mac OS X built on Apr 14 2014 17:02:16 Compiler: 4.2.1 (Apple Inc. build 5666) (dot 3) [Production VM]
CoInterpreter VMMaker.oscog-eem.674 uuid: eefd603d-9638-4ad8-99c0-4ee12e87d49d Apr 14 2014
StackToRegisterMappingCogit VMMaker.oscog-eem.674 uuid: eefd603d-9638-4ad8-99c0-4ee12e87d49d Apr 14 2014
VM: r2894 http://www.squeakvm.org/svn/squeak/branches/Cog Date: 2014-04-14 15:32:11 -0700

merkur pharo-without-higher-priority $ ./pharo --version
3.9-7 #1 22. April 2014 20:30:39 CEST gcc 4.7.3 [Production VM]
NBCoInterpreter NativeBoost-CogPlugin-GuillermoPolito.19 uuid: acc98e51-2fba-4841-a965-2975997bba66 Apr 22 2014
NBCogit NativeBoost-CogPlugin-GuillermoPolito.19 uuid: acc98e51-2fba-4841-a965-2975997bba66 Apr 22 2014
https://github.com/pharo-project/pharo-vm.git Commit: 9e648898f53aadb692f2dc95f432daedc449d432 Date: 2014-04-09 16:01:20 +0200 By: Esteban Lorenzano <[hidden email]
SunOS merkur 5.11 illumos-b6240e8 i86pc i386 i86pc
plugin path: /home/andreas/bin/pharo-without-higher-priority/ [default: /home/andreas/bin/pharo-without-higher-priority/]


What does this tell?

That you're using the JIT (NBCogit & NBCoInterpreter).
  
I started the VM with —trace. The last log is „IRBytecodeGenerator>>from:goto“.
The pipe is still rotating but ALT-. does not break the loop (maybe a problem of my Pharo image?; I will try later with a Squeak image).

I tried to replace the pthread_setschedparam call with a similar pthread_setschedprio call but
with no luck (same problem: failed call with "Not owner"). I don’t know wether this is a general problem with the pthreads implementation
on Solaris or just a problem with the gcc version (4.4.4) coming with the openindiana distribution I am using. Maybe this works only
with the compilers and libraries that is delivered by Oracle (Solaris 10 ships with gcc 3.4.3; Solaris Studio has its own compilers).

It's too do with pthreads.  Nothing to do with the compiler.  On some implementations it requires special permission to create threads with different priorities.  That used to be the case on linux and it appears to be the case on OpenSolaris.  Hence one is stuck with the itimer heartbeat.
Is there any implementation actually using sqUnixITimerHeartbeat.c?

Yes, but unhappily.  We use it at Cadence because we have customers on pre 2.6.12 kernels.  We have to e.g. switch off the heartbeast around certain external calls.
I am still wondering about where the necessary sleep call will be generated in this case. I will check your latest VM sources. Maybe PharoVM is different here…

Where is there a necessary sleep?
My understanding is that the interrupt handler for the heartbeat is waiting for SIGALRM. Typically this is emitted by an expiring usleep or nanosleep call. I cannot see one in the code that is active
when compiling the pharo-vm with ITIMER_HEARTBEAT flag set. In case that VM_TICKER flag is set there is an nanosleep call in the corresponding code.

But SIGALRM is also delivered by setitimer, as in

...
# define THE_ITIMER ITIMER_REAL
# define ITIMER_SIGNAL SIGALRM
...
if (setitimer(THE_ITIMER, &pulse, &pulse)) {

in platforms/unix/vm/sqUnixITimerHeartbeat.c.  I'm surprised this doesn't work on OpenSolaris.  In fact, I can't believe it doesn't work.  Something odd must be going on.


The fact that my ITIMER_HEARTBEAT version is running when external SIGALRM’s being triggered confirms my view that a source for this signal is missing.

We;;, the source is there (the call to setitimer) but for some reason something is going wrong.  I suspect the signal handler fires only once.  SysV unixes need a signal handler to be rearmed with a call to signal or sigaction in the handler.  This is daft, soby default sigaction avoids having to rearm (saving a costly system call on every signal).  The old behaviour can be reinstated using SA_RESETHAND.  Perhaps on OpenSolaris one has to explicitly use a flag that is the converse of SA_RESETHAND that says "don't reset the handler after delivery".

Here's the relevant excerpt from the Mac OS X manual page for sigaction:
"SA_RESETHAND    If this bit is set, the handler is reset back to SIG_DFL at the moment the signal is delivered."

HTH,
--
best,
Eliot



--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: VM on Solaris

Andreas Wacknitz
 
Hi Eliot,

Am 24.04.14 20:32, schrieb Eliot Miranda:
 


Hi Andreas,

    I just read the OpenSolaris sigaction manual page and it has the expected semantics with SA_RESETHAND; i.e. one does /not/ have to do anything special to avoid having to reset the handler.  So I wonder whether ioInitHeartbeat is even being called.  You might check.  It ends with a call of setIntervalTimer(beatMilliseconds) which should set the heartbeat itimer going.

I checked it. ioInitHeartbeat is being called. In fact setIntervalTimer(2) is being called periodically. I don't have an idea why setitimer doesn't fire SIGALRM.
It is initialized correctly. The man page state that SIGALRM should be fired after the expiration time.

Best regards,
Andreas
Reply | Threaded
Open this post in threaded view
|

Re: VM on Solaris

Andreas Wacknitz
In reply to this post by Eliot Miranda-2
 

Am 24.04.14 20:28, schrieb Eliot Miranda:
 


Hi Andreas,


On Thu, Apr 24, 2014 at 9:58 AM, Andreas Wacknitz <[hidden email]> wrote:
 

Am 24.04.2014 um 00:14 schrieb Eliot Miranda <[hidden email]>:

Hi Andreas,


On Wed, Apr 23, 2014 at 9:35 AM, Andreas Wacknitz <[hidden email]> wrote:
 
Thanks again Eliot,

First, I solved the pthreads problem under OpenSolaris. While Solaris 10 doesn’t need special user privileges for thread control (at least within the same thread policy I guess),
users under Solaris 11 (and thus OpenSolaris) need the privilege „proc_priocntl“ to be given by an administrator.
(For those who are interested: usermod -K defaultpriv=basic,proc_priocntl andreas)

This is a pain :-).  You could either assume that people can always get the necessary permission and go with the threaded heartbeat (my preferred suggestion) or provide two VMs (always tedious).
Yes, I consider going with the threaded heartbeat for OpenSolaris (I will also try to compile everything under Solaris 11.1 but that’s on lower priority for me as I am not really using it.).
I am not yet decided whether the version without increased priority would be enough. At the moment everything seems to run fine with this version; I can interrupt "[[true] whileTrue] forkAt: Processor userInterruptPriority“
by ALT-.

That implies it is working.  But I would definitely make sure the heartbeat runs at a higher priority than the main thread.

One thing to check is that delays expire even when the system is fully busy, e.g.

| run s |
run := true.
s := Semaphore new.
[| i | i := 0. s wait. [run] whileTrue: [i := i + 1]] forkAt: Processor highestPriority - 1.
[(Delay forSeconds: 1) wait. run := false] forkAt: Processor highestPriority.
s signal

should lock up the system for 1 second.  If the heartbeat is not advancing the clock used to check for delays then the sytsem will remain locked.
It's working like you described it. Locked for a second and then continues to work.


But the whole thing isn’t finished yet as FFI (NativeBoost doesn’t seem to work (e.g. "UnixEnvironment environ“ fails. I don’t know when I will find time to deal with that.

Well, the code is still useful for the Squeak VM, so please commit if and when you have the heartbeat working to your satisfaction.
I will do that. Next step is to get your Cog branch compiled and run :)
This is mostly for my fun and experience. I am doing all in my spare time and thus cannot predict when it will be finished.

Best regards,
Andreas