IMO the best solution would be to add command line parameters to set the priorities - both main and heartbeat using relative values, and keep the current behavior if none was given. Ben's example would be 0 for both parameters. Then the error message on startup could just say to set one of the parameters to run the VM (e.g. 0,0 or -1,0, etc). AFAIK you can always decrease the priority of the threads on linux, so 0 and negative values should always work. Levente |
In reply to this post by Ben Coman
> On 20 Mar 2017, at 17:43, Ben Coman <[hidden email]> wrote: > > Hi! > This is because its job is to cause Smalltalk code to break out at regular intervals to check for events. If the Smalltalk code is compute-intensive then it will prevent the heartbeat thread from running unless the heartbeat thread is running at a higher priority, and so it will be impossible to receive input keys, etc. (Note that if event collection was in a separate thread it would suffer the same issue; compute intensive code would block the event collection thread unless it was running at higher priority). > > Perhaps the heatbeat thread needed it static priority managed manually for with Linux 2.4, > but maybe that is not required in 2.6 ?? The "flaw" in this experiment is that number of runnable threads is most likely smaller or equal to the number of physical processors. You can use cpu affinity to bind your application to a single CPU and see the delay. I think the "jitter" will likely be higher. holger |
In reply to this post by Levente Uzonyi
On Mon, Mar 20, 2017 at 09:07:16PM +0100, Levente Uzonyi wrote: > > IMO the best solution would be to add command line parameters to set the > priorities - both main and heartbeat using relative values, and keep the > current behavior if none was given. Ben's example would be 0 for both > parameters. > Then the error message on startup could just say to set one of the > parameters to run the VM (e.g. 0,0 or -1,0, etc). A much simpler solution is to just comment out the "exit(errno)" at line 332 of sqUnixHearbeat.c. The resulting VM will work fine. If the /etc/security/limits.d/squeak.conf is not in place, then you will see the annoying but otherwise harmless warming: pthread_setschedparam failed: Operation not permitted This VM uses a thread heartbeat who requires a special configuration to work. You need to allow it to run higher priority threads (real time), to allow clock to work properly You need to add a conf file to /etc/security/limits.d, executing this: sudo cat >/etc/security/limits.d/squeak.conf <<END * hard rtprio 2 * soft rtprio 2 END Ben's analysis may or may not be exactly right for all conditions of load, but directionally he is right on target. Clock jitter is of little or no interest to the general user, and the people who do care about it are fully capable of setting the necessary security configuration. > AFAIK you can always decrease the priority of the threads on linux, so 0 > and negative values should always work. > I think this is correct, so another resonable strategy would be to lower the priority of threads other than the heartbeat thread when the heartbeat priority cannot be raised. But if Ben's analysis is generally correct, then it may not even be worth the trouble of doing that. Dave |
Hi David,
On Mon, Mar 20, 2017 at 5:00 PM, David T. Lewis <[hidden email]> wrote:
Feel free! Do it! Just add a comment to the source. If the /etc/security/limits.d/squeak. _,,,^..^,,,_ best, Eliot |
On Tue, Mar 21, 2017 at 1:44 AM, Eliot Miranda <[hidden email]> wrote:
+1 - Bert - |
In reply to this post by Holger Freyther
On Tue, Mar 21, 2017 at 6:18 AM, Holger Freyther <[hidden email]> wrote: > > > > > On 20 Mar 2017, at 17:43, Ben Coman <[hidden email]> wrote: > > > > On Sat, Jan 7, 2017 at 3:23 AM, Eliot Miranda <[hidden email]> wrote: > > > This is because its job is to cause Smalltalk code to break out at regular intervals to check for events. If the Smalltalk code is compute-intensive then it will prevent the heartbeat thread from running unless the heartbeat thread is running at a higher priority, and so it will be impossible to receive input keys, etc. (Note that if event collection was in a separate thread it would suffer the same issue; compute intensive code would block the event collection thread unless it was running at higher priority). > > > > > Perhaps the heatbeat thread needed it static priority managed manually for with Linux 2.4, > > but maybe that is not required in 2.6 ?? > > The "flaw" in this experiment is that number of runnable threads is most likely smaller or equal to the number of physical processors. You can use cpu affinity to bind your application to a single CPU and see the delay. I think the "jitter" will likely be higher. > Great insight. I'll try two things: cpu affinity & loading the other cpus with my previous fibinachi experiment. Keep in mind though that most modern desktop systems are overpowered, usual peaking loads less than 40% unless something goes haywire. And smaller systems like the Pi seem to default to operating as root, so can bump the heatbeat priority. cheers -ben |
In reply to this post by Eliot Miranda-2
On Sat, Jan 7, 2017 at 1:33 AM, Eliot Miranda <[hidden email]> wrote: > > Hi Guille, > >> On Jan 6, 2017, at 6:44 AM, Guillermo Polito <[hidden email]> wrote: >> >> Hi, >> >> I was checking the code in sqUnixHeartbeat.c to see how the heartbeat thread/itimer worked. It somehow bothers me that there are different compiled artifacts, one per option. >> >> What do you think about having a VM that manages that as an argument provided when we launch the VM? This would add some flexibility that we don't have right now because we make the decision at compile time. > > I think it's a fine idea but it isn't really the issue. The issue is that the itimer mechanism is problematic, especially for foreign code, and is therefore a stop gap. The itimer interrupts long-running system calls, which means that things like sound libraries break (at Qwaq I had to fix ALSA to get it to work with the itimer heartbeat). Since Pharo is becoming more reliant on external code it may impact us more going forward. Just curious, what is the root cause of this itimer conflict? Is it that a SIGALARM in particular is issued, or just that the current execution is pre-empted to handle the signal - and is that a timing issue, or a concurrency problem where some state is invalidated? Would it help if to handle the signal in another thread? I found this of mild interest also. Maybe of use if you want to hibernate your laptop without closing a running Image ?? (I'm not sure whether that's a problem anyway.) * Deferrable timers for user space (https://lwn.net/Articles/588086) |
> On 28 Mar 2017, at 17:05, Ben Coman <[hidden email]> wrote: > > > > Just curious, what is the root cause of this itimer conflict? > Is it that a SIGALARM in particular is issued, or just that the > current execution is pre-empted to handle the signal - and is that a > timing issue, or a concurrency problem where some state is > invalidated? > > Would it help if to handle the signal in another thread? > AFAICT the main issue is that it can not be shared/multiplexed. E.g. if both ALSA and the VM install an event handler for the signal it is not clear who will win. And they will probably cancel each others work. |
> On 28-03-2017, at 9:46 AM, Holger Freyther <[hidden email]> wrote: > > > >> On 28 Mar 2017, at 17:05, Ben Coman <[hidden email]> wrote: >> >> >> >> Just curious, what is the root cause of this itimer conflict? >> Is it that a SIGALARM in particular is issued, or just that the >> current execution is pre-empted to handle the signal - and is that a >> timing issue, or a concurrency problem where some state is >> invalidated? >> >> Would it help if to handle the signal in another thread? >> > > AFAICT the main issue is that it can not be shared/multiplexed. E.g. if both ALSA and the VM install an event handler for the signal it is not clear who will win. And they will probably cancel each others work. Pretty much this; by advanced programming magic it manages to always work out to do *exactly* the worst thing. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Strange OpCodes: MT: Muddle Through |
In reply to this post by Ben Coman
On Tue, Mar 28, 2017 at 11:05 PM, Ben Coman <[hidden email]> wrote: > On Sat, Jan 7, 2017 at 1:33 AM, Eliot Miranda <[hidden email]> wrote: >> >> Hi Guille, >> >>> On Jan 6, 2017, at 6:44 AM, Guillermo Polito <[hidden email]> wrote: >>> >>> Hi, >>> >>> I was checking the code in sqUnixHeartbeat.c to see how the heartbeat thread/itimer worked. It somehow bothers me that there are different compiled artifacts, one per option. >>> >>> What do you think about having a VM that manages that as an argument provided when we launch the VM? This would add some flexibility that we don't have right now because we make the decision at compile time. >> >> I think it's a fine idea but it isn't really the issue. The issue is that the itimer mechanism is problematic, especially for foreign code, and is therefore a stop gap. The itimer interrupts long-running system calls, which means that things like sound libraries break (at Qwaq I had to fix ALSA to get it to work with the itimer heartbeat). Since Pharo is becoming more reliant on external code it may impact us more going forward. > > Just curious, what is the root cause of this itimer conflict? > Is it that a SIGALARM in particular is issued, or just that the > current execution is pre-empted to handle the signal - and is that a > timing issue, or a concurrency problem where some state is > invalidated? > > Would it help if to handle the signal in another thread? bah, sorry, accidental bump sent too early. To continue.. POSIX timer_create() & timer_settime() might be worth considering, which can specify a different signal than SIGALARM, and also can apparently send the signal to a specific thread with with SIGEV_THREAD_ID https://linux.die.net/man/2/timer_create https://lists.gt.net/linux/kernel/1289398 Now when some places say there can only be one signal handler per signal per process, but IIUC modern Linux there is no distinction between process or thread, so its one signal handler per signal per thread. Or perhaps you just need a particular callback function called when the timer expires, per Patryk's answer... http://stackoverflow.com/questions/5740954/problem-in-timers-and-signal Although creating a new thread each time may be too much overhead. btw, did the change in Linux 2.6.12 from setitimer signalling individual threads to signalling only the main thread have any noticeable effect on the operation of the VM and external code like OSProcess and libraries? per answer by osgx... http://stackoverflow.com/questions/2586926/setitimer-sigalrm-multithread-process-linux-c cheers -ben ahh, just saw this from Holger... > AFAICT the main issue is that it can not be shared/multiplexed. E.g. if both ALSA and the VM install an event handler for the signal it is not clear who will win. And they will probably cancel each others work. So I guess that means ALSA is using SIGALRM? Since that is what setitmer is hardcoded to send. I see a few mentions here... http://git.alsa-project.org/?p=alsa-kernel.git&a=search&h=HEAD&st=commit&s=sigalrm cheers -ben |
In reply to this post by Holger Freyther
Hi Holger, > On Mar 28, 2017, at 9:46 AM, Holger Freyther <[hidden email]> wrote: > > > >> On 28 Mar 2017, at 17:05, Ben Coman <[hidden email]> wrote: >> >> >> >> Just curious, what is the root cause of this itimer conflict? >> Is it that a SIGALARM in particular is issued, or just that the >> current execution is pre-empted to handle the signal - and is that a >> timing issue, or a concurrency problem where some state is >> invalidated? >> >> Would it help if to handle the signal in another thread? >> > > AFAICT the main issue is that it can not be shared/multiplexed. E.g. if both ALSA and the VM install an event handler for the signal it is not clear who will win. And they will probably cancel each others work. Not quite. The problem with ALSA is that it does not follow the convention for installing signal handlers so that they can be chained, nor uninstalling signal handlers so that the previous chain is restored on uninstall. The main problem is that the SIGALRM will interrupt system calls and it is common to find libraries either poorly written or poorly tested (perfectly understandably) so that the interruptions mean they effectively crash. ALSA is also in this category. |
In reply to this post by Ben Coman
Hi Ben, > On Mar 28, 2017, at 10:35 AM, Ben Coman <[hidden email]> wrote: > > >> On Tue, Mar 28, 2017 at 11:05 PM, Ben Coman <[hidden email]> wrote: >>> On Sat, Jan 7, 2017 at 1:33 AM, Eliot Miranda <[hidden email]> wrote: >>> >>> Hi Guille, >>> >>>> On Jan 6, 2017, at 6:44 AM, Guillermo Polito <[hidden email]> wrote: >>>> >>>> Hi, >>>> >>>> I was checking the code in sqUnixHeartbeat.c to see how the heartbeat thread/itimer worked. It somehow bothers me that there are different compiled artifacts, one per option. >>>> >>>> What do you think about having a VM that manages that as an argument provided when we launch the VM? This would add some flexibility that we don't have right now because we make the decision at compile time. >>> >>> I think it's a fine idea but it isn't really the issue. The issue is that the itimer mechanism is problematic, especially for foreign code, and is therefore a stop gap. The itimer interrupts long-running system calls, which means that things like sound libraries break (at Qwaq I had to fix ALSA to get it to work with the itimer heartbeat). Since Pharo is becoming more reliant on external code it may impact us more going forward. >> >> Just curious, what is the root cause of this itimer conflict? >> Is it that a SIGALARM in particular is issued, or just that the >> current execution is pre-empted to handle the signal - and is that a >> timing issue, or a concurrency problem where some state is >> invalidated? >> >> Would it help if to handle the signal in another thread? > > bah, sorry, accidental bump sent too early. To continue.. > > POSIX timer_create() & timer_settime() might be worth considering, > which can specify a different signal than SIGALARM, > and also can apparently send the signal to a specific thread with with > SIGEV_THREAD_ID > https://linux.die.net/man/2/timer_create > https://lists.gt.net/linux/kernel/1289398 > > Now when some places say there can only be one signal handler per > signal per process, but IIUC modern Linux there is no distinction > between process or thread, so its one signal handler per signal per > thread. > > > Or perhaps you just need a particular callback function called when > the timer expires, per Patryk's answer... > http://stackoverflow.com/questions/5740954/problem-in-timers-and-signal > Although creating a new thread each time may be too much overhead. > > > btw, did the change in Linux 2.6.12 from setitimer signalling > individual threads to signalling only the main thread have any > noticeable effect on the operation of the VM and external code like > OSProcess and libraries? per answer by osgx... > http://stackoverflow.com/questions/2586926/setitimer-sigalrm-multithread-process-linux-c I don't know. I do not ha e the cycles or the inclination to keep up with Linux kernel changes. Once something works I tend to move on. I have noticed (but have to test more carefully) that the VM as ammended by David Lewis not to abort if permission to raise thread priority is denied ran on my 6.5 CentOS installation without complaint after I renamed squeak.conf, logged out and logged back in. I need the check that this continues after a reboot rather than just log out/log in. But perhaps the thread priority issue is fixed. Anyway, your interest and expertise is much appreciated. Feel free to experiment and make changes. Realize that I'm not "all over this" issue because there are many fish to fry. So I'm very thankful for your efforts here. > > > > cheers -ben > > ahh, just saw this from Holger... > >> AFAICT the main issue is that it can not be shared/multiplexed. E.g. if both ALSA and the VM install an event handler for the signal it is not clear who will win. And they will probably cancel each others work. > > So I guess that means ALSA is using SIGALRM? > Since that is what setitmer is hardcoded to send. I see a few mentions here... > http://git.alsa-project.org/?p=alsa-kernel.git&a=search&h=HEAD&st=commit&s=sigalrm > > cheers -ben |
On Wed, Mar 29, 2017 at 11:03 AM, Eliot Miranda <[hidden email]> wrote: > > Hi Ben, > >> On Mar 28, 2017, at 10:35 AM, Ben Coman <[hidden email]> wrote: >> >> >>> On Tue, Mar 28, 2017 at 11:05 PM, Ben Coman <[hidden email]> wrote: >>>> On Sat, Jan 7, 2017 at 1:33 AM, Eliot Miranda <[hidden email]> wrote: >>>> >>>> Hi Guille, >>>> >>>>> On Jan 6, 2017, at 6:44 AM, Guillermo Polito <[hidden email]> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I was checking the code in sqUnixHeartbeat.c to see how the heartbeat thread/itimer worked. It somehow bothers me that there are different compiled artifacts, one per option. >>>>> >>>>> What do you think about having a VM that manages that as an argument provided when we launch the VM? This would add some flexibility that we don't have right now because we make the decision at compile time. >>>> >>>> I think it's a fine idea but it isn't really the issue. The issue is that the itimer mechanism is problematic, especially for foreign code, and is therefore a stop gap. The itimer interrupts long-running system calls, which means that things like sound libraries break (at Qwaq I had to fix ALSA to get it to work with the itimer heartbeat). Since Pharo is becoming more reliant on external code it may impact us more going forward. >>> >>> Just curious, what is the root cause of this itimer conflict? >>> Is it that a SIGALARM in particular is issued, or just that the >>> current execution is pre-empted to handle the signal - and is that a >>> timing issue, or a concurrency problem where some state is >>> invalidated? >>> >>> Would it help if to handle the signal in another thread? >> >> bah, sorry, accidental bump sent too early. To continue.. >> >> POSIX timer_create() & timer_settime() might be worth considering, >> which can specify a different signal than SIGALARM, >> and also can apparently send the signal to a specific thread with with >> SIGEV_THREAD_ID >> https://linux.die.net/man/2/timer_create >> https://lists.gt.net/linux/kernel/1289398 >> >> Now when some places say there can only be one signal handler per >> signal per process, but IIUC modern Linux there is no distinction >> between process or thread, so its one signal handler per signal per >> thread. >> >> >> Or perhaps you just need a particular callback function called when >> the timer expires, per Patryk's answer... >> http://stackoverflow.com/questions/5740954/problem-in-timers-and-signal >> Although creating a new thread each time may be too much overhead. >> >> >> btw, did the change in Linux 2.6.12 from setitimer signalling >> individual threads to signalling only the main thread have any >> noticeable effect on the operation of the VM and external code like >> OSProcess and libraries? per answer by osgx... >> http://stackoverflow.com/questions/2586926/setitimer-sigalrm-multithread-process-linux-c > > I don't know. I do not have the cycles or the inclination to keep up with Linux kernel changes. Once something works I tend to move on. > > I have noticed (but have to test more carefully) that the VM as ammended by David Lewis not to abort if permission to raise thread priority is denied ran on my 6.5 CentOS installation without complaint after I renamed squeak.conf, logged out and logged back in. I need the check that this continues after a reboot rather than just log out/log in. But perhaps the thread priority issue is fixed. > > Anyway, your interest and expertise is much appreciated. Feel free to experiment and make changes. No problem. Not that I'm an expert. I didn't know anything about this last week. I'm just good at leveraging google search to learn things I don't know. My speculations are often a way of trying to check what I've learnt within our domain. > Realize that I'm not "all over this" issue because there are many fish to fry. So I'm very thankful for your efforts here. Cool. I just ask the questions that cross my mind to see if the knowledge is out there to tap. cheers -ben |
In reply to this post by Guillermo Polito
On Fri, Jan 6, 2017 at 10:44 PM, Guillermo Polito <[hidden email]> wrote:
Can someone advise how to create a new VM setting and read/write to it from the Image? As a challenge, I'd actually like to play with dynamically switching the heatbeat between threaded & timer while the Image is running. Also it would be nice for the Image to be able to inspect what priorities the heatbeat thread is running at. As well, what is the simplest reliably crashing example of signal conflict with the timer-beat VM? cheers -ben |
Hi Ben,
On Sat, Apr 8, 2017 at 6:39 PM, Ben Coman <[hidden email]> wrote:
Give me an example of the kind of setting and I'll make you through it. The kinds of settings I know about are: 1. has a command line keyword and argument to set it 2. has a persistent value stored in the image header and is set via vmParameterAt:[put:] 3. 1 + 2
Ugh :-( That's a lot of plumbing. Much easier to just put a printf in a custom version of the VM somewhere?
There isn't a way that reliably crashes the Vm (it doesn't crash). Instead I posted an example which should demonstrate the problem. Here it is again. [| infiniteLoop | infiniteLoop := [| sum | sum := 0. [sum < 10] whileTrue: [sum := sum + (sum even ifTrue: [1] ifFalse: [-1])]] newProcess. infiniteLoop resume. Processor activeProcess priority: Processor activePriority + 1. (Delay forSeconds: 1) wait. infiniteLoop terminate. Processor activeProcess priority: Processor activePriority - 1] timeToRun 1001 Remember to test it on an older kernel which doesn't have the improved schedulers to confirm that the example locks up.
_,,,^..^,,,_ best, Eliot |
On Wed, Apr 12, 2017 at 9:18 AM, Eliot Miranda <[hidden email]> wrote: > > Hi Ben, > > On Sat, Apr 8, 2017 at 6:39 PM, Ben Coman <[hidden email]> wrote: >> >> >> >> >> On Fri, Jan 6, 2017 at 10:44 PM, Guillermo Polito <[hidden email]> wrote: >>> >>> >>> Hi, >>> >>> I was checking the code in sqUnixHeartbeat.c to see how the heartbeat thread/itimer worked. It somehow bothers me that there are different compiled artifacts, one per option. >>> >>> What do you think about having a VM that manages that as an argument provided when we launch the VM? This would add some flexibility that we don't have right now because we make the decision at compile time. >>> >> >> Can someone advise how to create a new VM setting and read/write to it from the Image? > > > Give me an example of the kind of setting and I'll make you through it. Thanks. I'll ask further when I've got something more concrete to work with. > The kinds of settings I know about are: > > 1. has a command line keyword and argument to set it > > 2. has a persistent value stored in the image header and is set via vmParameterAt:[put:] [2.] is the one I was asking about, but I should also pay attention to [1.]. > > 3. 1 + 2 > >> As a challenge, I'd actually like to play with dynamically switching the heatbeat between threaded & timer while the Image is running. >> >> Also it would be nice for the Image to be able to inspect what priorities the heatbeat thread is running at. > > > Ugh :-( That's a lot of plumbing. Much easier to just put a printf in a custom version of the VM somewhere? Where's the challenge in that? ;) Whether it should be default part of the VM is different question. > > >> As well, what is the simplest reliably crashing example of signal conflict with the timer-beat VM? > > > There isn't a way that reliably crashes the Vm (it doesn't crash). Instead I posted an example which should demonstrate the problem. Here it is again. > > [| infiniteLoop | > infiniteLoop := [| sum | sum := 0. [sum < 10] whileTrue: [sum := sum + (sum even ifTrue: [1] ifFalse: [-1])]] newProcess. > infiniteLoop resume. > Processor activeProcess priority: Processor activePriority + 1. > (Delay forSeconds: 1) wait. > infiniteLoop terminate. > Processor activeProcess priority: Processor activePriority - 1] timeToRun 1001 IIUC, that tests the thread-beat VM. Sorry I switched topics. I was looking for examples where OSProcess(?) or otherwise fails due to (presumed) conflict with the itimer-beat signals. cheers -ben |
Free forum by Nabble | Edit this page |