Can't the box be setup 5o do some WoL thing and go back to sleep when idling for a while? This CPU usage is really annoying indeed. Phil Le 9 févr. 2015 21:11, "Norbert Hartl" <[hidden email]> a écrit :
I have an installation where a pharo powered hardware is used in a closed case. Over time that collects quite some heat. One reason for this is that the pharo vm is taking approx. 6% CPU all the time. The only thing that happens is network/sockets. I suspended the ui thread in the image but on this platform it doesn't help. |
On 09-02-2015, at 12:33 PM, [hidden email] wrote: > Can't the box be setup 5o do some WoL thing and go back to sleep when idling for a while? > > This CPU usage is really annoying indeed. Assuming you are using a stack or Cog vm, that will mostly be the heartbeat that checks for inputs and process switches and GC limits etc. Plus any remaining morphic loop etc. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Two wrongs are only the beginning. |
In reply to this post by philippeback
Nope, the device is an access point that serves seaside and websockets to tablets. There are close to no options having itself switch off. Yes it is. It is just one thing like 32 bit where we are way behind and no resources available to fix it. Norbert
|
In reply to this post by timrowledge
> Am 09.02.2015 um 21:39 schrieb tim Rowledge <[hidden email]>: > > > > On 09-02-2015, at 12:33 PM, [hidden email] wrote: > >> Can't the box be setup 5o do some WoL thing and go back to sleep when idling for a while? >> >> This CPU usage is really annoying indeed. > > Assuming you are using a stack or Cog vm, that will mostly be the heartbeat that checks for inputs and process switches and GC limits etc. Plus any remaining morphic loop etc. > thanks, Norbert |
On 09-02-2015, at 1:00 PM, Norbert Hartl <[hidden email]> wrote: > > >> Am 09.02.2015 um 21:39 schrieb tim Rowledge <[hidden email]>: >> >> >> >> On 09-02-2015, at 12:33 PM, [hidden email] wrote: >> >>> Can't the box be setup 5o do some WoL thing and go back to sleep when idling for a while? >>> >>> This CPU usage is really annoying indeed. >> >> Assuming you are using a stack or Cog vm, that will mostly be the heartbeat that checks for inputs and process switches and GC limits etc. Plus any remaining morphic loop etc. >> > Thanks for the analysis. Not being an expert on _all_ those topics :) I'm asking myself if there are some tweaks (monkey patching is fine) to get rid of those? I certainly do not need any morphic loop as long as it doesn't steer the network reception *cough* :) We’d need some input from someone that really knows Morphic. You could try running in an MVC project and see if it makes any difference? > Can the intervals to check inputs and process switches be stretched without open the door to hell because timing loops are tightly aligned? Maybe. Eliot? tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim What passes for common sense is always revisable |
Norbert
|
In reply to this post by timrowledge
Hi Tim,
On Mon, Feb 9, 2015 at 12:39 PM, tim Rowledge <[hidden email]> wrote:
No. The heartbeat is extremely cheap. It is the idle loop that calls ioRelinquishProcessorForMicroseconds which in turn calls aioSleepForUsecs which calls select: (Delay forSeconds: 60) wait gc prior. clear prior. 60.002 seconds; sampling frequency 1385 hz 7 samples in the VM (83120 samples in the entire program) 0.01% of total 3 samples in generated vm code 42.86% of entire vm ( 0.00% of total) 4 samples in vanilla vm code 57.14% of entire vm ( 0.00% of total) % of generated vm code (% of total) (samples) (cumulative) 100.0% ( 0.00%) ...others... (3) (100.0%) % of vanilla vm code (% of total) (samples) (cumulative) 100.0% ( 0.00%) ...others... (4) (100.0%) 83113 samples in the rest 99.99% of total % of rest (% of total) (samples) (cumulative) 99.98% (99.97%) select$DARWIN_EXTSN (83095) (99.98%) 0.01% ( 0.01%) mach_msg_trap (10) (99.99%) 0.01% ( 0.01%) ...others... (8) (100.0%) Now using epoll would make the select cheaper and I have changes for that. But the real solution is to combine this with an event-driven VM. best,
Eliot |
On 09-02-2015, at 4:53 PM, Eliot Miranda <[hidden email]> wrote: > > No. The heartbeat is extremely cheap. It is the idle loop that calls ioRelinquishProcessorForMicroseconds which in turn calls aioSleepForUsecs which calls select: > Happy to be shown wrong. That means it most likely is morphic and/or any other tucked away processes. On a Pi it seems to be about 15% cpu time, so there is certainly some interest in reducing it! tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim There are no stupid questions. But, there are a lot of inquisitive idiots. |
Hi Tim,
On Mon, Feb 9, 2015 at 5:02 PM, tim Rowledge <[hidden email]> wrote:
it is this one: ProcessorScheduler>>idleProcess "A default background process which is invisible." [self relinquishProcessorForMicroseconds: 1000] repeat If you recall the VW VM, that got rid of the background process and when the VM scheduling loop finds nothing to run it calls a blocking routine. best,
Eliot |
On 09-02-2015, at 5:05 PM, Eliot Miranda <[hidden email]> wrote: > it is this one: > > ProcessorScheduler>>idleProcess > "A default background process which is invisible." > > [self relinquishProcessorForMicroseconds: 1000] repeat > > If you recall the VW VM, that got rid of the background process and when the VM scheduling loop finds nothing to run it calls a blocking routine. How embarrassingly obvious. Not having a good day today, pi camera stuff is going to drive me to hairlessness at this rate. Updating everything actually made it even worse! Sigh. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Strange OpCodes: JTC: Jump To Conclusions |
On Mon, Feb 9, 2015 at 5:27 PM, tim Rowledge <[hidden email]> wrote:
Don't beat yourself up. I'm sure you were in good company and now you know...
best,
Eliot |
In reply to this post by philippeback
It's a bit more complicated and what platform you are on does matter. Just hunt in the squeak mailing list 10 years back for getNextWakeupTick
Possibly the mac vm still calls getNextWakeupTick() which returns the next time the VM has to wake up to service a delay pop. Normally that is less than 1/50 of a second out due to the Morphic polling cycle, say 16 - 20 milliseconds. The idea I had was to sleep until the VM needs to wakeup since when the ioRelinquishProcessorForMicroseconds is made we know we can sleep and the VM knows exactly when the next time to wake up is. Unfortunately we have to deal with user interrupts (i/o sockets ui) http://www.squeakvm.org/svn/squeak/branches/Cog/platforms/unix/vm/aio.c I note you can't properly calculate next wakeup tick in smalltalk code due to the rather brittle code base in the Delay logic. Attempts I made a decade back always resulted in a deadlock situation, which is why that calculation is done in the VM. I had last taken a serious look at this back in 2010 and found very strange oddities such as calling ioRelinquishProcessorForMicroseconds yet a wakeup time is now, or in the past.. Obviously one needed to explore the stack traces to understand why no process was runnable, yet a process was scheduled to be woken... Anyway compare ioRelinquishProcessorForMicroseconds Against whatever is being compiled for your target platform VM and what exactly HAVE_NANOSLEEP is when the VM is compiled. Also check idle CPU usage for say a OS X Squeak 4.2.5 VM against I'm assume a unix vm flavor as you can run both on the same os-x machine for comparison using the same image/etc. On Tue, Feb 10, 2015 at 3:03 AM, Norbert Hartl <[hidden email]> wrote:
=========================================================================== John M. McIntosh <[hidden email]> https://www.linkedin.com/in/smalltalk =========================================================================== |
In reply to this post by NorbertHartl
Hoi Norbert-- In 2003, while implementing remote messaging for what became the Naiad distributed module system[1], I noticed excessive CPU usage during idle by Squeak on MacOSX (and extremely poor remote messaging performance). I prepared alternate versions of ioRelinquishProcessorForMicroseconds, comparing: - select() (AKA aioSleepForUsecs in Ian's aio API, my starting point) - pthread_cond_timedwait() - nanosleep() pthread_cond_timedwait was the clear winner at the time. I wrote my own relinquish primitive as part of the Flow external streaming plugin[2], and I've been using it ever since. Still seems fine. I've mentioned this before. thanks, -C [1] http://netjam.org/naiad [1] http://netjam.org/flow -- Craig Latta netjam.org +31 6 2757 7177 (SMS ok) + 1 415 287 3547 (no SMS) |
Craig so how does using pthread_cond_timedwait affect socket processing? The promise of nanosleep was to wake up if an interrupt arrived say on a socket (Mind I never actually confirmed this the case, complete hearsay...) On Thu, Feb 12, 2015 at 2:40 AM, Craig Latta <[hidden email]> wrote:
=========================================================================== John M. McIntosh <[hidden email]> https://www.linkedin.com/in/smalltalk =========================================================================== |
On Thu, Feb 12, 2015 at 10:45 AM, John McIntosh <[hidden email]> wrote:
+1. What he said. The problem with pthread_cond_timed_wait, or any other merely delaying call is that, unless all file descriptors have been set up to send signals on read/writability and unless the blocking call is interruptible, the call may block for as long as it is asked, not until that or the read/writeability of the file descriptor. IMO a better solution here is to a) use epoll or its equivalent kqueue; these are like select but the state of which selectors to examine is kept in kernel space, so the set-up overhead is vastly reduced, and b) wait for no longer than the next scheduled delay if one is in progres. Of course, the VM can do both of these things, and then there's no need for a background process at all. Instead, when the V< scheduler finds there's nothing to run it calls epoll or kqueue with either an infinite timeout (if no delay is in progress) or the time until the next delay expiration. Now, if only there was more time ;-) It strikes me that the VM can have a flag that makes it behave like this so that e.g. some time in the Spur release cycle we can set the flag, nuke the background process and get on with our lives.
best,
Eliot |
I did look at using pthread_delay_np to delay the heartbeat thread as my thought was if the image is sleeping why wake up to service the clock, etc. Difficult to measure the outcome, but one should consider that option too. On Thu, Feb 12, 2015 at 10:55 AM, Eliot Miranda <[hidden email]> wrote:
=========================================================================== John M. McIntosh <[hidden email]> https://www.linkedin.com/in/smalltalk =========================================================================== |
In reply to this post by Eliot Miranda-2
Hi all-- Apologies, my newsreader's thread database got trashed, and I missed the responses to my previous message until now. John McIntosh writes: > Craig so how does using pthread_cond_timedwait affect socket > processing? It makes it actually work well. :) This was the whole point of using pthread_cond_timedwait. Please read the manpage at [1]. It waits until either a condition is met (hence the "cond") or a timeout elapses. In the Flow virtual machine plugin, I have a synchronizedSignalSemaphoreWithIndex function that calls the usual signalSemaphoreWithIndex provided by the virtual machine, and also sets the activity condition that the relinquish primitive cares about. The host threads which service external I/O requests from primitives use synchronizedSignalSemaphoreWithIndex when signalling the semaphores on which Smalltalk-level code is waiting. This includes not only the semaphores for reading and writing sockets, but also those for activities with other external resources entirely, like MIDI ports. So you get a generalized scheme which is not tied to the arcana of any particular kind of external resource, and it works the same way on any platform which supports the POSIX API (which now is all the Unix-ish ones). This has seemed the obvious way to go for over ten years now. Until I implemented this scheme, remote messaging throughput (and MIDI throughput) was horrible. Believe me, I tried all the other schemes that everyone has mentioned in the Squeak community and its descendants since 1996, and none of them were anything better than deeply embarrassing. From the Flow plugin, check out flow.c[2], which implements synchronizedSignalSemaphoreWithIndex, the activity condition, and the relinquish primitive, and ip.c[3] which creates host threads to do background work for external resource primitives and uses synchronizedSignalSemaphoreWithIndex to coordinate with the Smalltalk-level code and the relinquish primitive. It's so frustrating and weird that we're still talking about this in 2015. > The promise of nanosleep was to wake up if an interrupt arrived say > on a socket (Mind I never actually confirmed this the case, complete > hearsay...) Right, nanosleep promises this and doesn't deliver on MacOS, so I say forget it. pthread_cond_timedwait works as advertised on MacOS and Linux (all distros). Eliot writes: > +1. What [John] said. ...except John admitted himself that he hadn't verified his suggestion, and you both assumed for some reason that I didn't have the same goals in mind. > The problem with pthread_cond_timed_wait, or any other merely > delaying call... But pthread_cond_timedwait is *not* a "merely delaying call". It does exactly what we want (wait until *either* a condition is met or a timeout elapses), and it actually works, and the code is the same across POSIX platforms. What you go on to say is based on a false premise. > ...is that, unless all file descriptors have been set up to send > signals on read/writability and unless the blocking call is > interruptible, the call may block for as long as it is asked, not > until that or the read/writeability of the file descriptor. In the scheme I described above, we can do what we need without using formal Unix signals at all (happily avoiding that whole can of worms). The notion of interruptible blocking calls is a red herring generally. All the blocking calls in Flow happen in host threads which are decoupled from any function call a Smalltalk primitive would make. > IMO a better solution here is to a) use epoll or its equivalent > kqueue; these are like select but the state of which selectors to > examine is kept in kernel space, so the set-up overhead is vastly > reduced, and b) wait for no longer than the next scheduled delay if > one is in progres. I claim they are not better solutions, because they don't work for all kinds of external resources (e.g., MIDI ports). Also, I found that "waiting for no longer than the next scheduled delay" is often still far too long, when there is external resource activity before that time comes. > Of course, the VM can do both of these things, and then there's no > need for a background [Smalltalk] process at all. Instead, when the > VM scheduler finds there's nothing to run it calls epoll or kqueue > with either an infinite timeout (if no delay is in progress) or the > time until the next delay expiration. This would still leave us with poor performance when using new kinds of external resources that don't use selectors. (That is, the external resource access would perform poorly; I'm sure the main virtual machine would scream right along, blissfully oblivious to it all. :) > It strikes me that the VM can have a flag that makes it behave like > this so that e.g. some time in the Spur release cycle we can set the > flag, nuke the background process and get on with our lives. If the only external resources in our lives were selector-using ones, I might agree. thanks, -C [1] http://linux.die.net/man/3/pthread_cond_timedwait [2] https://github.com/ccrraaiigg/flow/blob/master/flow.c [3] https://github.com/ccrraaiigg/flow/blob/master/ip.c -- Craig Latta netjam.org +31 6 2757 7177 (SMS ok) + 1 415 287 3547 (no SMS) |
On Fri, Mar 6, 2015 at 10:47 PM, Craig Latta <[hidden email]> wrote:
(Sorry for this late response. I discovered it sitting in my Draft folder.) Finding this an interesting topic, I googled around to learn more and bumped into a few things maybe of random interest for some. * Condition variables performance of boost, Win32, and the C++11 standard library https://codesequoia.wordpress.com/2013/03/27/condition-variables-performance-of-boost-win32-and-the-c11-standard-library/ * pthread_cond_timedwait behaving differently on different platforms http://blogs.msdn.com/b/cellfish/archive/2009/09/01/pthread-cond-timedwait-behaving-differently-on-different-platforms.aspx * pthread-win32 pthread_cond_timedwait is SLOW? http://comp.programming.threads.narkive.com/fZU5gh0K/pthread-win32-pthread-cond-timedwait-is-slow * Fast Event Processing in SDL (since Pharo is getting SDL) cheers -ben |
Free forum by Nabble | Edit this page |