I have a server performing a heavy-lift operation crash with "out of
memory". Investigation revealed the lowSpaceWatcher did not give any of its #memoryHogs a chance to #freeSomeSpace because it the VM actually ran out of memory just prior to that -- during the initial GC (called from #bytesLeft). I notice #lowSpaceThreshold is only 400K which seems like a very insignificant amount of memory by 2013 standards, so I upped it 10-fold to 4M: lowSpaceThreshold ^ 4000000 My thought was that this would cause the VM to signal the LowSpaceSemaphore, when only 4M was remaining rather than when there was only 400K remaining. I'm allocating 700M to the server, so a LowSpace signal at 696M is absolutely fine! Unfortunately, the 4M threshold seems to create all sorts of strange slowness and... problems, I guess. I need lowSpaceWatcher to work, what is the proper way to fix this? |
On Mon, Mar 11, 2013 at 08:51:20PM -0500, Chris Muller wrote:
> I have a server performing a heavy-lift operation crash with "out of > memory". Investigation revealed the lowSpaceWatcher did not give any > of its #memoryHogs a chance to #freeSomeSpace because it the VM > actually ran out of memory just prior to that -- during the initial GC > (called from #bytesLeft). > > I notice #lowSpaceThreshold is only 400K which seems like a very > insignificant amount of memory by 2013 standards, so I upped it > 10-fold to 4M: > > lowSpaceThreshold > ^ 4000000 > > My thought was that this would cause the VM to signal the > LowSpaceSemaphore, when only 4M was remaining rather than when there > was only 400K remaining. I'm allocating 700M to the server, so a > LowSpace signal at 696M is absolutely fine! > > Unfortunately, the 4M threshold seems to create all sorts of strange > slowness and... problems, I guess. > > I need lowSpaceWatcher to work, what is the proper way to fix this? This is tricky business. The issues collected under this one may give some ideas: http://bugs.squeak.org/view.php?id=7655 400K is actually quite a lot of memory relative to the average size of objects that would typically be used in the image. Unless your image happens to be allocating some very large objects for some reason, it seems more likely that the problem has more to do with waking up the low space watcher in time to do something about the problem. <speculation> I think that it is also possible, on a unix system, that allowing the VM to allocate memory without limit might eventually crash the VM with a low space condition as the VM claims more and more virtual memory space, gradually leading to swap thrashing and a profoundly sluggish image that is unable to schedule anything, including the low space watcher process. When you are running Squeak interactively, you would notice the sluggish performance and hit the <alt><period> interrupt to stop it. But on on a server process, it might just bog down and die. I would suggest, for debugging, to have your server process evaluate 'Smalltalk createStackOverflow' while running normally, and observe the failure mode. Then try running it again with a fixed memory size (using the -memory flag to the VM) in order to force the failure before the system starts swapping, and see if the symptoms are different. </speculation> Dave |
On 11-03-2013, at 7:48 PM, David T. Lewis <[hidden email]> wrote: > > I think that it is also possible, on a unix system, that allowing the > VM to allocate memory without limit might eventually crash the VM > with a low space condition as the VM claims more and more virtual > memory space, gradually leading to swap thrashing and a profoundly > sluggish image that is unable to schedule anything, including the > low space watcher process. When you are running Squeak interactively, > you would notice the sluggish performance and hit the <alt><period> > interrupt to stop it. But on on a server process, it might just > bog down and die. There is, or at least was, a pathological condition that can cause nastiness. It afflicts RISC OS particularly because there is no virtual memory to hide the symptoms under the rug of gradual slowing. I forget exactly where in the code it is -I'm pretty sure there is a Mantis report on the issue somewhere- but there is a place where memory is simply stolen from the lowspace buffer without any sort of nod to the lowspace mechanism. If anyone wants to look around right now, look for places where the memory expansion call is used and a bit later; consider what happens when there is no more memory to be got. In the GC related code, some space is stolen from remaining room. I could probably take a quick look for the evidence tomorrow. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Useful Latin Phrases:- Estne volumen in toga, an solum tibi libet me videre? = Is that a scroll in your toga, or are you just happy to see me? |
In reply to this post by Chris Muller-4
You can always construct something in Smalltalk
to do more of what you want:
makeSpy | memoryEnd params | spy _ [ [pig isTerminated] whileFalse: [ params := Smalltalk getVMParameters. memoryEnd := params at: 3. memoryEnd > 400000000 ifTrue: [ Processor remove: pig ifAbsent: [self halt]. pig debugWithTitle: 'Interrupted from SpaceSpy'. self halt. ]. (Delay forMilliseconds: 60) wait. ]. 'The pig was terminated ',Time now asString,' ' displayAt: 0@20. ] newProcess. spy priority: 45; resume. In this case, as long as the suspect process is running, the spy checks periodically for above-normal memory use, terminating the pig when reached. Cheers, Bob On 3/11/13 9:51 PM, Chris Muller wrote:
I have a server performing a heavy-lift operation crash with "out of memory". Investigation revealed the lowSpaceWatcher did not give any of its #memoryHogs a chance to #freeSomeSpace because it the VM actually ran out of memory just prior to that -- during the initial GC (called from #bytesLeft). I notice #lowSpaceThreshold is only 400K which seems like a very insignificant amount of memory by 2013 standards, so I upped it 10-fold to 4M: lowSpaceThreshold ^ 4000000 My thought was that this would cause the VM to signal the LowSpaceSemaphore, when only 4M was remaining rather than when there was only 400K remaining. I'm allocating 700M to the server, so a LowSpace signal at 696M is absolutely fine! Unfortunately, the 4M threshold seems to create all sorts of strange slowness and... problems, I guess. I need lowSpaceWatcher to work, what is the proper way to fix this? |
In reply to this post by timrowledge
On 11-03-2013, at 10:52 PM, tim Rowledge <[hidden email]> wrote: > > There is, or at least was, a pathological condition that can cause nastiness. It afflicts RISC OS particularly because there is no virtual memory to hide the symptoms under the rug of gradual slowing. > > I forget exactly where in the code it is -I'm pretty sure there is a Mantis report on the issue somewhere- but there is a place where memory is simply stolen from the lowspace buffer without any sort of nod to the lowspace mechanism. If anyone wants to look around right now, look for places where the memory expansion call is used and a bit later; consider what happens when there is no more memory to be got. In the GC related code, some space is stolen from remaining room. I could probably take a quick look for the evidence tomorrow. Damn, can't find details. I know they're somewhere in my email archive… Basically, somewhere around the allocateChunk/sufficientSpaceToAllocate/sufficientSpaceAfterGC/initializeMemoryFirstFree area, there is a loophole whereby the freeBlock base is set without any consideration of updating the low space notification stuff. I remember test cases where you could end up with a sudden leap to 4 bytes of free space or similar. Not generally survivable. It hit me because RISC OS doesn't do memory expansion but on a server I rather suspect you have a top limit for virtual memory and the same case seems likely to obtain. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Strange OpCodes: PBC: Print and Break Chain |
In reply to this post by Bob Arning-2
Yes, a polling approach is something I've used in the past. I wanted
to get away from it but may have to go back to that.. On Tue, Mar 12, 2013 at 6:03 AM, Bob Arning <[hidden email]> wrote: > You can always construct something in Smalltalk to do more of what you want: > > makeSpy > > | memoryEnd params | > > spy _ [ > [pig isTerminated] whileFalse: [ > params := Smalltalk getVMParameters. > memoryEnd := params at: 3. > memoryEnd > 400000000 ifTrue: [ > Processor remove: pig ifAbsent: [self halt]. > pig debugWithTitle: 'Interrupted from SpaceSpy'. > self halt. > ]. > (Delay forMilliseconds: 60) wait. > ]. > 'The pig was terminated ',Time now asString,' ' displayAt: 0@20. > ] newProcess. > spy priority: 45; resume. > > In this case, as long as the suspect process is running, the spy checks > periodically for above-normal memory use, terminating the pig when reached. > > Cheers, > Bob > > On 3/11/13 9:51 PM, Chris Muller wrote: > > I have a server performing a heavy-lift operation crash with "out of > memory". Investigation revealed the lowSpaceWatcher did not give any > of its #memoryHogs a chance to #freeSomeSpace because it the VM > actually ran out of memory just prior to that -- during the initial GC > (called from #bytesLeft). > > I notice #lowSpaceThreshold is only 400K which seems like a very > insignificant amount of memory by 2013 standards, so I upped it > 10-fold to 4M: > > lowSpaceThreshold > ^ 4000000 > > My thought was that this would cause the VM to signal the > LowSpaceSemaphore, when only 4M was remaining rather than when there > was only 400K remaining. I'm allocating 700M to the server, so a > LowSpace signal at 696M is absolutely fine! > > Unfortunately, the 4M threshold seems to create all sorts of strange > slowness and... problems, I guess. > > I need lowSpaceWatcher to work, what is the proper way to fix this? > > > > > > |
In reply to this post by David T. Lewis
Here is the Smalltalk stack dump portion of the out of memory messages:
Smalltalk stack dump: 0xbf7d3808 I SmalltalkImage>garbageCollect -1943985568: a(n) SmalltalkImage 0xbf7d3828 I SmalltalkImage>bytesLeft -1943985568: a(n) SmalltalkImage 0xbf7d3850 I SmalltalkImage>lowSpaceWatcher -1943985568: a(n) SmalltalkImage 0x919f60c8 s [] in SmalltalkImage>installLowSpaceWatcher 0x9012dea0 s [] in BlockClosure>newProcess So it seems the lowSpaceWatcher WAS invoked -- it was just that the initial GC is the last thing it tried to do, just before it was about to let my app free a bunch of memory! In this case, I did use a maximum limit by specifying the -mmap parameter. As soon as it hit that, the LowSpace was signaled, it just couldn't get to asking the first memoryHog to release space because it couldn't get past the initial GC. Under these terms, the lowSpaceWatcher is essentially broken and we should not allow something that important to sit around broken in Squeak 4.4. Hence, my thought was along the lines of trying to get the VM to signal LowSpaceSemaphore just a bit sooner so it would have enough to do that initial GC -- but I didn't know what you meant by 400K being "quite a lot"..? - Chris (A note about virtual memory): Every time I install OS these days I never allocate any Swap anymore because memory is cheap -- you can still run out of memory+virtual memory so I'd rather things just crash quickly than go slogging in a swamp and killing my HD before crashing. On Mon, Mar 11, 2013 at 9:48 PM, David T. Lewis <[hidden email]> wrote: > On Mon, Mar 11, 2013 at 08:51:20PM -0500, Chris Muller wrote: >> I have a server performing a heavy-lift operation crash with "out of >> memory". Investigation revealed the lowSpaceWatcher did not give any >> of its #memoryHogs a chance to #freeSomeSpace because it the VM >> actually ran out of memory just prior to that -- during the initial GC >> (called from #bytesLeft). >> >> I notice #lowSpaceThreshold is only 400K which seems like a very >> insignificant amount of memory by 2013 standards, so I upped it >> 10-fold to 4M: >> >> lowSpaceThreshold >> ^ 4000000 >> >> My thought was that this would cause the VM to signal the >> LowSpaceSemaphore, when only 4M was remaining rather than when there >> was only 400K remaining. I'm allocating 700M to the server, so a >> LowSpace signal at 696M is absolutely fine! >> >> Unfortunately, the 4M threshold seems to create all sorts of strange >> slowness and... problems, I guess. >> >> I need lowSpaceWatcher to work, what is the proper way to fix this? > > > This is tricky business. The issues collected under this one may give > some ideas: http://bugs.squeak.org/view.php?id=7655 > > 400K is actually quite a lot of memory relative to the average size > of objects that would typically be used in the image. Unless your image > happens to be allocating some very large objects for some reason, it > seems more likely that the problem has more to do with waking up the > low space watcher in time to do something about the problem. > > <speculation> > > I think that it is also possible, on a unix system, that allowing the > VM to allocate memory without limit might eventually crash the VM > with a low space condition as the VM claims more and more virtual > memory space, gradually leading to swap thrashing and a profoundly > sluggish image that is unable to schedule anything, including the > low space watcher process. When you are running Squeak interactively, > you would notice the sluggish performance and hit the <alt><period> > interrupt to stop it. But on on a server process, it might just > bog down and die. > > I would suggest, for debugging, to have your server process evaluate > 'Smalltalk createStackOverflow' while running normally, and observe > the failure mode. Then try running it again with a fixed memory size > (using the -memory flag to the VM) in order to force the failure before > the system starts swapping, and see if the symptoms are different. > > </speculation> > > Dave > > |
In reply to this post by timrowledge
On Tue, Mar 12, 2013 at 11:48:07AM -0700, tim Rowledge wrote:
> > On 11-03-2013, at 10:52 PM, tim Rowledge <[hidden email]> wrote: > > > > > There is, or at least was, a pathological condition that can cause nastiness. It afflicts RISC OS particularly because there is no virtual memory to hide the symptoms under the rug of gradual slowing. > > > > I forget exactly where in the code it is -I'm pretty sure there is a Mantis report on the issue somewhere- but there is a place where memory is simply stolen from the lowspace buffer without any sort of nod to the lowspace mechanism. If anyone wants to look around right now, look for places where the memory expansion call is used and a bit later; consider what happens when there is no more memory to be got. In the GC related code, some space is stolen from remaining room. I could probably take a quick look for the evidence tomorrow. > > Damn, can't find details. I know they're somewhere in my email archive? > > Basically, somewhere around the allocateChunk/sufficientSpaceToAllocate/sufficientSpaceAfterGC/initializeMemoryFirstFree area, there is a loophole whereby the freeBlock base is set without any consideration of updating the low space notification stuff. I remember test cases where you could end up with a sudden leap to 4 bytes of free space or similar. Not generally survivable. It hit me because RISC OS doesn't do memory expansion but on a server I rather suspect you have a top limit for virtual memory and the same case seems likely to obtain. > Is it in this thread? http://lists.squeakfoundation.org/pipermail/vm-dev/2005-May/000216.html Googling "raab rowledge lowspace squeak" yields a number of relevant links :) Dave |
I don't see how #memoryHogs and #freeSomeSpace can work. SystemDictionary>>lowSpaceWatcher ... self memoryHogs isEmpty
ifFalse: [free := self bytesLeft. self memoryHogs do: [ :hog | hog freeSomeSpace ].
self bytesLeft > free ifTrue: [ ^ self installLowSpaceWatcher ]]. #freeSomeSpace is not implemented and SystemDictionary>>lowSpaceWatcher is the only sender of #memoryHogs
Karl On Wed, Mar 13, 2013 at 4:33 AM, David T. Lewis <[hidden email]> wrote:
|
Am 18.03.2013 um 23:29 schrieb karl ramberg <[hidden email]>: I don't see how #memoryHogs and #freeSomeSpace can work. Kind regards Georg |
Ok, I see.
So it's a feature not used in the 4.4 release image.
Karl
On Tue, Mar 19, 2013 at 9:12 AM, Georg Gollmann <[hidden email]> wrote:
|
More accurately: the base image doesn't have any memory hogs :)
frank On 19 March 2013 11:21, karl ramberg <[hidden email]> wrote: > Ok, I see. > So it's a feature not used in the 4.4 release image. > > Karl > > > > On Tue, Mar 19, 2013 at 9:12 AM, Georg Gollmann <[hidden email]> > wrote: >> >> >> Am 18.03.2013 um 23:29 schrieb karl ramberg <[hidden email]>: >> >> I don't see how #memoryHogs and #freeSomeSpace can work. >> ... >> #freeSomeSpace is not implemented and SystemDictionary>>lowSpaceWatcher is >> the only sender of #memoryHogs >> >> >> Memory hogs are supposed to implement #freeSomeSpace and register >> themselves with the lowSpaceWatcher. >> >> Kind regards >> Georg >> >> >> >> >> > > > > |
:-) On Tue, Mar 19, 2013 at 3:07 PM, Frank Shearar <[hidden email]> wrote: More accurately: the base image doesn't have any memory hogs :) |
It can be used by external apps but at this time one would be
ill-advised to depend on it. As my prior note shows, the cog VM does not signal the LowSpaceSemaphore until it is so maxed out on memory that the initial GC prior to calling #freeSomeSpace crashes the VM. There is no chance for any registered memory-hog to act on the low-memory condition. In my initial note I tried to ask whether the #lowSpaceThreshold could simply be increased to fix that but did not get a clear answer. On Tue, Mar 19, 2013 at 12:21 PM, karl ramberg <[hidden email]> wrote: > :-) > > > > On Tue, Mar 19, 2013 at 3:07 PM, Frank Shearar <[hidden email]> > wrote: >> >> More accurately: the base image doesn't have any memory hogs :) >> >> frank >> >> On 19 March 2013 11:21, karl ramberg <[hidden email]> wrote: >> > Ok, I see. >> > So it's a feature not used in the 4.4 release image. >> > >> > Karl >> > >> > >> > >> > On Tue, Mar 19, 2013 at 9:12 AM, Georg Gollmann >> > <[hidden email]> >> > wrote: >> >> >> >> >> >> Am 18.03.2013 um 23:29 schrieb karl ramberg <[hidden email]>: >> >> >> >> I don't see how #memoryHogs and #freeSomeSpace can work. >> >> ... >> >> #freeSomeSpace is not implemented and SystemDictionary>>lowSpaceWatcher >> >> is >> >> the only sender of #memoryHogs >> >> >> >> >> >> Memory hogs are supposed to implement #freeSomeSpace and register >> >> themselves with the lowSpaceWatcher. >> >> >> >> Kind regards >> >> Georg >> >> >> >> >> >> >> >> >> >> >> > >> > >> > >> > >> > > > > |
On Wed, Mar 20, 2013 at 1:34 PM, Chris Muller <[hidden email]> wrote: It can be used by external apps but at this time one would be Which is a bad bug. And to be pedantic is actually false. The Cog VM *does* signal low space, at least sometimes :) For example the following just triggered low space at depth 476 in a trunk image on Mac OS X.
| b | Smalltalk garbageCollect. b := #assigned. b := [:d :r| (d \\ 10) = 0 ifTrue: [Display reverse: (0@0 corner: d@d/10)].
d >= 800 ifTrue: [[Project current interruptName: 'User Interrupt @ ', d printString] newProcess
priority: Processor activePriority + 1; resume]. b value: d + 1 value: (Array new: 1024 * 1024 / 4)].
b value: 0 value: nil The UserInterrupt thang is to try and reproduce a crash I had when I interrupted the simpler case below which then crashed the VM:
| b | Smalltalk garbageCollect. b := #assigned. b := [:d :r| b value: d + 1 value: (Array new: 1024 * 1024 / 4)].
b value: 0 value: nil In my initial note I tried to ask whether the #lowSpaceThreshold could I don't know yet. I have a crash to investigate on Windows that points to the problem being the amount of space Cog reserves for flushing stack pages to the heap as contexts. Reproducible cases that do crash and don't take an age to run much appreciated :-/.
On Tue, Mar 19, 2013 at 12:21 PM, karl ramberg <[hidden email]> wrote: best, Eliot
|
Free forum by Nabble | Edit this page |