> > > I did primitive error codes at Cadence and they'll very probably b > making it into Cog real soon now. They're simpler than > VisualWorks', being only symbols. So extracting more information, > such as a related error code requires a subsequent call. But I > think the work I'm doing right now on eliminating pushRemappableOop:/ > popRemappableOop will enable me to have a structured object with a > name and parameters, which is more generally useful. So.... what is the anticipated way that will work? I'm not sure I see the need for a two step process anyway. The version we did at Interval in 98 was very simple, providing a call for prims to stuff some object (we used SmallInts in prcatice, but anything was allowed) into the first slot in the context that got activated on the fail. We extended the primitive declaration pragma to allow optional naming of the tempvar - default was 'errorValue' I think - and that was it. Code following the prim call could use or ignore the temp. > > > A nice thing is that the code is forwards and backewards > compatible. One can use the VM to run older images. One can run > images that contain the primitive error code on older VMs, where one > simply gets a nil error code on primitive failure. > Sure, if the temp is used by return-code-unaware methods in an older image then it can't make much difference because it would be assumed nil as normal. Running new images on older VMs isn't something we've ever much bothered with for Squeak so don't worry about it. I'm going to guess that "work I'm doing right now on eliminating pushRemappableOop:/popRemappableOop" relates to getting allocations out of primitives as much as possible, right? That would be nice. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim "Bother," said Pooh, reading his bank statement from Barings. |
On Tue, Jul 1, 2008 at 2:23 PM, tim Rowledge <[hidden email]> wrote:
The compiler is modified to recognise <primitive: integer errorCode: identifier> and <primitive: identifier module: identifier errorCode: identifier> and convert this into one additional temp and generate a long storeTemp as the first instruction of the method. The VM is modified to, on primitive failure, check for the long storeTemp and stores the error code if it sees it. Old images lack the storeTemp so the VM does not store the value. Old VMs do perform the storeTemp, but because the storeTemp is storing into the stack top it is effectively a noop. e.g. if the primitive looks like: primitiveAdoptInstance: anObject <primitive: 160 error: ec> ec == #'no modification' ifTrue: [^NoModificationError signal: self message: (Message selector: #primitiveAdoptInstance: arguments: {anObject})]. self primitiveFailed then the header says the method has 2 temps (anObject & ec). The VM initializes the context with stackp pointing at the second temp (ec). The merthod starts with a long storeTemp: 2. So the nil ec gets stored into itself. The compiler generates a long storeTemp to make it quicker to check for the method starting with a storeTemp.
No. This relates to not having to use pushRemappableOop:/popRemappableOop because it is extremely error-prone. For example can you spot the bug in this: primitivePerform | performSelector newReceiver selectorIndex lookupClass performMethod | performSelector := messageSelector. performMethod := newMethod. messageSelector := self stackValue: argumentCount - 1. newReceiver := self stackValue: argumentCount. "NOTE: the following lookup may fail and be converted to #doesNotUnderstand:, so we must adjust argumentCount and slide args now, so that would work." "Slide arguments down over selector" argumentCount := argumentCount - 1. selectorIndex := self stackPointerIndex - argumentCount. self transfer: argumentCount fromIndex: selectorIndex + 1 ofObject: activeContext toIndex: selectorIndex ofObject: activeContext. self pop: 1. lookupClass := self fetchClassOf: newReceiver. self findNewMethodInClass: lookupClass. "Only test CompiledMethods for argument count - other objects will have to take their chances" (self isCompiledMethod: newMethod) ifTrue: [self success: (self argumentCountOf: newMethod) = argumentCount]. self successful ifTrue: [self executeNewMethodFromCache. "Recursive xeq affects successFlag" self initPrimCall] ifFalse: ["Slide the args back up (sigh) and re-insert the selector. " 1 to: argumentCount do: [:i | self storePointer: argumentCount - i + 1 + selectorIndex ofObject: activeContext withValue: (self fetchPointer: argumentCount - i + selectorIndex ofObject: activeContext)]. self unPop: 1. self storePointer: selectorIndex ofObject: activeContext withValue: messageSelector. argumentCount := argumentCount + 1. newMethod := performMethod. messageSelector := performSelector] Don't look further. try and spot the bug first. It starts with performSelector := messageSelector. performMethod := newMethod. and ends with messageSelector := performSelector which is only executed if there is a doesNotUnderstand:. A doesNotUnderstand: may cause a GC in createActualMessage as it creates the message argument for doesNotUnderstand:. So this bug only bytes if a perform is not understood when memory is on the verge of exhaustion. In the context-to-stack-mapping VM I'm working on the VM may do up to a page worth of context allocations on e.g. trying to store the sender of a context. Rather than try and plug all the holes it is much safer to restrict garbage collection to happening between bytecodes (probably on sends and backward branches). To do this the GC has to maintain a reserve for the VM which is about the size of two stack pages, or probably 2k bytes, and the VM has to defer incremental collections from allocations until the send or backward branch following. |
In reply to this post by Eliot Miranda-2
+1 to moving named primitive into image
+1 to make use of primitive returning value (this would affect the code generator, but hell, its way better to have a notion why it failed than simply guessing in the dark room, like any currently primitive failure code does) +1 to eliminating pushRemappableOop:/popRemappableOop . (A question however, how you would handle heap overflows, when evil primitive generates massive object or lots of tiny objects?) (a little OT .. ) Another question, is how would you sync there changes with Hydra (if you having plans of course)? I have many changes in CodeGenerator and some in VMMaker , to simplify my day's job :) In fact, it would be good to make a clean refactoring of it: - make inst vars of Interpreter/ObjectMemory be the interpreter instance state - make class vars be the part of VM global state - make pool vars be the constants - same could be applied to methods (instance/class side) Currently things are little messy, since i made incremental changes to existing model. Another thing, which doesn't makes me sleep well, is that by using thread specific storage, it would be possible to avoid passing interpreter instance to each function/primitive. Too bad, GCC (on windows) support a __thread__ specifier only in latest release , which is 'technology preview' how they calling it :) If we could use this, it would be possible to make Hydra nearly as fast as current Squeak VM. And there is no point to argue do we have right to use thread specific storage or not: consider that Hydra can run on platforms which can support threads. And since we already using threads, why we can't use thread-specific storage. ... (stopping OT) 2008/7/1 Eliot Miranda <[hidden email]>: > > > On Tue, Jul 1, 2008 at 1:34 PM, tim Rowledge <[hidden email]> wrote: >> >> On 1-Jul-08, at 1:20 PM, Eliot Miranda wrote: >>> >>> One doesn't have to *use* the FFI. If the FFI isn't exposed via a >>> primitive then no FFI. One can still have named primitives supported by the >>> image and not by the VM and not use the FFI. To call a named primitive in a >>> primitive plugin the following sequence occurs: >>> >>> the method containing a named primitive spec is activated and the >>> primitive call fails because its function pointer is null. >>> the failure code extracts the plugin name and invokes a primitive to load >>> the plugin library >>> the failure code extracts the primitive name and uses the lookup >>> primitive to find the function in the loaded plugin library >>> the failure code uses a primitive to slam the function pointer into the >>> method >>> the failure code uses the executeMethodWithArgs primitive to retry the >>> bound named primitive method >>> >>> So the FFI is an optional extra. One needs four primitives, load >>> library, lookup name in library, insert primitive function pointer. and >>> executemethodWithArgs (thanks Tim!). Slamming the function into the method >>> could also be done using, say, objectAt:. >>> >>> So one can still have a nice small safe VM and have no direct support for >>> named primitives in the VM. >>> >> >> >> Leaving aside my like of all-named-prims, I *like* this enhancement to the >> support for named prims. >> >> It would slightly complicate the post-prim-call code in each method >> because you would need to handle the failed-to-find-prim case as well as all >> the prim-failed cases. It would be helped by an old idea that I'm pretty >> sure eliot has plans for anyway (as indeed I have written about a few times) >> to add primitive error return values. For those that haven't heard of them, >> this is just a way of having the first temp in a context involved in a prim >> call be a slot for a return value from the prim if any error occurs. This >> means you can actually know wtf went wrong instead of guessing - "ooh, was >> the first arg a SmallInteger? Um, if not, was it a Float that might round >> nicley? Err, was the second arg a ByteArray with the first byte = 255?" etc. >> Instead you get "ah, errorValue was 1, so we have to explode the reactor >> core in order to stop the Dreen from eating the children". Much nicer. > > I did primitive error codes at Cadence and they'll very probably b making it > into Cog real soon now. They're simpler than VisualWorks', being only > symbols. So extracting more information, such as a related error code > requires a subsequent call. But I think the work I'm doing right now on > eliminating pushRemappableOop:/popRemappableOop will enable me to have a > structured object with a name and parameters, which is more generally > useful. > > A nice thing is that the code is forwards and backewards compatible. One > can use the VM to run older images. One can run images that contain the > primitive error code on older VMs, where one simply gets a nil error code on > primitive failure. > > > > > -- Best regards, Igor Stasenko AKA sig. |
On Tue, Jul 1, 2008 at 4:12 PM, Igor Stasenko <[hidden email]> wrote: +1 to moving named primitive into image It is much the same as it is now. A large allocation attempt (e.g. in primitiveNewWithArg) either succeed, possibly causing a garbage collection very soon after, or fails. The garbage collector is allowed to run in primitiveNewWithArg et al. But allocations at any other time cannot cause the garbage collector to run and are known to fit within the extra memory reserved for the interpreter. These allocations always succeed but may exceed some threshold which will cause a garbage collection very soon thereafter. Memory will look something like this: -> start of heap ... lots of objects ... -> young start ... not so many objects ... -> free start/eden ... small amount of free space (a few hundred k) ... -> scavenge threshold ... free memory (megabytes)... -> VM reserve start .... very few k (e.g. 8k) ... -> fwd block reserve -> end The allocation pointer is free start/eden. A new object is allocated here and advances free start/eden. We will dispense with freeBlock and its header for speed. primitiveNewWithArg (et al) can allocate up to VM reserve start. An attempt beyond that can trigger a garbage collection but the retry cannot exceed VM reserve start. Any other allocation can exceed VM reserve start but not fwd block reserve. Exceeding fwd block reserve is an internal fatal error that should not happen if VM reserve is large enough. Any allocation that pushes free start/eden beyond scavenge threshold sets a flag that will cause the VM to perform an incrementalGC on the next send or backward branch. The net result is that pointers only move during primitives that do substantial allocations (primitiveNewWithArg, anything that allocates lots of objects) and only these primitives need to use pushRemappableOop:/popRemappableOop. The details can vary but the essentials are to arrange that for allocations in the course of execution (e.g. creating a block,creating the message argument of a doesNotUnderstand:, creating a primitive failure value, flushing a stack page to heap contexts, etc, etc) the garage collector will not run and so the VM does not need to manage pointers in local variables. It can assume these values will not change and remain valid until the next send or backward branch. (a little OT .. ) This *must* be done. I expect we'll spend quite a lot of time communicating to make this happen. I am about to extend the translator for the stack interpreter. But basically its my job to fit Cog into Hydra. My own OT: BTW, did you see my comment about renaming Hydra? I think "String" is a much better name. Strings are composed of threads. Smalltalk comes from Alan Kay's distaste for the kinds of names people were using for programming languages in the 60's & 70's, names like Zeus and Thor. Hence Smalltalk. Hydra is another mythic name. Cog is a small element in a larger whole (and a cool advert by Honda). Anyway, think it over... I have many changes in CodeGenerator and some in VMMaker , to simplify Could you email me your current VMMaker package, or a link to it? And could you summarise the changes you've made and why? I'll do the same. In fact, it would be good to make a clean refactoring of it: You need is an abstraction layer, e.g. implemented with macros that insulates the Slang code form the platform thread system details. You then implement the abstraction layer as thinly as possible. I've done this with the threaded API extension of the VW FFI. Its not hard. You may be able to find GPL versions out there. best Eliot ... (stopping OT) |
2008/7/2 Eliot Miranda <[hidden email]>:
> > > On Tue, Jul 1, 2008 at 4:12 PM, Igor Stasenko <[hidden email]> wrote: >> >> +1 to moving named primitive into image >> >> +1 to make use of primitive returning value (this would affect the >> code generator, but hell, its way better to have a notion why it >> failed than simply guessing in the dark room, like any currently >> primitive failure code does) >> >> +1 to eliminating pushRemappableOop:/popRemappableOop . (A question >> however, how you would handle heap overflows, when evil primitive >> generates massive object or lots of tiny objects?) > > It is much the same as it is now. A large allocation attempt (e.g. in > primitiveNewWithArg) either succeed, possibly causing a garbage collection > very soon after, or fails. The garbage collector is allowed to run in > primitiveNewWithArg et al. But allocations at any other time cannot cause > the garbage collector to run and are known to fit within the extra memory > reserved for the interpreter. These allocations always succeed but may > exceed some threshold which will cause a garbage collection very soon > thereafter. > > Memory will look something like this: > -> start of heap > ... lots of objects ... > -> young start > ... not so many objects ... > -> free start/eden > ... small amount of free space (a few hundred k) ... > -> scavenge threshold > ... free memory (megabytes)... > -> VM reserve start > .... very few k (e.g. 8k) ... > -> fwd block reserve > -> end > > The allocation pointer is free start/eden. A new object is allocated here > and advances free start/eden. We will dispense with freeBlock and its > header for speed. > > primitiveNewWithArg (et al) can allocate up to VM reserve start. An attempt > beyond that can trigger a garbage collection but the retry cannot exceed VM > reserve start. > > Any other allocation can exceed VM reserve start but not fwd block reserve. > Exceeding fwd block reserve is an internal fatal error that should not > happen if VM reserve is large enough. > > Any allocation that pushes free start/eden beyond scavenge threshold sets a > flag that will cause the VM to perform an incrementalGC on the next send or > backward branch. > > The net result is that pointers only move during primitives that do > substantial allocations (primitiveNewWithArg, anything that allocates lots > of objects) and only these primitives need to use > pushRemappableOop:/popRemappableOop. > > The details can vary but the essentials are to arrange that for allocations > in the course of execution (e.g. creating a block,creating the message > argument of a doesNotUnderstand:, creating a primitive failure value, > flushing a stack page to heap contexts, etc, etc) the garage collector will > not run and so the VM does not need to manage pointers in local variables. > It can assume these values will not change and remain valid until the next > send or backward branch. > > Agreed, most of primitives can't pass over 8k hard limit, except those which excessively deal with i/o (like taking a screenshot). Good, i found that both socket and file prims using an array argument passed as buffer, where plugin would store received data. I know at least single place where it can create a ByteArray of arbitrary size (but this is introduced in Hydra, and we will get to it ;) ). In any case, removing pushRemappableOop:/popRemappableOop pattern would require to overview all places where its used, so at least for core plugins, which live with regular VM this could be done in single swift cut'n'throw away manner. >> (a little OT .. ) >> >> Another question, is how would you sync there changes with Hydra (if >> you having plans of course)? > > This *must* be done. I expect we'll spend quite a lot of time communicating > to make this happen. I am about to extend the translator for the stack > interpreter. But basically its my job to fit Cog into Hydra. > > My own OT: > > BTW, did you see my comment about renaming Hydra? I think "String" is a > much better name. Strings are composed of threads. Smalltalk comes from > Alan Kay's distaste for the kinds of names people were using for programming > languages in the 60's & 70's, names like Zeus and Thor. Hence Smalltalk. > Hydra is another mythic name. Cog is a small element in a larger whole > (and a cool advert by Honda). Anyway, think it over... > Naming things is not my best talent. :) A name "Hydra" is not really my idea (originally it came from Andreas colleague afaik). I can say that i like it: Hydra has many heads. But if you find it too pretentious, hmm. A "String" is directly opposite to it.. Its too plain boring & gray :) There can be many criteria how to pick a name: ones care that name should sell itself, others want to put meaning in it (to conform idea). I personally prefer names which simply sound good. :) > >> I have many changes in CodeGenerator and some in VMMaker , to simplify >> my day's job :) > > Could you email me your current VMMaker package, or a link to it? And could > you summarise the changes you've made and why? I'll do the same. > A package available at http://jabberwocky.croquetproject.org:8889/HydraVM. Just made snapshot. Yes, i would write a summary, meanwhile you can read a summary about Hydra here: http://squeakvm.org/~sig/hydravm/devnotes.html >> In fact, it would be good to make a clean refactoring of it: >> - make inst vars of Interpreter/ObjectMemory be the interpreter instance >> state >> - make class vars be the part of VM global state >> - make pool vars be the constants >> - same could be applied to methods (instance/class side) >> >> Currently things are little messy, since i made incremental changes to >> existing model. >> >> Another thing, which doesn't makes me sleep well, is that by using >> thread specific storage, it would be possible to avoid passing >> interpreter instance to each function/primitive. Too bad, GCC (on >> windows) support a __thread__ specifier only in latest release , which >> is 'technology preview' how they calling it :) >> If we could use this, it would be possible to make Hydra nearly as >> fast as current Squeak VM. >> And there is no point to argue do we have right to use thread specific >> storage or not: consider that Hydra can run on platforms which can >> support threads. And since we already using threads, why we can't use >> thread-specific storage. > > You need is an abstraction layer, e.g. implemented with macros that > insulates the Slang code form the platform thread system details. You then > implement the abstraction layer as thinly as possible. I've done this with > the threaded API extension of the VW FFI. Its not hard. You may be able to > find GPL versions out there. > > best > Eliot > >> ... (stopping OT) >> -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Eliot Miranda-2
> I think "String" is a much better name. Strings are composed of > threads. But the word "string" is so overloaded, especially in computer science. It seemed to me that Igor was looking for a name that would evoke "multi-headedness", and "Hydra" certainly does that. I don't think it falls into the same category as names like "Zeus" or "Thor", which I assume were symptomatic of power fantasies with no further meaning. It's the lack of meaning that I see as the problem, I think the mythic quality is otherwise irrelevant. While were at it, though... > Smalltalk comes from Alan Kay's distaste for the kinds of names people > were using for programming languages in the 60's & 70's, names like > Zeus and Thor. Hence Smalltalk. I think "Smalltalk" was a terrible choice. Pretty much everyone to whom I've mentioned it thinks it's too long, and it immediately makes them think of annoying obligatory chit-chat, something they hate (as either speaker or listener). And how unfortunately ironic would it be to act with such deference to an authority figure expressing his distaste for authority figures? :) I don't think we should treat Alan Kay like a god, either... Oh, and Cog is a great name. :) -C -- Craig Latta improvisational musical informaticist www.netjam.org Smalltalkers do: [:it | All with: Class, (And love: it)] |
2008/7/2 Craig Latta <[hidden email]>:
> >> I think "String" is a much better name. Strings are composed of >> threads. > > But the word "string" is so overloaded, especially in computer science. > It seemed to me that Igor was looking for a name that would evoke > "multi-headedness", and "Hydra" certainly does that. I don't think it falls > into the same category as names like "Zeus" or "Thor", which I assume were > symptomatic of power fantasies with no further meaning. It's the lack of > meaning that I see as the problem, I think the mythic quality is otherwise > irrelevant. > Yeah, Hydra could (small)talk a lot faster, because it having many heads. - there is no Spoon nor a fork, simply because Hydra swallowed them both :) > While were at it, though... > >> Smalltalk comes from Alan Kay's distaste for the kinds of names people >> were using for programming languages in the 60's & 70's, names like >> Zeus and Thor. Hence Smalltalk. > > I think "Smalltalk" was a terrible choice. Pretty much everyone to whom > I've mentioned it thinks it's too long, and it immediately makes them think > of annoying obligatory chit-chat, something they hate (as either speaker or > listener). > I don't share your point here. A name is quite relevant to computing: if we going to talk with computers then to understand each other we need something in common - a language. Small language, which we can understand both and talk on it :) And that's the most why i like smalltalk - it is small. It is inherently easy to learn and talk on it. Maybe there other language(s) which would make programmer's life as easy as smalltalk does. But i didn't met them yet in my life. > And how unfortunately ironic would it be to act with such deference to > an authority figure expressing his distaste for authority figures? :) I > don't think we should treat Alan Kay like a god, either... > > Oh, and Cog is a great name. :) > > > -C > > -- > Craig Latta > improvisational musical informaticist > www.netjam.org > Smalltalkers do: [:it | All with: Class, (And love: it)] > > > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Igor Stasenko
Igor Stasenko wrote:
> 2008/7/2 Eliot Miranda <[hidden email]>: >> BTW, did you see my comment about renaming Hydra? I think "String" is a >> much better name. Strings are composed of threads. Smalltalk comes from >> Alan Kay's distaste for the kinds of names people were using for programming >> languages in the 60's & 70's, names like Zeus and Thor. Hence Smalltalk. >> Hydra is another mythic name. Cog is a small element in a larger whole >> (and a cool advert by Honda). Anyway, think it over... >> > > Naming things is not my best talent. :) A name "Hydra" is not really > my idea (originally it came from Andreas colleague afaik). > I can say that i like it: Hydra has many heads. But if you find it too > pretentious, hmm. A "String" is directly opposite to it.. Its too > plain boring & gray :) My alternative to Hydra would be "Couscous" [1] - lots of little things and you can have some decent Smalltalk with it ;-) [1] http://en.wikipedia.org/wiki/Couscous Cheers, - Andreas |
In reply to this post by Eliot Miranda-2
On 1-Jul-08, at 3:07 PM, Eliot Miranda wrote: > > The compiler is modified to recognise <primitive: integer errorCode: > identifier> and <primitive: identifier module: identifier errorCode: > identifier> and convert this into one additional temp Yup, that's what we did... > and generate a long storeTemp as the first instruction of the method. > > The VM is modified to, on primitive failure, check for the long > storeTemp and stores the error code if it sees it. > > Old images lack the storeTemp so the VM does not store the value. > Old VMs do perform the storeTemp, but because the storeTemp is > storing into the stack top it is effectively a noop. e.g. if the > primitive looks like: ...but that is very clever as a way of handling back-compat. We didn't have to care about that at all so it got no attention. > >> I'm going to guess that "work I'm doing right now on eliminating >> pushRemappableOop:/popRemappableOop" relates to getting allocations >> out of primitives as much as possible, right? That would be nice. > > No. This relates to not having to use pushRemappableOop:/ > popRemappableOop because it is extremely error-prone. For example > can you spot the bug in this: > > primitivePerform > > A doesNotUnderstand: may cause a GC in createActualMessage as it > creates the message argument for doesNotUnderstand:. So this bug > only bytes if a perform is not understood when memory is on the > verge of exhaustion. Ouch! What fun. > > > In the context-to-stack-mapping VM I'm working on the VM may do up > to a page worth of context allocations on e.g. trying to store the > sender of a context. Rather than try and plug all the holes it is > much safer to restrict garbage collection to happening between > bytecodes (probably on sends and backward branches). To do this the > GC has to maintain a reserve for the VM which is about the size of > two stack pages, or probably 2k bytes, and the VM has to defer > incremental collections from allocations until the send or backward > branch following. OK; be aware that there is a pathological case that might impact your code in this area, mostly restricted to non-virtual memory systems. Somewhere in the GC code it will try to grab more memory for forwarding blocks and if none is provided by the OS (as in RISC OS for example) then some of the reserved space will be stolen *without* proper checks and notifications. This can result in the system trying to handle a lowSpace with only a few hundred bytes of free memory. It doesn't go so well after that.... I've been trying to find relevant emails to illustrate better but no luck so far. I'm reasonably sure we never came up with a good solution but the problem surfaced about 4 years ago and just possibly got fixed somewhere. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim "Bother" said Piglet, as Pooh smeared him in honey. |
In reply to this post by Eliot Miranda-2
On Jul 1, 2008, at 5:06 PM, Eliot Miranda wrote: > Any other allocation can exceed VM reserve start but not fwd block > reserve. Exceeding fwd block reserve is an internal fatal error > that should not happen if VM reserve is large enough. > > Any allocation that pushes free start/eden beyond scavenge threshold > sets a flag that will cause the VM to perform an incrementalGC on > the next send or backward branch. > > The net result is that pointers only move during primitives that do > substantial allocations (primitiveNewWithArg, anything that > allocates lots of objects) and only these primitives need to use > pushRemappableOop:/popRemappableOop. > > The details can vary but the essentials are to arrange that for > allocations in the course of execution (e.g. creating a > block,creating the message argument of a doesNotUnderstand:, > creating a primitive failure value, flushing a stack page to heap > contexts, etc, etc) the garage collector will not run and so the VM > does not need to manage pointers in local variables. It can assume > these values will not change and remain valid until the next send or > backward branch. Back when Tim and I looked at allocation failures a few years back I had built a VM where it would slide the end of young space into the forwarding block reserve area. That worked quite nicely and you could chew memory right below the number of bytes for needed for the next MethodContext at which point the VM would crash. However that never went into VMMaker. The real issue is deciding how to stop the process that is fatally consuming memory. But that's a different issue. As for the set a flag for an IGC, also see forceTenureFlag via primitiveForceTenure which some memory policies use to solve the problem with excessive slot scanning via the root table. -- = = = ======================================================================== John M. McIntosh <[hidden email]> Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com = = = ======================================================================== |
In reply to this post by timrowledge
>
> OK; be aware that there is a pathological case that might impact > your code in this area, mostly restricted to non-virtual memory > systems. Somewhere in the GC code it will try to grab more memory > for forwarding blocks and if none is provided by the OS (as in RISC > OS for example) then some of the reserved space will be stolen > *without* proper checks and notifications. This can result in the > system trying to handle a lowSpace with only a few hundred bytes of > free memory. It doesn't go so well after that.... I've been trying > to find relevant emails to illustrate better but no luck so far. I'm > reasonably sure we never came up with a good solution but the > problem surfaced about 4 years ago and just possibly got fixed > somewhere. Ah, well let me see if I can find them then. -- = = = ======================================================================== John M. McIntosh <[hidden email]> Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com = = = ======================================================================== |
In reply to this post by timrowledge
Ok, well the discussions date from late April 2005. Some are below.
On Jul 1, 2008, at 11:46 PM, tim Rowledge wrote: > OK; be aware that there is a pathological case that might impact > your code in this area, mostly restricted to non-virtual memory > systems. Somewhere in the GC code it will try to grab more memory > for forwarding blocks and if none is provided by the OS (as in RISC > OS for example) then some of the reserved space will be stolen > *without* proper checks and notifications. This can result in the > system trying to handle a lowSpace with only a few hundred bytes of > free memory. It doesn't go so well after that.... I've been trying > to find relevant emails to illustrate better but no luck so far. I'm > reasonably sure we never came up with a good solution but the > problem surfaced about 4 years ago and just possibly got fixed > somewhere. > > tim > -- > tim Rowledge; [hidden email]; http://www.rowledge.org/tim > "Bother" said Piglet, as Pooh smeared him in honey. From: [hidden email] Subject: initializeMemoryFirstFree Date: April 28, 2005 10:03:36 PM PDT (CA) To: [hidden email] initializeMemoryFirstFree Some thoughts. a) We must have oh say 100,000 bytes free, reduce fwdBlockBytes if need be from it's optimal calculation. b) fwdBlockBytes must be > 100,000, if not die, this is an arbitrary value, not sure what min really should be.. fwdBlockBytes = foo->totalObjectCount & 4294967292U; if (!((foo->memoryLimit - fwdBlockBytes - 100000) >= (firstFree + BaseHeaderSize))) { fwdBlockBytes = foo->memoryLimit - (firstFree + BaseHeaderSize) - 100000; } if (fwdBlockBytes < 0) error("Death no memory"); So this does allow me to see and get the dialog, but then I don't think the right process gets stopped. since I don't have control and I can't click/keyboard and we die. error("Death no memory"); Is it suspending the UI process? 200,000 is also not much headroom, try 1,000,000 for better safety. & Code at bottom to test with. freememory fwdblocksBytes, endOfmemory memoryLimit 254204 759056 128479936 129238992 206044 759056 128479936 129238992 202500 759056 128479936 129238992 Fall under 200,000 gcmove moves some bytes about so we see 5 iterations. Note how fwdblocks changed from 759056 to 859152 and we cap that and give back 102404 bytes of free 102404 859152 128379840 129238992 102404 859152 128379840 129238992 102404 877184 128361808 129238992 102404 877184 128361808 129238992 102404 877184 128361808 129238992 JMMJMM TOSS signal LOWSPACE Yes under the 200,000 so do the signal. JMMJMM BEEEEEEEEEEEPPPPPPPPP This is the lowspace process waking up and running. JMMJMM primitiveSignalAtBytesLeft called lowspace process changed threshold to zero. watch how we carve away at fwdblock bytes keeping 100K + 4 bytes around 102404 865684 128373308 129238992 102404 871032 128367960 129238992 102404 871484 128367508 129238992 102404 865364 128373628 129238992 102404 871800 128367192 129238992 102404 869176 128369816 129238992 102404 873180 128365812 129238992 102404 871452 128367540 129238992 102404 864308 128374684 129238992 102404 864120 128374872 129238992 102404 870744 128368248 129238992 102404 870164 128368828 129238992 102404 870664 128368328 129238992 102404 866160 128372832 129238992 102404 869940 128369052 129238992 102404 868352 128370640 129238992 102404 864672 128374320 129238992 102404 870100 128368892 129238992 102404 822776 128416216 129238992 102404 872888 128366104 129238992 102404 872348 128366644 129238992 102404 871832 128367160 129238992 102404 870560 128368432 129238992 102404 869212 128369780 129238992 102404 867052 128371940 129238992 102404 867372 128371620 129238992 102404 867116 128371876 129238992 102404 869252 128369740 129238992 102404 867344 128371648 129238992 102404 866688 128372304 129238992 102404 864384 128374608 129238992 102404 863436 128375556 129238992 102404 860632 128378360 129238992 102404 858804 128380188 129238992 102404 860312 128378680 129238992 102404 859980 128379012 129238992 102404 860172 128378820 129238992 102404 859692 128379300 129238992 102404 860108 128378884 129238992 102404 859524 128379468 129238992 102404 858408 128380584 129238992 102404 860200 128378792 129238992 102404 860200 128378792 129238992 102404 859804 128379188 129238992 102404 859424 128379568 129238992 102404 856880 128382112 129238992 102404 854388 128384604 129238992 102404 855116 128383876 129238992 102404 854748 128384244 129238992 102404 855036 128383956 129238992 102404 855688 128383304 129238992 102404 853028 128385964 129238992 102404 852592 128386400 129238992 102404 852292 128386700 129238992 102404 852940 128386052 129238992 102404 852552 128386440 129238992 102404 852624 128386368 129238992 102404 852732 128386260 129238992 102404 852036 128386956 129238992 102404 855304 128383688 129238992 102404 855000 128383992 129238992 102404 854744 128384248 129238992 102404 854668 128384324 129238992 102404 855340 128383652 129238992 102404 856028 128382964 129238992 102404 853628 128385364 129238992 102404 854504 128384488 129238992 102404 854248 128384744 129238992 102404 854852 128384140 129238992 102404 853556 128385436 129238992 102404 854204 128384788 129238992 102404 852532 128386460 129238992 102404 853840 128385152 129238992 102404 853564 128385428 129238992 102404 853088 128385904 129238992 102404 854196 128384796 129238992 102404 854164 128384828 129238992 102404 854176 128384816 129238992 102404 852560 128386432 129238992 102404 854456 128384536 129238992 102404 852580 128386412 129238992 102404 850528 128388464 129238992 102404 839720 128399272 129238992 102404 848844 128390148 129238992 102404 847364 128391628 129238992 102404 847684 128391308 129238992 102404 847044 128391948 129238992 102404 846788 128392204 129238992 102404 847236 128391756 129238992 102404 850816 128388176 129238992 102404 846164 128392828 129238992 102404 839576 128399416 129238992 102404 848688 128390304 129238992 102404 849004 128389988 129238992 102404 848696 128390296 129238992 102404 838656 128400336 129238992 102404 851656 128387336 129238992 102404 797844 128441148 129238992 102404 793268 128445724 129238992 102404 791064 128447928 129238992 102404 841436 128397556 129238992 102404 840376 128398616 129238992 102404 839528 128399464 129238992 102404 786560 128452432 129238992 102404 835652 128403340 129238992 102404 835652 128403340 129238992 102404 783732 128455260 129238992 102404 782336 128456656 129238992 102404 836676 128402316 129238992 102404 832308 128406684 129238992 102404 832068 128406924 129238992 102404 832636 128406356 129238992 102404 833612 128405380 129238992 102404 834320 128404672 129238992 102404 833980 128405012 129238992 102404 835372 128403620 129238992 102404 780488 128458504 129238992 102404 778388 128460604 129238992 102404 826284 128412708 129238992 102404 827288 128411704 129238992 102404 827288 128411704 129238992 102404 827416 128411576 129238992 102404 827416 128411576 129238992 102404 827544 128411448 129238992 102404 827224 128411768 129238992 102404 828588 128410404 129238992 102404 826084 128412908 129238992 102404 774280 128464712 129238992 102404 837180 128401812 129238992 102404 840692 128398300 129238992 102404 841520 128397472 129238992 102404 796540 128442452 129238992 102404 748380 128490612 129238992 102404 700220 128538772 129238992 102404 652060 128586932 129238992 102404 603900 128635092 129238992 102404 555740 128683252 129238992 102404 507580 128731412 129238992 102404 459420 128779572 129238992 102404 411260 128827732 129238992 102404 363100 128875892 129238992 102404 314940 128924052 129238992 102404 266780 128972212 129238992 102404 218620 129020372 129238992 102404 170460 129068532 129238992 102404 122300 129116692 129238992 102404 74140 129164852 129238992 Grind down forward blocks to 32K, note that free goes to 95616, then 47456, then 2500. 95616 32768 129206224 129238992 47456 32768 129206224 129238992 2500 32768 129206224 129238992 LOTS of these.... 2500 32768 129206224 129238992 2500 32768 129206224 129238992 16720 32768 129206224 129238992 2592 32768 129206224 129238992 2496 32768 129206224 129238992 2496 32768 129206224 129238992 2680 32768 129206224 129238992 2588 32768 129206224 129238992 2484 32768 129206224 129238992 2668 32768 129206224 129238992 2576 32768 129206224 129238992 2472 32768 129206224 129238992 2656 32768 129206224 129238992 2564 32768 129206224 129238992 2460 32768 129206224 129238992 2644 32768 129206224 129238992 2552 32768 129206224 129238992 2448 32768 129206224 129238992 2632 32768 129206224 129238992 2540 32768 129206224 129238992 2436 32768 129206224 129238992 2620 32768 129206224 129238992 2528 32768 129206224 129238992 2424 32768 129206224 129238992 2608 32768 129206224 129238992 2516 32768 129206224 129238992 2412 32768 129206224 129238992 2596 32768 129206224 129238992 2504 32768 129206224 129238992 2400 32768 129206224 129238992 2584 32768 129206224 129238992 2492 32768 129206224 129238992 2492 32768 129206224 129238992 2400 32768 129206224 129238992 2400 32768 129206224 129238992 2308 32768 129206224 129238992 2308 32768 129206224 129238992 2216 32768 129206224 129238992 2216 32768 129206224 129238992 2124 32768 129206224 129238992 2124 32768 129206224 129238992 2032 32768 129206224 129238992 2032 32768 129206224 129238992 1940 32768 129206224 129238992 1940 32768 129206224 129238992 1848 32768 129206224 129238992 1848 32768 129206224 129238992 1756 32768 129206224 129238992 1756 32768 129206224 129238992 1664 32768 129206224 129238992 1664 32768 129206224 129238992 1572 32768 129206224 129238992 1572 32768 129206224 129238992 1480 32768 129206224 129238992 1480 32768 129206224 129238992 1388 32768 129206224 129238992 1388 32768 129206224 129238992 1296 32768 129206224 129238992 1296 32768 129206224 129238992 1204 32768 129206224 129238992 1204 32768 129206224 129238992 1112 32768 129206224 129238992 1112 32768 129206224 129238992 1020 32768 129206224 129238992 1020 32768 129206224 129238992 928 32768 129206224 129238992 928 32768 129206224 129238992 836 32768 129206224 129238992 836 32768 129206224 129238992 744 32768 129206224 129238992 744 32768 129206224 129238992 652 32768 129206224 129238992 652 32768 129206224 129238992 560 32768 129206224 129238992 560 32768 129206224 129238992 468 32768 129206224 129238992 468 32768 129206224 129238992 376 32768 129206224 129238992 376 32768 129206224 129238992 284 32768 129206224 129238992 284 32768 129206224 129238992 192 32768 129206224 129238992 192 32768 129206224 129238992 8 32768 129206224 129238992 8 32768 129206224 129238992 And we die no space to allocate context record.... int initializeMemoryFirstFree(int firstFree) { register struct foo * foo = &fum; int fwdBlockBytes; fwdBlockBytes = foo->totalObjectCount & 4294967292U; if (!((foo->memoryLimit - fwdBlockBytes) >= ((firstFree + BaseHeaderSize) + (100 * 1024)))) { fwdBlockBytes = (foo->memoryLimit - (firstFree + BaseHeaderSize)) - (100 * 1024); } if (fwdBlockBytes < (32 * 1024)) { fwdBlockBytes = 32 * 1024; if (!((foo->memoryLimit - fwdBlockBytes) >= (firstFree + BaseHeaderSize))) { fwdBlockBytes = foo->memoryLimit - (firstFree + BaseHeaderSize); } } foo->endOfMemory = foo->memoryLimit - fwdBlockBytes; foo->freeBlock = firstFree; /* begin setSizeOfFree:to: */ longAtput(foo->freeBlock, ((foo->endOfMemory - firstFree) & AllButTypeMask) | HeaderTypeFree); /* begin setSizeOfFree:to: */ longAtput(foo->endOfMemory, (BaseHeaderSize & AllButTypeMask) | HeaderTypeFree); if (DoAssertionChecks) { if (!((foo->freeBlock < foo->endOfMemory) && (foo->endOfMemory < foo- >memoryLimit))) { error("error in free space computation"); } if (!((foo->endOfMemory + (foo->headerTypeBytes[(longAt(foo- >endOfMemory)) & TypeMask])) == foo->endOfMemory)) { error("header format must have changed"); } if (!((objectAfter(foo->freeBlock)) == foo->endOfMemory)) { error("free block not properly initialized"); } } } ... at 200,000 we don't see the signal, because after the full GC at the 200K boundary we gobble up all the memory for the fwdtable, leaving 4 or 6 bytes left. Then we immediately die because we can't allocate the next context record. For a 64MB memory block we've about a MB or so tied up in fwdspace, certainly it wants over 200K after the fullGC. As you noticed changing the limit to be larger, I think 400,000 was about the cutoff for my test case gave me the debugger. Let me check at 512MB. On Apr 29, 2005, at 3:29 PM, Tim Rowledge wrote: > OK, I have some slightly off-to-the-side news on this. > > A VM with the store-errant-process-oop + an image with dtl's code to > make use > of that is much more stable if the lowSpaceThreshold is raised a > good bit. What > is likely happening to your system is that the lowspace is being > signalled but > the long faringabout with gc attempts means the event ticler is > interrupted > instead of the UI process 'at fault'. Thus the bad boy goes ahead > and messes > you up. With the fix in things are a bit more sensible in that the > 'right' > proces is interupted. Of course, not much help if lots of processes > are using > up memory! > > Still, a code-recursion test - ie fill up memory with contexts - is > passable > with 'only' half-meg of lowSpaceThreshold. Likewise the fill memory > with > bitblts. It seems to need 1mb to survive the fill with Links though. > This is at > least encouraging enouhg to maybe help us find the problem with my > tree walker > changes that lead down this path in the first pace. > > The essence of the problem is that we really wants one byte per object > available to the fwdTable. In the worst case, we could have almost > all of OM > filled with plain Objects - ie 8byte chunks - and so would really > wish for ~12% > of available memory reserved. So much for direct pointers 'saving the > wastedspace of an object table', eh? > > more later but I have to dash out for the dog's massage appointment. > Really. > > > tim > -- > Tim Rowledge, [hidden email], http://sumeru.stanford.edu/tim > Strange OpCodes: SEXI: Sign EXtend Integer > Apr 30th,, 2005 John M McIntosh wrote: > > On Apr 30, 2005, at 8:00 PM, Andreas Raab wrote: > >> Hi Tim - >> >>> After having problem trying to debug some TK4 code that blew up >>> with lowspace >>> problems but never let me catch and debug, I spent some time >>> adding the >>> lowspace-process stuff we recently discussed. I had to make a few >>> alterations >>> to match it up with the latest 64bit clean code but no problems >>> with that part. >> >> What am I missing? I don't remember low-space stuff - I only >> remember interrupt-related stuff. > > There was a mantis bug about low-space issues and some patchs to > record which process caused the lowspace signal. Mind this in my > opinion is wrong. > >> >>> Depending upon the exact size of object memory in use the 200kb >>> used as the >>> lowSpaceThreshold can be gobbled up in one swallow by the >>> initializeMemoryFirstFree: method making sure there is a byte per >>> object that >>> survived the markPhase. In using useUpMemory we can get to having >>> 4 bytes of >>> free space when the next allocate is attempted.... Ka-Boom. >> >> Well, so don't eat up the memory. There is no reason why >> initializeMemoryFirstFree: would have to reserve that much memory - >> like the comment says the reserve "should" be chosen so that >> compactions can be done in one pass but there is absolutely no such >> requirement. Multi-pass compactions have happened in the past and >> there is nothing wrong with them (in a low-space situation). >> >>> This assumes that we really need to have one byte per object of >>> course. The >>> original rationale was to keep the number of compact loops down to >>> eight (see >>> Dan's comment in initializeMemoryFirstFree:) for Alan's large demo >>> image. The >>> nicest solution would be to come up with a way to do our GC & >>> compacting >>> without needing any extra space. Commence headscratching now... >>> John suggested >>> making sure the fwd gets less than the byte-per-object if things >>> are tight, and >>> accpting the extra compaction loops. >> >> Yes. That's the only reasonable way of dealing with it. > > What happens is the fwdblocks calculation grabs all the available > free memory when it's recalculated after the full GC, the check for > this condition actually backs it off to allow one object header > free, 4 or 6 bytes I believe, usually you die right away because > someone attempts to allocate a new context record and we don't have > 98ish bytes free. I gave Tim a change set that attempts to maximise > freespace to 100K by reducing fwdblocks down to 32k, once you hit > the 32k limit freespace then heads towards zero of course. > > Note that once freespace goes under 200,000 we do signal the > lowspace semaphore btw. > > These changes do require a VM change, but we did notice as Tim > points out if you increase the lowspace threshold, say to 1MB in my > testing the other night we'll get the semaphore signaled with a > current VM, this would not occur before in an unaltered VM. > >> >>> Bad news- consider Tweak. With lots of processes whizzing away, >>> merely stopping >>> the one that did the allocation and triggered the lowspace is not >>> going to be >>> much good. Stopping everything except the utterly essential stuff >>> to debug the >>> lowspace will be needed. Probably. >> >> Uh, oh. Are you telling me that the "low space stuff" you are >> referring to above actually suspends the process that triggers the >> low-space condition? Bad, bad, bad idea. Ever considered that this >> might be the timer process? The finalization process? Low-space is >> *not* a per-process condition; suspending the currently running >> process is something that should be done with great care (if at all). >> >> Please, don't suspend that process - put it away for the image to >> examine but by all means do NOT suspend it. If you give me a nice >> clean semaphore signal for Tweak to handle a low-space condition I >> know perfectly well what to do but if you just suspend a random >> process which may have absolutely nothing with the low space >> condition, then, yes, we are in trouble (if this were a tweak >> scheduler process you'd be totally hosed). > > Tim and I were considering to suspend all user processes and others > we don't have knowledge of being untouchable, then I pointed out > Tweak spawns all these process, what do we do about them? Certainly > we can call something to say lowspace Mr Tweak beware... > > The Process Browser logic has a table identifying processes of the > VM, we assume a process the user created is causing the problem. > The earlier fix suggested to stop the process that was running when > the lowspace condition occurred, but I doubt you can 100% say that > is the process in question and could as you know be the finalization > process or other critical task. Still this is not harmful because > the evil process in question is still running and will terminate > your image in short order. > >> >> Cheers, >> - Andreas >> > -- = = = ======================================================================== John M. McIntosh <[hidden email]> 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com = = = ======================================================================== -- = = = ======================================================================== John M. McIntosh <[hidden email]> Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com = = = ======================================================================== |
2008/7/2 John M McIntosh <[hidden email]>:
[ big snip.... ] Few thoughts, about how VM could determine what process(es) is memory hungry to kill them w/o mercy, leaving rest intact and avoid dying: add 'memory policy' slot to Process instance, which can tell VM, is given process can be killed w/o damaging most critical parts of image. Lets say, a value 0 - do not kill under any circumstances, and values > 0 is candidates for killing. Then we could assign each process a corresponding policy, and when it goes to look for a candidate for killing, VM picks a process with higher 'memory policy' slot value. The we could have most important processes have value = 0, and UI and forked processes can have value = 100+. More important than UI processes can be in the middle between 0 and 100. Another rule - automatically increment value of fork(ed) process could be useful too. In this way given slot could reflect fork depth so, one can potentially track process spawn chains. How do you think, is adding new slot to Process instance worth stability? :) One more thing, which could be useful is informing image about such unconditional termination event. Either by registering a semaphore which will be signaled upon such event, or adding some kind of #onTerminate message, so dying process can handle such event by itself and perform necessary countermeasures :) Or is there already any means how to inform another process , that some other process is terminated? >From my OS practice: OS treats processes as waitable object, so one could apply following: proc run. proc wait. "wait till terminated" This is easy to do, if process can be associated with semaphore instance. And then , in case it terminates, it signals semaphore. -- Best regards, Igor Stasenko AKA sig. |
On Wed, Jul 2, 2008 at 3:32 AM, Igor Stasenko <[hidden email]> wrote: 2008/7/2 John M McIntosh <[hidden email]>: There is no need for this to be in the VM. Instead the process that runs once the LowSpaceSemaphore is signalled should be of a very high priority (probably above finalization but below timing priority). It then enumerates the runnable processes (processes waiting on semaphores are resumably not the ones chewing up memory). It then suspends any processes on the runnable queues that meet certain criteria. Something that *could* be in the VM is accounting of how much space a process consumes. Add a slot to each process known to the VM called e.g. slotsAllocated. The VM computes the slots allocated between context switches. This is cheap to compute because it can compute how much space was allocated since the last garbage collection or process switch simply by subtracting the allocation pointer at the end of the previous GC or process switch from the current allocation pointer. The slots allocated since the last process switch is added to the slot in the old process and the slots allocated count zeroed on each context switch. We then have an accurate measure of how much space each process has allocated. Wen the low space process runs it simply examines the runnable processes, checking their space allocation. It can either maintain a per-process allocation rate by computing the amount allocated since the last low space signal, or it can zero the per-process allocation count at each low space signal. This is much more flexible and less arbitrary than having the VM do it. Similarly a VM that does context-to-stack mapping should be able to cheaply maintain a stack size count per process since it only has to increase or decrease the current process's stack size on each stack page overfow/underflow and context switch. Traversing a stack page's frames to get an accurate count of the number of "contexts" on each page is pretty quick (just a walk of the frame pointer->caller frame pointer chain). Getting an approximation by dividing the used portion of a stack page by the minimum frame size is even quicker. You could then provide the image with either an accurate stack depth, or a good approximation thereof. That gives the low space process an easy job to identify a potentially infinitely recursive process. In general it is good design to put mechanism in the VM and keep policy up in the image. Lets say, a value 0 - do not kill under any circumstances, and values |
On Wed, 02 Jul 2008 16:06:42 +0200, Eliot Miranda wrote:
> On Wed, Jul 2, 2008 at 3:32 AM, Igor Stasenko wrote: > ... >> Few thoughts, about how VM could determine what process(es) is memory >> hungry to kill them w/o mercy, leaving rest intact and avoid dying: >> >> add 'memory policy' slot to Process instance, which can tell VM, is >> given process can be killed w/o damaging most critical parts of image. > > > There is no need for this to be in the VM. Instead the process that runs > once the LowSpaceSemaphore is signalled should be of a very high priority > (probably above finalization but below timing priority). It then > enumerates > the runnable processes (processes waiting on semaphores are resumably not > the ones chewing up memory). It then suspends any processes on the > runnable > queues that meet certain criteria. > > Something that *could* be in the VM is accounting of how much space a > process consumes. Add a slot to each process known to the VM called e.g. > slotsAllocated. The VM computes the slots allocated between context > switches. This is cheap to compute because it can compute how much space > was allocated since the last garbage collection or process switch simply > by > subtracting the allocation pointer at the end of the previous GC or > process > switch from the current allocation pointer. The slots allocated since > the > last process switch is added to the slot in the old process and the slots > allocated count zeroed on each context switch. We then have an accurate > measure of how much space each process has allocated. This is a very interesting point, and I would like to see memory consumption per process, that'd be great! But I'm not quite sure I understand how to balance the value of that slot, i.e. how freed space is subtracted. Could you elaborate a bit on this? Thank you! /Klaus |
Hi Klaus,
a request: please don't change the subject to Re: foo... because it breaks threading in gmail (and I guess elsewhere). All other replies have kept the subject unchanged.
On Wed, Jul 2, 2008 at 7:22 AM, Klaus D. Witzel <[hidden email]> wrote:
On Wed, 02 Jul 2008 16:06:42 +0200, Eliot Miranda wrote: The slot only measures allocations. We never subtract freed space. We don't know which processes the space freed during a garbage collection "belongs to" (were allocated by). But we don't need to know this. The heuristic to determine which processes should be suspended to stop run-away space allocation is that if a process has allocated a lot in recent time then it is likely to continue to do so. A process can't create lots of retained objects without allocating lots of objects. So while this scheme may find false positives, processes which allocate a lot but don't retain the storage, but it will find the processes that allocate a lot and do retain the storage allocated. Since the scheme is intended to stop computation before a fatal evet (running ojut of storage) these false positives are acceptable. Of course this heuristic is easy to fool if one constructs a pool of processes that conspire to pass the task of allocating space amongst themselves, so that by the time the system gets around to suspending an offending process it has passed the responsibility of behaving badly to another process, but that's a very artificial example. |
Hi Eliot,
on Wed, 02 Jul 2008 17:40:05 +0200, you wrote: > Hi Klaus, > > a request: please don't change the subject to Re: foo... because it > breaks threading in gmail (and I guess elsewhere). All other replies > have kept the subject unchanged. No, I didn't change the subject line. I replied to your - http://lists.squeakfoundation.org/pipermail/squeak-dev/2008-July/129727.html and here's what the list server received from me - http://lists.squeakfoundation.org/pipermail/squeak-dev/2008-July/129731.html and the gmane NNTP service that I'm using shows the same. Perhaps something happened on your gateway. Anyways. > On Wed, Jul 2, 2008 at 7:22 AM, Klaus D. Witzel wrote: > >> On Wed, 02 Jul 2008 16:06:42 +0200, Eliot Miranda wrote: ... >>> Something that *could* be in the VM is accounting of how much space a >>> process consumes. Add a slot to each process known to the VM called >>> e.g. >>> slotsAllocated. The VM computes the slots allocated between context >>> switches. This is cheap to compute because it can compute how much >>> space >>> was allocated since the last garbage collection or process switch >>> simply >>> by >>> subtracting the allocation pointer at the end of the previous GC or >>> process >>> switch from the current allocation pointer. The slots allocated since >>> the >>> last process switch is added to the slot in the old process and the >>> slots >>> allocated count zeroed on each context switch. We then have an >>> accurate >>> measure of how much space each process has allocated. >>> >> >> This is a very interesting point, and I would like to see memory >> consumption per process, that'd be great! But I'm not quite sure I >> understand how to balance the value of that slot, i.e. how freed space >> is >> subtracted. Could you elaborate a bit on this? Thank you! > > The slot only measures allocations. We never subtract freed space. We > don't know which processes the space freed during a garbage collection > "belongs to" (were allocated by). But we don't need to know this. The > heuristic to determine which processes should be suspended to stop > run-away > space allocation is that if a process has allocated a lot in recent time > then it is likely to continue to do so. A process can't create lots of > retained objects without allocating lots of objects. So while this > scheme > may find false positives, processes which allocate a lot but don't retain > the storage, but it will find the processes that allocate a lot and do > retain the storage allocated. Since the scheme is intended to stop > computation before a fatal evet (running ojut of storage) these false > positives are acceptable. Okay, I see what you mean. Thank you. /Klaus > Of course this heuristic is easy to fool if one constructs a pool of > processes that conspire to pass the task of allocating space amongst > themselves, so that by the time the system gets around to suspending an > offending process it has passed the responsibility of behaving badly to > another process, but that's a very artificial example. |
In reply to this post by Eliot Miranda-2
On Jul 2, 2008, at 7:06 AM, Eliot Miranda wrote: > Something that *could* be in the VM is accounting of how much space > a process consumes. Add a slot to each process known to the VM > called e.g. slotsAllocated. The VM computes the slots allocated > between context switches. This is cheap to compute because it can > compute how much space was allocated since the last garbage > collection or process switch simply by subtracting the allocation > pointer at the end of the previous GC or process switch from the > current allocation pointer. The slots allocated since the last > process switch is added to the slot in the old process and the slots > allocated count zeroed on each context switch. We then have an > accurate measure of how much space each process has allocated. Ya, there is exactly one place in the VM where the processor switch happens. In the past I tinkered with it to add some slots to Process to track wall clock time by process. At that time was I was thinking gee we could count bytes allocated and other things like socket usage etc, thus providing a bit more process consumption statistical data for Squeak threads ala unix processes. -- = = = ======================================================================== John M. McIntosh <[hidden email]> Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com = = = ======================================================================== |
In reply to this post by johnmci
On 2-Jul-08, at 12:12 AM, John M McIntosh wrote: > Ok, well the discussions date from late April 2005. Some are below. I also found On 27-Apr-05, at 7:39 PM, Tim Rowledge wrote: > In message <[hidden email]> > John M McIntosh <[hidden email]> wrote: > >> question without issues. Lately we added some changes by Andreas for >> correct weak array handling, some changes to how become: works, and >> my >> work in VM GC statistical data, so I cann't say which is at fault, if >> any... >> > I'm currently hip-deep in similar excrement after merging in the new > lowspace- > process handling and the gc instrumentation/weak pointer stuff. It > _looks_ as > if something is getting twisted in the general gc area since the > free block > size ends up being set to 4. As in '4' not 4k or 4mb, just 4. That > quite > unsurprisingly upsets the sufficientSpaceToAllocate: code and we get > an Earth- > shattering kaboom. Once my head has stopped spinning I'll try the > lowspace > changes without the gc changes. > > All to try to make it possible to debug some stupid problem in TK4 > tree > walking.... > at last - I think this is probably the email where I tried to explain the problem- On 30-Apr-05, at 7:00 PM, Tim Rowledge wrote: > After having problem trying to debug some TK4 code that blew up with > lowspace > problems but never let me catch and debug, I spent some time adding > the > lowspace-process stuff we recently discussed. I had to make a few > alterations > to match it up with the latest 64bit clean code but no problems with > that part. > > After building a VM I started testing with some of the methods in > SystemDictionary 'memory space' - in particular #useUpMemory. It is > perhaps > fortunate that I did since the other #useUp* methods pretty much > work once the > lowspace process is caught by the vm and passed up to the image. > After a _lot_ > of head scratching by John & I we found that with a gazillion tiny > objects > (Links are the smallest possible objects that can exist on their > own, plain > Objects would have to be contained in a collection and so would cost > the same 3 > words per object) cause a catastrophic GC explosion. What happens is > that > memory fills up until we get to signal lowspace and then we are in > danger. > Depending upon the exact size of object memory in use the 200kb used > as the > lowSpaceThreshold can be gobbled up in one swallow by the > initializeMemoryFirstFree: method making sure there is a byte per > object that > survived the markPhase. In using useUpMemory we can get to having 4 > bytes of > free space when the next allocate is attempted.... Ka-Boom. > Expanding the lowSpaceThreshold (along with the VM changes to report > the > process and avoid the accidental problem of interrupting > eventTickler) to a > couple of mb makes it ok on my machine and the threshold can be a > lot lower > with the other tests that create bigger (hence fewer) objects (hence > smaller > fwdTable needs). In the worst case, we could have a very large OM > filled with > very small objects all surviving markPhase; in such a case we would > need an > additional 1/12 of OM available for the fwdTable. So for a 30Mb > objectmemory we > ought to set the lowSPaceThreshold to 30/13 => 2.31Mb + actual space > needed to > run the notifier/debugger etc for reasonable safety. Or hide the > 2.31 Mb away > so the image never even knows it is there. If you are using virtual > memory and > a limit of 512Mb then you should perhaps secrete 40Mb some where safe. > > This assumes that we really need to have one byte per object of > course. The > original rationale was to keep the number of compact loops down to > eight (see > Dan's comment in initializeMemoryFirstFree:) for Alan's large demo > image. The > nicest solution would be to come up with a way to do our GC & > compacting > without needing any extra space. Commence headscratching now... John > suggested > making sure the fwd gets less than the byte-per-object if things are > tight, and > accpting the extra compaction loops. > > Good news- with the vm change and 2Mb lowSpaceThreshold I can > probably go back > and find my TK4 problem(s). > > Bad news- consider Tweak. With lots of processes whizzing away, > merely stopping > the one that did the allocation and triggered the lowspace is not > going to be > much good. Stopping everything except the utterly essential stuff to > debug the > lowspace will be needed. Probably. It's really depressing reading mail from back then and even earlier. There are long discussions from 2002 about changing the image format for version 4, clean compiled methods, closures........ tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Fractured Idiom:- LE ROI EST MORT. JIVE LE ROI - The King is dead. No kidding. |
In reply to this post by johnmci
On Wed, Jul 2, 2008 at 11:07 AM, John M McIntosh <[hidden email]> wrote:
Please maintain an option that keeps process context switches as fast as possible. I'm interested in Erlang style architectures and how they might be implemented in Smalltalk. Hydra+Cog+Newspeak would make a really nice platform for Erlang-style applications. -david |
Free forum by Nabble | Edit this page |