Hi All,
I've started working on a new GC for Cog and have a few questions. 1. does anyone have a copy of Joel Bartlett's J. F. Bartlett. A generational, compacting collector for C++. In E. Jul and N.-C. Juul, editors, OOPSLA/ECOOP ‘90 Workshop on Garbage Collection in Object-Oriented Systems, Oct. 1990. ?
2. what's the largest number of classes you've got in your most complex image and/or what's the largest number of classes you're ever created in an image? a.k.a. is 64k classes enough (16 bits), or is 1m classes enough (24 bits)?
3. do you have any personal recommendations, or example code, of systems that combine generation scavenging and/or compaction with object pinning?
TIA, Eliot
|
On 6/15/2010 5:17 PM, Eliot Miranda wrote: Hi All, -- Dennis Smith +1 416.798.7948 Cherniak Software Development Corporation Fax: +1 416.798.0948 509-2001 Sheppard Avenue East [hidden email] Toronto, ON M2J 4Z8 <a class="moz-txt-link-freetext" href="sip:dennis@CherniakSoftware.com">sip:dennis@... Canada http://www.CherniakSoftware.com Entrance off Yorkland Blvd south of Sheppard Ave east of the DVP |
In reply to this post by Eliot Miranda-2
If people define classes by hand, I imagine 64k would be enough. But
suppose they are using light-weight classes, i.e. giving each object its own behavior. Then it would be fairly easy to have more than 64K of them, though 1M is still a reasonable limit. What is the difference in cost between choosing 64K as the limit and choosing 1M? I know, one byte, but what else would you use that byte for? -Ralph |
In reply to this post by Eliot Miranda-2
Hi, Eliot
On 16 June 2010 00:17, Eliot Miranda <[hidden email]> wrote: > Hi All, > I've started working on a new GC for Cog and have a few questions. > 1. does anyone have a copy of Joel Bartlett's J. F. Bartlett. A > generational, compacting collector for C++. In E. Jul and N.-C. Juul, > editors, OOPSLA/ECOOP ‘90 Workshop on Garbage Collection in Object-Oriented > Systems, Oct. 1990. ? Not this in particular. But about GC, i found a very interesting idea about run-time garbage collection. The idea is simple: at each allocation of N bytes, GC marks N*k bytes of object memory (where k>1). So, the mark phase overhead is proportionally distributed among memory allocations, which makes a usual allocation a bit slower, but having much less delays in garbage collection cycles, since once it marked everything, it just doing a sweep (& compaction). > 2. what's the largest number of classes you've got in your most complex > image and/or what's the largest number of classes you're ever created in an > image? a.k.a. is 64k classes enough (16 bits), or is 1m classes enough (24 > bits)? I would really like to not have a hard limit for number of classes in VM. This is for implementing a prototype-based stuff, where each object is an instance of itself, and hence it acting as class. For running a usual smalltalk system, i think the limit could be even less - like 4095 or 8095 classes. But i think that VM should not impose a hard limit on the number of classes. > 3. do you have any personal recommendations, or example code, of systems > that combine generation scavenging and/or compaction with object pinning? Never explored this direction :) How about making everything pinned from start (i.e. use a heap-based algorithms for allocation)? Then you don't have to do compaction, and do it only when saving an image. > TIA, > Eliot > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Eliot Miranda-2
Can't resist the following unproductive comment: Object pinning? Ugh!!
I realize you probably, after much thought, have decided you cannot avoid pinning, but its consequences scare the you-know-what out of me. If you supported one class per object, then you could get rid of the identityHash field in the header, by making it an instance method. ;-) Oh wait a minute, since classes are objects, you would need oops larger than 64-bits each. ;-) - David On Jun 15, 2010, at 2:17 PM, Eliot Miranda wrote: Hi All, |
In reply to this post by Ralph Johnson
At Tue, 15 Jun 2010 16:44:57 -0500,
Ralph Johnson wrote: > > If people define classes by hand, I imagine 64k would be enough. But > suppose they are using light-weight classes, i.e. giving each object > its own behavior. Then it would be fairly easy to have more than 64K > of them, though 1M is still a reasonable limit. > > What is the difference in cost between choosing 64K as the limit and > choosing 1M? I know, one byte, but what else would you use that byte > for? If the next highest notch from 16 bit is 24 bit but not 20 bit, that would be 16M but not 1M. The big images in "our" use cases is with several million objects (not that big in other standard), so if each object theoretically had a distinct class, it would only barely fits. But in reality 16M classes and an object occupies around 40 byte each on avarage, only 6 instances or such per class on average before hitting 4GB. So, for 32-bit address space in mind, 16M classes would be a lot. 1M classes would be 100 instances on average (where the distribution of instance count is highly skewed so average would not make so much sense however), it still would be okay. (As Ralph asked, what are the other things that are competing the bits?) -- Yoshiki |
In reply to this post by Eliot Miranda-2
On Tue, Jun 15, 2010 at 11:17 PM, Eliot Miranda <[hidden email]> wrote:
Hi All, the internet does - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.8462 (click cached PDF) |
In reply to this post by Ralph Johnson
On Tue, Jun 15, 2010 at 2:44 PM, Ralph Johnson <[hidden email]> wrote: If people define classes by hand, I imagine 64k would be enough. But So the header format is 64 bits. Its not about the bits. Its about speed of access. We need of the order of 8 bits for GC flags (marked, free, forwarded, weak etc). We would like of the order of 8 bits for field size and 8 bits for inst size, with objects >= 255 32-bit fields having an overflow size field. That leaves 5 bytes, 40 bits for identity hash and class index. A symmetric 20-bits each gives us slightly slower class index access but provides for 1m classes (apologies for my math earlier). But a 16-bit class index would be faster, or at least have more compact code, on 86, and allow for identity hashes for non-classes of up to 16m. So if 64k classes is enough then a 16 bit class index field is the way to go. If its tight I would stick with 20/20. VW's 64-bit VM uses exactly this. Its essentially:
8 bits of GC flags (1 bit being pointers vs non-pointers) 8 bits of field size (64 bits per field) 8 bits of inst size for pointer objects/5 unused bits, 3 bits of odd size for non-pointers, so size = 8 * # of fields - odd size
20 bits of class index 20 bits of identity hash But its organized to put the 20 bits in the least significant half of each word, the inst and field sizes in bytes and the GC flags distributed:
| field size byte | 4 flags | 20 bit id hash | inst size/odd size byte | 4 flags | 20 bit class idx | so extracting the class idx, the high frequency operation done on every send, is a 64-bit fetch & mask, which in a 32-bit implementation would be a 32-bit fetch and mask. On 32-bits might be better to do
| field size byte | 20 bit id hash | 4 flags | 20 bit class idx | 4 flags | inst size/odd size byte | which would be a 32-bit fetch followed by a 12-bit shift, which is typically less code bytes than the 0xFFFFF mask, and fast now that we can assume a barrel shifter. But that would be incompatible with the 64-bit scheme. So I'll stick with the mask. Its still way shorter than the Cog code for the current compact class scheme.
Clearly with the 20/20 scheme its easy to steal a couple of bits for more GC flags by taking a bit from each 20 bit field. Further, with Squeak, the inst size byte might be pointless (although contexts are still indexable objects with fixed fields) and so I doubt bits are at a premium. Hence the real question is whether I should be whorish and go for the 16-bit compact fast class index access or not.
best Eliot
|
In reply to this post by Eliot Miranda-2
On Tue, Jun 15, 2010 at 2:17 PM, Eliot Miranda <[hidden email]> wrote: Hi All, thanks to all 5 of you who've provided this. Ta!
|
In reply to this post by Eliot Miranda-2
On 6/15/2010 3:32 PM, Eliot Miranda wrote:
> Clearly with the 20/20 scheme its easy to steal a couple of bits for > more GC flags by taking a bit from each 20 bit field. Further, with > Squeak, the inst size byte might be pointless (although contexts are > still indexable objects with fixed fields) and so I doubt bits are at a > premium. Hence the real question is whether I should be whorish and go > for the 16-bit compact fast class index access or not. My $.02: Start with 16 bits and declare the free bits "reserved". When someone complains for the first time, review and decide how many more bits to devote. Rationale #1: Rather than taking away resources when you notice you need an extra header bit, start with the smaller number but keep some room for growth if it becomes necessary. Rationale #2: YAGNI (i.e., we really don't know if we'll need more header bits or if we need more room for classes). Cheers, - Andreas |
On Tue, Jun 15, 2010 at 3:53 PM, Andreas Raab <[hidden email]> wrote:
Excellent suggestion.
|
In reply to this post by Igor Stasenko
Am 15.06.10 23:55 schrieb Igor Stasenko:
> For running a usual smalltalk system, i think the limit could be even > less - like 4095 or 8095 classes. I disagree. I have seen several systems which have between 10.000 and 20.000 classes, and over on the VWNC list, other people reported the same order of magnitude. Cheers, Joachim Geidel |
In reply to this post by Eliot Miranda-2
Hello Eliot Our current system has 14787 classes (including meta classes), I
can’t imagine I’ve ever worked on an image with more than twice that
number written by hand. I did once work on a classification system that created
classes programmatically. We probably had over 64K classes but less than 1m. Joerg From:
[hidden email] [mailto:[hidden email]]
On Behalf Of Eliot Miranda Hi All, I've started working on a new GC for Cog
and have a few questions. 1. does anyone have a copy of Joel Bartlett's J. F.
Bartlett. A generational, compacting collector for C++. In E. Jul and N.-C.
Juul, editors, OOPSLA/ECOOP ‘90 Workshop on Garbage Collection in
Object-Oriented Systems, Oct. 1990. ? 2.
what's the largest number of classes you've got in your most complex image
and/or what's the largest number of classes you're ever created in an image?
a.k.a. is 64k classes enough (16 bits), or is 1m classes enough (24
bits)? 3.
do you have any personal recommendations, or example code, of systems that
combine generation scavenging and/or compaction with object pinning? TIA, Eliot |
Free forum by Nabble | Edit this page |