Smalltalk › Squeak › Squeak - Dev

A couple of memory management related questions

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

13 messages Options

Eliot Miranda-2

A couple of memory management related questions

Hi All,

I've started working on a new GC for Cog and have a few questions.

1. does anyone have a copy of Joel Bartlett's J. F. Bartlett. A generational, compacting collector for C++. In E. Jul and N.-C. Juul, editors, OOPSLA/ECOOP ‘90 Workshop on Garbage Collection in Object-Oriented Systems, Oct. 1990. ?

2. what's the largest number of classes you've got in your most complex image and/or what's the largest number of classes you're ever created in an image? a.k.a. is 64k classes enough (16 bits), or is 1m classes enough (24 bits)?

3. do you have any personal recommendations, or example code, of systems that combine generation scavenging and/or compaction with object pinning?

TIA,

Eliot

Dennis smith-4

Re: [vwnc] A couple of memory management related questions

I have 12,000 classes (including everything), I expect that to grow, but probably not more than 30,000 total.

On 6/15/2010 5:17 PM, Eliot Miranda wrote:

Hi All,

I've started working on a new GC for Cog and have a few questions.

1. does anyone have a copy of Joel Bartlett's J. F. Bartlett. A generational, compacting collector for C++. In E. Jul and N.-C. Juul, editors, OOPSLA/ECOOP ‘90 Workshop on Garbage Collection in Object-Oriented Systems, Oct. 1990. ?

2. what's the largest number of classes you've got in your most complex image and/or what's the largest number of classes you're ever created in an image? a.k.a. is 64k classes enough (16 bits), or is 1m classes enough (24 bits)?

3. do you have any personal recommendations, or example code, of systems that combine generation scavenging and/or compaction with object pinning?

TIA,

Eliot
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
  

-- 
Dennis Smith                 		         +1 416.798.7948
Cherniak Software Development Corporation   Fax: +1 416.798.0948
509-2001 Sheppard Avenue East        [hidden email]
Toronto, ON M2J 4Z8              <a class="moz-txt-link-freetext" href="sip:dennis@CherniakSoftware.com">sip:dennis@...
Canada			         http://www.CherniakSoftware.com
Entrance off Yorkland Blvd south of Sheppard Ave east of the DVP

Ralph Johnson

Re: A couple of memory management related questions

In reply to this post by Eliot Miranda-2

If people define classes by hand, I imagine 64k would be enough. But
suppose they are using light-weight classes, i.e. giving each object
its own behavior. Then it would be fairly easy to have more than 64K
of them, though 1M is still a reasonable limit.

What is the difference in cost between choosing 64K as the limit and
choosing 1M? I know, one byte, but what else would you use that byte
for?

-Ralph

Igor Stasenko

Re: A couple of memory management related questions

In reply to this post by Eliot Miranda-2

Hi, Eliot

On 16 June 2010 00:17, Eliot Miranda <[hidden email]> wrote:
> Hi All,
> I've started working on a new GC for Cog and have a few questions.
> 1. does anyone have a copy of Joel Bartlett's J. F. Bartlett. A
> generational, compacting collector for C++. In E. Jul and N.-C. Juul,
> editors, OOPSLA/ECOOP ‘90 Workshop on Garbage Collection in Object-Oriented
> Systems, Oct. 1990. ?

Not this in particular. But about GC, i found a very interesting idea
about run-time
garbage collection.
The idea is simple: at each allocation of N bytes, GC marks N*k bytes
of object memory (where k>1).
So, the mark phase overhead is proportionally distributed among memory
allocations, which makes
a usual allocation a bit slower, but having much less delays in
garbage collection cycles, since
once it marked everything, it just doing a sweep (& compaction).

> 2. what's the largest number of classes you've got in your most complex
> image and/or what's the largest number of classes you're ever created in an
> image? a.k.a. is 64k classes enough (16 bits), or is 1m classes enough (24
> bits)?

I would really like to not have a hard limit for number of classes in VM.
This is for implementing a prototype-based stuff, where each object is
an instance of itself, and hence
it acting as class.
For running a usual smalltalk system, i think the limit could be even
less - like 4095 or 8095 classes.
But i think that VM should not impose a hard limit on the number of classes.

> 3. do you have any personal recommendations, or example code, of systems
> that combine generation scavenging and/or compaction with object pinning?

Never explored this direction :)
How about making everything pinned from start (i.e. use a heap-based
algorithms for allocation)?
Then you don't have to do compaction, and do it only when saving an image.

> TIA,
> Eliot
>

--
Best regards,
Igor Stasenko AKA sig.

ungar

Re: [Vm-dev] A couple of memory management related questions

In reply to this post by Eliot Miranda-2

Can't resist the following unproductive comment: Object pinning? Ugh!!

I realize you probably, after much thought, have decided you cannot avoid pinning, but its consequences scare the you-know-what out of me.

If you supported one class per object, then you could get rid of the identityHash field in the header, by making it an instance method. ;-)

Oh wait a minute, since classes are objects, you would need oops larger than 64-bits each. ;-)

- David

On Jun 15, 2010, at 2:17 PM, Eliot Miranda wrote:

Hi All,

I've started working on a new GC for Cog and have a few questions.

1. does anyone have a copy of Joel Bartlett's J. F. Bartlett. A generational, compacting collector for C++. In E. Jul and N.-C. Juul, editors, OOPSLA/ECOOP ‘90 Workshop on Garbage Collection in Object-Oriented Systems, Oct. 1990. ?

2. what's the largest number of classes you've got in your most complex image and/or what's the largest number of classes you're ever created in an image? a.k.a. is 64k classes enough (16 bits), or is 1m classes enough (24 bits)?

3. do you have any personal recommendations, or example code, of systems that combine generation scavenging and/or compaction with object pinning?

TIA,
Eliot

Yoshiki Ohshima-2

Re: A couple of memory management related questions

In reply to this post by Ralph Johnson

At Tue, 15 Jun 2010 16:44:57 -0500,
Ralph Johnson wrote:
>
> If people define classes by hand, I imagine 64k would be enough. But
> suppose they are using light-weight classes, i.e. giving each object
> its own behavior. Then it would be fairly easy to have more than 64K
> of them, though 1M is still a reasonable limit.
>
> What is the difference in cost between choosing 64K as the limit and
> choosing 1M? I know, one byte, but what else would you use that byte
> for?

If the next highest notch from 16 bit is 24 bit but not 20 bit, that
would be 16M but not 1M.

The big images in "our" use cases is with several million objects
(not that big in other standard), so if each object theoretically had
a distinct class, it would only barely fits. But in reality 16M
classes and an object occupies around 40 byte each on avarage, only 6
instances or such per class on average before hitting 4GB. So, for
32-bit address space in mind, 16M classes would be a lot. 1M classes
would be 100 instances on average (where the distribution of instance
count is highly skewed so average would not make so much sense
however), it still would be okay.

(As Ralph asked, what are the other things that are competing the
bits?)

-- Yoshiki

radoslav hodnicak

Re: A couple of memory management related questions

In reply to this post by Eliot Miranda-2

On Tue, Jun 15, 2010 at 11:17 PM, Eliot Miranda <[hidden email]> wrote:

Hi All,

I've started working on a new GC for Cog and have a few questions.

1. does anyone have a copy of Joel Bartlett's J. F. Bartlett. A generational, compacting collector for C++. In E. Jul and N.-C. Juul, editors, OOPSLA/ECOOP ‘90 Workshop on Garbage Collection in Object-Oriented Systems, Oct. 1990. ?

the internet does - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.8462
(click cached PDF)

Eliot Miranda-2

Re: A couple of memory management related questions

In reply to this post by Ralph Johnson

On Tue, Jun 15, 2010 at 2:44 PM, Ralph Johnson <[hidden email]> wrote:

If people define classes by hand, I imagine 64k would be enough. But
suppose they are using light-weight classes, i.e. giving each object
its own behavior. Then it would be fairly easy to have more than 64K
of them, though 1M is still a reasonable limit.

What is the difference in cost between choosing 64K as the limit and
choosing 1M? I know, one byte, but what else would you use that byte
for?

So the header format is 64 bits. Its not about the bits. Its about speed of access. We need of the order of 8 bits for GC flags (marked, free, forwarded, weak etc). We would like of the order of 8 bits for field size and 8 bits for inst size, with objects >= 255 32-bit fields having an overflow size field. That leaves 5 bytes, 40 bits for identity hash and class index. A symmetric 20-bits each gives us slightly slower class index access but provides for 1m classes (apologies for my math earlier). But a 16-bit class index would be faster, or at least have more compact code, on 86, and allow for identity hashes for non-classes of up to 16m. So if 64k classes is enough then a 16 bit class index field is the way to go. If its tight I would stick with 20/20. VW's 64-bit VM uses exactly this. Its essentially:

8 bits of GC flags (1 bit being pointers vs non-pointers)

8 bits of field size (64 bits per field)

8 bits of inst size for pointer objects/5 unused bits, 3 bits of odd size for non-pointers, so size = 8 * # of fields - odd size

20 bits of class index

20 bits of identity hash

But its organized to put the 20 bits in the least significant half of each word, the inst and field sizes in bytes and the GC flags distributed:

so extracting the class idx, the high frequency operation done on every send, is a 64-bit fetch & mask, which in a 32-bit implementation would be a 32-bit fetch and mask. On 32-bits might be better to do

which would be a 32-bit fetch followed by a 12-bit shift, which is typically less code bytes than the 0xFFFFF mask, and fast now that we can assume a barrel shifter. But that would be incompatible with the 64-bit scheme. So I'll stick with the mask. Its still way shorter than the Cog code for the current compact class scheme.

Clearly with the 20/20 scheme its easy to steal a couple of bits for more GC flags by taking a bit from each 20 bit field. Further, with Squeak, the inst size byte might be pointless (although contexts are still indexable objects with fixed fields) and so I doubt bits are at a premium. Hence the real question is whether I should be whorish and go for the 16-bit compact fast class index access or not.

best

Eliot

-Ralph

Eliot Miranda-2

Re: A couple of memory management related questions

In reply to this post by Eliot Miranda-2

On Tue, Jun 15, 2010 at 2:17 PM, Eliot Miranda <[hidden email]> wrote:

Hi All,

I've started working on a new GC for Cog and have a few questions.

1. does anyone have a copy of Joel Bartlett's J. F. Bartlett. A generational, compacting collector for C++. In E. Jul and N.-C. Juul, editors, OOPSLA/ECOOP ‘90 Workshop on Garbage Collection in Object-Oriented Systems, Oct. 1990. ?

thanks to all 5 of you who've provided this. Ta!

2. what's the largest number of classes you've got in your most complex image and/or what's the largest number of classes you're ever created in an image? a.k.a. is 64k classes enough (16 bits), or is 1m classes enough (24 bits)?

3. do you have any personal recommendations, or example code, of systems that combine generation scavenging and/or compaction with object pinning?

TIA,
Eliot

Andreas.Raab

Re: A couple of memory management related questions

In reply to this post by Eliot Miranda-2

On 6/15/2010 3:32 PM, Eliot Miranda wrote:
> Clearly with the 20/20 scheme its easy to steal a couple of bits for
> more GC flags by taking a bit from each 20 bit field. Further, with
> Squeak, the inst size byte might be pointless (although contexts are
> still indexable objects with fixed fields) and so I doubt bits are at a
> premium. Hence the real question is whether I should be whorish and go
> for the 16-bit compact fast class index access or not.

My $.02: Start with 16 bits and declare the free bits "reserved". When
someone complains for the first time, review and decide how many more
bits to devote. Rationale #1: Rather than taking away resources when you
notice you need an extra header bit, start with the smaller number but
keep some room for growth if it becomes necessary. Rationale #2: YAGNI
(i.e., we really don't know if we'll need more header bits or if we need
more room for classes).

Cheers,
- Andreas

Eliot Miranda-2

Re: A couple of memory management related questions

On Tue, Jun 15, 2010 at 3:53 PM, Andreas Raab <[hidden email]> wrote:

On 6/15/2010 3:32 PM, Eliot Miranda wrote:

Clearly with the 20/20 scheme its easy to steal a couple of bits for
more GC flags by taking a bit from each 20 bit field. Further, with
Squeak, the inst size byte might be pointless (although contexts are
still indexable objects with fixed fields) and so I doubt bits are at a
premium. Hence the real question is whether I should be whorish and go
for the 16-bit compact fast class index access or not.

My $.02: Start with 16 bits and declare the free bits "reserved". When someone complains for the first time, review and decide how many more bits to devote. Rationale #1: Rather than taking away resources when you notice you need an extra header bit, start with the smaller number but keep some room for growth if it becomes necessary. Rationale #2: YAGNI (i.e., we really don't know if we'll need more header bits or if we need more room for classes).

Excellent suggestion.

Cheers,
- Andreas

Joachim Geidel

Re: A couple of memory management related questions

In reply to this post by Igor Stasenko

Am 15.06.10 23:55 schrieb Igor Stasenko:
> For running a usual smalltalk system, i think the limit could be even
> less - like 4095 or 8095 classes.

I disagree. I have seen several systems which have between 10.000 and 20.000
classes, and over on the VWNC list, other people reported the same order of
magnitude.

Cheers,
Joachim Geidel

Joerg Beekmann, DeepCove Labs (YVR)

RE: A couple of memory management related questions

In reply to this post by Eliot Miranda-2

Hello Eliot

Our current system has 14787 classes (including meta classes), I can’t imagine I’ve ever worked on an image with more than twice that number written by hand. I did once work on a classification system that created classes programmatically. We probably had over 64K classes but less than 1m.

Joerg

From: [hidden email] [mailto:[hidden email]] On Behalf Of Eliot Miranda
Sent: Tuesday, June 15, 2010 2:17 PM
To: The general-purpose Squeak developers list; Pharo Development; vwnc NC; Squeak Virtual Machine Development Discussion
Subject: [squeak-dev] A couple of memory management related questions

Hi All,

I've started working on a new GC for Cog and have a few questions.

3. do you have any personal recommendations, or example code, of systems that combine generation scavenging and/or compaction with object pinning?

TIA,

Eliot