[squeak-dev] Using class indexes instead of reference in oop header (in COG)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Using class indexes instead of reference in oop header (in COG)

Igor Stasenko
Hi all,

I wanted to ask Eliot about some details how he planning to use new
object format in Cog.
I think it may be interesting to others, so i posted it here.

At ESUG, Eliot mentioned that new object format will be 64 bit
(regadless of platform), and he will use a class indexes (pointing to
entry in a global classes table) in header instead of direct class
oop.
Currently, in Squeak VM same principle is used for so-called compact
classes, where in oop header is stored an index of compact classes
array entry, which allowing to have a smallest possible oop header (32
bits).

So, in Cog, to determine an oop class, first it reads an index from
its header, and then reads a class oop from classes global table.
But you need a reverse operation as well - translating class oop into index.
Every time you doing #basicNew , or #primitiveChangeClass , you need
to determine a class index by its oop value.

So, the question is how this will be done?

Also, some classes may die, and obviously VM needs to do a compaction
of global classes table at some point, to prevent maintaining a big
table where only few of entries is used.
Suppose i created 1000+1 classes  => each class gets own index in table.
Then first 1000 classes is gone (when user uninstalling something).
Now you having 1000 free entries in table. But last one is still in
use, and if you want to make table more compact it would require to
visit each instance of that class and set a new index value.


--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Using class indexes instead of reference in oop header (in COG)

Eliot Miranda-2


On Sun, Aug 31, 2008 at 7:37 AM, Igor Stasenko <[hidden email]> wrote:
Hi all,

I wanted to ask Eliot about some details how he planning to use new
object format in Cog.
I think it may be interesting to others, so i posted it here.

At ESUG, Eliot mentioned that new object format will be 64 bit
(regadless of platform), and he will use a class indexes (pointing to
entry in a global classes table) in header instead of direct class
oop.
Currently, in Squeak VM same principle is used for so-called compact
classes, where in oop header is stored an index of compact classes
array entry, which allowing to have a smallest possible oop header (32
bits).

So, in Cog, to determine an oop class, first it reads an index from
its header, and then reads a class oop from classes global table.
But you need a reverse operation as well - translating class oop into index.
Every time you doing #basicNew , or #primitiveChangeClass , you need
to determine a class index by its oop value.

So, the question is how this will be done?

In VisualWorks the identity hash of an object is determined lazily when doing one of the following:

- sending a message to an object.  The object's class and the message's selector are quertied for their identity hash.  If either object does not have one it is assigned.  So the class object is identified from context; it is the class of the receiver.

- sending identityHash to a class.  In VisualWorks classes have a different version of the identityHash primitive .

In either case the class is assigned a hash that is the next unused slot in the class table, and entered into the table at that index.

In Squeak we assign hashes when objects are instantiated.  So we would need a special version of the new primitive to create instances of classes.  So Metaclass would have to do something like implement new and basicNew with a primitive that assigned the identity hash and enter it into the class table, or override new and basicNew with a primitive that assigns the hash and enters it into the table after instantiation.


Also, some classes may die, and obviously VM needs to do a compaction
of global classes table at some point, to prevent maintaining a big
table where only few of entries is used.
Suppose i created 1000+1 classes  => each class gets own index in table.
Then first 1000 classes is gone (when user uninstalling something).
Now you having 1000 free entries in table. But last one is still in
use, and if you want to make table more compact it would require to
visit each instance of that class and set a new index value.


With either the VW lazy or Squeak eager approach the class table is weak (a strong Array of WeakArray pages) and so the indexes/hashes of garbage collected classes can be reused.  In the VW VM I maintained an index to the first unused slot.  Whenever a class is reclaimed its page in the class table is finalized.  The VM notices this and sets the index to the unused slot to the minimum of the newly reclaimed slot and its current value.  This way the class table stays compact and doesn't grow unnecessarily large.

One nice thing I didn't mention is that one can use "class index puns".  There is nothing to prevent the VM from entering a class into the class table more than once.  So I entered in WeakArray twice into the class table, and used one as a hidden class index for the pages of the class table itself.  When one does WeakArray allInstances, therefore, the class table pages are not found.  Further, the finalization machinery can easily identify a class table page, because it has a unique class index, and so the maintennance of the first unused class table slot index is very cheap.

--
Best regards,
Igor Stasenko AKA sig.


Cheers!

Eliot


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Using class indexes instead of reference in oop header (in COG)

Kjell Godo
Instead of grouping methods into classes what happens if you
group classes into methods instead.
 
In picoLARC all the methods that have the same name are
grouped together into something I call a generic function.
 
Message dispatch works like this:
 
The generic function is looked up and the classIndex of the receiver
is looked up in the message send.
 
The generic function has a dictionary of < classIndex method >
pairs.  The classIndex is looked up to return the method which
is then evaluated.
 
When you do it this way then methods that are implemented in
just one leaf class have a generic function that has only one
pair in it so no lookup is needed.  A leaf class has no subclasses.
 
Like you say there is an Array of Classes and when a method is
implemented in the Object Class then the dictionary in the
generic function for that method can be replaced by the Array
of Classes so the dictionary lookup becomes an Array lookup
in that generic function.
 
So doing it this way you have a speed up in the message
dispatch when the method is unique in a leaf Class or when
the method has an implemention in the Object Class.

On Mon, Sep 1, 2008 at 1:25 PM, Eliot Miranda <[hidden email]> wrote:


On Sun, Aug 31, 2008 at 7:37 AM, Igor Stasenko <[hidden email]> wrote:
Hi all,

I wanted to ask Eliot about some details how he planning to use new
object format in Cog.
I think it may be interesting to others, so i posted it here.

At ESUG, Eliot mentioned that new object format will be 64 bit
(regadless of platform), and he will use a class indexes (pointing to
entry in a global classes table) in header instead of direct class
oop.
Currently, in Squeak VM same principle is used for so-called compact
classes, where in oop header is stored an index of compact classes
array entry, which allowing to have a smallest possible oop header (32
bits).

So, in Cog, to determine an oop class, first it reads an index from
its header, and then reads a class oop from classes global table.
But you need a reverse operation as well - translating class oop into index.
Every time you doing #basicNew , or #primitiveChangeClass , you need
to determine a class index by its oop value.

So, the question is how this will be done?

In VisualWorks the identity hash of an object is determined lazily when doing one of the following:

- sending a message to an object.  The object's class and the message's selector are quertied for their identity hash.  If either object does not have one it is assigned.  So the class object is identified from context; it is the class of the receiver.

- sending identityHash to a class.  In VisualWorks classes have a different version of the identityHash primitive .

In either case the class is assigned a hash that is the next unused slot in the class table, and entered into the table at that index.

In Squeak we assign hashes when objects are instantiated.  So we would need a special version of the new primitive to create instances of classes.  So Metaclass would have to do something like implement new and basicNew with a primitive that assigned the identity hash and enter it into the class table, or override new and basicNew with a primitive that assigns the hash and enters it into the table after instantiation.


Also, some classes may die, and obviously VM needs to do a compaction
of global classes table at some point, to prevent maintaining a big
table where only few of entries is used.
Suppose i created 1000+1 classes  => each class gets own index in table.
Then first 1000 classes is gone (when user uninstalling something).
Now you having 1000 free entries in table. But last one is still in
use, and if you want to make table more compact it would require to
visit each instance of that class and set a new index value.


With either the VW lazy or Squeak eager approach the class table is weak (a strong Array of WeakArray pages) and so the indexes/hashes of garbage collected classes can be reused.  In the VW VM I maintained an index to the first unused slot.  Whenever a class is reclaimed its page in the class table is finalized.  The VM notices this and sets the index to the unused slot to the minimum of the newly reclaimed slot and its current value.  This way the class table stays compact and doesn't grow unnecessarily large.

One nice thing I didn't mention is that one can use "class index puns".  There is nothing to prevent the VM from entering a class into the class table more than once.  So I entered in WeakArray twice into the class table, and used one as a hidden class index for the pages of the class table itself.  When one does WeakArray allInstances, therefore, the class table pages are not found.  Further, the finalization machinery can easily identify a class table page, because it has a unique class index, and so the maintennance of the first unused class table slot index is very cheap.

--

Best regards,
Igor Stasenko AKA sig.


Cheers!

Eliot






Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Using class indexes instead of reference in oop header (in COG)

Michael Haupt-3
Hi Kjell,

On Sun, Sep 7, 2008 at 4:32 AM, Kjell Godo <[hidden email]> wrote:
> Instead of grouping methods into classes what happens if you
> group classes into methods instead.
>
> In picoLARC all the methods that have the same name are
> grouped together into something I call a generic function.

so did lots of people before you, I believe. Ever heard of CLOS? ;-)

scnr,

Michael