Smalltalk › Gnu

Understanding the object layout.

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

4 messages Options

Mathieu Suen-2

Understanding the object layout.

Hi All,

I am trying to understand how the object are layout in memory.
I found out some information in libgst/gst.h.

So if I understand correctly the main structures is
object_s and oop_s.

object_s contain size and class pointer plus the object ivs.
oops_s contain flags for GC I guess plus the pointer to an object_s.

My first question is why 2 structure?

My second question is concerning the object_s struct why there will always be at least one data?
OOP data[1]; /* variable length, may not be objects,
but will always be at least this
big. */

In other word: what would contain data if the instance is from a class with no ivs?

Finally I am wondering why objSize has type OOP? Is it because it is a real pointer to a SmallInteger?

I might not guessing right.
Thanks for your answer.

Mth

___________________________________________________________________________
Yahoo! Mail réinvente le mail ! Découvrez le nouveau Yahoo! Mail et son interface révolutionnaire.
http://fr.mail.yahoo.com

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk

Paolo Bonzini-2

Re: Understanding the object layout.

> So if I understand correctly the main structures is
> object_s and oop_s.
>
> object_s contain size and class pointer plus the object ivs.
> oops_s contain flags for GC I guess plus the pointer to an object_s.

Correct.

> My first question is why 2 structure?

Because it allows to move the object data without rewriting all the
pointers in the OOPs. The only case in which we have to scan the
whole OOPs is for one-way become.

> My second question is concerning the object_s struct why there will always be at least one data?
> OOP data[1]; /* variable length, may not be objects,
> but will always be at least this
> big. */

Actually not true. It's a common C idiom to declare data[1] when
there could be zero items.

However, the comment is not false, it is simply poorly written. It
means that even if you have bytes, shorts, or any kind of instance
variable other than pointers, the allocated size is always a multiple
of sizeof(OOP).

> In other word: what would contain data if the instance is from a class with no ivs?

Only two words.

> Finally I am wondering why objSize has type OOP? Is it because it is a real pointer to a SmallInteger?

Yes. It slightly simplifies some code (I think it was the Cheney
breadth-first scan during newspace garbage collection -
http://en.wikipedia.org/wiki/Cheney's_algorithm).

Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk

Mathieu Suen-2

Re: Understanding the object layout.

On May 4, 2010, at 8:04 AM, Paolo Bonzini wrote:

>> So if I understand correctly the main structures is
>> object_s and oop_s.
>>
>> object_s contain size and class pointer plus the object ivs.
>> oops_s contain flags for GC I guess plus the pointer to an object_s.
>
> Correct.
>
>> My first question is why 2 structure?
>
> Because it allows to move the object data without rewriting all the
> pointers in the OOPs. The only case in which we have to scan the
> whole OOPs is for one-way become.

Ok so I guess the GC compactor is faster. But it imply also that the
oop_s structure should not be compact only the object_s is compact.

Are the oop_s and object_s structure in the same heap space?

>
>> My second question is concerning the object_s struct why there will always be at least one data?
>> OOP data[1]; /* variable length, may not be objects,
>> but will always be at least this
>> big. */
>
> Actually not true. It's a common C idiom to declare data[1] when
> there could be zero items.
>
> However, the comment is not false, it is simply poorly written. It
> means that even if you have bytes, shorts, or any kind of instance
> variable other than pointers, the allocated size is always a multiple
> of sizeof(OOP).

Ok the C confuse me, but it is true that you never allocate an object by hand so the 1 doesn't mean anything. :)

Does you keep null terminal string?

As far as I remember you have a way in the object syntax declaration (using pragma) to declare the underlining object structure.

>
>> In other word: what would contain data if the instance is from a class with no ivs?
>
> Only two words.
>
>> Finally I am wondering why objSize has type OOP? Is it because it is a real pointer to a SmallInteger?
>
> Yes. It slightly simplifies some code (I think it was the Cheney
> breadth-first scan during newspace garbage collection -
> http://en.wikipedia.org/wiki/Cheney's_algorithm).

Ok thanks for you help.
I will try to understand how the GC is implemented.
And also it could be interesting to discuss on the way to implement the copy-on-write feature.
I guess is might have some impact on the object layout.

>
> Paolo

___________________________________________________________________________
Yahoo! Mail réinvente le mail ! Découvrez le nouveau Yahoo! Mail et son interface révolutionnaire.
http://fr.mail.yahoo.com

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk

Paolo Bonzini-2

Re: Understanding the object layout.

On 05/04/2010 07:52 PM, Mathieu Suen wrote:
>> Because it allows to move the object data without rewriting all
>> the pointers in the OOPs. The only case in which we have to scan
>> the whole OOPs is for one-way become.
>
> Ok so I guess the GC compactor is faster. But it imply also that the
> oop_s structure should not be compact only the object_s is compact.

All oop_s structures are the same size, so it's easy to reuse them with
a simple freelist without leaving space in the heap.

> Are the oop_s and object_s structure in the same heap space?

No.

> Does you keep null terminal string?

No, only when needed (see _gst_to_cstring in libgst/dict.c).

> Ok thanks for you help. I will try to understand how the GC is
> implemented. And also it could be interesting to discuss on the way
> to implement the copy-on-write feature. I guess is might have some
> impact on the object layout.

Maybe, or maybe not. I wondered if it is enough to create another
subclass of CharacterArray, add some flags to be checked by the #at:put:
primitives. And of course some creative use of #become:. :-)

But a warning is needed: it is a hard feature. I'm sure you could
manage, but there is also other work that can be useful and a more
gentle start. For example:

1) making a binary format for Smalltalk source code that already
includes bytecode etc. and teaching the VM to load it. Between the
Smalltalk compiler and gst-convert, there would be a lot of existing
code to leverage. Maybe it fits your expertise better too?

2) making an Objective-C bridge; there is Squeak code out there to take
inspiration from. Bonus points for making it work with both the Apple
and GNU runtimes.

3) think of what _you_ would need if you were writing scripts in
Smalltalk (bindings, whatever) and do that. This will hopefully provide
insights in both GST and the VM that you can use for other projects.

Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk