Hi, this is another simple (I hope) question: I have an instance of MethodDictionary, which is defined as Dictionary variableSubclass: #MethodDictionary
instanceVariableNames: ''
classVariableNames: '' poolDictionaries: '' category: 'Kernel-Methods'
looking at the format field it says 3, which I understand is that instances have both fixed and indexed oops. The question is, what does that mean? or better why is it that way if the class only defines an array an a tally as instance variables?
Also, looking at the header, I see some strange things in the size field (being bigger than what I'd expect). What is the format of size field in this case? Thanks!
Javier. -- Javier Pimás Ciudad de Buenos Aires |
On Wed, Jun 8, 2011 at 5:35 PM, Javier Pimás <[hidden email]> wrote:
I think its own class comments says it everything: "I'm a special dictionary holding methods. I am just like a normal Dictionary, except that I am implemented differently. Each Class has an instance of MethodDictionary to hold the correspondence between selectors (names of methods) and methods themselves. In a normal Dictionary, the instance variable 'array' holds an array of Associations. Since there are thousands of methods in the system, these Associations waste space. Each MethodDictionary is a variable object, with the list of keys (selector Symbols) in the variable part of the instance. The variable 'array' holds the values, which are CompiledMethods." If not, then ask what part is still missing ;) It is really newbie for you, but once I wrote this: http://marianopeck.wordpress.com/2011/05/07/class-formats-and-compiledmethod-uniqueness/
-- Mariano http://marianopeck.wordpress.com |
In reply to this post by melkyades
On Wed, Jun 8, 2011 at 8:35 AM, Javier Pimás <[hidden email]> wrote:
An object with both named and indexed inst vars is flat, i.e. is only a single object. For example MethodContext. An object with an array to hold its variable objects, e.g. the current OrderedCollection, is not flat, i.e. two objects. The distinction is to do with the efficiency of implementing become. In the original Smalltalk-80 implementations and in the VisualWorks VM objects in the heap are split into a header and a body, with the header containing a pointer to the body (and references to objects are pointers to object headers). This makes certain algorithms like compaction easy to implement, but it also results in a cheap become.
When Squeak was implemented it was decided to use flat objects in the VM, following the lead of David Ungar's Berkeley Smalltalk implementation, and of the subsequent Self implementations, all of which also use flat objects. Flat objects makes for faster allocation and faster inst var access (since accessing an inst var doesn't require the double indirection of following the pointer to the header and then the pointer to the body). But it makes become very much more expensive, since in the worst case the VM must scan the entire heap looking for references to the objects in the become operation and replacing them by references to their corresponding objects. The solution David Ungar developed, which was adopted by Squeak, was to unflatten objects that used become to grow, such as OrderedCollection, Set and DIctionary, and use an array to hold their variable part.
A key point is that objects such as OrderedCollection, Set and DIctionary encapsulate their state and so are free to grow by allocating a larger array and copying the contents from the old to the new array. There is still an issue with streams, which have also been changed not to use become when growing their collections. It used to be the case that one could use streams to grow objects, since in Smalltalk-80 they used become. But this was not used very often and easily worked around.
HTH Eliot
|
thanks both mariano and you for the quick answers, they helped a lot. Now let's get the hands dirty: I want to understand a bit better the internals of this tiny animals. I have a MethodDictionary, with tally=1 and an array of size 32, filled with all nils except for the last position. Looking at it with gdb, including header I get 0x78c68dd4: 0x779cadf5 0x1848038d 0x00000003 0x78c68e64 0x78c68de4: 0x77831004 0x77831004 0x77831004 0x77831004 0x78c68df4: 0x77831004 0x77831004 0x77831004 0x77831004
0x78c68e04: 0x77831004 0x77831004 0x77831004 0x77831004 0x78c68e14: 0x77831004 0x77831004 0x77831004 0x77831004 0x78c68e24: 0x77831004 0x77831004 0x77831004 0x77831004
0x78c68e34: 0x77831004 0x77831004 0x77831004 0x77831004 0x78c68e44: 0x77831004 0x77831004 0x77831004 0x77831004 0x78c68e54: 0x77831004 0x77831004 0x77831004 0x778cb5d4
0x78c68e64: 0x00da3287 0x77831004 0x77831004 0x77831004 where the second int (0x1848038d) is the base header (the oop points to 0x78c68dd8). The first fields is a smallint for 1, I guess. Now the questions: in the header, size bits are 100011, why? Also second object would be the array oop in case the object weren't flat, but in this case it seems to point to the position past the last variable field, am I guessing right? How does the VM manage this flattening of the instance vars?
Thanks! Javier. On Wed, Jun 8, 2011 at 2:01 PM, Eliot Miranda <[hidden email]> wrote:
-- Javier Pimás Ciudad de Buenos Aires |
Hi Javier, On Wed, Jun 8, 2011 at 6:41 PM, Javier Pimás <[hidden email]> wrote:
Instead of using raw gdb, you could try using gdb plus the debug routines included in the VM (or use the VM simulator). e.g. set a breakpoint in interpret, run your favourite image (in this case an updated trunk squeak 4.2) and hence examine it immediately after loading:
Breakpoint 1, interpret () at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:1930 1930 JUMP_TABLE;
find nil: (gdb) call printOop(nilObj) 0x141fc004: a(n) UndefinedObject
find its class UndefinedObject: (gdb) call printOop(fetchClassOf(nilObj)) 0x146f0c34: a(n) UndefinedObject class
0x147ff778 0x14d5cdd8 0x5 0x141fc004 0x14255af0 0x141fc004 0x1438e6e0 0x14696580 0x141fc004 0x141fc004
0x1438e0cc print its method dictionary, the second inst var, 0x14d5cdd8 (gdb) call printOop(0x14d5cdd8)
0x14d5cdd8: a(n) MethodDictionary 0x55 0x14d5ceec 0x141fc004 0x141fc004 0x141fc004
0x1438fdac 0x141fc004 0x141fc004 0x141fc004 0x141fc004 0x141fc004 0x141fc004 0x141fc004 0x143a1448 0x141fc004
0x141fc004 0x141fc004 0x143c8290 0x141fc004 0x14386320 0x1511d0ec 0x1438fc90 0x141fc004 0x14386a70 0x1439d52c
0x143b53c8 0x1438db24 0x1483bc30 0x1439d53c 0x143a07c4 0x143c9634 0x143e6ca4 0x144095e4 0x14386730 0x14390080
0x1438ff04 0x14424624 0x14402638 0x1511d004 0x143869c4 0x14386530 0x1439f5d4 0x1438ffb4 0x143864b8 0x14386a08
0x143c20d4 0x14386a18 0x1438647c 0x143868cc 0x141fc004 0x1438f800 0x143aba20 0x141fc004 0x141fc004 0x14424608
0x1438b2f0 0x1438d1b0 0x144095fc 0x141fc004 0x141fc004 0x141fc004 0x143a4ce4 0x143c44e4 0x143c82a4
and hex 55 is 2 * 42 + 1, so the dictionary has 42 entries (how appropriate) and, by counting, 64 slots. ... (gdb) call printOop(0x14d5ceec)
0x14d5ceec: a(n) Array 0x141fc004 0x141fc004 0x141fc004 0x14661cdc 0x141fc004
0x141fc004 0x141fc004 0x141fc004 0x141fc004 0x141fc004 0x141fc004 0x14661cf4 0x141fc004 0x141fc004 0x141fc004
0x14661d0c 0x141fc004 0x1486564c 0x1513faa0 0x14661d28 0x141fc004 0x14661d4c 0x14661d70 0x14661d88 0x14661da0
0x1483bcb4 0x14661dc0 0x14661dd8 0x14661df0 0x14661e28 0x14661e40 0x14661e58 0x14661e6c 0x14661ea4 0x14661ebc
0x14661ed4 0x1513fa80 0x14661ef4 0x14661f18 0x14661f2c 0x14661f40 0x14661f80 0x14661fa4 0x14661fb8 0x14661fd0
0x14661fe4 0x14661ff8 0x141fc004 0x1466200c 0x1466202c 0x141fc004 0x141fc004 0x14662048 0x146620cc 0x14662114
0x1466212c 0x141fc004 0x141fc004 0x141fc004 0x14865690 0x14662150 0x14662164 0x141fc004 0x14662180
Does it? Anyway, the header constants are set forth in ObjectMemory class>>initializeObjectHeaderConstants, and used in ObjectMemory's header access protocol in methods such as ObjectMemory>>sizeBitsOf:.
Don't guess :) Read the source and work it out.
As far as the GC is concerned there is nothing special about the object; it is just a vector of object pointers. So the flattening (actually the stepping over of named inst vars) is actually handled by the at: and at:put: code, which is in Interpreter/StackInterpreter's indexing primitive support protocol in methods stObject:at:, stObject:at;put:, subscript:with:format: and subscript:with:storing:format:.
cheers, Eliot
|
In reply to this post by Eliot Miranda-2
Eliot Miranda wrote on Wed, 8 Jun 2011 10:01:17 -0700 > When Squeak was implemented it was decided to use flat objects in the VM, > following the lead of David Ungar's Berkeley Smalltalk implementation, and of > the subsequent Self implementations, all of which also use flat objects. Given the background you mention and that all Vector objects in Self (Array in Smalltalk-80) can have named slots as well as indexed ones, you would think that the use of flat objects would be popular. But they are actually very rare. Methods objects, for example, have a named variable called "literals" that holds a Vector and another variable named "bytecodes" that holds a ByteVector. The same is true for the various kinds of Collections. > Flat objects makes for faster allocation and faster inst var access (since > accessing an inst var doesn't require the double indirection of following the > pointer to the header and then the pointer to the body). But it makes become > very much more expensive, since in the worst case the VM must scan the entire > heap looking for references to the objects in the become operation and > replacing them by references to their corresponding objects. In theory there is no become in Self. In practice, the various object building primitives (such as _AddSlots: or _RemoveSlots:) do their work in place and so have the same cost as become. They are normally invoked by the user from the GUI and not used programatically, so their performance isn't too critical (and scanning the entire heap is the most optimized operation of all, obtained by a special tag for header words and by allocating bytes and pointers from different ends of the heap). > The solution David Ungar developed, which was adopted by Squeak, was to > unflatten objects that used become to grow, such as OrderedCollection, Set > and DIctionary, and use an array to hold their variable part. Do you mean at runtime or at design time (converting the image once)? If the latter, I had the impression that this was first proposed at Tektronix (after the versions described in the green book). But I don't trust my memory on this. -- Jecel |
In reply to this post by Eliot Miranda-2
Thanks again for your time and the detailed answer. I finally could understand what was going on. When you said it was flat, I firstly thought that meant that the whole array of associations instVar was somehow pasted into the MethodDictionary. Of course I was wrong, what happens is that only the keys are put there (now I know, this is obvious from the class description), and the array is just a normal object that holds the values. One unrelated question: if oops are aligned to 4-byte words, then there is a free bit in oops (the penultimate one). Is it used for something? Regards, Javier.
On Thu, Jun 9, 2011 at 3:34 PM, Eliot Miranda <[hidden email]> wrote:
-- Javier Pimás Ciudad de Buenos Aires |
On Thu, Jun 9, 2011 at 12:31 PM, Javier Pimás <[hidden email]> wrote:
No, but I plan on using it for immediate characters.. So SmallIntegers would continue to occupy 0x....1 and oops 0x...00 and Characters would occupy 0x.......10.
best, Eliot
|
In reply to this post by melkyades
On Thu, Jun 9, 2011 at 9:31 PM, Javier Pimás <[hidden email]> wrote:
yes
-- Mariano http://marianopeck.wordpress.com |
In reply to this post by Eliot Miranda-2
On Thu, Jun 9, 2011 at 12:14 PM, Jecel Assumpcao Jr. <[hidden email]> wrote:
Are you sure that Self allows named and indexable slots? What's an example in Self?
Im sure it was incremental. There is a full become: in BS but its slow and hence to speed the system up changes were made to certain core classes to reduce becomming. IIRC these changes were available in a change set called unbecomming.st :)
cheers Eliot
|
Eliot, > Are you sure that Self allows named and indexable slots? What's an example in Self? I'm pretty sure, but can't reach my copy of Craig Chambers' thesis right now and it doesn't seem to be available online. None of the other documentation mentions this. Sadly, Self 4.4 is now generating a segfault on this machine (my Fedora 10 installation is causing lots of problems like this). But looking at the Self sources (not VM) for 4.1.6 it seems that "processStack" is created by cloning a vector, removing the "parent" slot and then adding "cache", "format", "myProcess" as data slots and then "parent" as a constant parent slot. In the old UI1 experiment (Morphic is UI2) something similar is done for object "enumResult". Like I said, this is rarely used (these were the only examples I found). > > Do you mean at runtime or at design time (converting the image once)? If > > the latter, I had the impression that this was first proposed at > > Tektronix (after the versions described in the green book). But I don't > > trust my memory on this. > > Im sure it was incremental. There is a full become: in BS but its slow and hence to > speed the system up changes were made to certain core classes to reduce becomming. > IIRC these changes were available in a change set called unbecomming.st :) It is very common for people to get excited by a new feature (like become in Smalltalk-78 or dynamic inheritance in Self) and try to use it as much as possible, even where "conventional" solutions would be as good or even better. Eventually it either dies out or is limited to where it makes sense. -- Jecel |
Free forum by Nabble | Edit this page |