On Sun, May 7, 2017 at 4:24 AM, Max Leske <[hidden email]> wrote:
It's been a while since I wrote this code so my understanding has been coming back in fits and starts. The limitation on the use of the hash bits field is in referring to "out pointers", objects that the saved segment refers to, not on objects internal to the segment. So I think it's fixable.
The hash field is used to map from an object in the heap to its object in the segment. Right now the mapping is from hash (22 bits) to location in the segment / 8, and so limits the size of the segment to 500kb. If an extra level of indirection was added so that hash maps to index in an array of oops, then the segment could contain up to 4m objects and I think that'll be large enough for your use.
If that's still not enough then the al;goriqhm will have to be rewritten to use the first field of the object in the heap to point to its location in the segment, and the first field saved alongside.
So let me know. Would you be happy with a fix that provides up to 4m objects per segment or would you want to wait for something with a much higher limit?
I've sampled 10000 classes and 10000 model specific classes at random and got a median #byteSizeOfInstance of 60 bytes. For 4m objects that would mean an upper limit of 240 MB file size. That is enough for me at the moment, yes. I currently need around 120 MB (with a safety margin of 15 MB).
Not having to watch out for that limit would be nice but my priority is to be able to create and read segments.
How much time do you think it will take you to make the change? I just need a rough idea so I can plan my work around that.
Thanks for your help!
On May 9, 2017, at 7:30 AM, Max Leske <[hidden email]> wrote:
Agreed, but that's a bigger rewrite. It would have in common some of the approach taken with the new compactor which also uses the first field, in its case to point to eventual location, while saving the contents of the first field in an array off to the side.
(In Spur all objects, including zero-sized ones, have at least one field so that they can be converted into a forwarding pointer).
It shouldn't take more than a day; two at the most. You can help me by setting up a test case (although perhaps I can hack up a quick binary tree, so that might not help as much as I expect).
The primitive uses a routine that answers the objects to be written to the segment in an array. The primitive needs to be extended with a word array, let's call it the mapArray, as big as this to hold the indirection from hash field to index in the mapArray in which are held the pointers to locations in the segment data.
It might be just as easy/difficult to use a savedFirstFields array (also one element per object to be copied into the array) and point to location from first field, which would eliminate the upper limit immediately. I'll mull this over while at the DMV; a chore I have to do today.
[ Reposting with correct subject. Sorry! ]
I've tried to come up with a test case and have one that doesn't fail... Not sure what I'm missing. The test reports the sizes of the segment and its out pointers. I'm including it here, maybe you can work from that. I'm also appending my hacked version of ImageSegment for Pharo 6 (mainly use of FileSystem instead of FileDirectory).
I hope that helps.
System-Object Storage Hacked.st (323K) Download Attachment
And of course I forgot the code:
| rootCollection numberOfLeaves holder |
holder := OrderedCollection new.
numberOfLeaves := 1000000.
rootCollection := OrderedCollection new.
1 to: numberOfLeaves do: [ :i |
| next |
next := Object new.
holder add: next.
rootCollection add: i -> next ].
Transcript show: 'copying roots'; cr.
[ | im |
im := ImageSegment new
copyFromRoots: rootCollection sizeHint: 250000 areUnique: true;
show: 'out pointers: ', (im outPointers size asString); cr;
show: 'segment: ', (im segment size asString); cr ]
Transcript show: 'failed'; cr; cr.
Processor activeProcess terminate ].
Transcript show: 'succeeded'; cr; cr.
In reply to this post by Max Leske
Hi Max, Bert,
to integrate the below with Bert's rewriting of the importer, I'm thinking of renaming ImageSegment to NativeImageSegment. Hopefully we can get the native and the all-in-Smalltalk code to coexist. Is NativeImageSegment a good name? Do you prefer e.g. VMImageSegment? SpurImageSegment?
And we still have to handle fixing up of loaded objects. The load primitive answers the array of roots. We'd actually like an array of all loaded objects. Maybe I should add a variant of the load primitive that does this. Are you, Max, up to rewriting the post-load object mixup to avoid nextObject?
On Tue, May 9, 2017 at 10:15 AM, Max Leske <[hidden email]> wrote:
NativeImageSegment sounds good.
(I had renamed "my" class to OldImageSegment but ran into issues with old code breaking. Better to start clean)
- Bert -
On Tue 9. May 2017 at 23:05, Eliot Miranda <[hidden email]> wrote:
Yes, I can do that. I'll probably need some help but I think I can handle it.
|Free forum by Nabble||Edit this page|