Re: [squeak-dev] Error in ImageSegment primitive?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Error in ImageSegment primitive?

Eliot Miranda-2
 
Hi Max,

On Sun, May 7, 2017 at 4:24 AM, Max Leske <[hidden email]> wrote:
Hi,

I'm trying to store an image segment with the latest pharo.cog.spur VM (32 bits) but keep failing. The segment should produce a file of around 60 MB. With an old V3 VM this is no problem at all. There, the WordArrayForSegment instance has a size of 4094179 but with the new VM I always run out of space because the primitive returns nil and, therefore, the word array size is constantly being increased.

It's been a while since I wrote this code so my understanding has been coming back in fits and starts.  The limitation on the use of the hash bits field is in referring to "out pointers", objects that the saved segment refers to, not on objects internal to the segment.  So I think it's fixable.

The hash field is used to map from an object in the heap to its object in the segment.  Right now the mapping is from hash (22 bits) to location in the segment / 8, and so limits the size of the segment to 500kb.  If an extra level of indirection was added so that hash maps to index in an array of oops, then the segment could contain up to 4m objects and I think that'll be large enough for your use.

If that's still not enough then the al;goriqhm will have to be rewritten to use the first field of the object in the heap to point to its location in the segment, and the first field saved alongside.

So let me know.  Would you be happy with a fix that provides up to 4m objects per segment or would you want to wait for something with a much higher limit?


I've built a debug VM and am stepping through the code but I don't have a clear understanding of everything that's happening. The failure happens on line 46626 of gcc3x-cointerp.c:

newOop = (copy - segStart) / 8;
if (newOop > (identityHashHalfWordMask())) {
        return PrimErrLimitExceeded;  // <--------------- failure
}

What I don't understand, for example, is why "newOop" is checked against "identityHashHalfWordMask()" and not against the segment end ("endSeg"). Here's a list of the current values of the variables upon failure:

objOop  sqInt   180812096
segAddr sqInt   494731288
segStart        sqInt   461176856
endSeg  sqInt   815841432
bodySize        usqInt  64
contextSize     sqInt   335672448
copy    sqInt   494731288
hash    sqInt   0
hash1   sqInt   4194302
i       sqInt   833574680
iLimiT  sqInt   833574688
methodHeader    sqInt   1193471
newOop  sqInt   4194304
numMediatedSlots        sqInt   833574688
numSlots        usqInt  14
oop     sqInt   142640272

As you can see, "endSeg" would be more than large enough to hold the object. Is it possible that there's an error here?

Cheers,
Max




--
_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Error in ImageSegment primitive?

Max Leske
 
Hi Eliot,


On 9 May 2017, at 14:32, [hidden email] wrote:

Hi Max,

On Sun, May 7, 2017 at 4:24 AM, Max Leske <[hidden email]> wrote:

Hi,

I'm trying to store an image segment with the latest pharo.cog.spur VM (32
bits) but keep failing. The segment should produce a file of around 60 MB.
With an old V3 VM this is no problem at all. There, the WordArrayForSegment
instance has a size of 4094179 but with the new VM I always run out of
space because the primitive returns nil and, therefore, the word array size
is constantly being increased.


It's been a while since I wrote this code so my understanding has been
coming back in fits and starts.  The limitation on the use of the hash bits
field is in referring to "out pointers", objects that the saved segment
refers to, not on objects internal to the segment.  So I think it's fixable.

The hash field is used to map from an object in the heap to its object in
the segment.  Right now the mapping is from hash (22 bits) to location in
the segment / 8, and so limits the size of the segment to 500kb.  If an
extra level of indirection was added so that hash maps to index in an array
of oops, then the segment could contain up to 4m objects and I think
that'll be large enough for your use.

If that's still not enough then the al;goriqhm will have to be rewritten to
use the first field of the object in the heap to point to its location in
the segment, and the first field saved alongside.

So let me know.  Would you be happy with a fix that provides up to 4m
objects per segment or would you want to wait for something with a much
higher limit?

I've sampled 10000 classes and 10000 model specific classes at random and got a median #byteSizeOfInstance of 60 bytes. For 4m objects that would mean an upper limit of 240 MB file size. That is enough for me at the moment, yes. I currently need around 120 MB (with a safety margin of 15 MB).
Not having to watch out for that limit would be nice but my priority is to be able to create and read segments.

How much time do you think it will take you to make the change? I just need a rough idea so I can plan my work around that.


Thanks for your help!

Cheers,
Max



I've built a debug VM and am stepping through the code but I don't have a
clear understanding of everything that's happening. The failure happens on
line 46626 of gcc3x-cointerp.c:

newOop = (copy - segStart) / 8;
if (newOop > (identityHashHalfWordMask())) {
       return PrimErrLimitExceeded;  // <--------------- failure
}

What I don't understand, for example, is why "newOop" is checked against
"identityHashHalfWordMask()" and not against the segment end ("endSeg").
Here's a list of the current values of the variables upon failure:

objOop  sqInt   180812096
segAddr sqInt   494731288
segStart        sqInt   461176856
endSeg  sqInt   815841432
bodySize        usqInt  64
contextSize     sqInt   335672448
copy    sqInt   494731288
hash    sqInt   0
hash1   sqInt   4194302
i       sqInt   833574680
iLimiT  sqInt   833574688
methodHeader    sqInt   1193471
newOop  sqInt   4194304
numMediatedSlots        sqInt   833574688
numSlots        usqInt  14
oop     sqInt   142640272

As you can see, "endSeg" would be more than large enough to hold the
object. Is it possible that there's an error here?

Cheers,
Max




-- 
_,,,^..^,,,_
best, Eliot

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Error in ImageSegment primitive?

Eliot Miranda-2
 
Hi Max,

On May 9, 2017, at 7:30 AM, Max Leske <[hidden email]> wrote:

Hi Eliot,


On 9 May 2017, at 14:32, [hidden email] wrote:

Hi Max,

On Sun, May 7, 2017 at 4:24 AM, Max Leske <[hidden email]> wrote:

Hi,

I'm trying to store an image segment with the latest pharo.cog.spur VM (32
bits) but keep failing. The segment should produce a file of around 60 MB.
With an old V3 VM this is no problem at all. There, the WordArrayForSegment
instance has a size of 4094179 but with the new VM I always run out of
space because the primitive returns nil and, therefore, the word array size
is constantly being increased.


It's been a while since I wrote this code so my understanding has been
coming back in fits and starts.  The limitation on the use of the hash bits
field is in referring to "out pointers", objects that the saved segment
refers to, not on objects internal to the segment.  So I think it's fixable.

The hash field is used to map from an object in the heap to its object in
the segment.  Right now the mapping is from hash (22 bits) to location in
the segment / 8, and so limits the size of the segment to 500kb.  If an
extra level of indirection was added so that hash maps to index in an array
of oops, then the segment could contain up to 4m objects and I think
that'll be large enough for your use.

If that's still not enough then the al;goriqhm will have to be rewritten to
use the first field of the object in the heap to point to its location in
the segment, and the first field saved alongside.

So let me know.  Would you be happy with a fix that provides up to 4m
objects per segment or would you want to wait for something with a much
higher limit?

I've sampled 10000 classes and 10000 model specific classes at random and got a median #byteSizeOfInstance of 60 bytes. For 4m objects that would mean an upper limit of 240 MB file size. That is enough for me at the moment, yes. I currently need around 120 MB (with a safety margin of 15 MB).
Not having to watch out for that limit would be nice but my priority is to be able to create and read segments.

Agreed, but that's a bigger rewrite.  It would have in common some of the approach taken with the new compactor which also uses the first field, in its case to point to eventual location, while saving the contents of the first field in an array off to the side.

(In Spur all objects, including zero-sized ones, have at least one field so that they can be converted into a forwarding pointer).

How much time do you think it will take you to make the change? I just need a rough idea so I can plan my work around that.

It shouldn't take more than a day; two at the most.  You can help me by setting up a test case (although perhaps I can hack up a quick binary tree, so that might not help as much as I expect).

The primitive uses a routine that answers the objects to be written to the segment in an array.  The primitive needs to be extended with a word array, let's call it the mapArray, as big as this to hold the indirection from hash field to index in the mapArray in which are held the pointers to locations in the segment data.

It might be just as easy/difficult to use a savedFirstFields array (also one element per object to be copied into the array) and point to location from first field, which would eliminate the upper limit immediately.  I'll mull this over while at the DMV; a chore I have to do today.

Thanks for your help!

Cheers,
Max



I've built a debug VM and am stepping through the code but I don't have a
clear understanding of everything that's happening. The failure happens on
line 46626 of gcc3x-cointerp.c:

newOop = (copy - segStart) / 8;
if (newOop > (identityHashHalfWordMask())) {
       return PrimErrLimitExceeded;  // <--------------- failure
}

What I don't understand, for example, is why "newOop" is checked against
"identityHashHalfWordMask()" and not against the segment end ("endSeg").
Here's a list of the current values of the variables upon failure:

objOop  sqInt   180812096
segAddr sqInt   494731288
segStart        sqInt   461176856
endSeg  sqInt   815841432
bodySize        usqInt  64
contextSize     sqInt   335672448
copy    sqInt   494731288
hash    sqInt   0
hash1   sqInt   4194302
i       sqInt   833574680
iLimiT  sqInt   833574688
methodHeader    sqInt   1193471
newOop  sqInt   4194304
numMediatedSlots        sqInt   833574688
numSlots        usqInt  14
oop     sqInt   142640272

As you can see, "endSeg" would be more than large enough to hold the
object. Is it possible that there's an error here?

Cheers,
Max




-- 
_,,,^..^,,,_
best, Eliot

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Error in ImageSegment primitive?

Max Leske
In reply to this post by Eliot Miranda-2
 
[ Reposting with correct subject. Sorry! ]


On 9 May 2017, at 17:46, [hidden email] wrote:

Hi Max,

On May 9, 2017, at 7:30 AM, Max Leske <[hidden email]> wrote:

Hi Eliot,


On 9 May 2017, at 14:32, [hidden email] wrote:

Hi Max,

On Sun, May 7, 2017 at 4:24 AM, Max Leske <[hidden email]> wrote:

Hi,

I'm trying to store an image segment with the latest pharo.cog.spur VM (32
bits) but keep failing. The segment should produce a file of around 60 MB.
With an old V3 VM this is no problem at all. There, the WordArrayForSegment
instance has a size of 4094179 but with the new VM I always run out of
space because the primitive returns nil and, therefore, the word array size
is constantly being increased.


It's been a while since I wrote this code so my understanding has been
coming back in fits and starts.  The limitation on the use of the hash bits
field is in referring to "out pointers", objects that the saved segment
refers to, not on objects internal to the segment.  So I think it's fixable.

The hash field is used to map from an object in the heap to its object in
the segment.  Right now the mapping is from hash (22 bits) to location in
the segment / 8, and so limits the size of the segment to 500kb.  If an
extra level of indirection was added so that hash maps to index in an array
of oops, then the segment could contain up to 4m objects and I think
that'll be large enough for your use.

If that's still not enough then the al;goriqhm will have to be rewritten to
use the first field of the object in the heap to point to its location in
the segment, and the first field saved alongside.

So let me know.  Would you be happy with a fix that provides up to 4m
objects per segment or would you want to wait for something with a much
higher limit?

I've sampled 10000 classes and 10000 model specific classes at random and got a median #byteSizeOfInstance of 60 bytes. For 4m objects that would mean an upper limit of 240 MB file size. That is enough for me at the moment, yes. I currently need around 120 MB (with a safety margin of 15 MB).
Not having to watch out for that limit would be nice but my priority is to be able to create and read segments.

Agreed, but that's a bigger rewrite.  It would have in common some of the approach taken with the new compactor which also uses the first field, in its case to point to eventual location, while saving the contents of the first field in an array off to the side.

(In Spur all objects, including zero-sized ones, have at least one field so that they can be converted into a forwarding pointer).

How much time do you think it will take you to make the change? I just need a rough idea so I can plan my work around that.

It shouldn't take more than a day; two at the most.  You can help me by setting up a test case (although perhaps I can hack up a quick binary tree, so that might not help as much as I expect).

I've tried to come up with a test case and have one that doesn't fail... Not sure what I'm missing. The test reports the sizes of the segment and its out pointers. I'm including it here, maybe you can work from that. I'm also appending my hacked version of ImageSegment for Pharo 6 (mainly use of FileSystem instead of FileDirectory).

I hope that helps.

Cheers,
Max


The primitive uses a routine that answers the objects to be written to the segment in an array.  The primitive needs to be extended with a word array, let's call it the mapArray, as big as this to hold the indirection from hash field to index in the mapArray in which are held the pointers to locations in the segment data.

It might be just as easy/difficult to use a savedFirstFields array (also one element per object to be copied into the array) and point to location from first field, which would eliminate the upper limit immediately.  I'll mull this over while at the DMV; a chore I have to do today.

Thanks for your help!

Cheers,
Max



I've built a debug VM and am stepping through the code but I don't have a
clear understanding of everything that's happening. The failure happens on
line 46626 of gcc3x-cointerp.c:

newOop = (copy - segStart) / 8;
if (newOop > (identityHashHalfWordMask())) {
      return PrimErrLimitExceeded;  // <--------------- failure
}

What I don't understand, for example, is why "newOop" is checked against
"identityHashHalfWordMask()" and not against the segment end ("endSeg").
Here's a list of the current values of the variables upon failure:

objOop  sqInt   180812096
segAddr sqInt   494731288
segStart        sqInt   461176856
endSeg  sqInt   815841432
bodySize        usqInt  64
contextSize     sqInt   335672448
copy    sqInt   494731288
hash    sqInt   0
hash1   sqInt   4194302
i       sqInt   833574680
iLimiT  sqInt   833574688
methodHeader    sqInt   1193471
newOop  sqInt   4194304
numMediatedSlots        sqInt   833574688
numSlots        usqInt  14
oop     sqInt   142640272

As you can see, "endSeg" would be more than large enough to hold the
object. Is it possible that there's an error here?

Cheers,
Max




-- 
_,,,^..^,,,_
best, Eliot




System-Object Storage Hacked.st (323K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Error in ImageSegment primitive?

Max Leske
In reply to this post by Eliot Miranda-2
 
And of course I forgot the code:

| rootCollection numberOfLeaves holder |
Transcript open.
holder := OrderedCollection new.
numberOfLeaves := 1000000.
rootCollection := OrderedCollection new.
1 to: numberOfLeaves do: [ :i | 
| next |
next := Object new.
holder add: next.
rootCollection add: i -> next ].

Transcript show: 'copying roots'; cr.
[ | im |
im := ImageSegment new
copyFromRoots: rootCollection sizeHint: 250000 areUnique: true;
state: #active
yourself.
Transcript
show: 'out pointers: ', (im outPointers size asString); cr;
show: 'segment: ', (im segment size asString); cr ]
on: OutOfMemory
do: [
Transcript show: 'failed'; cr; cr.
Processor activeProcess terminate ].
Transcript show: 'succeeded'; cr; cr.


On 9 May 2017, at 19:15, [hidden email] wrote:

[ Reposting with correct subject. Sorry! ]


On 9 May 2017, at 17:46, [hidden email] <[hidden email]> wrote:

Hi Max,

On May 9, 2017, at 7:30 AM, Max Leske <[hidden email] <[hidden email]>> wrote:

Hi Eliot,


On 9 May 2017, at 14:32, [hidden email] <[hidden email]> wrote:

Hi Max,

On Sun, May 7, 2017 at 4:24 AM, Max Leske <[hidden email] <[hidden email]>> wrote:

Hi,

I'm trying to store an image segment with the latest pharo.cog.spur VM (32
bits) but keep failing. The segment should produce a file of around 60 MB.
With an old V3 VM this is no problem at all. There, the WordArrayForSegment
instance has a size of 4094179 but with the new VM I always run out of
space because the primitive returns nil and, therefore, the word array size
is constantly being increased.


It's been a while since I wrote this code so my understanding has been
coming back in fits and starts.  The limitation on the use of the hash bits
field is in referring to "out pointers", objects that the saved segment
refers to, not on objects internal to the segment.  So I think it's fixable.

The hash field is used to map from an object in the heap to its object in
the segment.  Right now the mapping is from hash (22 bits) to location in
the segment / 8, and so limits the size of the segment to 500kb.  If an
extra level of indirection was added so that hash maps to index in an array
of oops, then the segment could contain up to 4m objects and I think
that'll be large enough for your use.

If that's still not enough then the al;goriqhm will have to be rewritten to
use the first field of the object in the heap to point to its location in
the segment, and the first field saved alongside.

So let me know.  Would you be happy with a fix that provides up to 4m
objects per segment or would you want to wait for something with a much
higher limit?

I've sampled 10000 classes and 10000 model specific classes at random and got a median #byteSizeOfInstance of 60 bytes. For 4m objects that would mean an upper limit of 240 MB file size. That is enough for me at the moment, yes. I currently need around 120 MB (with a safety margin of 15 MB).
Not having to watch out for that limit would be nice but my priority is to be able to create and read segments.

Agreed, but that's a bigger rewrite.  It would have in common some of the approach taken with the new compactor which also uses the first field, in its case to point to eventual location, while saving the contents of the first field in an array off to the side.

(In Spur all objects, including zero-sized ones, have at least one field so that they can be converted into a forwarding pointer).

How much time do you think it will take you to make the change? I just need a rough idea so I can plan my work around that.

It shouldn't take more than a day; two at the most.  You can help me by setting up a test case (although perhaps I can hack up a quick binary tree, so that might not help as much as I expect).

I've tried to come up with a test case and have one that doesn't fail... Not sure what I'm missing. The test reports the sizes of the segment and its out pointers. I'm including it here, maybe you can work from that. I'm also appending my hacked version of ImageSegment for Pharo 6 (mainly use of FileSystem instead of FileDirectory).

I hope that helps.

Cheers,
Max


The primitive uses a routine that answers the objects to be written to the segment in an array.  The primitive needs to be extended with a word array, let's call it the mapArray, as big as this to hold the indirection from hash field to index in the mapArray in which are held the pointers to locations in the segment data.

It might be just as easy/difficult to use a savedFirstFields array (also one element per object to be copied into the array) and point to location from first field, which would eliminate the upper limit immediately.  I'll mull this over while at the DMV; a chore I have to do today.

Thanks for your help!

Cheers,
Max



I've built a debug VM and am stepping through the code but I don't have a
clear understanding of everything that's happening. The failure happens on
line 46626 of gcc3x-cointerp.c:

newOop = (copy - segStart) / 8;
if (newOop > (identityHashHalfWordMask())) {
     return PrimErrLimitExceeded;  // <--------------- failure
}

What I don't understand, for example, is why "newOop" is checked against
"identityHashHalfWordMask()" and not against the segment end ("endSeg").
Here's a list of the current values of the variables upon failure:

objOop  sqInt   180812096
segAddr sqInt   494731288
segStart        sqInt   461176856
endSeg  sqInt   815841432
bodySize        usqInt  64
contextSize     sqInt   335672448
copy    sqInt   494731288
hash    sqInt   0
hash1   sqInt   4194302
i       sqInt   833574680
iLimiT  sqInt   833574688
methodHeader    sqInt   1193471
newOop  sqInt   4194304
numMediatedSlots        sqInt   833574688
numSlots        usqInt  14
oop     sqInt   142640272

As you can see, "endSeg" would be more than large enough to hold the
object. Is it possible that there's an error here?

Cheers,
Max




-- 
_,,,^..^,,,_
best, Eliot

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Error in ImageSegment primitive?

Eliot Miranda-2
In reply to this post by Max Leske
 
Hi Max, Bert,

    to integrate the below with Bert's rewriting of the importer, I'm thinking of renaming ImageSegment to NativeImageSegment.  Hopefully we can get the native and the all-in-Smalltalk code to coexist.  Is NativeImageSegment a good name?  Do you prefer e.g. VMImageSegment?  SpurImageSegment?

And we still have to handle fixing up of loaded objects.  The load primitive answers the array of roots.  We'd actually like an array of all loaded objects.  Maybe I should add a variant of the load primitive that does this.  Are you, Max, up to rewriting the post-load object mixup to avoid nextObject?

On Tue, May 9, 2017 at 10:15 AM, Max Leske <[hidden email]> wrote:
 
[ Reposting with correct subject. Sorry! ]


On 9 May 2017, at 17:46, [hidden email] wrote:

Hi Max,

On May 9, 2017, at 7:30 AM, Max Leske <[hidden email]> wrote:

Hi Eliot,


On 9 May 2017, at 14:32, [hidden email] wrote:

Hi Max,

On Sun, May 7, 2017 at 4:24 AM, Max Leske <[hidden email]> wrote:

Hi,

I'm trying to store an image segment with the latest pharo.cog.spur VM (32
bits) but keep failing. The segment should produce a file of around 60 MB.
With an old V3 VM this is no problem at all. There, the WordArrayForSegment
instance has a size of 4094179 but with the new VM I always run out of
space because the primitive returns nil and, therefore, the word array size
is constantly being increased.


It's been a while since I wrote this code so my understanding has been
coming back in fits and starts.  The limitation on the use of the hash bits
field is in referring to "out pointers", objects that the saved segment
refers to, not on objects internal to the segment.  So I think it's fixable.

The hash field is used to map from an object in the heap to its object in
the segment.  Right now the mapping is from hash (22 bits) to location in
the segment / 8, and so limits the size of the segment to 500kb.  If an
extra level of indirection was added so that hash maps to index in an array
of oops, then the segment could contain up to 4m objects and I think
that'll be large enough for your use.

If that's still not enough then the al;goriqhm will have to be rewritten to
use the first field of the object in the heap to point to its location in
the segment, and the first field saved alongside.

So let me know.  Would you be happy with a fix that provides up to 4m
objects per segment or would you want to wait for something with a much
higher limit?

I've sampled 10000 classes and 10000 model specific classes at random and got a median #byteSizeOfInstance of 60 bytes. For 4m objects that would mean an upper limit of 240 MB file size. That is enough for me at the moment, yes. I currently need around 120 MB (with a safety margin of 15 MB).
Not having to watch out for that limit would be nice but my priority is to be able to create and read segments.

Agreed, but that's a bigger rewrite.  It would have in common some of the approach taken with the new compactor which also uses the first field, in its case to point to eventual location, while saving the contents of the first field in an array off to the side.

(In Spur all objects, including zero-sized ones, have at least one field so that they can be converted into a forwarding pointer).

How much time do you think it will take you to make the change? I just need a rough idea so I can plan my work around that.

It shouldn't take more than a day; two at the most.  You can help me by setting up a test case (although perhaps I can hack up a quick binary tree, so that might not help as much as I expect).

I've tried to come up with a test case and have one that doesn't fail... Not sure what I'm missing. The test reports the sizes of the segment and its out pointers. I'm including it here, maybe you can work from that. I'm also appending my hacked version of ImageSegment for Pharo 6 (mainly use of FileSystem instead of FileDirectory).

I hope that helps.

Cheers,
Max


The primitive uses a routine that answers the objects to be written to the segment in an array.  The primitive needs to be extended with a word array, let's call it the mapArray, as big as this to hold the indirection from hash field to index in the mapArray in which are held the pointers to locations in the segment data.

It might be just as easy/difficult to use a savedFirstFields array (also one element per object to be copied into the array) and point to location from first field, which would eliminate the upper limit immediately.  I'll mull this over while at the DMV; a chore I have to do today.

Thanks for your help!

Cheers,
Max



I've built a debug VM and am stepping through the code but I don't have a
clear understanding of everything that's happening. The failure happens on
line 46626 of gcc3x-cointerp.c:

newOop = (copy - segStart) / 8;
if (newOop > (identityHashHalfWordMask())) {
      return PrimErrLimitExceeded;  // <--------------- failure
}

What I don't understand, for example, is why "newOop" is checked against
"identityHashHalfWordMask()" and not against the segment end ("endSeg").
Here's a list of the current values of the variables upon failure:

objOop  sqInt   180812096
segAddr sqInt   494731288
segStart        sqInt   461176856
endSeg  sqInt   815841432
bodySize        usqInt  64
contextSize     sqInt   335672448
copy    sqInt   494731288
hash    sqInt   0
hash1   sqInt   4194302
i       sqInt   833574680
iLimiT  sqInt   833574688
methodHeader    sqInt   1193471
newOop  sqInt   4194304
numMediatedSlots        sqInt   833574688
numSlots        usqInt  14
oop     sqInt   142640272

As you can see, "endSeg" would be more than large enough to hold the
object. Is it possible that there's an error here?

Cheers,
Max




-- 
_,,,^..^,,,_
best, Eliot







--
_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Error in ImageSegment primitive?

Bert Freudenberg
 
NativeImageSegment sounds good. 

(I had renamed "my" class to OldImageSegment but ran into issues with old code breaking. Better to start clean)

- Bert -

On Tue 9. May 2017 at 23:05, Eliot Miranda <[hidden email]> wrote:
 
Hi Max, Bert,

    to integrate the below with Bert's rewriting of the importer, I'm thinking of renaming ImageSegment to NativeImageSegment.  Hopefully we can get the native and the all-in-Smalltalk code to coexist.  Is NativeImageSegment a good name?  Do you prefer e.g. VMImageSegment?  SpurImageSegment?

And we still have to handle fixing up of loaded objects.  The load primitive answers the array of roots.  We'd actually like an array of all loaded objects.  Maybe I should add a variant of the load primitive that does this.  Are you, Max, up to rewriting the post-load object mixup to avoid nextObject?

On Tue, May 9, 2017 at 10:15 AM, Max Leske <[hidden email]> wrote:
 
[ Reposting with correct subject. Sorry! ]


On 9 May 2017, at 17:46, [hidden email] wrote:

Hi Max,

On May 9, 2017, at 7:30 AM, Max Leske <[hidden email]> wrote:

Hi Eliot,


On 9 May 2017, at 14:32, [hidden email] wrote:

Hi Max,

On Sun, May 7, 2017 at 4:24 AM, Max Leske <[hidden email]> wrote:

Hi,

I'm trying to store an image segment with the latest pharo.cog.spur VM (32
bits) but keep failing. The segment should produce a file of around 60 MB.
With an old V3 VM this is no problem at all. There, the WordArrayForSegment
instance has a size of 4094179 but with the new VM I always run out of
space because the primitive returns nil and, therefore, the word array size
is constantly being increased.


It's been a while since I wrote this code so my understanding has been
coming back in fits and starts.  The limitation on the use of the hash bits
field is in referring to "out pointers", objects that the saved segment
refers to, not on objects internal to the segment.  So I think it's fixable.

The hash field is used to map from an object in the heap to its object in
the segment.  Right now the mapping is from hash (22 bits) to location in
the segment / 8, and so limits the size of the segment to 500kb.  If an
extra level of indirection was added so that hash maps to index in an array
of oops, then the segment could contain up to 4m objects and I think
that'll be large enough for your use.

If that's still not enough then the al;goriqhm will have to be rewritten to
use the first field of the object in the heap to point to its location in
the segment, and the first field saved alongside.

So let me know.  Would you be happy with a fix that provides up to 4m
objects per segment or would you want to wait for something with a much
higher limit?

I've sampled 10000 classes and 10000 model specific classes at random and got a median #byteSizeOfInstance of 60 bytes. For 4m objects that would mean an upper limit of 240 MB file size. That is enough for me at the moment, yes. I currently need around 120 MB (with a safety margin of 15 MB).
Not having to watch out for that limit would be nice but my priority is to be able to create and read segments.

Agreed, but that's a bigger rewrite.  It would have in common some of the approach taken with the new compactor which also uses the first field, in its case to point to eventual location, while saving the contents of the first field in an array off to the side.

(In Spur all objects, including zero-sized ones, have at least one field so that they can be converted into a forwarding pointer).

How much time do you think it will take you to make the change? I just need a rough idea so I can plan my work around that.

It shouldn't take more than a day; two at the most.  You can help me by setting up a test case (although perhaps I can hack up a quick binary tree, so that might not help as much as I expect).

I've tried to come up with a test case and have one that doesn't fail... Not sure what I'm missing. The test reports the sizes of the segment and its out pointers. I'm including it here, maybe you can work from that. I'm also appending my hacked version of ImageSegment for Pharo 6 (mainly use of FileSystem instead of FileDirectory).

I hope that helps.

Cheers,
Max


The primitive uses a routine that answers the objects to be written to the segment in an array.  The primitive needs to be extended with a word array, let's call it the mapArray, as big as this to hold the indirection from hash field to index in the mapArray in which are held the pointers to locations in the segment data.

It might be just as easy/difficult to use a savedFirstFields array (also one element per object to be copied into the array) and point to location from first field, which would eliminate the upper limit immediately.  I'll mull this over while at the DMV; a chore I have to do today.

Thanks for your help!

Cheers,
Max



I've built a debug VM and am stepping through the code but I don't have a
clear understanding of everything that's happening. The failure happens on
line 46626 of gcc3x-cointerp.c:

newOop = (copy - segStart) / 8;
if (newOop > (identityHashHalfWordMask())) {
      return PrimErrLimitExceeded;  // <--------------- failure
}

What I don't understand, for example, is why "newOop" is checked against
"identityHashHalfWordMask()" and not against the segment end ("endSeg").
Here's a list of the current values of the variables upon failure:

objOop  sqInt   180812096
segAddr sqInt   494731288
segStart        sqInt   461176856
endSeg  sqInt   815841432
bodySize        usqInt  64
contextSize     sqInt   335672448
copy    sqInt   494731288
hash    sqInt   0
hash1   sqInt   4194302
i       sqInt   833574680
iLimiT  sqInt   833574688
methodHeader    sqInt   1193471
newOop  sqInt   4194304
numMediatedSlots        sqInt   833574688
numSlots        usqInt  14
oop     sqInt   142640272

As you can see, "endSeg" would be more than large enough to hold the
object. Is it possible that there's an error here?

Cheers,
Max




-- 
_,,,^..^,,,_
best, Eliot







--
_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Error in ImageSegment primitive?

Max Leske
In reply to this post by Eliot Miranda-2
 

On 10 May 2017, at 01:14, [hidden email] wrote:

NativeImageSegment sounds good.

+1


(I had renamed "my" class to OldImageSegment but ran into issues with old
code breaking. Better to start clean)

- Bert -

On Tue 9. May 2017 at 23:05, Eliot Miranda <[hidden email]> wrote:


Hi Max, Bert,

   to integrate the below with Bert's rewriting of the importer, I'm
thinking of renaming ImageSegment to NativeImageSegment.  Hopefully we can
get the native and the all-in-Smalltalk code to coexist.  Is
NativeImageSegment a good name?  Do you prefer e.g. VMImageSegment?
SpurImageSegment?

And we still have to handle fixing up of loaded objects.  The load
primitive answers the array of roots.  We'd actually like an array of all
loaded objects.  Maybe I should add a variant of the load primitive that
does this.  Are you, Max, up to rewriting the post-load object mixup to
avoid nextObject?

Yes, I can do that. I'll probably need some help but I think I can handle it.

Max