Why Sets need to be rehashed when loading them from an ImageSegment?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Why Sets need to be rehashed when loading them from an ImageSegment?

Mariano Martinez Peck
Hi. I was almost sure that Set instances  that were written in an ImageSegment,  needed to be rehashed once they are swap in. And I even thought I was understanding why. But now, I don't understand at all why is that. I would really appreciate anyway that can give me a hint.

So...ImageSegment rehashes all Set instances (at loading time) that were written in an ImageSegment, and then exported with ReferenceStream. Notice that this rehash is finally called from ImageSegment >> comeFullyUpOnReload: smartRefStream.
When you export an ImageSegment with a reference stream, you also serializer the "outPointers" array. This is usually the case when you want to load the object subgraph in another image or when do it backups so that to start with a clean image.  Notice that the Set rehash only occurs in this case. If we use ImageSegment just to swap out / in, there is no Set rehash.
So...I think that this has something to do about WHY Set instances need to be rehashed.

For each object ImageSegment write into the file, it writes both: object header + slots.   So....the hash (12 bits in the object header) is copied. When the object is loaded back, it continues to have the same hash. It doesn't change.
So...if hash doesn't change, why Set instances need to be rehashes when they come back from an ImageSegment?

Thanks in advance,

Mariano
Reply | Threaded
Open this post in threaded view
|

Re: Why Sets need to be rehashed when loading them from an ImageSegment?

Igor Stasenko
One example:

#foo identityHash in one image
and same #foo identityHash
in another image could be different, because
these symbols could be interned in different time(s).
So, if you have a identity set or identity dictionary which using
symbols as keys,
which is often the case , you need to rehash them.

Just tried it
image 1
#awdqedqwd identityHash  633602048
image 2
#awdqedqwd identityHash 398196736

note that i specifically picked the symbol, which is not yet interned
(not exists) in image.
Obviously, you cannot change the identityHash of symbol(s) which
already present in image,
because you don't know where it used outside of object(s) you are loading.


On 10 January 2011 10:51, Mariano Martinez Peck <[hidden email]> wrote:

> Hi. I was almost sure that Set instances  that were written in an
> ImageSegment,  needed to be rehashed once they are swap in. And I even
> thought I was understanding why. But now, I don't understand at all why is
> that. I would really appreciate anyway that can give me a hint.
>
> So...ImageSegment rehashes all Set instances (at loading time) that were
> written in an ImageSegment, and then exported with ReferenceStream. Notice
> that this rehash is finally called from ImageSegment >> comeFullyUpOnReload:
> smartRefStream.
> When you export an ImageSegment with a reference stream, you also serializer
> the "outPointers" array. This is usually the case when you want to load the
> object subgraph in another image or when do it backups so that to start with
> a clean image.  Notice that the Set rehash only occurs in this case. If we
> use ImageSegment just to swap out / in, there is no Set rehash.
> So...I think that this has something to do about WHY Set instances need to
> be rehashed.
>
> For each object ImageSegment write into the file, it writes both: object
> header + slots.   So....the hash (12 bits in the object header) is copied.
> When the object is loaded back, it continues to have the same hash. It
> doesn't change.
> So...if hash doesn't change, why Set instances need to be rehashes when they
> come back from an ImageSegment?
>
> Thanks in advance,
>
> Mariano
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: Why Sets need to be rehashed when loading them from an ImageSegment?

Lukas Renggli
In reply to this post by Mariano Martinez Peck
Any object that uses an identity hash anywhere in its #hash might
change its hash, thus the rehashing.

Lukas

On Monday, 10 January 2011, Mariano Martinez Peck <[hidden email]> wrote:

> Hi. I was almost sure that Set instances  that were written in an ImageSegment,  needed to be rehashed once they are swap in. And I even thought I was understanding why. But now, I don't understand at all why is that. I would really appreciate anyway that can give me a hint.
>
> So...ImageSegment rehashes all Set instances (at loading time) that were written in an ImageSegment, and then exported with ReferenceStream. Notice that this rehash is finally called from ImageSegment >> comeFullyUpOnReload: smartRefStream.
> When you export an ImageSegment with a reference stream, you also serializer the "outPointers" array. This is usually the case when you want to load the object subgraph in another image or when do it backups so that to start with a clean image.  Notice that the Set rehash only occurs in this case. If we use ImageSegment just to swap out / in, there is no Set rehash.
> So...I think that this has something to do about WHY Set instances need to be rehashed.
>
> For each object ImageSegment write into the file, it writes both: object header + slots.   So....the hash (12 bits in the object header) is copied. When the object is loaded back, it continues to have the same hash. It doesn't change.
> So...if hash doesn't change, why Set instances need to be rehashes when they come back from an ImageSegment?
>
> Thanks in advance,
>
> Mariano
>

--
Lukas Renggli
www.lukas-renggli.ch

Reply | Threaded
Open this post in threaded view
|

Re: Why Sets need to be rehashed when loading them from an ImageSegment?

Mariano Martinez Peck
In reply to this post by Igor Stasenko


On Mon, Jan 10, 2011 at 1:25 PM, Igor Stasenko <[hidden email]> wrote:
One example:

#foo identityHash in one image
and same #foo identityHash
in another image could be different, because
these symbols could be interned in different time(s).
So, if you have a identity set or identity dictionary which using
symbols as keys,
which is often the case , you need to rehash them.

Just tried it
image 1
#awdqedqwd identityHash  633602048
image 2
#awdqedqwd identityHash 398196736

note that i specifically picked the symbol, which is not yet interned
(not exists) in image.
Obviously, you cannot change the identityHash of symbol(s) which
already present in image,
because you don't know where it used outside of object(s) you are loading.


Uhhhhhh this was a good example ;)
I forget about this guys heheh
Thanks Igor. This is a clear example.

 

On 10 January 2011 10:51, Mariano Martinez Peck <[hidden email]> wrote:
> Hi. I was almost sure that Set instances  that were written in an
> ImageSegment,  needed to be rehashed once they are swap in. And I even
> thought I was understanding why. But now, I don't understand at all why is
> that. I would really appreciate anyway that can give me a hint.
>
> So...ImageSegment rehashes all Set instances (at loading time) that were
> written in an ImageSegment, and then exported with ReferenceStream. Notice
> that this rehash is finally called from ImageSegment >> comeFullyUpOnReload:
> smartRefStream.
> When you export an ImageSegment with a reference stream, you also serializer
> the "outPointers" array. This is usually the case when you want to load the
> object subgraph in another image or when do it backups so that to start with
> a clean image.  Notice that the Set rehash only occurs in this case. If we
> use ImageSegment just to swap out / in, there is no Set rehash.
> So...I think that this has something to do about WHY Set instances need to
> be rehashed.
>
> For each object ImageSegment write into the file, it writes both: object
> header + slots.   So....the hash (12 bits in the object header) is copied.
> When the object is loaded back, it continues to have the same hash. It
> doesn't change.
> So...if hash doesn't change, why Set instances need to be rehashes when they
> come back from an ImageSegment?
>
> Thanks in advance,
>
> Mariano
>



--
Best regards,
Igor Stasenko AKA sig.


Reply | Threaded
Open this post in threaded view
|

Re: Why Sets need to be rehashed when loading them from an ImageSegment?

Mariano Martinez Peck
In reply to this post by Lukas Renggli


On Mon, Jan 10, 2011 at 5:35 PM, Lukas Renggli <[hidden email]> wrote:
Any object that uses an identity hash anywhere in its #hash might

the problem is what that "might" you talk, is about.
After Igor example, I understood that the only reason a Set has to be rehashed after an image segment is those objects that are tricky and that RefereceStream does magic to read them.
For example, if in a Set instance included objects like nil, true, false, when loaded back, it will point to those objects but those that are present in the image where the loading occurs (some instances like true, false, and nil were not even written in the file when using reference stream). And here, the hashes of those objects can be different from the one where I wrote the Set (like symbols).
And the same happens with Symbols. But is it correct that it only happens with these kind of objects?  (mostly where ReferenceStream needs a #readXXX method?)
Because with the rest of the objects (classes) I don't see any problem since #identityhash cannot change.

If all these is correct, then the problem is not that some objects of the Set might have changed its hash, but instead, some objects of the Set are directly REPLACED by other instances (like symbols, true, false, nil, etc) which of course, can have different hashes.

Thanks!

Mariano

 
change its hash, thus the rehashing.

Lukas

On Monday, 10 January 2011, Mariano Martinez Peck <[hidden email]> wrote:
> Hi. I was almost sure that Set instances  that were written in an ImageSegment,  needed to be rehashed once they are swap in. And I even thought I was understanding why. But now, I don't understand at all why is that. I would really appreciate anyway that can give me a hint.
>
> So...ImageSegment rehashes all Set instances (at loading time) that were written in an ImageSegment, and then exported with ReferenceStream. Notice that this rehash is finally called from ImageSegment >> comeFullyUpOnReload: smartRefStream.
> When you export an ImageSegment with a reference stream, you also serializer the "outPointers" array. This is usually the case when you want to load the object subgraph in another image or when do it backups so that to start with a clean image.  Notice that the Set rehash only occurs in this case. If we use ImageSegment just to swap out / in, there is no Set rehash.
> So...I think that this has something to do about WHY Set instances need to be rehashed.
>
> For each object ImageSegment write into the file, it writes both: object header + slots.   So....the hash (12 bits in the object header) is copied. When the object is loaded back, it continues to have the same hash. It doesn't change.
> So...if hash doesn't change, why Set instances need to be rehashes when they come back from an ImageSegment?
>
> Thanks in advance,
>
> Mariano
>

--
Lukas Renggli
www.lukas-renggli.ch