STON materialization corrupts a dictionary if the keys are references

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

STON materialization corrupts a dictionary if the keys are references

Peter Uhnak
Hi,

I ran into a problem with STON that seems to corrupt on materialization.

Consider the following script:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
c := OrderedCollection with: UUID new.
d := Dictionary with: c first -> nil.
o := { c . d }.

s := STON toStringPretty: o.

o2 := STON reader
allowComplexMapKeys: true;
on: s readStream;
next.
o2 second keys. "an Array(an UUID('571bbfb7-78b9-430c-9aad-3e0088a25f4c'))"
o2 second printString. "KeyNotFound: key an UUID(''571bbfb7-78b9-430c-9aad-3e0088a25f4c'') not found in Dictionary"
o2 second at: o2 second keys first. "the same error"
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

So even though it contains the key it fails to access it.

Interestingly enough if I serialize and materialize the "broken" dictionary again it suddenly works again.

Thanks,
Peter



Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Sven Van Caekenberghe-2
Hi Peter,

Strange indeed. It did not fail for me in 5.0 with the standard STON version, but it does fail in 4.0 with the latest STON. Asking the bogus dictionary to #rehash seems to fix the problem. I will investigate and turn you example into a unit test.

Sven
 

> On 05 Apr 2016, at 11:14, Peter Uhnák <[hidden email]> wrote:
>
> Hi,
>
> I ran into a problem with STON that seems to corrupt on materialization.
>
> Consider the following script:
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> c := OrderedCollection with: UUID new.
> d := Dictionary with: c first -> nil.
> o := { c . d }.
>
> s := STON toStringPretty: o.
>
> o2 := STON reader
> allowComplexMapKeys: true;
> on: s readStream;
> next.
>
> o2 second keys. "an Array(an UUID('571bbfb7-78b9-430c-9aad-3e0088a25f4c'))"
> o2 second printString. "KeyNotFound: key an UUID(''571bbfb7-78b9-430c-9aad-3e0088a25f4c'') not found in Dictionary"
> o2 second at: o2 second keys first. "the same error"
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> So even though it contains the key it fails to access it.
>
> Interestingly enough if I serialize and materialize the "broken" dictionary again it suddenly works again.
>
> Thanks,
> Peter
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Peter Uhnak
This is
Pharo #50666
Moose 6.0 build #1437
STON-Core-SvenVanCaekenberghe.67

 (I also loaded STON-Core-SvenVanCaekenberghe.70 but the result is the same).

One thing to note: it doesn't fail always… for me it seems that about every ten or so attempts it succeeds.

Peter

On Tue, Apr 5, 2016 at 11:41 AM, Sven Van Caekenberghe <[hidden email]> wrote:
Hi Peter,

Strange indeed. It did not fail for me in 5.0 with the standard STON version, but it does fail in 4.0 with the latest STON. Asking the bogus dictionary to #rehash seems to fix the problem. I will investigate and turn you example into a unit test.

Sven

> On 05 Apr 2016, at 11:14, Peter Uhnák <[hidden email]> wrote:
>
> Hi,
>
> I ran into a problem with STON that seems to corrupt on materialization.
>
> Consider the following script:
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> c := OrderedCollection with: UUID new.
> d := Dictionary with: c first -> nil.
> o := { c . d }.
>
> s := STON toStringPretty: o.
>
> o2 := STON reader
>       allowComplexMapKeys: true;
>       on: s readStream;
>       next.
>
> o2 second keys. "an Array(an UUID('571bbfb7-78b9-430c-9aad-3e0088a25f4c'))"
> o2 second printString. "KeyNotFound: key an UUID(''571bbfb7-78b9-430c-9aad-3e0088a25f4c'') not found in Dictionary"
> o2 second at: o2 second keys first. "the same error"
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> So even though it contains the key it fails to access it.
>
> Interestingly enough if I serialize and materialize the "broken" dictionary again it suddenly works again.
>
> Thanks,
> Peter
>
>
>



Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Sven Van Caekenberghe-2

> On 05 Apr 2016, at 11:50, Peter Uhnák <[hidden email]> wrote:
>
> This is
> Pharo #50666
> Moose 6.0 build #1437
> STON-Core-SvenVanCaekenberghe.67
>
>  (I also loaded STON-Core-SvenVanCaekenberghe.70 but the result is the same).
>
> One thing to note: it doesn't fail always… for me it seems that about every ten or so attempts it succeeds.

Like I said, it is a hashing issue, sometimes it will be correct by accident.

I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.

Is it urgent ?

I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.

> Peter
>
> On Tue, Apr 5, 2016 at 11:41 AM, Sven Van Caekenberghe <[hidden email]> wrote:
> Hi Peter,
>
> Strange indeed. It did not fail for me in 5.0 with the standard STON version, but it does fail in 4.0 with the latest STON. Asking the bogus dictionary to #rehash seems to fix the problem. I will investigate and turn you example into a unit test.
>
> Sven
>
> > On 05 Apr 2016, at 11:14, Peter Uhnák <[hidden email]> wrote:
> >
> > Hi,
> >
> > I ran into a problem with STON that seems to corrupt on materialization.
> >
> > Consider the following script:
> >
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > c := OrderedCollection with: UUID new.
> > d := Dictionary with: c first -> nil.
> > o := { c . d }.
> >
> > s := STON toStringPretty: o.
> >
> > o2 := STON reader
> >       allowComplexMapKeys: true;
> >       on: s readStream;
> >       next.
> >
> > o2 second keys. "an Array(an UUID('571bbfb7-78b9-430c-9aad-3e0088a25f4c'))"
> > o2 second printString. "KeyNotFound: key an UUID(''571bbfb7-78b9-430c-9aad-3e0088a25f4c'') not found in Dictionary"
> > o2 second at: o2 second keys first. "the same error"
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > So even though it contains the key it fails to access it.
> >
> > Interestingly enough if I serialize and materialize the "broken" dictionary again it suddenly works again.
> >
> > Thanks,
> > Peter
> >
> >
> >
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Peter Uhnak
Like I said, it is a hashing issue, sometimes it will be correct by accident.

I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.

The hardest thing often is to suspect the system or external libraries, but it wasn't that bad.

Is it urgent ?

I can rehash it for now (I have a trivial case), but it should be fixed eventually (it wouldn't be so easy to rehash more complex structures).

Thanks,
Peter
Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

CyrilFerlicot
In reply to this post by Sven Van Caekenberghe-2


On 05/04/2016 12:09, Sven Van Caekenberghe wrote:

> Like I said, it is a hashing issue, sometimes it will be correct by accident.
>
> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
>
> Is it urgent ?
>
> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
>
>
>
Hi Sven,

I got the same kind of problem in a personal application.

I use Sets that I serialize and I had a lot of trouble because sometimes
some action had strange behaviours.

For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
not found in aSet'.

I am glad to hear that it is a Ston issue and not me that used sets in a
bad way :)

For me too it is not urgent since I have a not of university work for
the moment.

--
Cyril Ferlicot

http://www.synectique.eu

165 Avenue Bretagne
Lille 59000 France


signature.asc (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Nicolai Hess-3-2


2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:


On 05/04/2016 12:09, Sven Van Caekenberghe wrote:

> Like I said, it is a hashing issue, sometimes it will be correct by accident.
>
> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
>
> Is it urgent ?
>
> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
>
>
>

Hi Sven,

I got the same kind of problem in a personal application.

I use Sets that I serialize and I had a lot of trouble because sometimes
some action had strange behaviours.

For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
not found in aSet'.

I am glad to hear that it is a Ston issue and not me that used sets in a
bad way :)

For me too it is not urgent since I have a not of university work for
the moment.



How are hashed collections created/filled during ston-parsing ?
If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
the index in the dictionary  (or other hashed collections) may be wrong.

 
--
Cyril Ferlicot

http://www.synectique.eu

165 Avenue Bretagne
Lille 59000 France


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Sven Van Caekenberghe-2

> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
>
>
>
> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
>
>
> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>
> > Like I said, it is a hashing issue, sometimes it will be correct by accident.
> >
> > I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
> >
> > Is it urgent ?
> >
> > I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
> >
> >
> >
>
> Hi Sven,
>
> I got the same kind of problem in a personal application.
>
> I use Sets that I serialize and I had a lot of trouble because sometimes
> some action had strange behaviours.
>
> For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
> not found in aSet'.
>
> I am glad to hear that it is a Ston issue and not me that used sets in a
> bad way :)
>
> For me too it is not urgent since I have a not of university work for
> the moment.
>
>
>
> How are hashed collections created/filled during ston-parsing ?
> If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
> the index in the dictionary  (or other hashed collections) may be wrong.

Yes, that is indeed it, Nicolai.

But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.

> --
> Cyril Ferlicot
>
> http://www.synectique.eu
>
> 165 Avenue Bretagne
> Lille 59000 France


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Sven Van Caekenberghe-2
https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Dictionaries-and-Sets-when-references-occur-in-its-keys-or-elements

fix coming

> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[hidden email]> wrote:
>
>>
>> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
>>
>>
>>
>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
>>
>>
>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>>
>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
>>>
>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
>>>
>>> Is it urgent ?
>>>
>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
>>>
>>>
>>>
>>
>> Hi Sven,
>>
>> I got the same kind of problem in a personal application.
>>
>> I use Sets that I serialize and I had a lot of trouble because sometimes
>> some action had strange behaviours.
>>
>> For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
>> not found in aSet'.
>>
>> I am glad to hear that it is a Ston issue and not me that used sets in a
>> bad way :)
>>
>> For me too it is not urgent since I have a not of university work for
>> the moment.
>>
>>
>>
>> How are hashed collections created/filled during ston-parsing ?
>> If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
>> the index in the dictionary  (or other hashed collections) may be wrong.
>
> Yes, that is indeed it, Nicolai.
>
> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
>
>> --
>> Cyril Ferlicot
>>
>> http://www.synectique.eu
>>
>> 165 Avenue Bretagne
>> Lille 59000 France


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Sven Van Caekenberghe-2
Fix for review:

===
Name: STON-Core-SvenVanCaekenberghe.71
Author: SvenVanCaekenberghe
Time: 6 April 2016, 2:22:24.782251 pm
UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
Ancestors: STON-Core-SvenVanCaekenberghe.70

Fix a bug where STONReferences occurring as keys in Dictionaries or elements in Sets caused those to be unhealthy after materialization. Thx to Peter Uhnák for reporting this issue.

Add 3 new unit tests to STONReaderTests

#testDictionaryWithReferenceKeys
#testSetWithReferenceElements
#testDeepStructure

Fix Details

change the implementation of STONReader>>#processSubObjectsOf: from iterative to recursive (see version 39 of 29 November 2012, this might be a functional regression, see #testDeepStructure; cleanup of stack instance variable for later) so that #stonProcessSubObjects: can be overwritten with code being executed before or after full reference resolution

imho, recursion stack depth will be equal during both writing and reading, and should be acceptable.

overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the end, but only when needed (minimal optimalization, see Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference)
===
Name: STON-Tests-SvenVanCaekenberghe.63
Author: SvenVanCaekenberghe
Time: 6 April 2016, 2:22:45.01986 pm
UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
Ancestors: STON-Tests-SvenVanCaekenberghe.62

(idem)
===

> On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[hidden email]> wrote:
>
> https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Dictionaries-and-Sets-when-references-occur-in-its-keys-or-elements
>
> fix coming
>
>> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[hidden email]> wrote:
>>
>>>
>>> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
>>>
>>>
>>>
>>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
>>>
>>>
>>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>>>
>>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
>>>>
>>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
>>>>
>>>> Is it urgent ?
>>>>
>>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
>>>>
>>>>
>>>>
>>>
>>> Hi Sven,
>>>
>>> I got the same kind of problem in a personal application.
>>>
>>> I use Sets that I serialize and I had a lot of trouble because sometimes
>>> some action had strange behaviours.
>>>
>>> For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
>>> not found in aSet'.
>>>
>>> I am glad to hear that it is a Ston issue and not me that used sets in a
>>> bad way :)
>>>
>>> For me too it is not urgent since I have a not of university work for
>>> the moment.
>>>
>>>
>>>
>>> How are hashed collections created/filled during ston-parsing ?
>>> If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
>>> the index in the dictionary  (or other hashed collections) may be wrong.
>>
>> Yes, that is indeed it, Nicolai.
>>
>> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
>>
>>> --
>>> Cyril Ferlicot
>>>
>>> http://www.synectique.eu
>>>
>>> 165 Avenue Bretagne
>>> Lille 59000 France
>


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Henrik Nergaard
Hi Sven,

Instead of:

stonProcessSubObjects: block
        | didContainStonReferenceAsKey |
        didContainStonReferenceAsKey := self stonProcessSubObjects: block

        super stonProcessSubObjects: block.
       
        self isHealthy ifFalse: [ self rehash ]..
        super stonProcessSubObjects: block.
        didContainStonReferenceAsKey
                ifTrue: [ self rehash ]


One can do:

stonProcessSubObjects: block

    super stonProcessSubObjects: block.

     self isHealthy ifFalse: [ self rehash ].


And this could also be moved to HashedCollection as a generic solution for any hash based collection. Moving it there would require implementing #isHealthy as a subclass responsibility. (all current subclasses (Set, Dictionary) implements it).

What do you think?

Best regards,
Henrik


-----Original Message-----
From: Pharo-dev [mailto:[hidden email]] On Behalf Of Sven Van Caekenberghe
Sent: Wednesday, April 6, 2016 2:27 PM
To: Pharo Development List <[hidden email]>
Subject: Re: [Pharo-dev] [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Fix for review:

===
Name: STON-Core-SvenVanCaekenberghe.71
Author: SvenVanCaekenberghe
Time: 6 April 2016, 2:22:24.782251 pm
UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
Ancestors: STON-Core-SvenVanCaekenberghe.70

Fix a bug where STONReferences occurring as keys in Dictionaries or elements in Sets caused those to be unhealthy after materialization. Thx to Peter Uhnák for reporting this issue.

Add 3 new unit tests to STONReaderTests

#testDictionaryWithReferenceKeys
#testSetWithReferenceElements
#testDeepStructure

Fix Details

change the implementation of STONReader>>#processSubObjectsOf: from iterative to recursive (see version 39 of 29 November 2012, this might be a functional regression, see #testDeepStructure; cleanup of stack instance variable for later) so that #stonProcessSubObjects: can be overwritten with code being executed before or after full reference resolution

imho, recursion stack depth will be equal during both writing and reading, and should be acceptable.

overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the end, but only when needed (minimal optimalization, see Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference) ===
Name: STON-Tests-SvenVanCaekenberghe.63
Author: SvenVanCaekenberghe
Time: 6 April 2016, 2:22:45.01986 pm
UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
Ancestors: STON-Tests-SvenVanCaekenberghe.62

(idem)
===

> On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[hidden email]> wrote:
>
> https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Di
> ctionaries-and-Sets-when-references-occur-in-its-keys-or-elements
>
> fix coming
>
>> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[hidden email]> wrote:
>>
>>>
>>> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
>>>
>>>
>>>
>>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
>>>
>>>
>>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>>>
>>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
>>>>
>>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
>>>>
>>>> Is it urgent ?
>>>>
>>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
>>>>
>>>>
>>>>
>>>
>>> Hi Sven,
>>>
>>> I got the same kind of problem in a personal application.
>>>
>>> I use Sets that I serialize and I had a lot of trouble because
>>> sometimes some action had strange behaviours.
>>>
>>> For example in a set with element `aSet remove: aSet anyOne` raised
>>> 'XXX not found in aSet'.
>>>
>>> I am glad to hear that it is a Ston issue and not me that used sets
>>> in a bad way :)
>>>
>>> For me too it is not urgent since I have a not of university work
>>> for the moment.
>>>
>>>
>>>
>>> How are hashed collections created/filled during ston-parsing ?
>>> If the position in a hashed collection is created by a
>>> ston-reference, that is later replaced by the "real" object, the index in the dictionary  (or other hashed collections) may be wrong.
>>
>> Yes, that is indeed it, Nicolai.
>>
>> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
>>
>>> --
>>> Cyril Ferlicot
>>>
>>> http://www.synectique.eu
>>>
>>> 165 Avenue Bretagne
>>> Lille 59000 France
>


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Nicolai Hess-3-2
In reply to this post by Sven Van Caekenberghe-2


2016-04-06 14:27 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
Fix for review:

===
Name: STON-Core-SvenVanCaekenberghe.71
Author: SvenVanCaekenberghe
Time: 6 April 2016, 2:22:24.782251 pm
UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
Ancestors: STON-Core-SvenVanCaekenberghe.70

Fix a bug where STONReferences occurring as keys in Dictionaries or elements in Sets caused those to be unhealthy after materialization. Thx to Peter Uhnák for reporting this issue.

Add 3 new unit tests to STONReaderTests

#testDictionaryWithReferenceKeys
#testSetWithReferenceElements
#testDeepStructure

Fix Details

change the implementation of STONReader>>#processSubObjectsOf: from iterative to recursive (see version 39 of 29 November 2012, this might be a functional regression, see #testDeepStructure; cleanup of stack instance variable for later) so that #stonProcessSubObjects: can be overwritten with code being executed before or after full reference resolution

imho, recursion stack depth will be equal during both writing and reading, and should be acceptable.

overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the end, but only when needed (minimal optimalization, see Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference)
===
Name: STON-Tests-SvenVanCaekenberghe.63
Author: SvenVanCaekenberghe
Time: 6 April 2016, 2:22:45.01986 pm
UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
Ancestors: STON-Tests-SvenVanCaekenberghe.62

(idem)
===

Hi Sven,
instead of rehashing the dictionary for every ston reference,
wouldn't it work to remove and readd the value after processing the subobject:

Dictionary>>#stonProcessSubObjects: block
    self keys do:[:key |
        |value|
        value := block value:(self removeKey: key ifAbsent:[ nil]).
        self at: (block value: key) put: value].



        

> On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[hidden email]> wrote:
>
> https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Dictionaries-and-Sets-when-references-occur-in-its-keys-or-elements
>
> fix coming
>
>> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[hidden email]> wrote:
>>
>>>
>>> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
>>>
>>>
>>>
>>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
>>>
>>>
>>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>>>
>>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
>>>>
>>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
>>>>
>>>> Is it urgent ?
>>>>
>>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
>>>>
>>>>
>>>>
>>>
>>> Hi Sven,
>>>
>>> I got the same kind of problem in a personal application.
>>>
>>> I use Sets that I serialize and I had a lot of trouble because sometimes
>>> some action had strange behaviours.
>>>
>>> For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
>>> not found in aSet'.
>>>
>>> I am glad to hear that it is a Ston issue and not me that used sets in a
>>> bad way :)
>>>
>>> For me too it is not urgent since I have a not of university work for
>>> the moment.
>>>
>>>
>>>
>>> How are hashed collections created/filled during ston-parsing ?
>>> If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
>>> the index in the dictionary  (or other hashed collections) may be wrong.
>>
>> Yes, that is indeed it, Nicolai.
>>
>> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
>>
>>> --
>>> Cyril Ferlicot
>>>
>>> http://www.synectique.eu
>>>
>>> 165 Avenue Bretagne
>>> Lille 59000 France
>



Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Sven Van Caekenberghe-2
In reply to this post by Henrik Nergaard
Henrik,

> On 06 Apr 2016, at 14:55, Henrik Nergaard <[hidden email]> wrote:
>
> Hi Sven,
>
> Instead of:
>
> stonProcessSubObjects: block
> | didContainStonReferenceAsKey |
> didContainStonReferenceAsKey := self stonProcessSubObjects: block
>
> super stonProcessSubObjects: block.
>
> self isHealthy ifFalse: [ self rehash ]..
> super stonProcessSubObjects: block.
> didContainStonReferenceAsKey
> ifTrue: [ self rehash ]
>
>
> One can do:
>
> stonProcessSubObjects: block
>
>    super stonProcessSubObjects: block.
>
>     self isHealthy ifFalse: [ self rehash ].
>
>
> And this could also be moved to HashedCollection as a generic solution for any hash based collection. Moving it there would require implementing #isHealthy as a subclass responsibility. (all current subclasses (Set, Dictionary) implements it).
>
> What do you think?

First, thanks a lot for having a look.

Now, I did spend a lot of time thinking about the best approach, and your suggestion was one of the solutions that I considered (and it certainly is the most elegant one).

My goal is to make STON both correct and as fast as possible. When materialising, there are 2 phases, the second being the resolution of references. That is a separate phase because doing it directly 'in place' would result in cycles.

Consider the first optimisation one level up:

STONReader>>next
  | object |
  self consumeWhitespace.
  object := self parseValue.
  unresolvedReferences > 0
    ifTrue: [ self processSubObjectsOf: object ].
  ^ object

Unless there really are unresolved references, it skips the second phase completely.

Next, all Dictionaries and Sets are always freshly created and in principle (99% of the time) correct and #rehash is expensive (it rebuilds them again). We have to try to avoid this.

IMO, #isHealthy is more expensive than the manual tests that I added (#containsStonReferenceAsKey and #containsStonReference). Why ? Because #isHealthy always checks all elements, and checks all their hashes, while my tests back out immediately and don't do any hashing. Would you agree ?

Still, this is all a balancing act where the pro's and con's have to be carefully considered. Thanks for thinking along.

Sven

> Best regards,
> Henrik
>
>
> -----Original Message-----
> From: Pharo-dev [mailto:[hidden email]] On Behalf Of Sven Van Caekenberghe
> Sent: Wednesday, April 6, 2016 2:27 PM
> To: Pharo Development List <[hidden email]>
> Subject: Re: [Pharo-dev] [Pharo-users] STON materialization corrupts a dictionary if the keys are references
>
> Fix for review:
>
> ===
> Name: STON-Core-SvenVanCaekenberghe.71
> Author: SvenVanCaekenberghe
> Time: 6 April 2016, 2:22:24.782251 pm
> UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
> Ancestors: STON-Core-SvenVanCaekenberghe.70
>
> Fix a bug where STONReferences occurring as keys in Dictionaries or elements in Sets caused those to be unhealthy after materialization. Thx to Peter Uhnák for reporting this issue.
>
> Add 3 new unit tests to STONReaderTests
>
> #testDictionaryWithReferenceKeys
> #testSetWithReferenceElements
> #testDeepStructure
>
> Fix Details
>
> change the implementation of STONReader>>#processSubObjectsOf: from iterative to recursive (see version 39 of 29 November 2012, this might be a functional regression, see #testDeepStructure; cleanup of stack instance variable for later) so that #stonProcessSubObjects: can be overwritten with code being executed before or after full reference resolution
>
> imho, recursion stack depth will be equal during both writing and reading, and should be acceptable.
>
> overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the end, but only when needed (minimal optimalization, see Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference) ===
> Name: STON-Tests-SvenVanCaekenberghe.63
> Author: SvenVanCaekenberghe
> Time: 6 April 2016, 2:22:45.01986 pm
> UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
> Ancestors: STON-Tests-SvenVanCaekenberghe.62
>
> (idem)
> ===
>
>> On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[hidden email]> wrote:
>>
>> https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Di
>> ctionaries-and-Sets-when-references-occur-in-its-keys-or-elements
>>
>> fix coming
>>
>>> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[hidden email]> wrote:
>>>
>>>>
>>>> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
>>>>
>>>>
>>>>
>>>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
>>>>
>>>>
>>>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>>>>
>>>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
>>>>>
>>>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
>>>>>
>>>>> Is it urgent ?
>>>>>
>>>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
>>>>>
>>>>>
>>>>>
>>>>
>>>> Hi Sven,
>>>>
>>>> I got the same kind of problem in a personal application.
>>>>
>>>> I use Sets that I serialize and I had a lot of trouble because
>>>> sometimes some action had strange behaviours.
>>>>
>>>> For example in a set with element `aSet remove: aSet anyOne` raised
>>>> 'XXX not found in aSet'.
>>>>
>>>> I am glad to hear that it is a Ston issue and not me that used sets
>>>> in a bad way :)
>>>>
>>>> For me too it is not urgent since I have a not of university work
>>>> for the moment.
>>>>
>>>>
>>>>
>>>> How are hashed collections created/filled during ston-parsing ?
>>>> If the position in a hashed collection is created by a
>>>> ston-reference, that is later replaced by the "real" object, the index in the dictionary  (or other hashed collections) may be wrong.
>>>
>>> Yes, that is indeed it, Nicolai.
>>>
>>> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
>>>
>>>> --
>>>> Cyril Ferlicot
>>>>
>>>> http://www.synectique.eu
>>>>
>>>> 165 Avenue Bretagne
>>>> Lille 59000 France
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Sven Van Caekenberghe-2
In reply to this post by Nicolai Hess-3-2
Hi Nicolai,

> On 06 Apr 2016, at 14:56, Nicolai Hess <[hidden email]> wrote:
>
>
>
> 2016-04-06 14:27 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
> Fix for review:
>
> ===
> Name: STON-Core-SvenVanCaekenberghe.71
> Author: SvenVanCaekenberghe
> Time: 6 April 2016, 2:22:24.782251 pm
> UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
> Ancestors: STON-Core-SvenVanCaekenberghe.70
>
> Fix a bug where STONReferences occurring as keys in Dictionaries or elements in Sets caused those to be unhealthy after materialization. Thx to Peter Uhnák for reporting this issue.
>
> Add 3 new unit tests to STONReaderTests
>
> #testDictionaryWithReferenceKeys
> #testSetWithReferenceElements
> #testDeepStructure
>
> Fix Details
>
> change the implementation of STONReader>>#processSubObjectsOf: from iterative to recursive (see version 39 of 29 November 2012, this might be a functional regression, see #testDeepStructure; cleanup of stack instance variable for later) so that #stonProcessSubObjects: can be overwritten with code being executed before or after full reference resolution
>
> imho, recursion stack depth will be equal during both writing and reading, and should be acceptable.
>
> overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the end, but only when needed (minimal optimalization, see Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference)
> ===
> Name: STON-Tests-SvenVanCaekenberghe.63
> Author: SvenVanCaekenberghe
> Time: 6 April 2016, 2:22:45.01986 pm
> UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
> Ancestors: STON-Tests-SvenVanCaekenberghe.62
>
> (idem)
> ===

Thanks for looking at the code.

> Hi Sven,
> instead of rehashing the dictionary for every ston reference,

(It rehashes only once after resolving all references)

> wouldn't it work to remove and readd the value after processing the subobject:
>
> Dictionary>>#stonProcessSubObjects: block
>     self keys do:[:key |
>         |value|
>         value := block value:(self removeKey: key ifAbsent:[ nil]).
>         self at: (block value: key) put: value].

Interesting idea. I have to think about that approach.

Now, Object>>#stonProcessSubObjects: is very general and looks at named and indexed instance variables. But this probably could be replaced by something more high level and specific I guess.

Adding and removing each key/value has a cost too. I try to make the simplest case very efficient and only pay a price when really needed. Anyway, time for some calculations.

Thanks again for the suggestion !

Sven

> > On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[hidden email]> wrote:
> >
> > https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Dictionaries-and-Sets-when-references-occur-in-its-keys-or-elements
> >
> > fix coming
> >
> >> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[hidden email]> wrote:
> >>
> >>>
> >>> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
> >>>
> >>>
> >>>
> >>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
> >>>
> >>>
> >>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
> >>>
> >>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
> >>>>
> >>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
> >>>>
> >>>> Is it urgent ?
> >>>>
> >>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
> >>>>
> >>>>
> >>>>
> >>>
> >>> Hi Sven,
> >>>
> >>> I got the same kind of problem in a personal application.
> >>>
> >>> I use Sets that I serialize and I had a lot of trouble because sometimes
> >>> some action had strange behaviours.
> >>>
> >>> For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
> >>> not found in aSet'.
> >>>
> >>> I am glad to hear that it is a Ston issue and not me that used sets in a
> >>> bad way :)
> >>>
> >>> For me too it is not urgent since I have a not of university work for
> >>> the moment.
> >>>
> >>>
> >>>
> >>> How are hashed collections created/filled during ston-parsing ?
> >>> If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
> >>> the index in the dictionary  (or other hashed collections) may be wrong.
> >>
> >> Yes, that is indeed it, Nicolai.
> >>
> >> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
> >>
> >>> --
> >>> Cyril Ferlicot
> >>>
> >>> http://www.synectique.eu
> >>>
> >>> 165 Avenue Bretagne
> >>> Lille 59000 France
> >


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Nicolai Hess-3-2


2016-04-06 15:25 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
Hi Nicolai,

> On 06 Apr 2016, at 14:56, Nicolai Hess <[hidden email]> wrote:
>
>
>
> 2016-04-06 14:27 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
> Fix for review:
>
> ===
> Name: STON-Core-SvenVanCaekenberghe.71
> Author: SvenVanCaekenberghe
> Time: 6 April 2016, 2:22:24.782251 pm
> UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
> Ancestors: STON-Core-SvenVanCaekenberghe.70
>
> Fix a bug where STONReferences occurring as keys in Dictionaries or elements in Sets caused those to be unhealthy after materialization. Thx to Peter Uhnák for reporting this issue.
>
> Add 3 new unit tests to STONReaderTests
>
> #testDictionaryWithReferenceKeys
> #testSetWithReferenceElements
> #testDeepStructure
>
> Fix Details
>
> change the implementation of STONReader>>#processSubObjectsOf: from iterative to recursive (see version 39 of 29 November 2012, this might be a functional regression, see #testDeepStructure; cleanup of stack instance variable for later) so that #stonProcessSubObjects: can be overwritten with code being executed before or after full reference resolution
>
> imho, recursion stack depth will be equal during both writing and reading, and should be acceptable.
>
> overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the end, but only when needed (minimal optimalization, see Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference)
> ===
> Name: STON-Tests-SvenVanCaekenberghe.63
> Author: SvenVanCaekenberghe
> Time: 6 April 2016, 2:22:45.01986 pm
> UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
> Ancestors: STON-Tests-SvenVanCaekenberghe.62
>
> (idem)
> ===

Thanks for looking at the code.

> Hi Sven,
> instead of rehashing the dictionary for every ston reference,

(It rehashes only once after resolving all references)

Ah, of course. I thought this would be called for every ston reference used as key.

 

> wouldn't it work to remove and readd the value after processing the subobject:
>
> Dictionary>>#stonProcessSubObjects: block
>     self keys do:[:key |
>         |value|
>         value := block value:(self removeKey: key ifAbsent:[ nil]).
>         self at: (block value: key) put: value].

Interesting idea. I have to think about that approach.

Now, Object>>#stonProcessSubObjects: is very general and looks at named and indexed instance variables. But this probably could be replaced by something more high level and specific I guess.

Adding and removing each key/value has a cost too. I try to make the simplest case very efficient and only pay a price when really needed. Anyway, time for some calculations.

ok, yes running over all keys isn't better than rehashing :-)
 

Thanks again for the suggestion !

Sven

> > On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[hidden email]> wrote:
> >
> > https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Dictionaries-and-Sets-when-references-occur-in-its-keys-or-elements
> >
> > fix coming
> >
> >> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[hidden email]> wrote:
> >>
> >>>
> >>> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
> >>>
> >>>
> >>>
> >>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
> >>>
> >>>
> >>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
> >>>
> >>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
> >>>>
> >>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
> >>>>
> >>>> Is it urgent ?
> >>>>
> >>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
> >>>>
> >>>>
> >>>>
> >>>
> >>> Hi Sven,
> >>>
> >>> I got the same kind of problem in a personal application.
> >>>
> >>> I use Sets that I serialize and I had a lot of trouble because sometimes
> >>> some action had strange behaviours.
> >>>
> >>> For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
> >>> not found in aSet'.
> >>>
> >>> I am glad to hear that it is a Ston issue and not me that used sets in a
> >>> bad way :)
> >>>
> >>> For me too it is not urgent since I have a not of university work for
> >>> the moment.
> >>>
> >>>
> >>>
> >>> How are hashed collections created/filled during ston-parsing ?
> >>> If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
> >>> the index in the dictionary  (or other hashed collections) may be wrong.
> >>
> >> Yes, that is indeed it, Nicolai.
> >>
> >> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
> >>
> >>> --
> >>> Cyril Ferlicot
> >>>
> >>> http://www.synectique.eu
> >>>
> >>> 165 Avenue Bretagne
> >>> Lille 59000 France
> >



Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Sven Van Caekenberghe-2

> On 06 Apr 2016, at 15:34, Nicolai Hess <[hidden email]> wrote:
>
>
>
> 2016-04-06 15:25 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
> Hi Nicolai,
>
> > On 06 Apr 2016, at 14:56, Nicolai Hess <[hidden email]> wrote:
> >
> >
> >
> > 2016-04-06 14:27 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
> > Fix for review:
> >
> > ===
> > Name: STON-Core-SvenVanCaekenberghe.71
> > Author: SvenVanCaekenberghe
> > Time: 6 April 2016, 2:22:24.782251 pm
> > UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
> > Ancestors: STON-Core-SvenVanCaekenberghe.70
> >
> > Fix a bug where STONReferences occurring as keys in Dictionaries or elements in Sets caused those to be unhealthy after materialization. Thx to Peter Uhnák for reporting this issue.
> >
> > Add 3 new unit tests to STONReaderTests
> >
> > #testDictionaryWithReferenceKeys
> > #testSetWithReferenceElements
> > #testDeepStructure
> >
> > Fix Details
> >
> > change the implementation of STONReader>>#processSubObjectsOf: from iterative to recursive (see version 39 of 29 November 2012, this might be a functional regression, see #testDeepStructure; cleanup of stack instance variable for later) so that #stonProcessSubObjects: can be overwritten with code being executed before or after full reference resolution
> >
> > imho, recursion stack depth will be equal during both writing and reading, and should be acceptable.
> >
> > overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the end, but only when needed (minimal optimalization, see Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference)
> > ===
> > Name: STON-Tests-SvenVanCaekenberghe.63
> > Author: SvenVanCaekenberghe
> > Time: 6 April 2016, 2:22:45.01986 pm
> > UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
> > Ancestors: STON-Tests-SvenVanCaekenberghe.62
> >
> > (idem)
> > ===
>
> Thanks for looking at the code.
>
> > Hi Sven,
> > instead of rehashing the dictionary for every ston reference,
>
> (It rehashes only once after resolving all references)
>
> Ah, of course. I thought this would be called for every ston reference used as key.
>
>  
>
> > wouldn't it work to remove and readd the value after processing the subobject:
> >
> > Dictionary>>#stonProcessSubObjects: block
> >     self keys do:[:key |
> >         |value|
> >         value := block value:(self removeKey: key ifAbsent:[ nil]).
> >         self at: (block value: key) put: value].
>
> Interesting idea. I have to think about that approach.
>
> Now, Object>>#stonProcessSubObjects: is very general and looks at named and indexed instance variables. But this probably could be replaced by something more high level and specific I guess.
>
> Adding and removing each key/value has a cost too. I try to make the simplest case very efficient and only pay a price when really needed. Anyway, time for some calculations.
>
> ok, yes running over all keys isn't better than rehashing :-)

Still, your idea might be better than you would expect:

Now, there is 1 (partial) iteration to do the check before, 2 iterations over the keys and values arrays, then an optional full rehash (which also reallocates and thus generates garbage).

Your idea does only 1 iteration over keys, with remove and add (which also happens more efficiently on the arrays above), no check before, no rehash, probably no garbage generation at all in any case.

Like I said, I have to study it (especially the cost of remove/add, maybe that can be optimised a bit as well).

> Thanks again for the suggestion !
>
> Sven
>
> > > On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[hidden email]> wrote:
> > >
> > > https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Dictionaries-and-Sets-when-references-occur-in-its-keys-or-elements
> > >
> > > fix coming
> > >
> > >> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[hidden email]> wrote:
> > >>
> > >>>
> > >>> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
> > >>>
> > >>>
> > >>>
> > >>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
> > >>>
> > >>>
> > >>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
> > >>>
> > >>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
> > >>>>
> > >>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
> > >>>>
> > >>>> Is it urgent ?
> > >>>>
> > >>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
> > >>>>
> > >>>>
> > >>>>
> > >>>
> > >>> Hi Sven,
> > >>>
> > >>> I got the same kind of problem in a personal application.
> > >>>
> > >>> I use Sets that I serialize and I had a lot of trouble because sometimes
> > >>> some action had strange behaviours.
> > >>>
> > >>> For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
> > >>> not found in aSet'.
> > >>>
> > >>> I am glad to hear that it is a Ston issue and not me that used sets in a
> > >>> bad way :)
> > >>>
> > >>> For me too it is not urgent since I have a not of university work for
> > >>> the moment.
> > >>>
> > >>>
> > >>>
> > >>> How are hashed collections created/filled during ston-parsing ?
> > >>> If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
> > >>> the index in the dictionary  (or other hashed collections) may be wrong.
> > >>
> > >> Yes, that is indeed it, Nicolai.
> > >>
> > >> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
> > >>
> > >>> --
> > >>> Cyril Ferlicot
> > >>>
> > >>> http://www.synectique.eu
> > >>>
> > >>> 165 Avenue Bretagne
> > >>> Lille 59000 France
> > >


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Henrik Sperre Johansen
If you are iterating over a Set with incorrectly placed objects, remove: calls aren't going to do you much good ;)
Not to mention, even nilling the slot directly, then add:'ing, still means you have to scan subsequent entries up to the next empty slot for potentially better placement, if the object ended up being added elsewhere.
(IOW, if you're gonna do it, better iterate in reverse)

Cheers,
Henry

> On 06 Apr 2016, at 3:46 , Sven Van Caekenberghe <[hidden email]> wrote:
>
>
>> On 06 Apr 2016, at 15:34, Nicolai Hess <[hidden email]> wrote:
>>
>>
>>
>> 2016-04-06 15:25 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>> Hi Nicolai,
>>
>>> On 06 Apr 2016, at 14:56, Nicolai Hess <[hidden email]> wrote:
>>>
>>>
>>>
>>> 2016-04-06 14:27 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>>> Fix for review:
>>>
>>> ===
>>> Name: STON-Core-SvenVanCaekenberghe.71
>>> Author: SvenVanCaekenberghe
>>> Time: 6 April 2016, 2:22:24.782251 pm
>>> UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
>>> Ancestors: STON-Core-SvenVanCaekenberghe.70
>>>
>>> Fix a bug where STONReferences occurring as keys in Dictionaries or elements in Sets caused those to be unhealthy after materialization. Thx to Peter Uhnák for reporting this issue.
>>>
>>> Add 3 new unit tests to STONReaderTests
>>>
>>> #testDictionaryWithReferenceKeys
>>> #testSetWithReferenceElements
>>> #testDeepStructure
>>>
>>> Fix Details
>>>
>>> change the implementation of STONReader>>#processSubObjectsOf: from iterative to recursive (see version 39 of 29 November 2012, this might be a functional regression, see #testDeepStructure; cleanup of stack instance variable for later) so that #stonProcessSubObjects: can be overwritten with code being executed before or after full reference resolution
>>>
>>> imho, recursion stack depth will be equal during both writing and reading, and should be acceptable.
>>>
>>> overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the end, but only when needed (minimal optimalization, see Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference)
>>> ===
>>> Name: STON-Tests-SvenVanCaekenberghe.63
>>> Author: SvenVanCaekenberghe
>>> Time: 6 April 2016, 2:22:45.01986 pm
>>> UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
>>> Ancestors: STON-Tests-SvenVanCaekenberghe.62
>>>
>>> (idem)
>>> ===
>>
>> Thanks for looking at the code.
>>
>>> Hi Sven,
>>> instead of rehashing the dictionary for every ston reference,
>>
>> (It rehashes only once after resolving all references)
>>
>> Ah, of course. I thought this would be called for every ston reference used as key.
>>
>>
>>
>>> wouldn't it work to remove and readd the value after processing the subobject:
>>>
>>> Dictionary>>#stonProcessSubObjects: block
>>>    self keys do:[:key |
>>>        |value|
>>>        value := block value:(self removeKey: key ifAbsent:[ nil]).
>>>        self at: (block value: key) put: value].
>>
>> Interesting idea. I have to think about that approach.
>>
>> Now, Object>>#stonProcessSubObjects: is very general and looks at named and indexed instance variables. But this probably could be replaced by something more high level and specific I guess.
>>
>> Adding and removing each key/value has a cost too. I try to make the simplest case very efficient and only pay a price when really needed. Anyway, time for some calculations.
>>
>> ok, yes running over all keys isn't better than rehashing :-)
>
> Still, your idea might be better than you would expect:
>
> Now, there is 1 (partial) iteration to do the check before, 2 iterations over the keys and values arrays, then an optional full rehash (which also reallocates and thus generates garbage).
>
> Your idea does only 1 iteration over keys, with remove and add (which also happens more efficiently on the arrays above), no check before, no rehash, probably no garbage generation at all in any case.
>
> Like I said, I have to study it (especially the cost of remove/add, maybe that can be optimised a bit as well).
>
>> Thanks again for the suggestion !
>>
>> Sven
>>
>>>> On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>
>>>> https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Dictionaries-and-Sets-when-references-occur-in-its-keys-or-elements
>>>>
>>>> fix coming
>>>>
>>>>> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>
>>>>>>
>>>>>> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
>>>>>>
>>>>>>
>>>>>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>>>>>>
>>>>>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
>>>>>>>
>>>>>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
>>>>>>>
>>>>>>> Is it urgent ?
>>>>>>>
>>>>>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Hi Sven,
>>>>>>
>>>>>> I got the same kind of problem in a personal application.
>>>>>>
>>>>>> I use Sets that I serialize and I had a lot of trouble because sometimes
>>>>>> some action had strange behaviours.
>>>>>>
>>>>>> For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
>>>>>> not found in aSet'.
>>>>>>
>>>>>> I am glad to hear that it is a Ston issue and not me that used sets in a
>>>>>> bad way :)
>>>>>>
>>>>>> For me too it is not urgent since I have a not of university work for
>>>>>> the moment.
>>>>>>
>>>>>>
>>>>>>
>>>>>> How are hashed collections created/filled during ston-parsing ?
>>>>>> If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
>>>>>> the index in the dictionary  (or other hashed collections) may be wrong.
>>>>>
>>>>> Yes, that is indeed it, Nicolai.
>>>>>
>>>>> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
>>>>>
>>>>>> --
>>>>>> Cyril Ferlicot
>>>>>>
>>>>>> http://www.synectique.eu
>>>>>>
>>>>>> 165 Avenue Bretagne
>>>>>> Lille 59000 France
>>>>
>
>


signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Sven Van Caekenberghe-2
Yeah, I just realised that iterating and changing a dictionary or set at the same time won't work ;-)

> On 06 Apr 2016, at 16:01, Henrik Johansen <[hidden email]> wrote:
>
> If you are iterating over a Set with incorrectly placed objects, remove: calls aren't going to do you much good ;)
> Not to mention, even nilling the slot directly, then add:'ing, still means you have to scan subsequent entries up to the next empty slot for potentially better placement, if the object ended up being added elsewhere.
> (IOW, if you're gonna do it, better iterate in reverse)
>
> Cheers,
> Henry
>
>> On 06 Apr 2016, at 3:46 , Sven Van Caekenberghe <[hidden email]> wrote:
>>
>>
>>> On 06 Apr 2016, at 15:34, Nicolai Hess <[hidden email]> wrote:
>>>
>>>
>>>
>>> 2016-04-06 15:25 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>>> Hi Nicolai,
>>>
>>>> On 06 Apr 2016, at 14:56, Nicolai Hess <[hidden email]> wrote:
>>>>
>>>>
>>>>
>>>> 2016-04-06 14:27 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>>>> Fix for review:
>>>>
>>>> ===
>>>> Name: STON-Core-SvenVanCaekenberghe.71
>>>> Author: SvenVanCaekenberghe
>>>> Time: 6 April 2016, 2:22:24.782251 pm
>>>> UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
>>>> Ancestors: STON-Core-SvenVanCaekenberghe.70
>>>>
>>>> Fix a bug where STONReferences occurring as keys in Dictionaries or elements in Sets caused those to be unhealthy after materialization. Thx to Peter Uhnák for reporting this issue.
>>>>
>>>> Add 3 new unit tests to STONReaderTests
>>>>
>>>> #testDictionaryWithReferenceKeys
>>>> #testSetWithReferenceElements
>>>> #testDeepStructure
>>>>
>>>> Fix Details
>>>>
>>>> change the implementation of STONReader>>#processSubObjectsOf: from iterative to recursive (see version 39 of 29 November 2012, this might be a functional regression, see #testDeepStructure; cleanup of stack instance variable for later) so that #stonProcessSubObjects: can be overwritten with code being executed before or after full reference resolution
>>>>
>>>> imho, recursion stack depth will be equal during both writing and reading, and should be acceptable.
>>>>
>>>> overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the end, but only when needed (minimal optimalization, see Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference)
>>>> ===
>>>> Name: STON-Tests-SvenVanCaekenberghe.63
>>>> Author: SvenVanCaekenberghe
>>>> Time: 6 April 2016, 2:22:45.01986 pm
>>>> UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
>>>> Ancestors: STON-Tests-SvenVanCaekenberghe.62
>>>>
>>>> (idem)
>>>> ===
>>>
>>> Thanks for looking at the code.
>>>
>>>> Hi Sven,
>>>> instead of rehashing the dictionary for every ston reference,
>>>
>>> (It rehashes only once after resolving all references)
>>>
>>> Ah, of course. I thought this would be called for every ston reference used as key.
>>>
>>>
>>>
>>>> wouldn't it work to remove and readd the value after processing the subobject:
>>>>
>>>> Dictionary>>#stonProcessSubObjects: block
>>>>   self keys do:[:key |
>>>>       |value|
>>>>       value := block value:(self removeKey: key ifAbsent:[ nil]).
>>>>       self at: (block value: key) put: value].
>>>
>>> Interesting idea. I have to think about that approach.
>>>
>>> Now, Object>>#stonProcessSubObjects: is very general and looks at named and indexed instance variables. But this probably could be replaced by something more high level and specific I guess.
>>>
>>> Adding and removing each key/value has a cost too. I try to make the simplest case very efficient and only pay a price when really needed. Anyway, time for some calculations.
>>>
>>> ok, yes running over all keys isn't better than rehashing :-)
>>
>> Still, your idea might be better than you would expect:
>>
>> Now, there is 1 (partial) iteration to do the check before, 2 iterations over the keys and values arrays, then an optional full rehash (which also reallocates and thus generates garbage).
>>
>> Your idea does only 1 iteration over keys, with remove and add (which also happens more efficiently on the arrays above), no check before, no rehash, probably no garbage generation at all in any case.
>>
>> Like I said, I have to study it (especially the cost of remove/add, maybe that can be optimised a bit as well).
>>
>>> Thanks again for the suggestion !
>>>
>>> Sven
>>>
>>>>> On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>
>>>>> https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Dictionaries-and-Sets-when-references-occur-in-its-keys-or-elements
>>>>>
>>>>> fix coming
>>>>>
>>>>>> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>>
>>>>>>>
>>>>>>> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
>>>>>>>
>>>>>>>
>>>>>>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>>>>>>>
>>>>>>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
>>>>>>>>
>>>>>>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
>>>>>>>>
>>>>>>>> Is it urgent ?
>>>>>>>>
>>>>>>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> Hi Sven,
>>>>>>>
>>>>>>> I got the same kind of problem in a personal application.
>>>>>>>
>>>>>>> I use Sets that I serialize and I had a lot of trouble because sometimes
>>>>>>> some action had strange behaviours.
>>>>>>>
>>>>>>> For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
>>>>>>> not found in aSet'.
>>>>>>>
>>>>>>> I am glad to hear that it is a Ston issue and not me that used sets in a
>>>>>>> bad way :)
>>>>>>>
>>>>>>> For me too it is not urgent since I have a not of university work for
>>>>>>> the moment.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> How are hashed collections created/filled during ston-parsing ?
>>>>>>> If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
>>>>>>> the index in the dictionary  (or other hashed collections) may be wrong.
>>>>>>
>>>>>> Yes, that is indeed it, Nicolai.
>>>>>>
>>>>>> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
>>>>>>
>>>>>>> --
>>>>>>> Cyril Ferlicot
>>>>>>>
>>>>>>> http://www.synectique.eu
>>>>>>>
>>>>>>> 165 Avenue Bretagne
>>>>>>> Lille 59000 France
>>>>>
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Henrik Sperre Johansen
The specified operation of fixing a badly hashed set will, if you go in reverse ;)
Removing (or, nilling, then checking for subsequent hits) only ever moves objects at indexes following the one you removed at (and have already verified to be correct)
Adding might result in the removed object being put at a lower index, which means you may process it twice, but that's not the end of the world (since the second time, index will be valid, and only one slot is affected).

Goes without saying such an operation does some arguably nasty stuff using HashedCollection internals, so would need to be implemented there...
Might've been a better compromise to write an optimized removeAllSuchThat: (followed by adding the same over again) fixing slots as it goes, ala the implementation in OrderedCollection. But no, keeping it consistent, it would have to be the values passed as block parameters :(

Cheers,
Henry

> On 06 Apr 2016, at 4:07 , Sven Van Caekenberghe <[hidden email]> wrote:
>
> Yeah, I just realised that iterating and changing a dictionary or set at the same time won't work ;-)
>
>> On 06 Apr 2016, at 16:01, Henrik Johansen <[hidden email]> wrote:
>>
>> If you are iterating over a Set with incorrectly placed objects, remove: calls aren't going to do you much good ;)
>> Not to mention, even nilling the slot directly, then add:'ing, still means you have to scan subsequent entries up to the next empty slot for potentially better placement, if the object ended up being added elsewhere.
>> (IOW, if you're gonna do it, better iterate in reverse)
>>
>> Cheers,
>> Henry
>>
>>> On 06 Apr 2016, at 3:46 , Sven Van Caekenberghe <[hidden email]> wrote:
>>>
>>>
>>>> On 06 Apr 2016, at 15:34, Nicolai Hess <[hidden email]> wrote:
>>>>
>>>>
>>>>
>>>> 2016-04-06 15:25 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>>>> Hi Nicolai,
>>>>
>>>>> On 06 Apr 2016, at 14:56, Nicolai Hess <[hidden email]> wrote:
>>>>>
>>>>>
>>>>>
>>>>> 2016-04-06 14:27 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>>>>> Fix for review:
>>>>>
>>>>> ===
>>>>> Name: STON-Core-SvenVanCaekenberghe.71
>>>>> Author: SvenVanCaekenberghe
>>>>> Time: 6 April 2016, 2:22:24.782251 pm
>>>>> UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
>>>>> Ancestors: STON-Core-SvenVanCaekenberghe.70
>>>>>
>>>>> Fix a bug where STONReferences occurring as keys in Dictionaries or elements in Sets caused those to be unhealthy after materialization. Thx to Peter Uhnák for reporting this issue.
>>>>>
>>>>> Add 3 new unit tests to STONReaderTests
>>>>>
>>>>> #testDictionaryWithReferenceKeys
>>>>> #testSetWithReferenceElements
>>>>> #testDeepStructure
>>>>>
>>>>> Fix Details
>>>>>
>>>>> change the implementation of STONReader>>#processSubObjectsOf: from iterative to recursive (see version 39 of 29 November 2012, this might be a functional regression, see #testDeepStructure; cleanup of stack instance variable for later) so that #stonProcessSubObjects: can be overwritten with code being executed before or after full reference resolution
>>>>>
>>>>> imho, recursion stack depth will be equal during both writing and reading, and should be acceptable.
>>>>>
>>>>> overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the end, but only when needed (minimal optimalization, see Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference)
>>>>> ===
>>>>> Name: STON-Tests-SvenVanCaekenberghe.63
>>>>> Author: SvenVanCaekenberghe
>>>>> Time: 6 April 2016, 2:22:45.01986 pm
>>>>> UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
>>>>> Ancestors: STON-Tests-SvenVanCaekenberghe.62
>>>>>
>>>>> (idem)
>>>>> ===
>>>>
>>>> Thanks for looking at the code.
>>>>
>>>>> Hi Sven,
>>>>> instead of rehashing the dictionary for every ston reference,
>>>>
>>>> (It rehashes only once after resolving all references)
>>>>
>>>> Ah, of course. I thought this would be called for every ston reference used as key.
>>>>
>>>>
>>>>
>>>>> wouldn't it work to remove and readd the value after processing the subobject:
>>>>>
>>>>> Dictionary>>#stonProcessSubObjects: block
>>>>>  self keys do:[:key |
>>>>>      |value|
>>>>>      value := block value:(self removeKey: key ifAbsent:[ nil]).
>>>>>      self at: (block value: key) put: value].
>>>>
>>>> Interesting idea. I have to think about that approach.
>>>>
>>>> Now, Object>>#stonProcessSubObjects: is very general and looks at named and indexed instance variables. But this probably could be replaced by something more high level and specific I guess.
>>>>
>>>> Adding and removing each key/value has a cost too. I try to make the simplest case very efficient and only pay a price when really needed. Anyway, time for some calculations.
>>>>
>>>> ok, yes running over all keys isn't better than rehashing :-)
>>>
>>> Still, your idea might be better than you would expect:
>>>
>>> Now, there is 1 (partial) iteration to do the check before, 2 iterations over the keys and values arrays, then an optional full rehash (which also reallocates and thus generates garbage).
>>>
>>> Your idea does only 1 iteration over keys, with remove and add (which also happens more efficiently on the arrays above), no check before, no rehash, probably no garbage generation at all in any case.
>>>
>>> Like I said, I have to study it (especially the cost of remove/add, maybe that can be optimised a bit as well).
>>>
>>>> Thanks again for the suggestion !
>>>>
>>>> Sven
>>>>
>>>>>> On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>>
>>>>>> https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Dictionaries-and-Sets-when-references-occur-in-its-keys-or-elements
>>>>>>
>>>>>> fix coming
>>>>>>
>>>>>>> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>>>>>>>>
>>>>>>>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
>>>>>>>>>
>>>>>>>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
>>>>>>>>>
>>>>>>>>> Is it urgent ?
>>>>>>>>>
>>>>>>>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Sven,
>>>>>>>>
>>>>>>>> I got the same kind of problem in a personal application.
>>>>>>>>
>>>>>>>> I use Sets that I serialize and I had a lot of trouble because sometimes
>>>>>>>> some action had strange behaviours.
>>>>>>>>
>>>>>>>> For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
>>>>>>>> not found in aSet'.
>>>>>>>>
>>>>>>>> I am glad to hear that it is a Ston issue and not me that used sets in a
>>>>>>>> bad way :)
>>>>>>>>
>>>>>>>> For me too it is not urgent since I have a not of university work for
>>>>>>>> the moment.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> How are hashed collections created/filled during ston-parsing ?
>>>>>>>> If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
>>>>>>>> the index in the dictionary  (or other hashed collections) may be wrong.
>>>>>>>
>>>>>>> Yes, that is indeed it, Nicolai.
>>>>>>>
>>>>>>> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
>>>>>>>
>>>>>>>> --
>>>>>>>> Cyril Ferlicot
>>>>>>>>
>>>>>>>> http://www.synectique.eu
>>>>>>>>
>>>>>>>> 165 Avenue Bretagne
>>>>>>>> Lille 59000 France
>>>>>>
>>>
>>>
>>
>
>


signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Henrik Sperre Johansen
In reply to this post by Sven Van Caekenberghe-2
Not really about modifying during iteration, if an object you try to remove: is in the wrong slot, you'll most likely get a CannotBeFound error.

Which you can work around by iterating indexes and nilling directly, doing so in reverse merely minimizes the amount of work needed to potentially fill the slot when it is still empty after re-adding.
(Maintain index of next nil slot as well, test backwards from that, and you should get away with the minimal amount of work required... Just need to ensure it's still nil, and not the slot removed object was put in)

Needless to say, it'd probably end up pretty hairy, and still have worse
worse-case perf* than rehash, though average-case would probably be much better.

Cheers,
Henry

* worst-case performance in linearly hashed tables is *much* worse for remove than add, due to the potential fixups needed when the set is dense... Consider at:ifAbsent: perf experienced for missing entries in default sized Unicode tables with hash at start of longest non-nil sequence of slots, and multiply by 100

> On 6. apr. 2016, at 21.07, Sven Van Caekenberghe <[hidden email]> wrote:
>
> Yeah, I just realised that iterating and changing a dictionary or set at the same time won't work ;-)
>
>> On 06 Apr 2016, at 16:01, Henrik Johansen <[hidden email]> wrote:
>>
>> If you are iterating over a Set with incorrectly placed objects, remove: calls aren't going to do you much good ;)
>> Not to mention, even nilling the slot directly, then add:'ing, still means you have to scan subsequent entries up to the next empty slot for potentially better placement, if the object ended up being added elsewhere.
>> (IOW, if you're gonna do it, better iterate in reverse)
>>
>> Cheers,
>> Henry
>>
>>> On 06 Apr 2016, at 3:46 , Sven Van Caekenberghe <[hidden email]> wrote:
>>>
>>>
>>>> On 06 Apr 2016, at 15:34, Nicolai Hess <[hidden email]> wrote:
>>>>
>>>>
>>>>
>>>> 2016-04-06 15:25 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>>>> Hi Nicolai,
>>>>
>>>>> On 06 Apr 2016, at 14:56, Nicolai Hess <[hidden email]> wrote:
>>>>>
>>>>>
>>>>>
>>>>> 2016-04-06 14:27 GMT+02:00 Sven Van Caekenberghe <[hidden email]>:
>>>>> Fix for review:
>>>>>
>>>>> ===
>>>>> Name: STON-Core-SvenVanCaekenberghe.71
>>>>> Author: SvenVanCaekenberghe
>>>>> Time: 6 April 2016, 2:22:24.782251 pm
>>>>> UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
>>>>> Ancestors: STON-Core-SvenVanCaekenberghe.70
>>>>>
>>>>> Fix a bug where STONReferences occurring as keys in Dictionaries or elements in Sets caused those to be unhealthy after materialization. Thx to Peter Uhnák for reporting this issue.
>>>>>
>>>>> Add 3 new unit tests to STONReaderTests
>>>>>
>>>>> #testDictionaryWithReferenceKeys
>>>>> #testSetWithReferenceElements
>>>>> #testDeepStructure
>>>>>
>>>>> Fix Details
>>>>>
>>>>> change the implementation of STONReader>>#processSubObjectsOf: from iterative to recursive (see version 39 of 29 November 2012, this might be a functional regression, see #testDeepStructure; cleanup of stack instance variable for later) so that #stonProcessSubObjects: can be overwritten with code being executed before or after full reference resolution
>>>>>
>>>>> imho, recursion stack depth will be equal during both writing and reading, and should be acceptable.
>>>>>
>>>>> overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the end, but only when needed (minimal optimalization, see Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference)
>>>>> ===
>>>>> Name: STON-Tests-SvenVanCaekenberghe.63
>>>>> Author: SvenVanCaekenberghe
>>>>> Time: 6 April 2016, 2:22:45.01986 pm
>>>>> UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
>>>>> Ancestors: STON-Tests-SvenVanCaekenberghe.62
>>>>>
>>>>> (idem)
>>>>> ===
>>>>
>>>> Thanks for looking at the code.
>>>>
>>>>> Hi Sven,
>>>>> instead of rehashing the dictionary for every ston reference,
>>>>
>>>> (It rehashes only once after resolving all references)
>>>>
>>>> Ah, of course. I thought this would be called for every ston reference used as key.
>>>>
>>>>
>>>>
>>>>> wouldn't it work to remove and readd the value after processing the subobject:
>>>>>
>>>>> Dictionary>>#stonProcessSubObjects: block
>>>>>  self keys do:[:key |
>>>>>      |value|
>>>>>      value := block value:(self removeKey: key ifAbsent:[ nil]).
>>>>>      self at: (block value: key) put: value].
>>>>
>>>> Interesting idea. I have to think about that approach.
>>>>
>>>> Now, Object>>#stonProcessSubObjects: is very general and looks at named and indexed instance variables. But this probably could be replaced by something more high level and specific I guess.
>>>>
>>>> Adding and removing each key/value has a cost too. I try to make the simplest case very efficient and only pay a price when really needed. Anyway, time for some calculations.
>>>>
>>>> ok, yes running over all keys isn't better than rehashing :-)
>>>
>>> Still, your idea might be better than you would expect:
>>>
>>> Now, there is 1 (partial) iteration to do the check before, 2 iterations over the keys and values arrays, then an optional full rehash (which also reallocates and thus generates garbage).
>>>
>>> Your idea does only 1 iteration over keys, with remove and add (which also happens more efficiently on the arrays above), no check before, no rehash, probably no garbage generation at all in any case.
>>>
>>> Like I said, I have to study it (especially the cost of remove/add, maybe that can be optimised a bit as well).
>>>
>>>> Thanks again for the suggestion !
>>>>
>>>> Sven
>>>>
>>>>>> On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>>
>>>>>> https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Dictionaries-and-Sets-when-references-occur-in-its-keys-or-elements
>>>>>>
>>>>>> fix coming
>>>>>>
>>>>>>>> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 05 Apr 2016, at 13:02, Nicolai Hess <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque <[hidden email]>:
>>>>>>>>
>>>>>>>>
>>>>>>>>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>>>>>>>>>
>>>>>>>>> Like I said, it is a hashing issue, sometimes it will be correct by accident.
>>>>>>>>>
>>>>>>>>> I hope you did not have to much trouble with this bug, I guess it must have been hard to chase.
>>>>>>>>>
>>>>>>>>> Is it urgent ?
>>>>>>>>>
>>>>>>>>> I probably can give you a quick fix, but I would like to think a bit more about this, since rehashing each materialised dictionary seems expensive.
>>>>>>>>
>>>>>>>> Hi Sven,
>>>>>>>>
>>>>>>>> I got the same kind of problem in a personal application.
>>>>>>>>
>>>>>>>> I use Sets that I serialize and I had a lot of trouble because sometimes
>>>>>>>> some action had strange behaviours.
>>>>>>>>
>>>>>>>> For example in a set with element `aSet remove: aSet anyOne` raised 'XXX
>>>>>>>> not found in aSet'.
>>>>>>>>
>>>>>>>> I am glad to hear that it is a Ston issue and not me that used sets in a
>>>>>>>> bad way :)
>>>>>>>>
>>>>>>>> For me too it is not urgent since I have a not of university work for
>>>>>>>> the moment.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> How are hashed collections created/filled during ston-parsing ?
>>>>>>>> If the position in a hashed collection is created by a ston-reference, that is later replaced by the "real" object,
>>>>>>>> the index in the dictionary  (or other hashed collections) may be wrong.
>>>>>>>
>>>>>>> Yes, that is indeed it, Nicolai.
>>>>>>>
>>>>>>> But I would like to try to minimise the rehashing as it seems expensive. But first I need a more reliable failure.
>>>>>>>
>>>>>>>> --
>>>>>>>> Cyril Ferlicot
>>>>>>>>
>>>>>>>> http://www.synectique.eu
>>>>>>>>
>>>>>>>> 165 Avenue Bretagne
>>>>>>>> Lille 59000 France
>
>

12