Ephemerons - hope it is right this time :)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Ephemerons - hope it is right this time :)

Igor Stasenko
 
I have corrected implementation, hope it is right this time :)

If Ephemeron's key is almost-collectable (reachable only by ephemeron itself),
then it is reported to image side.
Then VM continues tracing ephemeron as object with all strong slots,
and if there are other ephemerons
which need to be reported as well, same procedure will be done for them.
It will loop until no other ephemerons with almost-collectable keys
can be found and nothing left to be traced.

See
http://code.google.com/p/cog/issues/detail?id=44

for the code.

(I will upload the VMMaker package with changes now).

--
Best regards,
Igor Stasenko AKA sig.

ephemerons-imageside.3.cs (12K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Ephemerons - hope it is right this time :)

Igor Stasenko

There is one slight thing, which i had doubts about.
Initially, when i was revising the algorithm, i followed the same road
as in paper:
"
In the third phase, the collector traces the remaining objects,
beginning at the ephemerons still on the queue. Any ephemerons
encountered in this phase are treated as ordinary objects, and all
fields traced’.
"

It means that VM will report only the topmost ephemerons (which has
keys non-reachable from roots),
but not consider ephemerons which are reachanble from those ones,
which will be discovered later, when queue will be processed.

The difference is, whether VM should also report nested ephemeron or not.
Consider a following test:

testNestedEphemerons

        " test that we will get a notification from nested ephemeron"
       
        | e key key2 val list count |

        list := WeakFinalizationList forEphemerons.

        e := Ephemeron newForList: list.

        key := Object new.
        key2 := Object new.

        e key: key value: (
                Array with: (
                (Ephemeron newForList: list)
                        key: key2 value: 1@2 )
        ).

        key := key2 := nil.
        Smalltalk garbageCollect.

        count := 0.
        list finalizationAction: [:eph | count := count + 1].
        list finalizeValues.
       
        self assert: count = 2
       

Here, the top ephemeron (e)
should be reported as one which having almost-collectable key.
But then VM has to mark its fields (otherwise they will be lost, and
we don't want that before image can handle it),
and discovers a nested ephemeron.
But nested one also has its key almost-collectable.
So, here the difference: should VM trace this ephemeron as usual
object (with all slots as strong references, like paper says),
or it should also report this ephemeron as one with almost-collectable key.

By following a paper description, at this stage, a nested ephemeron
should not be reported (so, in above test , a resulting count must be
== 1, but not 2).

This is quite obscure case (usually ephemerons not form a nested
loops), since then it is a question, which one should be finalized
first, and which one second.
So, for most use cases, this difference will not affect anything.
But if this will happen, it may lead to situation, that nested
ephemeron won't be finalized, so you could forget to free external
resources etc.
That's why i think, its better to report even nested ephemerons.
Otherwise if you finalize the topmost ephemeron and set its key (or
values) to nil,
then there will be no other references to nested ephemeron and it will
be garbage collected without any notice.
But if we report it, then image side can decide if it needs to be
finalized or not, regardless that the only reference to given
ephemeron are coming
from another ephemeron , which also needs to be finalized.

--
Best regards,
Igor Stasenko AKA sig.
Reply | Threaded
Open this post in threaded view
|

Re: Ephemerons - hope it is right this time :)

Eliot Miranda-2
In reply to this post by Igor Stasenko
 
Hi Igor,

On Thu, Jun 2, 2011 at 5:00 AM, Igor Stasenko <[hidden email]> wrote:
 
I have corrected implementation, hope it is right this time :)

If Ephemeron's key is almost-collectable (reachable only by ephemeron itself),
then it is reported to image side.

That's not right.  The correct definition is "reachable only by ephemerons" (plural).  The way to think about it is that at any stage in the GCs trace it has traced from the roots up to, but not through ephemerons that have keys not reachable from the roots.  So it has collected the current set of ephemerons whose keys are only reachable from ephemerons.  This set of ephemerons are to mourn (be activated, to be queued for finalization) since their keys are otherwise unreachable.  This set is added to the finalization queue and now tracing continues through those ephemerons.  This subsequent tracing may also encounter a new set of ephemerons whose keys are only reachable from ephemerons.  So this set is added to the finalization queue and so on until no more objects can be traced.  The GC can now finish and all the ephemerons in the finalization queue can mourn their keys.

This is really important as it allows one to add mounr behaviours independent of a particular object.  One can have more than one mourn action, each occurring through a different ephemeron.

And that reminds me, in your implementation how is an object marked as being an ephemeron?  Is it a header bit or does it depend on a class?  If it depends on a class that's an unsatisfactory, but acceptable limitation, since we can't create different types of ephemeron for different tasks.  Instead, if we want different mourn behaviours they have to be based on some mapping from ephemeron to something else, such as the dictionary an ephemeron is in (ephemerons are specifically designed to look like Associations so they can liv comfortably in Dictionaries).

HTH
Eliot
 
Then VM continues tracing ephemeron as object with all strong slots,
and if there are other ephemerons
which need to be reported as well, same procedure will be done for them.
It will loop until no other ephemerons with almost-collectable keys
can be found and nothing left to be traced.

See
http://code.google.com/p/cog/issues/detail?id=44

for the code.

(I will upload the VMMaker package with changes now).

--
Best regards,
Igor Stasenko AKA sig.


Reply | Threaded
Open this post in threaded view
|

Re: Ephemerons - hope it is right this time :)

Eliot Miranda-2
In reply to this post by Igor Stasenko
 
Hi Igor,

    you have identified a key point I was trying to explain earlier.

On Thu, Jun 2, 2011 at 5:36 AM, Igor Stasenko <[hidden email]> wrote:

There is one slight thing, which i had doubts about.
Initially, when i was revising the algorithm, i followed the same road
as in paper:
"
In the third phase, the collector traces the remaining objects,
beginning at the ephemerons still on the queue. Any ephemerons
encountered in this phase are treated as ordinary objects, and all
fields traced’.
"

This  was Barry Hayes' misunderstanding and the thing I had to fix in his initial implementation in the VW VM, wen I finished and released VW ephemerons.  It is not a third phase. Instead, the ephemeron detecting process is a sub-part of GC tracing that continues until tracing reaches the fixed point of not finding any more morning ephemerons.  Discovering a mourning ephemeron also discovers a new root (the ephemeron) but that root cannot be traced until all mourning ephemerons have been discovered in the current cycle.  Once the cycle is complete then the mourning ephemerons can be traced which in turn may discover more mourning ephemerons until the fixed point is reached.

So the paper is wrong.  I can show you private communications between myself and George Bosworth (the inventor of ephemerons in VSE) discussing precisely this, but I would prefer it if you would take my word for it :)


It means that VM will report only the topmost ephemerons (which has
keys non-reachable from roots),
but not consider ephemerons which are reachanble from those ones,
which will be discovered later, when queue will be processed.

The difference is, whether VM should also report nested ephemeron or not.

It must report nested ephemerons.
 
Consider a following test:

testNestedEphemerons

       " test that we will get a notification from nested ephemeron"

       | e key key2 val list count |

       list := WeakFinalizationList forEphemerons.

       e := Ephemeron newForList: list.

       key := Object new.
       key2 := Object new.

       e key: key value: (
               Array with: (
               (Ephemeron newForList: list)
                       key: key2 value: 1@2 )
       ).

       key := key2 := nil.
       Smalltalk garbageCollect.

       count := 0.
       list finalizationAction: [:eph | count := count + 1].
       list finalizeValues.

       self assert: count = 2


Here, the top ephemeron (e)
should be reported as one which having almost-collectable key.
But then VM has to mark its fields (otherwise they will be lost, and
we don't want that before image can handle it),
and discovers a nested ephemeron.
But nested one also has its key almost-collectable.
So, here the difference: should VM trace this ephemeron as usual
object (with all slots as strong references, like paper says),
or it should also report this ephemeron as one with almost-collectable key.

By following a paper description, at this stage, a nested ephemeron
should not be reported (so, in above test , a resulting count must be
== 1, but not 2).

This is quite obscure case (usually ephemerons not form a nested
loops), since then it is a question, which one should be finalized
first, and which one second.
So, for most use cases, this difference will not affect anything.
But if this will happen, it may lead to situation, that nested
ephemeron won't be finalized, so you could forget to free external
resources etc.
That's why i think, its better to report even nested ephemerons.
Otherwise if you finalize the topmost ephemeron and set its key (or
values) to nil,
then there will be no other references to nested ephemeron and it will
be garbage collected without any notice.
But if we report it, then image side can decide if it needs to be
finalized or not, regardless that the only reference to given
ephemeron are coming
from another ephemeron , which also needs to be finalized.

--
Best regards,
Igor Stasenko AKA sig.

HTH
Eliot 

Reply | Threaded
Open this post in threaded view
|

Re: Ephemerons - hope it is right this time :)

Igor Stasenko
In reply to this post by Eliot Miranda-2

On 2 June 2011 21:38, Eliot Miranda <[hidden email]> wrote:

>
> Hi Igor,
>
> On Thu, Jun 2, 2011 at 5:00 AM, Igor Stasenko <[hidden email]> wrote:
>>
>>
>> I have corrected implementation, hope it is right this time :)
>>
>> If Ephemeron's key is almost-collectable (reachable only by ephemeron itself),
>> then it is reported to image side.
>
> That's not right.  The correct definition is "reachable only by ephemerons" (plural).  The way to think about it is that at any stage in the GCs trace it has traced from the roots up to, but not through ephemerons that have keys not reachable from the roots.  So it has collected the current set of ephemerons whose keys are only reachable from ephemerons.  This set of ephemerons are to mourn (be activated, to be queued for finalization) since their keys are otherwise unreachable.  This set is added to the finalization queue and now tracing continues through those ephemerons.  This subsequent tracing may also encounter a new set of ephemerons whose keys are only reachable from ephemerons.  So this set is added to the finalization queue and so on until no more objects can be traced.  The GC can now finish and all the ephemerons in the finalization queue can mourn their keys.
> This is really important as it allows one to add mounr behaviours independent of a particular object.  One can have more than one mourn action, each occurring through a different ephemeron.

> And that reminds me, in your implementation how is an object marked as being an ephemeron?  Is it a header bit or does it depend on a class?

i didn't wanted to add new special object, so to tell if given object
as ephemeron there are multiple checks:
  - an object has weak slots
  - an object has at least 2 inst vars
  - first inst var points to an instance of ClassWeakFinalizer
the above conditions are same as for weak finalization scheme which i
introduced before.
Now, if instance of ClassWeakFinalizer has 2nd instance variable == true, then
the object in question is ephemeron.

(the list of conditions is a bit long, but i just don't want to add
new special object(s), and actually the overhead is not that much ;)

Btw, i think that i found a generic "object markers" scheme, which
allows to mark an object as a special one, but dont put too much
constraints to the object itself:
imagine that an object is special, if its first ivar points to an
instance of ClassSpecialObject
and then that instance could also contain an instance variable, like a
small int number , which denoting, what kind of special object it is.

With such markers, we easily could have 2^31 various kinds of special
objects, without need of extending special objects array at all.
Another way is to use unique instances of ClassSpecialObject to denote
a kind of special object. But then image side needs to have a
registeration mechanism and communicate with VM for registering them.
Anyways,by putting in a first inst var an instance of
ClassSpecialObject, it quite cheap to check then, if given object is a
"special one".

> If it depends on a class that's an unsatisfactory, but acceptable limitation, since we can't create different types of ephemeron for different tasks.

As you can see from the list of conditions above, you can use any
class for ephemeron. Or just subclass Ephemeron class, which is what i
expect will be most often used.

> Instead, if we want different mourn behaviours they have to be based on some mapping from ephemeron to something else, such as the >dictionary an ephemeron is in (ephemerons are specifically designed to look like Associations so they can liv comfortably in Dictionaries).

Sure, at image side there is a way to decide how to mourn ephemerons
one way or another.
VM are not participating in it (it just reports ones), so there is
virtually nothing which prevents you from implementing various
mourning actions.

My implementation allows to have ephemerons with any number of
variable slots >= 1 , means:
key only
key and value
key and many values , as many as you want

> HTH
> Eliot
>

--
Best regards,
Igor Stasenko AKA sig.
Reply | Threaded
Open this post in threaded view
|

Re: Ephemerons - hope it is right this time :)

Igor Stasenko
In reply to this post by Eliot Miranda-2

On 2 June 2011 21:44, Eliot Miranda <[hidden email]> wrote:

>
> Hi Igor,
>     you have identified a key point I was trying to explain earlier.
>
> On Thu, Jun 2, 2011 at 5:36 AM, Igor Stasenko <[hidden email]> wrote:
>>
>> There is one slight thing, which i had doubts about.
>> Initially, when i was revising the algorithm, i followed the same road
>> as in paper:
>> "
>> In the third phase, the collector traces the remaining objects,
>> beginning at the ephemerons still on the queue. Any ephemerons
>> encountered in this phase are treated as ordinary objects, and all
>> fields traced’.
>> "
>
> This  was Barry Hayes' misunderstanding and the thing I had to fix in his initial implementation in the VW VM, wen I finished and released VW ephemerons.  It is not a third phase. Instead, the ephemeron detecting process is a sub-part of GC tracing that continues until tracing reaches the fixed point of not finding any more morning ephemerons.  Discovering a mourning ephemeron also discovers a new root (the ephemeron) but that root cannot be traced until all mourning ephemerons have been discovered in the current cycle.  Once the cycle is complete then the mourning ephemerons can be traced which in turn may discover more mourning ephemerons until the fixed point is reached.
> So the paper is wrong.  I can show you private communications between myself and George Bosworth (the inventor of ephemerons in VSE) discussing precisely this, but I would prefer it if you would take my word for it :)

Yes, i felt that this "third" phase are not quite consistent with
purpose of ephemerons.
And moreover, i think , its not hurts to report something in advance
to what one expecting.
It's much worse when you don't receive a notification, when you expecting it.

>>
>> It means that VM will report only the topmost ephemerons (which has
>> keys non-reachable from roots),
>> but not consider ephemerons which are reachanble from those ones,
>> which will be discovered later, when queue will be processed.
>>
>> The difference is, whether VM should also report nested ephemeron or not.
>
> It must report nested ephemerons.
>

Kewl. Then i made right decision, despite that paper says don't :)


--
Best regards,
Igor Stasenko AKA sig.