Image Segment semantics and weakness

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Image Segment semantics and weakness

Eliot Miranda-2
Hi All,

    I want to check my understanding of reference semantics for image segments as I'm close to completing the Spur implementation.  Specifically the question is whether objects reachable only through weak pointers should be included in an image segment or not.

Remember that an image segment is created from the transitive closure of an Array of root objects, the segment roots. i.e. we can think of an image segment as a set of objects created by tracing the object graph from the segment roots.

The segment always includes the segment roots.  Except for the roots, objects are excluded from the segment that are also reachable form the roots of the system (the system roots, effectively the root environment, Smalltalk, and the stack of the current process).

Consider a weak array in the transitive closure that is not reachable from the system roots, and hence should be included in the segment.  Objects referenced from that weak array may be in one of three categories

- reachable from the system roots (and hence not to be included in the segment)
- not reachable form the system roots, but reachable from the segment roots via strong pointers (and hence to be included in the segment)
not reachable form the system roots, not reachable from the segment roots via strong pointers

Should this last category be included or excluded from the segment?  I think that it makes no difference, and excluding them is only an optimization.  The argument is as follows.  Imagine that immediately after loading the image segment there is a garbage collection.  That garbage collection will collect all the objects in the last category as they are only reachable from the weak arrays in the segment.  Hence we are free to follow weak references as if they are strong when we create the image segment, leaving it to subsequent events to reclaim those objects.  

An analogous argument accounts for objects reachable from ephemerons.  Is my reasoning sound?
--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Image Segment semantics and weakness

J. Vuletich (mail lists)

Hi Eliot,

Quoting Eliot Miranda <[hidden email]>:

Hi All,
 
    I want to check my understanding of reference semantics for image segments as I'm close to completing the Spur implementation.  Specifically the question is whether objects reachable only through weak pointers should be included in an image segment or not.
 
Remember that an image segment is created from the transitive closure of an Array of root objects, the segment roots. i.e. we can think of an image segment as a set of objects created by tracing the object graph from the segment roots.
 
The segment always includes the segment roots.  Except for the roots, objects are excluded from the segment that are also reachable form the roots of the system (the system roots, effectively the root environment, Smalltalk, and the stack of the current process).
 
Consider a weak array in the transitive closure that is not reachable from the system roots, and hence should be included in the segment.  Objects referenced from that weak array may be in one of three categories
 
- reachable from the system roots (and hence not to be included in the segment)
- not reachable form the system roots, but reachable from the segment roots via strong pointers (and hence to be included in the segment)
not reachable form the system roots, not reachable from the segment roots via strong pointers
 
Should this last category be included or excluded from the segment?  I think that it makes no difference, and excluding them is only an optimization.  The argument is as follows.  Imagine that immediately after loading the image segment there is a garbage collection.  That garbage collection will collect all the objects in the last category as they are only reachable from the weak arrays in the segment.  Hence we are free to follow weak references as if they are strong when we create the image segment, leaving it to subsequent events to reclaim those objects.  
 
An analogous argument accounts for objects reachable from ephemerons.  Is my reasoning sound?
--
best,
Eliot


I think you are right. But there is a risk of somehow, someone, gaining a strong reference to the object after the image segment was created, breaking our invariants when the segment is loaded again.

An object might be (not reachable / strongly reachable / weakely reachable) from system roots and / or segment roots. This gives us 9 possibilities. Six of them are easy (and I'll not go into them). The other three are tricky:

a- Not reachable from system roots. Weakely reachable from segment roots.
Do not include them. It is best to run a GC before building the image segment, to get rid of them (run termination, etc). This is to avoid the risk of the object gaining somehow a strong reference after the segment is built, making the segment miss the weak ref to it. Doing this way would also mean that any objects affected by termination would be consistent, both in the image and in the segment.

b- Weakely reachable from system roots. Weakely reachable from segment roots.
Do not include them. If the object manages to survive by gaining a strong ref from the system roots, the weak ref will be repaired on segment load (Am I right on this?) If the original object was included in the segment, then on segment load it would point to a duplicate object that is about to be collected (and maybe terminated?) In any case, doing this way would also mean that any objects affected by termination would be consistent, both in the image and in the segment.

c- Weakely reachable from system roots. Strongly reachable from segment roots.
Do include them. It seems reasonable to run a GC and get rid of them after unloading the segment, to avoid the risk of the object gaining somehow a strong ref in the image, and being duplicated on segment load. But doing as I say means that we would be loading into the image an object that was already terminated, although in the state it had before running termination. Not really sure if this is ok. There could be some risk of objects in the segment being in some pre-termination state, with some objects in the image being in some after-termination state. In any case, this would suggest bad design... So perhaps it makes sense to throw an exception in these cases?

I hope this rant is of use.

Cheers,
Juan Vuletich

Reply | Threaded
Open this post in threaded view
|

Re: Image Segment semantics and weakness

stepharo
In reply to this post by Eliot Miranda-2
While I as a big fan of imageSegment and proposed to mariano to work on imageSegment2 (it was the original idea for his phd)
he convinced us that imagesegment were not worth their complexity.
So why do you want to have imageSegment?

Stef


On 20/10/14 03:01, Eliot Miranda wrote:
Hi All,

    I want to check my understanding of reference semantics for image segments as I'm close to completing the Spur implementation.  Specifically the question is whether objects reachable only through weak pointers should be included in an image segment or not.

Remember that an image segment is created from the transitive closure of an Array of root objects, the segment roots. i.e. we can think of an image segment as a set of objects created by tracing the object graph from the segment roots.

The segment always includes the segment roots.  Except for the roots, objects are excluded from the segment that are also reachable form the roots of the system (the system roots, effectively the root environment, Smalltalk, and the stack of the current process).

Consider a weak array in the transitive closure that is not reachable from the system roots, and hence should be included in the segment.  Objects referenced from that weak array may be in one of three categories

- reachable from the system roots (and hence not to be included in the segment)
- not reachable form the system roots, but reachable from the segment roots via strong pointers (and hence to be included in the segment)
not reachable form the system roots, not reachable from the segment roots via strong pointers

Should this last category be included or excluded from the segment?  I think that it makes no difference, and excluding them is only an optimization.  The argument is as follows.  Imagine that immediately after loading the image segment there is a garbage collection.  That garbage collection will collect all the objects in the last category as they are only reachable from the weak arrays in the segment.  Hence we are free to follow weak references as if they are strong when we create the image segment, leaving it to subsequent events to reclaim those objects.  

An analogous argument accounts for objects reachable from ephemerons.  Is my reasoning sound?
--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Image Segment semantics and weakness

Eliot Miranda-2


On Mon, Oct 20, 2014 at 8:26 AM, stepharo <[hidden email]> wrote:
While I as a big fan of imageSegment and proposed to mariano to work on imageSegment2 (it was the original idea for his phd)
he convinced us that imagesegment were not worth their complexity.

I absolutely agree.
 
So why do you want to have imageSegment?

Because of backwards-compatibility.  If Spur does not provide image segments then the barrier to entry for Terf, eToys and Squeak may be too high.  Spur is supposed to be a plug-in replacement for Cog, not something that requires lots of effort to port to.
 
Stef



On 20/10/14 03:01, Eliot Miranda wrote:
Hi All,

    I want to check my understanding of reference semantics for image segments as I'm close to completing the Spur implementation.  Specifically the question is whether objects reachable only through weak pointers should be included in an image segment or not.

Remember that an image segment is created from the transitive closure of an Array of root objects, the segment roots. i.e. we can think of an image segment as a set of objects created by tracing the object graph from the segment roots.

The segment always includes the segment roots.  Except for the roots, objects are excluded from the segment that are also reachable form the roots of the system (the system roots, effectively the root environment, Smalltalk, and the stack of the current process).

Consider a weak array in the transitive closure that is not reachable from the system roots, and hence should be included in the segment.  Objects referenced from that weak array may be in one of three categories

- reachable from the system roots (and hence not to be included in the segment)
- not reachable form the system roots, but reachable from the segment roots via strong pointers (and hence to be included in the segment)
not reachable form the system roots, not reachable from the segment roots via strong pointers

Should this last category be included or excluded from the segment?  I think that it makes no difference, and excluding them is only an optimization.  The argument is as follows.  Imagine that immediately after loading the image segment there is a garbage collection.  That garbage collection will collect all the objects in the last category as they are only reachable from the weak arrays in the segment.  Hence we are free to follow weak references as if they are strong when we create the image segment, leaving it to subsequent events to reclaim those objects.  

An analogous argument accounts for objects reachable from ephemerons.  Is my reasoning sound?
--
best,
Eliot




--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Image Segment semantics and weakness

stepharo
In reply to this post by Eliot Miranda-2
Sorry for breaking the stream. I could not find the email with
thunderbird :(

Just for the record, you see what netstyle people did: they use image
segment and they had to kill everything (like processes and others)
because of
possible pointers so after saving a segment the image was basically dead
because some escaping pointers to make a kind of memory leak.
Now they migrated to Fuel.

So now I think that people should better use Fuel. IS were nice at first
but not when you look carefully to them, especially when you have
an object not in the roots pointing to an object inside the graph. Then
you have to do a GC…. Mariano spent a year fighting with that
and his PhD is really nice.

Less magic, more stability, simpler VMs.

Because all the energy that you will put in something from the past will
not be put in things for the future(not counting the bug hunting).
Stef



On Mon, Oct 20, 2014 at 8:26 AM, stepharo <[hidden email]> wrote:
While I as a big fan of imageSegment and proposed to mariano to work on
imageSegment2 (it was the original idea for his phd)
he convinced us that imagesegment were not worth their complexity.

I absolutely agree.

So why do you want to have imageSegment?

Because of backwards-compatibility.  If Spur does not provide image
segments then the barrier to entry for Terf, eToys and Squeak may be too
high.  Spur is supposed to be a plug-in replacement for Cog, not
something that requires lots of effort to port to



Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Re: Image Segment semantics and weakness

EstebanLM
In reply to this post by Eliot Miranda-2

On 20 Oct 2014, at 21:41, Eliot Miranda <[hidden email]> wrote:



On Mon, Oct 20, 2014 at 8:26 AM, stepharo <[hidden email]> wrote:
While I as a big fan of imageSegment and proposed to mariano to work on imageSegment2 (it was the original idea for his phd)
he convinced us that imagesegment were not worth their complexity.

I absolutely agree.
 
So why do you want to have imageSegment?

Because of backwards-compatibility.  If Spur does not provide image segments then the barrier to entry for Terf, eToys and Squeak may be too high.  Spur is supposed to be a plug-in replacement for Cog, not something that requires lots of effort to port to.

but… (and tell me if I’m saying something stupid), it would be probably better to ask the guys using ImageSegments to spend some time doing an adaptor to use fuel (who is already there, works fine and faster than ImageSegments itself). In the not-so-long term, is better investment that make you replicate a technology that we all agree is not the best option (also, I would bet is better to use your valuable time in other stuff). 
Is not that there is no alternative to IS… and also, the IS binary format for Spur will not be compatible with the older one, so… why not?

anyway, that’s my 2c

Esteban

 
Stef



On 20/10/14 03:01, Eliot Miranda wrote:
Hi All,

    I want to check my understanding of reference semantics for image segments as I'm close to completing the Spur implementation.  Specifically the question is whether objects reachable only through weak pointers should be included in an image segment or not.

Remember that an image segment is created from the transitive closure of an Array of root objects, the segment roots. i.e. we can think of an image segment as a set of objects created by tracing the object graph from the segment roots.

The segment always includes the segment roots.  Except for the roots, objects are excluded from the segment that are also reachable form the roots of the system (the system roots, effectively the root environment, Smalltalk, and the stack of the current process).

Consider a weak array in the transitive closure that is not reachable from the system roots, and hence should be included in the segment.  Objects referenced from that weak array may be in one of three categories

- reachable from the system roots (and hence not to be included in the segment)
- not reachable form the system roots, but reachable from the segment roots via strong pointers (and hence to be included in the segment)
not reachable form the system roots, not reachable from the segment roots via strong pointers

Should this last category be included or excluded from the segment?  I think that it makes no difference, and excluding them is only an optimization.  The argument is as follows.  Imagine that immediately after loading the image segment there is a garbage collection.  That garbage collection will collect all the objects in the last category as they are only reachable from the weak arrays in the segment.  Hence we are free to follow weak references as if they are strong when we create the image segment, leaving it to subsequent events to reclaim those objects.  

An analogous argument accounts for objects reachable from ephemerons.  Is my reasoning sound?
--
best,
Eliot




--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Image Segment semantics and weakness

Mariano Martinez Peck
In reply to this post by Eliot Miranda-2
Just a quick note I would like to share....
For my PhD, I did investigate ImageSegment very very deeply:


I didn't want to write Fuel just because. I took quite a lot of time to understand how ImageSegment primitives worked. From that effort, I remember a few conclusions:

1) I found only few users of ImageSegment
2) The few users I found, were NOT using the real purpose of ImageSegment, that is, object swapping. It was used instead as an object serializer. For that, they use #writeForExportOn: which ended up using SmartRefStream for the rest of the objects.   
3) I noticed I could achieve the same performance or even better with an OO serializer built at the language side, with all the benefits this means. Of course, having Cog helped here....

you can find some benchmark comparison agains IS. Also in my PhD: http://rmod.lille.inria.fr/archives/phd/PhD-2012-Martinez-Peck.pdf

Cheers, 





On Mon, Oct 20, 2014 at 9:56 PM, <[hidden email]> wrote:
Hi Eliot,

> Hi All,
>
>     I want to check my understanding of reference semantics for image
> segments as I'm close to completing the Spur implementation.  Specifically
> the question is whether objects reachable only through weak pointers
> should
> be included in an image segment or not.
>
> Remember that an image segment is created from the transitive closure of
> an
> Array of root objects, the *segment roots*. i.e. we can think of an image
> segment as a set of objects created by tracing the object graph from the
> segment roots.
>
> The segment always includes the segment roots.  Except for the roots,
> objects are excluded from the segment that are also reachable form the
> roots of the system (the *system roots*, effectively the root environment,
> Smalltalk, and the stack of the current process).
>
> Consider a weak array in the transitive closure that is not reachable from
> the system roots, and hence should be included in the segment.  Objects
> referenced from that weak array may be in one of three categories
>
> - reachable from the system roots (and hence not to be included in the
> segment)
> - *not* reachable form the system roots, but reachable from the segment
> roots via strong pointers (and hence to be included in the segment)
> - *not* reachable form the system roots, *not* reachable from the segment
> roots via strong pointers
>
> Should this last category be included or excluded from the segment?  I
> think that it makes no difference, and excluding them is only an
> optimization.  The argument is as follows.  Imagine that immediately after
> loading the image segment there is a garbage collection.  That garbage
> collection will collect all the objects in the last category as they are
> only reachable from the weak arrays in the segment.  Hence we are free to
> follow weak references as if they are strong when we create the image
> segment, leaving it to subsequent events to reclaim those objects.
>
> An analogous argument accounts for objects reachable from ephemerons.  Is
> my reasoning sound?
> --
> best,
> Eliot
>
>

I think you are right. But there is a risk of somehow, someone, gaining a
strong reference to the object after the image segment was created,
breaking our invariants when the segment is loaded again.

An object might be (not reachable / strongly reachable / weakely
reachable) from system roots and / or segment roots. This gives us 9
possibilities.
Six of them are easy (and I'll not go into them). The other three are
tricky:

a- Not reachable from system roots. Weakely reachable from segment roots.
Do not include them. It is best to run a GC before building the image
segment, to get rid of them (run termination, etc). This is to avoid the
risk of the object gaining somehow a strong reference after the segment is
built, making the segment miss the weak ref to it. Doing this way would
also mean that any objects affected by termination would be consistent,
both in the image and in the segment.

b- Weakely reachable from system roots. Weakely reachable from segment
roots.
Do not include them. If the object manages to survive by gaining a strong
ref from the system roots, the weak ref will be repaired on segment load
(Am I right on this?) If the original object was included in the segment,
then on segment load it would point to a duplicate object that is about to
be collected (and maybe terminated?) In any case, doing this way would also
mean that any objects affected by termination would be consistent, both in
the image and in the segment.

c- Weakely reachable from system roots. Strongly reachable from segment
roots.
Do include them. It seems reasonable to run a GC and get rid of them after
unloading the segment, to avoid the risk of the object gaining somehow a
strong ref in the image, and being duplicated on segment load. But doing as
I say means that we would be loading into the image an object that was
already terminated, although in the state it had before running
termination. Not really sure if this is ok. There could be some risk of
objects in the segment being in some pre-termination state, with some
objects in the image being in some after-termination state. In any case,
this would suggest bad design... So perhaps it makes sense to throw an
exception in these cases?

I hope this rant is of use.

Cheers,
Juan Vuletich






--
Mariano
http://marianopeck.wordpress.com
Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Image Segment semantics and weakness

stepharo
Thanks mariano. This is what I call science in action. You convinced me back then and Fuel is a success.

On 21/10/14 03:55, Mariano Martinez Peck wrote:
Just a quick note I would like to share....
For my PhD, I did investigate ImageSegment very very deeply:


I didn't want to write Fuel just because. I took quite a lot of time to understand how ImageSegment primitives worked. From that effort, I remember a few conclusions:

1) I found only few users of ImageSegment
2) The few users I found, were NOT using the real purpose of ImageSegment, that is, object swapping. It was used instead as an object serializer. For that, they use #writeForExportOn: which ended up using SmartRefStream for the rest of the objects.   
3) I noticed I could achieve the same performance or even better with an OO serializer built at the language side, with all the benefits this means. Of course, having Cog helped here....

you can find some benchmark comparison agains IS. Also in my PhD: http://rmod.lille.inria.fr/archives/phd/PhD-2012-Martinez-Peck.pdf

Cheers, 





On Mon, Oct 20, 2014 at 9:56 PM, <[hidden email]> wrote:
Hi Eliot,

> Hi All,
>
>     I want to check my understanding of reference semantics for image
> segments as I'm close to completing the Spur implementation.  Specifically
> the question is whether objects reachable only through weak pointers
> should
> be included in an image segment or not.
>
> Remember that an image segment is created from the transitive closure of
> an
> Array of root objects, the *segment roots*. i.e. we can think of an image
> segment as a set of objects created by tracing the object graph from the
> segment roots.
>
> The segment always includes the segment roots.  Except for the roots,
> objects are excluded from the segment that are also reachable form the
> roots of the system (the *system roots*, effectively the root environment,
> Smalltalk, and the stack of the current process).
>
> Consider a weak array in the transitive closure that is not reachable from
> the system roots, and hence should be included in the segment.  Objects
> referenced from that weak array may be in one of three categories
>
> - reachable from the system roots (and hence not to be included in the
> segment)
> - *not* reachable form the system roots, but reachable from the segment
> roots via strong pointers (and hence to be included in the segment)
> - *not* reachable form the system roots, *not* reachable from the segment
> roots via strong pointers
>
> Should this last category be included or excluded from the segment?  I
> think that it makes no difference, and excluding them is only an
> optimization.  The argument is as follows.  Imagine that immediately after
> loading the image segment there is a garbage collection.  That garbage
> collection will collect all the objects in the last category as they are
> only reachable from the weak arrays in the segment.  Hence we are free to
> follow weak references as if they are strong when we create the image
> segment, leaving it to subsequent events to reclaim those objects.
>
> An analogous argument accounts for objects reachable from ephemerons.  Is
> my reasoning sound?
> --
> best,
> Eliot
>
>

I think you are right. But there is a risk of somehow, someone, gaining a
strong reference to the object after the image segment was created,
breaking our invariants when the segment is loaded again.

An object might be (not reachable / strongly reachable / weakely
reachable) from system roots and / or segment roots. This gives us 9
possibilities.
Six of them are easy (and I'll not go into them). The other three are
tricky:

a- Not reachable from system roots. Weakely reachable from segment roots.
Do not include them. It is best to run a GC before building the image
segment, to get rid of them (run termination, etc). This is to avoid the
risk of the object gaining somehow a strong reference after the segment is
built, making the segment miss the weak ref to it. Doing this way would
also mean that any objects affected by termination would be consistent,
both in the image and in the segment.

b- Weakely reachable from system roots. Weakely reachable from segment
roots.
Do not include them. If the object manages to survive by gaining a strong
ref from the system roots, the weak ref will be repaired on segment load
(Am I right on this?) If the original object was included in the segment,
then on segment load it would point to a duplicate object that is about to
be collected (and maybe terminated?) In any case, doing this way would also
mean that any objects affected by termination would be consistent, both in
the image and in the segment.

c- Weakely reachable from system roots. Strongly reachable from segment
roots.
Do include them. It seems reasonable to run a GC and get rid of them after
unloading the segment, to avoid the risk of the object gaining somehow a
strong ref in the image, and being duplicated on segment load. But doing as
I say means that we would be loading into the image an object that was
already terminated, although in the state it had before running
termination. Not really sure if this is ok. There could be some risk of
objects in the segment being in some pre-termination state, with some
objects in the image being in some after-termination state. In any case,
this would suggest bad design... So perhaps it makes sense to throw an
exception in these cases?

I hope this rant is of use.

Cheers,
Juan Vuletich






--
Mariano
http://marianopeck.wordpress.com