Re: [Pharo-dev] A weak/leak story

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] A weak/leak story

Eliot Miranda-2
Hi Guille,

On Apr 12, 2016, at 2:41 AM, Guille Polito <[hidden email]> wrote:

Hi list,

With Pavel and Christophe we spend some time digging these last weaks chasing the memory leaks we were seeing lately. It is a long story to tell, so this mail is divided in three:

1) A brief intro to weak structures and finalization in Pharo, for those that do not know,
2) A bit of history to explain what happened in pre-spur and post-spur,
3) The actual cause of the memory leak today,
4) How to avoid them in your application, and what are we going to do to prevent this in the future.

    forgive me for not responding with code immediately.  Proper Ephemeron support is already in the Spur VM, plus a "proper" finalization queue, which allows us to drop the weak registries, which have a scaling problem.  The proper Ephemeron support required ClassBuilder changes.  I can provide Squeak code soon, but not until the end of the week; Clément and I have two presentations to prepare today and tomorrow, and it's 4am...

Replacing the weak registry with the proper finalization queue, in which appears both triggered ephemerons and weak collections that have lost references, means that individual ephemerons and weak collections can mourn, instead of the system having to scan all weak collections in all weak registries whenever a single weak collection loses a referent.

My own Ephemeron story is that when I tried to replace the weak registry with the proper finalization queue in Squeak last year the mysterious symptom I had was the system running out of file descriptors and source file access stopping working.  I haven't had time (or a collaborator of two) to dig further.  So if there is a brave soul or two interested in getting ephemerons released and tested I'd love to get in touch and get this done.  We need ephemerons and I expect we want a scalable weak mourning scheme.

_,,,^..^,,,_ (phone)

For those that need/want/prefer just the practical explanation, you can jump over 2) and just read 1) and 3).

========================================================================
1. A weak explanation
========================================================================

To cleanup objects upon garbage collection, Pharo and Squeak use a finalization mechanism based on a Weak Registry. That is, if you want to execute some cleanup (like closing a file) when an object is about to be collected, you have to put your object inside the weak registry with the corresponding executor/finalizer object. The object you want to 'track' is hold weakly by this weak registry i.e., if the only reference to the object is from the weak registry, it will be chosen for garbage collection. When this object is collected, a special process in the Pharo image will send #finalize to your executor object where you implement your cleanup.

To interact with the weak registry, there are two main subscription messages:

- #add:executor:

  Will add an object to the registry with the executor that is send as argument.

- #add:

  Will add an object to the registry, and use as executor a 'shallow copy' of the object.

Some conclusions to be made from this:
 1) If the executor points strongly to the object that we want to collect, it will never be collected. That is why the #add: message creates a copy of the object.
 2) If we do not provide an explicit executor, the registered object should already contain all information required for the finalization (like file handlers or external pointers). If not, the shallow copy will not be able to finalize correctly.

Also:
 - Using weak objects/references do not guarantee that #finalize will be called, you need to put your object inside the registry!
 - Using weak objects/references do not guarantee that your object will be magically collected. You can still cause memory leaks!

========================================================================
2. A weak story
========================================================================

Pharo and Squeak use historically the weak registry mentioned above. Because of the limitations that we mentioned, a different kind of weak structure called Ephemerons is required/more useful. To overcome some of these limitations, Igor (Hi Igor! maybe you're reading :)) implemented a couple of years ago a new finalization mechanism that, IIANM, worked as follows:

- Some weak objects could have a first instance variable with a special linked list
- When the object was about to be collected, instead it was removed from the weak structure and put into its container's linked list
- On the image side, a special process iterated all special linked lists and executed #finalize on the weak objects

This mechanism was called NewFinalization, in contrast with what was called LegacyFinalization. Of course these names are context dependent, since today's Pharo is back to the so called legacy one ;). NewFinalization was implemented as the default finalization mechanim in Pharo, both in VM and image side. But the VM changes remained in the Pharo branch of development. After some discussions, I remember Igor and Eliot agreed that what they actually needed were Ephemerons, and since Eliot had started working on Spur at that time, he said he would provide Ephemeric classes with the new object format.

Basically, for those interested, an ephemeron is an association

  weak key -> strong value

with the special quality that upon garbage collection all references to the weak key that are computed from the strong value (directly or indirectly) are taken as weak. This allows the collection of the weak key even if the strong value points to it, but requires some more machinery in the GC/VM. You can read more in here [1].

Until a couple of months/weeks ago, Pharo was using the NewFinalization mechanism with it's special image and VM support. And Squeak was using the 'Legacy' one. And then Spur arrived.

So Spur arrived, and Eliot and Esteban made a lot of effort to simplify the VM's maintenance, and they merged both branches. As a conclusion, Pharo Spur VM did not support any more NewFinalization. This provoked at first some leaks because objects were not being finalized. A couple of weeks ago, we migrated back the image code to use the 'Legacy' mechanism, see issue 17537 [2].

And then finalization was not working either. Nor #finalize was being called on executors, nor objects in the weak registry were collected. As a symptom, opening any tool will cause 30 new everlasting registrations into the weakregistry, and no tools were collected.


========================================================================
3. The cause
========================================================================

After lots of digging, we finally found what was the particular issue causing objects in the weak registry to not be collected. In some words, it is caused by the normal belief that "weak objects are magical", which caused that weak references and finalizers are really spread over the system with no proper care. And particularly related to the usage of announcements.

To explain better, I made some pictures for you :)


***First, imagine you have a morph with its own local announcer. You subscribe to two events, and the graph will look like this.

<strong.png>

- the announcer knows two strong subscriptions
- the subscriptions know the announcer to be able to unregister
- the subscriptions know the registered object to send the message in case the event happens

This forms a closed graph that will be collected. No problem so far.


***Second, let's see what happens if we use weak subsriptions:

<weak.png>

- the announcer know two weak subscriptions
- these weak subscriptions know the announcer strongly to be able to unregister
- they also know the subscriber object but weakly
- THE difference is made by the weak registry: a global object that manages when and how objects are finalized. In the case of announcers, the weak registry will store weakly the subscriber morph, and strongly the weak announcer subscription.

So far so good also: the references to the morph are weak. When the morph is collected, the weak registry will execute finalize on the announcement subscriptions. The subscriptions will unregister from the morph.


***The really problematic case is the third one: mixing weak and strong subscriptions in the same announcer.

<both.png>

The object graph is just a mixture of the two other ones. One weak subscription and one strong subscription. BUT:

 - there is a strong path from a global object (the weak registry) to the subscriber (the morph)
 - then the morph is never collected
 - the weak registry never finalizes the weak announcement subscription
 - the graph remains there forever.


And these are the simple cases that show the problem. Imagine that you can have this same configuration but in cycles/chains among different morphs/announcements. Plus this is aggravated by evil globals (e.g., the theme and the HandMorph remembers the last focused morph, the system window class remembers the last top window even if it was closed...).


========================================================================
4. The solution?
========================================================================

Our solution for the moment is simple. We would like to enforce the following two rules for announcements:

- announcers local to a morph should only be used strongly. YES, this may cause small hiccups and leaks, for example if you register a morph A to the announcer to another morph B. But in the long term, these two will form a closed graph and will be collected.

- announcers used globally, such as the System announcer, should be used only and uniquely in a weak manner. Like that we ensure that they are loosely coupled for real.

So, please, please, do not use weak announcements unless you're really sure of what you're doing. At least, until we have ephemerons and we are sure everything works as expected. Ephemerons would solve this in a more natural way: if we model the weak registry subscription as an ephemeron, any reference to the weak #key that arrives from the #value will be treated as weak also.

Other action points we are working on:
- fixing tools to follow the rules above
- We are also writing tests to check that tools (gt*, Nautilus, Rubric, FT) do not leak.
- chasing other small memory leaks created by stepping, focus global variables...


((fogbugz allIssues select: [ :each | each relatedToLeak ])
    flatCollect: [ :each | each participants ])
        do: #thanks



[1] https://en.wikipedia.org/wiki/Ephemeron
[2] https://pharo.fogbugz.com/f/cases/17537/SystemAnnouncer-has-far-too-many-subscriptions


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] A weak/leak story

Levente Uzonyi
Hi Eliot,

I think the only interesting question is: can I create an ephemeron in a
Spur image?
If yes, then weak collections can be significanly improved. If the answer
is no, then the difference made by individual weak collection finalization
will be insignificant for most users, since there are usually less than
four WeakRegistries in an image, while other weak collections don't rely
on the finalizationProcess. (One might get the impression that
WeakKeyDictionary does, but that won't work as one might think unless the
dictionary is the internal collection of a WeakRegistry.)
Of course, the above only applies to Squeak.

Levente

On Tue, 12 Apr 2016, Eliot Miranda wrote:

> Hi Guille,
>
> On Apr 12, 2016, at 2:41 AM, Guille Polito <[hidden email]> wrote:
>
>       Hi list,
>
>       With Pavel and Christophe we spend some time digging these last weaks chasing the memory leaks we were seeing lately. It is a long
>       story to tell, so this mail is divided in three:
>
>       1) A brief intro to weak structures and finalization in Pharo, for those that do not know,
>       2) A bit of history to explain what happened in pre-spur and post-spur,
>       3) The actual cause of the memory leak today,
>       4) How to avoid them in your application, and what are we going to do to prevent this in the future.
>
>
>     forgive me for not responding with code immediately.  Proper Ephemeron support is already in the Spur VM, plus a "proper" finalization queue,
> which allows us to drop the weak registries, which have a scaling problem.  The proper Ephemeron support required ClassBuilder changes.  I can
> provide Squeak code soon, but not until the end of the week; Clément and I have two presentations to prepare today and tomorrow, and it's 4am...
>
> Replacing the weak registry with the proper finalization queue, in which appears both triggered ephemerons and weak collections that have lost
> references, means that individual ephemerons and weak collections can mourn, instead of the system having to scan all weak collections in all weak
> registries whenever a single weak collection loses a referent.
>
> My own Ephemeron story is that when I tried to replace the weak registry with the proper finalization queue in Squeak last year the mysterious
> symptom I had was the system running out of file descriptors and source file access stopping working.  I haven't had time (or a collaborator of
> two) to dig further.  So if there is a brave soul or two interested in getting ephemerons released and tested I'd love to get in touch and get
> this done.  We need ephemerons and I expect we want a scalable weak mourning scheme.
>
> _,,,^..^,,,_ (phone)
>
>       For those that need/want/prefer just the practical explanation, you can jump over 2) and just read 1) and 3).
>
>       ========================================================================
>       1. A weak explanation
>       ========================================================================
>
>       To cleanup objects upon garbage collection, Pharo and Squeak use a finalization mechanism based on a Weak Registry. That is, if you
>       want to execute some cleanup (like closing a file) when an object is about to be collected, you have to put your object inside the
>       weak registry with the corresponding executor/finalizer object. The object you want to 'track' is hold weakly by this weak registry
>       i.e., if the only reference to the object is from the weak registry, it will be chosen for garbage collection. When this object is
>       collected, a special process in the Pharo image will send #finalize to your executor object where you implement your cleanup.
>
>       To interact with the weak registry, there are two main subscription messages:
>
>       - #add:executor:
>
>         Will add an object to the registry with the executor that is send as argument.
>
>       - #add:
>
>         Will add an object to the registry, and use as executor a 'shallow copy' of the object.
>
>       Some conclusions to be made from this:
>        1) If the executor points strongly to the object that we want to collect, it will never be collected. That is why the #add: message
>       creates a copy of the object.
>        2) If we do not provide an explicit executor, the registered object should already contain all information required for the
>       finalization (like file handlers or external pointers). If not, the shallow copy will not be able to finalize correctly.
>
>       Also:
>        - Using weak objects/references do not guarantee that #finalize will be called, you need to put your object inside the registry!
>        - Using weak objects/references do not guarantee that your object will be magically collected. You can still cause memory leaks!
>
>       ========================================================================
>       2. A weak story
>       ========================================================================
>
>       Pharo and Squeak use historically the weak registry mentioned above. Because of the limitations that we mentioned, a different kind of
>       weak structure called Ephemerons is required/more useful. To overcome some of these limitations, Igor (Hi Igor! maybe you're reading
>       :)) implemented a couple of years ago a new finalization mechanism that, IIANM, worked as follows:
>
>       - Some weak objects could have a first instance variable with a special linked list
>       - When the object was about to be collected, instead it was removed from the weak structure and put into its container's linked list
>       - On the image side, a special process iterated all special linked lists and executed #finalize on the weak objects
>
>       This mechanism was called NewFinalization, in contrast with what was called LegacyFinalization. Of course these names are context
>       dependent, since today's Pharo is back to the so called legacy one ;). NewFinalization was implemented as the default finalization
>       mechanim in Pharo, both in VM and image side. But the VM changes remained in the Pharo branch of development. After some discussions,
>       I remember Igor and Eliot agreed that what they actually needed were Ephemerons, and since Eliot had started working on Spur at that
>       time, he said he would provide Ephemeric classes with the new object format.
>
>       Basically, for those interested, an ephemeron is an association
>
>         weak key -> strong value
>
>       with the special quality that upon garbage collection all references to the weak key that are computed from the strong value (directly
>       or indirectly) are taken as weak. This allows the collection of the weak key even if the strong value points to it, but requires some
>       more machinery in the GC/VM. You can read more in here [1].
>
>       Until a couple of months/weeks ago, Pharo was using the NewFinalization mechanism with it's special image and VM support. And Squeak
>       was using the 'Legacy' one. And then Spur arrived.
>
>       So Spur arrived, and Eliot and Esteban made a lot of effort to simplify the VM's maintenance, and they merged both branches. As a
>       conclusion, Pharo Spur VM did not support any more NewFinalization. This provoked at first some leaks because objects were not being
>       finalized. A couple of weeks ago, we migrated back the image code to use the 'Legacy' mechanism, see issue 17537 [2].
>
>       And then finalization was not working either. Nor #finalize was being called on executors, nor objects in the weak registry were
>       collected. As a symptom, opening any tool will cause 30 new everlasting registrations into the weakregistry, and no tools were
>       collected.
>
>
>       ========================================================================
>       3. The cause
>       ========================================================================
>
>       After lots of digging, we finally found what was the particular issue causing objects in the weak registry to not be collected. In
>       some words, it is caused by the normal belief that "weak objects are magical", which caused that weak references and finalizers are
>       really spread over the system with no proper care. And particularly related to the usage of announcements.
>
>       To explain better, I made some pictures for you :)
>
>
>       ***First, imagine you have a morph with its own local announcer. You subscribe to two events, and the graph will look like this.
>
>       <strong.png>
>
>       - the announcer knows two strong subscriptions
>       - the subscriptions know the announcer to be able to unregister
>       - the subscriptions know the registered object to send the message in case the event happens
>
>       This forms a closed graph that will be collected. No problem so far.
>
>
>       ***Second, let's see what happens if we use weak subsriptions:
>
>       <weak.png>
>
>       - the announcer know two weak subscriptions
>       - these weak subscriptions know the announcer strongly to be able to unregister
>       - they also know the subscriber object but weakly
>       - THE difference is made by the weak registry: a global object that manages when and how objects are finalized. In the case of
>       announcers, the weak registry will store weakly the subscriber morph, and strongly the weak announcer subscription.
>
>       So far so good also: the references to the morph are weak. When the morph is collected, the weak registry will execute finalize on the
>       announcement subscriptions. The subscriptions will unregister from the morph.
>
>
>       ***The really problematic case is the third one: mixing weak and strong subscriptions in the same announcer.
>
>       <both.png>
>
>       The object graph is just a mixture of the two other ones. One weak subscription and one strong subscription. BUT:
>
>        - there is a strong path from a global object (the weak registry) to the subscriber (the morph)
>        - then the morph is never collected
>        - the weak registry never finalizes the weak announcement subscription
>        - the graph remains there forever.
>
>
>       And these are the simple cases that show the problem. Imagine that you can have this same configuration but in cycles/chains among
>       different morphs/announcements. Plus this is aggravated by evil globals (e.g., the theme and the HandMorph remembers the last focused
>       morph, the system window class remembers the last top window even if it was closed...).
>
>
>       ========================================================================
>       4. The solution?
>       ========================================================================
>
>       Our solution for the moment is simple. We would like to enforce the following two rules for announcements:
>
>       - announcers local to a morph should only be used strongly. YES, this may cause small hiccups and leaks, for example if you register a
>       morph A to the announcer to another morph B. But in the long term, these two will form a closed graph and will be collected.
>
>       - announcers used globally, such as the System announcer, should be used only and uniquely in a weak manner. Like that we ensure that
>       they are loosely coupled for real.
>
>       So, please, please, do not use weak announcements unless you're really sure of what you're doing. At least, until we have ephemerons
>       and we are sure everything works as expected. Ephemerons would solve this in a more natural way: if we model the weak registry
>       subscription as an ephemeron, any reference to the weak #key that arrives from the #value will be treated as weak also.
>
>       Other action points we are working on:
>       - fixing tools to follow the rules above
>       - We are also writing tests to check that tools (gt*, Nautilus, Rubric, FT) do not leak.
>       - chasing other small memory leaks created by stepping, focus global variables...
>
>
>       ((fogbugz allIssues select: [ :each | each relatedToLeak ])
>           flatCollect: [ :each | each participants ])
>               do: #thanks
>
>
>
>       [1] https://en.wikipedia.org/wiki/Ephemeron
>       [2] https://pharo.fogbugz.com/f/cases/17537/SystemAnnouncer-has-far-too-many-subscriptions
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Re: [squeak-dev] Re: [Pharo-dev] A weak/leak story

Eliot Miranda-2


On Tue, Apr 12, 2016 at 6:46 AM, Levente Uzonyi <[hidden email]> wrote:
 
Hi Eliot,

I think the only interesting question is: can I create an ephemeron in a Spur image?
If yes, then weak collections can be significanly improved. If the answer is no, then the difference made by individual weak collection finalization will be insignificant for most users, since there are usually less than four WeakRegistries in an image, while other weak collections don't rely on the finalizationProcess. (One might get the impression that WeakKeyDictionary does, but that won't work as one might think unless the dictionary is the internal collection of a WeakRegistry.)
Of course, the above only applies to Squeak.

The answer is yes.  Find attached an Ephemeron definition (but with no methods) and the new finalisation scheme.  To use the scheme one must evaluate Smalltalk supportsQueueingFinalization: true.  Ephemeron needs a suitable mourn method, which would send finalise to its key, but also remove the ephemeron from whatever registry we keep ephemerons in.

As I say, though, the first order of business is to find out why turning on the scheme causes file access to fail very soon after.
 
Levente

On Tue, 12 Apr 2016, Eliot Miranda wrote:

Hi Guille,

On Apr 12, 2016, at 2:41 AM, Guille Polito <[hidden email]> wrote:

      Hi list,

      With Pavel and Christophe we spend some time digging these last weaks chasing the memory leaks we were seeing lately. It is a long
      story to tell, so this mail is divided in three:

      1) A brief intro to weak structures and finalization in Pharo, for those that do not know,
      2) A bit of history to explain what happened in pre-spur and post-spur,
      3) The actual cause of the memory leak today,
      4) How to avoid them in your application, and what are we going to do to prevent this in the future.


    forgive me for not responding with code immediately.  Proper Ephemeron support is already in the Spur VM, plus a "proper" finalization queue,
which allows us to drop the weak registries, which have a scaling problem.  The proper Ephemeron support required ClassBuilder changes.  I can
provide Squeak code soon, but not until the end of the week; Clément and I have two presentations to prepare today and tomorrow, and it's 4am...

Replacing the weak registry with the proper finalization queue, in which appears both triggered ephemerons and weak collections that have lost
references, means that individual ephemerons and weak collections can mourn, instead of the system having to scan all weak collections in all weak
registries whenever a single weak collection loses a referent.

My own Ephemeron story is that when I tried to replace the weak registry with the proper finalization queue in Squeak last year the mysterious
symptom I had was the system running out of file descriptors and source file access stopping working.  I haven't had time (or a collaborator of
two) to dig further.  So if there is a brave soul or two interested in getting ephemerons released and tested I'd love to get in touch and get
this done.  We need ephemerons and I expect we want a scalable weak mourning scheme.

_,,,^..^,,,_ (phone)

      For those that need/want/prefer just the practical explanation, you can jump over 2) and just read 1) and 3).

      ========================================================================
      1. A weak explanation
      ========================================================================

      To cleanup objects upon garbage collection, Pharo and Squeak use a finalization mechanism based on a Weak Registry. That is, if you
      want to execute some cleanup (like closing a file) when an object is about to be collected, you have to put your object inside the
      weak registry with the corresponding executor/finalizer object. The object you want to 'track' is hold weakly by this weak registry
      i.e., if the only reference to the object is from the weak registry, it will be chosen for garbage collection. When this object is
      collected, a special process in the Pharo image will send #finalize to your executor object where you implement your cleanup.

      To interact with the weak registry, there are two main subscription messages:

      - #add:executor:

        Will add an object to the registry with the executor that is send as argument.

      - #add:

        Will add an object to the registry, and use as executor a 'shallow copy' of the object.

      Some conclusions to be made from this:
       1) If the executor points strongly to the object that we want to collect, it will never be collected. That is why the #add: message
      creates a copy of the object.
       2) If we do not provide an explicit executor, the registered object should already contain all information required for the
      finalization (like file handlers or external pointers). If not, the shallow copy will not be able to finalize correctly.

      Also:
       - Using weak objects/references do not guarantee that #finalize will be called, you need to put your object inside the registry!
       - Using weak objects/references do not guarantee that your object will be magically collected. You can still cause memory leaks!

      ========================================================================
      2. A weak story
      ========================================================================

      Pharo and Squeak use historically the weak registry mentioned above. Because of the limitations that we mentioned, a different kind of
      weak structure called Ephemerons is required/more useful. To overcome some of these limitations, Igor (Hi Igor! maybe you're reading
      :)) implemented a couple of years ago a new finalization mechanism that, IIANM, worked as follows:

      - Some weak objects could have a first instance variable with a special linked list
      - When the object was about to be collected, instead it was removed from the weak structure and put into its container's linked list
      - On the image side, a special process iterated all special linked lists and executed #finalize on the weak objects

      This mechanism was called NewFinalization, in contrast with what was called LegacyFinalization. Of course these names are context
      dependent, since today's Pharo is back to the so called legacy one ;). NewFinalization was implemented as the default finalization
      mechanim in Pharo, both in VM and image side. But the VM changes remained in the Pharo branch of development. After some discussions,
      I remember Igor and Eliot agreed that what they actually needed were Ephemerons, and since Eliot had started working on Spur at that
      time, he said he would provide Ephemeric classes with the new object format.

      Basically, for those interested, an ephemeron is an association

        weak key -> strong value

      with the special quality that upon garbage collection all references to the weak key that are computed from the strong value (directly
      or indirectly) are taken as weak. This allows the collection of the weak key even if the strong value points to it, but requires some
      more machinery in the GC/VM. You can read more in here [1].

      Until a couple of months/weeks ago, Pharo was using the NewFinalization mechanism with it's special image and VM support. And Squeak
      was using the 'Legacy' one. And then Spur arrived.

      So Spur arrived, and Eliot and Esteban made a lot of effort to simplify the VM's maintenance, and they merged both branches. As a
      conclusion, Pharo Spur VM did not support any more NewFinalization. This provoked at first some leaks because objects were not being
      finalized. A couple of weeks ago, we migrated back the image code to use the 'Legacy' mechanism, see issue 17537 [2].

      And then finalization was not working either. Nor #finalize was being called on executors, nor objects in the weak registry were
      collected. As a symptom, opening any tool will cause 30 new everlasting registrations into the weakregistry, and no tools were
      collected.


      ========================================================================
      3. The cause
      ========================================================================

      After lots of digging, we finally found what was the particular issue causing objects in the weak registry to not be collected. In
      some words, it is caused by the normal belief that "weak objects are magical", which caused that weak references and finalizers are
      really spread over the system with no proper care. And particularly related to the usage of announcements.

      To explain better, I made some pictures for you :)


      ***First, imagine you have a morph with its own local announcer. You subscribe to two events, and the graph will look like this.

      <strong.png>

      - the announcer knows two strong subscriptions
      - the subscriptions know the announcer to be able to unregister
      - the subscriptions know the registered object to send the message in case the event happens

      This forms a closed graph that will be collected. No problem so far.


      ***Second, let's see what happens if we use weak subsriptions:

      <weak.png>

      - the announcer know two weak subscriptions
      - these weak subscriptions know the announcer strongly to be able to unregister
      - they also know the subscriber object but weakly
      - THE difference is made by the weak registry: a global object that manages when and how objects are finalized. In the case of
      announcers, the weak registry will store weakly the subscriber morph, and strongly the weak announcer subscription.

      So far so good also: the references to the morph are weak. When the morph is collected, the weak registry will execute finalize on the
      announcement subscriptions. The subscriptions will unregister from the morph.


      ***The really problematic case is the third one: mixing weak and strong subscriptions in the same announcer.

      <both.png>

      The object graph is just a mixture of the two other ones. One weak subscription and one strong subscription. BUT:

       - there is a strong path from a global object (the weak registry) to the subscriber (the morph)
       - then the morph is never collected
       - the weak registry never finalizes the weak announcement subscription
       - the graph remains there forever.


      And these are the simple cases that show the problem. Imagine that you can have this same configuration but in cycles/chains among
      different morphs/announcements. Plus this is aggravated by evil globals (e.g., the theme and the HandMorph remembers the last focused
      morph, the system window class remembers the last top window even if it was closed...).


      ========================================================================
      4. The solution?
      ========================================================================

      Our solution for the moment is simple. We would like to enforce the following two rules for announcements:

      - announcers local to a morph should only be used strongly. YES, this may cause small hiccups and leaks, for example if you register a
      morph A to the announcer to another morph B. But in the long term, these two will form a closed graph and will be collected.

      - announcers used globally, such as the System announcer, should be used only and uniquely in a weak manner. Like that we ensure that
      they are loosely coupled for real.

      So, please, please, do not use weak announcements unless you're really sure of what you're doing. At least, until we have ephemerons
      and we are sure everything works as expected. Ephemerons would solve this in a more natural way: if we model the weak registry
      subscription as an ephemeron, any reference to the weak #key that arrives from the #value will be treated as weak also.

      Other action points we are working on:
      - fixing tools to follow the rules above
      - We are also writing tests to check that tools (gt*, Nautilus, Rubric, FT) do not leak.
      - chasing other small memory leaks created by stepping, focus global variables...


      ((fogbugz allIssues select: [ :each | each relatedToLeak ])
          flatCollect: [ :each | each participants ])
              do: #thanks



      [1] https://en.wikipedia.org/wiki/Ephemeron
      [2] https://pharo.fogbugz.com/f/cases/17537/SystemAnnouncer-has-far-too-many-subscriptions

_,,,^..^,,,_
best, Eliot



Ephemeron.st (2K) Download Attachment
WeakArray newFinalization methods.st (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Re: [Vm-dev] Re: [squeak-dev] Re: [Pharo-dev] A weak/leak story

jelena
In reply to this post by Eliot Miranda-2


Reply | Threaded
Open this post in threaded view
|

Re: Re: [Vm-dev] Re: [squeak-dev] Re: [Pharo-dev] A weak/leak story

Eliot Miranda-2
Hi Jelena,

    every message you've posted in the last few days has been delivered to the list completely empty.  Can you check your email lint?

On Tue, Apr 12, 2016 at 10:52 PM, <[hidden email]> wrote:





--
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Re: [squeak-dev] Re: [Pharo-dev] A weak/leak story

Denis Kudriashov
In reply to this post by Eliot Miranda-2
Hi.

I open issue 17990

2016-04-13 7:51 GMT+02:00 Eliot Miranda <[hidden email]>:


On Tue, Apr 12, 2016 at 6:46 AM, Levente Uzonyi <[hidden email]> wrote:
 
Hi Eliot,

I think the only interesting question is: can I create an ephemeron in a Spur image?
If yes, then weak collections can be significanly improved. If the answer is no, then the difference made by individual weak collection finalization will be insignificant for most users, since there are usually less than four WeakRegistries in an image, while other weak collections don't rely on the finalizationProcess. (One might get the impression that WeakKeyDictionary does, but that won't work as one might think unless the dictionary is the internal collection of a WeakRegistry.)
Of course, the above only applies to Squeak.

The answer is yes.  Find attached an Ephemeron definition (but with no methods) and the new finalisation scheme.  To use the scheme one must evaluate Smalltalk supportsQueueingFinalization: true.  Ephemeron needs a suitable mourn method, which would send finalise to its key, but also remove the ephemeron from whatever registry we keep ephemerons in.

As I say, though, the first order of business is to find out why turning on the scheme causes file access to fail very soon after.
 
Levente


On Tue, 12 Apr 2016, Eliot Miranda wrote:

Hi Guille,

On Apr 12, 2016, at 2:41 AM, Guille Polito <[hidden email]> wrote:

      Hi list,

      With Pavel and Christophe we spend some time digging these last weaks chasing the memory leaks we were seeing lately. It is a long
      story to tell, so this mail is divided in three:

      1) A brief intro to weak structures and finalization in Pharo, for those that do not know,
      2) A bit of history to explain what happened in pre-spur and post-spur,
      3) The actual cause of the memory leak today,
      4) How to avoid them in your application, and what are we going to do to prevent this in the future.


    forgive me for not responding with code immediately.  Proper Ephemeron support is already in the Spur VM, plus a "proper" finalization queue,
which allows us to drop the weak registries, which have a scaling problem.  The proper Ephemeron support required ClassBuilder changes.  I can
provide Squeak code soon, but not until the end of the week; Clément and I have two presentations to prepare today and tomorrow, and it's 4am...

Replacing the weak registry with the proper finalization queue, in which appears both triggered ephemerons and weak collections that have lost
references, means that individual ephemerons and weak collections can mourn, instead of the system having to scan all weak collections in all weak
registries whenever a single weak collection loses a referent.

My own Ephemeron story is that when I tried to replace the weak registry with the proper finalization queue in Squeak last year the mysterious
symptom I had was the system running out of file descriptors and source file access stopping working.  I haven't had time (or a collaborator of
two) to dig further.  So if there is a brave soul or two interested in getting ephemerons released and tested I'd love to get in touch and get
this done.  We need ephemerons and I expect we want a scalable weak mourning scheme.

_,,,^..^,,,_ (phone)

      For those that need/want/prefer just the practical explanation, you can jump over 2) and just read 1) and 3).

      ========================================================================
      1. A weak explanation
      ========================================================================

      To cleanup objects upon garbage collection, Pharo and Squeak use a finalization mechanism based on a Weak Registry. That is, if you
      want to execute some cleanup (like closing a file) when an object is about to be collected, you have to put your object inside the
      weak registry with the corresponding executor/finalizer object. The object you want to 'track' is hold weakly by this weak registry
      i.e., if the only reference to the object is from the weak registry, it will be chosen for garbage collection. When this object is
      collected, a special process in the Pharo image will send #finalize to your executor object where you implement your cleanup.

      To interact with the weak registry, there are two main subscription messages:

      - #add:executor:

        Will add an object to the registry with the executor that is send as argument.

      - #add:

        Will add an object to the registry, and use as executor a 'shallow copy' of the object.

      Some conclusions to be made from this:
       1) If the executor points strongly to the object that we want to collect, it will never be collected. That is why the #add: message
      creates a copy of the object.
       2) If we do not provide an explicit executor, the registered object should already contain all information required for the
      finalization (like file handlers or external pointers). If not, the shallow copy will not be able to finalize correctly.

      Also:
       - Using weak objects/references do not guarantee that #finalize will be called, you need to put your object inside the registry!
       - Using weak objects/references do not guarantee that your object will be magically collected. You can still cause memory leaks!

      ========================================================================
      2. A weak story
      ========================================================================

      Pharo and Squeak use historically the weak registry mentioned above. Because of the limitations that we mentioned, a different kind of
      weak structure called Ephemerons is required/more useful. To overcome some of these limitations, Igor (Hi Igor! maybe you're reading
      :)) implemented a couple of years ago a new finalization mechanism that, IIANM, worked as follows:

      - Some weak objects could have a first instance variable with a special linked list
      - When the object was about to be collected, instead it was removed from the weak structure and put into its container's linked list
      - On the image side, a special process iterated all special linked lists and executed #finalize on the weak objects

      This mechanism was called NewFinalization, in contrast with what was called LegacyFinalization. Of course these names are context
      dependent, since today's Pharo is back to the so called legacy one ;). NewFinalization was implemented as the default finalization
      mechanim in Pharo, both in VM and image side. But the VM changes remained in the Pharo branch of development. After some discussions,
      I remember Igor and Eliot agreed that what they actually needed were Ephemerons, and since Eliot had started working on Spur at that
      time, he said he would provide Ephemeric classes with the new object format.

      Basically, for those interested, an ephemeron is an association

        weak key -> strong value

      with the special quality that upon garbage collection all references to the weak key that are computed from the strong value (directly
      or indirectly) are taken as weak. This allows the collection of the weak key even if the strong value points to it, but requires some
      more machinery in the GC/VM. You can read more in here [1].

      Until a couple of months/weeks ago, Pharo was using the NewFinalization mechanism with it's special image and VM support. And Squeak
      was using the 'Legacy' one. And then Spur arrived.

      So Spur arrived, and Eliot and Esteban made a lot of effort to simplify the VM's maintenance, and they merged both branches. As a
      conclusion, Pharo Spur VM did not support any more NewFinalization. This provoked at first some leaks because objects were not being
      finalized. A couple of weeks ago, we migrated back the image code to use the 'Legacy' mechanism, see issue 17537 [2].

      And then finalization was not working either. Nor #finalize was being called on executors, nor objects in the weak registry were
      collected. As a symptom, opening any tool will cause 30 new everlasting registrations into the weakregistry, and no tools were
      collected.


      ========================================================================
      3. The cause
      ========================================================================

      After lots of digging, we finally found what was the particular issue causing objects in the weak registry to not be collected. In
      some words, it is caused by the normal belief that "weak objects are magical", which caused that weak references and finalizers are
      really spread over the system with no proper care. And particularly related to the usage of announcements.

      To explain better, I made some pictures for you :)


      ***First, imagine you have a morph with its own local announcer. You subscribe to two events, and the graph will look like this.

      <strong.png>

      - the announcer knows two strong subscriptions
      - the subscriptions know the announcer to be able to unregister
      - the subscriptions know the registered object to send the message in case the event happens

      This forms a closed graph that will be collected. No problem so far.


      ***Second, let's see what happens if we use weak subsriptions:

      <weak.png>

      - the announcer know two weak subscriptions
      - these weak subscriptions know the announcer strongly to be able to unregister
      - they also know the subscriber object but weakly
      - THE difference is made by the weak registry: a global object that manages when and how objects are finalized. In the case of
      announcers, the weak registry will store weakly the subscriber morph, and strongly the weak announcer subscription.

      So far so good also: the references to the morph are weak. When the morph is collected, the weak registry will execute finalize on the
      announcement subscriptions. The subscriptions will unregister from the morph.


      ***The really problematic case is the third one: mixing weak and strong subscriptions in the same announcer.

      <both.png>

      The object graph is just a mixture of the two other ones. One weak subscription and one strong subscription. BUT:

       - there is a strong path from a global object (the weak registry) to the subscriber (the morph)
       - then the morph is never collected
       - the weak registry never finalizes the weak announcement subscription
       - the graph remains there forever.


      And these are the simple cases that show the problem. Imagine that you can have this same configuration but in cycles/chains among
      different morphs/announcements. Plus this is aggravated by evil globals (e.g., the theme and the HandMorph remembers the last focused
      morph, the system window class remembers the last top window even if it was closed...).


      ========================================================================
      4. The solution?
      ========================================================================

      Our solution for the moment is simple. We would like to enforce the following two rules for announcements:

      - announcers local to a morph should only be used strongly. YES, this may cause small hiccups and leaks, for example if you register a
      morph A to the announcer to another morph B. But in the long term, these two will form a closed graph and will be collected.

      - announcers used globally, such as the System announcer, should be used only and uniquely in a weak manner. Like that we ensure that
      they are loosely coupled for real.

      So, please, please, do not use weak announcements unless you're really sure of what you're doing. At least, until we have ephemerons
      and we are sure everything works as expected. Ephemerons would solve this in a more natural way: if we model the weak registry
      subscription as an ephemeron, any reference to the weak #key that arrives from the #value will be treated as weak also.

      Other action points we are working on:
      - fixing tools to follow the rules above
      - We are also writing tests to check that tools (gt*, Nautilus, Rubric, FT) do not leak.
      - chasing other small memory leaks created by stepping, focus global variables...


      ((fogbugz allIssues select: [ :each | each relatedToLeak ])
          flatCollect: [ :each | each participants ])
              do: #thanks



      [1] https://en.wikipedia.org/wiki/Ephemeron
      [2] https://pharo.fogbugz.com/f/cases/17537/SystemAnnouncer-has-far-too-many-subscriptions

_,,,^..^,,,_
best, Eliot