Smalltalk › Pharo › Pharo Smalltalk Developers

Growing large images: the case of Moose models

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

17 messages Options

Clément Béra

Growing large images: the case of Moose models

Hello everyone,

This morning I investigated with Vincent Blondeau a problem reported by the Moose community a while ago: loading Moose model is slower in Spur (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this problem was present for anyone growing images to a significant size.

To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb image, growing the image to 450Mb. Loading such a model takes 2 minutes in Spur and 1m30s in pre-Spur VMs.

Using the stable Pharo VM, the analysis results were the following:

- total time spent to load the Model: 2 minutes

- time spent in full GC: 1 minute (4 fullGCs)

- time spent in scavenges[1]: 15 seconds

On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5% in scavenges, 37.5% executing code.

We then used the latest VM that features the new compactor (VM from beginning of March 2017 and over). The full GC execution time went down from 1 minute to 2 seconds.

In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time spent in scavenges decreased from 15 seconds to 5 seconds.

Overall, loading the model is now taking ~50 seconds instead of 2 minutes.

To increase Eden size, one needs to run a script similar to:

| currentEdenSize desiredEdenSize |

currentEdenSize := Smalltalk vm parameterAt: 44.

desiredEdenSize := currentEdenSize * 4.

Smalltalk vm parameterAt: 45 put: desiredEdenSize.

And then restart the image.

I hope this report can be useful for some of you. I will try to make a blog post out of it, detailing other GC settings one can change from the image to improve performance.

Best,

Clement

[1] A scavenge is basically the garbage collection of only young objects

[2] Eden is basically the space where objects are initially allocated.

[3] All numbers in the report are order of magnitudes and not precise numbers

CyrilFerlicot

Re: Growing large images: the case of Moose models

On 03/03/2017 11:56, Clément Bera wrote:

> Hello everyone,
>
> This morning I investigated with Vincent Blondeau a problem reported by
> the Moose community a while ago: loading Moose model is slower in Spur
> (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this
> problem was present for anyone growing images to a significant size.
>
> To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb
> image, growing the image to 450Mb. Loading such a model takes 2 minutes
> in Spur and 1m30s in pre-Spur VMs.
>
> Using the stable Pharo VM, the analysis results were the following:
> - total time spent to load the Model: 2 minutes
> - time spent in full GC: 1 minute (4 fullGCs)
> - time spent in scavenges[1]: 15 seconds
> On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5%
> in scavenges, 37.5% executing code.
>
> We then used the latest VM that features the new compactor (VM from
> beginning of March 2017 and over). The full GC execution time went down
> from 1 minute to 2 seconds.
>
> In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time
> spent in scavenges decreased from 15 seconds to 5 seconds.
>
> Overall, loading the model is now taking ~50 seconds instead of 2 minutes.
>
> To increase Eden size, one needs to run a script similar to:
>
> | currentEdenSize desiredEdenSize |
> currentEdenSize := Smalltalk vm parameterAt: 44.
> desiredEdenSize := currentEdenSize * 4.
> Smalltalk vm parameterAt: 45 put: desiredEdenSize.
>
> _*And then restart the image.*_
>
> I hope this report can be useful for some of you. I will try to make a
> blog post out of it, detailing other GC settings one can change from the
> image to improve performance.
> _*
> *_
> Best,
>
> Clement
>
> [1] A scavenge is basically the garbage collection of only young objects
> [2] Eden is basically the space where objects are initially allocated.
> [3] All numbers in the report are order of magnitudes and not precise
> numbers
>
>
>

Hi,

This is great! We will probably try it soon on our models.

Guillaume had a question also, what is the counterparty if we let the
EdenSize at this size when we are not loading/exporting a MSE?

Because in our case we deploy a server that might need to read some MSE.
We cannot restart it with our current solution. In that case it would be
good to have more info to select the best EdenSize for the server.

Thank you!

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France

signature.asc (817 bytes) Download Attachment

Blondeau Vincent

Re: Growing large images: the case of Moose models

In reply to this post by Clément Béra

Thanks Eliot for the implementation of the new compactor!

Just to add a piece of information: the mse imported have a size of 40Mb.

Cheers,

Vincent

De : Pharo-dev [mailto:[hidden email]] De la part de Clément Bera
Envoyé : vendredi 3 mars 2017 11:56
À : Discusses Development of Pharo
Objet : [Pharo-dev] Growing large images: the case of Moose models

Hello everyone,

To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb image, growing the image to 450Mb. Loading such a model takes 2 minutes in Spur and 1m30s in pre-Spur VMs.

Using the stable Pharo VM, the analysis results were the following:

- total time spent to load the Model: 2 minutes

- time spent in full GC: 1 minute (4 fullGCs)

- time spent in scavenges[1]: 15 seconds

On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5% in scavenges, 37.5% executing code.

We then used the latest VM that features the new compactor (VM from beginning of March 2017 and over). The full GC execution time went down from 1 minute to 2 seconds.

In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time spent in scavenges decreased from 15 seconds to 5 seconds.

Overall, loading the model is now taking ~50 seconds instead of 2 minutes.

To increase Eden size, one needs to run a script similar to:

| currentEdenSize desiredEdenSize |

currentEdenSize := Smalltalk vm parameterAt: 44.

desiredEdenSize := currentEdenSize * 4.

Smalltalk vm parameterAt: 45 put: desiredEdenSize.

And then restart the image.

I hope this report can be useful for some of you. I will try to make a blog post out of it, detailing other GC settings one can change from the image to improve performance.

Best,

Clement

[1] A scavenge is basically the garbage collection of only young objects

[2] Eden is basically the space where objects are initially allocated.

[3] All numbers in the report are order of magnitudes and not precise numbers

!!!*************************************************************************************
"Ce message et les pièces jointes sont confidentiels et réservés à l'usage exclusif de ses destinataires. Il peut également être protégé par le secret professionnel. Si vous recevez ce message par erreur, merci d'en avertir immédiatement l'expéditeur et de le détruire. L'intégrité du message ne pouvant être assurée sur Internet, la responsabilité de Worldline ne pourra être recherchée quant au contenu de ce message. Bien que les meilleurs efforts soient faits pour maintenir cette transmission exempte de tout virus, l'expéditeur ne donne aucune garantie à cet égard et sa responsabilité ne saurait être recherchée pour tout dommage résultant d'un virus transmis.

This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Worldline liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted.!!!"

Clément Béra

Re: Growing large images: the case of Moose models

In reply to this post by CyrilFerlicot

On Fri, Mar 3, 2017 at 12:12 PM, Cyril Ferlicot D. <[hidden email]> wrote:

On 03/03/2017 11:56, Clément Bera wrote:
> Hello everyone,
>
> This morning I investigated with Vincent Blondeau a problem reported by
> the Moose community a while ago: loading Moose model is slower in Spur
> (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this
> problem was present for anyone growing images to a significant size.
>
> To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb
> image, growing the image to 450Mb. Loading such a model takes 2 minutes
> in Spur and 1m30s in pre-Spur VMs.
>
> Using the stable Pharo VM, the analysis results were the following:
> - total time spent to load the Model: 2 minutes
> - time spent in full GC: 1 minute (4 fullGCs)
> - time spent in scavenges[1]: 15 seconds
> On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5%
> in scavenges, 37.5% executing code.
>
> We then used the latest VM that features the new compactor (VM from
> beginning of March 2017 and over). The full GC execution time went down
> from 1 minute to 2 seconds.
>
> In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time
> spent in scavenges decreased from 15 seconds to 5 seconds.
>
> Overall, loading the model is now taking ~50 seconds instead of 2 minutes.
>
> To increase Eden size, one needs to run a script similar to:
>
> | currentEdenSize desiredEdenSize |
> currentEdenSize := Smalltalk vm parameterAt: 44.
> desiredEdenSize := currentEdenSize * 4.
> Smalltalk vm parameterAt: 45 put: desiredEdenSize.
>

> _*And then restart the image.*_
>
> I hope this report can be useful for some of you. I will try to make a
> blog post out of it, detailing other GC settings one can change from the
> image to improve performance.
> _*
> *_
> Best,
>
> Clement
>
> [1] A scavenge is basically the garbage collection of only young objects
> [2] Eden is basically the space where objects are initially allocated.
> [3] All numbers in the report are order of magnitudes and not precise
> numbers
>
>
>

Hi,

This is great! We will probably try it soon on our models.

Guillaume had a question also, what is the counterparty if we let the
EdenSize at this size when we are not loading/exporting a MSE?

There are 2 main counterparts:

- You waste a bit of memory. If you increase from 4Mb to 12Mb, you waste 8Mb.

- The user-pauses for scavenges may be more significant.

There are customers using 64Mb Eden in production. It improves their performance, they do not care about wasting 60Mb on machine with 16Gb RAM and their application does heavy computation and does not need to be that responsive.

However for UI applications (typically the IDE), the scavenge pauses may become significant enough to be noticed by the programmer. Maybe not at 12Mb, but certainly at 64Mb.

For your case:

- Do you care that your application use some more Mb of RAM ?

- Do you care if your application takes a couple extra ms to answer a request ? (In any case, the full GC takes much more time and also delays the answer right now)

If you don't care, you can use larger Eden. In any case, an Eden of 12 or 16 Mb should be safe.

There are other settings that can be useful in your case. I will try to write a post about it.

Because in our case we deploy a server that might need to read some MSE.
We cannot restart it with our current solution. In that case it would be
good to have more info to select the best EdenSize for the server.

Thank you!

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France

philippe.back@highoctane.be

Re: Growing large images: the case of Moose models

Le 3 mars 2017 13:56, "Clément Bera" <[hidden email]> a écrit :

On Fri, Mar 3, 2017 at 12:12 PM, Cyril Ferlicot D. <[hidden email]> wrote:
On 03/03/2017 11:56, Clément Bera wrote:
> Hello everyone,
>
> This morning I investigated with Vincent Blondeau a problem reported by
> the Moose community a while ago: loading Moose model is slower in Spur
> (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this
> problem was present for anyone growing images to a significant size.
>
> To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb
> image, growing the image to 450Mb. Loading such a model takes 2 minutes
> in Spur and 1m30s in pre-Spur VMs.
>
> Using the stable Pharo VM, the analysis results were the following:
> - total time spent to load the Model: 2 minutes
> - time spent in full GC: 1 minute (4 fullGCs)
> - time spent in scavenges[1]: 15 seconds
> On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5%
> in scavenges, 37.5% executing code.
>
> We then used the latest VM that features the new compactor (VM from
> beginning of March 2017 and over). The full GC execution time went down
> from 1 minute to 2 seconds.
>
> In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time
> spent in scavenges decreased from 15 seconds to 5 seconds.
>
> Overall, loading the model is now taking ~50 seconds instead of 2 minutes.
>
> To increase Eden size, one needs to run a script similar to:
>
> | currentEdenSize desiredEdenSize |
> currentEdenSize := Smalltalk vm parameterAt: 44.
> desiredEdenSize := currentEdenSize * 4.
> Smalltalk vm parameterAt: 45 put: desiredEdenSize.
>

> _*And then restart the image.*_
>
> I hope this report can be useful for some of you. I will try to make a
> blog post out of it, detailing other GC settings one can change from the
> image to improve performance.
> _*
> *_
> Best,
>
> Clement
>
> [1] A scavenge is basically the garbage collection of only young objects
> [2] Eden is basically the space where objects are initially allocated.
> [3] All numbers in the report are order of magnitudes and not precise
> numbers
>
>
>

Hi,

This is great! We will probably try it soon on our models.

Guillaume had a question also, what is the counterparty if we let the
EdenSize at this size when we are not loading/exporting a MSE?

There are 2 main counterparts:
- You waste a bit of memory. If you increase from 4Mb to 12Mb, you waste 8Mb.
- The user-pauses for scavenges may be more significant.

There are customers using 64Mb Eden in production. It improves their performance, they do not care about wasting 60Mb on machine with 16Gb RAM and their application does heavy computation and does not need to be that responsive.

However for UI applications (typically the IDE), the scavenge pauses may become significant enough to be noticed by the programmer. Maybe not at 12Mb, but certainly at 64Mb.

For your case:
- Do you care that your application use some more Mb of RAM ?
- Do you care if your application takes a couple extra ms to answer a request ? (In any case, the full GC takes much more time and also delays the answer right now)
If you don't care, you can use larger Eden. In any case, an Eden of 12 or 16 Mb should be safe.

There are other settings that can be useful in your case. I will try to write a post about it.

Because in our case we deploy a server that might need to read some MSE.
We cannot restart it with our current solution. In that case it would be
good to have more info to select the best EdenSize for the server.

Thank you!

I have images that do not shrink with the new compactor and as there are no leaks I am wondering if there is a way for me to diagnose why this happens. I am routinely loading 100MB XML in the image but these are only transient.

Shrinking works with a Pharo6 image but not with a 5.0 based. Strange.

Anyone willing to pair over the internet to have a look?

It is annoying because I am providing those images to customers and an ever growing thing is not really great especially with a non standard tech I am trying to push forward there.

Phil

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France

Guillaume Larcheveque

Re: Growing large images: the case of Moose models

In reply to this post by Clément Béra

2017-03-03 13:54 GMT+01:00 Clément Bera <[hidden email]>:

On Fri, Mar 3, 2017 at 12:12 PM, Cyril Ferlicot D. <[hidden email]> wrote:
On 03/03/2017 11:56, Clément Bera wrote:
> Hello everyone,
>
> This morning I investigated with Vincent Blondeau a problem reported by
> the Moose community a while ago: loading Moose model is slower in Spur
> (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this
> problem was present for anyone growing images to a significant size.
>
> To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb
> image, growing the image to 450Mb. Loading such a model takes 2 minutes
> in Spur and 1m30s in pre-Spur VMs.
>
> Using the stable Pharo VM, the analysis results were the following:
> - total time spent to load the Model: 2 minutes
> - time spent in full GC: 1 minute (4 fullGCs)
> - time spent in scavenges[1]: 15 seconds
> On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5%
> in scavenges, 37.5% executing code.
>
> We then used the latest VM that features the new compactor (VM from
> beginning of March 2017 and over). The full GC execution time went down
> from 1 minute to 2 seconds.
>
> In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time
> spent in scavenges decreased from 15 seconds to 5 seconds.
>
> Overall, loading the model is now taking ~50 seconds instead of 2 minutes.
>
> To increase Eden size, one needs to run a script similar to:
>
> | currentEdenSize desiredEdenSize |
> currentEdenSize := Smalltalk vm parameterAt: 44.
> desiredEdenSize := currentEdenSize * 4.
> Smalltalk vm parameterAt: 45 put: desiredEdenSize.
>

> _*And then restart the image.*_
>
> I hope this report can be useful for some of you. I will try to make a
> blog post out of it, detailing other GC settings one can change from the
> image to improve performance.
> _*
> *_
> Best,
>
> Clement
>
> [1] A scavenge is basically the garbage collection of only young objects
> [2] Eden is basically the space where objects are initially allocated.
> [3] All numbers in the report are order of magnitudes and not precise
> numbers
>
>
>

Hi,

This is great! We will probably try it soon on our models.

Guillaume had a question also, what is the counterparty if we let the
EdenSize at this size when we are not loading/exporting a MSE?

There are 2 main counterparts:
- You waste a bit of memory. If you increase from 4Mb to 12Mb, you waste 8Mb.
- The user-pauses for scavenges may be more significant.

There are customers using 64Mb Eden in production. It improves their performance, they do not care about wasting 60Mb on machine with 16Gb RAM and their application does heavy computation and does not need to be that responsive.

However for UI applications (typically the IDE), the scavenge pauses may become significant enough to be noticed by the programmer. Maybe not at 12Mb, but certainly at 64Mb.

For your case:
- Do you care that your application use some more Mb of RAM ?

No, we don't care about that

- Do you care if your application takes a couple extra ms to answer a request ? (In any case, the full GC takes much more time and also delays the answer right now)

No, because we are only using seaside and browser to display visualisations and UI, so we don't care that it answers with extra ms; it will not freeze at all the UI

If you don't care, you can use larger Eden. In any case, an Eden of 12 or 16 Mb should be safe.

There are other settings that can be useful in your case. I will try to write a post about it.

Because in our case we deploy a server that might need to read some MSE.
We cannot restart it with our current solution. In that case it would be
good to have more info to select the best EdenSize for the server.

Thank you!

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France

Thank you very much for your experiment Clément and Vincent; this point is really important for us :-)

Guillaume Larcheveque

CyrilFerlicot

Re: Growing large images: the case of Moose models

On 03/03/2017 14:06, Guillaume Larcheveque wrote:
>
> For your case:
> - Do you care that your application use some more Mb of RAM ?
>
> No, we don't care about that

I agree

>
> - Do you care if your application takes a couple extra ms to answer
> a request ? (In any case, the full GC takes much more time and also
> delays the answer right now)
>
> No, because we are only using seaside and browser to display
> visualisations and UI, so we don't care that it answers with extra ms;
> it will not freeze at all the UI

In addition to Guillaume's comment: If it is in ms we do not care but
longer full GC can be problematic some time. Do you have an idea of how
long would be the full GC depending on the Eden size? If it add 500ms it
should not be a problem. If it add 2sec it might be.

>
> Thank you very much for your experiment Clément and Vincent; this point
> is really important for us :-)
>
> --
> *Guillaume Larcheveque*
>

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France

signature.asc (817 bytes) Download Attachment

Tudor Girba-2

Re: [Moose-dev] Re: Growing large images: the case of Moose models

In reply to this post by Blondeau Vincent

Amazing work and great news!

Thanks a lot.

Doru

--

www.tudorgirba.com

"Every thing has its own flow"

On 3 Mar 2017, at 13:31, Blondeau Vincent <[hidden email]> wrote:

Thanks Eliot for the implementation of the new compactor!

Just to add a piece of information: the mse imported have a size of 40Mb.

Cheers,

Vincent

De : Pharo-dev [[hidden email]] De la part de Clément Bera
Envoyé : vendredi 3 mars 2017 11:56
À : Discusses Development of Pharo
Objet : [Pharo-dev] Growing large images: the case of Moose models

Hello everyone,

This morning I investigated with Vincent Blondeau a problem reported by the Moose community a while ago: loading Moose model is slower in Spur (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this problem was present for anyone growing images to a significant size.

To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb image, growing the image to 450Mb. Loading such a model takes 2 minutes in Spur and 1m30s in pre-Spur VMs.

Using the stable Pharo VM, the analysis results were the following:

- total time spent to load the Model: 2 minutes

- time spent in full GC: 1 minute (4 fullGCs)

- time spent in scavenges[1]: 15 seconds

On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5% in scavenges, 37.5% executing code.

We then used the latest VM that features the new compactor (VM from beginning of March 2017 and over). The full GC execution time went down from 1 minute to 2 seconds.

In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time spent in scavenges decreased from 15 seconds to 5 seconds.

Overall, loading the model is now taking ~50 seconds instead of 2 minutes.

To increase Eden size, one needs to run a script similar to:

| currentEdenSize desiredEdenSize |

currentEdenSize := Smalltalk vm parameterAt: 44.

desiredEdenSize := currentEdenSize * 4.

Smalltalk vm parameterAt: 45 put: desiredEdenSize.

And then restart the image.

I hope this report can be useful for some of you. I will try to make a blog post out of it, detailing other GC settings one can change from the image to improve performance.

Best,

Clement

[1] A scavenge is basically the garbage collection of only young objects

[2] Eden is basically the space where objects are initially allocated.

[3] All numbers in the report are order of magnitudes and not precise numbers

!!!*************************************************************************************
"Ce message et les pièces jointes sont confidentiels et réservés à l'usage exclusif de ses destinataires. Il peut également être protégé par le secret professionnel. Si vous recevez ce message par erreur, merci d'en avertir immédiatement l'expéditeur et de le détruire. L'intégrité du message ne pouvant être assurée sur Internet, la responsabilité de Worldline ne pourra être recherchée quant au contenu de ce message. Bien que les meilleurs efforts soient faits pour maintenir cette transmission exempte de tout virus, l'expéditeur ne donne aucune garantie à cet égard et sa responsabilité ne saurait être recherchée pour tout dommage résultant d'un virus transmis.

This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Worldline liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted.!!!"

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.list.inf.unibe.ch/listinfo/moose-dev

Clément Béra

Re: Growing large images: the case of Moose models

In reply to this post by CyrilFerlicot

On Fri, Mar 3, 2017 at 2:10 PM, Cyril Ferlicot D. <[hidden email]> wrote:

On 03/03/2017 14:06, Guillaume Larcheveque wrote:
>
> For your case:
> - Do you care that your application use some more Mb of RAM ?
>
> No, we don't care about that

I agree

>
> - Do you care if your application takes a couple extra ms to answer
> a request ? (In any case, the full GC takes much more time and also
> delays the answer right now)
>
> No, because we are only using seaside and browser to display
> visualisations and UI, so we don't care that it answers with extra ms;
> it will not freeze at all the UI

In addition to Guillaume's comment: If it is in ms we do not care but
longer full GC can be problematic some time. Do you have an idea of how
long would be the full GC depending on the Eden size? If it add 500ms it
should not be a problem. If it add 2sec it might be.

I'm sorry I was not clear.

The scavenges take up to a couple ms and they are done very frequently. They depend on Eden size because Eden is scavenged.

The full GC is done rarely. The full GC does not depend on Eden size but on old space size. With the new compactor, it takes around 500ms for a 500Mb heap. I would assume it takes more time on larger heaps. On 2Gb heaps, I guess it could take 2 seconds. For this problem, we need to implement an incremental GC. An incremental GC splits the full GC task in multiple sub tasks, allowing to have multiple small pauses instead of one big pause. The incremental GC implementation is on the TODO list. Your company is in the Pharo consortium, so if you discuss this problem during one of the consortium member meetings, you may ask to invest consortium money in this direction.

>
> Thank you very much for your experiment Clément and Vincent; this point
> is really important for us :-)
>
> --
> *Guillaume Larcheveque*

>

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France

Eliot Miranda-2

Re: Growing large images: the case of Moose models

In reply to this post by philippe.back@highoctane.be

Hi Phil,

On Fri, Mar 3, 2017 at 5:04 AM, [hidden email] <[hidden email]> wrote:

Le 3 mars 2017 13:56, "Clément Bera" <[hidden email]> a écrit :

On Fri, Mar 3, 2017 at 12:12 PM, Cyril Ferlicot D. <[hidden email]> wrote:
On 03/03/2017 11:56, Clément Bera wrote:
> Hello everyone,
>
> This morning I investigated with Vincent Blondeau a problem reported by
> the Moose community a while ago: loading Moose model is slower in Spur
> (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this
> problem was present for anyone growing images to a significant size.
>
> To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb
> image, growing the image to 450Mb. Loading such a model takes 2 minutes
> in Spur and 1m30s in pre-Spur VMs.
>
> Using the stable Pharo VM, the analysis results were the following:
> - total time spent to load the Model: 2 minutes
> - time spent in full GC: 1 minute (4 fullGCs)
> - time spent in scavenges[1]: 15 seconds
> On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5%
> in scavenges, 37.5% executing code.
>
> We then used the latest VM that features the new compactor (VM from
> beginning of March 2017 and over). The full GC execution time went down
> from 1 minute to 2 seconds.
>
> In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time
> spent in scavenges decreased from 15 seconds to 5 seconds.
>
> Overall, loading the model is now taking ~50 seconds instead of 2 minutes.
>
> To increase Eden size, one needs to run a script similar to:
>
> | currentEdenSize desiredEdenSize |
> currentEdenSize := Smalltalk vm parameterAt: 44.
> desiredEdenSize := currentEdenSize * 4.
> Smalltalk vm parameterAt: 45 put: desiredEdenSize.
>

> _*And then restart the image.*_
>
> I hope this report can be useful for some of you. I will try to make a
> blog post out of it, detailing other GC settings one can change from the
> image to improve performance.
> _*
> *_
> Best,
>
> Clement
>
> [1] A scavenge is basically the garbage collection of only young objects
> [2] Eden is basically the space where objects are initially allocated.
> [3] All numbers in the report are order of magnitudes and not precise
> numbers
>
>
>

Hi,

This is great! We will probably try it soon on our models.

Guillaume had a question also, what is the counterparty if we let the
EdenSize at this size when we are not loading/exporting a MSE?

There are 2 main counterparts:
- You waste a bit of memory. If you increase from 4Mb to 12Mb, you waste 8Mb.
- The user-pauses for scavenges may be more significant.

There are customers using 64Mb Eden in production. It improves their performance, they do not care about wasting 60Mb on machine with 16Gb RAM and their application does heavy computation and does not need to be that responsive.

However for UI applications (typically the IDE), the scavenge pauses may become significant enough to be noticed by the programmer. Maybe not at 12Mb, but certainly at 64Mb.

For your case:
- Do you care that your application use some more Mb of RAM ?
- Do you care if your application takes a couple extra ms to answer a request ? (In any case, the full GC takes much more time and also delays the answer right now)
If you don't care, you can use larger Eden. In any case, an Eden of 12 or 16 Mb should be safe.

There are other settings that can be useful in your case. I will try to write a post about it.

Because in our case we deploy a server that might need to read some MSE.
We cannot restart it with our current solution. In that case it would be
good to have more info to select the best EdenSize for the server.

Thank you!

I have images that do not shrink with the new compactor and as there are no leaks I am wondering if there is a way for me to diagnose why this happens. I am routinely loading 100MB XML in the image but these are only transient.

First of all run the attached shell script on your image and report back what the "freespace" field value us. In the old compactor this would usually be tens to hundreds of kilobytes. With the new compactor this should be zero or perhaps a few hundred bytes (and I suspect the few hundred bytes should only show up in Newspeak images).

Shrinking works with a Pharo6 image but not with a 5.0 based. Strange.

Look at the vm parameters for growth and shrinkage.

Anyone willing to pair over the internet to have a look?

Yes, but sometime next week.

It is annoying because I am providing those images to customers and an ever growing thing is not really great especially with a non standard tech I am trying to push forward there.

Phil

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France

_,,,^..^,,,_

best, Eliot

philippeback

Re: Growing large images: the case of Moose models

Eliot,

Script?

Best,

Phil

On Fri, Mar 3, 2017 at 5:05 PM, Eliot Miranda <[hidden email]> wrote:

Hi Phil,

On Fri, Mar 3, 2017 at 5:04 AM, [hidden email] <[hidden email]> wrote:

Le 3 mars 2017 13:56, "Clément Bera" <[hidden email]> a écrit :

On Fri, Mar 3, 2017 at 12:12 PM, Cyril Ferlicot D. <[hidden email]> wrote:
On 03/03/2017 11:56, Clément Bera wrote:
> Hello everyone,
>
> This morning I investigated with Vincent Blondeau a problem reported by
> the Moose community a while ago: loading Moose model is slower in Spur
> (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this
> problem was present for anyone growing images to a significant size.
>
> To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb
> image, growing the image to 450Mb. Loading such a model takes 2 minutes
> in Spur and 1m30s in pre-Spur VMs.
>
> Using the stable Pharo VM, the analysis results were the following:
> - total time spent to load the Model: 2 minutes
> - time spent in full GC: 1 minute (4 fullGCs)
> - time spent in scavenges[1]: 15 seconds
> On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5%
> in scavenges, 37.5% executing code.
>
> We then used the latest VM that features the new compactor (VM from
> beginning of March 2017 and over). The full GC execution time went down
> from 1 minute to 2 seconds.
>
> In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time
> spent in scavenges decreased from 15 seconds to 5 seconds.
>
> Overall, loading the model is now taking ~50 seconds instead of 2 minutes.
>
> To increase Eden size, one needs to run a script similar to:
>
> | currentEdenSize desiredEdenSize |
> currentEdenSize := Smalltalk vm parameterAt: 44.
> desiredEdenSize := currentEdenSize * 4.
> Smalltalk vm parameterAt: 45 put: desiredEdenSize.
>

> _*And then restart the image.*_
>
> I hope this report can be useful for some of you. I will try to make a
> blog post out of it, detailing other GC settings one can change from the
> image to improve performance.
> _*
> *_
> Best,
>
> Clement
>
> [1] A scavenge is basically the garbage collection of only young objects
> [2] Eden is basically the space where objects are initially allocated.
> [3] All numbers in the report are order of magnitudes and not precise
> numbers
>
>
>

Hi,

This is great! We will probably try it soon on our models.

Guillaume had a question also, what is the counterparty if we let the
EdenSize at this size when we are not loading/exporting a MSE?

There are 2 main counterparts:
- You waste a bit of memory. If you increase from 4Mb to 12Mb, you waste 8Mb.
- The user-pauses for scavenges may be more significant.

There are customers using 64Mb Eden in production. It improves their performance, they do not care about wasting 60Mb on machine with 16Gb RAM and their application does heavy computation and does not need to be that responsive.

However for UI applications (typically the IDE), the scavenge pauses may become significant enough to be noticed by the programmer. Maybe not at 12Mb, but certainly at 64Mb.

For your case:
- Do you care that your application use some more Mb of RAM ?
- Do you care if your application takes a couple extra ms to answer a request ? (In any case, the full GC takes much more time and also delays the answer right now)
If you don't care, you can use larger Eden. In any case, an Eden of 12 or 16 Mb should be safe.

There are other settings that can be useful in your case. I will try to write a post about it.

Because in our case we deploy a server that might need to read some MSE.
We cannot restart it with our current solution. In that case it would be
good to have more info to select the best EdenSize for the server.

Thank you!

I have images that do not shrink with the new compactor and as there are no leaks I am wondering if there is a way for me to diagnose why this happens. I am routinely loading 100MB XML in the image but these are only transient.

First of all run the attached shell script on your image and report back what the "freespace" field value us. In the old compactor this would usually be tens to hundreds of kilobytes. With the new compactor this should be zero or perhaps a few hundred bytes (and I suspect the few hundred bytes should only show up in Newspeak images).

Shrinking works with a Pharo6 image but not with a 5.0 based. Strange.

Look at the vm parameters for growth and shrinkage.

Anyone willing to pair over the internet to have a look?

Yes, but sometime next week.

It is annoying because I am providing those images to customers and an ever growing thing is not really great especially with a non standard tech I am trying to push forward there.

Phil

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France

--
_,,,^..^,,,_
best, Eliot

Eliot Miranda-2

Re: Growing large images: the case of Moose models

In reply to this post by Clément Béra

Hi Clément,

On Fri, Mar 3, 2017 at 4:54 AM, Clément Bera <[hidden email]> wrote:

On Fri, Mar 3, 2017 at 12:12 PM, Cyril Ferlicot D. <[hidden email]> wrote:
On 03/03/2017 11:56, Clément Bera wrote:
> Hello everyone,
>
> This morning I investigated with Vincent Blondeau a problem reported by
> the Moose community a while ago: loading Moose model is slower in Spur
> (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this
> problem was present for anyone growing images to a significant size.
>
> To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb
> image, growing the image to 450Mb. Loading such a model takes 2 minutes
> in Spur and 1m30s in pre-Spur VMs.
>
> Using the stable Pharo VM, the analysis results were the following:
> - total time spent to load the Model: 2 minutes
> - time spent in full GC: 1 minute (4 fullGCs)
> - time spent in scavenges[1]: 15 seconds
> On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5%
> in scavenges, 37.5% executing code.
>
> We then used the latest VM that features the new compactor (VM from
> beginning of March 2017 and over). The full GC execution time went down
> from 1 minute to 2 seconds.
>
> In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time
> spent in scavenges decreased from 15 seconds to 5 seconds.
>
> Overall, loading the model is now taking ~50 seconds instead of 2 minutes.
>
> To increase Eden size, one needs to run a script similar to:
>
> | currentEdenSize desiredEdenSize |
> currentEdenSize := Smalltalk vm parameterAt: 44.
> desiredEdenSize := currentEdenSize * 4.
> Smalltalk vm parameterAt: 45 put: desiredEdenSize.
>

> _*And then restart the image.*_
>
> I hope this report can be useful for some of you. I will try to make a
> blog post out of it, detailing other GC settings one can change from the
> image to improve performance.
> _*
> *_
> Best,
>
> Clement
>
> [1] A scavenge is basically the garbage collection of only young objects
> [2] Eden is basically the space where objects are initially allocated.
> [3] All numbers in the report are order of magnitudes and not precise
> numbers
>
>
>

Hi,

This is great! We will probably try it soon on our models.

Guillaume had a question also, what is the counterparty if we let the
EdenSize at this size when we are not loading/exporting a MSE?

There are 2 main counterparts:
- You waste a bit of memory. If you increase from 4Mb to 12Mb, you waste 8Mb.
- The user-pauses for scavenges may be more significant.

There are customers using 64Mb Eden in production. It improves their performance, they do not care about wasting 60Mb on machine with 16Gb RAM and their application does heavy computation and does not need to be that responsive.

It is important to realize that scavenge time depends on the amount of objects that survive, not on the size of new space. So increasing the size of new space will only cause longer pause times when an application is growing the heap, which is the case when Moose reads models. But if an application is following the standard pattern of creating many objects, most of which are collected, then a large eden should not cause noticeably longer pause times. This is because a scavenge copies the surviving objects from eden and past space into future space, overflowing into old space, tracing reachable objects only. So only if lots of objects survive does scavenging touch lots of data. If only a few objects survive the scavenger touches (copies) only those objects.

The VM collects times so you could do some experiments and measure the average scavenge time for the Moose application during its growth phase and then during its normal phase. I think you'll find that the large new space is not an issue for normal usage.

There is another variable that can affect pause times and that is the number of stack pages. The roots of a scavenging collection are the remembered table and the stack zone. So the larger the stack zone, the more time is spent scanning the stack looking for references to objects in new space. This is a difficult trade off. If one has lots of Smalltalk processes in one's application with lots of context switchers between them (this is the threading benchmark) then one wants lots of stack pages, because otherwise a process switch may involve evicting some contexts from a stack page in order to make room for the top context of a newly runnable process. But the more stack pages one uses the slower scavenging becomes.

Cog's default used to be very high (160 pages IIRC) which was determined at Qwaq, whose Teatime application uses lots of futures. I've reduced the default to 50 but it is an important variable to play with.

However for UI applications (typically the IDE), the scavenge pauses may become significant enough to be noticed by the programmer. Maybe not at 12Mb, but certainly at 64Mb.

I doubt this very much (because of the argument above). Remember that the scavenger is a garbage collector specifically designed to work well with systems like Smalltalk where lots of intermediate objects are created when computing results. Scavenging doesn't touch objects that are reclaimed, only objects that survive. So this works well. I think you'll find that GUI applications fit this pattern very much and so a large new sad should not present a problem.

For your case:
- Do you care that your application use some more Mb of RAM ?
- Do you care if your application takes a couple extra ms to answer a request ? (In any case, the full GC takes much more time and also delays the answer right now)
If you don't care, you can use larger Eden. In any case, an Eden of 12 or 16 Mb should be safe.

There are other settings that can be useful in your case. I will try to write a post about it.

Because in our case we deploy a server that might need to read some MSE.
We cannot restart it with our current solution. In that case it would be
good to have more info to select the best EdenSize for the server.

Thank you!

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France

_,,,^..^,,,_

best, Eliot

Eliot Miranda-2

Re: Growing large images: the case of Moose models

In reply to this post by philippeback

oops :-)

On Fri, Mar 3, 2017 at 8:19 AM, [hidden email] <[hidden email]> wrote:

Eliot,

Script?

Best,
Phil

On Fri, Mar 3, 2017 at 5:05 PM, Eliot Miranda <[hidden email]> wrote:
Hi Phil,

On Fri, Mar 3, 2017 at 5:04 AM, [hidden email] <[hidden email]> wrote:

Le 3 mars 2017 13:56, "Clément Bera" <[hidden email]> a écrit :

On Fri, Mar 3, 2017 at 12:12 PM, Cyril Ferlicot D. <[hidden email]> wrote:
On 03/03/2017 11:56, Clément Bera wrote:
> Hello everyone,
>
> This morning I investigated with Vincent Blondeau a problem reported by
> the Moose community a while ago: loading Moose model is slower in Spur
> (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this
> problem was present for anyone growing images to a significant size.
>
> To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb
> image, growing the image to 450Mb. Loading such a model takes 2 minutes
> in Spur and 1m30s in pre-Spur VMs.
>
> Using the stable Pharo VM, the analysis results were the following:
> - total time spent to load the Model: 2 minutes
> - time spent in full GC: 1 minute (4 fullGCs)
> - time spent in scavenges[1]: 15 seconds
> On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5%
> in scavenges, 37.5% executing code.
>
> We then used the latest VM that features the new compactor (VM from
> beginning of March 2017 and over). The full GC execution time went down
> from 1 minute to 2 seconds.
>
> In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time
> spent in scavenges decreased from 15 seconds to 5 seconds.
>
> Overall, loading the model is now taking ~50 seconds instead of 2 minutes.
>
> To increase Eden size, one needs to run a script similar to:
>
> | currentEdenSize desiredEdenSize |
> currentEdenSize := Smalltalk vm parameterAt: 44.
> desiredEdenSize := currentEdenSize * 4.
> Smalltalk vm parameterAt: 45 put: desiredEdenSize.
>

> _*And then restart the image.*_
>
> I hope this report can be useful for some of you. I will try to make a
> blog post out of it, detailing other GC settings one can change from the
> image to improve performance.
> _*
> *_
> Best,
>
> Clement
>
> [1] A scavenge is basically the garbage collection of only young objects
> [2] Eden is basically the space where objects are initially allocated.
> [3] All numbers in the report are order of magnitudes and not precise
> numbers
>
>
>

Hi,

This is great! We will probably try it soon on our models.

Guillaume had a question also, what is the counterparty if we let the
EdenSize at this size when we are not loading/exporting a MSE?

There are 2 main counterparts:
- You waste a bit of memory. If you increase from 4Mb to 12Mb, you waste 8Mb.
- The user-pauses for scavenges may be more significant.

There are customers using 64Mb Eden in production. It improves their performance, they do not care about wasting 60Mb on machine with 16Gb RAM and their application does heavy computation and does not need to be that responsive.

However for UI applications (typically the IDE), the scavenge pauses may become significant enough to be noticed by the programmer. Maybe not at 12Mb, but certainly at 64Mb.

For your case:
- Do you care that your application use some more Mb of RAM ?
- Do you care if your application takes a couple extra ms to answer a request ? (In any case, the full GC takes much more time and also delays the answer right now)
If you don't care, you can use larger Eden. In any case, an Eden of 12 or 16 Mb should be safe.

There are other settings that can be useful in your case. I will try to write a post about it.

Because in our case we deploy a server that might need to read some MSE.
We cannot restart it with our current solution. In that case it would be
good to have more info to select the best EdenSize for the server.

Thank you!

I have images that do not shrink with the new compactor and as there are no leaks I am wondering if there is a way for me to diagnose why this happens. I am routinely loading 100MB XML in the image but these are only transient.

First of all run the attached shell script on your image and report back what the "freespace" field value us. In the old compactor this would usually be tens to hundreds of kilobytes. With the new compactor this should be zero or perhaps a few hundred bytes (and I suspect the few hundred bytes should only show up in Newspeak images).

Shrinking works with a Pharo6 image but not with a 5.0 based. Strange.

Look at the vm parameters for growth and shrinkage.

Anyone willing to pair over the internet to have a look?

Yes, but sometime next week.

It is annoying because I am providing those images to customers and an ever growing thing is not really great especially with a non standard tech I am trying to push forward there.

Phil

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France

--
_,,,^..^,,,_
best, Eliot

_,,,^..^,,,_

best, Eliot

svihdr (2K) Download Attachment

CyrilFerlicot

Re: Growing large images: the case of Moose models

In reply to this post by Eliot Miranda-2

Le 03/03/2017 à 17:22, Eliot Miranda a écrit :

> It is important to realize that scavenge time depends on the amount of
> objects that survive, not on the size of new space. So increasing the
> size of new space will only cause longer pause times when an application
> is growing the heap, which is the case when Moose reads models. But if
> an application is following the standard pattern of creating many
> objects, most of which are collected, then a large eden should not cause
> noticeably longer pause times. This is because a scavenge copies the
> surviving objects from eden and past space into future space,
> overflowing into old space, tracing reachable objects only. So only if
> lots of objects survive does scavenging touch lots of data. If only a
> few objects survive the scavenger touches (copies) only those objects.
>
> The VM collects times so you could do some experiments and measure the
> average scavenge time for the Moose application during its growth phase
> and then during its normal phase. I think you'll find that the large
> new space is not an issue for normal usage.
>
> There is another variable that can affect pause times and that is the
> number of stack pages. The roots of a scavenging collection are the
> remembered table and the stack zone. So the larger the stack zone, the
> more time is spent scanning the stack looking for references to objects
> in new space. This is a difficult trade off. If one has lots of
> Smalltalk processes in one's application with lots of context switchers
> between them (this is the threading benchmark) then one wants lots of
> stack pages, because otherwise a process switch may involve evicting
> some contexts from a stack page in order to make room for the top
> context of a newly runnable process. But the more stack pages one uses
> the slower scavenging becomes.
>
> Cog's default used to be very high (160 pages IIRC) which was determined
> at Qwaq, whose Teatime application uses lots of futures. I've reduced
> the default to 50 but it is an important variable to play with.
>
>
> However for UI applications (typically the IDE), the scavenge pauses
> may become significant enough to be noticed by the programmer. Maybe
> not at 12Mb, but certainly at 64Mb.
>
>
> I doubt this very much (because of the argument above). Remember that
> the scavenger is a garbage collector specifically designed to work well
> with systems like Smalltalk where lots of intermediate objects are
> created when computing results. Scavenging doesn't touch objects that
> are reclaimed, only objects that survive. So this works well. I think
> you'll find that GUI applications fit this pattern very much and so a
> large new sad should not present a problem.
>
>
> --
> _,,,^..^,,,_
> best, Eliot

Awesome!

I look forward finding the time to test all this :)

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France

signature.asc (836 bytes) Download Attachment

Clément Béra

Re: [squeak-dev] Growing large images: the case of Moose models

In reply to this post by Eliot Miranda-2

On Mar 3, 2017 17:22, "Eliot Miranda" <[hidden email]> wrote:

Hi Clément,

On Fri, Mar 3, 2017 at 4:54 AM, Clément Bera <[hidden email]> wrote:

On Fri, Mar 3, 2017 at 12:12 PM, Cyril Ferlicot D. <[hidden email]> wrote:
On 03/03/2017 11:56, Clément Bera wrote:
> Hello everyone,
>
> This morning I investigated with Vincent Blondeau a problem reported by
> the Moose community a while ago: loading Moose model is slower in Spur
> (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this
> problem was present for anyone growing images to a significant size.
>
> To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb
> image, growing the image to 450Mb. Loading such a model takes 2 minutes
> in Spur and 1m30s in pre-Spur VMs.
>
> Using the stable Pharo VM, the analysis results were the following:
> - total time spent to load the Model: 2 minutes
> - time spent in full GC: 1 minute (4 fullGCs)
> - time spent in scavenges[1]: 15 seconds
> On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5%
> in scavenges, 37.5% executing code.
>
> We then used the latest VM that features the new compactor (VM from
> beginning of March 2017 and over). The full GC execution time went down
> from 1 minute to 2 seconds.
>
> In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time
> spent in scavenges decreased from 15 seconds to 5 seconds.
>
> Overall, loading the model is now taking ~50 seconds instead of 2 minutes.
>
> To increase Eden size, one needs to run a script similar to:
>
> | currentEdenSize desiredEdenSize |
> currentEdenSize := Smalltalk vm parameterAt: 44.
> desiredEdenSize := currentEdenSize * 4.
> Smalltalk vm parameterAt: 45 put: desiredEdenSize.
>

> _*And then restart the image.*_
>
> I hope this report can be useful for some of you. I will try to make a
> blog post out of it, detailing other GC settings one can change from the
> image to improve performance.
> _*
> *_
> Best,
>
> Clement
>
> [1] A scavenge is basically the garbage collection of only young objects
> [2] Eden is basically the space where objects are initially allocated.
> [3] All numbers in the report are order of magnitudes and not precise
> numbers
>
>
>

Hi,

This is great! We will probably try it soon on our models.

Guillaume had a question also, what is the counterparty if we let the
EdenSize at this size when we are not loading/exporting a MSE?

There are 2 main counterparts:
- You waste a bit of memory. If you increase from 4Mb to 12Mb, you waste 8Mb.
- The user-pauses for scavenges may be more significant.

There are customers using 64Mb Eden in production. It improves their performance, they do not care about wasting 60Mb on machine with 16Gb RAM and their application does heavy computation and does not need to be that responsive.

It is important to realize that scavenge time depends on the amount of objects that survive, not on the size of new space. So increasing the size of new space will only cause longer pause times when an application is growing the heap, which is the case when Moose reads models. But if an application is following the standard pattern of creating many objects, most of which are collected, then a large eden should not cause noticeably longer pause times. This is because a scavenge copies the surviving objects from eden and past space into future space, overflowing into old space, tracing reachable objects only. So only if lots of objects survive does scavenging touch lots of data. If only a few objects survive the scavenger touches (copies) only those objects.

The VM collects times so you could do some experiments and measure the average scavenge time for the Moose application during its growth phase and then during its normal phase. I think you'll find that the large new space is not an issue for normal usage.

There is another variable that can affect pause times and that is the number of stack pages. The roots of a scavenging collection are the remembered table and the stack zone. So the larger the stack zone, the more time is spent scanning the stack looking for references to objects in new space. This is a difficult trade off. If one has lots of Smalltalk processes in one's application with lots of context switchers between them (this is the threading benchmark) then one wants lots of stack pages, because otherwise a process switch may involve evicting some contexts from a stack page in order to make room for the top context of a newly runnable process. But the more stack pages one uses the slower scavenging becomes.

Cog's default used to be very high (160 pages IIRC) which was determined at Qwaq, whose Teatime application uses lots of futures. I've reduced the default to 50 but it is an important variable to play with.

However for UI applications (typically the IDE), the scavenge pauses may become significant enough to be noticed by the programmer. Maybe not at 12Mb, but certainly at 64Mb.

I doubt this very much (because of the argument above). Remember that the scavenger is a garbage collector specifically designed to work well with systems like Smalltalk where lots of intermediate objects are created when computing results. Scavenging doesn't touch objects that are reclaimed, only objects that survive. So this works well. I think you'll find that GUI applications fit this pattern very much and so a large new sad should not present a problem.

I agree with you conceptually.

On my measurements, with a non growing application and the exact same code, with a 3.8Mb Eden size the scavenge takes 0.62ms (average on 504 scavenges) while it takes 0.97 ms (average for 186 scavenges) with 8.9 Mb of Eden size. So with 2.5 times bigger Eden, I get 1.5 time slower scavenge, and 2.5 times less frequent scavenges. There is nothing such as a X times bigger Eden implies X times slower scavenge, but there is a difference.

Obviously I would need measurements on different applications to conclude. I tried multiple times with different portions of code and got similar results. Maybe if Eden is larger, more objects survive per scavenge even though a less overall proportion of scavenged object survived ? Or maybe I took non conventional code (I took test suites) ?

Based on what you said, I guess the only disadvantage to a larger Eden is then a bit a memory wasted. Not a problem for Gb heaps where the machine has large RAM.

I have not tried to change the number of stack pages.

In any cases, the main issue was the compactor and you fixed it.

For your case:
- Do you care that your application use some more Mb of RAM ?
- Do you care if your application takes a couple extra ms to answer a request ? (In any case, the full GC takes much more time and also delays the answer right now)
If you don't care, you can use larger Eden. In any case, an Eden of 12 or 16 Mb should be safe.

There are other settings that can be useful in your case. I will try to write a post about it.

Because in our case we deploy a server that might need to read some MSE.
We cannot restart it with our current solution. In that case it would be
good to have more info to select the best EdenSize for the server.

Thank you!

--
Cyril Ferlicot

http://www.synectique.eu

2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France

--
_,,,^..^,,,_
best, Eliot

stepharong

Re: Growing large images: the case of Moose models

In reply to this post by Clément Béra

tx clement

this is indeed really important

On Fri, 03 Mar 2017 11:56:05 +0100, Clément Bera <[hidden email]> wrote:

Hello everyone,

This morning I investigated with Vincent Blondeau a problem reported by the Moose community a while ago: loading Moose model is slower in Spur (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this problem was present for anyone growing images to a significant size.

To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb image, growing the image to 450Mb. Loading such a model takes 2 minutes in Spur and 1m30s in pre-Spur VMs.

Using the stable Pharo VM, the analysis results were the following:
- total time spent to load the Model: 2 minutes
- time spent in full GC: 1 minute (4 fullGCs)
- time spent in scavenges[1]: 15 seconds
On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5% in scavenges, 37.5% executing code.

We then used the latest VM that features the new compactor (VM from beginning of March 2017 and over). The full GC execution time went down from 1 minute to 2 seconds.

In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time spent in scavenges decreased from 15 seconds to 5 seconds.

Overall, loading the model is now taking ~50 seconds instead of 2 minutes.

To increase Eden size, one needs to run a script similar to:

| currentEdenSize desiredEdenSize |
currentEdenSize := Smalltalk vm parameterAt: 44.
desiredEdenSize := currentEdenSize * 4.
Smalltalk vm parameterAt: 45 put: desiredEdenSize.

And then restart the image.

I hope this report can be useful for some of you. I will try to make a blog post out of it, detailing other GC settings one can change from the image to improve performance.

Best,

Clement

[1] A scavenge is basically the garbage collection of only young objects
[2] Eden is basically the space where objects are initially allocated.
[3] All numbers in the report are order of magnitudes and not precise numbers

Using Opera's mail client: http://www.opera.com/mail/

Clément Béra

Re: Growing large images: the case of Moose models

Hi again,

I summarised my experiment with Moose and 2 other experiments in a "Tuning the Pharo garbage collector" blog post here:

https://clementbera.wordpress.com/2017/03/12/tuning-the-pharo-garbage-collector/

I may integrate some new methods in the VirtualMachine class in Pharo in the next few months to make VM tuning easier to deal with. I will update the scripts in the blog post accordingly if I do so.

Best,

On Sat, Mar 4, 2017 at 12:49 AM, stepharong <[hidden email]> wrote:

tx clement

this is indeed really important

On Fri, 03 Mar 2017 11:56:05 +0100, Clément Bera <[hidden email]> wrote:

Hello everyone,

This morning I investigated with Vincent Blondeau a problem reported by the Moose community a while ago: loading Moose model is slower in Spur (Pharo 5+) than in pre-Spur (Pharo 4 and older). In general, this problem was present for anyone growing images to a significant size.

To investigate the problem, we loaded a 200Mb[3] Moose model on a 250Mb image, growing the image to 450Mb. Loading such a model takes 2 minutes in Spur and 1m30s in pre-Spur VMs.

Using the stable Pharo VM, the analysis results were the following:
- total time spent to load the Model: 2 minutes
- time spent in full GC: 1 minute (4 fullGCs)
- time spent in scavenges[1]: 15 seconds
On the 2 minutes spent, we have 50% of the time spent in full GCs, 12.5% in scavenges, 37.5% executing code.

We then used the latest VM that features the new compactor (VM from beginning of March 2017 and over). The full GC execution time went down from 1 minute to 2 seconds.

In addition, we increased the size of Eden[2] from 4Mb to 12Mb. Time spent in scavenges decreased from 15 seconds to 5 seconds.

Overall, loading the model is now taking ~50 seconds instead of 2 minutes.

To increase Eden size, one needs to run a script similar to:

| currentEdenSize desiredEdenSize |
currentEdenSize := Smalltalk vm parameterAt: 44.
desiredEdenSize := currentEdenSize * 4.
Smalltalk vm parameterAt: 45 put: desiredEdenSize.

And then restart the image.

I hope this report can be useful for some of you. I will try to make a blog post out of it, detailing other GC settings one can change from the image to improve performance.

Best,

Clement

[1] A scavenge is basically the garbage collection of only young objects
[2] Eden is basically the space where objects are initially allocated.
[3] All numbers in the report are order of magnitudes and not precise numbers

--
Using Opera's mail client: http://www.opera.com/mail/