significant Moose-Core internal changes

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

significant Moose-Core internal changes

Tudor Girba-2
Hi,

I would like to signal that Diego and Stephan worked on some internal changes to Moose-Core that can have impact on other code. So, here is a little review to start the discussion:


- mooseName is now cached. This can have an impact if you rely on your model to be more dynamic. If you have a custom entity that can receive sub-entities at runtime, you might want to resetMooseName. For example, if you modify the scope of a FAMIXType, the mooseName has to be reset:

FAMIXType>>container: aContainerEntity
container := FMMultivalueLink on: self
update: #types
from: self container
to: aContainerEntity.
self resetMooseName

This is supposed to be an optimization, but it makes the code more complicated. To see if it is actually worth it, we should benchmark the impact.

@Diego, @Stephan: Did you do some benchmarking on this?


- An entity now answers if it hasUniqueMooseNameInModel. This is at the moment used when caching the entity by name in groups. For example, Associations do not have a unique name and thus, they are not cached.


- There are some parts of the code I do not understand. For example, the need of this code:

MooseGroupStorage>>becomeKind: elementStorageClass   
...
self do: [ :each | each hasUniqueMooseNameInModel ifTrue: [ each privateClearMooseName ] ].
...


Did I miss anything?


Cheers,
Doru


--

"Every thing has its own flow"

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: significant Moose-Core internal changes

Usman Bhatti
Hello,

I adapted our downstream tests to work with these changes and one of the things I remarked is that now you need to set the container of the entity before you add it to the model otherwise mooseName used for caching might contain nils. This happens because MooseGroupRuntimeStorage now caches the entities when the entities are added to the model (earlier the caching mechanism was lazy). So, I did a bit of fixing for our tools. So far, the tests are green.

Usman


On Wed, Sep 25, 2013 at 6:02 AM, Tudor Girba <[hidden email]> wrote:
Hi,

I would like to signal that Diego and Stephan worked on some internal changes to Moose-Core that can have impact on other code. So, here is a little review to start the discussion:


- mooseName is now cached. This can have an impact if you rely on your model to be more dynamic. If you have a custom entity that can receive sub-entities at runtime, you might want to resetMooseName. For example, if you modify the scope of a FAMIXType, the mooseName has to be reset:

FAMIXType>>container: aContainerEntity
container := FMMultivalueLink on: self
update: #types
from: self container
to: aContainerEntity.
self resetMooseName

This is supposed to be an optimization, but it makes the code more complicated. To see if it is actually worth it, we should benchmark the impact.

@Diego, @Stephan: Did you do some benchmarking on this?


- An entity now answers if it hasUniqueMooseNameInModel. This is at the moment used when caching the entity by name in groups. For example, Associations do not have a unique name and thus, they are not cached.


- There are some parts of the code I do not understand. For example, the need of this code:

MooseGroupStorage>>becomeKind: elementStorageClass   
...
self do: [ :each | each hasUniqueMooseNameInModel ifTrue: [ each privateClearMooseName ] ].
...


Did I miss anything?


Cheers,
Doru


--

"Every thing has its own flow"

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev



_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: significant Moose-Core internal changes

DiegoLont
In reply to this post by Tudor Girba-2
Hi Doru,

On Sep 25, 2013, at 6:02 AM, Tudor Girba wrote:
Hi,

I would like to signal that Diego and Stephan worked on some internal changes to Moose-Core that can have impact on other code. So, here is a little review to start the discussion:


- mooseName is now cached. This can have an impact if you rely on your model to be more dynamic. If you have a custom entity that can receive sub-entities at runtime, you might want to resetMooseName. For example, if you modify the scope of a FAMIXType, the mooseName has to be reset:

FAMIXType>>container: aContainerEntity
container := FMMultivalueLink on: self
update: #types
from: self container
to: aContainerEntity.
self resetMooseName

This is supposed to be an optimization, but it makes the code more complicated. To see if it is actually worth it, we should benchmark the impact.

@Diego, @Stephan: Did you do some benchmarking on this?

We did not run an official benchmark, but we used Moose to make a deprecation finder, to look at all recent code in Smalltalk hub if there are calls to deprecated methods. Actually it finds all calls made and stores them all, and we do not link them to deprecated methods yet, but the work of this last step is to build a list of deprecated methods. The deprecation finder uses a MonticelloImporter to import a single monticello packages into a new Moose model and then stores the signatures of the invocations.

Doing this, we experienced that this was rather slow, so we dived into the code to search for places we could optimize. We noted that especially the larger packages suffered from the lookup "entityAt:" to build the calls. It searched linear through all entities (about 64.000), to create the candidates for an invocation. Since there are plenty of invocations in a normal sized package, this took rather long.

After revising the code we had to build in a delay, not to bring smalltalkhub down. And with the delay (of 500 ms per monticello package) it is still faster then it used to be. So I think this proves the performance upgrade enough. But if you have suggestions for a good benchmark, we are willing to run this. We do not think the tests provide a good benchmark, since these contain small models and test for atypical use (renames and such). Maybe it is an idea to use running moose on moose as benchmark?

In the comments of the mooseName, it already states that the mooseName should not change. However we found out that plenty of code (especially tests) relied on the fact that it could change, so we decided to build in a failsafe. Yes, it makes the code more complicated, but it allows to set properties, that causes the mooseName to change, after it is added to the model (like the name and container).

- An entity now answers if it hasUniqueMooseNameInModel. This is at the moment used when caching the entity by name in groups. For example, Associations do not have a unique name and thus, they are not cached.

- There are some parts of the code I do not understand. For example, the need of this code:

MooseGroupStorage>>becomeKind: elementStorageClass
...
self do: [ :each | each hasUniqueMooseNameInModel ifTrue: [ each privateClearMooseName ] ].

This might be an old code snippet, that was needed to reset the cache. Later we made it possible to reset the cache more precise, so it is probably redundant now. We should test if we can remove it now. It makes sure that when you go from own type of storage to another type, all Moose names are recalculated.


Did I miss anything?


Cheers,
Doru


--

"Every thing has its own flow"
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: significant Moose-Core internal changes

DiegoLont
Hi,

Yes, maybe we have missed something … because the Smalltalkimporter shows no performance improvement.

When we look at the code and search for usages of "entityNamed:" we count a lot of occurrences in the test, and one in the monticello importer. 

So maybe a question that is more important is: do we need to be able to search on mooseName fast? Or does it suffice if the storage only stores by type: since this is the use case that is most apparent (allClasses, allTypes, etc.) and used. Importers need to link, so they need to be able to use a dictionary for name lookup. But this lookup could be private.

So our question is now:
- should we move our optimization to the monticello importer
and:
- throw out the by name.
- throw out the elements

Well, going to lunch now.

Cheers,
Diego

Hi Doru,

On Sep 25, 2013, at 6:02 AM, Tudor Girba wrote:
Hi,

I would like to signal that Diego and Stephan worked on some internal changes to Moose-Core that can have impact on other code. So, here is a little review to start the discussion:


- mooseName is now cached. This can have an impact if you rely on your model to be more dynamic. If you have a custom entity that can receive sub-entities at runtime, you might want to resetMooseName. For example, if you modify the scope of a FAMIXType, the mooseName has to be reset:

FAMIXType>>container: aContainerEntity
container := FMMultivalueLink on: self
update: #types
from: self container
to: aContainerEntity.
self resetMooseName

This is supposed to be an optimization, but it makes the code more complicated. To see if it is actually worth it, we should benchmark the impact.

@Diego, @Stephan: Did you do some benchmarking on this?

We did not run an official benchmark, but we used Moose to make a deprecation finder, to look at all recent code in Smalltalk hub if there are calls to deprecated methods. Actually it finds all calls made and stores them all, and we do not link them to deprecated methods yet, but the work of this last step is to build a list of deprecated methods. The deprecation finder uses a MonticelloImporter to import a single monticello packages into a new Moose model and then stores the signatures of the invocations.

Doing this, we experienced that this was rather slow, so we dived into the code to search for places we could optimize. We noted that especially the larger packages suffered from the lookup "entityAt:" to build the calls. It searched linear through all entities (about 64.000), to create the candidates for an invocation. Since there are plenty of invocations in a normal sized package, this took rather long.

After revising the code we had to build in a delay, not to bring smalltalkhub down. And with the delay (of 500 ms per monticello package) it is still faster then it used to be. So I think this proves the performance upgrade enough. But if you have suggestions for a good benchmark, we are willing to run this. We do not think the tests provide a good benchmark, since these contain small models and test for atypical use (renames and such). Maybe it is an idea to use running moose on moose as benchmark?

In the comments of the mooseName, it already states that the mooseName should not change. However we found out that plenty of code (especially tests) relied on the fact that it could change, so we decided to build in a failsafe. Yes, it makes the code more complicated, but it allows to set properties, that causes the mooseName to change, after it is added to the model (like the name and container).

- An entity now answers if it hasUniqueMooseNameInModel. This is at the moment used when caching the entity by name in groups. For example, Associations do not have a unique name and thus, they are not cached.

- There are some parts of the code I do not understand. For example, the need of this code:

MooseGroupStorage>>becomeKind: elementStorageClass
...
self do: [ :each | each hasUniqueMooseNameInModel ifTrue: [ each privateClearMooseName ] ].

This might be an old code snippet, that was needed to reset the cache. Later we made it possible to reset the cache more precise, so it is probably redundant now. We should test if we can remove it now. It makes sure that when you go from own type of storage to another type, all Moose names are recalculated.


Did I miss anything?


Cheers,
Doru


--

"Every thing has its own flow"
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: significant Moose-Core internal changes

abergel
In reply to this post by Tudor Girba-2
Thank you guys!

Alexandre

On 25 Sep 2013, at 01:02, Tudor Girba <[hidden email]> wrote:

Hi,

I would like to signal that Diego and Stephan worked on some internal changes to Moose-Core that can have impact on other code. So, here is a little review to start the discussion:


- mooseName is now cached. This can have an impact if you rely on your model to be more dynamic. If you have a custom entity that can receive sub-entities at runtime, you might want to resetMooseName. For example, if you modify the scope of a FAMIXType, the mooseName has to be reset:

FAMIXType>>container: aContainerEntity
container := FMMultivalueLink on: self
update: #types
from: self container
to: aContainerEntity.
self resetMooseName

This is supposed to be an optimization, but it makes the code more complicated. To see if it is actually worth it, we should benchmark the impact.

@Diego, @Stephan: Did you do some benchmarking on this?


- An entity now answers if it hasUniqueMooseNameInModel. This is at the moment used when caching the entity by name in groups. For example, Associations do not have a unique name and thus, they are not cached.


- There are some parts of the code I do not understand. For example, the need of this code:

MooseGroupStorage>>becomeKind: elementStorageClass   
...
self do: [ :each | each hasUniqueMooseNameInModel ifTrue: [ each privateClearMooseName ] ].
...


Did I miss anything?


Cheers,
Doru


--

"Every thing has its own flow"
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: significant Moose-Core internal changes

DiegoLont
In reply to this post by DiegoLont
Doru,

You were right we should add performance tests … so we ran some. And the only thing that has a significant speedup is the monticello import. We probably need to rewrite this not to use "entityNamed:" …

Importing whitestar (Delphi project using the Delphi importer) gives some interesting results:
Old version: ca. 12 min. 50
New version: ca. 12 min.
stripped version: ca. 10 min. 50.
That is: CPU-time pharo needed for loading + displaying the Moose model.

I guess that the new version improved performance comes from initializing the byName at the size of capacity instead of 10.000.

So, the question is now:
- should I commit this stripped version, where the byName is gone entirely and the elements are only calculated on demand.
(after I make the performance tweak to the Monticello Importer)
- should I revert the mooseName change (I guess I best leave out the privateState entirely to avoid confusion, since the old never caches anyway).

In the mean time I am running some more tests.

Cheers,
Diego

On Sep 25, 2013, at 12:05 PM, Diego Lont wrote:
- mooseName is now cached. This can have an impact if you rely on your model to be more dynamic. If you have a custom entity that can receive sub-entities at runtime, you might want to resetMooseName. For example, if you modify the scope of a FAMIXType, the mooseName has to be reset:

FAMIXType>>container: aContainerEntity
container := FMMultivalueLink on: self
update: #types
from: self container
to: aContainerEntity.
self resetMooseName

This is supposed to be an optimization, but it makes the code more complicated. To see if it is actually worth it, we should benchmark the impact.

@Diego, @Stephan: Did you do some benchmarking on this?


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: significant Moose-Core internal changes

Tudor Girba-2
Hi,

I thought so. However (there always is a however :)), entityNamed: is mostly used when you program queries with Moose.

The goal of the name cache in the group was actually to support that use case. But, it's true that if it costs too much to build the cache, it is not worth it in the end.

So, for this purpose maybe it makes sense to keep the performance optimization. To check the advantage, we should compare the penalty on the group creation time with and without the name cache.

Cheers,
Doru



On Wed, Sep 25, 2013 at 2:53 PM, Diego Lont <[hidden email]> wrote:
Doru,

You were right we should add performance tests … so we ran some. And the only thing that has a significant speedup is the monticello import. We probably need to rewrite this not to use "entityNamed:" …

Importing whitestar (Delphi project using the Delphi importer) gives some interesting results:
Old version: ca. 12 min. 50
New version: ca. 12 min.
stripped version: ca. 10 min. 50.
That is: CPU-time pharo needed for loading + displaying the Moose model.

I guess that the new version improved performance comes from initializing the byName at the size of capacity instead of 10.000.

So, the question is now:
- should I commit this stripped version, where the byName is gone entirely and the elements are only calculated on demand.
(after I make the performance tweak to the Monticello Importer)
- should I revert the mooseName change (I guess I best leave out the privateState entirely to avoid confusion, since the old never caches anyway).

In the mean time I am running some more tests.

Cheers,
Diego

On Sep 25, 2013, at 12:05 PM, Diego Lont wrote:
- mooseName is now cached. This can have an impact if you rely on your model to be more dynamic. If you have a custom entity that can receive sub-entities at runtime, you might want to resetMooseName. For example, if you modify the scope of a FAMIXType, the mooseName has to be reset:

FAMIXType>>container: aContainerEntity
container := FMMultivalueLink on: self
update: #types
from: self container
to: aContainerEntity.
self resetMooseName

This is supposed to be an optimization, but it makes the code more complicated. To see if it is actually worth it, we should benchmark the impact.

@Diego, @Stephan: Did you do some benchmarking on this?


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev




--

"Every thing has its own flow"

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev