Smalltalk › Frameworks & Tools › Moose

To a unified software model repository for Moose.

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

59 messages Options

123

Fabrizio Perin-3

Re: To a unified software model repository for Moose.

>
> Does anybody have any idea about any possible usage of such kind of information ?

you are talking about meta information
In famix 1.0
in the file header there was one entity describing the model.
what we could do is either
- have a separate file in use format (= reuse of parser) containing information and the file name of the model to which they refer
- or have the information inside the mse model by having an entity representing the model (we did that in some old moose versions).

I would prefer a separate file. If the idea is to setup a repository of projects the meta info are about a project together with its mse file not just about the mse.

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Mircea Filip Lungu-2

Re: To a unified software model repository for Moose.

Hi guys,

Some small observations:

Reduced Models

- Having reduced (partial) models is a nice idea. An even nicer one would be to have partial loading of the model. But until that happens, I guess the partial models could be a solution... And still, having the full Qualitas Corpus loaded even with partial models is impossible. I guess this is what Stef proposed to address by having the Project be just a proxy and loading it on demand, only that this might be too expensive.

In my experience, loading a MSE file for an average system takes between 3 and 5 minutes. Imagine that you want to collect the number of classes in each of the systems in the QC. you have 150 systems * 3 minutes each for loading the model = 7.5 hours. And imagine that after that you want to count the number of methods in each of the systems... This is why with Andrea we thought about caching Moose images preloaded with the systems in the Qualitas Corpus. In this way I could run a new analysis on the entire corpus in 10 minutes: I open an image, load the latest version of my analysis package if needed, run the analysis, return the results and close.

On the other hand, maybe the alternative to this an image per system could be to serialize the FAMIX models with Fuel and hope that loading an entire system with it would be waaay faster than when loading an MSE file.

Evolutionary Analysis

- I seem to remember talking at some point with Andy Kellens and him mentioning that they had an incremental model of FAMIX for modeling evolutionary analysis. This would allow them to only represent the deltas between two versions. We had this discussion last year at Sattose. Does anybody know anything about that? That would be the right solution for doing evolutionary analysis. Otherwise we should just be happy with analyzing at most a dozen versions at once, as many as fit in memory. Or loading things on demand if we find a fast way.

Memory

Did anybody look into using Gemstone...?

Server

Jannik I guess we could use one of the servers at SCG if needed for serving this information.

Cheers,

On Tue, Jul 24, 2012 at 11:40 AM, Fabrizio Perin <[hidden email]> wrote:

>
> Does anybody have any idea about any possible usage of such kind of information ?

you are talking about meta information
In famix 1.0
in the file header there was one entity describing the model.
what we could do is either
- have a separate file in use format (= reuse of parser) containing information and the file name of the model to which they refer
- or have the information inside the mse model by having an entity representing the model (we did that in some old moose versions).

I would prefer a separate file. If the idea is to setup a repository of projects the meta info are about a project together with its mse file not just about the mse.

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Nicolas Anquetil

Re: To a unified software model repository for Moose.

On 24/07/12 15:38, Mircea Filip Lungu wrote:

>
> In my experience, loading a MSE file for an average system takes
> between 3 and 5 minutes. Imagine that you want to collect the number
> of classes in each of the systems in the QC. you have 150 systems * 3
> minutes each for loading the model = 7.5 hours. And imagine that after
> that you want to count the number of methods in each of the systems...
> This is why with Andrea we thought about caching Moose images
> preloaded with the systems in the Qualitas Corpus. In this way I could
> run a new analysis on the entire corpus in 10 minutes: I open an
> image, load the latest version of my analysis package if needed, run
> the analysis, return the results and close.

- images are dependent on the smalltalk code.
In three years these images will be completly outdated and one will need
to reload all of pharo and moose to be able to use them which will
probably take much more than 5 minutes.

Which is not saying that the loading should be faster.
Actually saving a model is very slow too, ...
Maybe it is something in the way Fame is implemented ?

- from an experimental point of view, 8 hours is not a lot.You do it
once, you get your results and that'sit.

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Andre Hora

Re: To a unified software model repository for Moose.

In reply to this post by Mircea Filip Lungu-2

On Tue, Jul 24, 2012 at 3:38 PM, Mircea Filip Lungu <[hidden email]> wrote:

Hi guys,

Some small observations:

Reduced Models
- Having reduced (partial) models is a nice idea.

Yes. You can have reduced models either by:

- (1) exporting a reduced (MSE) version from VerveineJ (I guess this is implemented);

- (2) or, importing your full model in Moose and then exporting the entities you want (this is also implemented, not sure if this integrated);

An even nicer one would be to have partial loading of the model.

With this idea, you need to be aware that you will lose (a lot of) data. You will lose data because several (derived) properties are calculated by Moose and you don't have access to such information directly in your MSE. For instance, suppose you want to load only the entities FAMIXPackage and FAMIXClass. Then your classes you will not have information about the attributes, since FAMIXClass>>attributes is derived (in fact, you need to have FAMIXAttribute in your model). Then, AFAIK this is not good and also option (1) above is not so good. Then, option (2) is better.

But until that happens, I guess the partial models could be a solution... And still, having the full Qualitas Corpus loaded even with partial models is impossible. I guess this is what Stef proposed to address by having the Project be just a proxy and loading it on demand, only that this might be too expensive.

In my experience, loading a MSE file for an average system takes between 3 and 5 minutes. Imagine that you want to collect the number of classes in each of the systems in the QC. you have 150 systems * 3 minutes each for loading the model = 7.5 hours. And imagine that after that you want to count the number of methods in each of the systems... This is why with Andrea we thought about caching Moose images preloaded with the systems in the Qualitas Corpus. In this way I could run a new analysis on the entire corpus in 10 minutes: I open an image, load the latest version of my analysis package if needed, run the analysis, return the results and close.

On the other hand, maybe the alternative to this an image per system could be to serialize the FAMIX models with Fuel and hope that loading an entire system with it would be waaay faster than when loading an MSE file.

Evolutionary Analysis
- I seem to remember talking at some point with Andy Kellens and him mentioning that they had an incremental model of FAMIX for modeling evolutionary analysis. This would allow them to only represent the deltas between two versions. We had this discussion last year at Sattose. Does anybody know anything about that? That would be the right solution for doing evolutionary analysis. Otherwise we should just be happy with analyzing at most a dozen versions at once, as many as fit in memory. Or loading things on demand if we find a fast way.

If you need just class and package data, then I think Hismo will work pretty well. Model with just classes and packages are very small, and you load hundreds small models in Hismo.

Memory
Did anybody look into using Gemstone...?

Server
Jannik I guess we could use one of the servers at SCG if needed for serving this information.

Cheers,
M.

On Tue, Jul 24, 2012 at 11:40 AM, Fabrizio Perin <[hidden email]> wrote:

>
> Does anybody have any idea about any possible usage of such kind of information ?

you are talking about meta information
In famix 1.0
in the file header there was one entity describing the model.
what we could do is either
- have a separate file in use format (= reuse of parser) containing information and the file name of the model to which they refer
- or have the information inside the mse model by having an entity representing the model (we did that in some old moose versions).

I would prefer a separate file. If the idea is to setup a repository of projects the meta info are about a project together with its mse file not just about the mse.

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

--
Andre Hora

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Andrea Caracciolo

Re: To a unified software model repository for Moose.

> - the idea of unified repository is very nice for research. It is important if we want to be able to share results, and reproduce experiments.
> Could it have some interest for industry projects?
> For example, it could be usefull as a comparison base for a company to understand how good or bad its own software is compared to some other (presumably open source) projects ...

Our idea is to build an open repository containing a collection of models from a large group of projects (qualitas corpus), where each of these models can be annotated and expanded with analysis results.
Analysis results could be anything (i.e. all classes of a certain project can be annotated with a certain metric; two classes can be related by a duplication relationship; etc.. ).
The resulting "expanded models" can be queried or downloaded (useful if you want to compare the data against results of an industrial project analysis).
MSE files are used as a way to import a simple version (no contributed analysis results annotations) of the project model into moose and enable new analyses.

> Yes. You can have reduced models either by:
> - (1) exporting a reduced (MSE) version from VerveineJ (I guess this is implemented);
> - (2) or, importing your full model in Moose and then exporting the entities you want (this is also implemented, not sure if this integrated);

Distributing reduced MSE models of a project does not make much sense to me. You have to take in consideration multiple analysis requirements and build a reduced model for each of them.
Would it be possible (and would it make sense) to create a web service which interprets moose queries, runs them on the specified target image and returns a serialized representation of the returned objects ?

> - keeping source code (and much more) is important because you never know what people would like to do in the future.
> It happened to a team in Brazil recently, they correlated some metrics to bugs.
> So they needed on top of the MSEs: access to a bug-tracking system to identify bug-identifiers; access to SVN commit comments to identify bug-fixing commits; access to the code to know what methods/classes were changed to correct the bug.
> Anyway, people want to see source code. If something looks strange in a visualization, you want to go back to the code to understand what's happening.

Since qualitascorpus.com already provides and maintains a good collection of java projects, we thought about reusing that valuable source of information and building the MSE repository as a complement to Qualitas Corpus.
This means that MSE files would be generated and hosted by us and source code can be downloaded from qualitascorpus.com.

Andrea

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Stéphane Ducasse

Re: To a unified software model repository for Moose.

>>
>
> Our idea is to build an open repository containing a collection of models from a large group of projects (qualitas corpus), where each of these models can be annotated and expanded with analysis results.
> Analysis results could be anything (i.e. all classes of a certain project can be annotated with a certain metric; two classes can be related by a duplication relationship; etc.. ).
> The resulting "expanded models" can be queried or downloaded (useful if you want to compare the data against results of an industrial project analysis).
> MSE files are used as a way to import a simple version (no contributed analysis results annotations) of the project model into moose and enable new analyses.

Ok
now what I'm saying is that as soon as you want to do analysis you need also the infrastructure to support partial information.

>
>> Yes. You can have reduced models either by:
>> - (1) exporting a reduced (MSE) version from VerveineJ (I guess this is implemented);
>> - (2) or, importing your full model in Moose and then exporting the entities you want (this is also implemented, not sure if this integrated);
>
> Distributing reduced MSE models of a project does not make much sense to me.

:)
My point is that having a server is something, having an infrastructure to perform analysis is important and this is what we brainstormed.

> You have to take in consideration multiple analysis requirements and build a reduced model for each of them.
> Would it be possible (and would it make sense) to create a web service which interprets moose queries, runs them on the specified target image and returns a serialized representation of the returned objects ?
>
>
>> - keeping source code (and much more) is important because you never know what people would like to do in the future.
>> It happened to a team in Brazil recently, they correlated some metrics to bugs.
>> So they needed on top of the MSEs: access to a bug-tracking system to identify bug-identifiers; access to SVN commit comments to identify bug-fixing commits; access to the code to know what methods/classes were changed to correct the bug.
>> Anyway, people want to see source code. If something looks strange in a visualization, you want to go back to the code to understand what's happening.
>
> Since qualitascorpus.com already provides and maintains a good collection of java projects, we thought about reusing that valuable source of information and building the MSE repository as a complement to Qualitas Corpus.

I would always version the source code and the mse (as well as the version of the tools used to extracted it).
HD is cheap and you always want to get all the information if you can.

> This means that MSE files would be generated and hosted by us and source code can be downloaded from qualitascorpus.com.

I would follow an object-oriented approach: project with all data together and encapsulated.

>
> Andrea
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Stéphane Ducasse

Re: To a unified software model repository for Moose.

In reply to this post by Andre Hora

On Jul 24, 2012, at 4:54 PM, Andre Hora wrote:

>
>
> On Tue, Jul 24, 2012 at 3:38 PM, Mircea Filip Lungu <[hidden email]> wrote:
> Hi guys,
>
> Some small observations:
>
> Reduced Models
> - Having reduced (partial) models is a nice idea.
>
> Yes. You can have reduced models either by:
> - (1) exporting a reduced (MSE) version from VerveineJ (I guess this is implemented);
> - (2) or, importing your full model in Moose and then exporting the entities you want (this is also implemented, not sure if this integrated);

Andre you did not implement a filter at loading time?
I thought that you or cyril did it.

> An even nicer one would be to have partial loading of the model.

I think that using the MooseImportingContext we could do that easily.

>
> With this idea, you need to be aware that you will lose (a lot of) data. You will lose data because several (derived) properties are calculated by Moose and you don't have access to such information directly in your MSE. For instance, suppose you want to load only the entities FAMIXPackage and FAMIXClass. Then your classes you will not have information about the attributes, since FAMIXClass>>attributes is derived (in fact, you need to have FAMIXAttribute in your model). Then, AFAIK this is not good and also option (1) above is not so good. Then, option (2) is better.
>
> But until that happens, I guess the partial models could be a solution... And still, having the full Qualitas Corpus loaded even with partial models is impossible. I guess this is what Stef proposed to address by having the Project be just a proxy and loading it on demand, only that this might be too expensive.
>
> In my experience, loading a MSE file for an average system takes between 3 and 5 minutes. Imagine that you want to collect the number of classes in each of the systems in the QC. you have 150 systems * 3 minutes each for loading the model = 7.5 hours. And imagine that after that you want to count the number of methods in each of the systems... This is why with Andrea we thought about caching Moose images preloaded with the systems in the Qualitas Corpus. In this way I could run a new analysis on the entire corpus in 10 minutes: I open an image, load the latest version of my analysis package if needed, run the analysis, return the results and close.
>
> On the other hand, maybe the alternative to this an image per system could be to serialize the FAMIX models with Fuel and hope that loading an entire system with it would be waaay faster than when loading an MSE file.
>
> Evolutionary Analysis
> - I seem to remember talking at some point with Andy Kellens and him mentioning that they had an incremental model of FAMIX for modeling evolutionary analysis. This would allow them to only represent the deltas between two versions. We had this discussion last year at Sattose. Does anybody know anything about that? That would be the right solution for doing evolutionary analysis. Otherwise we should just be happy with analyzing at most a dozen versions at once, as many as fit in memory. Or loading things on demand if we find a fast way.
>
> If you need just class and package data, then I think Hismo will work pretty well. Model with just classes and packages are very small, and you load hundreds small models in Hismo.
>
> Memory
> Did anybody look into using Gemstone...?
>
> Server
> Jannik I guess we could use one of the servers at SCG if needed for serving this information.
>
>
> Cheers,
> M.
>
>
>
>
>
>
>
>
>
> On Tue, Jul 24, 2012 at 11:40 AM, Fabrizio Perin <[hidden email]> wrote:
>
> >
> > Does anybody have any idea about any possible usage of such kind of information ?
> you are talking about meta information
> In famix 1.0
> in the file header there was one entity describing the model.
> what we could do is either
> - have a separate file in use format (= reuse of parser) containing information and the file name of the model to which they refer
> - or have the information inside the mse model by having an entity representing the model (we did that in some old moose versions).
>
>
>
> I would prefer a separate file. If the idea is to setup a repository of projects the meta info are about a project together with its mse file not just about the mse.
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
>
>
> --
> Andre Hora
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

jannik laval

Re: To a unified software model repository for Moose.

In reply to this post by Stéphane Ducasse

On Jul 23, 2012, at 11:22 PM, Stéphane Ducasse wrote:

>
>> +1
>>
>> Now what should I use ? InMoose that I do not control, or VerveineJ.... that is not open source ?
>> This is stupid but maybe, I should write mine :)
>
> give a try to see :)
> again we were planning to have a license so that verveineJ is usable for close friends.
> I'm sorry but we do not want to give everything for free all the time.

:) it was just a joke.
I want to say that I am not sure the message is good: "you can use Moose, it is open-source, but must pay for importing data".
You are right, this is a commercial message, and maybe it is a bonus for Synectique...

Now, I have a researcher point of view and I think that using a tool that is open source is easier than contacting guys and maybe they answer or not. And if not ? I will use another tool.
I only give my opinion on something that does not concern me directly.

Jannik

>
> Stef
>>
>> Jannik
>>
>> On Jul 23, 2012, at 1:28 PM, Tudor Girba wrote:
>>
>>> Sure. But, you see, even Jannik who is an Moose insider misunderstood
>>> the status of VerveineJ. This means we have to be more clear somehow.
>>>
>>> I added now explicitly a line saying that there is a commercial license.
>>>
>>> Should I remove the pointer of how to checkout the source as well?
>>>
>>> Doru
>>>
>>>
>>> On Mon, Jul 23, 2012 at 1:04 PM, Stéphane Ducasse
>>> <[hidden email]> wrote:
>>>> For us we will continue to work on verveineJ and the license will stay like that for a while until we know were we go.
>>>> So for us verveineJ is important because we can control it, fix it…
>>>>
>>>> Stef
>>>>
>>>>> Hi Jannik,
>>>>>
>>>>>>> Some questions:
>>>>>>> - is there any reason to support vervainJ instead of InFamix ? Would it make sense to support both ?
>>>>>>
>>>>>> This is a long discussion. I know that VerveinJ works fine. It is open-soure and well-maintained.
>>>>>> Maybe Nicolas Anquetil has already done a comparison between the two importers.
>>>>>
>>>>> Unfortunately, VerveineJ is not open-source. It is at the moment
>>>>> available to be used, but as it stands, the license is not specified
>>>>> at all. This is why the page says that you should contact Stef for
>>>>> getting information about the license. It would be great to clarify
>>>>> this point to avoid confusion in the future :).
>>>>>
>>>>>
>>>>> Cheers,
>>>>> Doru
>>>>> _______________________________________________
>>>>> Moose-dev mailing list
>>>>> [hidden email]
>>>>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>>>>
>>>>
>>>> _______________________________________________
>>>> Moose-dev mailing list
>>>> [hidden email]
>>>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>>>
>>>
>>>
>>> --
>>> www.tudorgirba.com
>>>
>>> "Every thing has its own flow"
>>>
>>> _______________________________________________
>>> Moose-dev mailing list
>>> [hidden email]
>>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>>
>> ---
>> Jannik Laval
>>
>>
>> _______________________________________________
>> Moose-dev mailing list
>> [hidden email]
>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

---
Jannik Laval

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

jannik laval

Re: To a unified software model repository for Moose.

In reply to this post by Nicolas Anquetil

Hi Nico,

On Jul 24, 2012, at 11:03 AM, Nicolas Anquetil wrote:

>
> some comments on all I read in this thread:
>
> - the idea of unified repository is very nice for research. It is important if we want to be able to share results, and reproduce experiments.
> Could it have some interest for industry projects?
> For example, it could be usefull as a comparison base for a company to understand how good or bad its own software is compared to some other (presumably open source) projects ...
>
>
> - keeping source code (and much more) is important because you never know what people would like to do in the future.
> It happened to a team in Brazil recently, they correlated some metrics to bugs.
> So they needed on top of the MSEs: access to a bug-tracking system to identify bug-identifiers; access to SVN commit comments to identify bug-fixing commits; access to the code to know what methods/classes were changed to correct the bug.
>
> Anyway, people want to see source code. If something looks strange in a visualization, you want to go back to the code to understand what's happening.
>
>
> - metadata verveinej version.
> For now, what I do is I keep a copy of verveinej along with the project. It is not big (<20Mo.)
> Anyway, it does not change that much, just keeping the date of the download should be enough for now to track versions.

Tis is what I do for now.

>
>
> - server generated MSEs.
> It can be a nice thing for small projects, but from my experience, generating MSEs is not trivial.
> Think about it like compiling a project (converting from one language to another).
> It requires setting up the path correctly, pointing to the right libraries, may be generating some source code, ...
> Not something you can automate.
> And for industry, it would not work. A company would be very cautious about sending its source code to some unknown server.
>
>
> - memory
> I commented with Jannik that a 2Gb. image for a 650Mo. MSE file looks like a huge difference to me.
> This is an issue because the VM is constrained in memory whereas the size of software systems does not seem to be :-)
> How can we deal with REAL BIG systems? 10 MLOCs for example ?
>

That is a problem. We will overpass this problem by increasing the memory of the VM, but it does not solve the problem.
Maybe a database...

>
> - verveineJ vs. inFamix
> I never really used inFamix, so I cannot tell.
> Doru seems to be the one that knows better both projects (and all of Moose as usual :-) ).
> May be he has some idea on the pros/cons of each tool ?
>
>
> - licence: We are trying to startup a business around moose. We think this is something that could be profitable to the entire community. On the other hand, this is not something easy to do and we need to take care not to shoot ourselves in the foot. It is like raising a baby, it is better to be a bit over protective and correct things afterwards than to be too much confident and risking a serious accident ...
> So the license is closed for now and we give explicit permission to friends to use it.
>
> And we did not yet formalize it at the moment simply because it does not rely only on us, it requires legal knowledge that we don't have, and above all there are a lot of other things to do that are more urgent.
>
> As for rewriting a new (open) one. Yes you could. But why loosing time with this? The solutions exist for now that you can use and work with. If in the future the situation is really unbearable, then you can think of investing time in correcting it.
> (and as mentioned, don't underestimate the amount of work it represents. Sure, any of us can do it, but it means months of effort, do you really have that much free time to spend ?)

I do not really want to do this, but if Moose become close, what about the community ?
The reason why we move from Visual Work to Pharo is because Pharo is open-source and we control it.

Jannik

>
>
> nicolas
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

---
Jannik Laval

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Stéphane Ducasse

Re: To a unified software model repository for Moose.

In reply to this post by jannik laval

>>>
>> I'm sorry but we do not want to give everything for free all the time.
>
> :) it was just a joke.
> I want to say that I am not sure the message is good: "you can use Moose, it is open-source, but must pay for importing data".
> You are right, this is a commercial message, and maybe it is a bonus for Synectique...

It is not a bonus. It is important. Our goal is to make sure that our company gets a chance in face of the too many companies in the field.

> Now, I have a researcher point of view and I think that using a tool that is open source is easier than contacting guys and maybe they answer or not. And if not ? I will use another tool.
did you send an email and did not get an answer? I guess not so there is no problem.
I think that having a company around moose is the best that can happen to Moose even if it can make the system more complex.
If we play smart everybody can benefit from it.

If synectique fails then it will be far easier but much sadder.

> I only give my opinion on something that does not concern me directly.
:)

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

jannik laval

Re: To a unified software model repository for Moose.

In reply to this post by Mircea Filip Lungu-2

On Jul 24, 2012, at 3:38 PM, Mircea Filip Lungu wrote:

Hi guys,

Some small observations:

Reduced Models
- Having reduced (partial) models is a nice idea. An even nicer one would be to have partial loading of the model. But until that happens, I guess the partial models could be a solution... And still, having the full Qualitas Corpus loaded even with partial models is impossible. I guess this is what Stef proposed to address by having the Project be just a proxy and loading it on demand, only that this might be too expensive.

In my experience, loading a MSE file for an average system takes between 3 and 5 minutes. Imagine that you want to collect the number of classes in each of the systems in the QC. you have 150 systems * 3 minutes each for loading the model = 7.5 hours. And imagine that after that you want to count the number of methods in each of the systems... This is why with Andrea we thought about caching Moose images preloaded with the systems in the Qualitas Corpus. In this way I could run a new analysis on the entire corpus in 10 minutes: I open an image, load the latest version of my analysis package if needed, run the analysis, return the results and close.

On the other hand, maybe the alternative to this an image per system could be to serialize the FAMIX models with Fuel and hope that loading an entire system with it would be waaay faster than when loading an MSE file.

+1,

For example loading Eclipse 3.4 takes several hours, on a 4Gb memory system and a SSD...

Evolutionary Analysis
- I seem to remember talking at some point with Andy Kellens and him mentioning that they had an incremental model of FAMIX for modeling evolutionary analysis. This would allow them to only represent the deltas between two versions. We had this discussion last year at Sattose. Does anybody know anything about that? That would be the right solution for doing evolutionary analysis. Otherwise we should just be happy with analyzing at most a dozen versions at once, as many as fit in memory. Or loading things on demand if we find a fast way.

Yes, I know it, it is my work :)

You can find the paper on my website (http://www.jannik-laval.eu/assets/files/papers/Lava10b-SCP-Orion.pdf).

The implementation is available on the moose website (http://www.moosetechnology.org/tools/orion)

There are some bugs and I do not use it since 1 year.

Now, it is possible for me to work on the implementation.

Memory
Did anybody look into using Gemstone...?

Server
Jannik I guess we could use one of the servers at SCG if needed for serving this information.

Good idea !

Jannik

Cheers,
M.

On Tue, Jul 24, 2012 at 11:40 AM, Fabrizio Perin <[hidden email]> wrote:

>
> Does anybody have any idea about any possible usage of such kind of information ?

you are talking about meta information
In famix 1.0
in the file header there was one entity describing the model.
what we could do is either
- have a separate file in use format (= reuse of parser) containing information and the file name of the model to which they refer
- or have the information inside the mse model by having an entity representing the model (we did that in some old moose versions).

I would prefer a separate file. If the idea is to setup a repository of projects the meta info are about a project together with its mse file not just about the mse.

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

---

Jannik Laval

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Stéphane Ducasse

Re: To a unified software model repository for Moose.

In reply to this post by Nicolas Anquetil

> - memory
> I commented with Jannik that a 2Gb. image for a 650Mo. MSE file looks like a huge difference to me.
> This is an issue because the VM is constrained in memory whereas the size of software systems does not seem to be :-)
> How can we deal with REAL BIG systems? 10 MLOCs for example ?

I hope that in the future pharo will be able to run and use 64 bits.
Now the GC would have to be changed too. This is why the consortium is important to be able to get some work done
to make the future safer.

Now it would be interested to experiment with some database backend and gemstone like solutions. Now that FAMIX3.0 is
fully meta described (which was not the case with famix10 and 20) it should be easier.

Stef
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

jannik laval

Re: To a unified software model repository for Moose.

In reply to this post by Andrea Caracciolo

>
>> - keeping source code (and much more) is important because you never know what people would like to do in the future.
>> It happened to a team in Brazil recently, they correlated some metrics to bugs.
>> So they needed on top of the MSEs: access to a bug-tracking system to identify bug-identifiers; access to SVN commit comments to identify bug-fixing commits; access to the code to know what methods/classes were changed to correct the bug.
>> Anyway, people want to see source code. If something looks strange in a visualization, you want to go back to the code to understand what's happening.
>
> Since qualitascorpus.com already provides and maintains a good collection of java projects, we thought about reusing that valuable source of information and building the MSE repository as a complement to Qualitas Corpus.
> This means that MSE files would be generated and hosted by us and source code can be downloaded from qualitascorpus.com.
>
Ok, all the MSE files are generated, I have them.
I will share them.

For the metadata, I propose to use an xml file that is independent of mse files / source files.
It could contains all the informations that we discussed here.

Jannik

> Andrea
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

---
Jannik Laval

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

jannik laval

Re: To a unified software model repository for Moose.

In reply to this post by Stéphane Ducasse

On Jul 24, 2012, at 9:23 PM, Stéphane Ducasse wrote:

>>>>
>>> I'm sorry but we do not want to give everything for free all the time.
>>
>> :) it was just a joke.
>> I want to say that I am not sure the message is good: "you can use Moose, it is open-source, but must pay for importing data".
>> You are right, this is a commercial message, and maybe it is a bonus for Synectique...
>
> It is not a bonus. It is important. Our goal is to make sure that our company gets a chance in face of the too many companies in the field.
>
>> Now, I have a researcher point of view and I think that using a tool that is open source is easier than contacting guys and maybe they answer or not. And if not ? I will use another tool.
> did you send an email and did not get an answer? I guess not so there is no problem.

Not from you, but we are experimenting some tools and guys do not answers about their tools... so not possible to compare.

> I think that having a company around moose is the best that can happen to Moose even if it can make the system more complex.
> If we play smart everybody can benefit from it.
>
> If synectique fails then it will be far easier but much sadder.

You are probably right.
And I hope the best for Synectique.

Jannik

>
>> I only give my opinion on something that does not concern me directly.
> :)
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

---
Jannik Laval

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Usman Bhatti

Re: To a unified software model repository for Moose.

Hello all,

Rapidly sharing some points with you guys regarding Synectique and this community...

- VerveinJ should be available to you guys to conduct your experiments, extract MSEs and save them. As Stef mentioned, it should be free and open to use for the people in the community. This is because VJ has improved a lot because of the feedback from the community and we do not want to lose this precious loop.

- Synectique is transitioning from "not-sure" to "looks interesting" phase so from September we should start to look for appropriate usage licences for VerveineJ so that will clarify the "not sure" situation around VJ.

- May be this is pharo-related issue but very soon we need to look for a database to store analysis results because i) we need to store a lot of data ii) we do not want to get into resolving issues related to data consistency, concurrent accesses, data merging, etc.

- Companies working in the domain of software analysis have this constant size problem: very soon we will face it too. Since, this community gathers some of the most exp. people in software analysis, it would be good to see how we can innovate in Moose to resolve the problem.

And thanx a lot for all the tools created in this community that serves us as enabling techno. Currently, we might not have sufficient time to improve the Moose infrastructure as we would like to but the least we can do is to contribute with our feedback on the tools and improve them whenever we can.

regards,

Usman

www.synectique.eu

On Tue, Jul 24, 2012 at 9:38 PM, jannik.laval <[hidden email]> wrote:

On Jul 24, 2012, at 9:23 PM, Stéphane Ducasse wrote:

>>>>
>>> I'm sorry but we do not want to give everything for free all the time.
>>
>> :) it was just a joke.
>> I want to say that I am not sure the message is good: "you can use Moose, it is open-source, but must pay for importing data".
>> You are right, this is a commercial message, and maybe it is a bonus for Synectique...
>
> It is not a bonus. It is important. Our goal is to make sure that our company gets a chance in face of the too many companies in the field.
>
>> Now, I have a researcher point of view and I think that using a tool that is open source is easier than contacting guys and maybe they answer or not. And if not ? I will use another tool.
> did you send an email and did not get an answer? I guess not so there is no problem.

Not from you, but we are experimenting some tools and guys do not answers about their tools... so not possible to compare.

> I think that having a company around moose is the best that can happen to Moose even if it can make the system more complex.
> If we play smart everybody can benefit from it.
>
> If synectique fails then it will be far easier but much sadder.

You are probably right.
And I hope the best for Synectique.

Jannik

>
>> I only give my opinion on something that does not concern me directly.
> :)
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

---
Jannik Laval

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Stéphane Ducasse

Re: To a unified software model repository for Moose.

In reply to this post by jannik laval

On Jul 24, 2012, at 9:33 PM, jannik.laval wrote:

>>
>>> - keeping source code (and much more) is important because you never know what people would like to do in the future.
>>> It happened to a team in Brazil recently, they correlated some metrics to bugs.
>>> So they needed on top of the MSEs: access to a bug-tracking system to identify bug-identifiers; access to SVN commit comments to identify bug-fixing commits; access to the code to know what methods/classes were changed to correct the bug.
>>> Anyway, people want to see source code. If something looks strange in a visualization, you want to go back to the code to understand what's happening.
>>
>> Since qualitascorpus.com already provides and maintains a good collection of java projects, we thought about reusing that valuable source of information and building the MSE repository as a complement to Qualitas Corpus.
>> This means that MSE files would be generated and hosted by us and source code can be downloaded from qualitascorpus.com.
>>
> Ok, all the MSE files are generated, I have them.
> I will share them.
>
>
> For the metadata, I propose to use an xml file that is independent of mse files / source files.

why xml?
We could use use as use as well. It is usually much more readable than xml.

> It could contains all the informations that we discussed here.
>
> Jannik
>
>
>
>> Andrea
>>
>>
>>
>> _______________________________________________
>> Moose-dev mailing list
>> [hidden email]
>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
> ---
> Jannik Laval
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Stéphane Ducasse

Re: To a unified software model repository for Moose.

In reply to this post by Stéphane Ducasse

On Jul 25, 2012, at 3:42 PM, Andrea Caracciolo wrote:

> Hi Stef,
>
> I read your replies in the mailing list and i wanted to clarify some doubts.
> As I understand, your points are the following:
> - extend the FAMIX meta-model adding new meta-data to make the concept of "project version" and "project" explicit.

not necessarily.
First the moose infrastructure.

> - supporting the exchange of partial models

Yes.

> To me it seems that both of the mentioned ideas contradict the purpose of the MSE format.
> MSE is good for packaging structural information about one single project snapshot.
> If you want to represent a project and all of its versions as a whole, you need to come up with a new representation format.

no :)
mse is a format to exchange data.

> Also shrinking models looks quite hard. You end up with a lot of broken dependencies and you need to generate a file which fits your personal analysis requirements.

Not really. Have a look at MooseImportingContext it is there to manage that you always save and load a coherent model. In fact I use it to import
coherent Smalltalk models. So loading is the same. You look at the graph dependencies and pay attention that if you remove attributes you do not use something that requires it. This is the job of the ImportingContext to manage that for you.
We should ask cyrille because I think that I remember that he worked on partial loading. Now at the end some graph dependencies where
expressed in FAME and I lost track.

> In my opinion, the best option would be to transition to relational database as an intermediate representation medium.
Give a try and let us know but I have some doubts.

> Instead of exchanging models through mse files, you replicate a certain set of DB entires.

Saving to relational databases graphs is not easy and simple.
For the moment imagine that we want to build a dashboard where we monitor some metrics.
you can write a script that
checkout source code
launch verveineJ
generate a new version in use
then you load the use file (filtering information if you can = make sense)
compute your metrics
save a strip down version of your model that fit the purpose of your analysis
now
since you want to do a trend analysis you load following the strategies developed by andre
all the stripped versions, or just 10 of them and the latest….
and you compute your dashboard data.

To me a project: an entity managing files and sources, versions on a disk or whatever it is stored .
After a project could have different loading strategies but well encapsulated.

> Parsing of MSE is replaced by recursively exploring the newly created database entries.
> This would also allow to have different strategies for loading famix objects in memory.
>
>
> Cheers
> _____________________________
> Andrea Caracciolo -- [hidden email]
> Software Composition Group
> University of Bern
>
>
>
>
>
>

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

jannik laval

Re: To a unified software model repository for Moose.

In reply to this post by Stéphane Ducasse

On Jul 25, 2012, at 3:59 PM, Stéphane Ducasse <[hidden email]> wrote:

>
> On Jul 24, 2012, at 9:33 PM, jannik.laval wrote:
>
>>>
>>>> - keeping source code (and much more) is important because you never know what people would like to do in the future.
>>>> It happened to a team in Brazil recently, they correlated some metrics to bugs.
>>>> So they needed on top of the MSEs: access to a bug-tracking system to identify bug-identifiers; access to SVN commit comments to identify bug-fixing commits; access to the code to know what methods/classes were changed to correct the bug.
>>>> Anyway, people want to see source code. If something looks strange in a visualization, you want to go back to the code to understand what's happening.
>>>
>>> Since qualitascorpus.com already provides and maintains a good collection of java projects, we thought about reusing that valuable source of information and building the MSE repository as a complement to Qualitas Corpus.
>>> This means that MSE files would be generated and hosted by us and source code can be downloaded from qualitascorpus.com.
>>>
>> Ok, all the MSE files are generated, I have them.
>> I will share them.
>>
>>
>> For the metadata, I propose to use an xml file that is independent of mse files / source files.
>
> why xml?
> We could use use as use as well. It is usually much more readable than xml.

iep,

Just that xml is verbose :)
But mse is good too.

Jannik

>
>> It could contains all the informations that we discussed here.
>>
>> Jannik
>>
>>
>>
>>> Andrea
>>>
>>>
>>>
>>> _______________________________________________
>>> Moose-dev mailing list
>>> [hidden email]
>>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>>
>> ---
>> Jannik Laval
>>
>>
>> _______________________________________________
>> Moose-dev mailing list
>> [hidden email]
>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

---
Jannik Laval

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Mircea Filip Lungu-2

Re: To a unified software model repository for Moose.

In reply to this post by jannik laval

Yes, I know it, it is my work :)

You can find the paper on my website (http://www.jannik-laval.eu/assets/files/papers/Lava10b-SCP-Orion.pdf).

The implementation is available on the moose website (http://www.moosetechnology.org/tools/orion)

There are some bugs and I do not use it since 1 year.

Now, it is possible for me to work on the implementation.

That seems to be the best approach to supporting evolutionary analysis i've seen mentioned here yet :)

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Andrea Caracciolo

Re: To a unified software model repository for Moose.

In reply to this post by jannik laval

On Jul 26, 2012, at 7:29 PM, jannik.laval wrote:

>
> On Jul 25, 2012, at 3:59 PM, Stéphane Ducasse <[hidden email]> wrote:
>
>>
>> On Jul 24, 2012, at 9:33 PM, jannik.laval wrote:
>>
>>>>
>>>>> - keeping source code (and much more) is important because you never know what people would like to do in the future.
>>>>> It happened to a team in Brazil recently, they correlated some metrics to bugs.
>>>>> So they needed on top of the MSEs: access to a bug-tracking system to identify bug-identifiers; access to SVN commit comments to identify bug-fixing commits; access to the code to know what methods/classes were changed to correct the bug.
>>>>> Anyway, people want to see source code. If something looks strange in a visualization, you want to go back to the code to understand what's happening.
>>>>
>>>> Since qualitascorpus.com already provides and maintains a good collection of java projects, we thought about reusing that valuable source of information and building the MSE repository as a complement to Qualitas Corpus.
>>>> This means that MSE files would be generated and hosted by us and source code can be downloaded from qualitascorpus.com.
>>>>
>>> Ok, all the MSE files are generated, I have them.
>>> I will share them.
>>>
>>>
>>> For the metadata, I propose to use an xml file that is independent of mse files / source files.
>>
>> why xml?
>> We could use use as use as well. It is usually much more readable than xml.
>
> iep,
>
> Just that xml is verbose :)
> But mse is good too.

A separate MSE file or the same where the famix model is in ?

i tried to write down a list of all the meta-data needed:
- project name
- project version
- verveineJ version (i would use the SVN revision number)

is it enough ?
should we include the FAMIX version ?
I didn't include the supported moose version, because i don't think it's relevant and i don't see how this information should be interpreted.

The rest of the data should be the following:
- MSE of FAMIX model
- source code

If there are multiple versions of a project, it would be also interesting to distribute an Orion model of all the versions.
Is it possible to build such a model having an MSE file for each version as input ?
How much effort is needed to fix the bugs and make it work ?
How do you look for changes between subsequent versions ? Do you diff the source code of the analyzed code entities ?

Would it make sense to distribute also a FUEL file containing the famix model ?
Would it make loading faster ?

>
> Jannik
>
>>
>>> It could contains all the informations that we discussed here.
>>>
>>> Jannik
>>>
>>>
>>>
>>>> Andrea
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Moose-dev mailing list
>>>> [hidden email]
>>>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>>>
>>> ---
>>> Jannik Laval
>>>
>>>
>>> _______________________________________________
>>> Moose-dev mailing list
>>> [hidden email]
>>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>>
>>
>> _______________________________________________
>> Moose-dev mailing list
>> [hidden email]
>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
> ---
> Jannik Laval
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

123