Hi, I will soon need to analyze quite some java code at once and I'm already sure that one Moose image will not be enough for all that code. Therefore I would like to try to persist the model on a database. Do you have any pointer to a possible persistence manager/OODBMS/RODBMS to use for that purpose?Fabrizio _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
HI fabrizio
may be you should ask in the pharo mailing-list. Now the problem to me is how to persist a graph of pointers. In early version of Moose long time ago there were no pointers to methods, class….but id So all the code was full of FAMIXMethod>>msClass Model entityWithId: self msClassId and you could easily plug a stupid DB behind. Then we decided to use pointers. But now to me this is not clear how we can save/load part of a large graph. stef > Hi, > I will soon need to analyze quite some java code at once and I'm already sure that one Moose image will not be enough for all that code. Therefore I would like to try to persist the model on a database. Do you have any pointer to a possible persistence manager/OODBMS/RODBMS to use for that purpose? > > Thanks in advance, > Fabrizio > _______________________________________________ > Moose-dev mailing list > [hidden email] > https://www.iam.unibe.ch/mailman/listinfo/moose-dev _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Hi Stef,
How to store the elements on the DB I think will depend on which DBMS you use. With a, e.g., a ORDBMS, we can replicate exactly the structure of the MooseModel within the DB. In my mind I will have a MoosePersistentModel object which hold a reference to the MooseModel stored into the DB and that translate all the operations you can do to search, add, remove and modify an entity in executable SQL and send it to the DB. It sounds really ambitious, but I think I will realize soon after I will start the development if this idea can work or not.
Now I just need to identify which DBMS to use and I should decide which persistent framework in Pharo to use. I was thinking at Glorp (with a Postgres DB behind) or at Magma, although I never used neither of them. do you think that they would be suitale for my purpose? Do you have any other candidate to propose? Do you know if there is some nice "getting start documentation" for Magma and Glorp?
Thanks a lot,
Fabrizio
2013/8/19 Stéphane Ducasse <[hidden email]>
HI fabrizio _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Hi Fabrizio,
The focus of a database is persistency, not performance of large datasets, so this will probably not help you. So I would suggest the following directions for a solution: Use a more memory efficient model than FAMIX (or improve the memory consumption of this model) Export chuncks of data using FUEL Link several Moose images that share memory or Link a Moose image to a gemstone to store the data and do the processing. Regards, Diego On Aug 20, 2013, at 10:51 AM, Fabrizio Perin wrote:
_______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Performace is always glad to have but if you cannot do your analyses in the first place because you run out of memory, than it became not a problem. I don't think FUEL is an option in my case, in the sense that with a DB you can fetch a bunch of rows at the time and so you can have your data while keeping the memory cosumption manageable. With FUEL I will need to deserialize all and than do the search. FUEL could help to set up a filesystem DB and so the DBMS can use FUEL to quickly serialize and deserialize objects. On that matter, I would rather prefer to use an existing persistent framework and not to build one from scratch.
thanks,
Fabrizio
2013/8/20 Diego Lont <[hidden email]>
_______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by Fabrizio Perin-3
Hi Fabrizio,
Moose models are highly interconnected, and the analyzing algorithms tend to touch a lot of other objects. There is very limited locality of reference, so a RDBMS (or NoSQL) cannot help you here. You basically would only be swapping in and out objects. Stephan _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by Fabrizio Perin-3
Hi Fabrizio, How large is the MSE file? I understand your concern, but I do not think that a relational database would work well in this case. Probably much better would be a MongoDB. But, at the same time, it would be cool to try using a Gemstone and have FAMIX be compatible with it (that should not be hard).
Cheers, Doru On Tue, Aug 20, 2013 at 11:31 AM, Fabrizio Perin <[hidden email]> wrote:
"Every thing has its own flow"
_______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
On Aug 20, 2013, at 12:02 PM, Tudor Girba <[hidden email]> wrote:
any database and if we would we need to control the licensing and any other legal aspects.
_______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by Tudor Girba-2
Doru, I would be willing to spend time into helping with a GemStone-based persistence solution ... I will be at ESUG in September so that would be a great time to discuss the issues ... I know that you are not planning on being there, but perhaps I could meet with someone else who is familiar with the Moose requirements...there are several approaches that I think would make sense, but it really depends upon your requirements... Dale From: "Tudor Girba" <[hidden email]> _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by Stéphane Ducasse
Stef, A pluggable solution makes a lot of sense ... I understand your concerns: you don't want Moose to be locked into a GemStone-only solution (for example). Maintaining flexibility here would be one of the considerations in working toward a solution... Dale From: "Stéphane Ducasse" <[hidden email]> _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
In reply to this post by Dale Henrichs-3
<base href="x-msg://610/">
On Aug 20, 2013, at 7:47 PM, Dale K. Henrichs <[hidden email]> wrote:
We could organize a skype meeting. I will be there. Usman/Guillaume are not coming but we can arrange a skype meeting. In essence Moose has models that are graphs of objects like a code metamodel: a package contain classes, contains methods, Methods access IV, Methods access other methodds. So as soon as we program something on top of FAMIX then we navigate pointers in this graph. We did an experiment (mooseOnTheWeb) with Amber as a client and Moose as a server. The point is that it is working when you do a query and then you get JSON objects (but just the shallow objects) and if you need to work in the graphs you need to do multiple queries to get from shallowed information to the next one. Now in essence a solution with GS to me would mean to move Moose on GS on the long term and I'm not sure that this is the way to go. May be I'm wrong. Stef
_______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
From: "Stéphane Ducasse" <[hidden email]>That would be good! I'm not thinking in terms of develop in Pharo and deploy in GemStone for this ... The basic problem is that you've got Moose models that are too big to fit in the memory of Pharo ... so my basic idea is to provide a smart data store of the Moose model and allow you to make queries against the GemStone db until the size of the result set is small enough to fit in memory ... then the data (subgraph?) would be transferred to pharo and then all processing would be done completely in Pharo from that point forward ... perhaps we would use something like Fuel to ship these subgraphs efficiently over the wire ... there are other "tricks" that can come into play, but I think it is worth exploring the idea of using GemStone as a "smart datastore" where the line between the pharo client and gemstone server is somewhat blurred ... pharo can do some hefty data processing on its own so gemstone isn't required to do all of the analysis ... I'm just imagining that in the face of a "too big for memory" dataset, some sort of "data reduction queries" can be performed on the server and shipped to pharo for further analysis and visualization ... I assume that only the Moose core classes need to be ported to gemstone... Maybe I'm naive:) Dale _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
How GS manage pointers to a saved objects?
Because I could see a scenario (a la marea and fuel) where we store packages (kind of roots of famix graphs) into GS (but it could be Fuel) the problem is how to handle pointers pointing from one of these packages to another one that it not loaded. Second problem: how do you prevent that one query like allClasses reloads everything in memory (which you cannot because you do not have enough memory else everything would be in memory). Stef _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
----- Original Message ----- | From: "Stéphane Ducasse" <[hidden email]> | To: "Moose-related development" <[hidden email]> | Sent: Tuesday, August 20, 2013 2:30:18 PM | Subject: [Moose-dev] Re: persisting moose models | | How GS manage pointers to a saved objects? GemStone uses an object table, so every object reference in the body of an object on disk is via an oop (unique object id) reference instead of a direct pointer. In the vm these oop references are turned into a memory pointer on demand as an instance variable is referenced. The vm caches persistent objects and can flush unmodified persistent objects under memory pressure. Dirty persistent objects and non-persistent objects are stored in a separate memory space ... the only way to flush these objects is to do a commit ... in practice you end up with a working set of a portion of the entire object graph with new object references faulted in on demand and old objects flushed from memory... So you can operate on a million element collection without having the whole collection in memory at any one time. Large objects (like a million element collection) is broken up into a tree of large object nodes (~2000 oops per node) ... | Because I could see a scenario (a la marea and fuel) where we store | packages (kind of roots of famix graphs) into GS (but it could be | Fuel) | the problem is how to handle pointers pointing from one of these | packages to another one that it not loaded. The "natural" way to store things in GemStone would be as a complete object graph ... you've got live objects and behavior that you can leverage, so one could perform the queries like I've mentioned to dynamically define an object graph that is then shipped to pharo (using Fuel?) ... We can use oops to reference the "external objects" at the boundaries of the object subgraph. This is the area where we'd need to work out the details.. | | Second problem: how do you prevent that one query like allClasses | reloads everything in memory (which you cannot because you do not | have enough memory | else everything would be in memory). If you have a query whose result set will be larger than memory, you make the result set persistent itself and do intermediate commits when "memory fills up". Because we don't use memory pointers for persistent objects the whole result set doesn't have to be in memory at the same time ... Dale _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Free forum by Nabble | Edit this page |