Hi,
we have many nice analysis and visualizations tools available (Moose, Roassal, ViDi, ...) Often it is focused on development or code. Also we have refactoring tools. What would be cool would be to have some tool for doing this on (relational) databases and included data: - visualize databases (tables, data clusters, number of columns, ...) - analyze tables on satisfying some (external) business constraints and consistency rules - refactoring of databases (moving, migrating colums, ...) - easy building reports/visualizations of included data (extract, query, filter, ...) What I think of is a tool where (after connecting to a possibly unknown database) you can easily get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does for code). Is there any research, projects, ... for data mining already in Pharo? I guess this could make up a great toolset with Pharo as a base and make it more known because in the software business relational databases play a big role. Any takers for the next Pharo success story? ;) Bye T. |
I would be happy to participate in such project :)
Esteban A. Maringolo 2015-10-02 9:34 GMT-03:00 Torsten Bergmann <[hidden email]>: > Hi, > > we have many nice analysis and visualizations tools available (Moose, Roassal, ViDi, ...) > Often it is focused on development or code. Also we have refactoring tools. > > What would be cool would be to have some tool for doing this on (relational) databases and > included data: > > - visualize databases (tables, data clusters, number of columns, ...) > - analyze tables on satisfying some (external) business constraints > and consistency rules > - refactoring of databases (moving, migrating colums, ...) > - easy building reports/visualizations of included data (extract, query, filter, ...) > > What I think of is a tool where (after connecting to a possibly unknown database) you can easily > get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does > for code). > > Is there any research, projects, ... for data mining already in Pharo? I guess this could make > up a great toolset with Pharo as a base and make it more known because in the software business > relational databases play a big role. > > Any takers for the next Pharo success story? ;) > > Bye > T. > |
In reply to this post by Torsten Bergmann
2015-10-02 14:34 GMT+02:00 Torsten Bergmann <[hidden email]>: Hi, I like the fact you're saying database and not datasets ;) - analyze tables on satisfying some (external) business constraints Depending on how powerfull is your constraint system. - refactoring of databases (moving, migrating colums, ...) Hum, that one: schema migration. Probably already a lot of work on the theory of that. - easy building reports/visualizations of included data (extract, query, filter, ...) Business reporting generator. Many pre-existing competition doing lots of very cool visualisations already.
Some have done a datathon in Paris with Pharo (Serge, Onil, Me a bit in remote, Alexandre, Alvaro...). Had big issues opening a large file (~66GB of CSV data).
Would be nice ;) Thierry
|
In reply to this post by Torsten Bergmann
On 02-10-15 14:34, Torsten Bergmann wrote:
> What I think of is a tool where (after connecting to a possibly unknown database) you can easily > get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does > for code). In general, this is difficult to do. Our tools assume that all data is in ram and you can easily calculate the metrics you want to display. That means that currently it works well for databases < ~2GB. With a 64 bit VM we can easily use ~500 GB in a workstation (let's see what the GC thinks of that) or 60GB in a cheap PC. For analyzing legacy systems, the current limits are enough to analyze and migrate an ERP system for a 100-people company with 20 years of production data. What can be extremely helpful is normalizing data while reading it from the database. You'll often find that you can reduce file size for the largest tables by a factor 10. To get a quick idea of the normalization of the data, try zipping a dump file. Stephan |
In reply to this post by Torsten Bergmann
Anne Etien and Olivier Auverlot (with the help of Guillaume Larcheveque)
are parsing SQL and building representation in Moose about such objects. So contact anne [hidden email] Le 2/10/15 14:34, Torsten Bergmann a écrit : > Hi, > > we have many nice analysis and visualizations tools available (Moose, Roassal, ViDi, ...) > Often it is focused on development or code. Also we have refactoring tools. > > What would be cool would be to have some tool for doing this on (relational) databases and > included data: > > - visualize databases (tables, data clusters, number of columns, ...) > - analyze tables on satisfying some (external) business constraints > and consistency rules > - refactoring of databases (moving, migrating colums, ...) > - easy building reports/visualizations of included data (extract, query, filter, ...) > > What I think of is a tool where (after connecting to a possibly unknown database) you can easily > get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does > for code). > > Is there any research, projects, ... for data mining already in Pharo? I guess this could make > up a great toolset with Pharo as a base and make it more known because in the software business > relational databases play a big role. > > Any takers for the next Pharo success story? ;) > > Bye > T. > > |
Free forum by Nabble | Edit this page |