Analysis/Visualizations of DBs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Analysis/Visualizations of DBs

Torsten Bergmann
Hi,

we have many nice analysis and visualizations tools available (Moose, Roassal, ViDi, ...)
Often it is focused on development or code. Also we have refactoring tools.

What would be cool would be to have some tool for doing this on (relational) databases and
included data:

 - visualize databases (tables, data clusters, number of columns, ...)
 - analyze tables on satisfying some (external) business constraints
   and consistency rules
 - refactoring of databases (moving, migrating colums, ...)
 - easy building reports/visualizations of included data (extract, query, filter, ...)

What I think of is a tool where (after connecting to a possibly unknown database) you can easily
get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does
for code).

Is there any research, projects, ... for data mining already in Pharo? I guess this could make
up a great toolset with Pharo as a base and make it more known because in the software business
relational databases play a big role.

Any takers for the next Pharo success story? ;)

Bye
T.

Reply | Threaded
Open this post in threaded view
|

Re: Analysis/Visualizations of DBs

Esteban A. Maringolo
I would be happy to participate in such project :)
Esteban A. Maringolo


2015-10-02 9:34 GMT-03:00 Torsten Bergmann <[hidden email]>:

> Hi,
>
> we have many nice analysis and visualizations tools available (Moose, Roassal, ViDi, ...)
> Often it is focused on development or code. Also we have refactoring tools.
>
> What would be cool would be to have some tool for doing this on (relational) databases and
> included data:
>
>  - visualize databases (tables, data clusters, number of columns, ...)
>  - analyze tables on satisfying some (external) business constraints
>    and consistency rules
>  - refactoring of databases (moving, migrating colums, ...)
>  - easy building reports/visualizations of included data (extract, query, filter, ...)
>
> What I think of is a tool where (after connecting to a possibly unknown database) you can easily
> get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does
> for code).
>
> Is there any research, projects, ... for data mining already in Pharo? I guess this could make
> up a great toolset with Pharo as a base and make it more known because in the software business
> relational databases play a big role.
>
> Any takers for the next Pharo success story? ;)
>
> Bye
> T.
>

Reply | Threaded
Open this post in threaded view
|

Re: Analysis/Visualizations of DBs

Thierry Goubier
In reply to this post by Torsten Bergmann


2015-10-02 14:34 GMT+02:00 Torsten Bergmann <[hidden email]>:
Hi,

we have many nice analysis and visualizations tools available (Moose, Roassal, ViDi, ...)
Often it is focused on development or code. Also we have refactoring tools.

What would be cool would be to have some tool for doing this on (relational) databases and
included data:

 - visualize databases (tables, data clusters, number of columns, ...)

I like the fact you're saying database and not datasets ;)
 
 - analyze tables on satisfying some (external) business constraints
   and consistency rules

Depending on how powerfull is your constraint system.
 
 - refactoring of databases (moving, migrating colums, ...)

Hum, that one: schema migration. Probably already a lot of work on the theory of that.
 
 - easy building reports/visualizations of included data (extract, query, filter, ...)

Business reporting generator. Many pre-existing competition doing lots of very cool visualisations already.
 

What I think of is a tool where (after connecting to a possibly unknown database) you can easily
get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does
for code).

Is there any research, projects, ... for data mining already in Pharo? I guess this could make
up a great toolset with Pharo as a base and make it more known because in the software business
relational databases play a big role.

Some have done a datathon in Paris with Pharo (Serge, Onil, Me a bit in remote, Alexandre, Alvaro...). Had big issues opening a large file (~66GB of CSV data).
 

Any takers for the next Pharo success story? ;)

Would be nice ;)

Thierry
 

Bye
T.


Reply | Threaded
Open this post in threaded view
|

Re: Analysis/Visualizations of DBs

Stephan Eggermont-3
In reply to this post by Torsten Bergmann
On 02-10-15 14:34, Torsten Bergmann wrote:
> What I think of is a tool where (after connecting to a possibly unknown database) you can easily
> get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does
> for code).

In general, this is difficult to do. Our tools assume that all data is
in ram and you can easily calculate the metrics you want to display.
That means that currently it works well for databases < ~2GB. With a 64
bit VM we can easily use ~500 GB in a workstation (let's see what the GC
thinks of that) or 60GB in a cheap PC.

For analyzing legacy systems, the current limits are enough to analyze
and migrate an ERP system for a 100-people company with 20  years of
production data. What can be extremely helpful is normalizing data while
reading it from the database. You'll often find that you can reduce file
size for the largest tables by a factor 10. To get a quick idea of the
normalization of the data, try zipping a dump file.

Stephan


Reply | Threaded
Open this post in threaded view
|

Re: Analysis/Visualizations of DBs

stepharo
In reply to this post by Torsten Bergmann
Anne Etien and Olivier Auverlot (with the help of Guillaume Larcheveque)
are parsing SQL and building representation in Moose about such objects.
So contact anne

[hidden email]

Le 2/10/15 14:34, Torsten Bergmann a écrit :

> Hi,
>
> we have many nice analysis and visualizations tools available (Moose, Roassal, ViDi, ...)
> Often it is focused on development or code. Also we have refactoring tools.
>
> What would be cool would be to have some tool for doing this on (relational) databases and
> included data:
>
>   - visualize databases (tables, data clusters, number of columns, ...)
>   - analyze tables on satisfying some (external) business constraints
>     and consistency rules
>   - refactoring of databases (moving, migrating colums, ...)
>   - easy building reports/visualizations of included data (extract, query, filter, ...)
>
> What I think of is a tool where (after connecting to a possibly unknown database) you can easily
> get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does
> for code).
>
> Is there any research, projects, ... for data mining already in Pharo? I guess this could make
> up a great toolset with Pharo as a base and make it more known because in the software business
> relational databases play a big role.
>
> Any takers for the next Pharo success story? ;)
>
> Bye
> T.
>
>