Smalltalk › Pharo › Pharo Smalltalk Users

Analysis/Visualizations of DBs

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

5 messages Options

Torsten Bergmann

Analysis/Visualizations of DBs

Hi,

we have many nice analysis and visualizations tools available (Moose, Roassal, ViDi, ...)
Often it is focused on development or code. Also we have refactoring tools.

What would be cool would be to have some tool for doing this on (relational) databases and
included data:

- visualize databases (tables, data clusters, number of columns, ...)
- analyze tables on satisfying some (external) business constraints
and consistency rules
- refactoring of databases (moving, migrating colums, ...)
- easy building reports/visualizations of included data (extract, query, filter, ...)

What I think of is a tool where (after connecting to a possibly unknown database) you can easily
get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does
for code).

Is there any research, projects, ... for data mining already in Pharo? I guess this could make
up a great toolset with Pharo as a base and make it more known because in the software business
relational databases play a big role.

Any takers for the next Pharo success story? ;)

Bye
T.

Esteban A. Maringolo

Re: Analysis/Visualizations of DBs

I would be happy to participate in such project :)
Esteban A. Maringolo

2015-10-02 9:34 GMT-03:00 Torsten Bergmann <[hidden email]>:

> Hi,
>
> we have many nice analysis and visualizations tools available (Moose, Roassal, ViDi, ...)
> Often it is focused on development or code. Also we have refactoring tools.
>
> What would be cool would be to have some tool for doing this on (relational) databases and
> included data:
>
> - visualize databases (tables, data clusters, number of columns, ...)
> - analyze tables on satisfying some (external) business constraints
> and consistency rules
> - refactoring of databases (moving, migrating colums, ...)
> - easy building reports/visualizations of included data (extract, query, filter, ...)
>
> What I think of is a tool where (after connecting to a possibly unknown database) you can easily
> get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does
> for code).
>
> Is there any research, projects, ... for data mining already in Pharo? I guess this could make
> up a great toolset with Pharo as a base and make it more known because in the software business
> relational databases play a big role.
>
> Any takers for the next Pharo success story? ;)
>
> Bye
> T.
>

Thierry Goubier

Re: Analysis/Visualizations of DBs

In reply to this post by Torsten Bergmann

2015-10-02 14:34 GMT+02:00 Torsten Bergmann <[hidden email]>:

Hi,

we have many nice analysis and visualizations tools available (Moose, Roassal, ViDi, ...)
Often it is focused on development or code. Also we have refactoring tools.

What would be cool would be to have some tool for doing this on (relational) databases and
included data:

- visualize databases (tables, data clusters, number of columns, ...)

I like the fact you're saying database and not datasets ;)

- analyze tables on satisfying some (external) business constraints
and consistency rules

Depending on how powerfull is your constraint system.

- refactoring of databases (moving, migrating colums, ...)

Hum, that one: schema migration. Probably already a lot of work on the theory of that.

- easy building reports/visualizations of included data (extract, query, filter, ...)

Business reporting generator. Many pre-existing competition doing lots of very cool visualisations already.

What I think of is a tool where (after connecting to a possibly unknown database) you can easily
get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does
for code).

Is there any research, projects, ... for data mining already in Pharo? I guess this could make
up a great toolset with Pharo as a base and make it more known because in the software business
relational databases play a big role.

Some have done a datathon in Paris with Pharo (Serge, Onil, Me a bit in remote, Alexandre, Alvaro...). Had big issues opening a large file (~66GB of CSV data).

Any takers for the next Pharo success story? ;)

Would be nice ;)

Thierry

Bye
T.

Stephan Eggermont-3

Re: Analysis/Visualizations of DBs

In reply to this post by Torsten Bergmann

On 02-10-15 14:34, Torsten Bergmann wrote:
> What I think of is a tool where (after connecting to a possibly unknown database) you can easily
> get/build an overview on the design, quality and data of the database (similar to what Moose, ViDi does
> for code).

In general, this is difficult to do. Our tools assume that all data is
in ram and you can easily calculate the metrics you want to display.
That means that currently it works well for databases < ~2GB. With a 64
bit VM we can easily use ~500 GB in a workstation (let's see what the GC
thinks of that) or 60GB in a cheap PC.

For analyzing legacy systems, the current limits are enough to analyze
and migrate an ERP system for a 100-people company with 20 years of
production data. What can be extremely helpful is normalizing data while
reading it from the database. You'll often find that you can reduce file
size for the largest tables by a factor 10. To get a quick idea of the
normalization of the data, try zipping a dump file.

Stephan

stepharo

Re: Analysis/Visualizations of DBs

In reply to this post by Torsten Bergmann

Anne Etien and Olivier Auverlot (with the help of Guillaume Larcheveque)
are parsing SQL and building representation in Moose about such objects.
So contact anne

[hidden email]

Le 2/10/15 14:34, Torsten Bergmann a écrit :