Re: Panama Papers: a case for reproducible research, data activism and frictionless data (powered by Pharo)
Posted by
Offray Vladimir Luna Cárdenas-2 on
May 21, 2016; 2:52am
URL: https://forum.world.st/Panama-Papers-a-case-for-reproducible-research-data-activism-and-frictionless-data-powered-by-Pharo-tp4896412p4896469.html
Thanks Gastón for your interest.
I used csv and imported to sqlite, because that's the way the ICIJ
released their info and let me query aggregated information in an
easy way. I bridge SQLite with Pharo using UDBC and then the
choropleth map was made on Roassal. Details are in the blog post
;-).
My first attempt was trying to load all nodes in (Entities in the
offshore leaks database) in Roassal and query/visualize directly
from it, but with over 150k nodes the environment started to lag and
doesn't was as responsive as I want for exploring the dataset.
That's why I switched quickly to sqlite. I think that this keeps the
environment agile and covers a pretty good amount of the cases when
you work with tabular data and even some specific graphs could be
replicated from the exported CVS files containing the entities and
their relationships. My focus was more on accuracy of the
visualization, trying to put the rest of the territories in a
Roassal map. If you're interested I can put a quick script to run
the visualization/notebook in your Moose image.
I have not used Neo4J, but there will be a seminar on how it was
used in the Panama Papers next Tuesday:
http://info.neo4j.com/0526-register.html
Cheers,
Offray
On 20/05/16 16:42, Gastón Dall' Oglio
wrote:
Hi. Looks good :)
Just out of curiosity, what data format you used? csv,
sqlite?
A question, you can use Neo4reSt to store data and
Pharo/Roassal for display on a more or less friendly way? or
there is a lot impedance between graph models?