Gestome and OLAP

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Gestome and OLAP

Diogenes Moreira
Someone has experience in feeding a OLAP Cube from a Gemstone or persist the data un DBMS...

Some advice, will be welcome.

Thanks in avances..

Best

http://about.me/diogenes.moreira
Reply | Threaded
Open this post in threaded view
|

Re: Gestome and OLAP

James Foster-8
Hi Diogenes,

While GemStone is a wonderful object database, it does not have a ready solution for replicating to a relational database. This is the down side of having avoided the object-relational mismatch when building an application. Even so, it is not so hard to export a subset of your data for off-line analysis. There are a couple steps in this process. 

First, decide which objects you want to export and what to have as the relational representation of those objects. This step is very application-dependent and requires an understanding of the domain model and the reporting requirements.

Second, export those object to the appropriate relational data store. This requires an understanding of the target system and tools to transfer the data. GemConnect is a commercial, add-on product for GemStone/S 64 Bit (see http://gemstone.com/products/gemstone) and provides an interface to Oracle. (A free, open-source version is available for Sybase at http://sourceforge.net/projects/gemconnect/.) Alternatively, you could write your own interface using the FFI layer in version 3.0, a C UserAction, a socket interface, or a GCI application that communicates with both systems.

James

On Aug 15, 2011, at 7:01 AM, Diogenes Moreira wrote:

Someone has experience in feeding a OLAP Cube from a Gemstone or persist the data un DBMS...

Some advice, will be welcome.

Thanks in avances..

Best

http://about.me/diogenes.moreira

Reply | Threaded
Open this post in threaded view
|

Re: Gestome and OLAP

Mariano Martinez Peck


On Mon, Aug 15, 2011 at 5:52 PM, James Foster <[hidden email]> wrote:
Hi Diogenes,

While GemStone is a wonderful object database, it does not have a ready solution for replicating to a relational database. This is the down side of having avoided the object-relational mismatch when building an application. Even so, it is not so hard to export a subset of your data for off-line analysis. There are a couple steps in this process. 

First, decide which objects you want to export and what to have as the relational representation of those objects. This step is very application-dependent and requires an understanding of the domain model and the reporting requirements.

Second, export those object to the appropriate relational data store. This requires an understanding of the target system and tools to transfer the data. GemConnect is a commercial, add-on product for GemStone/S 64 Bit (see http://gemstone.com/products/gemstone) and provides an interface to Oracle. (A free, open-source version is available for Sybase at http://sourceforge.net/projects/gemconnect/.) Alternatively, you could write your own interface using the FFI layer in version 3.0,

If Gemstone 3.0 now provides an easy FFI implementation, porting OpenDBXDriver (formely known as SqueakDBX) should be very easy. Just subclass OpenDBX and implement the 20 ffi methods. The "Smalltalk" part of OpenDBXDriver should be easy to port.


 
a C UserAction, a socket interface, or a GCI application that communicates with both systems.

James

On Aug 15, 2011, at 7:01 AM, Diogenes Moreira wrote:

Someone has experience in feeding a OLAP Cube from a Gemstone or persist the data un DBMS...

Some advice, will be welcome.

Thanks in avances..

Best

http://about.me/diogenes.moreira




--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Gestome and OLAP

Diogenes Moreira
Thanks for your answers.


http://about.me/diogenes.moreira


On Mon, Aug 15, 2011 at 1:27 PM, Mariano Martinez Peck <[hidden email]> wrote:


On Mon, Aug 15, 2011 at 5:52 PM, James Foster <[hidden email]> wrote:
Hi Diogenes,

While GemStone is a wonderful object database, it does not have a ready solution for replicating to a relational database. This is the down side of having avoided the object-relational mismatch when building an application. Even so, it is not so hard to export a subset of your data for off-line analysis. There are a couple steps in this process. 

First, decide which objects you want to export and what to have as the relational representation of those objects. This step is very application-dependent and requires an understanding of the domain model and the reporting requirements.

Second, export those object to the appropriate relational data store. This requires an understanding of the target system and tools to transfer the data. GemConnect is a commercial, add-on product for GemStone/S 64 Bit (see http://gemstone.com/products/gemstone) and provides an interface to Oracle. (A free, open-source version is available for Sybase at http://sourceforge.net/projects/gemconnect/.) Alternatively, you could write your own interface using the FFI layer in version 3.0,

If Gemstone 3.0 now provides an easy FFI implementation, porting OpenDBXDriver (formely known as SqueakDBX) should be very easy. Just subclass OpenDBX and implement the 20 ffi methods. The "Smalltalk" part of OpenDBXDriver should be easy to port.


 
a C UserAction, a socket interface, or a GCI application that communicates with both systems.

James

On Aug 15, 2011, at 7:01 AM, Diogenes Moreira wrote:

Someone has experience in feeding a OLAP Cube from a Gemstone or persist the data un DBMS...

Some advice, will be welcome.

Thanks in avances..

Best

http://about.me/diogenes.moreira




--
Mariano
http://marianopeck.wordpress.com


Reply | Threaded
Open this post in threaded view
|

Re: Gestome and OLAP

Stephan Eggermont-3
In reply to this post by Diogenes Moreira
Is there a special reason you want to do the OLAP outside of Gemstone?

Stephan

On 15 aug 2011, at 16:01, Diogenes Moreira wrote:

> Someone has experience in feeding a OLAP Cube from a Gemstone or persist the data un DBMS...
>
> Some advice, will be welcome.
>
> Thanks in avances..
>
> Best
>
> http://about.me/diogenes.moreira

Reply | Threaded
Open this post in threaded view
|

Re: Gestome and OLAP

Diogenes Moreira
perhaps because of my ignorance.. I don't know OLAP CUBES or Dataminig tools, running on the GS, because it, my question.

 My approach to gemstone was through GLASS

Well, users use to explore the data though excel and olap cubes.. and run others BI tools.

if you give my some advice to kill the relational database.. i'll be very happy :)

Best.. 


http://about.me/diogenes.moreira


On Tue, Aug 16, 2011 at 8:59 PM, Stephan Eggermont <[hidden email]> wrote:
Is there a special reason you want to do the OLAP outside of Gemstone?

Stephan

On 15 aug 2011, at 16:01, Diogenes Moreira wrote:

> Someone has experience in feeding a OLAP Cube from a Gemstone or persist the data un DBMS...
>
> Some advice, will be welcome.
>
> Thanks in avances..
>
> Best
>
> http://about.me/diogenes.moreira


Reply | Threaded
Open this post in threaded view
|

Re: Gestome and OLAP

dario trussardi

Hi,

you can read about my work alias "Statistics on-line " relative to Statistics Data Warehouse problematic.

 


Dario

perhaps because of my ignorance.. I don't know OLAP CUBES or Dataminig tools, running on the GS, because it, my question.

 My approach to gemstone was through GLASS

Well, users use to explore the data though excel and olap cubes.. and run others BI tools.

if you give my some advice to kill the relational database.. i'll be very happy :)

Best.. 


http://about.me/diogenes.moreira


On Tue, Aug 16, 2011 at 8:59 PM, Stephan Eggermont <[hidden email]> wrote:
Is there a special reason you want to do the OLAP outside of Gemstone?

Stephan

On 15 aug 2011, at 16:01, Diogenes Moreira wrote:

> Someone has experience in feeding a OLAP Cube from a Gemstone or persist the data un DBMS...
>
> Some advice, will be welcome.
>
> Thanks in avances..
>
> Best
>
> http://about.me/diogenes.moreira




PresentazioneStatistiche.pdf (345K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Gestome and OLAP

Diogenes Moreira
Dario, 

Thank for you answare, but BI and Dataminig Tools are more complex than explore data..

Well, I'll try get money from the users.. to by a  GemsConnect.. that works with GLASS or need by other kind of lincence ??

Other way.. port DBXTalk to glass :)..


Best.

http://about.me/diogenes.moreira


On Thu, Aug 18, 2011 at 10:19 AM, Dario Trussardi <[hidden email]> wrote:

Hi,

you can read about my work alias "Statistics on-line " relative to Statistics Data Warehouse problematic.

 


Dario

perhaps because of my ignorance.. I don't know OLAP CUBES or Dataminig tools, running on the GS, because it, my question.

 My approach to gemstone was through GLASS

Well, users use to explore the data though excel and olap cubes.. and run others BI tools.

if you give my some advice to kill the relational database.. i'll be very happy :)

Best.. 


http://about.me/diogenes.moreira


On Tue, Aug 16, 2011 at 8:59 PM, Stephan Eggermont <[hidden email]> wrote:
Is there a special reason you want to do the OLAP outside of Gemstone?

Stephan

On 15 aug 2011, at 16:01, Diogenes Moreira wrote:

> Someone has experience in feeding a OLAP Cube from a Gemstone or persist the data un DBMS...
>
> Some advice, will be welcome.
>
> Thanks in avances..
>
> Best
>
> http://about.me/diogenes.moreira





Reply | Threaded
Open this post in threaded view
|

Re: Gestome and OLAP

Nick
Well, I'll try get money from the users.. to by a  GemsConnect.. that works with GLASS or need by other kind of lincence ??

Other way.. port DBXTalk to glass :)..

I believe there are various Smalltalk implementations of the Postgres db (socket-based) protocol around and I'm sure I remember talk of one working in Gemstone.   
Reply | Threaded
Open this post in threaded view
|

Re: Gestome and OLAP

Diogenes Moreira
Good point..

Enviado desde mi BlackBerry® de Claro Argentina


From: Nick Ager <[hidden email]>
Date: Thu, 18 Aug 2011 17:27:17 +0100
To: GemStone Seaside beta discussion<[hidden email]>
ReplyTo: GemStone Seaside beta discussion <[hidden email]>
Subject: Re: [GS/SS Beta] Gestome and OLAP

Well, I'll try get money from the users.. to by a  GemsConnect.. that works with GLASS or need by other kind of lincence ??

Other way.. port DBXTalk to glass :)..

I believe there are various Smalltalk implementations of the Postgres db (socket-based) protocol around and I'm sure I remember talk of one working in Gemstone.   
Reply | Threaded
Open this post in threaded view
|

Re: Gemstone and OLAP

Stephan Eggermont-3
In reply to this post by Diogenes Moreira

On 17 aug 2011, at 14:20, Diogenes Moreira wrote:

> perhaps because of my ignorance.. I don't know OLAP CUBES or Dataminig tools, running on the GS, because it, my question.
>
>  My approach to gemstone was through GLASS
>
> Well, users use to explore the data though excel and olap cubes.. and run others BI tools.
>
> if you give my some advice to kill the relational database.. i'll be very happy :)
 
It depends. There are lots of factors influencing such a decision:
- If the amount of data is small, every solution works. Pick the easiest.
- If you already have OLAP cubes and BI tools in place,
   just exporting the data in csv is probably easiest and fastest.
  All tools I'm aware of can read csv.
- The basic OLAP cube model is simple to implement in
  Smalltalk.  It is easy to add cached values (materialized subviews)
  and store data in a form optimal for your application. Connecting multiple
  cubes and working with time-based data is also easy in Smalltalk.
- OLAP is often done on a copy of the OLTP database to avoid slowing
  down the OLTP users. OLAP often needs to access large datasets.
  OLAP often considers history, while OLTP is focused on value now.
  The OLAP database therefore has a much larger working set. On the
  other hand the number of OLAP users often is much lower than the number
  of OLTP users.
- Do you need to provide data and analytics to one or a few specialists or
  do hundreds or thousands of users need to do real-time analytics?
  The specialists just need the data in a form they can use or
  can be trained to work with.  Real-time with lots of users needs a
  parallelized solution for ad-hoc queries or a limited model with
  pre-calculations.
- As long as the data fits in ram and your solution can use the ram,
  you need not have performance problems. SQL server has a 64G
  limit in the standard edition, you'll need Enterprise to use more.
  A Dell workstation with 192 GB ram costs about 15K euro today.
- Generic relational databases are exceptionally badly suited for BI/OLAP.
  They are designed for OLTP. Handling of time-based data is often very
  primitive. To get acceptable performance they need to materialize subviews,
  which tend to lead to a database size explosion when the dimension size
  (number of dimensions and elements and levels/dimension) grows.
  Database vendors are very interested in selling you these products,
  as you'll need lots of licenses.
- Gemstone is a much better fit with OLAP. You can implement optimized storage
  strategies easily (for inspiration, take a look at the Vertica white papers). OLAP is
  main memory intensive though, so you might outrun the free version quickly.
  If you run the OLAP on a copy of your database, you can consider putting all files
  on a large ram disk.
- MOOSE with Glamour and Mondrian is quite a nice tool for analyzing data
  and presenting results. You are limited in data size though. I didn't try it yet,
  but a MOOSE image should be able to just load Gemtools.

Some ten years ago I've build a distributed main memory cube engine in java.
I've not yet found the time (or a client) to re-implement it in Smalltalk.

Stephan Eggermont
Reply | Threaded
Open this post in threaded view
|

Re: Gemstone and OLAP

Diogenes Moreira
Hi stef.

Thanks for you answer
The cvs is a quick and good solution..because exists ETL process for import the data :)..

In this case, the data is operated by 8 analyst, but is a very larga database, arround 590 GB. 40 diferent cubes with an average of 20 dimmension for each one.

Precalculate aggregations is very important.

Over that cubes run spss.. Well a very large BI Platform.

I thinking replace a fronting applications.. Because my questions :)

Thanks again.
Best.
D
Enviado desde mi BlackBerry® de Claro Argentina

-----Original Message-----
From: Stephan Eggermont <[hidden email]>
Sender: [hidden email]
Date: Fri, 19 Aug 2011 11:30:56
To: GemStone Seaside beta discussion<[hidden email]>
Reply-To: GemStone Seaside beta discussion <[hidden email]>
Subject: Re: [GS/SS Beta] Gemstone and OLAP


On 17 aug 2011, at 14:20, Diogenes Moreira wrote:

> perhaps because of my ignorance.. I don't know OLAP CUBES or Dataminig tools, running on the GS, because it, my question.
>
>  My approach to gemstone was through GLASS
>
> Well, users use to explore the data though excel and olap cubes.. and run others BI tools.
>
> if you give my some advice to kill the relational database.. i'll be very happy :)
 
It depends. There are lots of factors influencing such a decision:
- If the amount of data is small, every solution works. Pick the easiest.
- If you already have OLAP cubes and BI tools in place,
   just exporting the data in csv is probably easiest and fastest.
  All tools I'm aware of can read csv.
- The basic OLAP cube model is simple to implement in
  Smalltalk.  It is easy to add cached values (materialized subviews)
  and store data in a form optimal for your application. Connecting multiple
  cubes and working with time-based data is also easy in Smalltalk.
- OLAP is often done on a copy of the OLTP database to avoid slowing
  down the OLTP users. OLAP often needs to access large datasets.
  OLAP often considers history, while OLTP is focused on value now.
  The OLAP database therefore has a much larger working set. On the
  other hand the number of OLAP users often is much lower than the number
  of OLTP users.
- Do you need to provide data and analytics to one or a few specialists or
  do hundreds or thousands of users need to do real-time analytics?
  The specialists just need the data in a form they can use or
  can be trained to work with.  Real-time with lots of users needs a
  parallelized solution for ad-hoc queries or a limited model with
  pre-calculations.
- As long as the data fits in ram and your solution can use the ram,
  you need not have performance problems. SQL server has a 64G
  limit in the standard edition, you'll need Enterprise to use more.
  A Dell workstation with 192 GB ram costs about 15K euro today.
- Generic relational databases are exceptionally badly suited for BI/OLAP.
  They are designed for OLTP. Handling of time-based data is often very
  primitive. To get acceptable performance they need to materialize subviews,
  which tend to lead to a database size explosion when the dimension size
  (number of dimensions and elements and levels/dimension) grows.
  Database vendors are very interested in selling you these products,
  as you'll need lots of licenses.
- Gemstone is a much better fit with OLAP. You can implement optimized storage
  strategies easily (for inspiration, take a look at the Vertica white papers). OLAP is
  main memory intensive though, so you might outrun the free version quickly.
  If you run the OLAP on a copy of your database, you can consider putting all files
  on a large ram disk.
- MOOSE with Glamour and Mondrian is quite a nice tool for analyzing data
  and presenting results. You are limited in data size though. I didn't try it yet,
  but a MOOSE image should be able to just load Gemtools.

Some ten years ago I've build a distributed main memory cube engine in java.
I've not yet found the time (or a client) to re-implement it in Smalltalk.

Stephan Eggermont
Reply | Threaded
Open this post in threaded view
|

Re: Gemstone and OLAP

Stephan Eggermont-3

On 19 aug 2011, at 14:44, [hidden email] wrote:
> Thanks for you answer
> The cvs is a quick and good solution..because exists ETL process for import the data :)..
>
> In this case, the data is operated by 8 analyst, but is a very larga database, arround 590 GB. 40 diferent cubes with an average of 20 dimmension for each one.

How much of that is real data? With 20 dimensions you probably need
a lot of materialized subviews.

> Precalculate aggregations is very important.

That depends on the architecture. With a distributed main-memory
model you might get away with a normalized model, and just calculate
on the fly.

> Over that cubes run spss.. Well a very large BI Platform.

SPSS is large and has a lot of functionality. If the users and management
are happy with it (and the performance)...

Stephan
Reply | Threaded
Open this post in threaded view
|

Re: Gestome and OLAP

Juan-2
In reply to this post by Nick
On Thu, Aug 18, 2011 at 1:27 PM, Nick Ager <[hidden email]> wrote:
>> Well, I'll try get money from the users.. to by a  GemsConnect.. that
>> works with GLASS or need by other kind of lincence ??
>> Other way.. port DBXTalk to glass :)..
>
> I believe there are various Smalltalk implementations of the Postgres db
> (socket-based) protocol around and I'm sure I remember talk of one working
> in Gemstone.
Yes
Reply | Threaded
Open this post in threaded view
|

Re: Gestome and OLAP

YossiDM
If you're really desperate and do not want to write anything that is lower-level socket based, you could use web services or a CURL approach. Indeed, there are several online DBs, ERPs, and CRMs that work this way. CouchDB for example uses the CURL approach extensively. It's of course way more chatty and not the most efficient vs. a native protocol/sockets, but it requires minimal effort and a lot of products already have this.


For example, you could create a REST interface or simple web service wrapper for your basic CRUD operations. Microsoft Dynamics CRM for instance does this with a WCF service layer in .NET. There's one for SalesForce.com as well like this. You could just pick your favorite ORM, setup a simple wrapper on your relational DB, and expose some methods via web service.

JSON can be a happy communication language as most platforms can serialize/deserialize this. You simply need to take each object in Gemstone, serialize it to json, then deserialize it into Java, Python, C++, whatever. You could use protobuf instead if you're familiar with it and it suits your needs better.

The biggest problem here will be the amount of network traffic. If you're going to bulk insert 2 million rows, it's going to be slow if each row is a separate web service call. It will work, but certainly that is not designed for that. You need to be careful not to saturate your connection. I've run into issues needing to do massive inserts into things like Azure or Amazon SimpleDB. One way to speed it up is to do parallel queries/inserts and you can perhaps use a queuing or scheduler to rate limit if needed.

You can also just try to bulk ship the data over to the other system so it is at least physically there. It may take awhile to serialize your entire object graph though and could fail so be careful, but it could be as simple as sending over a file to the other side with a watcher process or via a web service call.  If it's all on the same machine, this becomes a lot easier, but I am guessing this is not the case with Gemstone and your relational DB.

Another approach would be to again forget the relational DB and do things in memory as mentioned. If you are not comfortable with FFI and that kind of thing, you could again create web services against SPSS or whatever products that wrap their API (if not already there). Again, there's going to be a speed issue.

You could also go the other direction and warp Gemstone data with web services. Generally I would think you would rather query using gemstone anyway. I'm not sure in what context you're doing this analysis, but this is something I've often thought of doing with my data. Essentially you should already be encapsulating your object queries in some way like find customers via some block. Just make your service interface take params that can be sent in by a web app or your data visualizer/stats package API capabilities. In my experience, most of these packages have a standard data format and really what happens is SQL, CSV, tab delimited, flat files, XML, whatever maps back to it. You can return your data in one of the compatible formats from gemstone or create a new supported format if you are able. Then you will just make a service call that returns the data in this format i.e. from C# it might look like IEnumerable<Customers> customers = myGemstoneOrdersServiceWrapperClient.FindCustomer(firstName, lastName); and then your stats package consumes the customers object as it normally would. Obviously again, this can be XML, json, csv, whatever works for your software.