Smalltalk › Usenets › Dolphin Smalltalk

Distributed OODBMS

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

22 messages Options

Udo Schneider

Distributed OODBMS

All,

Is anybody aware of an Distributed OODBMS for Smalltalk (preerably
Dolphin of course). Currently it would be sufficient to have a master
OODB (with r/w access) which gets replicated to another client (only r/o
access needed). So it would be a one-way replication. On the longterm a
synchronizing approach would be prefered but let's start small.

I searched the web and found nothing. Anybody aware of something like
this for Smalltalk. Otherwise I though to implement an
distributing/synchronizing layer on top of OmniBase ... but that's just
food for thought.

CU,

Udo

Sebastián Sastre

Re: Distributed OODBMS

It seems that you can make something like that with GOODS. I made a
port of the GOODS client from Squeak to Dolphin. But sincerely I don't
find it a good option. It's not as flexible, nor trustable choice as
one could expect from an ODB. With OmniBase I have x1000 the
performance that GOODS can offer from Smalltalk.

Other approach I think you have, is to use dolphin's images. Each image
can work as an ODB system itself if you put a distribution framework,
trustable enough, on it, you're done. For more trustability, can add a
transactional framework.

I've made some tests with the "rST" distribution framework and it
works. Just make something to pass blocks so the server makes the
queries processin (for instance you send from the client an object with
the code 'myRepository select:[:e| 'John*' match: e completeName ]' and
in the server that code is compiled and evaluated). Performance is
cool.

I don't know any trustable transactional framework to use (nor port) to
dolphin. Perhaps somebody can give some clue on this.

cheers,

Sebastian

Udo Schneider

Re: Distributed OODBMS

Hi Sebastián,

thanks for your comments.

> It seems that you can make something like that with GOODS. I made a
> port of the GOODS client from Squeak to Dolphin. But sincerely I don't
> find it a good option. It's not as flexible, nor trustable choice as
> one could expect from an ODB. With OmniBase I have x1000 the
> performance that GOODS can offer from Smalltalk.
I took a look at GOODS but decided against it as it is a) not native
Smalltalk and b) it carries a lot of functionality not needed.

> Other approach I think you have, is to use dolphin's images. Each image
> can work as an ODB system itself if you put a distribution framework,
> trustable enough, on it, you're done. For more trustability, can add a
> transactional framework.
Although that's a good idea I can't use it in my case. I'll not able to
start "replicated images" once they are needed. Instead the images are
always running on the different machines and access the database from
time to time which then should be up-to-date according to a master database.

If you take different images accessing the same shared omnibase database
your're pretty close to what I think of (including wonderfull omnibase
features like Multi-Version Concurrency Control). The only problem is
that the different images in may case do not have access to a shared
filesystem .... so I'd like to "replicate" OmniBase's functionality of
multiple clients accessing the same database over a network connection
... but without a shared file system.

> I've made some tests with the "rST" distribution framework and it
> works. Just make something to pass blocks so the server makes the
> queries processin (for instance you send from the client an object with
> the code 'myRepository select:[:e| 'John*' match: e completeName ]' and
> in the server that code is compiled and evaluated). Performance is
> cool.
I had a similar idea but instead of using rSt (and adding another
dependency) I went for using ODBSerializer/ODBDeserializer instead. I
think that rST is both smaller (in terms of the serialized stream size)
and better (object reference vs. object copy) but I might be wrong. But
nevertheless querying is not the main problem. My needs for the database
a pretty simple (a replicated Dictionary would do :-). But the objects
I'm storing a pretty big (>10MB is not unusual). I want to replicate
this data because once the client needs to access this data even a few
seconds access time is too much. Therefor the need for a local replica.

CU,

Udo

Eric Taylor

Re: Distributed OODBMS

In reply to this post by Udo Schneider

Udo,

Have you taken a look at Matisse? We've worked with Matisse quite a bit
under Eiffel, and it's a _fantastic_ OODMBS. They have a Smalltalk
binding for VisualWorks, but I'm sure someone with experience could port
that binding to Dolphin. They actually have _many_ bindings, including
a C API that compiles relatively easily to a .DLL, should the VW port
prove unsatisfactory.

They have a free developers' edition, but it does expire periodically.
You have to renew. They also have a great Open Source licensing scheme,
which would be available to you if you express an earnest issue in
creating yet another binding. In fact, all of their bindings have been
created this way. The next step is their Home Developer/Distribution
Licensing scheme. This is for a developer who is writing small business
applications that will require no more than two (2) simultaneous users.
It's $495/year. Otherwise, Matisse is pricey (but in line with other
products of its type).

Take a look at this link, particularly towards the bottom:
http://www.matisse.com/product_information/features/

Hope this helps.

Eric S. Taylor

> -----Original Message-----
> From: Udo Schneider [mailto:[hidden email]]
> Posted At: Monday, May 29, 2006 1:35 PM
> Posted To: comp.lang.smalltalk.dolphin
> Conversation: Distributed OODBMS
> Subject: Distributed OODBMS
>
> All,
>
> Is anybody aware of an Distributed OODBMS for Smalltalk (preerably
> Dolphin of course). Currently it would be sufficient to have a master
> OODB (with r/w access) which gets replicated to another client (only

r/o
> access needed). So it would be a one-way replication. On the longterm
a
> synchronizing approach would be prefered but let's start small.
>
> I searched the web and found nothing. Anybody aware of something like
> this for Smalltalk. Otherwise I though to implement an
> distributing/synchronizing layer on top of OmniBase ... but that's
just
> food for thought.
>
> CU,
>
> Udo

Udo Schneider

Re: Distributed OODBMS

Eric,

thanks for the tip. I already registered for the developer version and I
am reading the Docs and Smalltalk binding already ... Looks pretty
impressive up to now.

Thanks,

Udo

Eric Taylor wrote:

> Udo,
>
> Have you taken a look at Matisse? We've worked with Matisse quite a bit
> under Eiffel, and it's a _fantastic_ OODMBS. They have a Smalltalk
> binding for VisualWorks, but I'm sure someone with experience could port
> that binding to Dolphin. They actually have _many_ bindings, including
> a C API that compiles relatively easily to a .DLL, should the VW port
> prove unsatisfactory.
>
> They have a free developers' edition, but it does expire periodically.
> You have to renew. They also have a great Open Source licensing scheme,
> which would be available to you if you express an earnest issue in
> creating yet another binding. In fact, all of their bindings have been
> created this way. The next step is their Home Developer/Distribution
> Licensing scheme. This is for a developer who is writing small business
> applications that will require no more than two (2) simultaneous users.
> It's $495/year. Otherwise, Matisse is pricey (but in line with other
> products of its type).
>
> Take a look at this link, particularly towards the bottom:
> http://www.matisse.com/product_information/features/
>
> Hope this helps.
>
> Eric S. Taylor
>
>
>
>>-----Original Message-----
>>From: Udo Schneider [mailto:[hidden email]]
>>Posted At: Monday, May 29, 2006 1:35 PM
>>Posted To: comp.lang.smalltalk.dolphin
>>Conversation: Distributed OODBMS
>>Subject: Distributed OODBMS
>>
>>All,
>>
>>Is anybody aware of an Distributed OODBMS for Smalltalk (preerably
>>Dolphin of course). Currently it would be sufficient to have a master
>>OODB (with r/w access) which gets replicated to another client (only
>
> r/o
>
>>access needed). So it would be a one-way replication. On the longterm
>
> a
>
>>synchronizing approach would be prefered but let's start small.
>>
>>I searched the web and found nothing. Anybody aware of something like
>>this for Smalltalk. Otherwise I though to implement an
>>distributing/synchronizing layer on top of OmniBase ... but that's
>
> just
>
>>food for thought.
>>
>>CU,
>>
>>Udo
>
>

James Foster-3

Re: Distributed OODBMS

In reply to this post by Udo Schneider

http://www.gemstone.com/products/smalltalk/ describes the GemStone/S Object
Server. Think of it as a multi-user shared Smalltalk image with database
semantics. You don't describe much about your needs, but this certainly
qualifies as a distributed OODBMS for Smalltalk.

"Udo Schneider" <[hidden email]> wrote in message
news:[hidden email]...

> All,
>
> Is anybody aware of an Distributed OODBMS for Smalltalk (preerably Dolphin
> of course). Currently it would be sufficient to have a master OODB (with
> r/w access) which gets replicated to another client (only r/o access
> needed). So it would be a one-way replication. On the longterm a
> synchronizing approach would be prefered but let's start small.
>
> I searched the web and found nothing. Anybody aware of something like this
> for Smalltalk. Otherwise I though to implement an
> distributing/synchronizing layer on top of OmniBase ... but that's just
> food for thought.
>
> CU,
>
> Udo

Eric Taylor

Re: Distributed OODBMS

James,

"Distributed" in this case, if I understand Udo's needs correctly,
refers to the ability to distribute a single database across multiple
hard drives. The distribution may be in the form of replication of the
entire database, or a parceling out of various tables to various
locations. Formally, it has little to do with multiple users, although
multiple users are certainly implicit in the distribution scheme.

Can GemStone/S handle this kind of a scenario?

Eric S. Taylor

> -----Original Message-----
> From: James Foster [mailto:[hidden email]]
> Posted At: Tuesday, May 30, 2006 9:18 AM
> Posted To: comp.lang.smalltalk.dolphin
> Conversation: Distributed OODBMS
> Subject: Re: Distributed OODBMS
>
> http://www.gemstone.com/products/smalltalk/ describes the GemStone/S
> Object
> Server. Think of it as a multi-user shared Smalltalk image with

database
> semantics. You don't describe much about your needs, but this
certainly
> qualifies as a distributed OODBMS for Smalltalk.
>
> "Udo Schneider" <[hidden email]> wrote in message
> news:[hidden email]...
> > All,
> >
> > Is anybody aware of an Distributed OODBMS for Smalltalk (preerably
> Dolphin
> > of course). Currently it would be sufficient to have a master OODB
(with
> > r/w access) which gets replicated to another client (only r/o access
> > needed). So it would be a one-way replication. On the longterm a
> > synchronizing approach would be prefered but let's start small.
> >
> > I searched the web and found nothing. Anybody aware of something
like
> this
> > for Smalltalk. Otherwise I though to implement an
> > distributing/synchronizing layer on top of OmniBase ... but that's
just
> > food for thought.
> >
> > CU,
> >
> > Udo

pax

Re: Distributed OODBMS

> "Distributed" in this case, if I understand Udo's needs correctly,
> refers to the ability to distribute a single database across multiple
> hard drives.

> Can GemStone/S handle this kind of a scenario?

Eric, Udo...

if I am not mistaken, the standard GemStone Object Server cannot handle
this scenario. The GemStone server would also need the "Enterprise
Option" which allows data/objects to be replicated across additional
object servers that are reachable across the "Enterprise Network".

This would obviously increase the deployment costs as you would need to
pay for each object server/cpu in addtion to the enterprise option per
server/cpu. However, if the projects budgets supports the economic cost
for implementing this solution, it could be a viable option.

If not, I would say that its time to roll up your sleeves and
build/modify to meet your requirements.

Pax

Udo Schneider

Re: Distributed OODBMS

In reply to this post by James Foster-3

James Foster wrote:
> http://www.gemstone.com/products/smalltalk/ describes the GemStone/S Object
> Server. Think of it as a multi-user shared Smalltalk image with database
> semantics. You don't describe much about your needs, but this certainly
> qualifies as a distributed OODBMS for Smalltalk.
GemStone might definetly a good choice ... however pretty much sure this
is by far exceeds what I'm searching for. In terms of complexity (to
setup - not to use!) and for sure in terms of pricing.
However this might be one of the alternatives left as far as I can see.

CU,

Udo

Udo Schneider

Re: Distributed OODBMS

In reply to this post by Eric Taylor

Eric Taylor wrote:
> "Distributed" in this case, if I understand Udo's needs correctly,
> refers to the ability to distribute a single database across multiple
> hard drives. The distribution may be in the form of replication of the
> entire database, or a parceling out of various tables to various
> locations. Formally, it has little to do with multiple users, although
> multiple users are certainly implicit in the distribution scheme.
If you replace hard drives with "hard drives on mutiple" clients it's
what I intended to say. Sorry If I made myself not very clear.

CU,

Udo

Udo Schneider

Re: Distributed OODBMS

In reply to this post by pax

Pax wrote:
> if I am not mistaken, the standard GemStone Object Server cannot handle
> this scenario. The GemStone server would also need the "Enterprise
> Option" which allows data/objects to be replicated across additional
> object servers that are reachable across the "Enterprise Network".
Didn't know that to be honest. This does increase the price tag even
more, correct?

> This would obviously increase the deployment costs as you would need to
> pay for each object server/cpu in addtion to the enterprise option per
> server/cpu. However, if the projects budgets supports the economic cost
> for implementing this solution, it could be a viable option.
/My/ bugdet would support that for sure :-)

> If not, I would say that its time to roll up your sleeves and
> build/modify to meet your requirements.
I think I have no choice other than doing it on my own ... let's see
where I can leverage OmniBase code as I don't want to reinvent the wheel
completely...

Thanks to all for your help and suggestions.

CU,

Udo

Bruno Brasesco

Re: Distributed OODBMS

In reply to this post by Eric Taylor

Eric Taylor escribió:

OmniBase is suitable for this.
Three years ago or so, using Dolphin i stored aOmniBase repository in
differents computer (windows and linux - using Samba).

But i think (not sure) that GemStone/S support this kind of distribution
through Repository Extent.

Regads Bruno

James Foster-3

Re: Distributed OODBMS

In reply to this post by Eric Taylor

Eric,

GemStone/S stores data in a "repository" (similar to the "image" in other
Smalltalks). A repository is made up of one or more "extents." Each extent
is a file in the file system or a raw partition on a disk. If a system is
using multiple extents, they would typically be on multiple hard drives.
GemStone/S also supports the concept of a "replicate" for each extent, with
the recommendation that the replicates be on different drives from the
primary extent. An object can be "clustered" or assigned to a page on a
particular extent (in some versions the assignment is always honored; in
others it is treated as a recommendation).

James

"Eric Taylor" <[hidden email]> wrote in message
news:000601c683ff$129658e0$6500a8c0@server...

> James,
>
> "Distributed" in this case, if I understand Udo's needs correctly,
> refers to the ability to distribute a single database across multiple
> hard drives. The distribution may be in the form of replication of the
> entire database, or a parceling out of various tables to various
> locations. Formally, it has little to do with multiple users, although
> multiple users are certainly implicit in the distribution scheme.
>
> Can GemStone/S handle this kind of a scenario?
>
> Eric S. Taylor
>
>
>> -----Original Message-----
>> From: James Foster [mailto:[hidden email]]
>> Posted At: Tuesday, May 30, 2006 9:18 AM
>> Posted To: comp.lang.smalltalk.dolphin
>> Conversation: Distributed OODBMS
>> Subject: Re: Distributed OODBMS
>>
>> http://www.gemstone.com/products/smalltalk/ describes the GemStone/S
>> Object
>> Server. Think of it as a multi-user shared Smalltalk image with
> database
>> semantics. You don't describe much about your needs, but this
> certainly
>> qualifies as a distributed OODBMS for Smalltalk.
>>
>> "Udo Schneider" <[hidden email]> wrote in message
>> news:[hidden email]...
>> > All,
>> >
>> > Is anybody aware of an Distributed OODBMS for Smalltalk (preerably
>> Dolphin
>> > of course). Currently it would be sufficient to have a master OODB
> (with
>> > r/w access) which gets replicated to another client (only r/o access
>> > needed). So it would be a one-way replication. On the longterm a
>> > synchronizing approach would be prefered but let's start small.
>> >
>> > I searched the web and found nothing. Anybody aware of something
> like
>> this
>> > for Smalltalk. Otherwise I though to implement an
>> > distributing/synchronizing layer on top of OmniBase ... but that's
> just
>> > food for thought.
>> >
>> > CU,
>> >
>> > Udo
>
>

James Foster-3

Re: Distributed OODBMS

In reply to this post by Udo Schneider

Udo,

Why do you need to have the database stored on hard drives on multiple
clients? If the goal is (simply!) to make the data available to multiple
client machines, then do you care if the data comes over the network and is
cached in RAM rather than being read from the hard disk? It seems that any
"distributed" solution will require some network communication (getting the
data from one machine's hard disk to another machine's hard disk).
GemStone/S nicely supports clients on multiple machines and has customers
with over a thousand concurrent distinct clients all sharing the same data.
If your interest is in redundancy (dealing with hard disk failure), then
having multiple disks on one machine is usually sufficient. If you want host
redundancy, then you can keep a recent backup on another machine and apply
transaction logs to the backup as a continuous restore. That sort of
"warm-standby" is used by a number of GemStone/S customers.
If you truly require the data to be duplicated on multiple client disks,
then you are dealing with a more complex problem. One interesting problem is
what happens when the network connection between the two hosts is lost.
Would each client be able to go on updating their own copy of the database?
As mentioned, GemStone/S has an add-on product (GemEnterprise) that provides
some of this sort of coordination, but it is much more complex (and I don't
think you will be able to replicate it without substantial investment).

James

"Udo Schneider" <[hidden email]> wrote in message
news:[hidden email]...

> Eric Taylor wrote:
>> "Distributed" in this case, if I understand Udo's needs correctly,
>> refers to the ability to distribute a single database across multiple
>> hard drives. The distribution may be in the form of replication of the
>> entire database, or a parceling out of various tables to various
>> locations. Formally, it has little to do with multiple users, although
>> multiple users are certainly implicit in the distribution scheme.
> If you replace hard drives with "hard drives on mutiple" clients it's what
> I intended to say. Sorry If I made myself not very clear.
>
> CU,
>
> Udo
>

Udo Schneider

Re: Distributed OODBMS

James Foster wrote:
> Why do you need to have the database stored on hard drives on multiple
> clients? If the goal is (simply!) to make the data available to multiple
> client machines, then do you care if the data comes over the network and is
> cached in RAM rather than being read from the hard disk? It seems that any
> "distributed" solution will require some network communication (getting the
> data from one machine's hard disk to another machine's hard disk).
The reason for replication on the local harddisk is simply the (access)
time needed to access the data. One advantage is that clients do not
need write access. The database is only changed on the master. Maybe
I'll just list a few conditions and requirements:

1) Works over "slow" networks (e.g. 10Mbit/s Ethernet) with high RTT
(>200ms is not uncommon).
2) Average number of clients is ~20 Clients getting the DB from 1 Master.
2) Big Database - ~2-4GB: Due to the size complete caching is not really
possible. Complete transfer does not work either due to 1) or due to
limited resources (read network bandwidth) of the master (2).
3) Get big objects fast (10-20 MB in <5s) - Problems with 1) and 2).
.
.
10) Fault tolerance. If the master (or the network!) dies clients should
be able to continue working for ~half a day.

I came up with the conclusion that only "real" replication (where the db
resides on the local clients' machine) would be able to cope with this.
However I'm happy to take other ways to do it.

> GemStone/S nicely supports clients on multiple machines and has customers
> with over a thousand concurrent distinct clients all sharing the same data.
> If your interest is in redundancy (dealing with hard disk failure), then
> having multiple disks on one machine is usually sufficient. If you want host
> redundancy, then you can keep a recent backup on another machine and apply
> transaction logs to the backup as a continuous restore. That sort of
> "warm-standby" is used by a number of GemStone/S customers.
If I understood you correctly I would have to keep "backups" on all the
client machines. Can the client machines then access this local backup
or does every client machine then needs it's on GS/S server instance?

> If you truly require the data to be duplicated on multiple client disks,
> then you are dealing with a more complex problem. One interesting problem is
> what happens when the network connection between the two hosts is lost.
> Would each client be able to go on updating their own copy of the database?
As mentioned above one criteria is that clients should be able to cope
with a non-working or unreachable master. On the other hand it is not
"mission critical" if a) the server is not reachable for a few hours as
the database contains kind of a "playlist" for the next few hours and b)
if different clients have different version of the database.
As far as I understood the terminology this means the database is not
duplicated but instead replicated .... however I might be totally wrong.

> As mentioned, GemStone/S has an add-on product (GemEnterprise) that provides
> some of this sort of coordination, but it is much more complex (and I don't
> think you will be able to replicate it without substantial investment).
I might have to take a look into it. On the other hand I'll thing this
increases the price "a bit" :-)

Thanks for your help,

CU,

Udo

Chris Uppal-3

Re: Distributed OODBMS

Udo,

> 1) Works over "slow" networks (e.g. 10Mbit/s Ethernet) with high RTT
> (>200ms is not uncommon).
> 2) Average number of clients is ~20 Clients getting the DB from 1 Master.
> 2) Big Database - ~2-4GB: Due to the size complete caching is not really
> possible. Complete transfer does not work either due to 1) or due to
> limited resources (read network bandwidth) of the master (2).
> 3) Get big objects fast (10-20 MB in <5s) - Problems with 1) and 2).
> .
> .
> 10) Fault tolerance. If the master (or the network!) dies clients should
> be able to continue working for ~half a day.

That's a /nasty/ set of requirements !

Just a thought, but could you use some sort of broadcast protocol ? I'm
imagining a scenario where every client has a more-or-less complete copy of the
database. When a client wants to ensure that it is working with the most
up-to-date version of <some data> or when it finds that it doesn't have the
data at all, then it sends a request to the server. The server /broadcasts/
its response, and all the clients listen for those broadcasts. When they see a
new version of <some data> on the wire they update their own copy whether they
need it (at that minute) or not. If something goes wrong so that they don't
get a complete copy from the broadcast, then they just don't update their local
copy (except, perhaps, to mark it "stale"). The server might also have some
sort of "push" broadcast of changes as they are made (or as a background task
to consume idle bandwidth), so that clients don't often have to check back with
the server to see whether their cached copy of <some data> is still current.

How you layer that over, or under, an OODB is another problem ;-)

Another possibility -- if (some of) the clients are nearer in network terms to
each other than they are to the server, it /might/ be worth thinking about
P2P-style distribution of the data.

A last thought. Have you looked at the "Prevalence" concept / design-pattern ?
If not then that might give you some ideas. See www.prevayler.org for a
description, and a Java implementation (which you would ignore, of course ;-)
You'd be using -- say -- Omnibase, instead of keeping everything in RAM, but
the basic idea seems as if it might be applicable, especially the Prevalence
approach to propagating changes.

-- chris

Udo Schneider

Re: Distributed OODBMS

Chris Uppal wrote:
> That's a /nasty/ set of requirements !
Nobody said life would be easy :-)

> Just a thought, but could you use some sort of broadcast protocol ? I'm
> imagining a scenario where every client has a more-or-less complete copy of the
> database. When a client wants to ensure that it is working with the most
> up-to-date version of <some data> or when it finds that it doesn't have the
> data at all, then it sends a request to the server. The server /broadcasts/
> its response, and all the clients listen for those broadcasts. When they see a
> new version of <some data> on the wire they update their own copy whether they
> need it (at that minute) or not. If something goes wrong so that they don't
> get a complete copy from the broadcast, then they just don't update their local
> copy (except, perhaps, to mark it "stale"). The server might also have some
> sort of "push" broadcast of changes as they are made (or as a background task
> to consume idle bandwidth), so that clients don't often have to check back with
> the server to see whether their cached copy of <some data> is still current.

I didn't really think about broadcasts yet .... maybe may notion of
broadcast ist too much network oriented - thus limited to one segment.
However (UDP-) multicast might be a good idea. However this would
require reliable working UDP Sockets, non-unicast IP Address support in
Dolphin and I would have to add a kind of stream layer (basically
imitating TCP) above that .... sound like a lot of work ... but I think
/any/ solution I have to choose will be "a lot of work".

I recently came up with the idea of combining the rsync alghorithm with
ditribution over bittorrent. I have both things in place (but don't ask
for code quality :-) in Smalltalk. So the idea would be that the check
which part of the file(s) needs to be updated will be done like rsync
whereas the real distribution takes place using bittorent (reducing the
bandwidth requirements for the master). But I think that's a) pretty
heavy in terms of implementation, b) not very "Smalltalkish" and c) only
deals with files which brings us to ...

> How you layer that over, or under, an OODB is another problem ;-)
... the question how to go from files to OODBs. I don't think that e.g.
OmniBase would like to have it's file updated independently (if even
possible of file locks are in place).

> Another possibility -- if (some of) the clients are nearer in network terms to
> each other than they are to the server, it /might/ be worth thinking about
> P2P-style distribution of the data.
That's what I thought as well in regards to BT.

> A last thought. Have you looked at the "Prevalence" concept / design-pattern ?
> If not then that might give you some ideas. See www.prevayler.org for a
> description, and a Java implementation (which you would ignore, of course ;-)
> You'd be using -- say -- Omnibase, instead of keeping everything in RAM, but
> the basic idea seems as if it might be applicable, especially the Prevalence
> approach to propagating changes.
Yes. I did. As far as I understood the concept it would be enough to
distribute the serialized objects once (or perodically) and then only
distribute the transaction log to the clients. I might have to look into
that. However this as well would be a complete rewrite from scratch. But
I think based on the answers I got it will be a complete rewrite anyway
regardingless which way I'll choose :-(

Thanks for your help.

CU,

Udo

Chris Uppal-3

Re: Distributed OODBMS

Udo,

> However (UDP-) multicast might be a good idea. However this would
> require reliable working UDP Sockets, non-unicast IP Address support in
> Dolphin and I would have to add a kind of stream layer (basically
> imitating TCP) above that .... sound like a lot of work ... but I think
> /any/ solution I have to choose will be "a lot of work".

BTW, note that if you are only using UDP to broadcast the state of a file with
write-once semantics (such as a change log), then the file itself can provide
much of the "reliable stream" layer. Clients don't need to track the state of
the "stream", only of which blocks are missing from their copy of the file.

> I recently came up with the idea of combining the rsync alghorithm with
> ditribution over bittorrent. I have both things in place (but don't ask
> for code quality :-) in Smalltalk. So the idea would be that the check
> which part of the file(s) needs to be updated will be done like rsync
> whereas the real distribution takes place using bittorent (reducing the
> bandwidth requirements for the master). But I think that's a) pretty
> heavy in terms of implementation, b) not very "Smalltalkish" and c) only
> deals with files

Another point to consider is what happens when the DB is reorganised, for
instance garbage-collecting an OmniBase repository. It may be that the
operation is (mostly) deterministic, so you could synch the clients first, then
compact each DB independently (then synch again), but it might not be...

> ... the question how to go from files to OODBs. I don't think that e.g.
> OmniBase would like to have it's file updated independently (if even
> possible of file locks are in place).

I suppose you could disconnect from the repository before applying changes. I
don't know what the performance of that would be like. It might be perfectly
acceptable.

Otherwise, personally I'd be unhappy messing with an operating OODB at the file
level unless I understood its operation inside and out. Which probably means
that either I wrote it myself, or at least that I can work directly with its
author. YMMV ;-)

> > A last thought. Have you looked at the "Prevalence" concept /
> > design-pattern ? [...]
> Yes. I did. As far as I understood the concept it would be enough to
> distribute the serialized objects once (or perodically) and then only
> distribute the transaction log to the clients.

I may be misunderstanding you, but I think you can do a little better than
that. Keep the objects in some local OODB which your code connects to and uses
normally (i.e. with no awareness of distribution issues). A separate process
gets the serialised updates from the server (by whatever mechanism seems
appropriate), and applies those changes atomically to the same repository.
I.e. take the Prevalence design and s/RAM/OODB/g ;-)

Distributing the initial state of the DB is a different problem. Bit-Torrent,
cut a DVD, whatever...

-- chris

Udo Schneider

Re: Distributed OODBMS

Chris Uppal wrote:
> BTW, note that if you are only using UDP to broadcast the state of a file with
> write-once semantics (such as a change log), then the file itself can provide
> much of the "reliable stream" layer. Clients don't need to track the state of
> the "stream", only of which blocks are missing from their copy of the file.
Good point ... I'd just have to make sure that there is some feedback
from the clients to prevent the clients from not updating the db
completely because one packet is missing.

> Another point to consider is what happens when the DB is reorganised, for
> instance garbage-collecting an OmniBase repository. It may be that the
> operation is (mostly) deterministic, so you could synch the clients first, then
> compact each DB independently (then synch again), but it might not be...
I'd prefer to compact the db on the master only ... and then copy over
to the client.

> I suppose you could disconnect from the repository before applying changes. I
> don't know what the performance of that would be like. It might be perfectly
> acceptable.
This might be perfectly acceptable indeed ... the only thing would be
that the clients need to determine a safe time to close the db and
reopen the new db.

> Otherwise, personally I'd be unhappy messing with an operating OODB at the file
> level unless I understood its operation inside and out. Which probably means
> that either I wrote it myself, or at least that I can work directly with its
> author. YMMV ;-)
That's the good thing about OmniBase .... "read the source Luke!" :-)

> I may be misunderstanding you, but I think you can do a little better than
> that. Keep the objects in some local OODB which your code connects to and uses
> normally (i.e. with no awareness of distribution issues). A separate process
> gets the serialised updates from the server (by whatever mechanism seems
> appropriate), and applies those changes atomically to the same repository.
> I.e. take the Prevalence design and s/RAM/OODB/g ;-)
This might be a way to go ... (a good one to be honest!) ... I'm just
not sure whether it's so easy has you have to keep the object path to
the rootobject. E.g.

Root
Key1 -> Object1 (method #object2 returns another object).

If the master now updates oject2 and marks it as being dirty
(#markDirty) I'll have to serialize object2 and keep track of it's
access path from the root ... some kind of generic #parentObject(s)
would be nice in this case :-)

> Distributing the initial state of the DB is a different problem. Bit-Torrent,
> cut a DVD, whatever...
For the initial state I decided to create on online backup from the
OmniBase file and distribute it (see above). One thing I have to
investigate is whether it's possible to do this directly over the
network. I think by overriding/extending some methods in ODBFileStream
to send this over the I should be able to replicate the db pretty easily
... this might as well be a way to go to address the complete issue if I
had ...

Thanks for your help.

CU,

Udo

Bruno Brasesco

Re: Distributed OODBMS

OmniBase

With OmniBase you can use:
ODBObjectManager>>containerNew:path:

To store differents containers over a network.
OmniBase
containerNew: aString path: path
^objectManager containerNew: aString path: path

anOmnniBase containerNew: 'Bank Instances' path:'\\MyHost\c:\omnibase\'.

If you store ODBBTreeDictionary inside each container --> you have a
distributed solution.

I test this over 2 differents SO, windows and linux (using samba).
Some instances were stored on Linux and some others on Windows.
It works perfectly.

No problem with object references between SO.

I do not know if it posible to you.
You only needs a windows network.

Regards Bruno

PS:
It is possible to store instances of the same class in differents part
of the network (evaluating some criteria).