Design Principles Behind Smalltalk, Revisited

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
79 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Re: Design Principles Behind Smalltalk, Revisited

J J-6
>From: "Paul D. Fernhout" <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: Design Principles Behind Smalltalk, Revisited
>Date: Tue, 26 Dec 2006 08:52:26 -0500
>
>Paul Graham has an essay on why languages become popular:
>   http://www.paulgraham.com/popular.html
>In the case of both C and C++, one should not discount the wight of AT&T,
>one of the largest and most widespread and visible companies of the time
>(as it ran a telephone monopoly).

Well I realize this had something to do with it as well.  But I still think
the "have something now" played the biggest role.  And even with the
conversion from COBOL, C++ was the most widely used language for a while.  
If you go smalltalk then you train your existing folks but if you go with
C++ or Java you can leverage that huge base of programmers.

>Well, it is also true one big issue is that an Algol-like syntax with
>operator precedence (times over plus) is taught in K-12 school. That is a
>big advantage for a computer language to build on that, even as that
>precedence is arbitrary and Smalltalk is more consistent.

I actually find the operator preference irrelevant.  Personally I always
ignored it and wrote the expression to read left-to-right how I wanted it
evaluated.  And I did this in school as well, when I was learning it.  I was
accustomed to the left-right orientation, why learn a new one that applies
in just one area.

>And you are right on how Java seemed an easy move for C++ programmers. Of
>course, now Ruby seems an easy move for Java programmers (and much of Ruby
>is based on Smalltalk ideas), so in a matter of time, we may see Ruby
>developers making the leap to a more self-documenting and flexible syntax.
>:-)

Lets hope!

>Still, Smalltalk syntax was supposedly designed to be easy for kids to
>learn. It is not that hard to learn the syntax. I've helped people in
>business learn it. It takes at most  week to become proficient in it (and
>often just a day). What is hard is to learn all the libraries. But, with
>more and more programmers learning things like Java or Python or Ruby, all
>systems with rich libraries, Ruby's being almost exactly Smalltalk's in
>many ways, making the leap to a new syntax would be a minor investment (and
>one worth taking because Smalltalk syntax is more extensible and
>self-documenting than any of those other languages').

You are right, the libraries are a challenge to learn.  But this is so in
any language that has any code for it at all.  For example, nearly any
Haskell tutorial you look at mentions the countless times the author has
rewritten something that was in the standard (never mind external)
libraries.

>So, why not people moving to Smalltalk (Squeak especially)?  People in
>Python or Perl or PHP or Ruby camps are not bemoaning "backward
>compatibility" as the reason for limited success and adoption.

But keep in mind, python is in the same boat as Java.  I.e. if you know C++,
it's not much of a jump to learn.  And perl is basically just bash/ksh++.  
It is a language by and for sys admins (of which there were/are a great
many).  So if you had already been fighting with *sh all this time, perl
wasn't such a departure.

>While everything you say it true, it is not true enough IMHO to be the main
>reason. What are the others and how can they be addressed to produce a
>popular free Smalltalk?

Honestly, I think we are just in the age of "the killer app".  You have to
have something everyone needs and no other language provides, to draw people
to you.  After that, the other things start to matter.

The things I see that squeak needs to address are: database access (I hear
this is being worked on by Alan Knight), advanced JIT-ish technology (I'm
expecting huge things from Exupery in this area), better thread handling and
the ability to take advantage of native threads (don't know the status of
this), modularity (I think Ralph Johnson will bring us a long way with this
one).

>Java is different, as you point out. However, Java is so different, and
>received so much attention, and incorporated so many Smalltalk-pioneered
>ideas in the JVM design and class libraries (Swing) that ten years after it
>has been introduced, it finally mostly works right as a self-contained
>environment. Not quite VisualWorks, but darn close in many ways by now, and
>it is free as in beer and is becoming free as in freedom (GPL). :-)

I think going with the GPL was actually a very bad move.  As far as I know,
the only people who are going to care about that will be people on the
fringe (who are, of course, irrelevant).  They could have went with BSD or
MIT instead.

>But for both Java and Python, being able to be easily edited in vi (or
>emacs) or being able to use a conventional text oriented version control
>system were indeed big wins, as they reduced the learning curve and initial
>commitment to new ideas. Being able to use the familiar file manager to
>look at code was also of value. And going beyond vi, the fact that Java
>IDEs started to look like C++ IDEs was another big win on familiarity. And
>seeing each class in a separate file in the good old reliable file system
>was also comforting -- at least you knew where your source is, and could
>use grep or other tools to search and manipulate it and back it up in a
>familiar fashion.

Well these are good points, but I can tell you, after being in smalltalk for
a while and then doing some python at work I had a feeling of panic.  I
realized I was going to have to *do something* if I wanted revision control.
  In squeak it just happens and I never think about it unless I need to
revert.

The grep and all the things you mentioned are a matter of training, but yes,
a project to narrow the gap until people get off the crutches probably
wouldn't hurt.

>Extensive tested and debugged libraries on a variety of topics.

Well, first I would call this "work", not innovation. :)  If smalltalk had
the number of bodies working on it that Java has had we would have solved
cancer by now.

And this isn't only a smalltalk compatibility issue.  *No one* that isn't on
the JVM can use these.

>>Typically?  It is harder in every case, no matter how badly designed the
>>programming language.
>
>Well, Spanish to Portuguese might be easier than COBOL to OCaml? But COBOL
>to OCaml is hard for different reasons than syntax. :-)

Actually there is a story somewhere of a student who learned fluent Spanish
and then took a trip to Brazil.  He figured it was close enough, and it was.
  He adjusted very quick.  But he has a terrible time talking spanish now
since the languages *are* so close.  Human language is just incredibly
complicated.  You really do, as you mentioned, have to absorb some of the
culture as well to get fluent.

>My point here wasn't that Squeak should change; it was just an example of
>how being different and staying entirely in Smalltalk might not have been a
>big win, compared to just having a VM written in, say, C. There remains the
>"conceptual" barrier of the VM domain, even as the "technical" one of
>syntax is removed.

I know what you mean.  I just think in this case it doesn't really apply.  
If the VM had been written in C then the people who worked on it probably
wouldn't have.  And that would have just resulted in no squeak at all.

>Translating primitives into C or Java, like for sound manipulation, seems
>like a bigger win. But even then, you have to be writing that code (or
>rewriting that code) in such a non-Smalltalk way semantically that it is
>still not clear to me if there is a lot of value in it. Especially when the
>alternative might be to just call an existing sound synthesis library
>written in Java or C. We now have Java for a good cross-platform language
>with equivalent to C++ performance, so it would have been a harder choice
>ten years previously  as to what cross-platform language to use if not C
>with all its quirks (Free Pascal?).

Well I think projects like Exupery will be important here.  If it works out,
then smalltalk code will be able to compete with C or Java in many cases.  
And Java is no better for a cross platform language then smalltalk is.  They
are both running on VMs.

Now if we ignore Java the language and consider the VM itself as the
computer that our languages compile to then ok.  But I don't think Java has
the best VM.  The problem is, the Java VM is made for a static language and
dynamic languages have to be bolted on.  Microsoft had the same problem
since they just wanted to basically fork Java.  But as I understand it, both
are moving toward the idea of having the VM be for dynamic languages and
build static languages on top of that.  If that is the case then something
like Strongtalk is already ahead of the game.

>But here again is an issue of culture. Who cares if Sun is "behind"; or if
>Squeak runs 30% slower without some extra dynamic dispatch opcode in the
>JVM? Speed is not Squeak's main problem. Being able to leverage Sun's JVM
>and the fact that you can call AWT classes in the same way for any platform
>Java runs on is a big win for Squeak IMHO, as it would reduce the
>maintenance burden of it in terms of complexity of the common code base,
>and would also make it easy to install one common package for any platform
>Java runs on. Ten years ago, or even five, I myself would have laughed at
>the value of this idea (as Java was so buggy and unstable and slow). But
>most of the bugs have been fixed, the 1.5 JVM shares memory across JVMs and
>does dynamic translation for speed, so Java finally, now that it is going
>free under the GPL, has the potential to be a great cross-platform tool
>where you get both a common base GUI window system as well as the ability
>to deliver fast primitives written in Java, as well as access to a lot of
>libraries someone else has already written and debugged for you.
>
>The Squeak community could admit that it would be a big win to leverage
>that "pink plane" success, even if it is "behind" and decide to move
>forward on top of it, but in other "blue plane" directions. Or it can
>continue to spend a lot of time dealing with time consuming basic issues
>relating to packaging and testing C code for lots of platforms (which
>essentially just duplicates the work the Java community is doing, but not
>as well because of more limited people power).

Well those are all good points.

>dot net is a non-starter because it is proprietary (and may be covered by
>patents). And I would not make this suggestion without basing it on Sun's
>move to the GPL for Java. There are several JVM Smalltalk already of
>course.

Don't make the mistake of assuming the world is how we really *want* it to
be.  dot.net is getting more popular all the time and may end up beating
Java in the end.  Linux has been GPL from the start but windows is still the
king of the desktop and growing in the server realm.

And I would be hesitant *because* of Sun using the GPL.  The license has the
reputation for being viral (even if it isn't anymore) and therefor many
companies avoid it.  For example I work for a very large company who will
only allow GPL code to be used in isolation (i.e. as a stand alone program),
never something to build on top of for fear of having to give away trade
secrets.

A license that says you *must* make source code available isn't any more
free then what Microsoft provides.  It is just restricted in a different
way.

>   http://www.robert-tolksdorf.de/vmlanguages.html
>But none have the power of Squeak. And, building on Squeak's strengths, it
>could be an opportune time to also shake off licensing problems, say by
>carefully comparing with and using GNU Smalltalk code when possible, or by
>using an approach like Bistro to leverage Java libraries temporarily until
>replacement versions in Smalltalk could be written in a true "clean room"
>fashion.

Well gcc can optionally output XML instead of assembly code.  I wonder about
using something like this to convert C projects to smalltalk directly.  And
this may work for all the gcc compilers (e.g. Java).

>Why not have Squeak in that role too? But the deeper question is, why is it
>not there already, and why has, say, Talks2 not gotten more effort behind
>it?

It is a matter of people time.  Everyone always asks "why hasn't <the
project I'm interested in> gotten more effort behind it?".  There is no free
effort left to get behind it.  And these questions wont inspire like a
Braveheart speech before a battle.  All I can suggest is; if you believe in
it get behind it.  Hopefully others will follow but don't expect it.  I am
personally looking at ways to pay to get work done in squeak I want to see
done but don't have the time to do.  Maybe with rentacoder.com or something.

>And I think that issue has to do with community issues and also licensing
>issues than technology issues. (I myself would build on Talks2, right now
>except it is stuck in the same licensing ambiguity Squeak is; I'm hoping
>when Squeak gets that cleared up for itself, that Talks2 might follow).

I did a quick look at the talks2 page and it said you are granted all the
rights to anything you write on it, to sell or not sell.  What is ambiguous
about that?  It sounds as free as it gets to me.

Thanks,
J

_________________________________________________________________
>From photos to predictions, The MSN Entertainment Guide to Golden Globes has
it all. http://tv.msn.com/tv/globes2007/


Reply | Threaded
Open this post in threaded view
|

Re: Design Principles Behind Smalltalk, Revisited

J J-6
In reply to this post by Paul D. Fernhout
>From: "Paul D. Fernhout" <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: Design Principles Behind Smalltalk, Revisited
>Date: Tue, 26 Dec 2006 21:39:30 -0500
>
>Perhaps the biggest single issue is, how do we have a community around new
>things inspired by Squeak? I brought this issue up many years ago, but was
>basically shot down in flames of people pushing "Squeak the artifact" not
>"Squeak the community". Still, it seems like it is community which makes
>the value in the free and open source world. Yet the Squeak community seems
>closely tied to Smalltalk-80 and Squeak-as-it-is, in part as a
>self-selecting process -- yet ironically as Alan Kay himself keeps saying
>he wants something better.

Well if there is a crowd that wants to keep Squeak tied to the blue book,
then a fork has to happen.  I don't think it is the case though.  And change
for change's sake isn't good either.  Smalltalk-80 had a lot of great ideas,
so care needs to be taken when breaking from the blue-book to ensure we are
going forward not backward.  Making some change because "well Java works
that way" would be a very bad idea.  On the other hand, traits was a
departure, but I think (so far) a good one.

_________________________________________________________________
Fixing up the home? Live Search can help
http://imagine-windowslive.com/search/kits/default.aspx?kit=improve&locale=en-US&source=hmemailtaglinenov06&FORM=WLMTAG


Reply | Threaded
Open this post in threaded view
|

Re: Design Principles Behind Smalltalk, Revisited

J J-6
In reply to this post by Jimmie Houchin-3
>From: Jimmie Houchin <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: Design Principles Behind Smalltalk, Revisited
>Date: Thu, 28 Dec 2006 11:36:02 -0600
>
>I've tended to use Python more functionally than OO. So Lua fits me better
>in that regard.

Have you looked at Haskell?  It is purely functional and amazingly
expressive.  Behind smalltalk, it is probably my second favorite at this
point.

_________________________________________________________________
Your Hotmail address already works to sign into Windows Live Messenger! Get
it now
http://clk.atdmt.com/MSN/go/msnnkwme0020000001msn/direct/01/?href=http://get.live.com/messenger/overview


Reply | Threaded
Open this post in threaded view
|

Re: Design Principles Behind Smalltalk, Revisited

Jimmie Houchin-3
J J wrote:
>>
>> I've tended to use Python more functionally than OO. So Lua fits me
>> better in that regard.
>
> Have you looked at Haskell?  It is purely functional and amazingly
> expressive.  Behind smalltalk, it is probably my second favorite at this
> point.

Yes I have, but not in a while. I do need to revisit it. It looked
interesting. I asked a few questions on the mailing list. And for the
project I am currently working on it didn't seem to be the most
practical tool at that time.

I am doing lots of text processing. A few million objects and several
gigabytes of text. Constant daily text retrieval and processing.

But I will tell you this much. In this thread you flipped my world
upside down. :)

I've been spending time thinking about how I wanted to manage all my
data. Now, I'm not a professional programmer and have no explicit training.

I've avoided RDBMS because I read a lot about the Object Relational
mismatch in Squeak, Ruby, Python, etc. mailing lists. So how do I store
my millions of objects, search and access them. I could easily store
them in files and search via Swish-e. But managing millions of files in
the file system is kludge. Ugh. So I've been thinking that I'm working
harder on a kludge than it would be to learn SQL and use PostgreSQL.

And then you write:
"""But this observation is the reason OO databases haven't really taken
off:  An OO database will tend to model things how *your* application
wants to see them.  A traditional relational DBA will model things in
the most generic way he can so that *all* the applications can build the
view they need easily.  Relational DBA's tend to be of the view point:
The data will exist for the life of the company, while the applications
that access it come and go like the tide.  And one only needs to look at
the huge Java rewrites going on to know they are right."""

This stood out for me:
"""Relational DBA's tend to be of the view point: The data will exist
for the life of the company, while the applications that access it come
and go like the tide."""

I've been chewing on that. And it just rang true to me. Wow!!!

And I thought about my entire computing experience. I have all kinds of
data and documents that I've changed the application accessing them
many, many times. But the data format is paramount. And as I thought
about my projects. Still true.

So with that nudge from you, I sit at my desk right now reading one of
my several SQL books. Thanks. :)

I know for smaller datasets options increase. But I'm feeling good about
an RDB for this one. Now that I've had a little tweak to my thinking. :)

Thanks again.

Jimmie


Reply | Threaded
Open this post in threaded view
|

Re: Design Principles Behind Smalltalk, Revisited

tblanchard

On Dec 30, 2006, at 9:09 AM, Jimmie Houchin wrote:

I've avoided RDBMS because I read a lot about the Object Relational mismatch in Squeak, Ruby, Python, etc. mailing lists. So how do I store my millions of objects, search and access them. I could easily store them in files and search via Swish-e. But managing millions of files in the file system is kludge. Ugh. So I've been thinking that I'm working harder on a kludge than it would be to learn SQL and use PostgreSQL.


So with that nudge from you, I sit at my desk right now reading one of my several SQL books. Thanks. :)

Put down the book.  

You want to load up GLORP.  It rocks.  It is as easy to work with as an OODB, but much more flexible and is backed by PostgreSQL.  Grab a copy of squeak, load the postgres client, then load up glorp.  You're golden.

Except you need a meta model.  You can write one in glorp, you can build one with a GUI like Apple's EOModeler - free with WebObjects.  The EOGlorp package will let GLORP work off of your EOModel files.  Once you have your meta model, glorp is just like working with objects.  You write queries in Smalltalk like

aDatabase readOneOf: User where: [:user | user login = 'jhouchin'].

-Todd Blanchard


Reply | Threaded
Open this post in threaded view
|

Re: Design Principles Behind Smalltalk, Revisited

J J-6
>From: Todd Blanchard <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: Design Principles Behind Smalltalk, Revisited
>Date: Sun, 31 Dec 2006 21:26:22 -0800
>
>
>On Dec 30, 2006, at 9:09 AM, Jimmie Houchin wrote:
>
>>I've avoided RDBMS because I read a lot about the Object Relational  
>>mismatch in Squeak, Ruby, Python, etc. mailing lists. So how do I  store
>>my millions of objects, search and access them. I could  easily store them
>>in files and search via Swish-e. But managing  millions of files in the
>>file system is kludge. Ugh. So I've been  thinking that I'm working harder
>>on a kludge than it would be to  learn SQL and use PostgreSQL.
>
>>So with that nudge from you, I sit at my desk right now reading one  of my
>>several SQL books. Thanks. :)
>
>Put down the book.
>
>You want to load up GLORP.  It rocks.  It is as easy to work with as  an
>OODB, but much more flexible and is backed by PostgreSQL.  Grab a  copy of
>squeak, load the postgres client, then load up glorp.  You're  golden.
>
>Except you need a meta model.  You can write one in glorp, you can  build
>one with a GUI like Apple's EOModeler - free with WebObjects.   The EOGlorp
>package will let GLORP work off of your EOModel files.   Once you have your
>meta model, glorp is just like working with  objects.  You write queries in
>Smalltalk like
>
>aDatabase readOneOf: User where: [:user | user login = 'jhouchin'].
>
>-Todd Blanchard

Tools like GLORP are very nice: they save you writing SQL directly.  But
look at your line of code:  it is SQL in message form.

I wasn't talking about using embedded SQL in code.  I was talking about the
back end data store.  IMO the data is often best modeled relationally.  Then
you can set up any views you want and then use something like GLORP to
access it.

_________________________________________________________________
Find sales, coupons, and free shipping, all in one place!  MSN Shopping
Sales & Deals
http://shopping.msn.com/content/shp/?ctid=198,ptnrid=176,ptnrdata=200639


Reply | Threaded
Open this post in threaded view
|

Re: Design Principles Behind Smalltalk, Revisited

tblanchard

On Jan 1, 2007, at 1:23 AM, J J wrote:

>> From: Todd Blanchard <[hidden email]>
>>
>> aDatabase readOneOf: User where: [:user | user login = 'jhouchin'].
>>
>> -Todd Blanchard
>
> Tools like GLORP are very nice: they save you writing SQL  
> directly.  But look at your line of code:  it is SQL in message form.

You know, that is a good point.  I think it would be easy to emulate  
the squeak collections operations though.
ie

database users detect: [:ea | ea login='todd'] -> readOneOf: User  
where:...
database users select: [] -> readManyOf:...
reject:....

Basically treating each entity as a collection.  That might be worth  
doing and should be pretty easy.
Good idea.

-Todd Blanchard


Reply | Threaded
Open this post in threaded view
|

Re: Design Principles Behind Smalltalk, Revisited

Göran Krampe
Hi!

Todd Blanchard <[hidden email]> wrote:

>
> On Jan 1, 2007, at 1:23 AM, J J wrote:
>
> >> From: Todd Blanchard <[hidden email]>
> >>
> >> aDatabase readOneOf: User where: [:user | user login = 'jhouchin'].
> >>
> >> -Todd Blanchard
> >
> > Tools like GLORP are very nice: they save you writing SQL  
> > directly.  But look at your line of code:  it is SQL in message form.
>
> You know, that is a good point.  I think it would be easy to emulate  
> the squeak collections operations though.
> ie
>
> database users detect: [:ea | ea login='todd'] -> readOneOf: User  
> where:...
> database users select: [] -> readManyOf:...
> reject:....
>
> Basically treating each entity as a collection.  That might be worth  
> doing and should be pretty easy.
> Good idea.

Magma uses a similar trick in its query capabilities (in order to make
seemingly iterative block-code actually generate a query with
optimization and index-support). See here:
        http://wiki.squeak.org/squeak/5859

(search for #where: down that page)

I don't have time right now posting in this thread, let me just mention
that I disagree with JJ :) regarding the arguments for using an RDB
instead of an ODB. There are of course arguments in both directions -
depending on context - but IMHO the lifecycle-argument is not as clear
cut as described.

regards, Göran

Reply | Threaded
Open this post in threaded view
|

Re: Design Principles Behind Smalltalk, Revisited

J J-6
>From: [hidden email]
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: Design Principles Behind Smalltalk, Revisited
>Date: Tue, 2 Jan 2007 10:56:23 +0200
>
>I don't have time right now posting in this thread, let me just mention
>that I disagree with JJ :) regarding the arguments for using an RDB
>instead of an ODB. There are of course arguments in both directions -
>depending on context - but IMHO the lifecycle-argument is not as clear
>cut as described.

Well, let me clarify my position a little.  I don't feel that ODB's are
useless or anything.  Things you see in the Rails demo's should probably
have been in an ODB (or even just objects, as Ramon's "blog in 15 minutes"
showed).  I simply believe in the right tool for the right job, and you
can't beat an RDB in it's domain.

It depends on what you are doing.  Sometimes in a powerful language like
smalltalk you just keep your data in objects and let image persistence
handle it.  Sometimes you want a little more so you write the data out to
files.  Sometimes you want to go even further, and this is when an ODB can
be a great solution.

But at the enterprise level (i.e. lots of different programs over a large
organization) I still see RDBMS as the winner.  And the reason I see it this
way is simply: SQL/RDB can be seen as a DSL system for dealing with set
data.  There is a tremendous amount of power built into it for this
particular domain that would be difficult to make more concise in another
way.  I suppose it is just a question of how comfortable one is with SQL.

_________________________________________________________________
Get FREE Web site and company branded e-mail from Microsoft Office Live
http://clk.atdmt.com/MRT/go/mcrssaub0050001411mrt/direct/01/


Reply | Threaded
Open this post in threaded view
|

Re: Design Principles Behind Smalltalk, Revisited

bernard_notarianni
J J-6 wrote
But at the enterprise level (i.e. lots of different programs over a large
organization) I still see RDBMS as the winner.  And the reason I see it this
way is simply: SQL/RDB can be seen as a DSL system for dealing with set
data.  There is a tremendous amount of power built into it for this
particular domain that would be difficult to make more concise in another
way.  I suppose it is just a question of how comfortable one is with SQL.
I never had the opportunity to work on an Object Oriented Database (such as Gemstone) used for integration between multiple applications, as would be a RDB. I suppose it could offer very efficient, simple and powerfull solution for integration.

Anyone has some feedback?
Reply | Threaded
Open this post in threaded view
|

relational for what? [was: Design Principles Behind Smalltalk, Revisited]

Howard Stearns
In reply to this post by J J-6
J J wrote:
>> ... I simply believe in the right tool for the right job,
> and you can't beat an RDB in it's domain. ...

That's something I've never really understood: what is the domain in
which Relational Databases excel?

- Data too large to fit in memory? Well, most uses today may have been
too large to fit in memory 20 years ago, but aren't today. And even for
really big data sets today, networks are much faster than disk drives,
so a distributed database (e.g., a DHT) will be faster.   Sanity check:
Do you think Google uses an RDB for storing indexes and a cache of the WWW?

- Transactional processing with rollback, three-phase commit, etc?
Again, these don't appear to actually be used by the application servers
that get connected to the databases today. And if they were, would this
be a property of relational databases per se? Finally, in world with
great distributed computing power, is centralized transaction processing
really a superior model?

- Set processing? I'm not sure what you mean by set data, JJ. I've seen
set theory taught in a procedural style, a functional style, and in an
object oriented style, but outside of ERP system training classes, I've
never seen it taught in a relational style. I'm not even sure what that
means. (Tables with other than one key, ...) That's not a proof that
relational is worse, but it does suggest to me that the premise is worth
questioning.

- Working with other applications that are designed to use RDB's? Maybe,
but that's a tautology, no?

I'm under the impression (could be wrong) that RDBMS were created to
solve a particular problem that may or may not have been true at the
time, but which is no longer the situation today. And what are called
RDBMS no longer actually conform to the original problem/solution space
anyway.

Regards,
-Howard

Reply | Threaded
Open this post in threaded view
|

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

tblanchard
Funny, I just blogged about this.

http://www.blackbagops.net/?p=93

On Jan 2, 2007, at 6:18 AM, Howard Stearns wrote:

> J J wrote:
>>> ... I simply believe in the right tool for the right job,
>> and you can't beat an RDB in it's domain. ...
>
> That's something I've never really understood: what is the domain  
> in which Relational Databases excel?
>
> - Data too large to fit in memory? Well, most uses today may have  
> been too large to fit in memory 20 years ago, but aren't today. And  
> even for really big data sets today, networks are much faster than  
> disk drives, so a distributed database (e.g., a DHT) will be  
> faster.   Sanity check: Do you think Google uses an RDB for storing  
> indexes and a cache of the WWW?
>
> - Transactional processing with rollback, three-phase commit, etc?  
> Again, these don't appear to actually be used by the application  
> servers that get connected to the databases today. And if they  
> were, would this be a property of relational databases per se?  
> Finally, in world with great distributed computing power, is  
> centralized transaction processing really a superior model?
>
> - Set processing? I'm not sure what you mean by set data, JJ. I've  
> seen set theory taught in a procedural style, a functional style,  
> and in an object oriented style, but outside of ERP system training  
> classes, I've never seen it taught in a relational style. I'm not  
> even sure what that means. (Tables with other than one key, ...)  
> That's not a proof that relational is worse, but it does suggest to  
> me that the premise is worth questioning.
>
> - Working with other applications that are designed to use RDB's?  
> Maybe, but that's a tautology, no?
>
> I'm under the impression (could be wrong) that RDBMS were created  
> to solve a particular problem that may or may not have been true at  
> the time, but which is no longer the situation today. And what are  
> called RDBMS no longer actually conform to the original problem/
> solution space anyway.
>
> Regards,
> -Howard
>


Reply | Threaded
Open this post in threaded view
|

Re: Design Principles Behind Smalltalk, Revisited

Göran Krampe
In reply to this post by Göran Krampe
[hidden email] wrote:

> Hi!
>
> Todd Blanchard <[hidden email]> wrote:
> > Basically treating each entity as a collection.  That might be worth  
> > doing and should be pretty easy.
> > Good idea.
>
> Magma uses a similar trick in its query capabilities (in order to make
> seemingly iterative block-code actually generate a query with
> optimization and index-support). See here:
> http://wiki.squeak.org/squeak/5859

And oh, I forgot the first time - Avi did ROE which is also very
interesting:
        http://map.squeak.org/packagebyname/roe

regards, Göran

Reply | Threaded
Open this post in threaded view
|

RE: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

James Foster-4
In reply to this post by tblanchard
> -----Original Message-----
> From: [hidden email] [mailto:squeak-dev-
> [hidden email]] On Behalf Of Todd Blanchard
> Sent: Tuesday, January 02, 2007 7:17 AM
> To: The general-purpose Squeak developers list
> Subject: Re: relational for what? [was: Design Principles Behind
> Smalltalk,Revisited]
>
> Funny, I just blogged about this.
>
> http://www.blackbagops.net/?p=93
>

Todd,

As a long-time user of Object Databases (and now a vendor's employee) I
found your comments interesting. A few responses with regard to
GemStone/Smalltalk:
- Object accesses do not need to be done within a transaction.
- Locking is at the object (not page) level (and can be optimistic or
pessimistic).
- Classes are available that allow things like a queue to have multiple
producers and a single consumer without conflicts.
- A number of classes have automatic retry built in.
- GemBuilder provides client-side in-memory user-level caching.
- Schema migration is very flexible. You can have multiple versions of a
class, each of which may have live instances. The database does not have to
be off-line to update the schema.
- Most image-level bug fixes can be applied with the system in production.
- Users can be assigned to groups and objects can be assigned security based
on owner/group/world.
- Garbage collection is quite sophisticated and is not adversely impacted by
schema changes.

James


Reply | Threaded
Open this post in threaded view
|

RE: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

J J-6
In reply to this post by Howard Stearns
>From: Howard Stearns <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: relational for what? [was: Design Principles Behind Smalltalk,
>Revisited]
>Date: Tue, 02 Jan 2007 08:18:24 -0600
>
>J J wrote:
>>>... I simply believe in the right tool for the right job,
>>and you can't beat an RDB in it's domain. ...
>
>That's something I've never really understood: what is the domain in which
>Relational Databases excel?

Handling large amounts of enterprise data.  If you have never worked in a
large company, you probably wont appreciate this.  But in a large company
you have a *lot* of data, and different applications want to see different
parts of it.  In an RDBMS this is no problem, you normalize the data and
take one of a few strategies to supply it to the different consumers (e.g.
views, stored procedures, etc.).

>- Data too large to fit in memory? Well, most uses today may have been too
>large to fit in memory 20 years ago, but aren't today. And even for really
>big data sets today, networks are much faster than disk drives, so a
>distributed database (e.g., a DHT) will be faster.   Sanity check: Do you
>think Google uses an RDB for storing indexes and a cache of the WWW?

Are you serious with this (data too large to fit into memory)?  And if you
use a good RDBMS then you don't have to worry about disk speed or
distribution.  The DBA's can watch how the database is being used and tune
this (i.e. partition the data and move it to another CPU, etc., etc.).

Oh, but you found one example where someone with a lot of data didn't use a
RDB.  I guess we can throw the whole technology sector in the trash.  Sanity
check:  google is trying to keep a current snapshot of all websites and run
it on commodity hardware.  You could do exactly the same thing with a lot
less CPU's using a highly tuned, distributed RDBMS.  They chose to hand tune
code instead of an RDBMS.

>- Transactional processing with rollback, three-phase commit, etc? Again,
>these don't appear to actually be used by the application servers that get
>connected to the databases today. And if they were, would this be a
>property of relational databases per se?

What data point are you using?  Sure little blogs and things like that
probably don't use it, and that probably is the majority of database users.  
But how much wealth (i.e. money and jobs) are being generated by those
compared to larger companies.

All the applications I write at work absolutely require such functionality
and I have no intention of writing it myself.

>Finally, in world with great distributed computing power, is centralized
>transaction processing really a superior model?

Some people seem to think so:
http://lambda-the-ultimate.org/node/463

And there is more then that.  I believe in that paper (dont have time to
verify) they mention that hardware manufacturers are also starting to take
this approach as well because fine grain locking is so bad.

>- Set processing? I'm not sure what you mean by set data, JJ. I've seen set
>theory taught in a procedural style, a functional style, and in an object
>oriented style, but outside of ERP system training classes, I've never seen
>it taught in a relational style. I'm not even sure what that means. (Tables
>with other than one key, ...) That's not a proof that relational is worse,
>but it does suggest to me that the premise is worth questioning.

I thought this was the common way of expression the data operations one does
in an RDBMS.  To give an example of the power; not too long ago I had to
write a report about the state of various systems on the network in relation
to the applications that run on them.  My first approach was simply read the
data into objects and extract the data via coding.  But after the
requirements for the reports changed a couple of times I got sick of hand
writing joins, unions, etc. etc. and just downloaded a database.  It took
about 5 minutes to set up the scheme and import all the data.  After that I
could quickly generate any report the requesters could dream up.  Since SQL
is effectively a DSL over relational data, my code changed from many
statements to 1 per report.

>- Working with other applications that are designed to use RDB's? Maybe,
>but that's a tautology, no?

Again, one has to work in a large company to appreciate the nature of
enterprise application development.

>I'm under the impression (could be wrong) that RDBMS were created to solve
>a particular problem that may or may not have been true at the time, but
>which is no longer the situation today. And what are called RDBMS no longer
>actually conform to the original problem/solution space anyway.

I don't know what the first RDBMS was created for, but what they are today
and have been for the span of my career is certainly not a solution to a
problem no one has.

The fact is, there are two basic kinds of databases: Relational and
Hierarchical (LDAP, OODB).  Each is good at dealing with certain kinds of
data and bad at others.

_________________________________________________________________
Fixing up the home? Live Search can help
http://imagine-windowslive.com/search/kits/default.aspx?kit=improve&locale=en-US&source=hmemailtaglinenov06&FORM=WLMTAG


Reply | Threaded
Open this post in threaded view
|

RE: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

Peter Crowther-2
> From: J J
> >From: Howard Stearns <[hidden email]>
> >That's something I've never really understood: what is the domain in
> >which Relational Databases excel?
>
> Handling large amounts of enterprise data.

Handling and dynamically querying large amounts of data where the data
format is not necessarily completely stable and ad-hoc query performance
is important.  "Large" here is "much larger than main memory of the
machine(s) concerned".  I routinely handle data sets of tens of gigs on
current commodity hardware - storing the data in RAM would be somewhat
faster, but too expensive for the available capital.

The strength of relational over other forms is in being able to form
arbitrary joins *relatively* efficiently, and hence in being able to
query across data many times larger than main memory without excessive
disk traffic.

Google isn't a good counter-example, as the ad-hoc querying is missing.
The types of queries done on the Google database are very limited and
are well known in advance.

                - Peter

Reply | Threaded
Open this post in threaded view
|

RE: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

J J-6
In reply to this post by J J-6
>Some people seem to think so:
>http://lambda-the-ultimate.org/node/463

Damn it, that's the wrong paper.  I don't know what this one is about, but
it might be similar.  The paper I meant was on LTU somewhere within the last
couple of weeks.  I don't know how some of you people are so fast with
links. My mind records the summary, not the title, so I can never find the
same link in any reasonable amount of time. :(

_________________________________________________________________
>From photos to predictions, The MSN Entertainment Guide to Golden Globes has
it all. http://tv.msn.com/tv/globes2007/


Reply | Threaded
Open this post in threaded view
|

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

Howard Stearns
In reply to this post by J J-6
Yes, I'm quite serious. I'm asking what kinds of problems RDBMS are
uniquely best at solving (or at least no worse). I'm not asking whether
they CAN be used for this problem or that.  I'm asking this from an
engineering/mathematics perspective, not a business ("we've always done
things this way" or "we like this vendor") perspective.

I'm new to the Enterprise Software world, having been mostly in either
industrial or "hard problem" software. But the 3-tier application
architecture we use for financial processing at our 26 state campuses
(University of Wisconsin) appears to me to be typical: large numbers of
individual browser (not communicating with each other) interact through
a Web server farm to the Application Servers. The overall application is
too large as implemented to allow the load to be accommodated, so it is
divided by functional area into a farm of individual applications that
do not talk directly to each other. This partitioning isn't very
successful, because the users tend to do the same functional activities
at the same times of day, so most of the applications sit idle while a
few are at their limit. I assumed that a single database was used so
that the RDBMS could ensure data consistency between all these different
applications.  But it turns out that the Oracle database can't handle
that, so instead, each functional area gets its own database.  Most of
the work done by the system (and most of the work of programmers like
me) is to COPY data from one table to another at night when the system
is otherwise quiet. [There is this Byzantine dance in which data is
copied from one ledger to the next, with various checks against yet
another set of ledgers. The whole thing is kept in sync by offsetting
entries ("double entry") that are reconciled once a month or once a year
when the system is shut down. Amazing.] The whole thing is kludged so
that nothing ends up handling more than a few gigs of records at a time.
  [Naively, it seems like the obvious solution for this (mathematically)
is a hashing operation to keep the data evenly distributed over
in-memory systems on a LAN, plus an in-memory cache of recently used
chunks. But let's assume I'm missing something. The task here is to
figure out what I'm not seeing.]

Maybe this isn't typical, but it is the architecture that Oracle and its
PeopleSoft division pushes on us in their extensive training classes.
And it appears to be the architecture discussed in the higher education
IT conferences and Web sites in the U.S.

My experience with non-Enterprise Web/application software is also
limited, but installations I've encountered since -- when did Phil and
Alex's Excellent Web site come out? -- appear to also use partitioning
to keep the working sets down to a few gigs.

My friends at Ab Initio won't tell me what they do or how they do it,
but no one's claiming they use a RDBMS as Codd described it.

Anyway, either the data AS USED fits into memory or doesn't. If it does,
then what benefit is the relational math providing? If it doesn't, then
we have to ask whether the math techniques that were developed to
provide efficient random access over disks 20 years ago are still valid.
Is this still the fastest way? (Answer is no.) Is there some
circumstance in which it is the fastest? Or the safest? Or allow us to
do something that we could not do otherwise?

Having tools to allow a cult of specialists to break your own computing
model (the relational calculus) is not feature, but a signal that
something is wrong.

I tried briefly to combine JJ's answer with Peter's to find an
appropriate niche. (Again, I'm trying to look at the math, not fit and
finish, availability of experienced programmers, color of brochure...)
For exampe, there could be a class of problems for which the data set is
a few 10's of gigs and needs to be operated on as a whole. And that
queries are fairly arbitrary and exploratory, not production-oriented.
Etc. But I haven't been able to come up with one that doesn't have
better characteristics as a distributed system.  Maybe if we define the
problem as "and you only have one commodity box to do it on." That's
fair. Maybe that's it?  (Then we need to find an "enterprise" with only
one box...)

J J wrote:

>> From: Howard Stearns <[hidden email]>
>> Reply-To: The general-purpose Squeak developers
>> list<[hidden email]>
>> To: The general-purpose Squeak developers
>> list<[hidden email]>
>> Subject: relational for what? [was: Design Principles Behind
>> Smalltalk, Revisited]
>> Date: Tue, 02 Jan 2007 08:18:24 -0600
>>
>> J J wrote:
>>>> ... I simply believe in the right tool for the right job,
>>> and you can't beat an RDB in it's domain. ...
>>
>> That's something I've never really understood: what is the domain in
>> which Relational Databases excel?
>
> Handling large amounts of enterprise data.  If you have never worked in
> a large company, you probably wont appreciate this.  But in a large
> company you have a *lot* of data, and different applications want to see
> different parts of it.  In an RDBMS this is no problem, you normalize
> the data and take one of a few strategies to supply it to the different
> consumers (e.g. views, stored procedures, etc.).
>
>> - Data too large to fit in memory? Well, most uses today may have been
>> too large to fit in memory 20 years ago, but aren't today. And even
>> for really big data sets today, networks are much faster than disk
>> drives, so a distributed database (e.g., a DHT) will be faster.  
>> Sanity check: Do you think Google uses an RDB for storing indexes and
>> a cache of the WWW?
>
> Are you serious with this (data too large to fit into memory)?  And if
> you use a good RDBMS then you don't have to worry about disk speed or
> distribution.  The DBA's can watch how the database is being used and
> tune this (i.e. partition the data and move it to another CPU, etc., etc.).
>
> Oh, but you found one example where someone with a lot of data didn't
> use a RDB.  I guess we can throw the whole technology sector in the
> trash.  Sanity check:  google is trying to keep a current snapshot of
> all websites and run it on commodity hardware.  You could do exactly the
> same thing with a lot less CPU's using a highly tuned, distributed
> RDBMS.  They chose to hand tune code instead of an RDBMS.
>
>> - Transactional processing with rollback, three-phase commit, etc?
>> Again, these don't appear to actually be used by the application
>> servers that get connected to the databases today. And if they were,
>> would this be a property of relational databases per se?
>
> What data point are you using?  Sure little blogs and things like that
> probably don't use it, and that probably is the majority of database
> users.  But how much wealth (i.e. money and jobs) are being generated by
> those compared to larger companies.
>
> All the applications I write at work absolutely require such
> functionality and I have no intention of writing it myself.
>
>> Finally, in world with great distributed computing power, is
>> centralized transaction processing really a superior model?
>
> Some people seem to think so:
> http://lambda-the-ultimate.org/node/463
>
> And there is more then that.  I believe in that paper (dont have time to
> verify) they mention that hardware manufacturers are also starting to
> take this approach as well because fine grain locking is so bad.
>
>> - Set processing? I'm not sure what you mean by set data, JJ. I've
>> seen set theory taught in a procedural style, a functional style, and
>> in an object oriented style, but outside of ERP system training
>> classes, I've never seen it taught in a relational style. I'm not even
>> sure what that means. (Tables with other than one key, ...) That's not
>> a proof that relational is worse, but it does suggest to me that the
>> premise is worth questioning.
>
> I thought this was the common way of expression the data operations one
> does in an RDBMS.  To give an example of the power; not too long ago I
> had to write a report about the state of various systems on the network
> in relation to the applications that run on them.  My first approach was
> simply read the data into objects and extract the data via coding.  But
> after the requirements for the reports changed a couple of times I got
> sick of hand writing joins, unions, etc. etc. and just downloaded a
> database.  It took about 5 minutes to set up the scheme and import all
> the data.  After that I could quickly generate any report the requesters
> could dream up.  Since SQL is effectively a DSL over relational data, my
> code changed from many statements to 1 per report.
>
>> - Working with other applications that are designed to use RDB's?
>> Maybe, but that's a tautology, no?
>
> Again, one has to work in a large company to appreciate the nature of
> enterprise application development.
>
>> I'm under the impression (could be wrong) that RDBMS were created to
>> solve a particular problem that may or may not have been true at the
>> time, but which is no longer the situation today. And what are called
>> RDBMS no longer actually conform to the original problem/solution
>> space anyway.
>
> I don't know what the first RDBMS was created for, but what they are
> today and have been for the span of my career is certainly not a
> solution to a problem no one has.
>
> The fact is, there are two basic kinds of databases: Relational and
> Hierarchical (LDAP, OODB).  Each is good at dealing with certain kinds
> of data and bad at others.
>
> _________________________________________________________________
> Fixing up the home? Live Search can help
> http://imagine-windowslive.com/search/kits/default.aspx?kit=improve&locale=en-US&source=hmemailtaglinenov06&FORM=WLMTAG 
>
>
>

--
Howard Stearns
University of Wisconsin - Madison
Division of Information Technology
mailto:[hidden email]
jabber:[hidden email]
voice:+1-608-262-3724

Reply | Threaded
Open this post in threaded view
|

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

Andreas.Raab
Howard Stearns wrote:
> Yes, I'm quite serious. I'm asking what kinds of problems RDBMS are
> uniquely best at solving (or at least no worse). I'm not asking whether
> they CAN be used for this problem or that.  I'm asking this from an
> engineering/mathematics perspective, not a business ("we've always done
> things this way" or "we like this vendor") perspective.

The main benefit: They work. There is no question how to use them, apply
them to problems, map them into different domains etc. This has all been
worked out, there is nothing new to find out, just a book or two to
read. From an engineering perspective that is vastly advantageous since
it represents a solution with a proven track-record and no surprises.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

Howard Stearns
Of course. No question.  Except, of course, where they don't. The 3-tier
enterprise software scenario is -- to me -- an example of it NOT working.

I used to write expert system software. A fellow once asked, "But
couldn't I do that with Fortran?" The answer was, "Yes, and you could do
it with pencil and paper, too, but you wouldn't want to."

There's a whole bunch of problems for which pencil and paper are good
enough, but maybe not ideal. Same for RDBMS. And there's all sorts of
practical considerations in this range. Worse is Better, End to End, and
whatever you like. No one is (I hope!) going to walk away from a
solution in-hand that is good enough.

There are also problems for which pencil and paper really aren't suited
for. Same for RDBMS. They can be made to work with the great expenditure
of resources, chewing gum, bailing wire, duct tape, vise grips, etc. And
half of all enterprise IT projects fail. And yet even with this
knowledge, there's still a 50% chance that you can make an RDBMS work on
the wrong kind of problem if you throw enough money at it.

What I'm trying to do -- and of course, this isn't a Squeak question at
all, but I hope it is a Squeak community question -- is try to learn
what domain a perfectly running RDBMS is a good fit for by design,
compared with a perfectly running alternative (even a hypothetical one).

Andreas Raab wrote:

> Howard Stearns wrote:
>> Yes, I'm quite serious. I'm asking what kinds of problems RDBMS are
>> uniquely best at solving (or at least no worse). I'm not asking
>> whether they CAN be used for this problem or that.  I'm asking this
>> from an engineering/mathematics perspective, not a business ("we've
>> always done things this way" or "we like this vendor") perspective.
>
> The main benefit: They work. There is no question how to use them, apply
> them to problems, map them into different domains etc. This has all been
> worked out, there is nothing new to find out, just a book or two to
> read. From an engineering perspective that is vastly advantageous since
> it represents a solution with a proven track-record and no surprises.
>
> Cheers,
>   - Andreas
>

--
Howard Stearns
University of Wisconsin - Madison
Division of Information Technology
mailto:[hidden email]
jabber:[hidden email]
voice:+1-608-262-3724

1234