Smalltalk › Squeak › Squeak - Dev

Design Principles Behind Smalltalk, Revisited

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

79 messages Options

1234

Paul D. Fernhout

Re: Design Principles Behind Smalltalk, Revisited (humor)

Laurence Rozier wrote:

> While there are certainly valuable insights in "Data & Reality" and I would
> agree that some data objects are merely "tools of thought", *many* objects
> have meaning and exist independent of our view/model. Quantum physics does
> tell us that the boundries of "things" are hard to define precisely but
> "things" themselves as aggregates are held together by forces of nature not
> by external views. A keyboard can be remapped in software and different
> people using it can have different views of the individual key "objects".
> Even the keyboard itself could be viewed differently - a word processor,
> game controller, or a cash register. However, any observer, human, machine
> or otherwise observer of measurable physical characteristics of the
> keyboard
> will not see any changes. The wave-functions underlying all of the
> sub-atomic particles making up that keyboard have a unique history going
> back at least to just after the big bang.

Thanks for the other comments.

On the "keyboard" analogy:

Consider if you move to a Dvorack layout on your keyboard instead of
Qwerty. Then you might need to pry off all the keycaps and move them
around. Suddenly you do not have "a keyboard". What you have is a
collection of keycaps (perhaps some broken in the process of removing
them) plus a base (perhaps with a keyboard switch or two damaged by
prying). Your mind has followed this situation, where something you
thought was an object has now been decomposed into multiple items, some of
which even have subitems or subareas which are not obviously removable
(broken switches soldered on the keyboard base) yet behave differently. To
model this requires a lot of subtly with boundaries not being obvious --
with the boundaries fluidly moving around depending on the questions we
have or the intent we have or the actions we take.

Or, what if, say, a rabid ocelot has just wondered into your office?
http://en.wikipedia.org/wiki/Ocelot
Suddenly your mental model of your entire office might shift to -- what
item can I throw at the foaming-mouthed ocelot to keep it away from me and
give me enough time to escape past it through the door? The closest thing
at hand is the keyboard. Suddenly your mental model of the keyboard needs
to switch from "data entry device" to "ocelot management system". You have
to think about issues like will the cable be long enough if I throw it
as-is or will the plug disconnect from the computer if hurled with enough
force? Or will the computer itself move with it if I toss it? And all in
an instant. So, suddenly your whole mapping of the possibilities and uses
of your keyboard needs to change, and in less time than it takes that
rabid ocelot to move from your door to your desk. A typical ST80
simulation of a computer could not be used in that way, but your mind can
do it easily and quickly.

So there is a gap here between the flexibility of the way your brain
models physical objects and processes and intent and the way we build
limited computer models using ST80. Your brain makes the switch in
microseconds; it might take weeks or months to change a simulation of a
keyboard as data input to keyboard as thrown object (let alone model a
rabid ocelot :-). And there remain subtle problems -- is the keyboard an
independent "object" if you have to think about the cable and how it is
attached to another independent "object, the computer? Perhaps this set of
problems is just solvable with a good class library; if so, I haven't seen
it yet. :-) Perhaps the latest version of Inform?
http://www.inform-fiction.org/I7/Inform%207.html
But even there, it seems like a lot of hand crafting of rules specific to
the needs of the story. Essentially, our minds' model of reality is much
more subtle and fluid than that of "object" even if we appear to be seeing
them all the time. And it works so well we don't even notice these abrupt
shifts in representation -- except perhaps when we laugh as a perspective
shifting joke. :-) Consider:
http://www.funsulting.com/september_2004_newsletter.html
From there: "Illegal aliens have always been a problem in the United
States. Ask any [American] Indian."

Our mind has a much deeper and greater and more flexible command of the
notion of "objects" and "classes" in relation to "need" or "intent" than
the Smalltalk environment has, at the very least. And these perspective
shifts are often the basis of creativity. And it is exactly enhancing
creativity which is Smalltalk's stated design goal. So maybe we need a
software modeling environment for modeling jokes about objects and
classes. :-) Again from the above link: """Research has linked the
creative and humor portions of our brains. Several studies showed that
humor leads to creativity. One of the most creative uses of humor is seen
in the comedic style of Stephen Wright. His one liner’s take normal
everyday concepts and show us a creative, and playful, way of seeing them.
Here are some examples:
“I spilled Spot Remover on my dog... Now he's gone.”
“I went to a general store. They wouldn't let me buy anything specifically.”
Many of us hear his jokes and immediately see the humor in the different
perspective. Interestingly, by exposing ourselves to this kind of humor,
we are also more likely to be creative. Since the creative process
involves seeing new things or new points of view, humor is a logical jump
starter to creativity."""
Maybe, ultimately, the problem with Smalltalk and its very rigid class
based view of the world is it is too serious a programming environment?
Maybe it needs to lighten up a little? Learn to laugh at itself? :-)
How would one even begin to tell a joke (and get laughter in response) in
Smalltalk-80?

As someone else in the thread put it, it is a general principle of
mathematical model building what we are just making a simplification of
reality for our purposes. I'll agree, but I will still not let Smalltalk
off the hook -- since our mind is able to build and rebuild these models
seemingly in an instant -- even in the punch line of a joke -- whereas
Smalltalk coding takes a long time. And I only hold ST80 to such high
standards as it aspires to them (forget about C++; no hope of a sense of
humor there. :-) '

There is some sort of mismatch going on here between the mind and
Smalltalk's object model. What it is in its entirety I am not sure. But
clearly the tools at hand in Smalltalk-80 can't match the minds
flexibility in object-oriented (and other) modeling. Yet it is very much a
stated design goal in Dan's original paper to have the Smalltalk software
environment be a good match for how the mind actually works. So, here, as
exemplified by humor, we have a mismatch. Essentially, Smalltalk code
isn't funny. :-)

Granted eToys may be "fun", but that is not the same as being "funny".
How could you tell a joke to eToys and have it laugh? Or how could eToys
invent new jokes and tell them to you for your approval? Perhaps this
starts to border on AI?

Anyway, writing this inspired me to Google on programs that invent jokes,
and I got this:
http://news.bbc.co.uk/1/hi/technology/5275544.stm
"""Computer scientists in Scotland developed the program for children who
need to use computerised speech aids. The team said enabling non-speaking
children to use puns and other jokes would help them to develop their
language and communication skills. The researchers admitted some of the
computer-generated puns were terrible, but said the children who had tried
the technology loved them. ...Children using the software can choose a
word or compound word, which will form some or all of the punch line, from
the system's dictionary. The program then writes the joke's opener. It
works by comparing the selected word with other words in its dictionary
for phonetic similarity or concepts that link the words together, and then
fits them into a pun template. ... Dr Waller said: "The kids have been
superb, they have taken to the software like fish to water. They have been
regaling everybody with their jokes." She said it seemed to have boosted
their confidence as well as their language skills. "It gives these kids
the ability to control conversations, perhaps for the first time, it gives
them the ability to entertain other people. And their self-image improves
too." """

Related web sites:
http://www.computing.dundee.ac.uk/staff/awaller/research.asp
http://groups.inf.ed.ac.uk/standup/
From the last: "We are exploring how humour may be used to help
non-speaking children learn to use language more effectively. There is
evidence to suggest that language play, including using puns and other
jokes, has a beneficial effect on a child's developing language and
communication skills. Children with communication impairments are often
reliant on augmented communication aids in order to carry on
conversations, but these aids give little scope for generating novel
language. This inhibits experimentation with language and limits the
trying out of humorous ideas, which can in turn have a stultifying effect
on language development. We propose to address this deficiency in the
language environment of the non-speaking child by providing a software
tool which promotes humorous language play. Starting from our previous
research on the automated generation of punning riddles, we will design
and implement a program which allows the user to experiment with the
construction of simple jokes. The user interface of this system will be
specially designed to be accessible to children with communication and
physical disabilities. We will then test the efficacy of the system by
observing and evaluating the use of the software by the children."

Perhaps there in Dr. Waller's lab is the future of Smalltalk? :-)

> Today, more and more so-called information systems are being used not just
> for description but to augment/effect the external world. In this
> evolving hyperlinked
> meshverse of simulation and
> "reality"<http://www.meshverse.com/2006/11/20/hyperlinking-reality/>,
> data often enters into a symbiotic relationship with "reality" where
> changing views can change "reality". The "real" Mars Climate
> Orbiter<http://en.wikipedia.org/wiki/Mars_Climate_Orbiter>object was
> destroyed because it was dependent on the data a model object
> had. If one accepts that a paradigm shift is underway which Croquet offers
> something of value in, then there are important
> ramifications<http://croquet.funkencode.com/2006/04/24/the-64-billion-dollar-question/>for
> database and language choices.

Thanks for the links. I'll agree that as the "noosphere" or "nooverse"
continues to develops,
http://en.wikipedia.org/wiki/Pierre_Teilhard_de_Chardin
http://en.wikipedia.org/wiki/Noosphere
we'll see more bridging of mental models (incarnated in computers or not)
and the physical world, where such abstract constructs have unexpected
effects on the physical world. I heop this project has poticve effects in
that direction (intended to be a GPL'd matter replicator, which can
reproduce itself):
http://reprap.org/
Still, we have been seeing this link of model (data) and reality for some
time, and not just on an individual level -- I'm sure we've all had to
deal with government bureaucracies or corporate hierarchies or classroom
settings where our problem or need did not match the pigeonholes or
procedures the organization had for dealing with individuals (especially
creative ones. :-) How does a bureaucracy deal with humor? It often can't.
Consider:
"The Soviet Joke Book"
http://www.st-andrews.ac.uk/~pv/courses/sovrus/jokes.html
An anecdote told during the Brezhnev era: Stalin, Khrushchev and Brezhnev
were all travelling together in a railway carriage, when unexpectedly the
train stopped. Stalin put his head out of the window and shouted, "Shoot
the driver!" But the train didn't start moving. Khrushchev then shouted,
"Rehabilitate the driver!" But it still didn't move. Brezhnev then said,
"Comrades, Comrades, let's draw the curtains, turn on the gramophone and
let's pretend we're moving!" After Gorbachev came to power another line
was added, in which he suggests: "Comrades, let's get out and push."

The history of Smalltalk-80 is itself an example of that -- ST80 didn't
fit Steve Job's model when he saw it, so he ignored most of it, and gave
us only the GUI window part in the Macintosh. Or considering my comments
above, essentially, Steve did not get most of the joke. :-) The idea of
making source and development tools available to end users did not match
the notion of run-time fees, so we ended up with an absurd focus on
"packaging" and "image stripping" and "shrinking" even to this day, so
again considering the above, ParcPlace did not see the humor in a free
Smalltalk. :-) But now we do.

--Paul Fernhout

Jon Hylands

Re: Design Principles Behind Smalltalk, Revisited (humor)

On Wed, 03 Jan 2007 09:54:35 -0500, "Paul D. Fernhout"
<[hidden email]> wrote:

> There is some sort of mismatch going on here between the mind and
> Smalltalk's object model. What it is in its entirety I am not sure. But
> clearly the tools at hand in Smalltalk-80 can't match the minds
> flexibility in object-oriented (and other) modeling. Yet it is very much a
> stated design goal in Dan's original paper to have the Smalltalk software
> environment be a good match for how the mind actually works. So, here, as
> exemplified by humor, we have a mismatch. Essentially, Smalltalk code
> isn't funny. :-)

I'm working on some serious AI research right now, using Squeak (of
course). My idea of the brain (in terms of how we model it) is a virtual
machine, with very little Smalltalk code, and huge amounts of data that
gets stored and indexed. You can't model things like humor and emotions and
such in code - it gets modeled in data.

http://www.bioloid.info/tiki/tiki-index.php?page=MicroRaptor if anyone is
interested...

Later,
Jon

--------------------------------------------------------------
Jon Hylands [hidden email] http://www.huv.com/jon

Project: Micro Raptor (Small Biped Velociraptor Robot)
http://www.huv.com/blog

Paul D. Fernhout

Re: Design Principles Behind Smalltalk, Revisited (humor)

Jon Hylands wrote:
> On Wed, 03 Jan 2007 09:54:35 -0500, "Paul D. Fernhout"
> <[hidden email]> wrote:
>> Essentially, Smalltalk code isn't funny. :-)
>
> I'm working on some serious AI research right now, using Squeak (of
> course). My idea of the brain (in terms of how we model it) is a virtual
> machine, with very little Smalltalk code, and huge amounts of data that
> gets stored and indexed. You can't model things like humor and emotions and
> such in code - it gets modeled in data.

Of course, as LISP often shows, or Squeak's VM generation system, the line
between code and data can often get blurry. :-)

> http://www.bioloid.info/tiki/tiki-index.php?page=MicroRaptor
Interesting project. I'll be curious over time how you see Squeak needing
to change or expand to better support your AI and robotics related goals.

--Paul Fernhout

J J-6

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by tblanchard

A big +1 to most of this message (just not the anti-OODB stuff. They have
not had any poison for me yet)

>From: Todd Blanchard <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: relational for what? [was: Design Principles Behind
>Smalltalk,Revisited]
>Date: Tue, 2 Jan 2007 17:28:25 -0800
>
>
>On Jan 2, 2007, at 1:36 PM, [hidden email] wrote:
>
>>Using the RDB as a sharing ground for applications is IMHO really,
>>really bad. Sure, it *works* kinda, but very fast you end up with
>>replicated SQL statements all over the place. Then someone says "stored
>>procedures" and hey... why not consider OBJECTS? There is probably a
>>reason why people are so worked up about Services these days. :)
>
>So if it is objects instead of tables - how is this different?
>Uh, and the alternative would be what? Take a typical company that makes
>and sells stuff.
>
>They have customers (hopefully).
>
>The marketing guys want the customer's demographics and contacts to
>generate targeted messages.
>The accounting people what to know their credit status, payments, and
>order totals.
>The inventory/production planning guys don't really care who is buying
>what, but they want to see how much of each thing is going out the door.
>The product development people are looking for trends to spot new kinds of
>demand trends.
>The sales guys want recent order history, contact event logs, etc.
>
>There are many cross cutting concerns.
>
>If you take the naive object model you probably have
>Customers->>Accounts->>Orders->>Items-----(CatalogItem)->InventoryStatus
>
>Works for most traversals, you put the customers in a dictionary at the
>root by some identifier.
>But for the people who process orders or do shipping, this model is a
>drag. They just want orders, and items and navigating to all the orders
>by searching starting at customers is nuts. So maybe you add a second
>root for orders for them. Then there's the inventory stuff....
>
>Everybody wants a different view with different entry points. I'm talking
>enterprise architecture here - bigger than systems which is bigger than
>applications.
>
>Relational databases don't care about your relationships or roots -
>anything can be a root. Anything can be correlated. Any number of object
>models can map to a well normalized.
>
>RDBMS systems have a couple nice properties - you can produce lots of
>different views tailored to a viewpoint/area of responsibility. They
>guarantee data consistency in integrity. Something I find lacking from OO
>solutions.
>
>Here's a fun game. Build an OO db. Get a lot of data. Overtime,
>deprecate one class and start using another in its place. Give it to
>another developer who doesn't know the entire history. One day he deletes
>a class because it appears not to be referenced in the image anymore. 6
>months later try to traverse the entire db to build a report and find
>errors 'no such class for record'. What will you do?
>
>This has happened BTW to me. If I have long lived data, with different
>classes of users and different areas of responsibilities, I want the RDBMS
>because it is maximally flexible while providing the highest guarantees of
>data integrity and consistency. The problems I've heard described are the
>result of poor design and unwillingness to refactor as the business
>changes and grows.
>
>FWIW I have worked at everything from 5 person Mac software companies
>(anyone remember Tempo II?) to telcos, aerospace, government agencies, and
>the world's largest internet retailer (a scenario where the relational
>database turns out to be not the best fit overall). My solution selection
>hierarchy as the amount of data grows runs:
>
>1) In the image
>2) Image segments or PLists in directories
>3) RDBMS/ORM like glorp
>4) RDBMS with optimized SQL
>5) SOA
>
>I'm pretty sour on OODBMS's based on my long running experiences with
>them.
>
>-Todd Blanchard

>

_________________________________________________________________
Get live scores and news about your team: Add the Live.com Football Page
www.live.com/?addtemplate=football&icid=T001MSN30A0701

J J-6

Re: Design Principles Behind Smalltalk, Revisited

In reply to this post by Marcel Weiher

The reason I chose Java to use as a reference was because by every chart I
have seen it is by far the most used language and therefor an example of
"success" (well and it fits my arguments better :). And what I mean by
"success" is, regardless of what you or I think of the language, the number
of people using it have validated some of the ideas that were questioned in
the original email.

>From: Marcel Weiher <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: Design Principles Behind Smalltalk, Revisited
>Date: Tue, 2 Jan 2007 22:26:31 -0800
>
>
>On Dec 26, 2006, at 3:18 , J J wrote:
>
>>
>>>Again, to contrast with Python, Squeak wants to run the show, but Python
>>>plays nice with all the other free tools of the GNU/Linux ecosystem.
>>
>>I keep on seeing this, but it appears largely overstated. Java has it's
>>own VM, threads etc. as well.
>
>Yes, Java. I think Python is very different from Java in this context, as
>Java also wants to run the show, and I think this is where it is quite
>similar to Smalltalk. Python on the other hand is quite happy to play
>along with others, just like Ruby, Perl and, of course, C.
>
>[more java comparison]
>
>>
>>And if you mean more to address the tools, well yes you *can* edit Java
>>code in vi if you really want to. But no one really wants to. And if
>>your interface to the language is through some program anyway, then the
>>"barrier" of the code not being on the file system disappears.
>
>Once again, I think that Java is not a valid substitute for Python in this
>context. In my experience, hacking Python in or Java or Ruby in vi is not
>just doable but quite useful. I can't say the same for Java.
>
>Marcel
>
>

_________________________________________________________________
Dave vs. Carl: The Insignificant Championship Series. Who will win?
http://clk.atdmt.com/MSN/go/msnnkwsp0070000001msn/direct/01/?href=http://davevscarl.spaces.live.com/?icid=T001MSN38C07001

J J-6

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by Marcel Weiher

>From: Marcel Weiher <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: relational for what? [was: Design Principles Behind
>Smalltalk,Revisited]
>Date: Tue, 2 Jan 2007 23:08:29 -0800
>
>Well, I have worked in a large-ish enterprise and my experience was that
>moving *away* from the RDB was central to improving performance around a
>hundred- to a thousandfold, with the bigger improvement for the project
>that completely eliminated the RDB.
>
>Har har. Sorry, but I have seen very few actually reusable data models.
>
>You are kidding, right?

Who are you people getting for DBA's? :)

>>Oh, but you found one example where someone with a lot of data didn't use
>>a RDB. I guess we can throw the whole technology sector in the trash.
>>Sanity check: google is trying to keep a current snapshot of all
>>websites and run it on commodity hardware. You could do exactly the same
>>thing with a lot less CPU's using a highly tuned, distributed RDBMS.
>
>That's a big claim, mister. Care to back it up?

And how do you propose I do that? I worked at a very very large retailer
for most of my career and they kept basically every transaction ever for
trending purposes. Now given the size of that company I would say it has to
be at least as large as Google's data (probably quite a bit bigger). Now
they didn't turn over queries in fractions of a second, but keep in mind the
general kind of queries they were dealing with. If they were limited to a
subset of possible queries like Google is, I believe they could produce
comparable times.

_________________________________________________________________
Your Hotmail address already works to sign into Windows Live Messenger! Get
it now
http://clk.atdmt.com/MSN/go/msnnkwme0020000001msn/direct/01/?href=http://get.live.com/messenger/overview

J J-6

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by Howard Stearns

>From: Howard Stearns <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: relational for what? [was: Design Principles Behind Smalltalk,
> Revisited]
>Date: Tue, 02 Jan 2007 14:36:22 -0600
>
>Yes, I'm quite serious. I'm asking what kinds of problems RDBMS are
>uniquely best at solving (or at least no worse). I'm not asking whether
>they CAN be used for this problem or that. I'm asking this from an
>engineering/mathematics perspective, not a business ("we've always done
>things this way" or "we like this vendor") perspective.
>

<horror story ommited>

Honestly, it just seems to me like someone architected an awful system. I
know, for example, some databases (oracle I thought) can span a given DB
across boxes etc. with different methods of partitioning (e.g. some tables
here, some tables there, foreign keys between them, etc.).

You certainly shouldn't have to be copying data between tables. If nothing
better, you could install MySQL everywhere and turn on replication.

>Maybe this isn't typical, but it is the architecture that Oracle and its
>PeopleSoft division pushes on us in their extensive training classes. And
>it appears to be the architecture discussed in the higher education IT
>conferences and Web sites in the U.S.

Well, the big companies tend to push the most expensive option, not the best
for the data model. In my experience so far, I can think of no case where
we accepted what the vendors proposed before some serious threats etc..

>Anyway, either the data AS USED fits into memory or doesn't. If it does,
>then what benefit is the relational math providing? If it doesn't, then we
>have to ask whether the math techniques that were developed to provide
>efficient random access over disks 20 years ago are still valid. Is this
>still the fastest way? (Answer is no.) Is there some circumstance in which
>it is the fastest? Or the safest? Or allow us to do something that we could
>not do otherwise?

I still don't think the question has anything to do with "in memory" vs.
"not in memory" or "quickest way to access the disk". You can tune your
RDBMS to try to cache as much as possible in memory, and then it becomes a
contest of: is it faster for me to write all the code to do the joins, etc.
or take what they already have for possibly a run-time speed hit.

Or maybe a speed gain since the RDBMS can break up the table into different
"spaces" and run the query simultaniously in different threads. Of course
you can do that by hand, but then you are getting further behind what they
already have.

>I tried briefly to combine JJ's answer with Peter's to find an appropriate
>niche. (Again, I'm trying to look at the math, not fit and finish,
>availability of experienced programmers, color of brochure...) For exampe,
>there could be a class of problems for which the data set is a few 10's of
>gigs and needs to be operated on as a whole. And that queries are fairly
>arbitrary and exploratory, not production-oriented. Etc. But I haven't been
>able to come up with one that doesn't have better characteristics as a
>distributed system. Maybe if we define the problem as "and you only have
>one commodity box to do it on." That's fair. Maybe that's it? (Then we
>need to find an "enterprise" with only one box...)

Well, it's not going to be that ("you only have one commodity box"). When I
said I think you could do what Google is doing with an RDBMS if you really
wanted to I wasn't thinking of a few commidity boxes. I was thinking of
4-10 really enormous boxes (but my understanding was that google uses *lots*
of computers to do their work, no?).

In other words the RDBMS solution will be much more expensive
computer/software wise compared to what Google did.

_________________________________________________________________
Your Hotmail address already works to sign into Windows Live Messenger! Get
it now
http://clk.atdmt.com/MSN/go/msnnkwme0020000001msn/direct/01/?href=http://get.live.com/messenger/overview

J J-6

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by Howard Stearns

>From: Howard Stearns <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: relational for what? [was: Design Principles Behind Smalltalk,
> Revisited]
>Date: Tue, 02 Jan 2007 15:16:22 -0600
>
>Of course. No question. Except, of course, where they don't. The 3-tier
>enterprise software scenario is -- to me -- an example of it NOT working.

I think this is due to your bad experiences with bad implementations.

>I used to write expert system software. A fellow once asked, "But couldn't
>I do that with Fortran?" The answer was, "Yes, and you could do it with
>pencil and paper, too, but you wouldn't want to."
>
>There's a whole bunch of problems for which pencil and paper are good
>enough, but maybe not ideal. Same for RDBMS. And there's all sorts of
>practical considerations in this range. Worse is Better, End to End, and
>whatever you like. No one is (I hope!) going to walk away from a solution
>in-hand that is good enough.
>
>There are also problems for which pencil and paper really aren't suited
>for. Same for RDBMS. They can be made to work with the great expenditure of
>resources, chewing gum, bailing wire, duct tape, vise grips, etc. And half
>of all enterprise IT projects fail. And yet even with this knowledge,
>there's still a 50% chance that you can make an RDBMS work on the wrong
>kind of problem if you throw enough money at it.

Completely agree.

>What I'm trying to do -- and of course, this isn't a Squeak question at
>all, but I hope it is a Squeak community question -- is try to learn what
>domain a perfectly running RDBMS is a good fit for by design, compared with
>a perfectly running alternative (even a hypothetical one).

Programmer time. How long will it take to make the RDBMS run perfectly (for
some definition of perfectly) vs. writing this alternative.

It is the same argument of using an existing DSL vs. just writing it by hand
in your favorite language.

_________________________________________________________________
>From photos to predictions, The MSN Entertainment Guide to Golden Globes has
it all. http://tv.msn.com/tv/globes2007/

tblanchard

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by J J-6

Based on the thoughtful responses I've gottten, I'll be taking a new
look at some oodb technologies - but I'm pretty sure I've made the
right move for my current project. Honestly, working with glorp is no
more or less complex than working with an oodb - they feel about the
same to me - especially since I derive the schema from the meta model
anyhow.

On Jan 3, 2007, at 12:47 PM, J J wrote:

> A big +1 to most of this message (just not the anti-OODB stuff.
> They have not had any poison for me yet)
>
>
>> From: Todd Blanchard <[hidden email]>
>> Reply-To: The general-purpose Squeak developers list<squeak-
>> [hidden email]>
>> To: The general-purpose Squeak developers list<squeak-
>> [hidden email]>
>> Subject: Re: relational for what? [was: Design Principles Behind
>> Smalltalk,Revisited]
>> Date: Tue, 2 Jan 2007 17:28:25 -0800
>>
>>
>> On Jan 2, 2007, at 1:36 PM, [hidden email] wrote:
>>
>>> Using the RDB as a sharing ground for applications is IMHO really,
>>> really bad. Sure, it *works* kinda, but very fast you end up with
>>> replicated SQL statements all over the place. Then someone says
>>> "stored
>>> procedures" and hey... why not consider OBJECTS? There is probably a
>>> reason why people are so worked up about Services these days. :)
>>
>> So if it is objects instead of tables - how is this different?
>> Uh, and the alternative would be what? Take a typical company
>> that makes and sells stuff.
>>
>> They have customers (hopefully).
>>
>> The marketing guys want the customer's demographics and contacts
>> to generate targeted messages.
>> The accounting people what to know their credit status, payments,
>> and order totals.
>> The inventory/production planning guys don't really care who is
>> buying what, but they want to see how much of each thing is going
>> out the door.
>> The product development people are looking for trends to spot new
>> kinds of demand trends.
>> The sales guys want recent order history, contact event logs, etc.
>>
>> There are many cross cutting concerns.
>>
>> If you take the naive object model you probably have
>> Customers->>Accounts->>Orders->>Items-----(CatalogItem)-
>> >InventoryStatus
>>
>> Works for most traversals, you put the customers in a dictionary
>> at the root by some identifier.
>> But for the people who process orders or do shipping, this model
>> is a drag. They just want orders, and items and navigating to
>> all the orders by searching starting at customers is nuts. So
>> maybe you add a second root for orders for them. Then there's
>> the inventory stuff....
>>
>> Everybody wants a different view with different entry points.
>> I'm talking enterprise architecture here - bigger than systems
>> which is bigger than applications.
>>
>> Relational databases don't care about your relationships or roots
>> - anything can be a root. Anything can be correlated. Any
>> number of object models can map to a well normalized.
>>
>> RDBMS systems have a couple nice properties - you can produce lots
>> of different views tailored to a viewpoint/area of
>> responsibility. They guarantee data consistency in integrity.
>> Something I find lacking from OO solutions.
>>
>> Here's a fun game. Build an OO db. Get a lot of data.
>> Overtime, deprecate one class and start using another in its
>> place. Give it to another developer who doesn't know the entire
>> history. One day he deletes a class because it appears not to be
>> referenced in the image anymore. 6 months later try to traverse
>> the entire db to build a report and find errors 'no such class
>> for record'. What will you do?
>>
>> This has happened BTW to me. If I have long lived data, with
>> different classes of users and different areas of
>> responsibilities, I want the RDBMS because it is maximally
>> flexible while providing the highest guarantees of data integrity
>> and consistency. The problems I've heard described are the
>> result of poor design and unwillingness to refactor as the
>> business changes and grows.
>>
>> FWIW I have worked at everything from 5 person Mac software
>> companies (anyone remember Tempo II?) to telcos, aerospace,
>> government agencies, and the world's largest internet retailer (a
>> scenario where the relational database turns out to be not the
>> best fit overall). My solution selection hierarchy as the amount
>> of data grows runs:
>>
>> 1) In the image
>> 2) Image segments or PLists in directories
>> 3) RDBMS/ORM like glorp
>> 4) RDBMS with optimized SQL
>> 5) SOA
>>
>> I'm pretty sour on OODBMS's based on my long running experiences
>> with them.
>>
>> -Todd Blanchard
>
>
>>
>
> _________________________________________________________________
> Get live scores and news about your team: Add the Live.com Football
> Page www.live.com/?addtemplate=football&icid=T001MSN30A0701
>
>

Marcel Weiher

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by J J-6

On Jan 3, 2007, at 13:39 , J J wrote:

>> From: Howard Stearns <[hidden email]>
>>
>> Of course. No question. Except, of course, where they don't. The 3-
>> tier enterprise software scenario is -- to me -- an example of it
>> NOT working.
>
> I think this is due to your bad experiences with bad implementations.

Obviously. And yours seems to be good experiences with good
implementations. What does that show us? Apart from that both good
and bad examples exist?

>>
>> What I'm trying to do -- and of course, this isn't a Squeak
>> question at all, but I hope it is a Squeak community question -- is
>> try to learn what domain a perfectly running RDBMS is a good fit
>> for by design, compared with a perfectly running alternative (even
>> a hypothetical one).
>
> Programmer time. How long will it take to make the RDBMS run
> perfectly (for some definition of perfectly) vs. writing this
> alternative.
>
> It is the same argument of using an existing DSL vs. just writing it
> by hand in your favorite language.

Precisely. If the problem domain is a good fit for the RDBMS/DSL,
data that naturally wants to be in 'tables', then it *may* be a win,
even after factoring in the inevitable overhead of overcoming
packaging mismatch. If the original problem is not naturally "table-
oriented", and many are not, then it's just not going to be a win.

Cheers,

Marcel

Marcel Weiher

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by J J-6

On Jan 3, 2007, at 13:01 , J J wrote:

[I wrote]

You are kidding, right?

Who are you people getting for DBA's? :)

Let's pull back some context here:

Are you serious with this (data too large to fit into memory)? And if you use a good RDBMS then you don't have to worry about disk speed or distribution.

Do you really, truly believe you don't have to worry about physical parameters such as disk speed just because there is an intermediate layer between you and your disk(s) called a RDBMS?

Then our vendor must have been really ignorant when they recommended that we get faster machines with faster disks in order to fix problems we were having where the database could not keep up with the (write) data rate. And reality must also have not known about this magical property of RDBMS to make you immune to physical limtations, because getting faster disks *did* actually solve the problem.

Or are you saying that what happens is that you pay someone else to worry about those parameters?

Of course, using a database in that scenario was actually not necessary, and the benefits that the vendor touted for their database-based system were quite irrelevant in our application context. We could have built a far (a) faster (b) simpler (c) more reliable and (d) cheaper system had we not bought into the "must use RDBMS because it makes everything better" voodoo and just kept the relevant data in application memory. This might have been a small amount of extra programming initially, but would it ever have paid off in maintenance.

Especially since we didn't actually own the data, we were just getting a feed from somewhere else.

Had the data been ours, that would have been a different story, because data-integrity is one RDBMS myth that I still believe in, as it hasn't been beaten out of me yet by encounters with reality...

Marcel

David T. Lewis

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by Andreas.Raab

On Tue, Jan 02, 2007 at 09:57:40PM +0100, Andreas Raab wrote:

> Howard Stearns wrote:
> >Yes, I'm quite serious. I'm asking what kinds of problems RDBMS are
> >uniquely best at solving (or at least no worse). I'm not asking whether
> >they CAN be used for this problem or that. I'm asking this from an
> >engineering/mathematics perspective, not a business ("we've always done
> >things this way" or "we like this vendor") perspective.
>
> The main benefit: They work. There is no question how to use them, apply
> them to problems, map them into different domains etc. This has all been
> worked out, there is nothing new to find out, just a book or two to
> read. From an engineering perspective that is vastly advantageous since
> it represents a solution with a proven track-record and no surprises.

Quite right from an engineering perspective. But "proven track-record
and no surprises" is wrong, at least in the context of the larger
organizations for which RDBMS are considered appropriate. This has
very little to do with technology, mathematics, or engineering, and
lots to do with organizational behavior. An RDBMS scales extremely
well, but the human organizations associated with them do not.

One lesson that I take from Squeak is that the way people interact
with technology is important. It does not matter whether or not
Squeak is "fast" if it helps people to work with ideas and solve
problems quickly. More broadly, it does not matter if a technology
(RDBMS or whatever) scales well if it leads people and organizations
to behave as disfunctional groups of "architects," "data analysts,"
and so forth.

Dave

p.s. Ralph Johnson's earlier reply on this thread is an excellent
assessment, and would serve well as the last word on the topic.
My sincere apologies for indulging in a further reply ;)

Peter Crowther-2

RE: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by Marcel Weiher

> From: Marcel Weiher
> > The DBA's can watch how the database is being used and tune this
> > (i.e. partition the data and move it to another CPU, etc., etc.).
>
> To some limited extent, yes. But they can't work miracles, and
> neither can the DB. In fact, if you know your access patterns, you
> can (almost) always do better than the DB, simply because there are
> fewer layers between you and the code.

[Note: I appear to be getting a slow and incomplete feed from squeak-dev
- someone else may have said what's below or the discussion may have
moved on. My apologies if so.]

*If you know your access patterns*, I agree. I suggest that in
many/most practical business applications you do *not* know your access
patterns, because the data is at the core of the business and will be
used in unexpected ways over time. The old schema will change slowly;
new pieces of schema will be bolted on at the edges as new applications
are accreted; many, many ad-hoc reports will be written that need to run
"fast enough" when they are required; and different functional areas of
the business will wish to view the data in their own way. Optimising
for any one set of access patterns counts as premature optimisation, and
will come back and bite you later. I claim that one of the advantages
of a normalised TP database is that it is agnostic about its access
patterns, and that indexes (and indexed views) can be added later and
changed dynamically to improve response times as the access patterns
change over time.

RDBs are "good enough" for quite a wide class of problems, and most
sensibly-designed relational databases have gentle fitness curves - as
the requirements change over time (which they do), there are
comparatively few changes that completely break the database. By
contrast, a highly-optimised storage structure is fragile - a
requirements change may completely screw the optimisation, or be screwed
by it. I don't want "right", I want "good enough for the job".

- Peter

J J-6

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by Marcel Weiher

>From: Marcel Weiher <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: relational for what? [was: Design Principles Behind
>Smalltalk,Revisited]
>Date: Wed, 3 Jan 2007 16:36:42 -0800
>
>Obviously. And yours seems to be good experiences with good
>implementations. What does that show us? Apart from that both good and
>bad examples exist?

Well, it didn't tell us anything, it just reminded us that there are many
more bad IT people then good ones (or at least it sure seems so).

>Precisely. If the problem domain is a good fit for the RDBMS/DSL, data
>that naturally wants to be in 'tables', then it *may* be a win, even after
>factoring in the inevitable overhead of overcoming packaging mismatch. If
>the original problem is not naturally "table- oriented", and many are not,
>then it's just not going to be a win.

I agree. Trying to fit non-relational data into a DB because "it's what we
know" is bad.

_________________________________________________________________
Type your favorite song. Get a customized station. Try MSN Radio powered
by Pandora. http://radio.msn.com/?icid=T002MSN03A07001

J J-6

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by Marcel Weiher

>From: Marcel Weiher <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: relational for what? [was: Design Principles Behind
>Smalltalk,Revisited]
>Date: Wed, 3 Jan 2007 20:41:58 -0800
>
>>>Are you serious with this (data too large to fit into memory)? And if
>>>you use a good RDBMS then you don't have to worry about disk speed or
>>>distribution.
>
>Do you really, truly believe you don't have to worry about physical
>parameters such as disk speed just because there is an intermediate layer
>between you and your disk(s) called a RDBMS?

Well no, someone has to worry about this. I guess when I said RDBM*S* I
meant RDBMS *team*.

>Or are you saying that what happens is that you pay someone else to worry
>about those parameters?

Sort of.

I just don't see the data point of "does the data fit into memory or not" as
being relavant to the discussion. If you have relational type data and you
want to run various reports that look at the data in various different ways
for reports, what does "have it in memory" have to do with anything?
Whether it fits or not, you still have to hand write code that does
relational joins and other things to deal with it.

My last (relevant) project would have easily fit in memory, but downloading
MySQL, building 3 tables and loading up the data was *vastly* faster then
hand writing all that stuff for about 10 reports that had to be run one
time.

>Of course, using a database in that scenario was actually not necessary,
>and the benefits that the vendor touted for their database- based system
>were quite irrelevant in our application context.

_________________________________________________________________
Your Hotmail address already works to sign into Windows Live Messenger! Get
it now
http://clk.atdmt.com/MSN/go/msnnkwme0020000001msn/direct/01/?href=http://get.live.com/messenger/overview

J J-6

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by David T. Lewis

>From: "David T. Lewis" <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: relational for what? [was: Design Principles Behind
>Smalltalk,Revisited]
>Date: Thu, 4 Jan 2007 00:26:40 -0500
>
>Quite right from an engineering perspective. But "proven track-record
>and no surprises" is wrong, at least in the context of the larger
>organizations for which RDBMS are considered appropriate. This has
>very little to do with technology, mathematics, or engineering, and
>lots to do with organizational behavior. An RDBMS scales extremely
>well, but the human organizations associated with them do not.

Well that's true. The worst things I have seen in my career in this context
were

1) DA's who apply silly standards to every table no matter what. We had
some data that happen to have strings in it (the host names of computers),
but since it was a string the DA's wanted us to break the string out to
another table(s) so that we could internationalize our application. No
matter how we explained it they just replied with the "Data standards"
document.

2) Developers (and by this I mean: The kind of person who probably uses Java
and only knows the OO that Java has) who inflict their will on the tables.
This is probably where most of the horror stories come from. Either the
table was designed by them from the start, or they gradually made
modifications to it that fit their world view. I have seen some pretty
awful results from this one.

>One lesson that I take from Squeak is that the way people interact
>with technology is important. It does not matter whether or not
>Squeak is "fast" if it helps people to work with ideas and solve
>problems quickly. More broadly, it does not matter if a technology
>(RDBMS or whatever) scales well if it leads people and organizations
>to behave as disfunctional groups of "architects," "data analysts,"
>and so forth.

Agreed.

>p.s. Ralph Johnson's earlier reply on this thread is an excellent
>assessment, and would serve well as the last word on the topic.
>My sincere apologies for indulging in a further reply ;)

Agreed.

_________________________________________________________________
Get live scores and news about your team: Add the Live.com Football Page
www.live.com/?addtemplate=football&icid=T001MSN30A0701

J J-6

RE: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by Peter Crowther-2

>From: "Peter Crowther" <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: "The general-purpose Squeak developers
>list"<[hidden email]>
>Subject: RE: relational for what? [was: Design Principles Behind
>Smalltalk,Revisited]
>Date: Thu, 4 Jan 2007 08:28:49 -0000
>
> > From: Marcel Weiher
> > > The DBA's can watch how the database is being used and tune this
> > > (i.e. partition the data and move it to another CPU, etc., etc.).
> >
> > To some limited extent, yes. But they can't work miracles, and
> > neither can the DB. In fact, if you know your access patterns, you
> > can (almost) always do better than the DB, simply because there are
> > fewer layers between you and the code.
>
>[Note: I appear to be getting a slow and incomplete feed from squeak-dev
>- someone else may have said what's below or the discussion may have
>moved on. My apologies if so.]
>
>*If you know your access patterns*, I agree. I suggest that in
>many/most practical business applications you do *not* know your access
>patterns, because the data is at the core of the business and will be
>used in unexpected ways over time. The old schema will change slowly;
>new pieces of schema will be bolted on at the edges as new applications
>are accreted; many, many ad-hoc reports will be written that need to run
>"fast enough" when they are required; and different functional areas of
>the business will wish to view the data in their own way. Optimising
>for any one set of access patterns counts as premature optimisation, and
>will come back and bite you later. I claim that one of the advantages
>of a normalised TP database is that it is agnostic about its access
>patterns, and that indexes (and indexed views) can be added later and
>changed dynamically to improve response times as the access patterns
>change over time.
>
>RDBs are "good enough" for quite a wide class of problems, and most
>sensibly-designed relational databases have gentle fitness curves - as
>the requirements change over time (which they do), there are
>comparatively few changes that completely break the database. By
>contrast, a highly-optimised storage structure is fragile - a
>requirements change may completely screw the optimisation, or be screwed
>by it. I don't want "right", I want "good enough for the job".
>
> - Peter
>

_________________________________________________________________
>From photos to predictions, The MSN Entertainment Guide to Golden Globes has
it all. http://tv.msn.com/tv/globes2007/?icid=nctagline1

J J-6

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

In reply to this post by Joshua Gargus-2

>From: Joshua Gargus <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: Re: relational for what? [was: Design Principles Behind
>Smalltalk,Revisited]
>Date: Tue, 2 Jan 2007 15:21:48 -0800
>
>What, really? There are many possible reasons that Google don't use an
>RDBMS to index the web: stupidity, arrogance, excessive cost of an RDBMS,
>sound engineering decisions, or a combination of these.

I'm not saying Google are idiots. Clearly not. I was basically just
questioning using them as some sort of counter point against RDBM systems.
I think you could do the same thing they did with a RDBMS, but not on a
bunch of low end computers. You would have to spend some cash.

>According to the computer systems research community, Google has sound
>engineering reasons for its architecture; they have published papers at
>top conferences such as OSDI and SOSP. See http:// labs.google.com/papers
>("The Google File System" and "BigTable..." might be the most relevant to
>this conversation).

Yes of course. And look at what they are doing: Fault tolerant systems on a
large number of commodity boxes. Almost the opposite of an RDBMS. :)

>That's not rule out the possibility of stupidity, arrogance, excessive
>cost, etc.. But it does cast doubt on the unsubstantiated claim that
>Google could "do exactly the same thing with a lot less CPUs".

Well, it would be time consuming (and probably expensive) to prove, but I
still think the statement is ok. But it will be big boxes and big CPUs with
lots of through-put.

>As you mentioned in a follow-up email, this wasn't the paper you meant.
>Although it has nothing whatsoever to do with RDBMSes, I would recommend
>anyone who has enough free time to learn enough Haskell to read that
>paper.
>
>Did you happen to find the intended link?

Yes,
http://lambda-the-ultimate.org/node/1896

>Certainly RDBs are essential to the operations of the modern enterprise,
>but how much of this is because RDBs are really the best imaginable
>approach to this sort of thing, and how much is due to a complicated
>process of co-evolution that has resulted in the current enterprise
>software ecosystem?

Here I think you envision more religious fervor behind my words than exist.
It is nothing more then a "toolbox" issue for me. A problem comes up, what
is the fastest way to solve it weighed against the suspected length of the
project and how scalable the solution? For me there are times I reach for
the RDBMS. There are other times I would reach for an OODB (I plan to use
magma to persist my website). Or maybe a combination (I am *very* impressed
of what I have seen from GLORP so far), or maybe just stick it in memory.

Which is going to be the best? Well it is our jobs as engineers to weigh
all the factors and answer that question, but for every isolated case.

_________________________________________________________________
Type your favorite song. Get a customized station. Try MSN Radio powered
by Pandora. http://radio.msn.com/?icid=T002MSN03A07001

Joshua Gargus-2

Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

On Jan 4, 2007, at 11:40 AM, J J wrote:
>
>> That's not rule out the possibility of stupidity, arrogance,
>> excessive cost, etc.. But it does cast doubt on the
>> unsubstantiated claim that Google could "do exactly the same
>> thing with a lot less CPUs".
>
> Well, it would be time consuming (and probably expensive) to prove,
> but I still think the statement is ok. But it will be big boxes
> and big CPUs with lots of through-put.

Well, if you say so. I'm no expert.

>
>> As you mentioned in a follow-up email, this wasn't the paper you
>> meant. Although it has nothing whatsoever to do with RDBMSes, I
>> would recommend anyone who has enough free time to learn enough
>> Haskell to read that paper.
>>
>> Did you happen to find the intended link?
>
> Yes,
> http://lambda-the-ultimate.org/node/1896

Thanks, that looks interesting. It actually is related to the
original link.

>
>> Certainly RDBs are essential to the operations of the modern
>> enterprise, but how much of this is because RDBs are really the
>> best imaginable approach to this sort of thing, and how much is
>> due to a complicated process of co-evolution that has resulted in
>> the current enterprise software ecosystem?
>
> Here I think you envision more religious fervor behind my words
> than exist.

My apologies, I can see how you might read it that way. I'm not
saying that you are arguing that RDBs are the best imaginable
approach; I was trying to re-state Howard's initial question. As I
understood it, the question was not about whether an RDBMS is the
appropriate choice in a given situation (given time and cost
constraints, etc.), but whether we know enough now to make
fundamentally better choices if we magically found ourselves with the
resources to "burn the disk packs" and start over.

Josh

>

1234