Smalltalk › Usenets › Dolphin Smalltalk

speed question

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

9 messages Options

Howard Oh

speed question

Hi all

I'm trying out VisualWorks now. I accidentally visited Cincom and downloaded
theirs.

I found VW four times faster in performance than Dolphin which was very
attractive and misterious. With my common sense Dolphin should be faster
since it is more dedicated to Windows.

Regards
Hwa Jong Oh

Frank Sergeant

Re: speed question

"Hwa Jong Oh" <[hidden email]> wrote in message
news:981hog$r70m0$[hidden email]...
> Hi all
>
> I'm trying out VisualWorks now. I accidentally visited Cincom and
downloaded
> theirs.

I was going to try to make a joke along the lines of "I once 'accidentally'
visited a whorehouse in Tijuana one time" or "I accidentally attended an 8
o'clock class one time in college" but I decided against it.

> I found VW four times faster in performance than Dolphin which was very
> attractive and misterious. With my common sense Dolphin should be faster
> since it is more dedicated to Windows.

You probably know all this, but I just felt like discussing speed. My goal,
my motto is

No optimization before its/it's time.

Sort of like the wine commercial: "We sell no wine before it's time (and boy
is it time! If we don't sell some soon we're going out of business.)".

I have found it far too easy to get suckered into thinking the time was
going somewhere it wasn't. So, I try harder than ever to code
straightforwardly without regard to performance, leaving the optimization
until later, when and if it becomes a proven problem.

> I found VW four times faster in performance than Dolphin which was very

If your overall application was four times faster in VW, then that is, of
course, significant. But, if it was four times faster on anything but your
actual application, then there is a danger that the speed ratio will not
apply to your actual application. So, extra speed may make little practical
difference, depending on where the bottlenecks are in your application.

As to how VW could possibly be faster than Dolphin, I understand VW does
considerable optimization, such as caching the results of method lookups so
they take much less time when repeated, perhaps even caching actual machine
code for the most used methods (I'm not sure of this). Eliot has posted
some specifics about this in comp.lang.smalltalk. So, yes, I see how VW
could be and probably is faster than Dolphin. It is just not clear whether
this matters.

-- Frank
[hidden email]

Howard Oh

Re: speed question

Frank,

> I was going to try to make a joke along the lines of "I once
'accidentally'
> visited a whorehouse in Tijuana one time" or "I accidentally attended an 8
> o'clock class one time in college" but I decided against it.
>

Alright, do you want me to say that I visited there delibrately and posting
nasty message here? Well, I can't. Dolphin is my first seriously used
Smalltalk VM. And I use Dolphin much more than VW.

> my motto is
>
> No optimization before its/it's time.
>

I think it is every smalltalk programmer's motto. I follow it, too.
Ian's ProfileViewer help me alot for attacking bottlelocks and I experienced
10 times of performance enhancement with great surprise.
But saddly, even that wasn't good enough.
I usually reach dead-end of the joyful journey of optimization where the
final performance is poor.
When I reach there, ProfileViewer tells me that the core smalltalk method is
the bottlelock, such as Array>>at:.

I hate when fellow C programmers laugh at my application's going tortle. Can
you help me out?

> If your overall application was four times faster in VW, then that is, of
> course, significant. But, if it was four times faster on anything but
your
> actual application, then there is a danger that the speed ratio will not
> apply to your actual application. So, extra speed may make little
practical
> difference, depending on where the bottlenecks are in your application.

Okay, I should be more patient to go on more benchmarking. That is my
interest, too.
You can give me a package of large code that can be run on both system.
Let me see if there is no difference there.

> As to how VW could possibly be faster than Dolphin, I understand VW does
> considerable optimization, such as caching the results of method lookups
so
> they take much less time when repeated, perhaps even caching actual
machine
> code for the most used methods (I'm not sure of this). Eliot has posted
> some specifics about this in comp.lang.smalltalk. So, yes, I see how VW
> could be and probably is faster than Dolphin. It is just not clear
whether
> this matters.

"Well, VW is a JIT, Dolphin is an interpreter. That's why VW is faster.
Dolphin uses native widgets on Windows though, which is why it may feel
snappier." says James A. Robertson at comp.lang.smalltalk

Nobody want to write a slow program.
Many say that Smalltalk memory management makes programer relax about
crashing.
But they are missing that Smalltalk programmers should be nervous about
performance at the end.
Am I the only one who worries about it?

VW has many defect, too. It's GUI and short-cuts are terribly unfriendly.
MVC is said to be hard to use.

PS: Please don't treat me like a traitor.

Best regards
Hwa Jong Oh

Andy Bower

Re: speed question

Hwa Jong,

> "Well, VW is a JIT, Dolphin is an interpreter. That's why VW is faster.
> Dolphin uses native widgets on Windows though, which is why it may feel
> snappier." says James A. Robertson at comp.lang.smalltalk

The Dolphn VM is a bytecode interpreter. The Dolphin compiler converts the
Smalltalk code in your methods into a series of byte codes that are then
executed in a loop by the VM. To get a feeling for what the bytecodes are
like try:

method := (Collection methodDictionary at: #select:).
method getSource. "Display it.. the Smalltalk source for
Collection>>select:"
method byteCodes. "Display it.. the byte codes"
Disassembler disassemble: method "Display it.. the bytecode language"

Most other Smalltalks use a technique of JIT (Just In Time) compilation.
Their compilers also produce bytecodes which are stored with the compiled
methods. However, just before a method is about to be run the bytecodes are
re-compiled (translated is a better word) into real Pentium machine code
that can be executed directly by the computer's processor without the need
for an interpreter. Therefore, yes, a VM based on a JIT will typically be
much (2 to 10 times) faster than a pure interpreter. You can see this if you
run some benchmarks on the VMs.

Our paper "The Interpreter is Dead (slow) isn't it" aims to set out why we
don't consider a VM interpreter to be an outmoded design despite it's
"apparent" reduced execution speed relative to a JIT. You can find the paper
at:

http://www.object-arts.com/Papers/TheInterpreterIsDead.PDF

On the first page of this you'll find some "micro-benchmarks" (by this I
mean benchmarks of relatively small pieces of code) that compare the VMs of
Dolphin 3 and VW. Both VMs have moved on over the last couple of years and
are faster than when the figures were compiled but the idea should be the
same. Notice the last column in the benchmark table. On "Towers of Hanoi"
Dolphin is 7 times slower than VW. On "String Compare" is 2 times faster. So
there is obviously a wide range of performance ratios that can be deduced
from benchmarks and this is their danger.

So benchmarks are not everything; you have to be very careful how you
interpret the results. Consider the following issues which are brought up in
the above paper:

1) A JIT must translate a method into machine code at runtime and this
process takes time. Typically, the translated code is held in a cache so the
time consuming translation only occurs once. However, this translation
process means that methods that are only executed once or rarely will
typically run slower on a JIT than an interpreter. Benchmarks, since they
are run repeatedly, don't illustrate this difference. Similarly, JIT
applications will often be slower to start up because of the need to
translate the bytecodes for all methods the first time they are run.

2) The code cache required by a JIT will typically be of the order of
several megabytes and this increases the memory footprint of any application
running on a JIT VM.

3) Garbage collection. Most garbage collectors run at idle time or when the
memory footpriint of the application grows large enough to demand it. This
means that many benchmarks will often run without incurring any garbage
collection penalty (the collection happens in idle time after the benchmark
has completed). Dolphin, however, has a two stage garbage collection, the
first stage of which runs incrementally while the code is running. This
means that the Dolphn benchmark figures will include the g/c hit while the
JIT figures typically may not.

4) External Interfacing. Most modern applications have extensive
communication with the "world outside of Smalltalk". Dolphin in particular
makes much use of its callouts to Windows DLLs, use of native Windows
widgets and COM interfacing. It is essential that this interface be fast and
yet the speed of this is not measured by any commonly used benchmark. This
sort of interfacing is a function of the VM and yet is independent of
whether a JIT or interpreter is used (although for reasons discussed in the
paper, an interpreter can have advantages in this area). We believe that the
external interfacing capabilities of the Dolphin VM are among the fastest in
the business.

5) VM Primitives. The operations implemented in primitives can have a
significant impact on the performance of the micro-benchmarks. Fortunately,
Dolphin comes off quite well here. In the above benchmarks, Dolphin is much
faster than VW on "String Compare" because it uses a primitive to do this.
Similarly, any tests using large integer arithmetic will most probably shine
in Dolphin because it has some of the fastest large integer primitves
available.

My point here is that you should "beware of the power of the benchmark".
Benchmarks do not tell the whole story. Your application is as fast as it
appears. For example, both Dolphin and VW are similar applications; i.e.
Smalltalk development systems. And yet if you run the development
environments side by side Dolphin appears much snappier (faster) even though
the raw bytecode speed of the VM is significantly slower. This is because
Dolphin uses native Windows widgets and has better external i/f capabilities
than VW. It was this appearence of speed that led many people to assume that
Dolphin used a JIT VM until we "came clean" and released the above paper.

> Nobody want to write a slow program.
> Many say that Smalltalk memory management makes programer relax about
> crashing.
> But they are missing that Smalltalk programmers should be nervous about
> performance at the end.
> Am I the only one who worries about it?

Remember, speed isn't everything. If it was then I'd be using C++, C or
(heaven forbid) assembly language!

I don't worry about performance initially. The rule: "Make it run. make it
run correctly. make it run fast." is a good one. As you say in your other
message you achieved a 10x speed up of your application by profiling and
refactoring the code. This is much more than you would achieve by swapping
VMs. However, if your application is still not acceptably fast then a switch
to another platform may be the only route.

> too. It's GUI and short-cuts are terribly unfriendly.
> MVC is said to be hard to use.
>
> PS: Please don't treat me like a traitor.

Don't worry, we won't. If you prefer VW over Dolphin because it is the only
way to get your application to run fast enough then use it.

Best Regards,

Andy Bower
Dolphin Support
http://www.object-arts.com

---
Visit the Dolphin Smalltalk WikiWeb
http://www.object-arts.com/wiki/html/Dolphin/FrontPage.htm
---

Bill Schwab-2

Re: speed question

Andy and Hwa Jong,

Andy, one thing you did not notice in your reply is a generational GC; my
expectation is that this would help Dolphin's performance. Would it? If
so, do you have any thoughts on when might it appear or perhaps its relative
priority?

Hwa Jong, depending on that nature of your app, you might find that some of
VW's advantage is that it avoids native widgets =:0 I have one app that
is now early production use and benefited greatly from _not_ using MS
widgets, and another that would be flatly impossible to make work with
native widgets - that one is still in development. In both cases, there are
large numbers of widgets on the screen. Another thing that can make a big
difference is to remove toolbars from windows; I once had to run Dolphin
(3.x) on a 486 machine that had some kind of Java middleware causing it to
run (crawl is a better term) in compatibility mode, and quickly deleting the
toolbars from the workspace and browser gave me a remarkably snappy IDE.
Also, if you have large collection of floats or integers or other
(potentially) intrinsic types, you might see a boost from storing them in
FLOATArray, etc., rather than ordinary collections. StructureArray can be
quite slow compared to the other external array classes, so you would do
well to experiment with alternatives (as would I).

This is starting to sound like another Wiki page...

Have a good one,

Bill

--
Wilhelm K. Schwab, Ph.D.
[hidden email]

Andy Bower

Re: speed question

Bill,

> Andy, one thing you did not notice in your reply is a generational GC; my
> expectation is that this would help Dolphin's performance. Would it? If
> so, do you have any thoughts on when might it appear or perhaps its
relative
> priority?

Yes, a generational collector would almost certainly improve the speed by a
significant amount. I'm not sure, of course, what exactly I mean by
"significant" though ;-)

We are not sure when such a beast will be implemented.

Best Regards,

Andy Bower
Dolphin Support
http://www.object-arts.com

---
Visit the Dolphin Smalltalk WikiWeb
http://www.object-arts.com/wiki/html/Dolphin/FrontPage.htm
---

Dave Harris-3

Re: speed question

In reply to this post by Howard Oh

[hidden email] (Hwa Jong Oh) wrote (abridged):
> When I reach there, ProfileViewer tells me that the core smalltalk
> method is the bottlelock, such as Array>>at:.

This could be because Array>>at: is slow, but more likely you are calling
it too many times. There may still be higher level or algorithmic
improvements you can make.

Failing that, try to identify a core inner loop that you can write in C
and call from Dolphin. Dolphin is pretty good at interfacing with other
languages.

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
[hidden email] | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

Christopher J. Demers

Re: speed question

In reply to this post by Howard Oh

Hwa Jong Oh <[hidden email]> wrote in message
news:9821ur$r5njo$[hidden email]...
> I usually reach dead-end of the joyful journey of optimization where the
> final performance is poor.
> When I reach there, ProfileViewer tells me that the core smalltalk method
is
> the bottlelock, such as Array>>at:.
>
> I hate when fellow C programmers laugh at my application's going tortle.
Can
> you help me out?
>

Have you looked at Smalltalk MT?
http://www.objectconnect.com/stmt_overview.htm ) I can only talk from a
theoretical perspective as I have not really used it much. The idea is that
it uses an optimizing compiler to generate native machine code directly (no
runtime JIT or interpreter overhead). It is supposed to be fast and
generate small executables. It is not inexpensive (~$800 for commercial
use, but I think they have a educational version for ~$100).

The last time I looked at Smalltalk MT (an old version) I thought it still
had some rough edges. I would not necessarily suggest replacing Dolphin
with it but rather you might consider using Smalltalk MT to develop
optimized code in an ActiveX DLL (as an alternative to doing the same thing
in C++) then call it from your Dolphin application. Ideally this might give
you the speed of optimized static compilation without having to resort to
C++. Perhaps there is someone that has had more actual experience with
Smalltalk MT that can comment on this notion from a practical perspective.

I have also looked at VisualWorks. Their Windows UI emulation never seemed
to look or work quite as I would have liked. I was also turned off by their
royalty pricing, and since we only target Windows multi-os support was never
important to us.

Good luck with your speed issues. I would be interested to hear how things
turn out.

Chris

Frank Sergeant

Re: speed question

In reply to this post by Howard Oh

"Hwa Jong Oh" <[hidden email]> wrote in message
news:9821ur$r5njo$[hidden email]...
> > I was going to try to make a joke along the lines of "I once
'accidentally' ...

> Alright, do you want me to say that I visited there delibrately

No, no, I was only joking. It just struck me funny that someone would
accidentally visit Cincom.

> and posting nasty message here?

If I gave the appearance that I thought you had posted a nasty message here,
that was an incorrect appearance. I was very happy with your posting as it
gave me a chance to rant on the subject of speed; ranting at myself, not
you, certainly not at either VW or Dolphin.

> Well, I can't. Dolphin is my first seriously used Smalltalk VM. And I use
Dolphin much more than VW.

VW was the Smalltalk I was working with when Smalltalk "clicked" for me, so
it will always have a place in my heart. I think it is perfectly reasonable
for us Dolphin fans to be aware of other Smalltalks and to compare them and
use them as we find appropriate. I have made a terrible error if I gave any
other impression.

> I hate when fellow C programmers laugh at my application's going tortle.
Can
> you help me out?

I hope my previous attempt at humor did not obscure the main point of my
message, which I repeat here:

> > If your overall application was four times faster in VW, then that is,
of
> > course, significant. But, if it was four times faster on anything but
> > your actual application, then there is a danger that the speed ratio
will not
> > apply to your actual application.

It wasn't clear to me from your posting whether you had found that VW
increased your actual application's speed by a factor of 4 or whether you
were speaking of isolated benchmarks.

> > So, extra speed may make little practical difference,
> > depending on where the bottlenecks are in your application.

Of course, I was thinking of my own application as I wrote the above
paragraph. In my case, I access a database over a TCP/IP connection to
materialize my domain objects. I start a brand new socket connection for
each database request (rather than maintaining a permanent socket
connection). I reckon (although even that is risky unless I measure it
specifically) I am spending so much time on the socket transfers that
doubline the speed of the rest of my application would not be *noticeable*.
In my case, it hardly matters if I increase the speed at which Dolphin
*waits* for the data. Furthermore, in spite of this, my application is fast
enough. So, right now, speed is not my worst problem, so I am doing my best
to refrain from addressing it. I may not have been clear in my previous
posting, but I find it *difficult* to avoid addressing speed. It is so very
tempting for me to spend time on optimization that it takes conscious effort
to void doing so. I don't always succeed.

It sounds like your application is quite a bit different from mine and/or
that you are advanced to a point where lack of speed *is* your worst
problem.

> Okay, I should be more patient to go on more benchmarking. That is my
> interest, too.

And, you are probably already doing this, but my understanding is that
vigorously applying "once and only once" with no optimization at all is the
best way to lay the groundwork for future optimization, as the system is
more modular and the true bottlenecks are easier to identify and then to
fix.

> You can give me a package of large code that can be run on both system.
> Let me see if there is no difference there.

Your application is itself such a package, isn't it?

> But they are missing that Smalltalk programmers should be nervous about
> performance at the end.
> Am I the only one who worries about it?

I am sure you are not the only one who worries about it, and I think it is a
legitimate concern, as long as it is viewed at the "macro" level and not
just the "micro" level. For example, some years ago I did some contract
programming on the internals of a game engine. I was shocked at how wrong
my intuition could be as to where the actual bottlenecks lay. Steps taken
*outside* the main loop could be extremely inefficient without having *any*
effect upon the frame rate. On the other hand, various common wisdom, such
as unrolling loops (inside the main loop) could turn out to slow down the
frame rate rather than increase it. Counterintuitive, eh? The only thing
that works is to *measure* rather than guess. (Cache memory is an important
factor which can lead to slow, small code being faster than fast, large
code, if it affects the rate of cache misses. (This could be a point in
favor of Dolphin using a bytecode interpreter.))

> VW has many defect, too. It's GUI and short-cuts are terribly unfriendly.

Say, are you the VW critic or am I? I got on quite well with VW. That is,
I was very comfortable working in its IDE. I was using VWNC 3.x. It was
Dolphin I was lost in for the longest time. I'm glad I kept at it.

> MVC is said to be hard to use.

And, here again, this clicked for me pretty quickly. Plus, there are a
number of great books specifically directed at VW that help out here. I am
really looking forward to Dolphin having its own book. (Squeak has one and
soon will have another.)

> PS: Please don't treat me like a traitor.

I hope I have climbed out of the hole I dug myself into.

-- Frank
[hidden email]