Smalltalk › Usenets › Dolphin Smalltalk

Evaluating Dolphin

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

97 messages Options

12345

Blair McGlashan

Re: Evaluating Dolphin

"Eliot Miranda" <[hidden email]> wrote in message
news:[hidden email]...

> ...
> Interesting! [or alternatively "Ouch!", ed] The VW oldSpace allocator
> used to allocate tenured objects (which is what this "let's keep tons of
> objects around" stresses in VW's case is poor w.r.t a classic blue-book
> implementation because VW doesn't organize its oldSpace free lists as an
> objectTableEntry (ote) holding onto an objectBody. Instead it keeps
> separate lists of free otes and objectBodies. SO allocating an oldSpace
> object requires unlinking an ote from one free list and an objectBody
> from another. Further, the allocation code is not at all aggressively
> inlined and involves at least three procedure calls.
>
> Blair, if you're comfortable discussing it, what oldSpace free list
> organization does D5 use?
>

It too would have to allocate from two separate lists, one for header and
one for body, but its not a classic generational collector so would perhaps
do better on this unnatural test. Like I say I don't think its very relevant
to normal application performance. What is interesting, however, is the
speed of the Java collectors, which are presumably more similar in design to
VWs?

Regards

Blair

Eliot Miranda

Re: Evaluating Dolphin

Blair McGlashan wrote:

>
> "Eliot Miranda" <[hidden email]> wrote in message
> news:[hidden email]...
> > ...
> > Interesting! [or alternatively "Ouch!", ed] The VW oldSpace allocator
> > used to allocate tenured objects (which is what this "let's keep tons of
> > objects around" stresses in VW's case is poor w.r.t a classic blue-book
> > implementation because VW doesn't organize its oldSpace free lists as an
> > objectTableEntry (ote) holding onto an objectBody. Instead it keeps
> > separate lists of free otes and objectBodies. SO allocating an oldSpace
> > object requires unlinking an ote from one free list and an objectBody
> > from another. Further, the allocation code is not at all aggressively
> > inlined and involves at least three procedure calls.
> >
> > Blair, if you're comfortable discussing it, what oldSpace free list
> > organization does D5 use?
> >
>
> It too would have to allocate from two separate lists, one for header and
> one for body, but its not a classic generational collector so would perhaps
> do better on this unnatural test. Like I say I don't think its very relevant
> to normal application performance. What is interesting, however, is the
> speed of the Java collectors, which are presumably more similar in design to
> VWs?

I don't think so. Aren't many of the commercial Java offerings based on
the train algorithm? VW doesn't use the train algorithm. It has a
straight-forward incremental collector, stop-the-world mark-sweep and a
three generation system (scavenged newSpace, incrementally collected
oldSpace and uncollected [except for stop-the-world collection]
permSpace). The only thing exotic about the VW collector is ephemerons.

--
_______________,,,^..^,,,____________________________
Eliot Miranda Smalltalk - Scene not herd

Chris Uppal-3

Re: Evaluating Dolphin

Eliot Miranda wrote:

> Aren't many of the commercial Java offerings based
> on the train algorithm?

It's difficult to find out what the Java vendors use. IBM only seem to talk
details about their research JVMs (and some it is pretty interesting). I
haven't yet found any data about anyone else except Sun.

Sun (of course the major player), manage to obfuscate what they're doing pretty
well too. Since they seem to change the GC with every major release, but don't
often update the documentation, it's difficult to work out what's going on.
However, my best guess (and it *is* a guess, note) is that the current (J2SDK
1.4.1) JVM uses:

1) A perm space, perhaps; there was definitely one in 1.3, but there are hints,
no more, that it's gone away in 1.4.

2) A long-lived object space. Collected by either mark-and-compact or by a
parallelised equivalent ; the choice is configurable, defaulting to the
non-parallel
version.

3) An intermediate space that is either collected by mark-and-compact, or by a
train algorithm. Again the choice is configurable, the default is to use
mark-and-compact. The distinction between 2 and 3 is only mentioned once,
mostly the doc leaves you with the impression that the train algorithm is used
for all long-lived objects (if it's used at all).

4) A nursery consisting of an allocation area and a couple of alternating
survivor spaces. Optionally (as of 1.4.1), there's a parallelised GC
available for this space too.

At one time, Sun claimed that their latest JVM (I think this was around 1.3)
had improved allocation in multi-threaded apps because it now used a per-thread
pool of some sort, but I don't see how that can be reconciled with their other
documentation.

The best links I can find are:

http://java.sun.com/docs/hotspot/gc/index.html

http://developer.java.sun.com/developer/technicalArticles/Networking/HotSpot/

http://developer.java.sun.com/developer/technicalArticles/Programming/turbo/

And there's some of their research stuff at:

http://research.sun.com/jtech/pubs/

which I'm sure won't have much news for Eliot or Blair, but there's some pretty
interesting stuff there for the rest of us (those of us who happen to be sad
VM-junkies anyway ;-)

-- chris

David Simmons-2

Re: Evaluating Dolphin

In reply to this post by Blair McGlashan

On the current S#.AOS VM I tried the following code:

|kTimes := 1000000, oc, run|
VM.gcMemory.
run := {
Time millisecondsToRun: [
oc := OrderedCollection new: kTimes.
1 to: kTimes do: [:each | oc add: Rectangle new]
].
Time millisecondsToRun: [1 to: kTimes do: [:each |
(oc at: each) top]].
Time millisecondsToRun: [oc do: [:each | each top]].
Time millisecondsToRun: [
|s| := 1.00000001.
1 to: kTimes do: [:each | s := s * 1.00000001.].
].
}.
{run inject: 0 into: [:a:b| a+b], run}.

My machine info is:

OS Name Microsoft Windows XP Professional
Version 5.1.2600 Service Pack 1 Build 2600
OS Manufacturer Microsoft Corporation
System Name SATELLITE
System Manufacturer TOSHIBA
System Model Satellite 5105
System Type X86-based PC
Processor x86 Family 15 Model 2 Stepping 4 GenuineIntel ~1694 Mhz
BIOS Version/Date TOSHIBA Version 1.70, 4/8/2002
SMBIOS Version 2.3
Windows Directory C:\WINDOWS
System Directory C:\WINDOWS\System32
Boot Device \Device\HarddiskVolume1
Locale United States
Hardware Abstraction Layer Version = "5.1.2600.1106 (xpsp1.020828-1920)"
User Name SATELLITE\David Simmons
Time Zone Pacific Standard Time
Total Physical Memory 1,024.00 MB
Available Physical Memory 419.88 MB
Total Virtual Memory 2.65 GB
Available Virtual Memory 1.53 GB
Page File Space 1.65 GB
Page File C:\pagefile.sys

Which yielded the following runs [all numbers in milliseconds]:

== 1M CASES ==
==============
Times = 1M [actual 1.7GHz mobile cpu speed]
RUN[0]: {1988, {1277, 37, 40, 634}}
RUN[1]: {741, {147, 35, 39, 520}}
RUN[2]: {354, {144, 35, 50, 125}}
RUN[3]: {356, {141, 34, 39, 142}}

Times = 1M [1.7GHz mobile cpu speed scaled to 1.9MHz equiv]
RUN[0]: {1778, {1142, 33, 35, 567}}
RUN[0]: {663, {131, 31, 34, 465}}
RUN[0]: {316, {128, 31, 44, 111}}
RUN[0]: {318, {126, 30, 34, 127}}

== 3M CASES ==
==============
Times = 3M [actual 1.7GHz mobile cpu speed]
RUN[0]: {5781, {3537, 106, 121, 2017}}
RUN[0]: {1172, {438, 104, 120, 510}}
RUN[0]: {952, {391, 117, 121, 323}}
RUN[0]: {948, {401, 104, 120, 323}}

Times = 3M [1.7GHz mobile cpu speed scaled to 1.9MHz equiv]
RUN[0]: {5172, {3164, 94, 108, 1804}}
RUN[0]: {1048, {391, 93, 107, 456}}
RUN[0]: {851, {349, 104, 108, 289}}
RUN[0]: {848, {358, 93, 107, 289}}

----------------------
Some things to note...
----------------------

o) The current AOS.VM build I ran this against does not have "ephemeral-gc"
services enabled -- which negatively affects all constructor times -- the
ephemeral gc, among other things, uses auto-inlined custom jitted #new
methods for every class.

This biggest performance impact of this would be on the [Double mul time]
4th value (within a given run/set), but it will also impacts the performance
of the rectangle construction.

Typically when the "ephermal-gc" is enabled it generates ephemeral objects
10-20 times faster (10 vs 20x depends largely on cpu cache characteristics).

o) My machine has a lot of other stuff running [I'm building S#.NET with
VS.NET open, sucking down Squeak 3.4, etc] which tends to cause the cited
numbers
to be probably some 10-15% higher [longer times] than they might otherwise
be.

o) I ran the tests in a heavily loaded environment browser. If this affected
the tests it should have done so by hurting the performance numbers.

What does this mean?
-------------------

1. In all likelyhood, from comparitive numbers I've run in the past, the
Rectangle.new method would probably run 2X or better than the current
version using a generic #new.

2. The 4th value (within a given run/set) is very likely to be consistently
less than 100ms if the ephemeral gc was enabled.

I would guess that with ephemeral-gc services enabled we would not see the
"RUN[0]" numbers but would instead see behavior consistent with the "RUN[1]"
and better cases. In all likelyhood the numbers would be somewhat better
than the "RUN[4]" cases after the initial run (or some other gc sizing
trigger) occurred.

Are such benchmarks meaningful?
------------------------------

I suspect they are probably not generally useful for hard-comparison
purposes because +/- some 50% could be accounted for by variances in images
and object memory at the time/environment in which the tests were run.

However, running such tests is useful for identifying performance issues.
Most significantly for me, I found a policy bug in the VM regarding
resizeable object growth. I revised the policy as a result of this
benchmark. Which, as
is often the case, turned out to be the most important/useful aspect for me.

-- Dave S. [www.smallscript.org]

"Blair McGlashan" <[hidden email]> wrote in message
news:b3frad$1m4hep$[hidden email]...
> "Jochen Riekhof" <[hidden email]> wrote in message
> news:[hidden email]...
> > > > If that's the case then I'd expect a difference of less than
> > > > an order of magnitude between JITed Java and Dolphin
> > > > for *integer* arithmetic and integer arrays. The difference
> > > > is huge for floating point, though. (Presumably because of
Smalltalk's
> > > > "boxed" floats.) If you are seeing a 20-to-1 difference then I'd
> guess
> > > > that nearly all of it is down to floating-point performance.
> > >
> > > This may well be.
> >
> > No, may not be :-).
> >
> > I did a VERY quick check on simple operations performance and here is
the
> > result. I measured both server and client vm in hotspot and interpreted
> mode
> > vs. Dolphin. All code is appended. That the sever VM seems to be slow is
> > that it does not have enough time to "wwarm up". It does a lot of
> background
> > analysis and compilation that never pays off because the app runs only a
> few
> > seconds at all. You can expect the server vm to be faster than the
client

> vm
> > after a few minutes.
> >
> > There is not a BIG difference in interpreted mode vs. Dolphin, except
> > iterators are half the speed of a do: operation.
> > Hotspot any version is much faster, however. Float is only about factor
> two
> > slower in dolphin as opposed to your 1 to 20 guess.
> >
> > The most striking difference came from the memory management. Dolphin
> needed
> > more than 30 seconds to allocated the one million Rectangle objects.

After
> > close of the workspace the env. freezed foar about 45 seconds for gc (I
> > guess).
> >...
>
> I was pretty surprised by this, so I thought I'd look to see why. Just
> looking at the script, something that was immediately apparent is that
your
> allocation test is actually allocating 3 million objects on Dolphin, vs 1
> million on Java. This is because Smalltalk Rectangles are actually
> implemented as a pair of Point objects, whereas Java's is a single block
of
> memory holding 4 integer values. Since this is a micro-benchmark designed
to
> measure object allocation speed, I think it really ought to try and
measure
> the same number of allocations. Note though that on VW, Rectangle
class>>new
> answers an uninitialized Rectangle, so it is only performing 1 million
> allocations, at least if we ignore the allocations needed to grow the
> OrderedCollection. I noticed this when trying to run your benchmark on VW,
> as it failed on the second expression when attempting to access #top of
the
> first Rectangle. Another point to note is that this isn't a particularly
> pure test of allocation speed, as Smalltalk has to send a few messages to
> initialize a Rectangle.
>
> Anyway, regardless of this, I tried out the following slight modification
of

> your script on the 2.2Ghz P4 Xeon with 512Mb I happened to be using:
>
> start := Time millisecondClockValue.
> Transcript display: 'Alloc time: '; print: (Time millisecondsToRun: [
> oc := OrderedCollection new.
> "Use #origin:corner: so will also run on VW and - note this actually
> allocates 3 million objects"
> 1 to: 1000000 do: [:each | oc add: (Rectangle origin: 0@0 corner: 0@0)]]);
> cr.
> Transcript display: 'Get (index) time: '; print: (Time millisecondsToRun:

[1
> to: 1000000 do: [:each | (oc at: each) top]]); cr.
> Transcript display: 'Iterate (do) time: '; print: (Time millisecondsToRun:
> [oc do: [:each | each top]]); cr.
> Transcript display: 'Double mul time: '; print: (Time millisecondsToRun:
[s
> := 1.00000001. 1 to: 1000000 do: [:each | s := s * 1.00000001.]]); cr.
> Transcript display: 'GC time: '; print: (Time millisecondsToRun: [oc := s
:=
> nil. MemoryManager current collectGarbage "or ObjectMemory quickGC on
VW"]);

> cr.
> Transcript display: 'Overall runtime: '; print: (Time
> millisecondClockValue - start); cr
>
> These are the times I got from Dolphin 6 for the first and second runs,
> times in milliseconds:
>
> Alloc time: 4116
> Get (index) time: 422
> Iterate (do) time: 289
> Double mul time: 116
> GC time: 204
> Overall runtime: 5159
>
> Alloc time: 1408
> Get (index) time: 418
> Iterate (do) time: 290
> Double mul time: 105
> GC time: 211
> Overall runtime: 2441
>
> Running it a number of times, the figures varied a bit, but I haven't
> bothered to average them.
>
> As you can see the first run allocation time was significantly better than
> your experience, I didn't know your machine spec but assumed that it must

be
> similar since the second run results are similar. I also didn't see any
> extended GC time, even if I replaced the #collectGarbage with a #compact,
> though doing that did mean that the subsequent run figures were not much
> faster than the first on the allocation test. Anyway, I thought this must
be
> something massively improved in D6 vs D5 (though I can't for the life of
me

> think what :-)), so I went back to D5 and got these results:
>
> Alloc time: 52363
> Get (index) time: 411
> Iterate (do) time: 284
> Double mul time: 123
> GC time: 190
> Overall runtime: 53375
>
> Alloc time: 1275
> Get (index) time: 414
> Iterate (do) time: 288
> Double mul time: 112
> GC time: 186
> Overall runtime: 2285
>
> I was happy that this coincided with your experience on the initial
> allocation behaviour (though not that D6 was 100mS slower on the

subsequent
> run, even though this is probably just timing variability).
>
> I was still mystified as to the delay you experienced closing the
workspace,
> since this didn't seem to be born out by the forced GC timings (and if you
> insert a 'Rectangle primAllInstances size' at the end of the script,
you'll
> see that those Rectangles really have been collected). So I thought I'd
try
> out doing as you did, and simply closing the workspace leaving the
variables
> to be collected at idle time. To my surprise I experienced exactly the
same
> lengthy freeze. I didn't measure its duration, but it was lengthy. I found
> that if I nilled out the workspace variables before closing the workspace,
> that the delay did not occur, so I could only conclude that there is
> something very odd going on in the interaction between the view closing
and
> activities of the garbage collector. Obviously this needs to be
> investigated, but I don't think it is a fundamental performance problem in
> the Dolphin collector, as otherwise my other tests would also have shown
> that.
>
> As a point of reference I tried running the script on VWNC7. I had to
change
> the Transcript #display: messages to #show:, and use "ObjectMemory
quickGC"

> in place of "MemoryManager current collectGarbage" (it seemed the nearest
> equivalent), and this is what I got.
>
> Alloc time: 40849
> Get (index) time: 77
> Iterate (do) time: 51
> Double mul time: 327
> GC time: 116
> Overall runtime: 41445
>
> [Subsequent runs were similar]
>
> As you can see, performance on the initial allocation test was poor. I

think
> this is because I either have insufficient memory to run the test in VW,
or
> (more likely) the default memory policy/configuration is not appropriate
for
> this test. Certainly there was an awful lot of flashing up of the GC and
> dustbin cursors when the test was running. So anyway, I don't think it is
> really a valid result, and I also think the FP mul figure is questionable
> since once again this was probably over influenced by GC activity:
>
> Anyway Jochen, I believe what has brought us to this point was your
> statement: " Performance is about factor twenty
> lower than Java HotSpot VM, ..." On this test at least, that would appear
be
> FUD, right? :-)
>
> [Frankly, though, I think you really need some more "macro" benchmarks,
i.e.
> closer to an actual application, to draw any real performance conclusions]
>
> Regards
>
> Blair
>
>

David Simmons-2

Re: Evaluating Dolphin

Dag...

The:

> RUN[0]: ...
> RUN[0]: ...
> RUN[0]: ...
> RUN[0]: ...

forms, should be read as:

> RUN[1]: ...
> RUN[2]: ...
> RUN[3]: ...
> RUN[4]: ...

It was my goof. I got interrupted in the midst of writing this to attend to
my lamb-roast dinner in the oven and forgot to come back and annotate them
properly.

sigh.

-- Dave S. [www.smallscript.org]

"David Simmons" <[hidden email]> wrote in message
news:[hidden email]...

> On the current S#.AOS VM I tried the following code:
>
> |kTimes := 1000000, oc, run|
> VM.gcMemory.
> run := {
> Time millisecondsToRun: [
> oc := OrderedCollection new: kTimes.
> 1 to: kTimes do: [:each | oc add: Rectangle new]
> ].
> Time millisecondsToRun: [1 to: kTimes do: [:each |
> (oc at: each) top]].
> Time millisecondsToRun: [oc do: [:each | each top]].
> Time millisecondsToRun: [
> |s| := 1.00000001.
> 1 to: kTimes do: [:each | s := s * 1.00000001.].
> ].
> }.
> {run inject: 0 into: [:a:b| a+b], run}.
>
> My machine info is:
>
> OS Name Microsoft Windows XP Professional
> Version 5.1.2600 Service Pack 1 Build 2600
> OS Manufacturer Microsoft Corporation
> System Name SATELLITE
> System Manufacturer TOSHIBA
> System Model Satellite 5105
> System Type X86-based PC
> Processor x86 Family 15 Model 2 Stepping 4 GenuineIntel ~1694 Mhz
> BIOS Version/Date TOSHIBA Version 1.70, 4/8/2002
> SMBIOS Version 2.3
> Windows Directory C:\WINDOWS
> System Directory C:\WINDOWS\System32
> Boot Device \Device\HarddiskVolume1
> Locale United States
> Hardware Abstraction Layer Version = "5.1.2600.1106

(xpsp1.020828-1920)"

> User Name SATELLITE\David Simmons
> Time Zone Pacific Standard Time
> Total Physical Memory 1,024.00 MB
> Available Physical Memory 419.88 MB
> Total Virtual Memory 2.65 GB
> Available Virtual Memory 1.53 GB
> Page File Space 1.65 GB
> Page File C:\pagefile.sys
>
> Which yielded the following runs [all numbers in milliseconds]:
>
> == 1M CASES ==
> ==============
> Times = 1M [actual 1.7GHz mobile cpu speed]
> RUN[0]: {1988, {1277, 37, 40, 634}}
> RUN[1]: {741, {147, 35, 39, 520}}
> RUN[2]: {354, {144, 35, 50, 125}}
> RUN[3]: {356, {141, 34, 39, 142}}
>
> Times = 1M [1.7GHz mobile cpu speed scaled to 1.9MHz equiv]
> RUN[0]: {1778, {1142, 33, 35, 567}}
> RUN[0]: {663, {131, 31, 34, 465}}
> RUN[0]: {316, {128, 31, 44, 111}}
> RUN[0]: {318, {126, 30, 34, 127}}
>
> == 3M CASES ==
> ==============
> Times = 3M [actual 1.7GHz mobile cpu speed]
> RUN[0]: {5781, {3537, 106, 121, 2017}}
> RUN[0]: {1172, {438, 104, 120, 510}}
> RUN[0]: {952, {391, 117, 121, 323}}
> RUN[0]: {948, {401, 104, 120, 323}}
>
> Times = 3M [1.7GHz mobile cpu speed scaled to 1.9MHz equiv]
> RUN[0]: {5172, {3164, 94, 108, 1804}}
> RUN[0]: {1048, {391, 93, 107, 456}}
> RUN[0]: {851, {349, 104, 108, 289}}
> RUN[0]: {848, {358, 93, 107, 289}}
>
> ----------------------
> Some things to note...
> ----------------------
>
> o) The current AOS.VM build I ran this against does not have

"ephemeral-gc"
> services enabled -- which negatively affects all constructor times -- the
> ephemeral gc, among other things, uses auto-inlined custom jitted #new
> methods for every class.
>
> This biggest performance impact of this would be on the [Double mul time]
> 4th value (within a given run/set), but it will also impacts the
performance
> of the rectangle construction.
>
> Typically when the "ephermal-gc" is enabled it generates ephemeral objects
> 10-20 times faster (10 vs 20x depends largely on cpu cache
characteristics).
>
> o) My machine has a lot of other stuff running [I'm building S#.NET with
> VS.NET open, sucking down Squeak 3.4, etc] which tends to cause the cited
> numbers
> to be probably some 10-15% higher [longer times] than they might otherwise
> be.
>
> o) I ran the tests in a heavily loaded environment browser. If this
affected

> the tests it should have done so by hurting the performance numbers.
>
> What does this mean?
> -------------------
>
> 1. In all likelyhood, from comparitive numbers I've run in the past, the
> Rectangle.new method would probably run 2X or better than the current
> version using a generic #new.
>
> 2. The 4th value (within a given run/set) is very likely to be

consistently
> less than 100ms if the ephemeral gc was enabled.
>
> I would guess that with ephemeral-gc services enabled we would not see the
> "RUN[0]" numbers but would instead see behavior consistent with the
"RUN[1]"
> and better cases. In all likelyhood the numbers would be somewhat better
> than the "RUN[4]" cases after the initial run (or some other gc sizing
> trigger) occurred.
>
> Are such benchmarks meaningful?
> ------------------------------
>
> I suspect they are probably not generally useful for hard-comparison
> purposes because +/- some 50% could be accounted for by variances in
images
> and object memory at the time/environment in which the tests were run.
>
> However, running such tests is useful for identifying performance issues.
> Most significantly for me, I found a policy bug in the VM regarding
> resizeable object growth. I revised the policy as a result of this
> benchmark. Which, as
> is often the case, turned out to be the most important/useful aspect for
me.

>
> -- Dave S. [www.smallscript.org]
>
> "Blair McGlashan" <[hidden email]> wrote in message
> news:b3frad$1m4hep$[hidden email]...
> > "Jochen Riekhof" <[hidden email]> wrote in message
> > news:[hidden email]...
> > > > > If that's the case then I'd expect a difference of less than
> > > > > an order of magnitude between JITed Java and Dolphin
> > > > > for *integer* arithmetic and integer arrays. The difference
> > > > > is huge for floating point, though. (Presumably because of
> Smalltalk's
> > > > > "boxed" floats.) If you are seeing a 20-to-1 difference then I'd
> > guess
> > > > > that nearly all of it is down to floating-point performance.
> > > >
> > > > This may well be.
> > >
> > > No, may not be :-).
> > >
> > > I did a VERY quick check on simple operations performance and here is
> the
> > > result. I measured both server and client vm in hotspot and

interpreted
> > mode
> > > vs. Dolphin. All code is appended. That the sever VM seems to be slow
is
> > > that it does not have enough time to "wwarm up". It does a lot of
> > background
> > > analysis and compilation that never pays off because the app runs only
a
> > few
> > > seconds at all. You can expect the server vm to be faster than the
> client
> > vm
> > > after a few minutes.
> > >
> > > There is not a BIG difference in interpreted mode vs. Dolphin, except
> > > iterators are half the speed of a do: operation.
> > > Hotspot any version is much faster, however. Float is only about
factor
> > two
> > > slower in dolphin as opposed to your 1 to 20 guess.
> > >
> > > The most striking difference came from the memory management. Dolphin
> > needed
> > > more than 30 seconds to allocated the one million Rectangle objects.
> After
> > > close of the workspace the env. freezed foar about 45 seconds for gc
(I
> > > guess).
> > >...
> >
> > I was pretty surprised by this, so I thought I'd look to see why. Just
> > looking at the script, something that was immediately apparent is that
> your
> > allocation test is actually allocating 3 million objects on Dolphin, vs
1
> > million on Java. This is because Smalltalk Rectangles are actually
> > implemented as a pair of Point objects, whereas Java's is a single block
> of
> > memory holding 4 integer values. Since this is a micro-benchmark
designed
> to
> > measure object allocation speed, I think it really ought to try and
> measure
> > the same number of allocations. Note though that on VW, Rectangle
> class>>new
> > answers an uninitialized Rectangle, so it is only performing 1 million
> > allocations, at least if we ignore the allocations needed to grow the
> > OrderedCollection. I noticed this when trying to run your benchmark on
VW,
> > as it failed on the second expression when attempting to access #top of
> the
> > first Rectangle. Another point to note is that this isn't a particularly
> > pure test of allocation speed, as Smalltalk has to send a few messages
to
> > initialize a Rectangle.
> >
> > Anyway, regardless of this, I tried out the following slight
modification
> of
> > your script on the 2.2Ghz P4 Xeon with 512Mb I happened to be using:
> >
> > start := Time millisecondClockValue.
> > Transcript display: 'Alloc time: '; print: (Time millisecondsToRun: [
> > oc := OrderedCollection new.
> > "Use #origin:corner: so will also run on VW and - note this actually
> > allocates 3 million objects"
> > 1 to: 1000000 do: [:each | oc add: (Rectangle origin: 0@0 corner:
0@0)]]);
> > cr.
> > Transcript display: 'Get (index) time: '; print: (Time
millisecondsToRun:
> [1
> > to: 1000000 do: [:each | (oc at: each) top]]); cr.
> > Transcript display: 'Iterate (do) time: '; print: (Time
millisecondsToRun:
> > [oc do: [:each | each top]]); cr.
> > Transcript display: 'Double mul time: '; print: (Time millisecondsToRun:
> [s
> > := 1.00000001. 1 to: 1000000 do: [:each | s := s * 1.00000001.]]); cr.
> > Transcript display: 'GC time: '; print: (Time millisecondsToRun: [oc :=
s

> :=
> > nil. MemoryManager current collectGarbage "or ObjectMemory quickGC on
> VW"]);
> > cr.
> > Transcript display: 'Overall runtime: '; print: (Time
> > millisecondClockValue - start); cr
> >
> > These are the times I got from Dolphin 6 for the first and second runs,
> > times in milliseconds:
> >
> > Alloc time: 4116
> > Get (index) time: 422
> > Iterate (do) time: 289
> > Double mul time: 116
> > GC time: 204
> > Overall runtime: 5159
> >
> > Alloc time: 1408
> > Get (index) time: 418
> > Iterate (do) time: 290
> > Double mul time: 105
> > GC time: 211
> > Overall runtime: 2441
> >
> > Running it a number of times, the figures varied a bit, but I haven't
> > bothered to average them.
> >
> > As you can see the first run allocation time was significantly better

than
> > your experience, I didn't know your machine spec but assumed that it
must
> be
> > similar since the second run results are similar. I also didn't see any
> > extended GC time, even if I replaced the #collectGarbage with a
#compact,
> > though doing that did mean that the subsequent run figures were not much
> > faster than the first on the allocation test. Anyway, I thought this
must

> be
> > something massively improved in D6 vs D5 (though I can't for the life of
> me
> > think what :-)), so I went back to D5 and got these results:
> >
> > Alloc time: 52363
> > Get (index) time: 411
> > Iterate (do) time: 284
> > Double mul time: 123
> > GC time: 190
> > Overall runtime: 53375
> >
> > Alloc time: 1275
> > Get (index) time: 414
> > Iterate (do) time: 288
> > Double mul time: 112
> > GC time: 186
> > Overall runtime: 2285
> >
> > I was happy that this coincided with your experience on the initial
> > allocation behaviour (though not that D6 was 100mS slower on the
> subsequent
> > run, even though this is probably just timing variability).
> >
> > I was still mystified as to the delay you experienced closing the
> workspace,
> > since this didn't seem to be born out by the forced GC timings (and if

you
> > insert a 'Rectangle primAllInstances size' at the end of the script,
> you'll
> > see that those Rectangles really have been collected). So I thought I'd
> try
> > out doing as you did, and simply closing the workspace leaving the
> variables
> > to be collected at idle time. To my surprise I experienced exactly the
> same
> > lengthy freeze. I didn't measure its duration, but it was lengthy. I
found
> > that if I nilled out the workspace variables before closing the
workspace,
> > that the delay did not occur, so I could only conclude that there is
> > something very odd going on in the interaction between the view closing
> and
> > activities of the garbage collector. Obviously this needs to be
> > investigated, but I don't think it is a fundamental performance problem
in
> > the Dolphin collector, as otherwise my other tests would also have shown
> > that.
> >
> > As a point of reference I tried running the script on VWNC7. I had to
> change
> > the Transcript #display: messages to #show:, and use "ObjectMemory
> quickGC"
> > in place of "MemoryManager current collectGarbage" (it seemed the
nearest

> > equivalent), and this is what I got.
> >
> > Alloc time: 40849
> > Get (index) time: 77
> > Iterate (do) time: 51
> > Double mul time: 327
> > GC time: 116
> > Overall runtime: 41445
> >
> > [Subsequent runs were similar]
> >
> > As you can see, performance on the initial allocation test was poor. I
> think
> > this is because I either have insufficient memory to run the test in VW,
> or
> > (more likely) the default memory policy/configuration is not appropriate
> for
> > this test. Certainly there was an awful lot of flashing up of the GC and
> > dustbin cursors when the test was running. So anyway, I don't think it

is
> > really a valid result, and I also think the FP mul figure is
questionable
> > since once again this was probably over influenced by GC activity:
> >
> > Anyway Jochen, I believe what has brought us to this point was your
> > statement: " Performance is about factor twenty
> > lower than Java HotSpot VM, ..." On this test at least, that would
appear
> be
> > FUD, right? :-)
> >
> > [Frankly, though, I think you really need some more "macro" benchmarks,
> i.e.
> > closer to an actual application, to draw any real performance
conclusions]

> >
> > Regards
> >
> > Blair
> >
> >
>
>
>
>
>

David Simmons-2

Re: Evaluating Dolphin

In reply to this post by David Simmons-2

FYI,

On a slightly less loaded machine test [just VS.NET and Outlook] using the
same large image, yielded:

== 1M CASES ==
==============
Times = 1M [actual 1.7GHz mobile cpu speed]
RUN[0]: {1735, {996, 34, 40, 665}}
RUN[1]: {719, {123, 34, 40, 522}}
RUN[2]: {301, {132, 34, 39, 96}}

Times = 1M [1.7GHz mobile cpu speed scaled to 1.9MHz equiv]
RUN[0]: {1552, {891, 30, 35, 595}}
RUN[1]: {643, {110, 30, 35, 467}}
RUN[2]: {269, {118, 30, 34, 85}}

======================
------
Errata (previous post)
------
RUN[0]: ...
RUN[0]: ...
RUN[0]: ...
RUN[0]: ...

forms, should be read as:

RUN[0]: ...
RUN[1]: ...
RUN[2]: ...
RUN[3]: ...

-- Dave S.

"David Simmons" <[hidden email]> wrote in message
news:[hidden email]...

(xpsp1.020828-1920)"

> >
> > Regards
> >
> > Blair
> >
> >
>
>
>
>
>

Mark Wilden

Re: Evaluating Dolphin [LONG]

In reply to this post by Blair McGlashan

"Blair McGlashan" <[hidden email]> wrote in message
news:b334s0$1i941s$[hidden email]...

> Incidentally if you have installed PL2, you can easily view a list of all
> accelerator keys in the browsers by choosing the 'Key Bindings' command on
> the common Help menu.

I've installed PL2 and PL3, but I don't see that menu choice (and I really
could use some keyboard shortcuts!).

Ian Bartholomew-18

Re: Evaluating Dolphin [LONG]

Hi Mark,

> I've installed PL2 and PL3, but I don't see that menu choice (and I
> really could use some keyboard shortcuts!).

Looks like this patch needs initialising before it works in anything
other than the Class or System Browser.

Evaluate the following in a workspace and all (?) of the Help menus in
browsers should gain a "Key Bindings" entry.

OAIDEExtensions initialize

--
Ian

Mark Wilden

Re: Evaluating Dolphin [LONG]

"Ian Bartholomew" <[hidden email]> wrote in message
news:BAfda.1257$[hidden email]...
>
> > I've installed PL2 and PL3, but I don't see that menu choice (and I
> > really could use some keyboard shortcuts!).
>
> Looks like this patch needs initialising before it works in anything
> other than the Class or System Browser.

I don't have the menu choice there, either.

> Evaluate the following in a workspace and all (?) of the Help menus in
> browsers should gain a "Key Bindings" entry.
>
> OAIDEExtensions initialize

I don't have that class. Can it be that I don't have PL3? About Dolphin
Smalltalk shows 5.0.3.

Anyway, I think another message from Blair or Andy says this was actually a
V6 feature.

Ian Bartholomew-18

Re: Evaluating Dolphin [LONG]

Mark,

>> OAIDEExtensions initialize
>
> I don't have that class. Can it be that I don't have PL3? About
> Dolphin Smalltalk shows 5.0.3.

You should have. It was introduced with Dolphin 5.0.0 to provide a way
of hooking into a browser as it opened. It enables you to add
additional functionality to existing tools without having to modify any
existing resources.

FWIW - If I cut/paste the above text into my 5.0.3 it evaluates without
a problem (I even spelt initialise the (in)correct way - something I
often forget :-) )

> Anyway, I think another message from Blair or Andy says this was
> actually a V6 feature.

It works (once I reinitialised the class as above) in my 5.0.3. All of
the browsers now have a "Key Bindings" option on the help menu that
generates a html page containing the key bindings for the current tool
and opens up your html browser on the result.

--
Ian

Blair McGlashan

Re: Evaluating Dolphin [LONG]

In reply to this post by Mark Wilden

"Mark Wilden" <[hidden email]> wrote in message
news:[hidden email]...
> "Ian Bartholomew" <[hidden email]> wrote in message
> news:BAfda.1257$[hidden email]...
> >....
> > OAIDEExtensions initialize
>
> I don't have that class. ....

Are you sure? Its a member of the 'Dolphin IDE Extension Example' package,
which is located in the 'Object Arts\Samples\IDE\' folder. Have you perhaps
uninstalled all the samples?

> ...
> Anyway, I think another message from Blair or Andy says this was actually
a
> V6 feature.

It was originally, but we provided a version of it in PL2.

Regards

Blair

Ian Bartholomew-18

Re: Evaluating Dolphin [LONG]

In reply to this post by Mark Wilden

Mark,

> I don't have that class. Can it be that I don't have PL3? About
> Dolphin Smalltalk shows 5.0.3.

Ahh, a light bulb has just switched on. I checked the patch list and it
is shown as patch DSE #975 - for Dolphin Standard and Pro only. I guess
you are using DVE?

I don't know why it wasn't included in the DVE - it seems that it would
be most applicable there. Possibly because something in the list
creation process needs DSE or DPro?

> Anyway, I think another message from Blair or Andy says this was
> actually a V6 feature.

That bit appears to just be an enhancement where D6 will open the list
in it's own browser rather than spawning your default internet browser.

--
Ian

Mark Wilden

Re: Evaluating Dolphin [LONG]

In reply to this post by Ian Bartholomew-18

"Ian Bartholomew" <[hidden email]> wrote in message
news:jQgda.603$[hidden email]...
> >
> > I don't have that class. Can it be that I don't have PL3? About
> > Dolphin Smalltalk shows 5.0.3.
>
> You should have. It was introduced with Dolphin 5.0.0 to provide a way
> of hooking into a browser as it opened. It enables you to add
> additional functionality to existing tools without having to modify any
> existing resources.

Okey-doke. I'd uninstalled that package (temporarily) along with the rest of
the Samples (as you thought, Blair). I put it back and evaluated
OAIDEExtensions initialize, but I still don't get any darn Help menu choice
for key bindings in a CHB. :(

Mark Wilden

Re: Evaluating Dolphin [LONG]

In reply to this post by Ian Bartholomew-18

"Ian Bartholomew" <[hidden email]> wrote in message
news:adhda.1277$[hidden email]...
> Mark,
>
> > I don't have that class. Can it be that I don't have PL3? About
> > Dolphin Smalltalk shows 5.0.3.
>
> Ahh, a light bulb has just switched on. I checked the patch list and it
> is shown as patch DSE #975 - for Dolphin Standard and Pro only. I guess
> you are using DVE?

No, I'm using the downloaded version of Pro. All the OAIDEExtensions class
does that I can see is add a class comment pane context menu choice Emit
Class Layout Description which does nothing that I can see.

Mark Wilden

Re: Evaluating Dolphin [LONG]

"Mark Wilden" <[hidden email]> wrote in message
news:[hidden email]...

> a class comment pane context menu choice Emit
> Class Layout Description which does nothing that I can see.

I take that back--I can choose a protocol for each instance variable to put
in the comment.

Blair McGlashan

Re: Evaluating Dolphin [LONG]

In reply to this post by Mark Wilden

"Mark Wilden" <[hidden email]> wrote in message
news:[hidden email]...

> "Ian Bartholomew" <[hidden email]> wrote in message
> news:jQgda.603$[hidden email]...
> > >
> > > I don't have that class. Can it be that I don't have PL3? About
> > > Dolphin Smalltalk shows 5.0.3.
> >
> > You should have. It was introduced with Dolphin 5.0.0 to provide a way
> > of hooking into a browser as it opened. It enables you to add
> > additional functionality to existing tools without having to modify any
> > existing resources.
>
> Okey-doke. I'd uninstalled that package (temporarily) along with the rest

of
> the Samples (as you thought, Blair). I put it back and evaluated
> OAIDEExtensions initialize, but I still don't get any darn Help menu
choice
> for key bindings in a CHB. :(

Well of course if you reload the package after installing the patch, then
you won't get any of the patched methods but the originals. We recommend
patching a freshly installed image to avoid problems like this - also look
out for errors on the Transcript when applying a patch, there shouldn't be
any. We
will be releasing a complete new download of 5.0.4 or 5.1 soon, this week if
all goes to plan. In the meantime you may want to try and repeat the patch
process against a freshly installed image. First package and save your
existing work, and also save the
existing image to a new name as a backup. Then if you find anything missing
you can still load up the old image, or use Ian's excellent Chunk Browser
tool to pull it from the old change log.

Please follow these steps exactly when installing the patches:

1) Start LiveUpdate and wait for it to download the patches.
2) Sort the list of patches until in ascending order (click the column
header)
3) Select only Patch Level 2, and apply it.
4) Exit and restart LiveUpdate.
5) Select and apply patch Level 3.
6) Save the image, and save another copy as a baseline 5.0.3 for later use.

These steps are necessary because of an issue with LiveUpdate itself,
corrected by PL2.

Another (advanced) alternative would be to open the change log, locate the
chunks relating to #975, and file them in.

Regards

Blair

Mark Wilden

Re: Evaluating Dolphin [LONG]

"Blair McGlashan" <[hidden email]> wrote in message
news:b549s6$24pcnk$[hidden email]...
>
> Well of course if you reload the package after installing the patch, then
> you won't get any of the patched methods but the originals.

Ah, of course. :)

> Another (advanced) alternative would be to open the change log, locate the
> chunks relating to #975, and file them in.

It was as simple as that! Now I have my Key Bindings menu selection. Thanks,
Blair & Ian--now it's time for all us ex-pat British-Canadian-Californians
to hit the hay. :)

BTW, the only evaluation question I had concerning purchase of Dolphin was
"how active/current is the support community?" The question has been
answered. Thanks again, guys.

12345