I became a Dolphin user after being a devoted Dylan fan (I still am, but
maybe a bit less so than before--and I mean the programming language not Bob), so I still follow the Dylan newsgroup. There was recently a post there from someone trying to gauge the speed of the one commercial Dylan implementation against Lisp, C++, and Python for computing some values from a heat conduction equation. Since I'm new to Smalltalk, I couldn't resist trying to do the algorithm in Smalltalk. My attempt is below. On my Pentium III 500 running NT 4.0, Dolphin runs in about 15 - 20 seconds. By the way C++ does this on a PII 450 in 0.2 seconds. After some help from the experts, the Dylan time got down to somewhere around 3.7 seconds. Are there any simple optimizations that would make a Smalltalk version faster? Note: I don't want to cause worthless language flaming. Just curious. Apologies for the formatting. I can only submit through deja.com from work. First the algorithm (which was simply typed into a workspace. It depends on the Array2D class which follows the algorithm: t1 := (Array2D new: 300 d2: 300) fill: 1.0. t2 := (Array2D new: 300 d2: 300) fill: 30.0. Transcript show: 'Heat Conduction Run' ; cr. (Interval from: 1 to: 10) do: [ :k | (Interval from: 2 to: 299) do: [ :i | (Interval from: 2 to: 299) do: [ :j | t2 row: i col:j put: ((t1 row: i col: j) + 1.0 + (0.1 * ((t1 row: i + 1 col: j) + (t1 row: i - 1 col: j) + (t1 row: i col: j + 1) + (t1 row: i col: j - 1) - (4.0 * (t1 row: i col: j))))). ]]. (Interval from: 2 to: 299) do: [ :i | (Interval from: 2 to: 299) do: [ :j | t1 row: i col:j put: ((t2 row: i col: j) + 1.0 + (0.1 * ((t2 row: i + 1 col: j) + (t2 row: i - 1 col: j) + (t2 row: i col: j + 1) + (t2 row: i col: j - 1) - (4.0 * (t2 row: i col: j))))). ]]. Transcript show: (t1 row: 150 col: 150) printString; cr; cr. ]. package! "Class Definitions"! Array variableSubclass: #Array2D instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' classInstanceVariableNames: ''! "Loose Methods"! "End of package definition"! Array2D comment: ''! Array2D guid: (GUID fromString: '{EB7F0C00-9732-4767-8B67-57A9CB9F1440}')! !Array2D categoriesForClass!Unclassified! ! !Array2D methodsFor! fill: aNumber ^ self do: [ :row | (Interval from: 1 to: (row size)) do: [ :index | row at: index put: aNumber ] ].! r: row c: col ^(self at: row) at: col.! r: row c: col put: aNumber ^(self at: row) at: col put: aNumber.! !Array2D class methodsFor! new: d1 d2: dim2 | temp | temp := ( super new: d1 ). (Interval from: 1 to: d1) do: [ :index | temp at: index put: (Array new: dim2 )]. ^temp. Sent via Deja.com http://www.deja.com/ |
> I became a Dolphin user after being a devoted Dylan fan (I still am, but
> maybe a bit less so than before--and I mean the programming language > not Bob), so I still follow the Dylan newsgroup. There was recently a > post there from someone trying to gauge the speed of the one commercial > Dylan implementation against Lisp, C++, and Python for computing some > values from a heat conduction equation. Since I'm new to Smalltalk, I > couldn't resist trying to do the algorithm in Smalltalk. My attempt is > below. On my Pentium III 500 running NT 4.0, Dolphin runs in about 15 - > 20 seconds. By the way C++ does this on a PII 450 in 0.2 seconds. > After some help from the experts, the Dylan time got down to somewhere > around 3.7 seconds. Are there any simple optimizations that would make > a Smalltalk version faster? Good Smalltalk code will generally run slower than good C/C++ code; but, good Smalltalk is a LOT easier to write than good C++. It is often true that Smalltalk will allow time for optimizations that would not happen in other languages. A refined Smalltalk app can be faster, and certainly more stable and easier to maintain, than the first C++ app that doesn't crash (assuming the C++ project even reaches that stage). I have a particular Smalltalk app doing screen updates with very reasonable performance (with a moderately patient user, it can still function on a 486-100); in other situations, I resort to a C++ DLL, especially for heavy number crunching. Numerical analysis is sometimes more easily done in C++ anyway due to available source code for many algorithms; Smalltalk wrappers make it easy to get the memory management correct, and to assemble small steps into larger ones. Finally, try experimenting with interactive creation of presentation graphics in VC++ :) For things that involve lots of tedious data capture, I've gone so far as to write wizard interfaces to do the error prone parts. One of the best things you can do is try Ian Bartholomew's profiler on your code. Bottlenecks tend to show themselves; if tuning in Smalltalk fails to give the desired performance, you can always create a DLL and do the expensive work in it. As a first suggestion, try replacing things like (1 to:3) do:[ :i | ] with 1 to:3 do:[ :i | ] - Ian's profiler will allow you to measure the difference to see if it's worth pursuing further. Have a good one, Bill -- Wilhelm K. Schwab, Ph.D. [hidden email] |
In reply to this post by johncwhi
John,
NumberCrunching in this way is not really something that Smalltalk is renowned for so please don't read too much into comparisons resulting from this test. FWIW, I've managed to knock 30% off of the execution time for the example you gave. I might be able to skim a bit more so I won't post until I've had a further think - or someone else does better. For future reference - The compiler inlines #to:do: so (in normal apps) using it instead of your Interval>>from:to: will certainly improve execution times. It doesn't make much difference to your example though. Ian |
> I might be able to skim a bit more so I won't post until I've had
> a further think Just over a 50% reduction is the best I can do. Homework: Time the workspace example using the code below and then make one alteration to the #updateValueRow:col: method - changing (current * 4) to (4 * current) - and time it again. Discuss <g> Ian Workspace - Time millisecondsToRun: [ t1 := (Array2D new: 300 d2: 300) fill: 1.0. t2 := (Array2D new: 300 d2: 300) fill: 30.0. Transcript show: 'Heat Conduction Run' ; cr. 10 timesRepeat: [ i := 1. [i <= 298] whileTrue: [ j := 2. [j <= 299] whileTrue: [ t2 updateRow: i col: j in: t1. j := j + 1]. i := i + 1]. i := 1. [i <= 298] whileTrue: [ j := 2. [j <= 299] whileTrue: [ t1 updateRow: i col: j in: t2. j := j + 1]. i := i + 1]. Transcript show: (t1 row: 150 col: 150) printString; cr; cr]] Class - Array variableSubclass: #Array2D instanceVariableNames: 'columns' classVariableNames: '' poolDictionaries: '' classInstanceVariableNames: '' Array2D class>>new: dim1 d2: dim2 ^(super new: dim1 * dim2) setColumns: dim2 Array2D>>updateRow: row col: col in: anArray self at: col + (row * columns) put: (anArray updateValueRow: row col: col) Array2D>>fill: aNumber self atAllPut: aNumber Array2D>>updateValueRow: row col: col | location current | current := self at: (location := col + (row * columns)). ^(((self at: location + columns) + (self at: location - columns) + (self at: location + 1) + (self at: location - 1) - (current * 4)) * 0.1) + current + 1 Array2D>>setColumns: anInteger columns := anInteger Array2D>>row: row col: col ^self at: row * columns + col |
Ian Bartholomew wrote:
> > I might be able to skim a bit more so I won't post until I've had > > a further think > > Just over a 50% reduction is the best I can do. Did you consider jumping into C for some critical parts ? -Panu |
In reply to this post by Ian Bartholomew
"Ian Bartholomew" <[hidden email]> wrote in message
news:942iqc$bflfi$[hidden email]... > John, > > NumberCrunching in this way is not really something that Smalltalk is > renowned for so please don't read too much into comparisons resulting from > this test. > > FWIW, I've managed to knock 30% off of the execution time for the example > you gave. I might be able to skim a bit more so I won't post until I've had > a further think - or someone else does better. > > For future reference - The compiler inlines #to:do: so (in normal apps) > using it instead of your Interval>>from:to: will certainly improve execution > times. It doesn't make much difference to your example though. > I realize that Smalltalk is not necessarily geared to "crunching numbers", but at the same time it's alway instructive to pose these kinds of questions. I also hoped that I might be able to introduce the Dylan language to Smalltalkers through this exercise. While Smalltalk is my language of choice right now, I've always loved the functional approach. Dylan combines the best of both worlds. Thanks for your insights into optimizing Smalltalk. I have learned something from this. |
In reply to this post by Ian Bartholomew
>
[rest of code snipped]
> Time millisecondsToRun: [ > t1 := (Array2D new: 300 d2: 300) fill: 1.0. > t2 := (Array2D new: 300 d2: 300) fill: 30.0. > > Transcript show: 'Heat Conduction Run' ; cr. > > 10 timesRepeat: [ > i := 1. > [i <= 298] whileTrue: [ > j := 2. > [j <= 299] whileTrue: [ > t2 updateRow: i col: j in: t1. > j := j + 1]. > i := i + 1]. > i := 1. > [i <= 298] whileTrue: [ > j := 2. > [j <= 299] whileTrue: [ > t1 updateRow: i col: j in: t2. > j := j + 1]. > i := i + 1]. > Transcript show: (t1 row: 150 col: 150) printString; cr; cr]] > > Class - > > Array variableSubclass: #Array2D > instanceVariableNames: 'columns' > classVariableNames: '' > poolDictionaries: '' > classInstanceVariableNames: '' > > Array2D class>>new: dim1 d2: dim2 > ^(super new: dim1 * dim2) setColumns: dim2 > > Array2D>>updateRow: row col: col in: anArray > self > at: col + (row * columns) > put: (anArray updateValueRow: row col: col) > > Array2D>>fill: aNumber > self atAllPut: aNumber > > Array2D>>updateValueRow: row col: col > | location current | > current := self at: (location := col + (row * columns)). > ^(((self at: location + columns) > + (self at: location - columns) > + (self at: location + 1) > + (self at: location - 1) > - (current * 4)) * 0.1) + current + 1 > It's interesting and a bit counter-intuitive to me that creating the method updateValueRow cuts down the execution time. There's obviously some message sending overhead here. Is that overhead dominated by the speed gained by not having to reference the array object in the old straight-line approach? If so, could the Dolphin compiler maybe do a bit of its own inlining if it were sufficiently sophisticated? (Just guessing, but this is probably what the FD (Functional Developer) Dylan compiler does in Dylan). |
In reply to this post by johncwhi
[hidden email] () wrote (abridged):
> (Interval from: 2 to: 299) do: That is one of the slowest ways of doing it. 2 to: 299 do: should be faster, but the difference is only about 12% on my machine. > r: row c: col > > ^(self at: row) at: col.! This is an array of arrays, with lots of double-indirection. It may be quicker to use a single array with a rectangular conversion. Eg: row: row col: col ^self at: (row - 1) * colSize + col This is closer to what the C++ will have been doing. However, the difference on my machine is marginal. I suspect we really need a method like: t1 transformBy: [:above :below :centre :left :right| ... ] yeilding: t2 where the block's arguments are the array elements surrounding the centre one, and the block answers the new centre value. A naive implementation would be like: transformBy: aBlock yeilding: anArray2D 2 to: self rowSize - 1 do: [:i| 2 to: self colSize - 1 do: [:j| anArray2D row: i col: j put: (aBlock value: (self row: i-1 col: j) value: (self row: i+1 col: j) value: (self row: i col: j) value: (self row: i col: j-1) value: (self row: i col: j+1))]] This separates the iteration logic from the code which decides the actual transform. It would then become feasible to inline the #row:col: messages manually, which would make it possible to cache various common subexpressions and loop invariants. Later... I tried this, and found I couldn't easily use a block as there isn't a 5-value message, so I send a message to self instead. I ended up with: fastTransformYeilding: anArray2D 2 to: rowSize - 1 do: [:i| |above below centre left right| above := self indexOf: i-1 col: 0. centre := self indexOf: i col: 0. below := self indexOf: i+1 col: 0. left := centre-1. right := centre + 1. 2 to: colSize - 1 do: [:j| anArray2D at: centre+j put: (self transformAbove: (self at: above+j) below: (self at: below+j) centre: (self at: centre+j) left: (self at: left+j) right: (self at: right+j)) ]] transformAbove: above below: below centre: centre left: left right: right ^1.0 + centre + (0.1 * below) + above + left + right - (4.0 * centre) This ran in about 40% of the time of the original, which is not as big an improvement as I'd hoped for. It probably goes beyond what you'd call a "simple optimisation", too. Inlining #transformAbove:... and using #basicAt: got the time down to 33% of the original, but the result was pretty ugly. Assuming your machine scales, and going with the 40% figure, we would be talking about 7 seconds versus Dylan's 3.7, or a factor of 2. To be honest I would have thought Dylan would be much closer to C++'s figure than Dolphin's. Dave Harris, Nottingham, UK | "Weave a circle round him thrice, [hidden email] | And close your eyes with holy dread, | For he on honey dew hath fed http://www.bhresearch.co.uk/ | And drunk the milk of Paradise." |
In reply to this post by panu
Panu,
> Did you consider jumping into C for some critical parts ? Could have done (although I would have used Delphi) but the idea was to see what could be achieved using Smalltalk. I knocked it down to 40% of the original by moving the loops into the Array2D class (as Dave did in his example) and now, for my part, intend to leave it there <g> Ian |
In reply to this post by John Whittaker
John,
> It's interesting and a bit counter-intuitive to me that creating the method > updateValueRow cuts down the execution time. There's obviously some message > sending overhead here. Is that overhead dominated by the speed gained by > not having to reference the array object in the old straight-line approach? It's just reducing the total number of messages sent and also replacing the messages that are sent with much simpler ones (i.e. closer to Dolphins VM primitives) directly onto the flattened Array. Smalltalk is all about message sending - it does nothing else. The message sending code is _very_ optimised but each one obviously takes a finite time. If you reduce the number of direct message sends (and therefore the message sends that they would have invoked, etc) then you will also reduce the time taken. Ian |
In reply to this post by Ian Bartholomew
> Class -
> > Array variableSubclass: #Array2D > instanceVariableNames: 'columns' > classVariableNames: '' > poolDictionaries: '' > classInstanceVariableNames: '' > > Array2D class>>new: dim1 d2: dim2 > ^(super new: dim1 * dim2) setColumns: dim2 > > Array2D>>updateRow: row col: col in: anArray > self > at: col + (row * columns) > put: (anArray updateValueRow: row col: col) > > Array2D>>fill: aNumber > self atAllPut: aNumber > > Array2D>>updateValueRow: row col: col > | location current | > current := self at: (location := col + (row * columns)). > ^(((self at: location + columns) > + (self at: location - columns) > + (self at: location + 1) > + (self at: location - 1) > - (current * 4)) * 0.1) + current + 1 > > Array2D>>setColumns: anInteger > columns := anInteger > > Array2D>>row: row col: col > ^self at: row * columns + col > Maybe try changing the Array2D in an array of points? My 2p Ted |
In reply to this post by Dave Harris-3
[hidden email] (Dave Harris) wrote (abridged):
> transformAbove: above below: below centre: centre left: left right: > right > ^1.0 + centre + (0.1 * below) + above + left + right - (4.0 * > centre) Actually should be: ^centre + 1 + ((below + above + left + right + (centre * -4)) * 0.1) because I misread the original formatting. Rearranging the expression can make a difference. Intuitively I'd expect putting the literal first to be faster - because the compiler should realise #* cannot be overridden and do extra inlining. However, in practice the compiler doesn't seem to do this optimisation. Also, I'd have expected using 4.0 to be faster than 4 because it avoids an int-to-float conversion. In practice it is slightly slower. I guess the gain is small because the conversion is done inside the primitive, and outweighed by the extra cost from unboxing the float. Using + (centre * -4) is marginally (but consistently) faster than - (centre * 4). This implies that addition is faster than subtraction - perhaps an effect of caching (as the previous ops are also additions). After looking at Ian's code I tweaked my main loop a bit: fastTransformYielding: anArray2D 2 to: rowSize - 1 do: [:i| |start finish| start := i * (colSize-1) + 2. finish := start + colSize - 3. start to: finish do: [:centre| anArray2D basicAt: centre put: (self transformAbove: (self basicAt: centre-colSize) below: (self basicAt: centre+colSize) centre: (self basicAt: centre) left: (self basicAt: centre-1) right: (self basicAt: centre+1))]] This has fewer variables and does slightly less maths. I tried using #whileTrue: but it didn't seem to help. The difference in speed is marginal but I prefer this because it feels simpler than my previous version. I also prefer this to Ian's because I think it keeps all the grunge in one place. The #transformAbove:... routine actually makes the core equation clearer than the original, and #fastTransformYielding: is more reusable. It's 38% faster. Dave Harris, Nottingham, UK | "Weave a circle round him thrice, [hidden email] | And close your eyes with holy dread, | For he on honey dew hath fed http://www.bhresearch.co.uk/ | And drunk the milk of Paradise." |
Dave,
> Rearranging the expression can make a difference. Yes. That was the reason for the "homework" comment in my original post. I knew that the simple reordering of the (centre * 4) expression would make a difference but I was surprised at the amount - giving an improvement of around 20% in the version I was testing at the time. > I also prefer this to Ian's because I think it keeps all the grunge in > one place. The #transformAbove:... routine actually makes the core > equation clearer than the original, and #fastTransformYielding: is more > reusable. It's 38% faster. Yes, I agree. It's much clearer and closer to "proper" Smalltalk than my (and John's) version. I was keeping too close to John's original by putting the looping in the workspace. Moving it into the method as well knocked another 10% off the timing. I've forced myself to delete all the files and tear up all the bits of paper concerning this now. Don't you find you get into the "one more little tweak and then I will go on to what I really should be doing" loop mode with something like this. Hours later you are still trying "one more little tweak". <g> Ian |
In reply to this post by John Whittaker
[hidden email] (John Whittaker) wrote (abridged):
> It's interesting and a bit counter-intuitive to me that creating the > method updateValueRow cuts down the execution time. This kind of thing is good OO design because it moves the responsibility closer to where it belongs, in the array object. This often leads to faster code because as a member, the method can exploit knowledge of the object's private representation; it can take short-cuts which would be dangerous or impossible for an outsider. In this case it also enables some caching. The location only has to be worked out once, rather than 6 times as in the original code. > There's obviously some message sending overhead here. Inlining a similar method in my version takes it from 38% to 33%. Dave Harris, Nottingham, UK | "Weave a circle round him thrice, [hidden email] | And close your eyes with holy dread, | For he on honey dew hath fed http://www.bhresearch.co.uk/ | And drunk the milk of Paradise." |
In reply to this post by Ian Bartholomew
[hidden email] (Ian Bartholomew) wrote (abridged):
> Homework: Time the workspace example using the code below and then > make one alteration to the #updateValueRow:col: method - > changing (current * 4) to (4 * current) - and time it again. > > Discuss <g> The primitive #* for Float also works for SmallInteger arguments. The one for SmallInteger doesn't return the favour so you get a whole lot of extra code, including a double-dispatch and a conversion, in Smalltalk. If you use 4.0 instead of 4, the effect mostly goes away. Dave Harris, Nottingham, UK | "Weave a circle round him thrice, [hidden email] | And close your eyes with holy dread, | For he on honey dew hath fed http://www.bhresearch.co.uk/ | And drunk the milk of Paradise." |
In reply to this post by johncwhi
So, after all of the previous suggestions, if you take the most efficient
code submitted and run on the same machine, what is final timing for Smalltalk? <[hidden email]> wrote in message news:9425ef$4on$[hidden email]... > I became a Dolphin user after being a devoted Dylan fan (I still am, but > maybe a bit less so than before--and I mean the programming language > not Bob), so I still follow the Dylan newsgroup. There was recently a > post there from someone trying to gauge the speed of the one commercial > Dylan implementation against Lisp, C++, and Python for computing some > values from a heat conduction equation. Since I'm new to Smalltalk, I > couldn't resist trying to do the algorithm in Smalltalk. My attempt is > below. On my Pentium III 500 running NT 4.0, Dolphin runs in about 15 - > 20 seconds. By the way C++ does this on a PII 450 in 0.2 seconds. > After some help from the experts, the Dylan time got down to somewhere > around 3.7 seconds. Are there any simple optimizations that would make > a Smalltalk version faster? > > Note: I don't want to cause worthless language flaming. Just curious. > > Apologies for the formatting. I can only submit through deja.com from > work. > > > First the algorithm (which was simply typed into a workspace. It > depends on the Array2D class which follows the algorithm: > > > t1 := (Array2D new: 300 d2: 300) fill: 1.0. > t2 := (Array2D new: 300 d2: 300) fill: 30.0. > > Transcript show: 'Heat Conduction Run' ; cr. > (Interval from: 1 to: 10) do: > [ :k | > (Interval from: 2 to: 299) do: > [ :i | > (Interval from: 2 to: 299) do: > [ :j | t2 row: i col:j put: ((t1 row: i col: j) + > 1.0 + > (0.1 * ((t1 row: i + 1 col: > j) + > (t1 row: i - 1 > col: j) + > (t1 row: i col: > j + 1) + > (t1 row: i col: j > - 1) - (4.0 * (t1 row: i col: j))))). ]]. > > > (Interval from: 2 to: 299) do: > [ :i | > (Interval from: 2 to: 299) do: > [ :j | t1 row: i col:j put: ((t2 row: i col: j) + > 1.0 + > (0.1 * ((t2 row: i + 1 col: > j) + > (t2 row: i - 1 > col: j) + > (t2 row: i col: > j + 1) + > (t2 row: i col: j > - 1) - (4.0 * (t2 row: i col: j))))). ]]. > Transcript show: (t1 row: 150 col: 150) printString; cr; cr. > ]. > > > > > package! > > "Class Definitions"! > > Array variableSubclass: #Array2D > instanceVariableNames: '' > classVariableNames: '' > poolDictionaries: '' > classInstanceVariableNames: ''! > "Loose Methods"! > > "End of package definition"! > > > > Array2D comment: ''! > > Array2D guid: (GUID fromString: > '{EB7F0C00-9732-4767-8B67-57A9CB9F1440}')! > > !Array2D categoriesForClass!Unclassified! ! > !Array2D methodsFor! > > fill: aNumber > > ^ self do: [ :row | (Interval from: 1 to: (row size)) > do: [ :index | row at: index put: aNumber ] > ].! > > r: row c: col > > ^(self at: row) at: col.! > > r: row c: col put: aNumber > > ^(self at: row) at: col put: aNumber.! > > > !Array2D class methodsFor! > > new: d1 d2: dim2 > | temp | > temp := ( super new: d1 ). > > (Interval from: 1 to: d1) do: > [ :index | temp at: index put: (Array new: dim2 )]. > ^temp. > > > Sent via Deja.com > http://www.deja.com/ |
In reply to this post by johncwhi
The code below is no longer object-oriented, but it is at least faster than
the original version. The same types of changes could probably be made to the C++ and Dylan code to speed those versions as well, so this doesn't necessarily help in any race. It takes about 40% of the original execution time, so that's over a degree of magnitude (base 2, that is). On Dolphin 4 it comes in at just over 5 seconds. test | rowDim colDim edge t1 t2 firstIndex rowReps colReps | rowDim := 300. colDim := 300. edge := 1. t1 := (Array new: rowDim * colDim) atAllPut: 1. t2 := (Array new: rowDim * colDim) atAllPut: 30. firstIndex := 1 + rowDim + edge. rowReps := rowDim - edge - edge. colReps := colDim - edge - edge. 10 timesRepeat: [ | rowIndex | rowIndex := firstIndex. rowReps timesRepeat: [ | colIndex | colIndex := rowIndex. colReps timesRepeat: [ t2 at: colIndex put: ( 1.0 + (0.6 * (t1 at: colIndex)) + (0.1 * ( (t1 at: colIndex + colDim) + (t1 at: colIndex - colDim) + (t1 at: colIndex - 1) + (t1 at: (colIndex := colIndex + 1)) )) ). ]. rowIndex := rowIndex + colDim. ]. rowIndex := firstIndex. rowReps timesRepeat: [ | colIndex | colIndex := rowIndex. colReps timesRepeat: [ t1 at: colIndex put: ( 1.0 + (0.6 * (t2 at: colIndex)) + (0.1 * ( (t2 at: colIndex + colDim) + (t2 at: colIndex - colDim) + (t2 at: colIndex - 1) + (t2 at: (colIndex := colIndex + 1)) )) ). ]. rowIndex := rowIndex + colDim. ]. Transcript show: (t1 at: (rowDim / 2 * rowDim - rowDim + (colDim / 2))) printString; cr. ]. timeTest | time | Transcript show: 'Heat Conduction Run' ; cr. time := Time millisecondsToRun: [ self test ]. Transcript show: 'Time in milliseconds was: '; show: time printString; cr. [hidden email] wrote: > I became a Dolphin user after being a devoted Dylan fan (I still am, but > maybe a bit less so than before--and I mean the programming language > not Bob), so I still follow the Dylan newsgroup. There was recently a > post there from someone trying to gauge the speed of the one commercial > Dylan implementation against Lisp, C++, and Python for computing some > values from a heat conduction equation. Since I'm new to Smalltalk, I > couldn't resist trying to do the algorithm in Smalltalk. My attempt is > below. On my Pentium III 500 running NT 4.0, Dolphin runs in about 15 - > 20 seconds. By the way C++ does this on a PII 450 in 0.2 seconds. > After some help from the experts, the Dylan time got down to somewhere > around 3.7 seconds. Are there any simple optimizations that would make > a Smalltalk version faster? > > Note: I don't want to cause worthless language flaming. Just curious. > > Apologies for the formatting. I can only submit through deja.com from > work. > > First the algorithm (which was simply typed into a workspace. It > depends on the Array2D class which follows the algorithm: > > t1 := (Array2D new: 300 d2: 300) fill: 1.0. > t2 := (Array2D new: 300 d2: 300) fill: 30.0. > > Transcript show: 'Heat Conduction Run' ; cr. > (Interval from: 1 to: 10) do: > [ :k | > (Interval from: 2 to: 299) do: > [ :i | > (Interval from: 2 to: 299) do: > [ :j | t2 row: i col:j put: ((t1 row: i col: j) + > 1.0 + > (0.1 * ((t1 row: i + 1 col: > j) + > (t1 row: i - 1 > col: j) + > (t1 row: i col: > j + 1) + > (t1 row: i col: j > - 1) - (4.0 * (t1 row: i col: j))))). ]]. > > (Interval from: 2 to: 299) do: > [ :i | > (Interval from: 2 to: 299) do: > [ :j | t1 row: i col:j put: ((t2 row: i col: j) + > 1.0 + > (0.1 * ((t2 row: i + 1 col: > j) + > (t2 row: i - 1 > col: j) + > (t2 row: i col: > j + 1) + > (t2 row: i col: j > - 1) - (4.0 * (t2 row: i col: j))))). ]]. > Transcript show: (t1 row: 150 col: 150) printString; cr; cr. > ]. > > package! > > "Class Definitions"! > > Array variableSubclass: #Array2D > instanceVariableNames: '' > classVariableNames: '' > poolDictionaries: '' > classInstanceVariableNames: ''! > "Loose Methods"! > > "End of package definition"! > > Array2D comment: ''! > > Array2D guid: (GUID fromString: > '{EB7F0C00-9732-4767-8B67-57A9CB9F1440}')! > > !Array2D categoriesForClass!Unclassified! ! > !Array2D methodsFor! > > fill: aNumber > > ^ self do: [ :row | (Interval from: 1 to: (row size)) > do: [ :index | row at: index put: aNumber ] > ].! > > r: row c: col > > ^(self at: row) at: col.! > > r: row c: col put: aNumber > > ^(self at: row) at: col put: aNumber.! > > !Array2D class methodsFor! > > new: d1 d2: dim2 > | temp | > temp := ( super new: d1 ). > > (Interval from: 1 to: d1) do: > [ :index | temp at: index put: (Array new: dim2 )]. > ^temp. > > Sent via Deja.com > http://www.deja.com/ |
In reply to this post by Dave Harris-3
On Wed, 17 Jan 2001, Dave Harris wrote:
> ... and using > #basicAt: got the time down to 33% of the original, but the result was > pretty ugly. It's strange, but on my machine (old Pentium - don't know, what are the results on other processors), Object>>at: is faster then Object>>basicAt: in 4.0 (it was slower in previous versions). Since many low level methods in Collection hierarchy are using #basicAt:, there may be a simple way to improve their performance just by applying to Object>>basicAt: same changes, that speeded up Object>>at:. Artur Zaroda [hidden email] |
In reply to this post by johncwhi
[hidden email] () wrote (abridged):
> By the way C++ does this on a PII 450 in 0.2 seconds. After some > help from the experts, the Dylan time got down to somewhere around > 3.7 seconds. Are you going to show us the Dylan code? I don't understand why Dylan was so slow. I don't know the FunDev product, but from the review at: http://www.byte.com/column/BYT20000628S0007 it is a static compiler. Are you sure it was used in "production mode"? Admittedly Dylan's time was twice as good as Dolphin's, but then Dolphin is interpreted bytecode. It would be mildly interesting to see how other Smalltalk implementations do. I would expect VW to be a little better but not a lot. I don't know how good SmalltalkMT is in practice, but it is a static compiler specifically aimed at speed so I'd like to think it could be within a few factors of C++. > Note: I don't want to cause worthless language flaming. Just curious. I can't resist mentioning I had quite a few "array subscript out of range" errors while trying to make it faster. I imagine the C++ code did not include range-checking. Dave Harris, Nottingham, UK | "Weave a circle round him thrice, [hidden email] | And close your eyes with holy dread, | For he on honey dew hath fed http://www.bhresearch.co.uk/ | And drunk the milk of Paradise." |
In reply to this post by Artur Zaroda
[hidden email] (Artur Zaroda) wrote (abridged):
> It's strange, but on my machine (old Pentium - don't know, what are > the results on other processors), Object>>at: is faster then > Object>>basicAt: in 4.0 (it was slower in previous versions). To me it seemed about the same. I didn't notice any degradation, but I didn't notice any improvement either so I probably should have left it out. It didn't occur to me it might be slower. #basicAt: should avoid a message dispatch. Investigation shows that #at: is <primitive: 60> and #basicAt: is <primitive: 49>. There is nothing in the method comment to explain this difference, and indeed the comment for #at: implies they are implemented the same. Dave Harris, Nottingham, UK | "Weave a circle round him thrice, [hidden email] | And close your eyes with holy dread, | For he on honey dew hath fed http://www.bhresearch.co.uk/ | And drunk the milk of Paradise." |
Free forum by Nabble | Edit this page |