Blair,
Ultimately, a better algorithm would probably render irrelevant most of what follows. One answer is to use #< in preference to other operators, but it seems worthy of mention either way. I am in the early stages of a number crunching job. While I am not yet convinced that the output is correct, it runs, so I used Ian's profiler to see where the 24 seconds went - and this not a large dataset :( Two obviously expensive lines were of the form ( floatArray at:(x-y+1)) - ( floatArray at:(x-z+1)) and floatX > floatY Seeing Magnitude mentioned in the profile, I investigated and changed the test to use #<. That took the run from 24 seconds to 20 seconds. The first line is a little less encouraging, because it is listed as a primitive, and seems to be responsible for >80% of the work. That makes me wonder whether moving it to a C/C++ DLL would offer a boost. Any ideas? Another profile flagged a different line with a #<=. Changing to #< takes off another half to almost one second. I suspect the impact there is reduced because of the compiler optimization of #==. Equality of floats is not terribly meaningful, so I can probably just switch to #< to get the boost, but would a primitive form of <= be reasonable? At least that way, one could easily reverse a test to eliminate the need for message sends. Have a good one, Bill -- Wilhelm K. Schwab, Ph.D. [hidden email] |
"Bill Schwab" <[hidden email]> wrote in message
news:c79d8u$c3e$[hidden email]... > Blair, > > Ultimately, a better algorithm would probably render irrelevant most of > what follows. One answer is to use #< in preference to other > operators, but it seems worthy of mention either way. > > I am in the early stages of a number crunching job. While I am not yet > convinced that the output is correct, it runs, so I used Ian's profiler > to see where the 24 seconds went - and this not a large dataset :( > > Two obviously expensive lines were of the form > > ( floatArray at:(x-y+1)) - ( floatArray at:(x-z+1)) > > and > > floatX > floatY > > Seeing Magnitude mentioned in the profile, I investigated and changed > the test to use #<. That took the run from 24 seconds to 20 seconds. Well a quick test on my 1.7Ghz P4 laptop (on battery power) shows that it is necessary to perform about 20 million such comparisons to get that magnitude of difference (on 5.1, on D6 with faster method activations I needed 25 million). The difference is due to #< having a direct primitive implementation, and #> requring a method activation (that then uses #< by reversing the comparison). > > The first line is a little less encouraging, because it is listed as a > primitive, and seems to be responsible for >80% of the work. That makes > me wonder whether moving it to a C/C++ DLL would offer a boost. Any ideas? I assume you are using a FLOATArray - you could try going directly to the ByteArray inside the FLOATArray and invoking the ByteArray>>floatAtOffset: primitive directly. This will elminate the method activation needed for FLOATArray>>at: (i.e. just inline the call to FLOATArray>>at:). This will help, but the real cost here is probably the instantiation of the objects to represent the intermediate values. Since two out of three of these instantiations are occurring in order to "deserialize" from the FLOATArray, it would probably run quite a lot faster if you traded (a lot) of memory for speed and held an Array of Float objects instead. To get real speed from number crunching of floats though, you're probably better off writing that external DLL. > > Another profile flagged a different line with a #<=. Changing to #< > takes off another half to almost one second. I suspect the impact there > is reduced because of the compiler optimization of #==. Equality of > floats is not terribly meaningful, so I can probably just switch to #< > to get the boost, but would a primitive form of <= be reasonable? At > least that way, one could easily reverse a test to eliminate the need > for message sends. Well you can write it as implemented in Magnitude and inherited by Float, i.e. floatX <= floatY is equivalent to: (floatY < floatX) == false. This will eliminate any method activations. In many cases you can get rid of the == false, since the relational test is frequently followed by an ifTrue:ifFalse:, that simply needs to be inverted. Regards Blair |
Blair,
> Well a quick test on my 1.7Ghz P4 laptop (on battery power) shows that it is > necessary to perform about 20 million such comparisons to get that magnitude > of difference (on 5.1, on D6 with faster method activations I needed 25 > million). You're catching on :) > I assume you are using a FLOATArray - you could try going directly to the > ByteArray inside the FLOATArray and invoking the ByteArray>>floatAtOffset: > primitive directly. This will elminate the method activation needed for > FLOATArray>>at: (i.e. just inline the call to FLOATArray>>at:). This will > help, but the real cost here is probably the instantiation of the objects to > represent the intermediate values. Since two out of three of these > instantiations are occurring in order to "deserialize" from the FLOATArray, > it would probably run quite a lot faster if you traded (a lot) of memory for > speed and held an Array of Float objects instead. To get real speed from > number crunching of floats though, you're probably better off writing that > external DLL. Fair enough. I was forgetting about the deserialization overhead. Thanks! Bill -- Wilhelm K. Schwab, Ph.D. [hidden email] |
Free forum by Nabble | Edit this page |