Hello Folks,
I have a method which mainly calculates with FloatArray using *=, /= and friends. It's mainly sum products (inputs * coefficients) sum. When I call it the first time, it finishes in a second, next time I call it it takes 8 seconds. The puzzling thing is, I get it back to speed if I re-initialize the coefficients to new random numbers. Inputs and coefficients are in the 0.0 to 1.0 range. This does not change during the calculations, as after every recalculation I normalise the coefficients using: coefficients /= coefficients sum. I use some 800 samples as values and I have 100 Arrays of 150 coefficients. So 800*100*150 calculations. The only thing I can think of is that the Arrays of 150 inputs are mainly populated with zeroes and 1 to 10 non zero values. So a lot of the coefficients end up as '7.00649232162409e-45'. Wikipedia says the closest to zero number in 32 Bit float is some 1e-38 so maybe the above numbers get a special (and time consuming) treatment in the primitives? I suspect this for if I add 1E-30 to every coefficient during normalization I get a continuos speed of 2 seconds. But in this case I get a significant number of incremental gc's (veryfied the gc percentage in all 3 cases) Is this some bug to put on Mantis? Some other measure to take? I'm completely at a loss here. MessageTallys follow for all three cases. Thanks Herbert mailto:[hidden email] First call to MessageTally: - 953 tallies, 953 msec. **Tree** 100.0% {953ms} RingSofmTrainer(SofmTrainer)>>trainEpochVariableLearnRate 59.3% {565ms} RingSOFM(SelfOrganizingFeatureMap)>>learnOneStepAtVariableRate |22.6% {215ms} Neuron>>sofmLearnAtLearnRate: | |7.7% {73ms} FloatArray>>*= | |6.2% {59ms} FloatArray>>+= | |5.4% {51ms} FloatArray>>- | | |4.1% {39ms} FloatArray>>-= | |3.4% {32ms} primitives |13.9% {132ms} Neuron>>normalizeCoefficients | |7.8% {74ms} FloatArray>>/= | |6.1% {58ms} primitives |9.5% {91ms} OrderedCollection(SequenceableCollection)>>withIndexDo: | |8.5% {81ms} primitives |8.3% {79ms} Array(Collection)>>max | |5.2% {50ms} Float(Magnitude)>>max: | |2.9% {28ms} Array(Collection)>>inject:into: | | 2.9% {28ms} Array(SequenceableCollection)>>do: |4.2% {40ms} OrderedCollection>>at: 34.6% {330ms} RingSOFM(SelfOrganizingFeatureMap)>>calculateOutputs |21.8% {208ms} Array(SequenceableCollection)>>withIndexDo: |12.4% {118ms} Neuron>>calculateOutput | 8.4% {80ms} FloatArray>>* | |6.5% {62ms} FloatArray>>*= | 4.0% {38ms} primitives 3.5% {33ms} OrderedCollection(Collection)>>remove: 3.5% {33ms} OrderedCollection>>remove:ifAbsent: 3.4% {32ms} primitives **Leaves** 30.3% {289ms} Array(SequenceableCollection)>>withIndexDo: 14.2% {135ms} FloatArray>>*= 7.8% {74ms} FloatArray>>/= 6.2% {59ms} FloatArray>>+= 6.1% {58ms} Neuron>>normalizeCoefficients 5.9% {56ms} OrderedCollection>>at: 5.4% {51ms} Float(Magnitude)>>max: 4.1% {39ms} FloatArray>>-= 4.0% {38ms} Neuron>>calculateOutput 3.4% {32ms} OrderedCollection>>remove:ifAbsent: 3.4% {32ms} Neuron>>sofmLearnAtLearnRate: 2.9% {28ms} Array(SequenceableCollection)>>do: **Memory** old +0 bytes young +161,996 bytes used +161,996 bytes free -161,996 bytes **GCs** full 0 totalling 0ms (0.0% uptime) incr 155 totalling 39ms (4.0% uptime), avg 0.0ms tenures 0 root table 0 overflows ******************************************************************** Second call to MessageTally: - 8215 tallies, 8215 msec. **Tree** 100.0% {8215ms} RingSofmTrainer(SofmTrainer)>>trainEpochVariableLearnRate 82.4% {6769ms} RingSOFM(SelfOrganizingFeatureMap)>>learnOneStepAtVariableRate |35.2% {2892ms} Neuron>>sofmLearnAtLearnRate: | |17.2% {1413ms} FloatArray>>+= | |17.1% {1405ms} FloatArray>>*= |28.9% {2374ms} Neuron>>normalizeCoefficients | |19.6% {1610ms} primitives | |9.3% {764ms} FloatArray>>/= |16.7% {1372ms} OrderedCollection(SequenceableCollection)>>withIndexDo: | 16.6% {1364ms} primitives 16.8% {1380ms} RingSOFM(SelfOrganizingFeatureMap)>>calculateOutputs 15.2% {1249ms} Array(SequenceableCollection)>>withIndexDo: **Leaves** 31.8% {2612ms} OrderedCollection(SequenceableCollection)>>withIndexDo: 19.6% {1610ms} Neuron>>normalizeCoefficients 17.9% {1470ms} FloatArray>>*= 17.2% {1413ms} FloatArray>>+= 9.3% {764ms} FloatArray>>/= **Memory** old +0 bytes young +169,864 bytes used +169,864 bytes free -169,864 bytes **GCs** full 0 totalling 0ms (0.0% uptime) incr 203 totalling 54ms (1.0% uptime), avg 0.0ms tenures 0 root table 0 overflows *********************************************************************** nth call to MessageTally adding a small constant in normalisation: - 2096 tallies, 2100 msec. **Tree** 100.0% {2100ms} RingSofmTrainer(SofmTrainer)>>trainEpochVariableLearnRate 83.5% {1754ms} RingSOFM(SelfOrganizingFeatureMap)>>learnOneStepAtVariableRate |57.2% {1201ms} Neuron>>normalizeCoefficients | |51.5% {1082ms} FloatArray>>+= | | |47.8% {1004ms} Fraction>>asFloat | | | |39.3% {825ms} LargePositiveInteger(Integer)>>asFloat | | | | |38.5% {809ms} primitives | | | |3.5% {74ms} SmallInteger(Magnitude)>>max: | | |3.7% {78ms} primitives | |2.9% {61ms} primitives | |2.8% {59ms} FloatArray>>/= |10.5% {221ms} Neuron>>sofmLearnAtLearnRate: | |3.6% {76ms} FloatArray>>- | | |2.9% {61ms} FloatArray>>-= | |3.2% {67ms} FloatArray>>*= | |2.8% {59ms} FloatArray>>+= |9.6% {202ms} OrderedCollection(SequenceableCollection)>>withIndexDo: | |8.9% {187ms} primitives |3.0% {63ms} Array(Collection)>>max |2.4% {50ms} OrderedCollection>>at: 13.7% {288ms} RingSOFM(SelfOrganizingFeatureMap)>>calculateOutputs 8.6% {181ms} Array(SequenceableCollection)>>withIndexDo: 4.9% {103ms} Neuron>>calculateOutput 3.5% {74ms} FloatArray>>* 2.3% {48ms} FloatArray>>*= **Leaves** 38.5% {809ms} LargePositiveInteger(Integer)>>asFloat 17.5% {368ms} Array(SequenceableCollection)>>withIndexDo: 6.5% {137ms} FloatArray>>+= 6.2% {130ms} SmallInteger(Magnitude)>>max: 5.6% {118ms} FloatArray>>*= 3.3% {69ms} OrderedCollection>>at: 2.9% {61ms} FloatArray>>-= 2.9% {61ms} Neuron>>normalizeCoefficients 2.8% {59ms} FloatArray>>/= **Memory** old +0 bytes young +14,744 bytes used +14,744 bytes free -14,744 bytes **GCs** full 0 totalling 0ms (0.0% uptime) incr 972 totalling 320ms (15.0% uptime), avg 0.0ms tenures 0 root table 0 overflows |
Herbert König wrote:
> The only thing I can think of is that the Arrays of 150 inputs are > mainly populated with zeroes and 1 to 10 non zero values. So a lot of > the coefficients end up as '7.00649232162409e-45'. > > Wikipedia says the closest to zero number in 32 Bit float is some > 1e-38 so maybe the above numbers get a special (and time consuming) > treatment in the primitives? It is possible (though require specific testing) that the resulting floating point values cause underflow or similar fp exceptions and may be handled specially. Only a well-defined benchmark could tell. But just out of curiosity, what processor are you running? Cheers, - Andreas |
In reply to this post by Herbert König
Herbert,
I cannot help with your FloatArray problem, but this one caught my eye: > 30.3% {289ms} Array(SequenceableCollection)>>withIndexDo: What are you using #withIndexDo: for? One third is a rather large percentage (provided this is not an aliasing error with the floatarray prims). - Bert - |
Free forum by Nabble | Edit this page |