FloatArray : puzzled about varying speed

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

FloatArray : puzzled about varying speed

Herbert König
Hello Folks,

I have a method which mainly calculates with FloatArray using *=, /=
and friends. It's mainly sum products (inputs * coefficients) sum.

When I call it the first time, it finishes in a second, next time I
call it it takes 8 seconds.

The puzzling thing is, I get it back to speed if I re-initialize the
coefficients to new random numbers. Inputs and coefficients are in the
0.0 to 1.0 range. This does not change during the calculations, as
after every recalculation I normalise the coefficients using:
coefficients /= coefficients sum.

I use some 800 samples as values and I have 100 Arrays of 150
coefficients. So 800*100*150 calculations.

The only thing I can think of is that the Arrays of 150 inputs are
mainly populated with zeroes and 1 to 10 non zero values. So a lot of
the coefficients end up as  '7.00649232162409e-45'.

Wikipedia says the closest to zero number in 32 Bit float is some
1e-38 so maybe the above numbers get a special (and time consuming)
treatment in the primitives?

I suspect this for if I add 1E-30 to every coefficient during
normalization I get a continuos speed of 2 seconds.

But in this case I get a significant number of incremental gc's
(veryfied the gc percentage in all 3 cases)

Is this some bug to put on Mantis? Some other measure to take? I'm
completely at a loss here.

MessageTallys follow for all three cases.
 
Thanks

Herbert                          mailto:[hidden email]


First call to MessageTally:

 - 953 tallies, 953 msec.

**Tree**
100.0% {953ms} RingSofmTrainer(SofmTrainer)>>trainEpochVariableLearnRate
  59.3% {565ms} RingSOFM(SelfOrganizingFeatureMap)>>learnOneStepAtVariableRate
    |22.6% {215ms} Neuron>>sofmLearnAtLearnRate:
    |  |7.7% {73ms} FloatArray>>*=
    |  |6.2% {59ms} FloatArray>>+=
    |  |5.4% {51ms} FloatArray>>-
    |  |  |4.1% {39ms} FloatArray>>-=
    |  |3.4% {32ms} primitives
    |13.9% {132ms} Neuron>>normalizeCoefficients
    |  |7.8% {74ms} FloatArray>>/=
    |  |6.1% {58ms} primitives
    |9.5% {91ms} OrderedCollection(SequenceableCollection)>>withIndexDo:
    |  |8.5% {81ms} primitives
    |8.3% {79ms} Array(Collection)>>max
    |  |5.2% {50ms} Float(Magnitude)>>max:
    |  |2.9% {28ms} Array(Collection)>>inject:into:
    |  |  2.9% {28ms} Array(SequenceableCollection)>>do:
    |4.2% {40ms} OrderedCollection>>at:
  34.6% {330ms} RingSOFM(SelfOrganizingFeatureMap)>>calculateOutputs
    |21.8% {208ms} Array(SequenceableCollection)>>withIndexDo:
    |12.4% {118ms} Neuron>>calculateOutput
    |  8.4% {80ms} FloatArray>>*
    |    |6.5% {62ms} FloatArray>>*=
    |  4.0% {38ms} primitives
  3.5% {33ms} OrderedCollection(Collection)>>remove:
    3.5% {33ms} OrderedCollection>>remove:ifAbsent:
      3.4% {32ms} primitives

**Leaves**
30.3% {289ms} Array(SequenceableCollection)>>withIndexDo:
14.2% {135ms} FloatArray>>*=
7.8% {74ms} FloatArray>>/=
6.2% {59ms} FloatArray>>+=
6.1% {58ms} Neuron>>normalizeCoefficients
5.9% {56ms} OrderedCollection>>at:
5.4% {51ms} Float(Magnitude)>>max:
4.1% {39ms} FloatArray>>-=
4.0% {38ms} Neuron>>calculateOutput
3.4% {32ms} OrderedCollection>>remove:ifAbsent:
3.4% {32ms} Neuron>>sofmLearnAtLearnRate:
2.9% {28ms} Array(SequenceableCollection)>>do:

**Memory**
        old                     +0 bytes
        young           +161,996 bytes
        used            +161,996 bytes
        free            -161,996 bytes

**GCs**
        full                    0 totalling 0ms (0.0% uptime)
        incr            155 totalling 39ms (4.0% uptime), avg 0.0ms
        tenures         0
        root table      0 overflows

********************************************************************

Second call to MessageTally:

 - 8215 tallies, 8215 msec.

**Tree**
100.0% {8215ms} RingSofmTrainer(SofmTrainer)>>trainEpochVariableLearnRate
  82.4% {6769ms} RingSOFM(SelfOrganizingFeatureMap)>>learnOneStepAtVariableRate
    |35.2% {2892ms} Neuron>>sofmLearnAtLearnRate:
    |  |17.2% {1413ms} FloatArray>>+=
    |  |17.1% {1405ms} FloatArray>>*=
    |28.9% {2374ms} Neuron>>normalizeCoefficients
    |  |19.6% {1610ms} primitives
    |  |9.3% {764ms} FloatArray>>/=
    |16.7% {1372ms} OrderedCollection(SequenceableCollection)>>withIndexDo:
    |  16.6% {1364ms} primitives
  16.8% {1380ms} RingSOFM(SelfOrganizingFeatureMap)>>calculateOutputs
    15.2% {1249ms} Array(SequenceableCollection)>>withIndexDo:

**Leaves**
31.8% {2612ms} OrderedCollection(SequenceableCollection)>>withIndexDo:
19.6% {1610ms} Neuron>>normalizeCoefficients
17.9% {1470ms} FloatArray>>*=
17.2% {1413ms} FloatArray>>+=
9.3% {764ms} FloatArray>>/=

**Memory**
        old                     +0 bytes
        young           +169,864 bytes
        used            +169,864 bytes
        free            -169,864 bytes

**GCs**
        full                    0 totalling 0ms (0.0% uptime)
        incr            203 totalling 54ms (1.0% uptime), avg 0.0ms
        tenures         0
        root table      0 overflows

***********************************************************************

nth call to MessageTally adding a small constant in normalisation:

 - 2096 tallies, 2100 msec.

**Tree**
100.0% {2100ms} RingSofmTrainer(SofmTrainer)>>trainEpochVariableLearnRate
  83.5% {1754ms} RingSOFM(SelfOrganizingFeatureMap)>>learnOneStepAtVariableRate
    |57.2% {1201ms} Neuron>>normalizeCoefficients
    |  |51.5% {1082ms} FloatArray>>+=
    |  |  |47.8% {1004ms} Fraction>>asFloat
    |  |  |  |39.3% {825ms} LargePositiveInteger(Integer)>>asFloat
    |  |  |  |  |38.5% {809ms} primitives
    |  |  |  |3.5% {74ms} SmallInteger(Magnitude)>>max:
    |  |  |3.7% {78ms} primitives
    |  |2.9% {61ms} primitives
    |  |2.8% {59ms} FloatArray>>/=
    |10.5% {221ms} Neuron>>sofmLearnAtLearnRate:
    |  |3.6% {76ms} FloatArray>>-
    |  |  |2.9% {61ms} FloatArray>>-=
    |  |3.2% {67ms} FloatArray>>*=
    |  |2.8% {59ms} FloatArray>>+=
    |9.6% {202ms} OrderedCollection(SequenceableCollection)>>withIndexDo:
    |  |8.9% {187ms} primitives
    |3.0% {63ms} Array(Collection)>>max
    |2.4% {50ms} OrderedCollection>>at:
  13.7% {288ms} RingSOFM(SelfOrganizingFeatureMap)>>calculateOutputs
    8.6% {181ms} Array(SequenceableCollection)>>withIndexDo:
    4.9% {103ms} Neuron>>calculateOutput
      3.5% {74ms} FloatArray>>*
        2.3% {48ms} FloatArray>>*=

**Leaves**
38.5% {809ms} LargePositiveInteger(Integer)>>asFloat
17.5% {368ms} Array(SequenceableCollection)>>withIndexDo:
6.5% {137ms} FloatArray>>+=
6.2% {130ms} SmallInteger(Magnitude)>>max:
5.6% {118ms} FloatArray>>*=
3.3% {69ms} OrderedCollection>>at:
2.9% {61ms} FloatArray>>-=
2.9% {61ms} Neuron>>normalizeCoefficients
2.8% {59ms} FloatArray>>/=

**Memory**
        old                     +0 bytes
        young           +14,744 bytes
        used            +14,744 bytes
        free            -14,744 bytes

**GCs**
        full                    0 totalling 0ms (0.0% uptime)
        incr            972 totalling 320ms (15.0% uptime), avg 0.0ms
        tenures         0
        root table      0 overflows


Reply | Threaded
Open this post in threaded view
|

Re: FloatArray : puzzled about varying speed

Andreas.Raab
Herbert König wrote:
> The only thing I can think of is that the Arrays of 150 inputs are
> mainly populated with zeroes and 1 to 10 non zero values. So a lot of
> the coefficients end up as  '7.00649232162409e-45'.
>
> Wikipedia says the closest to zero number in 32 Bit float is some
> 1e-38 so maybe the above numbers get a special (and time consuming)
> treatment in the primitives?

It is possible (though require specific testing) that the resulting
floating point values cause underflow or similar fp exceptions and may
be handled specially. Only a well-defined benchmark could tell. But just
out of curiosity, what processor are you running?

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: FloatArray : puzzled about varying speed

Bert Freudenberg
In reply to this post by Herbert König
Herbert,

I cannot help with your FloatArray problem, but this one caught my eye:

> 30.3% {289ms} Array(SequenceableCollection)>>withIndexDo:

What are you using #withIndexDo: for? One third is a rather large  
percentage (provided this is not an aliasing error with the  
floatarray prims).

- Bert -