"PMVector sum" is extremely slow, and two more issues

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

"PMVector sum" is extremely slow, and two more issues

Atharva Khare

Hey everyone,


I found three issues in the library, they are written below sorted by their severity. Please go through them and suggest better fixes, if any.


1. "PMVector sum" - Here is the comparison with array:


a := ((1 to: 5000) collect: [ :i | i ]) asPMVector.
a sum. "Best of 10 runs: 568ms"

b := ((1 to: 25000000) collect: [ :i | i ]).
b sum. "Worst of 10 runs: 551ms"


This means that PMVector sum is at least 5,000 times slower than Collection sum.

Possible fix: Don't override sum method. Is there any specific reason to override?


2. PMVector > < operators modify in-place:

x := #(1 2.5 3 4) asPMVector.
x > 2.
x. "==> a PMVector(false true true true)"

I have created a PR for it, here: https://github.com/PolyMathOrg/PolyMath/pull/123
Should additional operators like >= and <= be added in similar fashion as well?

3. PMStandardScaler fails when scale = 0 - happens when all elements in a column are equal. Possible fix: add a "handleZeroScale" method which changes scale to 1 when it is 0.
Question: Should this method be added in PMDataTransformer? If we add more scalers in future, all would need this message.

--
You received this message because you are subscribed to the Google Groups "PolyMath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/polymath-project/3afcf939-5715-41dd-a35e-d12b98327bc9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: "PMVector sum" is extremely slow, and two more issues

Atharva Khare
Forgot to link the issues:
1. PMVector sum - https://github.com/PolyMathOrg/PolyMath/issues/125
2. PMVector > < operators - https://github.com/PolyMathOrg/PolyMath/issues/122
3. Standardscalar scale=0 - https://github.com/PolyMathOrg/PolyMath/issues/124

On Sunday, May 26, 2019 at 5:59:24 PM UTC+5:30, Atharva Khare wrote:

Hey everyone,


I found three issues in the library, they are written below sorted by their severity. Please go through them and suggest better fixes, if any.


1. "PMVector sum" - Here is the comparison with array:


a := ((1 to: 5000) collect: [ :i | i ]) asPMVector.
a sum. "Best of 10 runs: 568ms"

b := ((1 to: 25000000) collect: [ :i | i ]).
b sum. "Worst of 10 runs: 551ms"


This means that PMVector sum is at least 5,000 times slower than Collection sum.

Possible fix: Don't override sum method. Is there any specific reason to override?


2. PMVector > < operators modify in-place:

x := #(1 2.5 3 4) asPMVector.
x > 2.
x. "==> a PMVector(false true true true)"

I have created a PR for it, here: <a href="https://github.com/PolyMathOrg/PolyMath/pull/123" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2FPolyMathOrg%2FPolyMath%2Fpull%2F123\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHZBcP6KDg65y0rubYJo3eWxIh8mg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2FPolyMathOrg%2FPolyMath%2Fpull%2F123\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHZBcP6KDg65y0rubYJo3eWxIh8mg&#39;;return true;">https://github.com/PolyMathOrg/PolyMath/pull/123
Should additional operators like >= and <= be added in similar fashion as well?

3. PMStandardScaler fails when scale = 0 - happens when all elements in a column are equal. Possible fix: add a "handleZeroScale" method which changes scale to 1 when it is 0.
Question: Should this method be added in PMDataTransformer? If we add more scalers in future, all would need this message.

--
You received this message because you are subscribed to the Google Groups "PolyMath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/polymath-project/754859e3-cf63-4694-90ea-0c33deb4fb46%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: "PMVector sum" is extremely slow, and two more issues

SergeStinckwich
In reply to this post by Atharva Khare

On Sun, May 26, 2019 at 1:29 PM Atharva Khare <[hidden email]> wrote:

Hey everyone,


Dear Atharva,

 

I found three issues in the library, they are written below sorted by their severity. Please go through them and suggest better fixes, if any.


1. "PMVector sum" - Here is the comparison with array:


a := ((1 to: 5000) collect: [ :i | i ]) asPMVector.
a sum. "Best of 10 runs: 568ms"

b := ((1 to: 25000000) collect: [ :i | i ]).
b sum. "Worst of 10 runs: 551ms"


This means that PMVector sum is at least 5,000 times slower than Collection sum.

Possible fix: Don't override sum method. Is there any specific reason to override?




I try to remove PMVector>>sum and it runs much faster.
All the tests are green. I agree to remove this method.

Thank you for your commitment to PolyMath.

--
Serge Stinckwic
h

Int. Research Unit
 on Modelling/Simulation of Complex Systems (UMMISCO)
Sorbonne University
 (SU)
French National Research Institute for Sustainable Development (IRD)
U
niversity of Yaoundé I, Cameroon
"Programs must be written for people to read, and only incidentally for machines to execute."
https://twitter.com/SergeStinckwich

--
You received this message because you are subscribed to the Google Groups "PolyMath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/polymath-project/CAOysuxU6gF7R9RcKF5M6n9_yvt09EWPpxApKS8L%3DJDFkbbHwQg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.