Smalltalk › Frameworks & Tools › SciSmalltalk

PCA implementation issue

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

3 messages Options

SergeStinckwich

PCA implementation issue

Dear all,

I try to implement a PCA (Principal Component Analysis) algorithm based on existing code of Didier.

This is an example in last version of PolyMath:
==============================================================
m := PMMatrix rows: #(#(-1 -1) #(-2 -1) #(-3 -2) #(1 1) #(2 1) #(3 2)).

"Compute PCA components"
pca := PMPrincipalComponentAnalyser new.
pca componentsNumber: 1.
pca fit: m.

"Return eigen values"
pca components.
 "#(6.616285933932035)"

"Return eigen vectors"
pca transformMatrix.
 "a PMVector(0.8384922379048739 -0.5449135408239332)"

pca transform: m.
"a PMVector(-0.29357869708094075)
a PMVector(-1.1320709349858147)
a PMVector(-1.4256496320667553)
a PMVector(0.29357869708094075)
a PMVector(1.1320709349858147)
a PMVector(1.4256496320667553)"

================================================================

If I'm doing something similar in Python, I'm not having exactly the same eigen vectors ... Apparently sign are inversed on the diagonal. I dunno why ...
=================================================================
>>> import numpy as np
>>> from sklearn.decomposition import PCA
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> pca = PCA(n_components=1)
>>> pca.fit(X)
PCA(copy=True, iterated_power='auto', n_components=1, random_state=None,
  svd_solver='auto', tol=0.0, whiten=False)
>>> pca.components_
array([[-0.83849224, -0.54491354]])
>>> pca.transform(X)
array([[ 1.38340578],
       [ 2.22189802],
       [ 3.6053038 ],
       [-1.38340578],
       [-2.22189802],
       [-3.6053038 ]])
===============================================================

I can try to implement a PCA with SVD instead (like in Python) but I would like to save my time.

PolyMath implementation of PCA use Jacobi transformation of the covariance matrix of the data. The covariance matrix is symmetric, so the Jacobi transformation should work correctly I think.

Someone has an idea what happens here ?

Thank you.​

​A+​

Serge Stinckwich
UMI UMMISCO 209 (SU/IRD/UY1)

"Programs must be written for people to read, and only incidentally for machines to execute."

http://www.doesnotunderstand.org/

--
You received this message because you are subscribed to the Google Groups "PolyMath" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

SergeStinckwich

Re: PCA implementation issue

I have now implement a PCA based on SVD and I obtain the same result than with the Jacobi transformation.

I think I found the explanation about the sign differences with the Python version:

https://stackoverflow.com/questions/44765682/in-sklearn-decomposition-pca-why-are-components-negative#44847053

I will implement a function that flip the signs, in order to have the same answer than in Python. Will ease the comparisons of results with scikit-learn framework.

A+

On Tue, Jul 31, 2018 at 3:27 PM Serge Stinckwich <[hidden email]> wrote:

Dear all,

I try to implement a PCA (Principal Component Analysis) algorithm based on existing code of Didier.

This is an example in last version of PolyMath:
==============================================================
m := PMMatrix rows: #(#(-1 -1) #(-2 -1) #(-3 -2) #(1 1) #(2 1) #(3 2)).

"Compute PCA components"
pca := PMPrincipalComponentAnalyser new.
pca componentsNumber: 1.
pca fit: m.

"Return eigen values"
pca components.
"#(6.616285933932035)"

"Return eigen vectors"
pca transformMatrix.
"a PMVector(0.8384922379048739 -0.5449135408239332)"

pca transform: m.
"a PMVector(-0.29357869708094075)
a PMVector(-1.1320709349858147)
a PMVector(-1.4256496320667553)
a PMVector(0.29357869708094075)
a PMVector(1.1320709349858147)
a PMVector(1.4256496320667553)"

================================================================

If I'm doing something similar in Python, I'm not having exactly the same eigen vectors ... Apparently sign are inversed on the diagonal. I dunno why ...

=================================================================
>>> import numpy as np
>>> from sklearn.decomposition import PCA
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> pca = PCA(n_components=1)
>>> pca.fit(X)
PCA(copy=True, iterated_power='auto', n_components=1, random_state=None,
svd_solver='auto', tol=0.0, whiten=False)
>>> pca.components_
array([[-0.83849224, -0.54491354]])
>>> pca.transform(X)
array([[ 1.38340578],
       [ 2.22189802],
       [ 3.6053038 ],
       [-1.38340578],
       [-2.22189802],
       [-3.6053038 ]])
===============================================================

I can try to implement a PCA with SVD instead (like in Python) but I would like to save my time.

PolyMath implementation of PCA use Jacobi transformation of the covariance matrix of the data. The covariance matrix is symmetric, so the Jacobi transformation should work correctly I think.

Someone has an idea what happens here ?

Thank you.
A+

--
Serge Stinckwich
UMI UMMISCO 209 (SU/IRD/UY1)
"Programs must be written for people to read, and only incidentally for machines to execute."
http://www.doesnotunderstand.org/

Serge Stinckwich
UMI UMMISCO 209 (SU/IRD/UY1)

"Programs must be written for people to read, and only incidentally for machines to execute."

http://www.doesnotunderstand.org/

SergeStinckwich

Re: PCA implementation issue

I put an issue here describing the current situation: 

https://github.com/PolyMathOrg/PolyMath/issues/81

On Wed, Aug 1, 2018 at 4:42 PM Serge Stinckwich <[hidden email]> wrote:

I have now implement a PCA based on SVD and I obtain the same result than with the Jacobi transformation.
I think I found the explanation about the sign differences with the Python version:
https://stackoverflow.com/questions/44765682/in-sklearn-decomposition-pca-why-are-components-negative#44847053

I will implement a function that flip the signs, in order to have the same answer than in Python. Will ease the comparisons of results with scikit-learn framework.

A+

On Tue, Jul 31, 2018 at 3:27 PM Serge Stinckwich <[hidden email]> wrote:
Dear all,

I try to implement a PCA (Principal Component Analysis) algorithm based on existing code of Didier.

This is an example in last version of PolyMath:
==============================================================
m := PMMatrix rows: #(#(-1 -1) #(-2 -1) #(-3 -2) #(1 1) #(2 1) #(3 2)).

"Compute PCA components"
pca := PMPrincipalComponentAnalyser new.
pca componentsNumber: 1.
pca fit: m.

"Return eigen values"
pca components.
"#(6.616285933932035)"

"Return eigen vectors"
pca transformMatrix.
"a PMVector(0.8384922379048739 -0.5449135408239332)"

pca transform: m.
"a PMVector(-0.29357869708094075)
a PMVector(-1.1320709349858147)
a PMVector(-1.4256496320667553)
a PMVector(0.29357869708094075)
a PMVector(1.1320709349858147)
a PMVector(1.4256496320667553)"

================================================================

If I'm doing something similar in Python, I'm not having exactly the same eigen vectors ... Apparently sign are inversed on the diagonal. I dunno why ...

=================================================================
>>> import numpy as np
>>> from sklearn.decomposition import PCA
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> pca = PCA(n_components=1)
>>> pca.fit(X)
PCA(copy=True, iterated_power='auto', n_components=1, random_state=None,
svd_solver='auto', tol=0.0, whiten=False)
>>> pca.components_
array([[-0.83849224, -0.54491354]])
>>> pca.transform(X)
array([[ 1.38340578],
       [ 2.22189802],
       [ 3.6053038 ],
       [-1.38340578],
       [-2.22189802],
       [-3.6053038 ]])
===============================================================

I can try to implement a PCA with SVD instead (like in Python) but I would like to save my time.

PolyMath implementation of PCA use Jacobi transformation of the covariance matrix of the data. The covariance matrix is symmetric, so the Jacobi transformation should work correctly I think.

Someone has an idea what happens here ?

Thank you.
A+

--
Serge Stinckwich
UMI UMMISCO 209 (SU/IRD/UY1)
"Programs must be written for people to read, and only incidentally for machines to execute."
http://www.doesnotunderstand.org/

--
Serge Stinckwich
UMI UMMISCO 209 (SU/IRD/UY1)
"Programs must be written for people to read, and only incidentally for machines to execute."
http://www.doesnotunderstand.org/

Serge Stinckwich
UMI UMMISCO 209 (SU/IRD/UY1)

"Programs must be written for people to read, and only incidentally for machines to execute."

http://www.doesnotunderstand.org/