Java PCA transformation library

I wrote a Java library implementing Principal Component Analysis data transformation. About two years ago I needed such an algorithm implemented in Java and I was not able to find one, hence this library.

The library along with a more detailed description as well as binary and source packages is placed on github: https://github.com/mkobos/pca_transform

This entry was posted in Java and tagged . Bookmark the permalink.

4 Responses to Java PCA transformation library

  1. Majka says:

    Świetnie, właśnie tego szukałam :). Z ciekawości – Kobos z V lo? :D

  2. Ming says:

    How to reconstruct the data after PCA transformation?
    Below is the code I used. It is not working. Can you advise?

    for (int i = 0; i < transformedData.getRowDimension(); i++) {
    for (int k = 0; k < PCA_Dimension; k++) {
    for (int j = 0; j < transformedData.getColumnDimension(); j++) {
    restoredData[i][j] = restoredData[i][j] + transformedData.get(i, k)
    * pca.getEigenvalue(k) * pca.getEigenvectorsMatrix().get(k, j);
    }
    }
    }

    • mateusz says:

      I assume that you use the whitening transformation. In such case, the transformation from the original space to transformed space is Z = X V D^{-1/2}, where X is the original data matrix (data points correspond to rows), V is the eigenvectors matrix (eigenvectors correspond to columns), D is the diagonal matrix with eigenvalues on the diagonal, Z is the transformed data matrix (data points correspond to rows). The reverse transformation (that will get you the original vector only if the number of dimensions hasn’t been reduced) is X = Z D^{1/2} V^{-1} = Z D^{1/2} V^T (see e.g. http://stats.stackexchange.com/a/229093/43697 for an explanation for a version with the rotation transformation).

      In that context, I can see two problems with your code: it has swapped indexes when iterating over the `pca.getEigenvectorsMatrix()` (i.e., the V matrix) and is missing square root when handling the eigenvalue.

      for (int i = 0; i < transformedData.getRowDimension(); i++) {
      for (int k = 0; k < pca.getOutputDimsNo(); k++) {
      for (int j = 0; j < transformedData.getColumnDimension(); j++) {
      restoredData[i][j] = restoredData[i][j] + transformedData.get(i, k)
      * Math.sqrt(pca.getEigenvalue(k)) * pca.getEigenvectorsMatrix().get(j, k);
      }
      }
      }

      Of course this is assuming that `restoredData` matrix is initialized with zeros.

Leave a Reply to mateusz Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>