Singular Value Decomposition
Table of Content
1. Singular Value Decomposition (SVD) 1.1. Eigendecomposition 1.2. Principal Component Analysis (PCA)
1. Singular Value Decomposition (SVD)What if we had a bunch of data and we didn't really know much about it?We'd like to take the data and look for patterns in it and separate them out → so that we could understand our data better.We can use SVD to do this.SVD states that any matrix can be represented by three different matrices as follows, A=U𝛴VTU → rotation𝛴 → scalingVT → final rotationFor Example [1-1232-2]=[-0.240.960.960.24][4.20002.20][0.630.58-0.570.74-0.20.63-0.20.820.51]Note: If we divide each diagonal element of 𝛴 by the sum all elements in the diagonal, we get percentage of the variance explained by corresponding column in the U matrix. In the example above, the variance explained by first column of U, [-0.240.96], is equal to 4.24.2+2.2=0.65. Note: The third column of 𝛴 and third row of VT are not used. 1.1. EigendecompositionEigendecomposition states that any square matrix can be broken down into eigenvectors and eigenvalues.Few problems with eigendecomposition:It only works on square matrices.The eigenvalues don't necessarily lie between 0 and 1. The ranks of eigenvectors are not perpendicular.SVD solves these problem by:Allowing any sort of matrix (not only limited to square matrices)𝛴 is eigenvalues of AAT → This allows these values to lie between 0 and 1.VT is just the eigenvectors of ATA.To get the values of U, we can simply solve this equation → ui=Avi𝛴We can think of SVD as a generalized version of eigendecomposition. 1.2. Principal Component Analysis (PCA)The eigendecomposition of matrix A is, A=VLVT Now, what if we take matrix A and standardize it (i.e. subtract the mean and divide it by the standard deviation) and then divide it N-1 → This means that we have a correlation matrixATAN-1The problem is that this computation is typically not stable.So, instead, what's typically done to get PCA is to use SVD on the standardized matrix A.* In this case, the U𝛴 term → Principal Components Note: SVD and PCA can be used for dimensionality reduction.Note: SVD and PCA assume a linear correlation between the features.There are non-linear dimensionality reduction techniques. Examples of such methods are Kernel PCA. Back to Top