principal component analysis problem statement
map x ∈ R d x \in R^d x ∈ R d to z ∈ R q z \in \mathbb{R}^q z ∈ R q with q < d q < d q < d
A q × d q \times d q × d matrix can represent a linear mapping:
z = A x z = Ax z = A x
Assume that A A T = I A A^T = I A A T = I (orthonormal matrix)
minimising reconstruction error
Given X ∈ R d × n X \in \mathbb{R}^{d \times n} X ∈ R d × n , find A A A that minimises the reconstruction error:
min A , B ∑ i ∥ x i − B A x i ∥ 2 2 \min\limits_{A,B} \sum_{i} \| x^i - B A x^i \|_2^2 A , B min i ∑ ∥ x i − B A x i ∥ 2 2
if q = d q=d q = d , then error is zero.
Solution:
B = A T B = A^T B = A T
min A ∑ i ∥ x i − A T A x i ∥ 2 \min\limits_{A} \sum_i \| x^i - A^T A x^i \|^2 A min ∑ i ∥ x i − A T A x i ∥ 2 is subjected to A A T = I q × q A A^T = I_{q \times q} A A T = I q × q
assuming data is centered, or 1 n ∑ _ i x i = [ 0 ⋯ 0 ] T \frac{1}{n} \sum\_{i} x^i = \begin{bmatrix} 0 & \cdots & 0 \end{bmatrix}^T n 1 ∑ _ i x i = [ 0 ⋯ 0 ] T
eigenvalue decomposition
X T X u = λ u X T X = U T Λ U ∵ Λ = diag ( λ 1 , λ 2 , ⋯ , λ d ) = [ λ 1 0 ⋯ 0 0 λ 2 ⋯ 0 ⋮ ⋮ ⋱ ⋮ 0 0 ⋯ λ q ] \begin{aligned}
X^T X \mathcal{u} &= \lambda \mathcal{u} \\
X^T X &= U^T \Lambda U \\
\\
\\
\because \Lambda &= \text{diag}(\lambda_1, \lambda_2, \cdots, \lambda_d) \\ &= \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\
0 & \lambda_2 & \cdots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & \lambda_q \end{bmatrix}
\end{aligned} X T X u X T X ∵ Λ = λ u = U T Λ U = diag ( λ 1 , λ 2 , ⋯ , λ d ) = λ 1 0 ⋮ 0 0 λ 2 ⋮ 0 ⋯ ⋯ ⋱ ⋯ 0 0 ⋮ λ q
pca
Idea: given input x 1 , ⋯ , x n ∈ R d x^1, \cdots, x^n \in \mathbb{R}^d x 1 , ⋯ , x n ∈ R d , μ = 1 n ∑ i x i \mu = \frac{1}{n} \sum_{i} x^i μ = n 1 ∑ i x i
Thus
C = ∑ ( x i − μ ) ( x i − μ ) T C = \sum (x^i - \mu)(x^i - \mu)^T C = ∑ ( x i − μ ) ( x i − μ ) T
Find the eigenvectors/values of C C C :
C = U T Λ U C = U^T \Lambda U C = U T Λ U
Optimal A A A is:
A = [ u 1 T u 2 T ⋮ u q T ] A = \begin{bmatrix}
u_1^T \\
u_2^T \\
\vdots \\
u_q^T
\end{bmatrix} A = u 1 T u 2 T ⋮ u q T