Advance AI MCE-E205

0 of 4 lessons complete (0%)

Deep Neural Network

PCA (Principal Component Analysis)

You don’t have access to this lesson

Please register or sign in to access the course content.

PCA (Principal Component Analysis) is a dimensionality reduction technique that transforms high-dimensional data into a smaller set of new variables called principal components, while retaining as much of the original variance as possible.


The core idea

Imagine you have data with 100 features. Many of those features are correlated — they carry overlapping information. PCA finds new axes (principal components) that capture the maximum variance in the data, so you can represent the same data with far fewer dimensions without losing much information.


How it works (step by step)

1. Standardize the data — subtract mean, divide by standard deviation so all features are on the same scale.

2. Compute the covariance matrix — measures how features vary together. For n features, this is an n×n matrix.

3. Compute eigenvectors and eigenvalues — eigenvectors define the directions of maximum variance (the new axes); eigenvalues tell you how much variance each direction captures.

4. Sort by eigenvalue — rank the eigenvectors from highest to lowest eigenvalue. The first eigenvector explains the most variance.

5. Project the data — multiply original data by the top-k eigenvectors to get k-dimensional data.

The key equation: if X is your data matrix and V is the matrix of top-k eigenvectors, then the reduced data is simply Z = X · V.


Simple example

Suppose you have height and weight data for 1000 people. These two are highly correlated — taller people tend to weigh more. PCA would find one principal component (roughly “overall body size”) that captures most of the variance, letting you represent each person with one number instead of two.


What PCA is used for

It’s used for data visualization (reducing to 2D or 3D to plot), speeding up machine learning (fewer features = faster training), noise reduction, and as a preprocessing step before clustering or classification.


PCA vs Autoencoders (relevant to your course)

Since you’re teaching Unit 1 — this connection is worth making in class. PCA finds a linear lower-dimensional representation. Autoencoders do the same thing but with nonlinear transformations via neural network layers, so they can capture far more complex structure in the data. PCA is essentially a special case of a linear autoencoder with no activation functions.