PowerPoint Presentation - Dimension reduction in large data sets

PCA – some definitions, terminology

ĄIf x is a vector of p variables, then the principal components (PCs) are linear combinations aT1x, aT2x, É aTpx

ĄAlthough we can find p PCs, and sometimes the last few are useful (e.g in finding outliers), for dimension reduction purposes we usually only keep the first few

ĄIn the kth PC ak, the vector of coefficients or loadings, is chosen so that the variance of aTkx is maximised, subject to a normalisation constraint aTkak = 1, and subject to successive PCs being uncorrelated