Use-cases for inverted PCA

No video

Use-cases for inverted PCA

Рет қаралды 1,593

probabl

Күн бұрын

Пікірлер: 20

@gilad_rubin Ай бұрын

Super cool!

@guilhermeduarte1932 Ай бұрын

Really Cool! I never thought of PCA in this way. Thanks for showing!

@user-eg8mt4im1i Ай бұрын

Another fun video thanks ! 🎉 I believe this compression/decompression may be used in fraud detection isnt it ? The visualization is great !

@alexloftus8892 Ай бұрын

Cool! The way I'm thinking about this is that \hat{X_w} is at most rank 2 (in this case), since it appears as the result of a linear transformation of a rank 2 matrix. Therefore since it is the matrix that minimizes MSE loss from X_w, the span of \hat{X_w} defines the best-fitting (by MSE) 2D plane for the 64-dimensional datapoints. So an 'outlier', in this case, is 'any point which is far away from the plane'. Because we're doing PCA, the axes of that plane are the two directions of highest variance within those 64 dimensions. One thing to bear in mind here is that as you increase dimensionality, distances in the space increase. So you're going to have to change whatever threshold you use if you go up to, for example, 64x64 images instead of 8x8, and things might get weird.

@ilearnthings123 5 күн бұрын

genius

@Melle-sq4df Ай бұрын

I never thought of PCA this way but it's clever :) Would you say that you basically re-invented autoencoder ? or do you consider that it'll have subtle behavior difference with this method?

@probabl_ai Ай бұрын

PCA existed well before autoencoders did, so it feels strange to call it a reinvention. But there is a relationship for sure. If you remove the activation functions and just have a single hidden layer then it feels pretty much identical to me.

@zb9458 Ай бұрын

Amazing video! You really explained this in such a nice and easy step by step process, I feel my mental model of PCA is better now! I wonder, could you do a video on discrete cosine transform for encoding MNIST digits? I know patch-wise encoding is how JPG works, but I always got kind of lost in the linear algebra there...

@coopernik Ай бұрын

That’s how encoders and decoders are used to spot anomalies no?

@probabl_ai Ай бұрын

There's certainly a similar thing happening in neural network land. You could (somewhat handwavingly) look at PCA as an autoencoder with a single linear layer with a smaller dimension and no activation function that needs to output the input.

@denniswatson4326 9 күн бұрын

Great video by the way. Back in the day before generative AI for images I thought about trying this on something like the oliveti faces dataset. Reduce the dimensions to some latent space and then draw from there randomly to make "PCA faces"? I wonder if there is a distribution of the dimensions on the latent space? That is to say each individual dimension has some kind of non-flat distribution that would have to be drawn from in order to make a good face. I suppose if the chosen values where outliers then the face would be distorted?

@probabl_ai 8 күн бұрын

There will for sure be 'a distribution' in the latent space but the question remains if this distribution can translate back to reality. PCA really is loosing a bunch of information and there is no guarantee that you might recover it. If you really want to go and reconstruct, why not just use a neural network? You may appreciate this old blogpost of mine: koaning.io/posts/gaussian-auto-embeddings/

@denniswatson4326 9 күн бұрын

Is it my imagination or do a lot of datasets exhibit that rotated square or diamond shape you see around 3:22 when reduced to two dimensions?

@carlomotta2742 Ай бұрын

hey cool video, you show a simplified and linearized version of an autoencoder! Btw, I was checking the notebook you provide, when you import `from wigglystuff impor Slider2D` what is wigglystuff actually? I could not find any reference on the web. Thanks!

@probabl_ai Ай бұрын

(Vincent here) It is a super new library that's mostly personal that I use for these demos. It's a collection of UI elements made with Anywidget to help explain stuff in Jupyter. Hence the name. It's widgets that wiggle to explain stuff. Wigglystuff!

@carlomotta2742 Ай бұрын

@@probabl_ai that's really awesome! any plan to release it to make the nbs reproducible, or can you recommend an alternative to get the interactive pca graph? Thanks!

@probabl_ai Ай бұрын

@@carlomotta2742 The shownotes of the video show a link to the notebook. Wigglystuff is on PyPi so that's just a pip install away. pypi.org/project/wigglystuff/

@carlomotta2742 Ай бұрын

@@probabl_ai couldn't find it before - thank you!

@kanewilliams1653 Ай бұрын

One part I don't understand is 5:38 - why is it x_w hat instead of just x_w? There is no epsilon/error term in a PCA. And assuming T is invertible, wouldn't it just return exactly the same original matrix x_w?

@probabl_ai Ай бұрын

There is information loss when you go from the original input to the low dimensional representation. When you then go back you still have this information loss that makes it impossible to fully reconstruct the original array.