Is the Future of Linear Algebra.. Random?

  Рет қаралды 219,201

Mutual Information

Mutual Information

Күн бұрын

The machine learning consultancy: truetheta.io
Want to work together? See here: truetheta.io/about/#want-to-w...
"Randomization is arguably the most exciting and innovative idea to have hit linear algebra in a long time." - First line of the Blendenpik paper, H. Avron et al.
Follow up post: truetheta.io/concepts/linear-...
SOCIAL MEDIA
LinkedIn : / dj-rich-90b91753
Twitter : / duanejrich
Github: github.com/Duane321
SUPPORT
/ mutualinformation
SOURCES
Source [1] is the paper that caused me to create this video. [3], [7] and [8] provided a broad and technical view of randomization as a strategy for NLA. [9] and [12] informed me about the history of NLA. [2], [4], [5], [6], [10], [11], [13] and [14] provide concrete algorithms demonstrating the utility of randomization.
[1] Murray et al. Randomized Numerical Linear Algebra. arXiv:2302.11474v2 2023
[2] Melnichenko et al. CholeskyQR with Randomization and Pivoting for Tall Matrices (CQRRPT). arXiv:2311.08316v1 2023
[3] P. Drineas and M. Mahoney. RandNLA: Randomized Numerical Linear Algebra. Communications of the ACM. 2016
[4] N. Halko, P. Martinsson, and J. Tropp. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions. arXiv:0909.4061v2 2010
[5] Tropp et al. Fixed Rank Approximation of a Positive-Semidefinite Matrix from Streaming Data. NeurIPS Proceedings. 2017
[6] X. Meng, M. Saunders, and M. Mahoney. LSRN: A Parallel Iterative Solver for Strongly Over- Or Underdetermined Systems. SIAM 2014
[7] D. Woodruff. Sketching as a Tool for Numerical Linear Algebra. IBM Research Almaden. 2015
[8] M. Mahoney. Randomized Algorithms for Matrices and Data. arXiv:1104.5557v3. 2011
[9] G. Golub and H van der Vorst. Eigenvalue Computation in the 20th Century. Journal of Computational and Applied Mathematics. 2000
[10] J. Duersch and M. Gu. Randomized QR with Column Pivoting. arXiv:1509.06820v2 2017
[11] Erichson et al. Randomized Matrix Decompositions Using R. Journal of Statistical Software. 2019
[12] J. Gentle et al. Software for Numerical Linear Algebra. Springer. 2017
[13] H. Avron, P. Maymounkov, and S. Toledo. Blendenpik: Supercharging LAPACK's Least-Squares Solver. Siam. 2010
[14] M. Mahoney and P. Drineas. CUR Matrix Decompositions for Improved Data Analysis. Proceedings of the National Academy of Sciences. 2009
TIMESTAMPS
0:00 Significance of Numerical Linear Algebra (NLA)
1:35 The Paper
2:20 What is Linear Algebra?
5:57 What is Numerical Linear Algebra?
8:53 Some History
12:22 A Quick Tour of the Current Software Landscape
13:42 NLA Efficiency
16:06 Rand NLA's Efficiency
18:38 What is NLA doing (generally)?
20:11 Rand NLA Performance
26:24 What is NLA doing (a little less generally)?
31:30 A New Software Pillar
32:43 Why is Rand NLA Exceptional?
34:01 Follow Up Post and Thank You's

Пікірлер: 422
@charilaosmylonas5046
@charilaosmylonas5046 Ай бұрын
Great video! I want to add a couple of references to what you mentioned in the video related to neural networks: 1. Ali Rahimi got the Neurips 2017 "test of time" award for a method called - Random kitchen sinks (kernel method with random features). 2. Choromansky (from Google) made a variation of this idea to alleviate the quadratic memory cost of self-attention in transformers (which also works like a charm - I tried it myself, and I'm still perplexed how it didn't become one of the main efficiency improvements for transformers.). Check "retrinking attention with performers". Thank you for the great work on the video - keep them coming please! :)
@howuhh8960
@howuhh8960 Ай бұрын
it didn't because all efficient variations have significantly worse performance on retrieval tasks (associative recall for example), as all recent papers demonstrated
@Arithryka
@Arithryka Ай бұрын
The Quadratic Memory Cost of Self-Attention in Transformers is my new band name
@theo1103
@theo1103 21 күн бұрын
Is this a similar idea compared with the latent space in the transformer?
@hyperplano
@hyperplano 18 күн бұрын
Rahimi got the award for the "Random Features for Large-Scale Kernel Machines" paper, not the random kitchen sinks one
@rileyjohnmurray7568
@rileyjohnmurray7568 15 күн бұрын
@@howuhh8960 do you have specific references for this claim? I'm not doubting you, I'm just really interested in learning more, and the literature is vast.
@octavianova1300
@octavianova1300 Ай бұрын
reminds me of that episode of veggie tales when larry was like "in the future, linear algebra will be randomly generated!"
@NoNameAtAll2
@NoNameAtAll2 Ай бұрын
W E E D E A T E R
@rileymurray7437
@rileymurray7437 Ай бұрын
Reminds you of what???
@jedediahjehoshaphat
@jedediahjehoshaphat Ай бұрын
xD
@Godfather-qr6ej
@Godfather-qr6ej Ай бұрын
I thought it would be some nice science show, but it turns out to be some kids show : (
@notsojharedtroll23
@notsojharedtroll23 Ай бұрын
​@@rileymurray7437 he means this video: kzfaq.info/get/bejne/oJqAm5NjzODVnY0.htmlsi=wb2atwfoSQaefrjL
@BJ52091
@BJ52091 Ай бұрын
As a mathematician specializing in probability and random processes, I approve this message. N thumbs up where N ranges between 2.01 and 1.99 with 99% confidence!
@Mutual_Information
@Mutual_Information Ай бұрын
Great to have you here!
@purungo
@purungo Ай бұрын
So you're saying there's a 1 chance in roughly 10^16300 that you're giving him 3 thumbs up...
@frankjohnson123
@frankjohnson123 Ай бұрын
My brother in Christ, use a discrete probability distribution.
@nile6076
@nile6076 Ай бұрын
Only if you assume a normal distribution! ​@@purungo
@sylv256
@sylv256 Ай бұрын
Is this just one big late april fool's? What the hell
@Dagobah359
@Dagobah359 Ай бұрын
3:03 Linear algebra professor, here. Please stop teaching that it's the rows of matrices which are vectors. Yes, both rows and columns of matrices correspond to vectors in separate vector spaces, but when they don't have the full picture yet, beginning students should be thinking of the columns of the matrix as 'the' vectors. I've had to spend so much work fixing the perspective of engineers in their masters program who only think of the rows as vectors. It's much easier to broaden a student's perspective from columns to also rows, than it is to broaden their perspective from rows to also columns.
@rileyjohnmurray7568
@rileyjohnmurray7568 15 күн бұрын
Thanks for sharing this perspective! I've heard something similar from a professor when I did my PhD, and I generally agree with it. That said, I think introducing row-wise is not so bad *in the specific context of this video.* It seems like the natural thing to do if we want to compare scalar-valued nonlinear functions to scalar-valued linear functions. So if you're in a time crunch and you need to explain the concept of linearity in one minute (and with few equations), then this approach seems not so bad.
@laurenwrubleski7204
@laurenwrubleski7204 Ай бұрын
As a developer at AMD I feel somewhat obligated to note we have an equivalent to cuBLAS called rocBLAS, as well as an interface layer hipBLAS designed to compile code to make use of either AMD or NVIDIA GPUs.
@sucim
@sucim Ай бұрын
but can your cards train imagenet without crashing?
@389martijn
@389martijn Ай бұрын
​@@sucimsheeeeeeeeesh
@johnisdoe
@johnisdoe Ай бұрын
Are you guys hiring?
@Zoragna
@Zoragna Ай бұрын
OP forgot about BLAS being a standard so most implementations have been forgotten, it's weird to point at Nvidia
@cannaroe1213
@cannaroe1213 Ай бұрын
As an AMD customer who recently bought a 6950XT for €600, I am disappointed to learn rocBLAS is not supported on my outdated 2 year old hardware.
@TimL_
@TimL_ Ай бұрын
The part about matrix multiplication reminded me of studying cache hit and miss patterns in university. Interesting video.
@charlesloeffler333
@charlesloeffler333 Ай бұрын
Another tidbit about LinPack: One of its major strengths at the time it was written was that all of its double precision algorithms were truly double precision. At that time other packages often had double precision calculations hidden within the single precision routines where as their double precision counter parts did not have quad-precision parts anywhere inside. The LinPack folks were extraordinarily concerned about numerical precision in all routines. It was a great package. It also provided the basis for Matlab
@scottmiller2591
@scottmiller2591 Ай бұрын
Brunton, Kutz et al. in the paper you mentioned here "Randomized Matrix Decompositions using R," recommended in their paper using Nathan Halko's algo, developed at the CU Math department. B&K give some timing data, but the time and memory complexity were already computed by Halko, and he had implemented it in MATLAB for his paper - B&K ported it to R. Halko's paper from 2009 "FINDING STRUCTURE WITH RANDOMNESS: STOCHASTIC ALGORITHMS FOR CONSTRUCTING APPROXIMATE MATRIX DECOMPOSITIONS" laid this all out 7 years before the first draft of the B&K paper you referenced. Halko's office was a mile down the road from me at that time, and I implemented Python and R code based on his work (it was used in medical products, and my employer didn't let us publish). It does work quite well.
@Mutual_Information
@Mutual_Information Ай бұрын
Very cool! The more I researched this, the more I realized the subject was deeper (older too) than I had realized with the first few papers I read. It's interest to hear your on-the-ground experience of it, and I'm glad the video got your attention.
@ajarivas72
@ajarivas72 Ай бұрын
@@Mutual_Information Has anyone tried genetic algorithms instead of purely random approches? In my experience, genetic algorithms are 100 faster than Monte Carlo simulations to obtain an optimum.
@skn123
@skn123 27 күн бұрын
Halko's algorithm helped me start my understanding of Laplacian eigenmaps and other dimensionality reduction methods.
@danielsantiagoaguilatorres9973
@danielsantiagoaguilatorres9973 Ай бұрын
I'm writing a paper on a related topic. Didn't know about many of these papers, thanks for sharing! I really enjoyed your video
@pietheijn-vo1gt
@pietheijn-vo1gt Ай бұрын
I have seen a very similar idea in compressed sensing. In compressed sensing we also use a randomized sampling matrix, because the errors can be considered as white noise. We can then use a denoising algorithm to recover the original data. In fact I know Philips MRI machines use this technique to speed up scans, because you have to take less pictures. Fascinating
@tamineabderrahmane248
@tamineabderrahmane248 Ай бұрын
random sampling to reconstruct the signal
@pietheijn-vo1gt
@pietheijn-vo1gt Ай бұрын
@@tamineabderrahmane248... what?
@MrLonelyrager
@MrLonelyrager Ай бұрын
Compressed sensing is also useful for wireless comunications. I studied its usage for sampling ultra wideband signals and indoor positioning. It only works accurately under certain sparsity assumptions. In MRI scans , their "fourier transform" can be considered sparse, then we can use l1 denoising algorithms to recover the original signal.
@pietheijn-vo1gt
@pietheijn-vo1gt Ай бұрын
@@MrLonelyrager yes correct, that's exactly what I used. In the form of ISTA (iterative shrinkage and thresholding) algorithms and its many (deep-learning) derivatives
@richardyim8914
@richardyim8914 Ай бұрын
Golub and Van Loan’s textbook is goated. I loved studying and learning numerical linear algebra for the first time in undergrad.
@makapaka8247
@makapaka8247 Ай бұрын
I'm finally far enough in education to see how well made your stuff is. Super excited to see a new one from you. Thanks for expanding people's horizons!
@Mutual_Information
@Mutual_Information Ай бұрын
Glad to have you watching!
@zyansheep
@zyansheep Ай бұрын
Dang, I absolutely love videos and articles that summarize the latest in a field of research and explain the concepts well!
@charlesity
@charlesity Ай бұрын
As always this is BRILLIANT. I started following your videos since I saw the GP regression video. Great content! Thank you very much.
@Apophlegmatis
@Apophlegmatis 24 күн бұрын
The nice thing is, with continuous systems (and everything in experienced life is continuous) the question is not "is it linear," but "on what scale is it functionally linear," which makes calculations of highly complex situations much simpler.
@Mutual_Information
@Mutual_Information 23 күн бұрын
YES!
@noahgsolomon
@noahgsolomon Ай бұрын
You discussed all the priors incredibly well. I didn’t even understand the premise of random in this context and now I leave with a lot more. Keep it up man ur videos are the bomb
@mgostIH
@mgostIH Ай бұрын
I started reading this paper when you mentioned it on Twitter, forgot it was you who I got it from and was now so happy to see a video about it!
@Mutual_Information
@Mutual_Information Ай бұрын
Yes! And good to see you here mgost
@marcegger7411
@marcegger7411 Ай бұрын
Damn... your videos are getting beyond excellent!
@bn8ws
@bn8ws Ай бұрын
Outstanding content, instant sub. Keep up the good work!
@aleksszukovskis2074
@aleksszukovskis2074 Ай бұрын
its always a pleasure to watch this channel
@bluearctik3980
@bluearctik3980 Ай бұрын
My first thought was "this is like journal club with DJ"! Great stuff - well researched and crisply delivered. More of this, if you please.
@deltaranged
@deltaranged Ай бұрын
It feels like this video was made to match my exact interests LOL I've been interested in NLA for a while now, and I've recently studied more "traditional" randomized algorithms in uni for combinatorial tasks (e.g. Karger's Min-cut). It's interesting to see how they've recently made ways to combine the 2 paradigms. I'm excited to see where this field goes. Thanks for the video and for introducing me to the topic!
@Rockyzach88
@Rockyzach88 Ай бұрын
KZfaq has you in its palms. _laughs maniacally_
@Sino12
@Sino12 Ай бұрын
where do you study?
@AjaniTea
@AjaniTea Ай бұрын
This is a world class video. Thanks for posting this and keep it up!
@Stephen_Kelley
@Stephen_Kelley Ай бұрын
Excellent video, really well paced.
@gaussology
@gaussology Ай бұрын
Wow, so much research went into this! It makes me even more motivated to read papers and produce videos 😀
@jondor654
@jondor654 Ай бұрын
Lovely type, great clarity .
@AlexGarel-xr9ri
@AlexGarel-xr9ri Ай бұрын
Incredible video with very good animations and script. Thank you !
@moisesbessalle
@moisesbessalle Ай бұрын
Amazing video!
@razeo7068
@razeo7068 5 сағат бұрын
Amazing video. Had me hooked from start to finish. You gained a new subscriber
@JHillMD
@JHillMD 16 күн бұрын
What a terrific video and channel. Great work! Subbed.
@piyushkumbhare5969
@piyushkumbhare5969 Ай бұрын
This is a really well made video, nice!
@MachineLearningStreetTalk
@MachineLearningStreetTalk Ай бұрын
Great video brother! 😍
@Mutual_Information
@Mutual_Information Ай бұрын
Thank you MLST! You're among a rare bunch providing non-hyped or otherwise crazy takes on AI/ML, so it means a lot coming from you.
@JoeBurnett
@JoeBurnett Ай бұрын
You are an amazing teacher! Thank you for explaining the topic in this manner. It really motivates me to continue learning about all things linear algebra!
@JonathanPlasse
@JonathanPlasse Ай бұрын
Awesome presentation, thank you!
@ernestoherreralegorreta137
@ernestoherreralegorreta137 Ай бұрын
Amazing explanation of a complex topic! You've got yourself a new subscriber.
@Mutual_Information
@Mutual_Information Ай бұрын
Glad to have you!
@tiwiatg2186
@tiwiatg2186 Ай бұрын
Loving it loving it loving it!! Amazing video, amazing topic 👏
@braineaterzombie3981
@braineaterzombie3981 Ай бұрын
This is exactly what i needed. Subscribed
@from_my_desk
@from_my_desk Ай бұрын
thanks a ton! this was eye-opening 😊
@billbez7465
@billbez7465 Ай бұрын
Amazing video with great presentation. Thank you
@user-le1ho7sl7h
@user-le1ho7sl7h Ай бұрын
I used one time random matrices for eigenvalue counts on intervals and it was amazing! Di Napoli, E., Polizzi, E., & Saad, Y. (2016). Efficient estimation of eigenvalue counts in an interval. Numerical Linear Algebra with Applications, 23(4), 674-692.
@chazhovnanian6897
@chazhovnanian6897 4 күн бұрын
you've GOT to post more, this stuff is amazing, im still in high school but learning about so-called 'mature' processes which become completely revolutionised really inspires me, thanks for this :)
@Pedritox0953
@Pedritox0953 Ай бұрын
Great video!
@lbgstzockt8493
@lbgstzockt8493 Ай бұрын
Very good video on a very interesting topic. Who would have thought that there is this much to gain in such a commonly used piece of mathematics.
@wiktorzdrojewski890
@wiktorzdrojewski890 Ай бұрын
this feels like a good presentation topic for numerical methods seminar
@Otakutaru
@Otakutaru Ай бұрын
Adequate density of new information, and sublime narrative. Also, on point visuals
@hozaifas4811
@hozaifas4811 Ай бұрын
We need more content creators like you ❤
@Mutual_Information
@Mutual_Information Ай бұрын
Thank you. These videos take awhile, so I wish I could upload more. But I'm confident I'll be doing KZfaq for a long time.
@hozaifas4811
@hozaifas4811 Ай бұрын
@@Mutual_Information Well ,This news made my day !
@CyberBlaster-fu2dz
@CyberBlaster-fu2dz Ай бұрын
Great video, thank you!
@EkShunya
@EkShunya Ай бұрын
Been a while since ur last post thanks Please make more often I like what u make
@oceannuclear
@oceannuclear Ай бұрын
Oh my god, this forms a small part of my PhD thesis where I've been trying to understand LAPACK's advantage/disadvantage when it comes to inverting matrices. Having this video really helps me put things into contex! Thank you very much for making this!
@tantzer6113
@tantzer6113 Ай бұрын
I enjoyed this video. Thank you.
@pygmalionsrobot1896
@pygmalionsrobot1896 Ай бұрын
Whoa - very cool stuff !!
@iamr0b0tx
@iamr0b0tx Ай бұрын
This is a really good video 💯
@vNCAwizard
@vNCAwizard Ай бұрын
An excellent presentation.
@KipIngram
@KipIngram Ай бұрын
Fascinating. Thanks very much for filling us then on this.
@scottmiller2591
@scottmiller2591 Ай бұрын
This was a nice walk down memory lane for me, and a good introduction to the beginner. It's nice to see SWE getting interested in these techniques, which have a very long history (like solving finite elements with diffusion decades ago, and compressed sensing). I enjoyed your video. A few notes: It's useful to note that "random" projections started out as Gaussian, but it turns out very simple, in-memory, transformations let you use binary random numbers at high speed with little to no loss of accuracy. I think you had this in mind when talking about the random matrix S in sketch-and-solve. BLAS sounds like blast, but without the t. I'm sure there's people who pronounce it like blahs. Software engineers mangle the pronunciation of everything, including other SWE packages, looking at you, Ubuntu users. However the first pronunciation is the pronunciation I have always heard in the applied linear algebra field. FORTRAN doesn't end like "fortune," but rather ends with "tran," but maybe people pronounce "fortran" (uncapitalized) that way these days - IDK (see note above re: mangling; FORTRAN has been decapitalized since I started working with it). Cholesky starts with a hard "K" sound, which is the only pronunciation you'll ever hear in NLA and linear algebra. It certainly is the way Cholesky pronounced it. Me, I always pronounce Numpy to sound like lumpy just to tweak people, even though I know better ☺. I've always pronounced CQRRPT as "corrupt," too, but because that's what the acronym looks like (my eyes are bad). One way to explain how these work intuitively is to look at a PCA, similar to what you touched on with the illustration of covariance. If you know the rank is low, then there will be, say, k large PCA directions, and the rest will be small. If you perform random projection on the data, those large directions will almost certainly show up in your projections, with the remaining PCA directions certainly being no bigger than they were originally (projection is always non-expanding). This means the random projections will still contain large components of the strong PCA directions, and you only need to make sure you took enough random projections to avoid being unlucky enough to accidentally be very nearly normal with the strong PCA directions every time. The odds of you being unlucky go down with every random projection you add. You'd have to be very unlucky to take a photo of a stick from random directions, and have every photo of the stick be taken end-on. In most photos, it will look like a stick, not a point. Similarly, taking a photo of a piece of paper from random directions will look like a distorted rectangle, not a line segment It's one case where the curse of dimensionality is actually working in your favor - several random projections almost guarantees they won't all be projections to an object that's the thickness of the paper. I've been writing randomized algos for a long time (I have had arguments w engineers about how random SVD couldn't possibly work!), and love seeing random linear algebra libraries that are open and unit tested. I agree with your summary - a good algorithm is worth far more than good hardware. Looking forward to you tracking new developments in the future.
@Mutual_Information
@Mutual_Information Ай бұрын
This is the real test of a video. When an expert watches it and, with some small corrections, agrees that it gets the bulk of the message right. It's a reason I try to roll in an subject matter expert where I can. So I'm quite happy to have covered the topic appropriately in your view. (It's also a relief!) And I also wish I had thought of the analogy: "You'd have to be very unlucky to take a photo of a stick from random directions, and have every photo of the stick be taken end-on. In most photos, it will look like a stick, not a point." I would have included that if I had thought of it!
@scottmiller2591
@scottmiller2591 Ай бұрын
@@Mutual_Information Agree absolutely!
@rileyjohnmurray7568
@rileyjohnmurray7568 Ай бұрын
Jim Demmel and Jack Dongarra pronounced it "blahs" the last time I spoke with each of them. (~This morning and one month ago, respectively.) 😉
@Mutual_Information
@Mutual_Information Ай бұрын
@@rileyjohnmurray7568 lol
@scottmiller2591
@scottmiller2591 Ай бұрын
@@rileyjohnmurray7568 I hope they perk up ☺
@EE-wo5ty
@EE-wo5ty Ай бұрын
the quality on this editing is top notch, congratulations!!!
@broccoli322
@broccoli322 Ай бұрын
Great stuff
@mohamedalmahditantaoui8422
@mohamedalmahditantaoui8422 4 күн бұрын
I think you made the best numerical linear algebra in the world, we really need more content like this. Keep up the good work.
@plfreeman111
@plfreeman111 9 күн бұрын
"And if you aren't, you're probably doing something wrong." So very very true. Don't roll your own NLA code. You won't get it right and it certainly won't be faster. The corollary is "If you're inverting a matrix, you're probably doing something wrong." But that's a different problem I have to solve with newbies.
@firefoxmetzger9063
@firefoxmetzger9063 20 күн бұрын
I realize that YT comments are not the best place to explain "complex" ideas, but here it goes anyway: The head bending relative difference piece reply is "just" a coordinate transformation. At 29:45, you lay ellipses atop each other and show the absolute approximation difference between the full sample and the sketch. The "trick" is to realize that this happens in the common (base) coordinate system and that nothing stops you from changing this coordinate system. For example, you can (a) move the origin to the centroid of the sketch, (b) rotate so that X and Y align with the semi-axis of the sketch, and (c) scale (asymmetrically) so that the sketches semi-axis have length 1. What happens to the ellipsoid of the full sample in this "sketch space"? Two things happen when plotting in the new coordinate system: (1) the ellipsoid of the sketch becomes a circle around the origin (semi-axes are all 1) by construction. (2) the ellipsoid of the full sample becomes an "almost" circle depending on the quality of the approximation of the full sample by the sketch. As sample size increases, centroids converges, semi-axes start aligning, and (importantly) semi-axes get stretched/squashed until they reach length 1. Again, this is for the full sample - the sketch is already "a perfect circle by construction". In other words, as we increase the sample size of the sketch the full sample looks more and more like a unit circle in "sketch space". We can now quantify the quality of the approximation using the ratio of the full sample's semi-axis in "sketch space". If there are no relative errors (perfect approximation), these become the ratio of radii of a circle which is always 1. Any other number is due to (relative) approximation error, lower is better, and it can't be less than 1. The claim now is that, even for small samples, this ratio is already low enough for practical use, i.e., sketches with just 10 points already yield good results.
@firefoxmetzger9063
@firefoxmetzger9063 20 күн бұрын
If you understand the above, then the high-dimensional part becomes clear as well: In N dimensions a "hyper-ellipsoid" has N semi-axes, and the claim is that for real (aka. sparse) problems some of these semi-axes are really large and some are really small when measured in "problem space". This relationship applied to the 2D ellipsis you show at 29:45 means that the primary axis becomes really large (stretches beyond the screen size) and the secondary axis becomes really small (squished until the border lines touch each other due to line thickness). This will make the ellipsis plot "degenerate" and it will look like a line - which is boring to visualize.
@Cats3141
@Cats3141 7 күн бұрын
Really superb presentation!
@ihatephysixs
@ihatephysixs Ай бұрын
Awesome video
@the_master_of_cramp
@the_master_of_cramp Ай бұрын
Great and clear video! Makes me wanna study more numerical LA...combined with probability theory because it shows how likely inefficient many algorithms use currently are, and that randomized algorithms are usually insanely much faster, while being approximately correct. So those randomized algorithms basically can be used anywhere when we don't need to be 100% sure about the result (which is basically always, because our mathematical models are only approximations of what's going on in the world and thus are inaccurate anyways and as you mentioned, if data is used, it's noisy).
@michaeln.8185
@michaeln.8185 Ай бұрын
Great video! Thank you for producing this!
@TrungHieuTu
@TrungHieuTu Ай бұрын
Very useful, thanks
@prithvidhyani1991
@prithvidhyani1991 24 күн бұрын
awesome video! also the soundtrack at the start is beautiful, which piece is it?
@pr0crastinatr
@pr0crastinatr Ай бұрын
Another neat explanation for why the randomized least-squares problem works is the Johnson-Lindenstrauss lemma. That lemma states that most vectors don't change length a lot when you multiply them by a random gaussian matrix, so the norm of S(Ax - b) is within (1-eps) to (1+eps) of the norm of Ax-b with high probability.
@metromap9618
@metromap9618 Ай бұрын
great video!
@General12th
@General12th Ай бұрын
Hi DJ! I love improvements in algorithmic efficiency.
@nonamehere9658
@nonamehere9658 Ай бұрын
The trick of multiplying by random S in argmin (SAx-Sb)^2 reminds me of the similar trick in the Freivalds' algorithm: instead of verifying matrix multiplication A*B==C we check A*B*x==C*x for a random vector x. Random projections FTW???
@Mutual_Information
@Mutual_Information Ай бұрын
Sounds like it!
@johannguentherprzewalski
@johannguentherprzewalski Ай бұрын
Very interesting content! I did find that the video felt longer than expected. I was intrigued by the thumbnail and the promise of at least 10x speed improvement. However, it took quite a while to get to the papers and even longer to get to the explanation. The history definitely deserves its own video and most chapters could be much shorter.
@nikita_x44
@nikita_x44 Ай бұрын
linearity @ 4:43 is diffirent linearity. linear functions in the sense of linear algebra must always pass through (0,0)
@sufyanali3992
@sufyanali3992 Ай бұрын
I thought so too, the 2D line shown on the right is an affine function, not a linear function in the rigorous sense.
@KepleroGT
@KepleroGT Ай бұрын
Yep, otherwise the linearity of addition and multiplication which he just skipped over wouldn't apply and thus wouldn't be linear functions, or rather the correct term is linear map/transformation. Example: F(x,y,z) = (2x+y, 3y, z+5), (0,0,0) = F(0,0,0) is incorrect because F(0,0,0) = (0,0,5). The intent is to preserve the linearity of these operations so they can be applied similarly. If I want 2+2 or 2*2 I can't have 5
@Geenimetsuri
@Geenimetsuri 29 күн бұрын
I understood this. Thank you, great education!
@Mutual_Information
@Mutual_Information 29 күн бұрын
That's a win!!
@DavidS-ji6qv
@DavidS-ji6qv Ай бұрын
Phenomenal video
@ericc6820
@ericc6820 15 күн бұрын
man j really wish these kinda of videos existed when I was in school. I would have reached my math potential instead of getting bored and losing interest because my teachers didn’t know how to teach.
@chakrasamik
@chakrasamik Ай бұрын
Excellent ❤
@StratosFair
@StratosFair Ай бұрын
As a grad student in theoretical machine learning, I have to say i'm blown away by the quality of your content, please keep videos like these coming !
@ryanjkim
@ryanjkim Ай бұрын
Really great thank you.
@pedroteran5885
@pedroteran5885 2 күн бұрын
I love how Volker Strassen did things so different from each other.
@wafikiri_
@wafikiri_ Ай бұрын
The first program I fed a computer was one I wrote in FORTRAN IV. It almost exhausted the memory capacity of the IBM machine, which was about 30 KBytes for the user (it used memory overloads, which we'd call banked memory today, in order to not exceed the available memory for programs).
@damondanieli
@damondanieli Ай бұрын
Great video! One thing: “processor registers” not “registries”
@Mutual_Information
@Mutual_Information Ай бұрын
I know.. lol damn it
@jamesedwards6173
@jamesedwards6173 Ай бұрын
lol, I caught that same thing.
@RepChris
@RepChris Ай бұрын
Of course i get this in my recommended a few days after my first numerical analysis lecture
@RepChris
@RepChris Ай бұрын
Which is a course i picked up (its semi-required) since it seems like a very useful thing to understand properly, even though i am not the best at advanced linear algebra and have PTSD from a previous professor and get a visceral reaction every time i see an epsilon, both of which are integral to most of the course
@Mutual_Information
@Mutual_Information 27 күн бұрын
Well I hope math KZfaq serves as a bit of PTSD therapy. I hope a shit professor doesn't get the way of you enjoying a good thing.
@psl_schaefer
@psl_schaefer Ай бұрын
As always great (very educative) content. I very much appreciate all the work you put into those videos!
@Ohmriginal722
@Ohmriginal722 Ай бұрын
Whenever randomness is involved you got me wanting to use Analogue processors for fast and low-power processing
@user-gv6fn6yt2u
@user-gv6fn6yt2u Ай бұрын
it's really mind-blowing how random numbers can achieve something such fast
@h.b.1285
@h.b.1285 Ай бұрын
Excellent video! This topic is not easy for the layperson (admittedly, the layperson that likes Linear Algebra), but it was clearly and very well structured.
@mtteslian9159
@mtteslian9159 18 күн бұрын
Amazing!!
@DawnOfTheComputer
@DawnOfTheComputer Ай бұрын
The math presentation and explanation alone was worth a sub, let alone the interesting topic.
@user-qp2ps1bk3b
@user-qp2ps1bk3b Ай бұрын
very nice!
@antiguarocks
@antiguarocks Ай бұрын
Reminds me of what my high school maths teacher said about being able to assess product quality on a production line with high accuracy by only sampling a few percent of the product items.
@DocM221
@DocM221 Ай бұрын
I've been through some basic linear algebra courses, but really the covariance problem struck me as one obviousness to a statician. A statician would never go and sample everybody, they would first determine how accurate they needed to be in their certainty, and then go about sampling exactly the number of people that satisfies that equation. I actually had to do this in my job! I can totally see how this will be a great tool used with data prediction and maybe hardware accelerators to make MASSIVE gains. We are in for a huge wild ride! Thanks for the video!
@ShivaTD420
@ShivaTD420 Ай бұрын
If you take white noise. And put a filter on it. You can produce every note, because every tone and semi tone is in the noise.
@0x4849
@0x4849 28 күн бұрын
Some small correction: At 4:50, assuming the plotted values follow y=f(x), f is actually not linear, since in the graph we see that f(0)/=0. At 8:22, you incorrectly refer to the computer's registers as "registries", but more importantly, data access speed depends much more on cache size than register size, as the latter can generally only hold 1-4 values (32-bit float in 128-bit register), which, while allowing the use of SIMD, is very restrictive in its use. A computer's cache is some intermediate between CPU and disk, which, if used efficiently, can indeed greatly reduce runtime.
@williamchen6099
@williamchen6099 Ай бұрын
wow this is an amazingly well produced and scripted video and delivered perfectly, how long did it take you to plan and execute it?
@Mutual_Information
@Mutual_Information Ай бұрын
I was working on it since November, mostly on the weekends and sometimes in the evenings. I'd guess it took me over 150 hours. The stages are reading research, script writing, creating the on screen animations, re-writing the script with feedback (e.g. from Riley here), shooting the video, editing it, adding music, cleaning it up, sharing the video for feedback. It takes a lot longer than I like to admit.
@pythonguytube
@pythonguytube Ай бұрын
Worth pointing out that there is a modern sparse linear algebra package called GraphBLAS, that can be used not just for graphs (which generalize to sparse matrices) but also to any sparse matrix multiplication operation.
@rainaldkoch9093
@rainaldkoch9093 Ай бұрын
Danke!
@mohammedbelgoumri
@mohammedbelgoumri Ай бұрын
No better way to start the day than with an MI upload 🥳
@Mutual_Information
@Mutual_Information Ай бұрын
Thank you, love hearing that!
@usernameisamyth
@usernameisamyth 14 күн бұрын
great stuff
@maxheadrom3088
@maxheadrom3088 Ай бұрын
Nice video! Nice channel! The complicated part isn't multiplying ... it's inverting!
@minsookim-ql1he
@minsookim-ql1he Ай бұрын
This is very interesting
The Boundary of Computation
12:59
Mutual Information
Рет қаралды 943 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 158 М.
Do you have a friend like this? 🤣#shorts
00:12
dednahype
Рет қаралды 55 МЛН
Sprinting with More and More Money
00:29
MrBeast
Рет қаралды 148 МЛН
The Most Important Algorithm in Machine Learning
40:08
Artem Kirsanov
Рет қаралды 243 М.
And this year's Turing Award goes to...
15:44
polylog
Рет қаралды 97 М.
The Bubble Sort Curve
19:18
Lines That Connect
Рет қаралды 400 М.
The Most Important (and Surprising) Result from Information Theory
9:10
Mutual Information
Рет қаралды 83 М.
P vs. NP: The Biggest Puzzle in Computer Science
19:44
Quanta Magazine
Рет қаралды 694 М.
How AI Discovered a Faster Matrix Multiplication Algorithm
13:00
Quanta Magazine
Рет қаралды 1,3 МЛН
Alternative to bearings for tiny robots
12:05
Breaking Taps
Рет қаралды 320 М.
Do you have a friend like this? 🤣#shorts
00:12
dednahype
Рет қаралды 55 МЛН