But what is a neural network REALLY?

Рет қаралды 61,792

Күн бұрын

My submission for 2022 #SoME2. In this video I try to explain what a neural network is in the simplest way possible. That means no linear algebra, no calculus, and definitely no statistics. The aim is to be accessible to absolutely anyone.
00:00 Intro
00:47 Gauss & Parametric Regression
02:59 Fitting a Straight Line
06:39 Defining a 1-layer Neural Network
09:29 Defining a 2-layer Neural Network
Part of the motivation for making this video is to try to dispel some of the misunderstandings around #deeplearning and to highlight 1) just how simple the neural network algorithm actually is and 2) just how NOT like a human brain it is.
I also haven't seen Gauss's original discovery of parametric regression presented anywhere before, and I think its a fun story to highlight just how far (and how little) data science has come in 200 years.
***************************
In full disclosure, planets do not orbit in straight lines, and Gauss did not fit a straight line to Ceres' positions, but rather an ellipse (in 3d).

Пікірлер: 222

@dsagman Жыл бұрын

“Do neural networks work because they reason like a human. No. They work because they fit the data.” You should have added “boom. mic drop.”. Excellent video!

@LuisPereira-bn8jq Жыл бұрын

Can't say I agree. I really liked the video as a whole, but that "drop" was the worst part of the video to me, since it's a bit of a strawman, for at least two reasons: - knowing what a complex system does "at a foundational level" is very far from allowing you to understand the system. After all, Biology is "just" applied Chemistry which in turn is "just" applied Physics, but good luck explaining any complex biological system from physical principles alone. - much of what humans do doesn't use "reason" at all. A few years back I decided to start learning Japanese. And I recall that for the first few months of listening to random native Japanese speakers I'd have trouble even correctly identifying the syllables of their words. But after some time and more exposure to the sounds, grammar, and speech patterns, that gradually improved. Yet that improvement had little to do with me *reasoning* about the language, and was largely an unconscious process of my brain getting better at pattern recognition in the language. At least when it comes to "pattern recognition" I see no compelling reason to declare that humans (and animals, for that matter) are doing anything fundamentally different from neural networks.

@algorithmicsimplicity Жыл бұрын

My comments about neural networks reasoning were in response to some of the recent discussions about large language models being conscious. My impression is that these discussions give people a wildly inaccurate view of what neural networks actually do. I just wanted to make it clear that all neural networks do is curve fitting. Sure you can say "neural networks are a function that map inputs to outputs" and "humans are a function that map inputs to outputs", therefore they are fundamentally doing the same thing. But there are important differences between humans and neural networks. For one thing, in the human's case the function is not learned by curve fitting. It is learned by Bayesian inference. Humans are born with an incredible amount of prior knowledge about the world, including what types of sounds human language can contain. This is why you were able to learn to recognize Japanese sounds in a few months, where it would take a neural network the equivalent of thousands of years worth of examples. If you want to say that neural networks are doing the same thing as humans that's fine, but you should equally be comfortable saying that random forests are doing the same thing as humans.

@danielguy3581 Жыл бұрын

@@algorithmicsimplicity Whatever mechanism underlies human cognition, if it begets the same results as a neural network, then it can be said to also "merely" perform curve fitting. Whether that can also be described in terms of Bayesian inference would not invalidate that. Similarly, it is not helpful stating there's nothing to understand or use as a model in neurobiology since it is just atoms minimizing energy states.

@charletsebeth Жыл бұрын

Why ruin a good story with the truth?

@revimfadli4666 Жыл бұрын

@@LuisPereira-bn8jq aren't you making a strawman yourself? Also wouldn't your language example still count as his "learning abstract hierarchies and concepts"?

@sissyrecovery 10 ай бұрын

DUUUUUUDE. I watched all the people you'd expect, 3Blue1Brown, StatQuest etc. I was so lost. Then I was ranting to a friend about how exasperated I was, and he sent me this video. BAM. Everything clicked. You're the man. Also, love how you ended it with a banger.

@BurkeMcCabe Жыл бұрын

BRUH. This video gave me that amazing feeling when something clicks in your brain and everything all of a sudden makes sense! Thank you I have never seen neural networks explained in this way before.

@danyalS78 Жыл бұрын

Copied🗿

@wissemrouin4814 Жыл бұрын

exactly !

@Weberbros1 Жыл бұрын

I was expecting a duplicate of many other nueral network videos, but this was a perspective that I have not seen before! Awesome Video!

@newchaoz Жыл бұрын

This is THE best intuition behind neural networks I have ever seen. Thanks for the great video!

@AdobadoFantastico Жыл бұрын

Getting some bit of math history context makes these extra enjoyable. Great video, explanation, and visualization.

@orik737 Жыл бұрын

Oh my lord. I've been struggling with neural networks for awhile and I've always felt like I have a decent grasp on them but this video finally brought everything together. Beautiful introduction

@Flobbled Жыл бұрын

Simply and elegantly explained. The bit at the end was superb.

@zenithparsec Жыл бұрын

Except this just described one activation function, and did not show it generalized to all neural networks. Being so accessible means it couldn't explain ReLU in context. Don't get me wrong, it's a good explanation of how some variants of the ReLU activation function works, but it doesn't explain what a neural network really is, nor prove that your brain doesn't work by fitting data in a similar way.

@paulhamacher773 2 күн бұрын

Brilliant explanation! Very glad I stumbled on your channel!

@aloufin 5 күн бұрын

amazing viewpoint of explanation. Would've loved an additional segment using this viewpoint to do the MNIST image recognition

@algorithmicsimplicity 5 күн бұрын

I explain how this viewpoint applies in the case of image classification in my video on CNNs: kzfaq.info/get/bejne/bs95l7p5z9LJeac.html

@dfparker2002 4 ай бұрын

Best explanation of parametric calcs ever! Bias & weights has new meaning

@videos4mydad 5 ай бұрын

This is the best video I have ever seen on the internet that describes what a neural network is actually The best and most powerful explanations are those that give you the intuitive meaning behind the math and this video does it perfectly When a video describes a neural network by jumping into matrices and talking about subscripts i's and J's, they're just talking about the mechanics and do absolutely nothing about making you understand what you're reading Unfortunately, this is how most textbooks approach the subject and it's also how many content creators approach the subject as well This type of video only comes from someone who understands things so deeply that they're able to explain it in a way that involves almost zero math I consider this video one of the true treasures of KZfaq involving artificial intelligence education

@illeto 13 күн бұрын

Fantastic video! I have been working with econometrics, data science, neural networks, and various kind of ML for 20 years but never thought of the ReLU neural networks as just a series of linear regressions until now!

@Stone87148 Жыл бұрын

Building an intuitive understanding of the math behind Neural Network is so important. Understand the application of NN gets the job done; understand the math behind of NN makes the job fun. This video helps the latter! Nice video!

@gregor-alic Жыл бұрын

Great video! I think this video finally shows what I was waiting for, namely what is the purpose of multiple neurons / layers in a neural network intuitively. This is the first time i have actually seen it explained clearly, good job!

@igorbesel4910 Жыл бұрын

Made my brain goes boom. Seriously thanks for sharing this perspective!

@stefanzander5956 Жыл бұрын

One of the best introductory explanations about the foundational principles of neural networks. Well done and keep up the good work!

@dineshkumarramasamy9849 28 күн бұрын

I love to get the history lesson first always, Excellent.

@PolyRocketMatt Жыл бұрын

This might actually be the clearest perspective on neural networks I have seen yet!

@PowerhouseCell Жыл бұрын

This is such a cool way of thinking about it! You did an amazing job discussing a popular topic in a refreshing way. I can't believe I just found your channel - as a video creator myself, I understand how much time this must have taken. Liked and subscribed 💛

@KIRA_VX 3 ай бұрын

IMO one of the best explanation when it comes the idea/fundamental concept of the NN, Please make more 🙏

@algorithmicsimplicity 3 ай бұрын

Thank you so much! Don't worry, more videos are on the way!

@karkunow 9 ай бұрын

Thank you! That is a really brilliant video! I have been using regressions often, but never knew that Neural Network is kinda the same idea. Very enlightening!

@srbox2 5 ай бұрын

This is flat out the best video on neural networks on the internet, provided you are not a complete newbie. Never have I had such an "ahaaa" moment. Clear, consize, easy to follow, going from 0 to hero effortlessly. Bravo.

@Deletaste Жыл бұрын

And with this single video, you earned my subscription.

@williamwilkinson2748 Жыл бұрын

The best video I have seen in giving one an understanding of neural nets. Thank you. Excellent, looking for more from you.

@sharkbaitquinnbarbossa3162 Жыл бұрын

This is a really great video!! Love the approach with parametric regression.

@doublynegative9015 Жыл бұрын

Just watched Sebastian Lague's video on Neural Networks the other day, and whilst great as always, it was _such_ a standard method of explaining them. Because mostly I just see this explained in the same way each time. This was such a nice change, and really provided me with a different way to look at this. Seeing 'no lin-alg, no calc, no stats' really concerned me, but, you did a great job, just by trying to explain different parts. Such a great explanation - would recommend to others.

@ultraFilmwatch 9 ай бұрын

Thank you thousands of times, you excellent teacher. Finally, I saw a high-quality and clear explanation of neural networks.

@napomokoetle 7 ай бұрын

This the clearest video I've ever seen on KZfaq on what a Neural Network is. Thank you so much... you are a star. Could I perhaps ask or encourage you to please create for many of us keen on learning Neural networks on our own, a video practically illustrating the fundamental difference between supervised, unsupervised and reinforcement learning.

@geekinasuit8333 7 ай бұрын

I was wondering myself exactly what a simulated NN actually is doing (not what it is, but what it is doing) and this explanation is the best by far, if not THE answer. One adjustment I will suggest, is at the end explain that a simulated NN is not required at all, and explain that alternative systems can also perform the same function, which begs the question, what exactly are the fundamental requirements needed for line fitting to occur? Yes I like to generalize and get to the fundamentals.

@metrix7513 Жыл бұрын

Like someone else said, I expected the video to be similar to all the others, but this one gave me so much more, very nice.

@jorgesolorio620 Жыл бұрын

Where has this video been all my life! amazing simply amazing! we need more please

@ArghyadebBandyopadhyay Жыл бұрын

THIS was the missing piece of the puzzle I was looking for. This video helped me a lot. Thanks.

@jcorey333 3 ай бұрын

Your channel is really amazing! Thanks for making videos.

@some1rational Жыл бұрын

Great video, this is an explanation I have not heard before. Also I don't know if that abrupt ending was purposefully sarcastic, but I thoroughly enjoyed it lol

@ChrisJV1883 9 ай бұрын

I've loved all three of your videos, looking forward to more!

@symbolsforpangaea6951 Жыл бұрын

Amazing explanations!! Thank you!!

@MrLegarcia Жыл бұрын

This straight forward explaining method can save thousands of kids from dropping school "due to math"

@ButcherTTV 9 күн бұрын

good video! very easy to follow.

@lollmao249 2 ай бұрын

This EXCELLENT and the best video explaining intuitively what a neural network does. You are seriously brilliant

@geekinasuit8333 7 ай бұрын

Another explanation that's needed is to explain the concept of gradient descent (GD), that's the generalized method used to figure out the best fit. Lot's of systems use GD including natural evolution, it's basically trial and error with adjustments, although there are various ways to make it work more efficiently which can become quite complicated. You can even use GD to figure out better forms of the GD algorithm, that is it can be used recursively on itself.

@Muuip 8 ай бұрын

Another great concise visual explanation! Thank you!👍

@DmitryRomanov Жыл бұрын

Thank you! Really beautiful point about layers and exp growth of number of a segments one can make!

@martinsanchez-hw4fi Жыл бұрын

Good one! Nice video. In the regression line of Gauss one is not taking the perpendicular distances, though. But very cool video!

@aravindr7422 9 ай бұрын

wow. very good. keep posting great content like this. you have an better potential to explain complex topics to simpler versions. and there are people who just post content just for the sake of posting and minting money. we need more people like you.

@yonnn7523 7 ай бұрын

wow, ReLU is an unexpected starting point to explain NNs, but nicely demonstrates the flexibility of summing up weighted non-linear functions.such a refreshing way!

@andanssas Жыл бұрын

Great concise explanation, and it does works: it fits at least my brain's data like a glove! Not that I have a head shaped like a hand (or do I?), but you did light up some bulbs in there after watching those lines animations fitting better and better. However, what happens when the neural network fits too well? If you can briefly mention the overfitting problem in one of your next episodes, I''d greatly appreciate. Looking forward to the CNNs and transformer ones! 🦾🤖

@DaryonGaming Жыл бұрын

i'm positive I only got this recommended because of Veritasium's FFT video, but thank you youtube algorithm nonetheless. What a brilliant explanation!

@Justarandomguyonyoutube12345 8 ай бұрын

I wish I could like the video more than once.. Great job buddy

@bisaster5471 Жыл бұрын

480p in 2022 surely takes me back in time. i love it!!

@karlbooklover Жыл бұрын

most intuitive explanation ive seen

@orangemash Жыл бұрын

Excellent! First time I've seen it explained like this.

@tunafllsh Жыл бұрын

Wow this is a really interesting view to the neural networks and what role do layers play in it.

@sciencely8601 7 күн бұрын

god bless you for this work

@ward_heimdal 9 ай бұрын

Hands down the most enlightening ANN series on the net from my perspective, afaik. I'd be happy to pay 5 USD for the next video in the series.

@garagedoorvideos Жыл бұрын

8:47 --> 9:21 Is like watching my brain while I predict some trades. 🤣🤣🤣 "The reason why neural networks work....is that they fit the data" sweet stuff.

@SiimKoger Жыл бұрын

Might be the best and most rational neural networks video on KZfaq that I've seen 🤘🤘

@borisbadinoff1291 6 ай бұрын

Brilliant! Lots to unload from the concluding sentence: neural networks works because they fit the data. Sounds like an even deeper issue than misalignment due to proxy-based training.

@gbeziuk Жыл бұрын

Great video. Special thanks for the historical background.

@koderksix 7 ай бұрын

I like this video so much It really shows that ANNs are really just, at the end of the day, glorified multivariate regression models.

@lewismassie Жыл бұрын

Oh wow. This was so much more than I was expecting. And then it all clicked right in at about 9:45

@metanick1837 9 ай бұрын

Nicely explained!

@xt3708 9 ай бұрын

This makes total sense thank you. With the last observation of the video, how does that reconcile with statements from the openai team regarding emergent properties of GPT4, that they didn't expect, or don't comprehend. I might be mixing apples and oranges, but if it's just curve fitting then why has some thing substantially changed?

@scarletsence Жыл бұрын

Actually adding a bit of math to this video won't hurt while you add to them visual representation of graphs and formulas. But any way one of the most accessible explanation i have ever seen.

@is_this_youtube Жыл бұрын

This is such a good explanation

@colebrzezinski4059 7 ай бұрын

This is a really good explanation

@4.0.4 8 ай бұрын

This is the second video of yours that I watch that gives me an eureka moment. Fantastic content. One thing I don't get is, people used to use the sigmoid function before ReLU, right? Was it just because natural neurons work like that and artificial ones were inspired by them?

@algorithmicsimplicity 8 ай бұрын

Yes sigmoid was the most common activation function up until around 2010. The very earliest neural networks back in the 1950s all used sigmoid, supposedly to better model real neurons, and nobody questioned this choice for a long time. Interestingly, the very first convolutional neural network paper in 1980 used ReLU, and even though it was already clear that ReLU performed better than sigmoid back then, it still took another 30 years for ReLU to catch on and become the most popular choice.

@bassemmansour3163 Жыл бұрын

👍 Super demonstration! how did you generate the graphics? Thanks!

@qrubmeeaz 8 ай бұрын

Careful there! You should explicitly mention that you are taking the absolute values of the errors. (Usually we use squares). Without the squares (or abs), the positive and negative errors will kill each other off, and the simple regression does not have a unique solution. Without the squares (or abs), you can start with any intercept, and find a slope that will give you ZERO total error!!

@Scrawlerism Жыл бұрын

Damn you need and deserve more subscribers!

@saysoy1 Жыл бұрын

I loved the video, would you please make another one explaining the back propagation?

@algorithmicsimplicity Жыл бұрын

Hopefully I will get around to making a back propagation video sometime, but my immediate plans are to make videos for CNNs and transformers.

@saysoy1 Жыл бұрын

@@algorithmicsimplicity just don't stop man!

@Gravitation. Жыл бұрын

beautiful! could you do this type of videos on other machine learning models such as convolution?

@algorithmicsimplicity Жыл бұрын

Yep, I am planning to do CNN and transformer videos next.

@master11111 Жыл бұрын

That's a great explanation

@willturner1105 Жыл бұрын

Love this!

@johnchessant3012 Жыл бұрын

Great video!

@StephenGillie Жыл бұрын

Having worked with a simple single-layer 2-synapse neuron in a spreadsheet, I find this video vastly overexplains the topic at a high level, while not going into enough detail. It does, however, go over the linear regression needed for the synapse weight updates. Also it treats the massive regression testing as a benefit instead of a cost. One synapse per neuron in the layer above, or per input if the top layer. One neuron per output if the bottom layer. Middle layers define resolution, from this video at a rate of (neurons per layer)^(layers). Fun fact: Neural MAC (multiply-accumulate) chips can perform whole racks worth of computation. The efficiency gain here isn't so much in speed as it is reduction of power and space, by rearranging the compute units and using analog accumulation. In this way the MAC units more closely resemble our own neurons too.

@talsheaffer9988 24 күн бұрын

Thanks for the vid! At about 10:30 you say a NN with n neurons in each of L layers expresses ~ n^L linear segments. Could this be a mistake? I think it's more like n^2 * L

@algorithmicsimplicity 24 күн бұрын

The number of different linear segments is definitely at least exponential in the number of layers, e.g. proceedings.neurips.cc/paper_files/paper/2014/file/109d2dd3608f669ca17920c511c2a41e-Paper.pdf

@hibamajdy9769 7 ай бұрын

Nice interpretation 😊, please can you make a video explaining how neural networks used in for example digit recognition

@justaname999 Ай бұрын

This is a really cool explanation I haven't seen before. But I have two questions: Where does overfitting fit in here? more neurons would mean higher risk of overfitting? do layers help or are they unrelated? And where would co-activation of multiple neurons fit in this explanation? e.g., combination of information from multiple sensory sources?

@algorithmicsimplicity Ай бұрын

My video on CNNs talks about overfitting and how neural networks avoid it (kzfaq.info/get/bejne/bs95l7p5z9LJeac.html ) . It turns out that actually the more neurons and layers there are, the LESS neural nets overfit, but the reason is pretty unintuitive. From the neural nets perspective, there is no such thing as multiple sensory sources. Even if your input to the NN combines images and text, the neural net still just sees a vector as input, and it is still doing curve fitting just in a higher dimensional space (dimensions from image + dimensions from text).

@justaname999 Ай бұрын

@@algorithmicsimplicity Thank you! I had read the more neurons lead to less overfitting and thought it was counterintuitive but I guess that must have carried over from the regular modeling approach where variables remain (or should) interpretable. I'll have a look at the other videos! Thanks I guess my confusion stems from what you address at the end. We can fairly simply imitate some things via principles like Hebbian learning but the fact that in actual brains it involves different interconnected systems makes me stumble. (and it shouldn't because obviously these models are not actually like real brains)

@redpanda8961 Жыл бұрын

great video!

@LegenDUS2 Жыл бұрын

Really nice video!

@oleksandrkatrusha9882 8 ай бұрын

Amazing!

@promethful Жыл бұрын

Is this piecewise linear approximation of a network a feature of using the ReLU activation function? What if we use a sigmoid activation function instead?

@dasanoneia4730 8 ай бұрын

Thanks needed this

@vasilin97 8 ай бұрын

Great video! I am left with a question though. If the number of straight line segments in an NN with n neurons in each of the L layers is n^L, then why would we ever use n > 2? If we are constrained by the total number of neurons n*L, than n = 2 maximizes n^L. I have two guesses why use n>2: 1. (Hardware) Linear algebra is fast, especially on a GPU. We want to use vectors of larger sizes to make use of the parallelism. 2. (Math) Maybe the number of gradient descent steps needed to fit a deeper neural network is larger than to fit a shallower NN with wider layers? If you plan to make any more videos about this, this question would be great to address. If not, maybe you can reply here with your thoughts? Thank you!

@algorithmicsimplicity 8 ай бұрын

Really good question. There are 2 reasons why neural networks tend to use very large n (usually several thousand) even though this means less representation capacity. The first is, as you guessed, it makes better use of GPU accelerators. You can't parallelize computation across layers, but you can parallelize computation across neurons within the same layer. The second, and more important reason, is that in practice we don't care that much about representation power. Realistically, as soon as you have 10 layers with a few hundred neurons in each, you already have enough representation power to fit any function in the universe. What we actually care about is generalization performance. Just because your network has the capacity to represent the target function, doesn't mean that it will learn the correct target function from the training data. It is much more likely that the network will just overfit to the training dataset. It turns out to be the case that the wider a neural network is, the better it generalizes. It is still an open area of research why this is the case, but there are a few hypotheses floating around. My other video on convolutional neural networks actually goes into one of the hypotheses a bit, in it I explain that the more neurons you have the more likely it is that they have good initializations, but it was a bit hand-wavy. I was planning to do a more in-depth video on this topic at some point.

@vasilin97 8 ай бұрын

@@algorithmicsimplicity thank you for such a thoughtful reply! I'll watch your CNN video and all other videos you'll produce on this topic! I thought that the whole overfitting business is kind of obsolete nowadays, with LLMs having more neurons than the number of training data samples. This is only a rough understanding I've gained from some random articles, and would love to learn more about it. Do you have any suggestions for what to read or watch in this direction? As you noted in the video, there is lots of low-quality content about NNs out there, which makes it hard to find answers to even rather straightforward questions like whether overfitting is "still a thing" in large models.

@algorithmicsimplicity 8 ай бұрын

@@vasilin97 The reason why we use such large neural networks is precisely because larger neural networks overfit less than smaller neural networks. This is a pretty counter-intuitive result, and is contrary to what traditional statistical learning theory predicts, but it is empirically observed over and over again. This phenomenon is known as "double descent", you should be able to find some good resources on this topic searching for that term, for example www.lesswrong.com/posts/FRv7ryoqtvSuqBxuT/understanding-deep-double-descent , medium.com/mlearning-ai/double-descent-8f92dfdc442f . The Wikipedia page on double descent is pretty good too.

@stefanrigger7675 Жыл бұрын

Top notch video, one thing you might have mentioned is that you only deal with the one-dimensional case here.

@MARTIN-101 Жыл бұрын

phenomenal

@y.8901 Жыл бұрын

Is it right that, at 8:19, the input x is a 2d input (namely x and y) ? Otherwise the points could be plot in 2d ?

@algorithmicsimplicity Жыл бұрын

All of the inputs in this video are 1d (just x), y is the output. If the input was 2d the graph would need to be in 3d.

@y.8901 Жыл бұрын

@@algorithmicsimplicity Thank you, so how did you placed your point if you had only 1 input (i.e x for example), wouldn't we need x and y, so 2 inputs ?

@dann_y5319 Ай бұрын

Omg great video

@Null_Simplex 3 ай бұрын

Thank you. This is far more intuitive than the usual interpretation with nodes and edges graph with the inputs bouncing back and forth between layers of the graph until it finally gets an output. What are the advantages and disadvantages between this method of approximating a function and polynomial interpolation?

@algorithmicsimplicity 3 ай бұрын

For 1-dimension inputs and outputs, there isn't much difference between them. For higher dimensional inputs polynomials become infeasible, since a polynomial would need coefficients for all of the interaction terms between the input variables (of which there are exponentially many). For this reason, neural nets are preferred when input is high dimensional as they simply apply a linear function to the input variables, and then an activation function to the result of that.

@HitAndMissLab 9 ай бұрын

You are math God! . . . subscribed

@spadress 7 ай бұрын

Very good video

@TheCebulon Жыл бұрын

Do you have a video of how to apply this to train a neural network? Would be awesome.

@kwinvdv Жыл бұрын

Neural network "training" is just model fitting. It is just that the proposed structure of it is just quite versatile.

@abdulhakim4639 Жыл бұрын

Whoa, easy to understand for me.

@computerconcepts3352 Жыл бұрын

extremely underrated

@TheEmrobe Жыл бұрын

Brilliant.

@willowarkan2263 Жыл бұрын

Might have been useful to explain why a nonlinear function like the ReLU is used as the transfer function, or that it's not the only common transfer function, which is kind of implied since you use the identity in the beginning, though that isn't nonlinear. Also the link to the human brain is in the structure of neurons, the conceptual foundation of a weighted sum transformed into a single value by a nonlinear process, dendritic signals combined into a single signal down the axon, at the end of which lie new connections to further neurons. Furthermore in the case of computer vision the structure of the visual cortex served as an inspiration for the neural networks in that field. If memory serves the neocognitron was one such network, not a learning network as it's parameters were tweeked by humans till it did as desired, but foundational to convolutional neural networks. Otherwise interesting enough, stressing the behind the scenes nature of neural networks, though maybe mentioning how classification relates to those regressions would have been cool too. Btw what learning scheme were you using, as far as I could tell it was some small random jump first added and then subtracted if the result got worse? I assume if the substraction is even worse it rolls back the entire thing and rolls a new jump? I ask as it neither sounded nor looked like back propagation was used. The thumbnail still bugs me, the graph representation of the network isn't wrong it just shows something different, the nature of the interconnections between neurons in the network that are hard to see in the graph representation of the resulting regression. It's like saying the map of the route is wrong because it's not a photo of the destination.

@algorithmicsimplicity Жыл бұрын

All commonly used activations are either ReLU or smooth approximations to ReLU, so I disagree that it is useful to talk about alternatives in an introductory video. It is not helpful to think of modern neural networks as having anything to do with real brains. Yes early neural networks were inspired by neuroscience experiments from the 1950s. But our modern understanding is that the way the brain actually works is vastly more complex than those 1950s experiments let on. Not to mention modern neural networks architectures are even less inspired by those experiments (e.g. transformers). The learning scheme I used is iterate through parameters, increase parameter value by 0.1, if new loss is worse than old loss then decrease parameter value by 0.2. That's it. I just repeated that operation. No backpropagation was used, as the purpose of backprop is only to speed up the computation (which was unnecessary for these toy examples). The graph representation of the network is absolutely wrong, as it misleads people into thinking the interconnections between neurons are relevant at all, when they have nothing to do with how or why neural nets work.

@willowarkan2263 Жыл бұрын

@@algorithmicsimplicity i would argue that modern neural networks developed from the more primitive neural networks based on the then understanding of the brain. Also I didn't write neural networks work like the brain, just that their basic building blocks were inspired by the basic building blocks of the brain and some structural aspects, as understood at the time, inspired the structure of NN. So you are claiming that neither sigmoid, hyperbolic, nor any of the transfer functions common to RBFNN are used? Or are so rarely used as to be negligible? Yes ReLU and related functions are currently popular as the transfer function of hidden layers in deep learning and CNNs. Although some of the related functions start to depart quite a bit from the original, mostly keeping linearity in the positive domain, approximations not withstanding. Looking through them it feels like calling sigmoid and tanh functions related to the step function, which they kinda are, similarly solving issues with differentiability as the approximations to ReLU do. So you kind of used a grid search, on a 0.1 sized grid to discretize the parameter space. Pretty sure a NN without connections isn't a network. Especially for back propagation they are inherent in it's functioning, after all it needs to propagate over those connections. I don't see what you mean by it misleads people or how the interconnectedness is meaningless. The fact that a single layer wasn't enough to solve a non linear separable problem is what kept the field inactive for at least a decade.

@algorithmicsimplicity Жыл бұрын

"So you are claiming that neither sigmoid, hyperbolic, nor any of the transfer functions common to RBFNN are used? Or are so rarely used as to be negligible?"

@kalisticmodiani2613 8 ай бұрын

@@algorithmicsimplicity if we didn't need performance then we wouldn't have made so much progress in the last few years. Stars aligned with performant hardware and efficient algorithms.