Running Neural Networks on Meshes of Light

Рет қаралды 197,783

Күн бұрын

I want to thank Alex Sludds for his efforts in helping me research and produce his video. Check out his work here: alexsludds.github.io
Links:
- The Asianometry Newsletter: asianometry.com
- Patreon: / asianometry
- The Podcast: anchor.fm/asianometry
- Twitter: / asianometry

Пікірлер: 335

@harrykekgmail 2 жыл бұрын

A big Thank You to Alex Sludds too (from grateful audience)!

@outerspaceisalie Жыл бұрын

@Nobody Important thats none of your business

@oscarruorochmolinacansino5907 Жыл бұрын

@Nobody Important Time travel.

@stefanklaus6441 Жыл бұрын

@Nobody Important The video may have been on private before?

@ivoryas1696 Жыл бұрын

@@outerspaceisalie Indeed. Should we take him out? He might have learned too much...

@Hassanmohamed31152 Жыл бұрын

I almost commented a few videos ago that you have single handled staffed all US semi conductor fabs with engineers in the next 10 years just by posting. Happy to see you grow so much even in your own niche without clickbait.

@geneballay9590 Жыл бұрын

you have single handled staffed all US semi conductor fabs with engineers in the next 10 years just by posting. THE SAME THOUGHT OCCURED TO ME. BY PRODUCING THESE VIDEOS HE IS OPENING UP A WORLD OF OPPORTUNITIES FOR OTHERS TO SEE AND CONSIDER.

@baptistedelplanque8859 Жыл бұрын

This video is sponsored by chipactVPN

@wale7342 Жыл бұрын

I’m getting my comp eng degree rn rn

@jaazz90 Жыл бұрын

definitely, one of the topics I've always been interested in but never had any good source of information about, if I was younger I'd strongly consider partaking in it

@clemenkok5758 Жыл бұрын

As a ECE sophmore in college, I just wanted to say that you're playing an amazing role in developing the next generation of semiconductor engineering! :)

@guaposneeze Жыл бұрын

FWIW, that observation that IO consumes more power than MAC operations in an AI accelerator is pretty universal across problem domains. I often quip that it's a silly accident of history that we call the metal boxes "computers" since almost none of the power, gate count, mass, etc., is actually used directly for computation. Most of computing in-practice is about getting the right data to the right place at the right time. I have a patent on internet-scale CDN configuration. But it's all the same at every scale. Pushing configuration data across the globe to the right server. Pushing weight data across the chip to the right IO pin. The memory/storage hierarchy instantly becomes the constraint as soon as you try to scale compute at any scale, in any domain. The ideas driving photonic compute for AI will be directly applicable to more seemingly mundane use cases.

@Soken50 Жыл бұрын

It always comes down to logistics and thermodynamics, humanity's two biggest nemeses, doesn't it?

@0MoTheG Жыл бұрын

It was not always so in the past. Leakage could also be a major contributor. But it was always known that the wires and power density would become the main problem, both in power and delay.

@aberroa1955 Жыл бұрын

Electrical signals are essentially sent at the speed of light (because it's not charge carriers that transmit the signal, it's electrical fields), so it's not the signal propagation speed that allows high throughput, it's ability to distinguish signals. Electrical fields get "smudged" along the way, but so does electromagnetic signals. But second one does it remarkably lesser. Also, electrical logic gates takes some time to transition from one state to another, and that's the major factor in limiting throughput. If there would be faster switching transistors - higher frequencies would be available. I don't know about photonics, but it seems that for it transition is either much faster, or architecture is completely different, like, instead of switching state, light is split along the way and goes through preconfigured logic gates, so processing is faster while it goes through same transformations, but takes some time to switch from one configuration to another. But there's possibility that same results could be achieved with using electronic components.

@davidb5205 Жыл бұрын

I wouldn't say electrical signals travel at "essentially the speed of light." That applies to maybe radio waves in free space. But velocity factor/wave propagation speed is typically ~64% the speed of light (Cat 5 data cables) to ~90% the speed of light (RF signals). Without taking insulation into account which reduces VF further. Even if the jump is from 90% to 99% the speed of light, that optimization would result in huge improvement. But like you said, it's about ability to distinguish signals, the accuracy of detection at the receiving end. Without that it's unusable.

@aberroa1955 Жыл бұрын

@@davidb5205 True, but overcomplicated, so that's why there was a word "essentially" - because it's still by multiple orders of magnitude faster than charge carries move. Also, in photonics light is travelling significantly slower than c too, because it does so in a medium (glass, or whatever), which slows down electromagnetic wave propagation.

@XCSme Жыл бұрын

If gate transition time is the bottleneck, shouldn't FPGAs or static circuits (simple wires) not have this limitation? Or you can not implement matrix multiplication without using gates?

@aberroa1955 Жыл бұрын

@@XCSme FPGAs aren't simple wires, they too use transistors and switch state each clock iteration based on input. Static circuits... well, as long as they do not use any capacitors and do not have too much capacitance or inductance on their own, they'd be lightning fast, but as useful as a plain wire or resistor, unable to compute anything. One could say that transistor is a teeny-tiny capacitors. And transistors takes time to charge. The less capacitance they have - the faster is charging. And despite modern nanometers transistors have neglectable capacitance, they still have it and they need some time to charge or discharge. If you have, tons of transistors, but they're mostly in parallel, then you can raise clock cycle maybe to tens of GHz and it would be fine, but you won't be able to do many operations per cycle, only the basic ones. If, on the other hand, you have same number of transistors interconnected with each other like in CPU, then your frequency would be limited by the latest one in chain - you need to be sure that each one in series before this one had enough time to charge/discharge, otherwise last one could end up in incorrect state.

@XCSme Жыл бұрын

@@aberroa1955 Thanks a lot for the response! I realized that my initial comment was a bit stupid to suggest an FPGA without logical gates, as that's the G stands for in the acronym... That being said, is it really no way to compute anything without using transistors? What if you use the voltage value as the output? Let's say you have to make an addition, if you feed 0.5V and 0.3V at the input, and link them in series, you should get 0.8V (maybe this is just an analog computer? en.wikipedia.org/wiki/Analog_computer ). Also, for example division could be done by adding resistors in parallel, so let's say you want to divide by 3, you feed 1V at the entrance, and then have 3 resistors/paths, to get input divide by 3 you could measure the output voltage of one of the paths.

@watermans7357 Жыл бұрын

I work in a research group which develops simulation tools for these photonic circuits. This video was very well explained. I can't wait to see what photonic circuits will be used for in the future. Thanks for making this video!

@raphaelcardoso7927 Жыл бұрын

What types of tools do you develop?

@Soken50 Жыл бұрын

He said at the end a 1D row of interferometers can perform like a 2D array using time instead, would the same priinciple apply for 2D to 3D if the accuracy for 2D can be improved ?

@watermans7357 Жыл бұрын

@@raphaelcardoso7927 We develop tools that leverage Artificial Neural Networks to simulate the performance of photonic devices. All of our softwares are free and open source.

@watermans7357 Жыл бұрын

@@Soken50 I am not sure, since I do not deal with the theory-sort of stuff, my wokr us mostly in the software development side of things.

@chandrasekarank8583 Жыл бұрын

Hi can we connect on LinkedIn ??

@geneballay9590 Жыл бұрын

Wow, your videos just get better and better. As I watched this I kept having flashbacks to my university math/physics discussions on matrix mechanics of more than 50 years ago, and realizing that those concepts remain important in today's world.

@paxdriver Жыл бұрын

You did a really good job on this one, man. That's no small feat, bravo

@alexsludds1377 Жыл бұрын

Great work Jon!

@lilacswithtea Жыл бұрын

thanks to you, he really made light work of this topic!

@suntemple3121 Жыл бұрын

Thank you Alex, all the best blessings to you and yours.🌟🌟🌟🌟🌟🌟🌟🌟

@billwhoever2830 Жыл бұрын

1) Electrical signals also travel at the speed of light (speed of light inside the conducting material), the signal is transmitted by photons. The main limiting factor of electronic computers are the capacities inside them. The most basic one is the capacity of the FET gates. In case for a FET to function the gate needs to reach the desired charge and this although getting smaller and smaller with the new nano transistors is still there. The same applies to discharging of those capacities which still takes time and also dumps all of their energy into heat. 2) the speed of light is a limiting factor even on photonics: A 4ghz chip, something that might be in a modern computer has a 4ghz clock and a period of 0.25ns between clock cycles, light can only travel 75mm in such a period and this is in the best case scenario (in vacum). A theoretical 40ghz photonic computer will have a 0.025ns or 25ps period and light will only be able to cover 7.5mm. This means that even in a 40ghz chip, the maximum distance for the datapath inside a computation core is at 7.5mm in the best case scenario. Having photonic computers working at teraherz is almost certainly sci-fi. And ofcource this type of cpu, with such a small distance covered between clocks will have very big memory bottlenecks (time it takes in cycles for data to be stored-recovered from the memory) and will require the memory to be very very close to the chip.

@tf_d Жыл бұрын

This.

@billwhoever2830 Жыл бұрын

@@tf_d I just noticed a mistake in my comment. I said that the datapath on a 40ghz core would be 7.5mm. In reality datapaths are normaly pipelined so, each individual stage of the pipeline would be limited to that length (this totaly applies to electronic computers). Pipelines on cpus today are around 8-20 stages long. Im not sure if pipelining would work on photonics and I think there would need to be electronic circuits between the stages anyways.

@tf_d Жыл бұрын

@@billwhoever2830 I don't see why pipelining wouldn't be possible with photonics, they're technically able to do anything that an electronic circuit can.

@cerebralm Жыл бұрын

At 5:04, did you mean to write picojoule instead of petajoule?

@infinitumneo840 Жыл бұрын

Silicon Photonics represent a quantum leap in technological speed and power efficiency. One major issue when dealing with light is the fact that you're reading the probabilities of the light waves. You run into quantum mechanics at this level. Light is sensitive to interference of the environment through quantum decoherence. I believe there will be a solution to this problem as our understanding of quantum systems evolves.

@Primarkka Жыл бұрын

@@rufushawkins3950 Very simplified it means something is quantifiable, as in a photons energy is discrete in a way.

@venerable_nelson Жыл бұрын

@@rufushawkins3950 When used as a noun it means small, when used as an adjective it means big. English!

@marilynlucas5128 Жыл бұрын

Geometry is the key to solving the quantum mechanics problem

@unvergebeneid Жыл бұрын

Wow, there is so much wrong with this one small comment that it would take an essay to pick it apart.

@benjybo Жыл бұрын

How does this interference from the environment behave? Can it be algorithmically modeled? If so, I believe it might possible to create a noise generator which mimics this behavior during neural network training. This can help “robustify” the neural networks, to prepare them for inference on such optical devices. This can provide an software rather than hardware approach to mitigating the accuracy issue.

@tortugatech Жыл бұрын

Great video as always! Keep them coming, love it!

@PlanetFrosty Жыл бұрын

Excellent work! My company is one of those working on photonics/quantum compute InFlight as optical networks transit the world. Though quite different, great progress has been made.

@kice Жыл бұрын

6:28 High bandwidth is not due to the physical transfer speed, in fact, electricity also move as speed as light. Bandwidths usually determined by how many bits per transfer and how many transfer per second. A normal GPU transfers couple hundred bits at few GHz.

@Steven_Edwards Жыл бұрын

It's Speed of Light Through a Medium... Switching to photonics removes the need for conductive metals and voltage transformation. Light TX/RX is a lot more simple, and a much clearer medium so the speed of light is faster in that medium.

@mystifoxtech Жыл бұрын

Small correction (I may be nitpicking) electricity can transmit data at speeds of 50%-99% speed of light

@bakedbeings Жыл бұрын

Quick posts! Really enjoying your silicon rabbit hole.

@WildEngineering Жыл бұрын

im really glad to see this tech being mentioned more and more.

@taktoa1 Жыл бұрын

A few mistakes: 1. ML typically consists of many matrix-vector multiplication steps, not matrix-matrix multiplications. 2. At 5:02 you meant picojoules, not petajoules 3. As I understand it (not an expert in photonics, though I have worked on an ML accelerator), for a given level of accuracy a photonic matrix-vector multiplication circuit will consume more power than a digital one, mostly because of the digital-to-analog and analog-to-digital steps. So I think it's somewhat misleading to say that power is not the problem. 4. I think the last point about replacing one of the axes with time is also misleading. That can be done for any circuit ("time-multiplexing") and will proportionally decrease throughput. So it's far from a solution to the density problem.

@trulyUnAssuming Жыл бұрын

1. A 1xn matrix is a vector so eh... plus if you do batch learning you end up with true matrix-matrix products

@taktoa1 Жыл бұрын

yeah, but I still feel like mentioning matrix-matrix multiplication is going to confuse the average viewer more than illuminate, compared to matrix-vector. most ML accelerators are built to accelerate matrix-vector products (e.g.: they use weight stationary systolic arrays). this is because accelerators rarely have the memory bandwidth to support matrix-matrix products at full throughput; they require the higher operational intensity of the static matrix/dynamic vector product.

@leonfa259 Жыл бұрын

2. 28 orders of magnitude is a lot

@daniel_960_ Жыл бұрын

Petajoules in a chip sounds fun

@jadeaffenjaeger6361 Жыл бұрын

Convolutions are typically expressed using im2col, which makes them an instance of the matrix-matrix multiply. They are extremely common in vision-based applications, so I think the statement is absolutely justified!? I would consider the questions whether a matrix-matrix product is decomposed into matrix-vector multiplications in a given accelerator an implementation detail, rather than an inherent feature of the underlying problem.

@mapp0v0 Жыл бұрын

have you heard of Brainchips Akida chip? Currently in production. Akida is a neuromorphic system on a chip designed for a wide range of markets from edge inference and training with a sub-1W power to high-performance data center applications. The architecture consists of three major parts: sensor interfaces, the conversion complex, and the neuron fabric. Akida incorporates a Neuron fabric along with a processor complex used for system and data management as well as training and inference control. The chip efficiency comes from their ability to take advantage of sparsity with neurons only firing once a programmable threshold is exceeded. NNs are feed-forward. Neurons learn through selective reinforcement or inhibition of synapses. Sensory data such as images are converted into spikes. The Akida NSoC has neuron fabric comprised of 1.2 million neurons and 10 billion synapses. For training, both supervised and unsupervised modes are supported. In the supervised mode, initial layers of the network are trained autonomously with the labels being applied to the final fully-connected layer. This makes it possible for the networks to function as classification networks. Unsupervised learning from unlabeled data as well as label classification is possible.

@johanlarsson9805 Жыл бұрын

Thanks for metioning the paper! I knew I recognized this and when you showed it I realized it was 4 years since I read it.

@QSecty Жыл бұрын

i have the idea to do computing with light 5 years ago, but have no clue to get it done. glad to see a big step in computing!

@x2ul725 Жыл бұрын

Such a fun video ! Great work guys !

@Erik-gg2vb Жыл бұрын

I watched a you tube called "The next big step in computing" by Anastasi, she mention how they are trying to use light in a analog form, different intensity's as a new way to compute. Not as in depth as here but still over my head.

@rahulmathew4970 Жыл бұрын

Happy to know that i am not only one following her

@mclilzenthepoet2331 Жыл бұрын

Oy another Anastasia followers nice

@satadrudas3675 Жыл бұрын

This was a very informative video. I am in fact working on a time-multiplexed SiPh matrix multiplication design like the one you mentioned towards the end of your video.

@stevengill1736 Жыл бұрын

So there an analog aspect of these calculators as well? Very cool...exactly what was wanted. Can't wait to see how this tech works out....cheers

@benjaminlynch9958 Жыл бұрын

Awesome video. Another reminder of why I’m subscribed. 👍🏼 This technology is really cool. It seems like the use case to make this commercially viable is training massive neural networks rather than inference. It’s the training that is computationally expensive and requires stupid amounts of computing power. That’s a challenge that needs to be solved. Inference on the other hand is trivial by comparison. Almost every smartphone these days has a built in neural engine that can run inference in real time at less than a watt for relatively simple problems, and even moderate to large problems can be run through inference on a traditional modern CPU with no dedicated matrix multiplier.

@JamEngulfer Жыл бұрын

I wonder if you got the photonics cheap enough and small enough, the accuracy could be improved by running the same calculation multiple times and averaging it. Though the extra electronics and redundancy might offset any gains made…

@punditgi Жыл бұрын

Excellent video! Learned a lot. Well done! 😃

@hugod2000 Жыл бұрын

Thank you for these fascinating videos.

@anteconfig5391 Жыл бұрын

"Photonic Neural Networks" That a yummy combo of words. I hope this video doesn't disappoint.

@stefanklaus6441 Жыл бұрын

I have recently seen a great video on why our end/or gates will always dissipate energy. The answer "boils down" to entropy. Depending on how far into theory one wants to dabble this might be pretty interesting content.

@SianaGearz Жыл бұрын

Can you give a better set of keywords or a full title?

@stefanklaus6441 Жыл бұрын

@@SianaGearz why pure information gives of heat By up and atom

@VicenteSchmitt Жыл бұрын

@@stefanklaus6441 Watched it yesterday, great video

@Soken50 Жыл бұрын

gates are used to make "bits" interact and potentially effect a change of states, that change will of course necessitate a certain amount of work, however tiny it is we can't get a system to change states without expending energy somewhere.

@TaylorAlexander Жыл бұрын

Thank you for this! I have been seriously wondering about Lightmatter and I just checked up on them recently. Looks like they’re hiring some powerful folks and hopefully going to be able to offer real products soon!

@benjybo Жыл бұрын

Great video! Thank you very much for making it! I’m currently working on a research project with ultra low precision neural networks. I wanted to ask if reducing the number of bits in the activations and/or weights to about 2-3 bits each (using state of the art Quantization methods) would help with the issues with photonics accelerators raised in this videos accuracy and scale? In general, most neural networks these days can be quantized down to 4-bits with almost no loss of performance, using the latest Quantization methods. So 8-bits might be a bit unnecessary, if these methods are used.

@Kengur8 Жыл бұрын

In my favourite si-fi movie Bicentennial Man they kinda show photonic brain, even though it's called positronic. I love it now...

@norik1616 Жыл бұрын

From what I've read (ML is my main field), even AlphaZero (and definitely MuZero) run on a "high end PC". The training was done on TPUs and simulation on CPU servers.

@norik1616 Жыл бұрын

Also, the problem is how DL model is querried in reinforcement learning scenario - it is querried thousands of times per step for simulating the game in it's "state space" (evaluation of a tree of future steps)

@pc_screen5478 Жыл бұрын

Katago, which is a Go AI based on AlphaGo Zero with some extra improvements, is superhuman at just a couple hundred playouts, which on my computer (gtx 1650) only takes a couple seconds to achieve (about 3-5). On a high end computer this is achieved in less than a second per move. The original AlphaGo was a frankenstein of neural networks and needed a lot of MCTS rollouts to make up for it, subsequent Go AIs can be superhuman running on an iphone even

@nicholasgrippo1754 Жыл бұрын

This is very interesting. Excited to see what the future brings in this space.

@gustavderkits8433 Жыл бұрын

Good that you started looking at this. More presentations in this area should follow. Talk to more experts.

@ajeybs4030 2 ай бұрын

I can't thank this channel enough. Good job.

@pmk_ Жыл бұрын

You mention that the 2016 AlphaGo was ran on 48 TPUs. Were these required for the Inference step used during the matches? Or was the final trained version running on just the laptop we saw in the documentary? Thanks for the great video!

@aniksamiurrahman6365 Жыл бұрын

Man, you are remarkable. Btw, do u do financial consulting for tech companies? Or plan to do in the future?

@miklov Жыл бұрын

Fascinating. Thank you!

@paulmichaelfreedman8334 Жыл бұрын

Excellent channel. Objective, serious and extremely informative. Channels like these are what make KZfaq great. Not those bonehead vloggers.

@itonylee1 Жыл бұрын

I wonder if it is possible to have multi-stack-layer of LED film to do the similar task, since LED can both emit light and also photoelectric?

@animeshthakur5693 Жыл бұрын

LEDs aren't sensitive enough

@itonylee1 Жыл бұрын

@@animeshthakur5693 Sure, but in theory, it is possible to integrate LED within semiconductor die process.

@AngDavies Жыл бұрын

Yes, but you probably wouldn't want to, for this to work you want coherent light, which for LEDs is going to mean throwing away most of it. A laser is what you want here really

@AngDavies Жыл бұрын

@@itonylee1 if I recall, integrating the light source well is actually one of the major pitfalls/ cost centres that is as of yet unresolved. Integrating with the design means you don't need to align it/tune it. But making light sources out of silicon is really hard.

@gspaulsson Жыл бұрын

When Deep Blue beat Garry Kasparov, some wag said: "Sure, but how did it do in the post-game interview?" Probably wouldn't be hard to train a neural network to give trite answers to trite questions, with a few quips thrown in. "Mr Deep. Can I call you Deep, or do you prefer Blue". "Whichever you like." "OK Deep, how do you think Mr Kasparov played?" "Pretty well - for a human.". "Why didn't you take his pawn at move 35?" "It wins at depth 6, but loses at 16. Humans are so slow."

@jimurrata6785 Жыл бұрын

And today we have Meta's chatbot dissing Zuck! 🤣

@JorgetePanete Жыл бұрын

the bit-flip that caused that move really broke Kasparov

@brandonblue2994 Жыл бұрын

Was wondering when you would cover this.

@randomhandle721 Жыл бұрын

Great video. I enjoyed watching it.

@Y2Kmeltdown Жыл бұрын

Great Video really interesting to see how other fields are tackling the issue of power consumption. From what I understand, it is not a fair comparison to make between conventional neural networks and the human brain. The human brain works on a completely different mode of computation where data storage and computation are unified and signals are carried through spike potentials. Hopefully photonics can be applied to designing analogs for spiking neural networks.

@dmurphydrtc Жыл бұрын

excellent summary. thanks

@jacoblara4175 Жыл бұрын

I wonder how this compares to the analog circuits that are being used to run neural networks.

@MrJazzCigar Жыл бұрын

You are producing some excellent content, never a dull video…thank you!

@JorgetePanete Жыл бұрын

I saw that analog computing could have conversion to digital and back between a few steps to recover accuracy, with some circuit tradeoffs

@theAadi47 Жыл бұрын

Amazing and insightful video. I take solace in the fact that brain is much more efficient if not the best at specialised tasks. Let's hope the photonic innovators are able to get a product market fit, and who knows, due to efficiency reason we just might be able to simulate quantum computers before we actually design quantum computing at scale !

@salma-amlas 11 ай бұрын

Woah this is blowing my mind! It's amazing, the things nature has provided for us humans. And the human scientific collaborative effort never ceases to impress me. Thank you for this video.

@chavita4321 2 жыл бұрын

love this video! cheers from California

@Rockyzach88 6 ай бұрын

When I was getting my chemistry degree I noticed multiple labs working on materials for things like this. It's cool seeing it hit youtube.

@lachlanperrier2851 Жыл бұрын

This is one of my favourites don’t know why it doesn’t have more views

@zane62135 Жыл бұрын

Wow...this is incredible!

@hugoboyce9648 Жыл бұрын

The caliber of this video was very impressive!

@Dr7-1 Жыл бұрын

Welcome back on Asianometry. Thanks for your answer. I’m not so ready! First my family. See you soon. I hope! DV

@htomerif Жыл бұрын

It would be interesting to know some numbers. So far as I can find out, Google's TPUs use a little bit off-standard 16 bit floating point format for all of their data. You don't need the high accuracy of a 32 or 64 bit float, at least for inference. If the silicon photonics ADC/DAC has an effective end-to-end precision of an 8 bit float, then the gap between them and what is useful for AI is very, very large. If its equivalent to 12 bits, then its not as much of a problem. The other thing that would be nice to know is how much process variation there is in individual interferometers. One nice thing about digital electronics is that you fabricate a chip, you test it at speed and if it gives you the right digital answers, the chip is good. With analog electronics, you might have an interferometer with a reliable 32 bit equivalent signal to noise ratio, but non-linearity and variation between interferometers on the same chip might push the effective precision way down into the single digits, especially with a light path passing through multiple optical elements, testing every possible light path may be functionally impossible. With digital electronics, all you have to know is that one device's output falls within certain bounds to know that you can chain together an unlimited number of them with no loss of accuracy. With analog electronics, chaining them together always compounds the error, whether its RMS error of the noise floor being added or the multiplicative error in the actual signal. Anyway, I don't expect answers to these questions, but they are questions that I think the answers to determine whether digital photonics will be a thing in the future.

@Andrew-rc3vh Жыл бұрын

I think there is a bit of an error in this video. the MZI is a passive device which uses a half-silvered mirror to create interference patterns, so there is no voltage applied to the MZI. What I suspect you may be referring to is the Kerr effect where the reflective index changes with applied voltage, and if you used it with an MZI then this is likely to be what gives you the desired properties.

@10-AMPM-01 Жыл бұрын

12:35 - That's really clever.... Easy to get bogged down by the "right and wrong" ways to use tools.

@uirwi9142 Жыл бұрын

the part about alpha go and how many TPUs were used. it's no wonder i cant find anywhere to build AI for StarCraft on my PC at home. ambitious but just not gonna happen, it seems. nevermind that, this talk/video was spectacular and incredibly informative. thank you.

@matttaylor2009 Жыл бұрын

Excellent channel

@krimsonsun10 Жыл бұрын

I saw an article in high school in the early 2000 on photonics research for replacing buses on motherboards from MIT. The idea was to reduce heat loss and latency.I wonder if this is an offshoot of that research?

@2black1white3blue Жыл бұрын

This is very interesting. Thanks

@BB-nz9rp 11 ай бұрын

Hi there, I really appreciate your content. Just a side note: I believe you meant 20 or 1 'pico' Joules / MAC and not Peta which would be about 278GWh = 19k households/year?

@DaT0nkee Жыл бұрын

Paralell operation can be achieved using different frequency lights on the same chip simultanously.

@jannegrey593 Жыл бұрын

OK. It seems like old video but also just released. It will probably be fantastic.

@MrTonypace Жыл бұрын

Lightmatter has an upcoming talk about doing this at wafer scale coming up in 2 weeks at Hot Chips. I hope you can tell us what they're up to! (And Ranovus).

@HexerPsy Жыл бұрын

Do Photonics require extreme low temps, such as q-bits do currently? Quantum bits receive noise from temperature, so those chips work most reliably in a very low temp, close to 0K. It ends up with a machine thats mostly multi stage cooler, with a chip on the tip. Are photonics the same?

@user-hx1ku8sp8c Ай бұрын

Would be great to get an update .. once we have more info on the Chinese Taichi photonic chip.. is it hype or real ?

@thegame4027 Жыл бұрын

Small detail, but electrons don't move through the chip/wires. They just wiggle around; the energy is transmitted over the electric field around the conductor, not by the electrons. Doesn't really matter as your point is still valid, just a technicality.

@signalworks Жыл бұрын

Electrons do flow in low frequency conductors, especially the DC power. The energy is indeed in the fields, the motion of charge carriers are described by the edge of the fields (current and magnitude)

@sairam4588 57 минут бұрын

Extremely informative to the point 👉☝️

@seditt5146 Жыл бұрын

I created a Neural network I trained to work as a Binary ALU. Even better for this I trained "Cells" which act as logic gates and I would love to see my data encoded into glass such that it could function as a full on ALU in light.

@antiprime4665 Жыл бұрын

What is the point of using a neural network as an ALU

@nanobrains Жыл бұрын

Thanks!

@darthmoomoo Жыл бұрын

5:03 Are you sure it's petajoules? That's the energy equivalent of about 1 megaton of TNT.

@T3hderk87 Жыл бұрын

Holy crud.... That is insane. This reminds me of the x64 jump, and I think it will be as, if not more, significant.

@jasonkocher3513 Жыл бұрын

I'm far, far away from this area of study, but I am an EE nonetheless... could they replace the "thin film heater" with a Piezo element on each of those interferometers to slightly deform the one leg? This stuff is so cool.

@10-AMPM-01 Жыл бұрын

8:23 - I'm not very surprised. I figured it could be done. That kind of manufacturing isn't in my wheelhouse. But, architecture is, haha.

@lionelcliff Жыл бұрын

In the challenges section what does John mean when he states that the photonic chips aren't used for training, but only for 'inferences' due to their lower accuracy? Great Presentation btw !

@quaidcarlobulloch9300 Жыл бұрын

12:41 LET's GO, literally called it because rate coding is how our neurons are organized!

@cubertmiso Жыл бұрын

would you consider investing in which company making some of the axes for the photonic/silicon era?

@NEWDAWNrealizingself 2 ай бұрын

THANKS !

@youngmonk3801 Жыл бұрын

Are these light matrices forming AND, NOR, OR, XOR gates, etc? Or is this a different type of computing that isn't "Turing style" ? in other words, are neural networks different from these logic gates?

@profdc9501 Жыл бұрын

A small note, a petajoule is the amount of energy unleashed by a 250 kt nuclear bomb. You probably mean femtojoule. :) The structure of feedforward deep neural networks is unfortunately very sensitive to computation error which is why typically these often employ at least 32-bit floating point arithmetic. Backpropagation of these networks to update weights through many layers can result in cumulative error which limits model performance. For optical scaling operations, there are additional error sources due to quantum detection fluctuations, flaws in the optical system that cause scattering and coherent noise, sampling and quantization error, not to mention power consumption from electro-optical interfaces that can be quite substantial. There may be neural networks for which optical scaling operations are suitable, however, the conventional feedforward deep neural network, because of its reliance on precision matrix multiplication operations so that backpropagation can be performed using the adjoint operation, is going to be quite challenging. There are plenty of ideas and simulations floating around for this but very little in the way of actually attacking the real issues surrounding optical neural network implementations, just mostly hype.

@taktoa1 Жыл бұрын

I don't think anyone is interested in training on photonic accelerators, it's all inference. Quantization is very commonly employed to make inference cheaper, which results in errors similar to photonic accelerators, though smaller in magnitude (IIRC current photonic accelerator designs get 2-4 bits of precision, classical inference accelerators are typically in the 8-16 bit range). So I think most of what you're saying here is a non sequitur with regard to the published research.

@profdc9501 Жыл бұрын

@@taktoa1 Run something as simple as MNIST on an optical accelerator and get 99% accuracy and then we'll talk. The key with digital quantized neural networks is that despite the fact they're quantized they are also deterministic, that is, given an input, the output is the same each time, as there is no measurement noise. Therefore if you train with quantization error, the network can learn that error. However, analog physical systems have measurement error. It's not just that the optical system achieves the "equivalent" of 2-4 bits of precision, its that no matter how many average photons are used to represent a signal, there are going to be measurement outliers. Due to the nonlinear operations of ReLu and Maxpool, outliers due to measurement error can accumulate in deep neural network layers. So it seems to me that having many deep layers and nonlinear operations like ReLu and Maxpool make it extremely difficult for an analog multiplier, especially one susceptible to quantum noise, is going to produce reproducible, reliable inference. Because of the extreme sensitivity of feedforward neural networks to cumulative error, if training is performed digitally for inference that is to occur on an analog/optical computer, the training model must be extremely accurate, including effects of quantization, noise sources including Poisson, thermal, coherent noise, system manufacturing error, etc., and even then the variation due to measurement error may limit the ultimate inference accuracy. It may be required to train a neural network for each physical system because the manufacturing tolerances of two different optical chips may be too different for a network trained on one chip to work on another chip. Biological neural networks seem to work quite effectively without being deterministic despite the fact these are implemented on analog computer wetware. Deep feedforward neural networks seem like a poor fit for analog computing, especially quantum noise limited computing for which the power consumption is directly influenced by the number of photons required to achieve a certain SNR due to Poisson noise (SNR being proportional directly to the square root of power, and so SNR increasing only slowly with increased power consumption). Even other solutions that use electric charge (mythic.ai/) with similar electric charge quantization problems are limited in the number of layers that can be implemented. The whole reason why feedforward deep neural networks were created in the first place is because backpropagation is possible using a bit of clever calculus and the chain rule. Training is the problem, because if you don't have any other kind of neural network you can effectively train that is resistant to measurement error, analog computation is not going to be a viable solution for neural network inference. Neural network accelerators like the Tensor processor have sucked all of the air out of the room for research into any other kind of neural network architecture, and as long as this is the case, the market will not care about analog computers because the current feedforward deep neural networks were created for deterministic, digital machines.

@TaeruAlethea Жыл бұрын

What would be pretty wild would be using both time offsetting and wavelength multiplexing to increase throughput. If I understand it, it would be like light based hyperthreading, except you could do 3, 4, or more threads all independently. I guess it would just rely on how passive the structures would actually be.

@markkyn7851 Жыл бұрын

I worked on this datacom side for photonic switches. The thing about wavelength multiplexing (WDM) when used with MZIs is that crosstalk can be a killer, depending on the MZIs used. Depending on the interconnect topology used for the MZI mesh, crosstalk can cascade through the MZI mesh ultimately increasing the "noise" level beyond practicality. This also inhibits scaling these meshes out, as you could imagine!

@amaz_ng Жыл бұрын

Can you do a report on Rigetti computing?

@googacct Жыл бұрын

One thing that seems to be overlooked in the video is the use of a nonlinear activation function as part of the computation. I do not think matrix multiplies all by themselves give the desired effect.

@raphaelcardoso7927 Жыл бұрын

Usually the nonlinearity is achieved outside of the photonic part :/

@TheRiskyBrothers Жыл бұрын

This is some real Metamorphosis of Prime Intelect shit. Also this chanel is great keep it up 👍

@rohanofelvenpower5566 Жыл бұрын

Cloud Tensor Processing Units (TPUs) Tensor Processing Units (TPUs) are Google’s custom-developed application-specific integrated circuits (ASICs) used to accelerate machine learning workloads. TPUs are designed from the ground up with the benefit of Google’s deep experience and leadership in machine learning.

@TymexComputing Жыл бұрын

Nice episode :)

@AdityaChaudhary-oo7pr Жыл бұрын

Awesome video

@user-it5bm2gf9p Жыл бұрын

Very nice video. 10/10

@MoritzvonSchweinitz Жыл бұрын

I have no idea how any of this workd, but maybe something funky multiplexing could happen to this sector, similar to how DWDM re-uses fibre optics and multiplies their bandwidth?

@BRUXXUS Жыл бұрын

I see this being a much more viable path to future computing than quantum computers. Even if the chips are substantially bigger, they'll use far less energy and won't require cooling in the same way as traditional transistors. I think it's really exciting and I hope to see this continue to grow and advance!

@thomaspluck1515 Жыл бұрын

Check out Xanadu Photonics, squeezed state photons make quantum computing also possible in photonics - although the photodetectors have to be cooled in a liquid oxygen bath.

@superpie0000 Жыл бұрын

with what you said at about 10:10 with analog not having the accuracy, could i theoretically use multiple streams of analog to convey greater resolution, like if i want more accuracy, i can have a 1's channel and a 1/2 channel for 2x the accuracy like how binary/any other number system has powers. id imagine if the error is on the reading side and not the light multiplication computer part then that could work(i dont really understand light, spooky stuff like magnets and electrons), however in a op amp implementation id imagine the lower places would leach more noise as they have heavyier weight.(could be reduced by differential noise reduction, but emf is unbeatable) another solution would to fill up the smaller place with value, then as you fill the first place, use the second as an extension to allow you to stack the channels into one massive in depth channel made up of many streams of light or whatever the medium. Idk how any of this works whatsoeverbut i would love to know if this method is of any use for improving accuracy at the expense of complexity

@nexusyang4832 Жыл бұрын

Thursdays I fry my brain with First We Feast in the morning and then educate myself at night with Asianometry.

@arjungoalset8442 11 күн бұрын

Do you have a link for Cornell paper?

@runforitman Жыл бұрын

6:19 voltage potential is also transmitted at the speed of light

@kasuha Жыл бұрын

There's some inconsistency in the argument. At the start you note that most energy is lost on data transfers, yet these are untouched by the photonics, they tackle the multiplication instead. And I can't help but notice that important part of the photonic circuit is a heater, presumably to affect length of one of the paths, adjusting the interference. So while there seems to be obvious advantage in speed of the multiplication itself, it's not clear how much if any energy does it save.