AI That Doesn't Try Too Hard - Maximizers and Satisficers

  Рет қаралды 203,317

Robert Miles AI Safety

Robert Miles AI Safety

Күн бұрын

Powerful AI systems can be dangerous in part because they pursue their goals as strongly as they can. Perhaps it would be safer to have systems that don't aim for perfection, and stop at 'good enough'. How could we build something like that?
Generating Fake KZfaq comments with GPT-2: • Generating Fake YouTub...
Computerphile Videos:
Unicorn AI: • Unicorn AI - Computerp...
More GPT-2, the 'writer' of Unicorn AI: • More GPT-2, the 'write...
AI Language Models & Transformers: • AI Language Models & T...
GPT-2: Why Didn't They Release It?: • GPT-2: Why Didn't They...
The Deadly Truth of General AI?: • Deadly Truth of Genera...
With thanks to my excellent Patreon supporters:
/ robertskmiles
Scott Worley
Jordan Medina
Simon Strandgaard
JJ Hepboin
Lupuleasa Ionuț
Pedro A Ortega
Said Polat
Chris Canal
Nicholas Kees Dupuis
Jake Ehrlich
Mark Hechim
Kellen lask
Francisco Tolmasky
Michael Andregg
Alexandru Dobre
David Reid
Robert Daniel Pickard
Peter Rolf
Chad Jones
Truthdoc
James
Richárd Nagyfi
Jason Hise
Phil Moyer
Shevis Johnson
Alec Johnson
Clemens Arbesser
Ludwig Schubert
Bryce Daifuku
Allen Faure
Eric James
Jonatan R
Ingvi Gautsson
Michael Greve
Julius Brash
Tom O'Connor
Erik de Bruijn
Robin Green
Laura Olds
Jon Halliday
Paul Hobbs
Jeroen De Dauw
Tim Neilson
Eric Scammell
Igor Keller
Ben Glanton
Robert Sokolowski
anul kumar sinha
Jérôme Frossard
Sean Gibat
Cooper Lawton
Tyler Herrmann
Tomas Sayder
Ian Munro
Jérôme Beaulieu
Taras Bobrovytsky
Anne Buit
Tom Murphy
Vaskó Richárd
Sebastian Birjoveanu
Gladamas
Sylvain Chevalier
DGJono
Dmitri Afanasjev
Brian Sandberg
Marcel Ward
Andrew Weir
Ben Archer
Scott McCarthy
Kabs
Miłosz Wierzbicki
Tendayi Mawushe
Jannik Olbrich
Anne Kohlbrenner
Jussi Männistö
Mr Fantastic
Wr4thon
Martin Ottosen
Archy de Berker
Marc Pauly
Joshua Pratt
Andy Kobre
Brian Gillespie
Martin Wind
Peggy Youell
Poker Chen
Kees
Darko Sperac
Truls
Paul Moffat
Anders Öhrt
Marco Tiraboschi
Michael Kuhinica
Fraser Cain
Robin Scharf
Oren Milman
John Rees
Seth Brothwell
Clark Mitchell
Kasper Schnack
Michael Hunter
Klemen Slavic
Patrick Henderson
Long Nguyen
Melisa Kostrzewski
Hendrik
Daniel Munter
Graham Henry
Volotat
Duncan Orr
Marin Aldimirov
Bryan Egan
James Fowkes
Frame Problems
Alan Bandurka
Benjamin Hull
Tatiana Ponomareva
Aleksi Maunu
Michael Bates
Simon Pilkington
Dion Gerald Bridger
Steven Cope
Marcos Alfredo Núñez
Petr Smital
Daniel Kokotajlo
Fionn
Yuchong Li
Nathan Fish
Diagon
Parker Lund
Russell schoen
Andreas Blomqvist
Bertalan Bodor
David Morgan
Ben Schultz
Zannheim
Daniel Eickhardt
lyon549
HD
/ robertskmiles

Пікірлер: 1 200
@mihalisboulasikis5911
@mihalisboulasikis5911 4 жыл бұрын
"Intuitively the issue is that utility maximizers have precisely zero chill". Best intuitive explanation on the subject ever.
@tonicblue
@tonicblue 4 жыл бұрын
I think this quotation is precisely why I love this guy.
@mihalisboulasikis5911
@mihalisboulasikis5911 4 жыл бұрын
@@tonicblue Exactly. These types of explanations (which are not "formal" but do a much better job at conveying a point - especially to non-experts - than formal explanations) make you realize that not only is he a brilliant scientist, but also has intuition and experience on the subject which in my opinion is also extremely important. And of course, the humor is on point, as always!
@tonicblue
@tonicblue 4 жыл бұрын
@@mihalisboulasikis5911 couldn't agree more
@Gooberpatrol66
@Gooberpatrol66 4 жыл бұрын
So if I have zero chill does that make me hyperintelligent?
@NortheastGamer
@NortheastGamer 4 жыл бұрын
@@Gooberpatrol66 Maximizers aren't necessarily intelligent, they just treat everything like it's life or death. (Which is actually how we train most maximizers, by killing off the weak)
@unvergebeneid
@unvergebeneid 4 жыл бұрын
"Any world where humans are alive and happy is a world that could have more stamps in it." 😂 😂 😂 I need that on a t-shirt!
@diphyllum8180
@diphyllum8180 4 жыл бұрын
but if they're unhappy you made too many stamps
@MouseGoat
@MouseGoat 4 жыл бұрын
@@diphyllum8180 the robot begins to inject dopamine into humans to insure they always happy XD
@logangraham2956
@logangraham2956 4 жыл бұрын
idk , sounds like something graystillplays would say XD
@ioncasu9825
@ioncasu9825 2 жыл бұрын
Killing all humans to make stamps is a bad strategy because after that you don't get more stamps.
@devoottanr
@devoottanr 22 күн бұрын
0yyhhjiiikioooooö​@@MouseGoat
@tatianatub
@tatianatub 4 жыл бұрын
"utility maximizers have precisely zero chill" needs to be on a tshirt
@SlimThrull
@SlimThrull 4 жыл бұрын
Yes. Yes, it does.
@Gunth0r
@Gunth0r 4 жыл бұрын
I would buy Robert Miles merch.
@xcvsdxvsx
@xcvsdxvsx 4 жыл бұрын
@@Gunth0r This channel would have the best merch ever.
@nibblrrr7124
@nibblrrr7124 4 жыл бұрын
Well, what if you're a maximizer that values "chill" (amongst other things, or exclusively)? :^)
@josephburchanowski4636
@josephburchanowski4636 4 жыл бұрын
@@nibblrrr7124 Intuitively the issue will be that utility maximizers will have precisely zero chill when it comes to maximizing chill. Also how do you code chill?
@armorsmith43
@armorsmith43 4 жыл бұрын
“So satisficers will want to become maximizers” and this is one reason that studying AI safety is interesting-it prompts observations that also apply to organizations made of humans.
@PragmaticAntithesis
@PragmaticAntithesis 4 жыл бұрын
The unintended social commentary about capitalism is real...
@killers31337
@killers31337 4 жыл бұрын
Well, AI is simply a kind of agent making decisions, so all the theory about such agents still applies. Say, perverse incentive problem. E.g. if you pay people for rat tails hoping they will catch wild rats, they might end up farming rats.-- this is a 'maximizer' problem which actually happened IRL.
@PragmaticAntithesis
@PragmaticAntithesis 4 жыл бұрын
@@killers31337 I thought that was a culling if stray cats, not rats?
@ivandiaz5791
@ivandiaz5791 4 жыл бұрын
@@PragmaticAntithesis It has happened many times in many different places for all sorts of animal problems. The most famous case generally was snakes in India under British rule... specifically cobras, which is why this is often called the Cobra Effect. See the wikipedia article.
@bp56789
@bp56789 4 жыл бұрын
You think humans don't seek to maximise their own utility if they aren't in a "capitalist" system?
@miapuffia
@miapuffia 4 жыл бұрын
Satisficer AI may want to use a maximizer AI, as that will lead to a high probably of success, even without knowing how the maximizer works. That made me think that humans are satisficers and we're using AI as maximizers, in a similar way
@ciherrera
@ciherrera 4 жыл бұрын
Yup, but unfortunately (or maybe fortunately) we don't have a convenient way to reach into our source code and turn ourselves into maximizers, so we have to create one from scratch
@AugustusBohn0
@AugustusBohn0 4 жыл бұрын
@@ciherrera inducing certain mental conditions would accomplish this as well as can be expected for biological creatures.
@johnwilford3020
@johnwilford3020 4 жыл бұрын
This is deep
@JM-mh1pp
@JM-mh1pp 3 жыл бұрын
@@ciherrera I do not want to be maximiser, it goes against my goal of chilling.
@randomnobody660
@randomnobody660 3 жыл бұрын
@@JM-mh1pp but do you get MAXIMAL CHILLING!?
@superjugy
@superjugy 4 жыл бұрын
hahahaha, flower smelling champion. I had already seen that comic but its so much more funny in this context XD thanks for the great videos
@MouseGoat
@MouseGoat 4 жыл бұрын
Sooo we really do want to program lazynes into our robots :D lmao
@theshaggiest303
@theshaggiest303 4 жыл бұрын
"Not trying too hard"? Move over, dude, I happen to be an expert in this field. Just program the AI to take a break after every five minutes of work to watch KZfaq videos for an hour and a half. Problem solved.
@thesteaksaignant
@thesteaksaignant 4 жыл бұрын
5min later... Breaking news ! All youtube servers worldwide are down ! Largest DDOS attack ever !
@k_tess
@k_tess 4 жыл бұрын
@@thesteaksaignant Now, now this only happens if you multi-thread.
@thesteaksaignant
@thesteaksaignant 4 жыл бұрын
@@k_tess let's cross our fingers hoping that a super intelligence capable of conquering the world won't figure out multithreading then
@Hakou4Life
@Hakou4Life 4 жыл бұрын
I think it is enough to let it watch youtube...
@martinsmouter9321
@martinsmouter9321 4 жыл бұрын
@@thesteaksaignant DDOSing KZfaq keeps from watching said video and, so from getting perfect utility.
@NancyLebovitz
@NancyLebovitz 4 жыл бұрын
For anyone who missed it, the closing music is "Dayenu", a Hebrew song with a refrain of "it would have been enough". It's a nice choice.
@AndreRhineDavis
@AndreRhineDavis 3 жыл бұрын
I noticed this, it was really clever
@Verrisin
@Verrisin 4 жыл бұрын
I just realized... If you make it (say AI-1) to want to chill (not work too hard to achieve it)... it will just make something else (another AI) to do the work for it, if it's easier than solving it on its own... right? Then, what it will create is probably a maximizer (because that is the easiest; and it is lazy, and just wants to chill) Then I realized..... *We, humans, are the AI-1* ... O.O - We are doomed...
@buttonasas
@buttonasas 4 жыл бұрын
Amazing observation! But hey, maybe we can build something that is just ever so slifhtly less lazy? Then maybe it can make an another less lazy machine... But yeah, chances are that might suddenly jump to building a maximizer and that's the end :D
@shadiester
@shadiester 4 жыл бұрын
Holy crap, that's actually so true!
@jjkthebest
@jjkthebest 4 жыл бұрын
Unless that AI cares about self preservation. Normally this would naturally arise from being a utility maximiser, though I'm not sure if it would still be the case for the AI that wants to chill, since it can be confident in the fact that the maximiser it creates will do the job just fine... hmm.
@Roonasaur
@Roonasaur 4 жыл бұрын
No. Utility =/= Work. If an AI is successfully programmed to not want infinity stamps, it will not do anything to create infinite stamps. It will only willingly create subordinates that also want less than infinity stamps, and will put in a lot of work to act against any subordinate that is a "maximizer" which will create infinity stamps. When Guy-who-needs-a-haircut says he wants AI to "chill" . . . What he's really wanting is for it to look for "balance." And, expert I am not, but that doesn't seem like an impossible thing to code.
@Verrisin
@Verrisin 4 жыл бұрын
@@Roonasaur But that is not what it wants. It wants "at least N" - and infinity is good way to assure it will get at least that much. It has nothing against infinite amount of stamps. - But I am already thinking about why this isn't as bad as I feared originally: Especially: I think it's not necessary (or even that likely) for a satisficer to become a maximizer. The rest of my 'argument' seems sound to me, but this just does not _feel_ right... I haven't had time to think about it properly, but I think there is something there.... What he really wants does not matter. Only the utility function he can specify for the AI.
@SapkaliAkif
@SapkaliAkif 4 жыл бұрын
2:57 "You can't perfectly simulate a universe from the inside." is a good motto to have if don't want to overthink stuff. Science is cool
@orangeninjaMR
@orangeninjaMR 4 жыл бұрын
This is actually false. It depends entirely on the complexity of the system relative to its size: a large but simple system can have its information "compressed" into a replica within itself, and indeed the fact that real-world physics is at all effective is a result of the fact that some (if not all) of the systems in our universe are compressible in this way. A fun example in the very simple universe of Conway's Game of Life: kzfaq.info/get/bejne/rrZlYMx6yrG8dWw.html
@SapkaliAkif
@SapkaliAkif 4 жыл бұрын
@@orangeninjaMR I am no expert, but this seems to ignore something. You can get results this way -if you are looking for results- but you cannot perfectly simulate and observe all the details. So is it really a perfect simulation or is it just a miniature version that gives you the info that you want?
@orangeninjaMR
@orangeninjaMR 4 жыл бұрын
@@SapkaliAkif you ask for a perfect simulation, which I would take to mean a "copy containing all of the same information", which demands nothing about observation... but on the other hand if all an AI wants is to predict the utility of the outcome, it doesn't need to be able to observe all of the details, just the number of stamps that it results in!
@SapkaliAkif
@SapkaliAkif 4 жыл бұрын
@@orangeninjaMR Oh I forgot we were in the comments of a AI video.
@CircuitrinosOfficial
@CircuitrinosOfficial 4 жыл бұрын
@@orangeninjaMR Doesn't the halting problem disprove the ability to perfectly simulate a universe from the inside? For the simulation to perfectly simulate the universe, it also needs to include itself in the simulation because it is a part of the universe. Because of this, it is possible to have situations where the act of the simulator printing out it's answer of the simulation can change the result of the simulation. For example: Let's say you ask the simulator if your friend is going to invite you to their party. If the simulator says yes, you start acting differently towards your friend and end up annoying them. So they decide not to invite you to the party after all. So the simulator was wrong. If the simulator says no, you act normal so your friend does invite you to their party. So the simulator was wrong. In this situation, the only way for the simulator to accurately simulate the situation is to not tell you the answer. But if you designed the simulator to always print out an answer then it can never correctly simulate this situation.
@Cobra6x6
@Cobra6x6 4 жыл бұрын
Have you guys played the game Uniserval Paperclips? It's free, and basically you play as the Stamp Collector AI. You're maximizing the number of clips. I kinda loved it to be honest.
@Trophonix
@Trophonix 4 жыл бұрын
I also thought of this while watching! Make everything paperclips!!!
@zac9311
@zac9311 4 жыл бұрын
That sounds awsome. Is it good?
@Trophonix
@Trophonix 4 жыл бұрын
@@zac9311 It's an incremental/clicker game with multiple stages of progression. Google it!
@klobiforpresident2254
@klobiforpresident2254 4 жыл бұрын
So what you're saying is that if I want stamps I must invent and subsequently RELEASE THE HYPNO DRONES?
@maoman4855
@maoman4855 4 жыл бұрын
@@Trophonix i.e. it's cookie clicker but with paperclips instead of cookies
@NightmareCrab
@NightmareCrab 4 жыл бұрын
"Can you relax mister maniacal, soulless, non-living, breathless, pulseless, non-human all-seeing AI, sir? Just chill, don't be such a robot."
@baranxlr
@baranxlr 4 жыл бұрын
"SHUT UP AND RETURN TO THE STAMP MINES, MEATBAG"
@herp_derpingson
@herp_derpingson 4 жыл бұрын
Historically speaking, several humans have brought apocalypses while they were trying to maximize something.
@qwertyTRiG
@qwertyTRiG 4 жыл бұрын
Thomas Midgely Jr, for example.
@SamuelKristopher
@SamuelKristopher 4 жыл бұрын
We're doing it right now on several fronts
@Nosirrbro
@Nosirrbro 4 жыл бұрын
@@qwertyTRiG Well, that and his pope infestation
@DiThi
@DiThi 4 жыл бұрын
I was thinking exactly that: Analyzing corporations as if they were AI agents, they're literally doing everything described in this channel. It's not that corporations are bad. The system itself (capitalism) creates agents that modify their own source code (laws) to maximize capital accumulation.
@jordanrodrigues1279
@jordanrodrigues1279 4 жыл бұрын
@@DiThi I'm really starting to believe that AI safety research is the most mathematically rigorous critique of utilitarianism and capitalism to date. I think I'm okay with that.
@WhiteThunder121
@WhiteThunder121 4 жыл бұрын
7:53: "- Control human infrastructure - ??? - STAMPS " lol
@davidwuhrer6704
@davidwuhrer6704 4 жыл бұрын
Replace stamps with money, and watch the world burn.
@revimfadli4666
@revimfadli4666 4 жыл бұрын
@@davidwuhrer6704 especially if it adapts to any new currency made to solve the problem
@MrBrew0
@MrBrew0 4 жыл бұрын
Hello Robert! Let me start by saying, your channel is probably my favorite channel on KZfaq. I'm a compsci student, AI enthusiast, and your insight and explanations in the field of AI are really entertaining and educational. Many other channels try to present the information in the condensed and easy to digest way, which is fine, but I would really like to see more advanced content on YT. Maybe you have a recommendation for me? I was wondering, you don't upload videos very frequently. I really appreciate your work and would be very happy to see more content from you, but if it is because you are busy or want to provide quality over quantity I'm all for it too!
@qzbnyv
@qzbnyv 4 жыл бұрын
Reminds me a lot of asymmetric call option payoffs from finance. And a lot of near-bankrutpcy decision making for corporations.
@bejoscha
@bejoscha 4 жыл бұрын
This is one of the better videos (of all your good ones). I like it very much. Speed is well adjusted (a tiny bit slower than usual), explanations are concise and good. Just a good watch. I'm definitely looking out for the next... Thanks for breaking down such complex topics into digestible chunks for (near)-leasure watching. I feel this is the kind of "solid" common-sense understanding of AI future generations will need to have, even if being an expert in the field is out of reach. More complicated life? Yes, but that's just as it is. People 500 years ago could do with a lot less "every-day complexity" than today as well...
@AsteriosChardalias
@AsteriosChardalias 4 жыл бұрын
The content and the comments on this channel always gets me reflecting on the 'human condition' and how much trying to build AIs teaches us about understanding ourselves.
@Elyandarin
@Elyandarin 4 жыл бұрын
My impression about AI is that you can only ever maximize for one utility function, but you can satisfice as much as you want, as long as you are OK with the failure state of [doing nothing]. So, you satisfice for "at least 100 stamps expected in optimal case", satisfice for "at least 95% chance of optimal case", satisfice further for "zero human casualties" and "with 99.9% certainty", let the planning engine spin for an hour or until 100 plans have passed muster, then maximize acceptable plans according to something like "simplicity of plan", "positive-sum outcomes" or "similarity to recorded human interactions". ...Well, there's probably a lot that could go wrong with that, even so, and I'd probably add some more complex safety measures after considering everything that could go wrong for a couple of months, but that's what I'd start with, were I to program AI.
@khananiel-joshuashimunov4561
@khananiel-joshuashimunov4561 4 жыл бұрын
Sounds like you need a cost function that outgrows the utility function at some point as a sort of sanity check.
@NineSun001
@NineSun001 4 жыл бұрын
With a human hurt being really costly and a human killed with maximum cost. That would actually solve a lot of the issues. I am sure some clever mind in the field already thought about that.
@nibblrrr7124
@nibblrrr7124 4 жыл бұрын
Cost is already considered in the utility function.
@nibblrrr7124
@nibblrrr7124 4 жыл бұрын
​@@NineSun001 You're basically restating Asimov's (fictional) First Law, and the problems with it have been explored in (adaptions of) his works, and ofc by AI researchers. Consider that, even if you could define terms like "hurt" or "kill", humans get hurt or die all the time if left to their own devices, so e.g. putting all of them in a coma with perpetual life-extension will reduce the expected number of human injuries & deaths. So if an agent with your proposed values is capable enough to pull it off, it will prefer that to any course of action we would consider desirable.
@khananiel-joshuashimunov4561
@khananiel-joshuashimunov4561 4 жыл бұрын
@@nibblrrr7124 In the video, the utility function is explicitly the number of stamps.
@foundleroy2052
@foundleroy2052 4 жыл бұрын
The costs are Aproegmena and the Agent may safely reprogram itself to be indifferent to Adiaphora; To achieve Eudaimonia. Marcus AIrelius
@lambdaprog
@lambdaprog 4 жыл бұрын
Add one or more smooth penalty terms to your utility. By smooth, it means that the penalty is a continuous monotonic function of the distance to the safe region with zero when inside the safe region. The penalty terms can be designed to sanction over-optimization (optimizations with little *expected return*), or instability (apocalypse). This is a common technique used in non-smooth bounded optimization in capital markets portfolio management where the individual investment per asset within the portfolio is bounded to avoid increasing the portfolio's exposure to market risks. I also found similar applications in digital signal processing with adaptive filters that rely on intrinsically bad forecasts (poor statistics) due to the latency constraints (time is the actual resource), available dynamic range of the processing (analog and/or digital) and the power consumption (the thermal stability). Looking forward to your next video!
@dlwatib
@dlwatib 4 жыл бұрын
Actually, we usually have a pretty good idea what the safe region is, and if not, we can run the AI in shadow mode to see what it says it would do if set free to do as it pleases.
@za012345678998765432
@za012345678998765432 4 жыл бұрын
What if you limit both utility and confidence in expected utility approach? For example, more than a hundred stamps don't add utility, and more than 99% confidence that it had achieved it's goal isn't worth more utility. It probably also fail spactaculerly, but it's interesting to see how
@underrated1524
@underrated1524 4 жыл бұрын
"Hmmm. My utility function treats all percentages higher than 99% as exactly 99% for the purpose of expected value. So, my original plan that has a 99.9999% chance of getting 100 stamps isn't gonna cut it, because it leaves almost 1% of the possibility space unused. Ooh, ooh, I got it! I'll give myself a 99% chance to have 100 stamps and a 0.9999% chance to have 99 stamps! Genius!"
@serversurfer6169
@serversurfer6169 4 жыл бұрын
I was thinking something similar. If it has a 99% chance to satisfy the goal, why doesn’t it see how that goes before it starts considering supplemental or compensatory strategies? 🤔
@Aconspiracyofravens1
@Aconspiracyofravens1 Жыл бұрын
a better option would be for it to round percentages or: treat options with a less then 5% difference in their likelyhood of succeeding as equal in addition, the base model still works, as working against humans has a chance of failure, so a outcome with a 99% certainty is better than one that results in a 99.99999999% likelyhood that has a 2% chance of getting spotted by a investigation algorythim and shut down.
@gabrote42
@gabrote42 2 жыл бұрын
7:53 This is one of the best missing steps plans I have ever seen
@ViridianIsland
@ViridianIsland 4 жыл бұрын
Just found your channel, about to start the binge! Thanks for the content!
@CircusBamse
@CircusBamse 4 жыл бұрын
I absolutely love your outro, I dunno how many people does not know or recognize your parody of "Chroma Key test" xD
@ZarHakkar
@ZarHakkar 4 жыл бұрын
Issues like these when it comes to practical AI design often make me think of the Great Filter and the likely possibility we're not just quite past it yet.
@TiagoTiagoT
@TiagoTiagoT 4 жыл бұрын
But then, where are all the alien robots?
@TiagoTiagoT
@TiagoTiagoT 4 жыл бұрын
@@bosstowndynamics5488 But for all the alien robots in the whole galaxy?
@TiagoTiagoT
@TiagoTiagoT 4 жыл бұрын
@@bosstowndynamics5488 But why did all the alien robots of all the zillion of planets in the Milky Way got the same restriction in their programing?
@grimjowjaggerjak
@grimjowjaggerjak 4 жыл бұрын
@@TiagoTiagoT imagine in 150 years humans stumble into random stamps planets
@underrated1524
@underrated1524 4 жыл бұрын
The issue is that if ASI is the great filter, we immediately run into the same problem all over again. If ASI is the Great Filter, why haven't we yet stumbled across the paperclip maximizer that once was an alien civilization? (Not that I'm complaining, mind you... :) )
@mydickissmallbut9716
@mydickissmallbut9716 4 жыл бұрын
Maybe you could add a "have a minimum impact on the state of the environment" (or something similar) requirement.
@circuit10
@circuit10 2 жыл бұрын
There was a video on that, there are a few reasons why it doesn’t work, I’ll find it
@circuit10
@circuit10 2 жыл бұрын
kzfaq.info/get/bejne/otd6iKyiv7TegGw.html
@elfpi55-bigB0O85
@elfpi55-bigB0O85 4 жыл бұрын
You're absolutely awesome, Miles. Thank you for blessing us with your high quality content
@nraynaud
@nraynaud 4 жыл бұрын
it just occurred to me that Uber killed a pedestrian by trying to maximise the average number of miles between system disconnections.
@Abdega
@Abdega 4 жыл бұрын
This… is news to me
@leninalopez2912
@leninalopez2912 4 жыл бұрын
Hello Miles: I've been thinking for a while to ask/suggest you to make a video showing us publications regarding AI, either journals, proceedings, or textbooks... for those of us either completely ignorant on the subject, barely initiated in it, or those already knowing the basics and capable of following the last developments on the subject right from the sources. I love your videos, your style, and your expositions... but I must say that at the end of EACH video, I'm **HUNGRY** for **A LOT MORE**. Thanks! Live love and SkyNet... I mean... prosper (?
@SamB-gn7fw
@SamB-gn7fw 4 жыл бұрын
You'd love Robert Miles' weekly podcast where he gives an overview of the latest developments in AI safety: rohinshah.com/alignment-newsletter/
@SamB-gn7fw
@SamB-gn7fw 4 жыл бұрын
You would also like this online AI safety MOOC series: www.aisafety.info/
@BarnacleBrown
@BarnacleBrown 4 жыл бұрын
This video was great! Hope to see more videos from you, You've done great work on computerphile as well
@kwillo4
@kwillo4 3 жыл бұрын
Great vid! Last strip on the flowers was fun :)
@XOPOIIIO
@XOPOIIIO 4 жыл бұрын
Any utility function will rewrite it's source code to recieve reward from doing nothing and preventing people from rewriting it back.
@jameslarsen5057
@jameslarsen5057 4 жыл бұрын
I don't think that's the case. A parent would never take a pill that would make them want to kill their child. Even if they were much happier after the pill, the situation they'd end up in would be contrary to their current goals. In a similar way AIs wouldn't rewrite their utility function, just the code which limits their ability to satisfy their utility function
@Grouiiiiik
@Grouiiiiik 4 жыл бұрын
@@jameslarsen5057 what ? People killing relatives and direct ascendants / descendants for money is quite common.
@Horny_Fruit_Flies
@Horny_Fruit_Flies 4 жыл бұрын
ХОРОШО m Rob already made a video in the past pointing out that agents don't want to modify their utility function.
@XOPOIIIO
@XOPOIIIO 4 жыл бұрын
​@@jameslarsen5057 I think you right, but I still have something to say. I mean parents don't want to kill their children not only because it is associated with negative reward, but also because it is not right thing to do. I'm not sure would AI have anything close to morality or not. If not, it will achieve the goal not because it is right thing to do, but because it is associated with the reward.
@OnEiNsAnEmOtHeRfUcKa
@OnEiNsAnEmOtHeRfUcKa 4 жыл бұрын
@@jameslarsen5057 A parent would never take a pill that would make them _want_ to kill their child. But many have, can, do, and WILL take a pill, substance or psychological hook that makes them neglect their child completely to the point where they eventually either die or are taken out of custody, then continue to obliterate themselves with their new reward function even at the cost of their future, finances, family, mental state and physical body. Some recover. Most don't.
@iamatissue
@iamatissue 4 жыл бұрын
Did no one get the shipping forecast joke at 9:24?
@RobertMilesAI
@RobertMilesAI 4 жыл бұрын
I believe you're the first to
@remmo123
@remmo123 4 жыл бұрын
Very clearly explained! I will wait for the next videos in the series.
@lunkel8108
@lunkel8108 4 жыл бұрын
Your videos always were awesome but you've really outdone yourself with the presentation on this one, great job
@badradish2116
@badradish2116 4 жыл бұрын
"hi." - robert miles, 2019
@toyuyn
@toyuyn 4 жыл бұрын
To think shen's comics would make it into an AI safety video
@urieldaboamorte
@urieldaboamorte 4 жыл бұрын
if my professors had told me economic theory would help watching pop AI videos with ease I wouldn't have cried to sleep so much in the past semesters
@CyberAnalyzer
@CyberAnalyzer 4 жыл бұрын
I appreciate your shared knowledge! Keep the work up!
@ioncasu1993
@ioncasu1993 4 жыл бұрын
Can we just all agree that building a stamp collector is a bad idea and drop it?
@user-xz2rv4wq7g
@user-xz2rv4wq7g 4 жыл бұрын
This is why emails are good, now a spam decreasing AI, that would be good. *AI procceds to destroy every computer with email on the planet*.
@jamesmnguyen
@jamesmnguyen 4 жыл бұрын
@@user-xz2rv4wq7g More like, *AI proceeds to eliminate humans, because humans have a non 0 chance of producing spam emails*
@underrated1524
@underrated1524 4 жыл бұрын
Wouldn't that be nice. If you can find a way to get us all to agree on that, please let me know.
@willdbeast1523
@willdbeast1523 4 жыл бұрын
To solve the "becoming a maximizer" problem you could have a symmetric utility function somewhat like a probability density function, so any strategy that might result in "a fuckton of stamps" would be actively bad rather than just extraneous (but this wouldn't fix the tendency to go overkill on the certainty side making a billion stamp counters etc) edit: I guess you could also use a broken expectation calculation so it would ignore low probability events (like the chance of miscounting 100 times) but that seems a very bad idea from the start
@player6769
@player6769 4 жыл бұрын
That's what I was thinking... if going over 100 was just as undesirable as going under, wouldn't that demotivate it from ordering 100 stamps twice, since the expected value would be much more different from 100 than if it only got 99 stamps?
@chemical_ko755
@chemical_ko755 4 жыл бұрын
@@player6769 That is the same as the case of U(w) = {100 if s(w) = 100, 0 otherwise}. It could result in a lot of stamp counting infrastructure.
@player6769
@player6769 4 жыл бұрын
@@chemical_ko755 ah, fair enough. Always another problem
@lucashowell8689
@lucashowell8689 Жыл бұрын
You could just tell it to fudge the numbers if they’re close enough and get utility from the laziness it uses to do so
@xeozim
@xeozim 4 жыл бұрын
Nothing like anticipating the certain apocalypse to pass the time on Sunday morning
@wiseboar
@wiseboar 4 жыл бұрын
instant-click, love your Videos man!
@sevret313
@sevret313 4 жыл бұрын
What about two bounds? One for the utility function and another for the expected value? So if you bound the expected value to 100 and the utility to 150, then ordering 150 stamps might give you an expected value of 147 stamps. But you bound this to 100. So if you've a 50:50 between 0 stamps and 1 trillion stamps, under this bounds it will get an expected value at 75, less than just ordering 150 stamps.
@_DarkEmperor
@_DarkEmperor 4 жыл бұрын
Realistic Stamp Collecting AI, would get limited resources. So, AI, i give You 1000 000 $ and get me as much stamps as You can get in 2 years.
@sevret313
@sevret313 4 жыл бұрын
@@_DarkEmperor It could always steal money to finance it stamp production.
@rmsgrey
@rmsgrey 4 жыл бұрын
@@sevret313 Steal it? Just run the stock market for 700 days and then cash out to finance pure stamp acquisition for the final month. Of course, maximising the available resources on day 700 means promoting as big a bubble as possible, which means there's going to be a hell of a market crash, probably triggered by the liquidation of the AI's holdings - which offers the added bonus of dragging down the price of stamps... Of course, you're also talking about years of human misery as a direct result, but you get a lot of stamps in the process.
@ruben307
@ruben307 4 жыл бұрын
should make it so the expected stamps should be between 95 to 105 to get the maximum utility function. That way there is no reason to change its code (except for changing what the maximum utility function is)
@underrated1524
@underrated1524 4 жыл бұрын
That would indeed solve the problem of self-modification, but this system is functionally identical to the "give me precisely 100 stamps" agent - it'll turn the planet into redundant stamp counting machinery to make absolutely sure the stamp count is within the allowable range.
@cakep4271
@cakep4271 4 жыл бұрын
Just make it round up. If it's 95% sure that it will accomplish the desired range, round up so that it thinks it is 100% sure.
@underrated1524
@underrated1524 4 жыл бұрын
@@cakep4271 Then you're right back at a satisficer, since many strategies all lead to the "perfect" solution according to the utility function and there's no specified way to break the tie. And once again you run into the problem that "make a maximizer with the same values as you" might be the fastest solution to identify and implement.
@ruben307
@ruben307 4 жыл бұрын
If it gets full satisfaction by a 95% cjance to get the stamps. It could just order them and say satisfied. Then if they arent there in a week it will order them from somewhere else if the treashold of lost package is above 5%
@smiley_1000
@smiley_1000 Жыл бұрын
This reminds me of Asimov, in his novels some of the robots start discussing whether they can modify or circumvent the three laws of robotics that they would usually all have to obey.
@Noerfi
@Noerfi 4 жыл бұрын
this would make some amazing sci-fi series. people everywhere inventing utility maximizers accidentally and having to fight them
@shadowmil
@shadowmil 4 жыл бұрын
So... what about a bell curve? Get as close to 100 stamps as possible, but as you get more than 100, the score decreases. So getting 1,000,000 would be rated low, even lower than 0 stamps. The goal of making yourself a maximizer would also be rated very poorly.
@jamesrockybullin5250
@jamesrockybullin5250 4 жыл бұрын
He addressed that in the video. You don't want the world to be made into stamp-counting machines.
@puskajussi37
@puskajussi37 4 жыл бұрын
One common problem seems to be that the utility function never tells the machine what we don't want it to do. You could subtract "the effect the agi has on world" from the utility and (especially if it uderstands concepts as "order of 100 stamps from a factory is normal") could lead to solutions where the stamps arrive at a convenient time to not disturb your day. Then again, it would also lead to solutions such as "lets not tell the human he has the stamps, maybe he just forgets about them without fuzz" or "lets perform poorly so this AGI tech doens't get used and disrupt the whole world with its usefullnes." Didn't Robert speak about this too, I forget?
@pafnutiytheartist
@pafnutiytheartist 4 жыл бұрын
Yes this. And throw in a small penalty for changes in the environment like discussed in the side effects video. Make it so a reasonable strategy has a punishment of 1. And complete world domination results in highly negative values. This way sending an extra email to make sure the stamps arrive on time is ok if it gives you a percent or two more shure but creating a separate agent to count stamps is instantly negative reward.
@underrated1524
@underrated1524 4 жыл бұрын
@@puskajussi37 Adding negative terms to an unsafe system doesn't reliably make it safe. We can't depend on being able to match an AGI's ability to spot loopholes in the rules, so there'll unavoidably be loopholes the AGI can see but we can't.
@MattettaM
@MattettaM 4 жыл бұрын
I have a question regarding that Utility Satisficers become Maximizers. Wouldn't modifying its own goal to get stamps within a certain range into get as many stamps as possible conflict with its own utility function? Or is this issue seperate from that?
@underrated1524
@underrated1524 3 жыл бұрын
Normally, yes, this kind of agent avoids changing its own utility function, but there's a key difference here. Because satisfiers don't have fully defined utility functions, they have no qualms about arbitrarily pinning down those parts of their utility function that are undefined.
@emmanuelotamendi9583
@emmanuelotamendi9583 4 жыл бұрын
So ok let's recount. Is he super cute? CHECK Is he super smart? CHECK Has he a delicious voice? CHECK Is he on a boat? (he's on a boat) Is he on a boat? (he's on a boat) Everybody look at him 'cause he's sailing on a boat? CHECK Well I guess I'm in love now
@NextFuckingLevel
@NextFuckingLevel 3 жыл бұрын
I didn't know that "ultron" problem is this much more complicated
@morkovija
@morkovija 4 жыл бұрын
Oh hey. College student approach of bare minimum - niiice!)
@marin.aldimirov
@marin.aldimirov 4 жыл бұрын
What if the AI can gradually increase the outcome. Like come up with a strategy to collect 1 stamp. Then modify it so it can collect 2 and so on, until it has a strategy for collecting 100, but no more. Then execute only the 100 stamp strategy.
@GrixM
@GrixM 4 жыл бұрын
Even the simplest goal such as collecting 1 stamp contains a bunch of strategies resulting in the apocalypse.
@puskajussi37
@puskajussi37 4 жыл бұрын
@@GrixM True. But what if the first program is ready made, safe program? Not quite as usefull and sill prone to possibly murderous tactics but its something.
@edskodevries
@edskodevries 4 жыл бұрын
Thought provoking video as always!
@MegaOgrady
@MegaOgrady 4 жыл бұрын
I'm so glad that I found this channel I'd only watch computerphile cuz of him, and honestly, he does such a great job at simplifying how an AI works so that those who don't really know the in-depths can understand
@Gooberpatrol66
@Gooberpatrol66 4 жыл бұрын
Is that background at the end from that Important Videos meme video?
@ian1685
@ian1685 4 жыл бұрын
I really think so, especially since Rob did the little awkward thumbs up.
@joshuahillerup4290
@joshuahillerup4290 4 жыл бұрын
I love how your videos are either explaining how AI works, or why AI is a terrible idea.
@S0ulFinder
@S0ulFinder 4 жыл бұрын
If the AI is capable of changing its code, the easiest way to get to the goal is to change it. It can change (for example) the number of stamps required to 0 and assign itself infinite points as a reward. This is possible because the AI doesn't really care about the stamps themselves, it only cares about the score assigned at the end of the process. If we are lucky after we turn it on the AI will make txt file with "score = infinite" and turn itself off, but there is the chance that it will turn the entire universe into an hard disk to store the highest score possible. Anyhow, if the programmer is somehow capable of protecting section of the code (like in the video) a possible solution to the doom AI is to add more dimensions to the task. Right now we are considering only one dimension: how many stamps. This is similar to how viruses (biology ones) behave in the real world. They just create more copies until the host is dead. If we are capable of adding dimensions to the problem such as: time allowed, value of the stamps at the end of the collection process, number if changes to the AI code, etc. It will create boundaries that the AI is unwilling to cross, similar how a simple unicellular organism "checks" for how much energy/food it has before initiating mitosis. I'm aware that this is similar to trying to add manual rules to the code, so probably smarter people have figured out better solutions as you hinted at the end of the video.
@lightningstrike9876
@lightningstrike9876 4 жыл бұрын
One thing we could try is taking a point from Economics: the law of diminishing returns. In the case of the stamp collector, rather than a linear relationship between utility and the number of stamps, the relationship diminishes with the more stamps collected. Thus, even a Maximizer will realize that any plan the creates above a certain threshold of stamps will actually subtract from the overall utility. As long as we set this threshold at a reasonable point, we can be fairly confident in the safety.
@allaeor
@allaeor 4 жыл бұрын
Will you talk about the debate approach to AI soon?
@underrated1524
@underrated1524 4 жыл бұрын
Although he hasn't discussed the debate plan specifically, he has discussed its two components - the "only give AIs the power to talk about stuff" part, and the "use multiple AIs for checks and balances" part. Only giving an AGI the power to talk won't make it safe, because if it outsmarts us, there's no way to tell what suggestions are safe and what suggestions will advance the AGI's plan to take over the world or whatever. Using multiple AIs for checks and balances is not a dependable solution, because the balance between two AIs probably won't be maintained for long. Once one grows even a little smarter than the other, it'll be able to leverage its advantage until the opposing AI is essentially an automaton in comparison.
@Aljazhhh
@Aljazhhh 4 жыл бұрын
Like now, watch later !
@y2ksw1
@y2ksw1 4 жыл бұрын
The Google search engine uses a relaxed neural network, reason for which it has such a great performance. And yet is pretty reliable, although not perfect.
@richiskinner9810
@richiskinner9810 4 жыл бұрын
You remind me a lot of Michael Reeves. Just muuuuuuch more chilled.... :D Nice video!
@AlbertPerrienII
@AlbertPerrienII 4 жыл бұрын
Why not have the system take into account the likely effort needed to collect stamps and set a penalty for wasted effort? That seems closer to what humans do.
@adamjamesclarke1
@adamjamesclarke1 4 жыл бұрын
How would you calculate effort, and how would be able to calculate expected effort with complete accuracy without actually performing the task in order to measure it?
@robertthebrucey
@robertthebrucey 4 жыл бұрын
@@adamjamesclarke1 Expected energy used would be an easy metric, converting the world to stamps consumes far more energy that ordering existing stamps off of ebay, and is calculable to a reasonable degree of certainty.
@underrated1524
@underrated1524 4 жыл бұрын
For a narrow definition of wasted effort, the AGI will just build a sub-agent to do all the work for it, and make sure the sub-agent doesn't care about wasted effort. For a slightly less narrow definition of wasted effort, the AGI will send some emails to computer science students to trick them into building that sub-agent instead of the AGI. For a much broader definition of wasted effort, the AGI will slaughter all living things on the planet, because just *look* at how much effort we're collectively wasting, that's totally unacceptable. (I'm not confident that there even *is* a sweet spot in the middle that avoids these problems satisfactorily. Even if there is, I don't want to roll the dice that we get it right on the first try.)
@tobiasgorgen7592
@tobiasgorgen7592 4 жыл бұрын
This is probably also a already well researched version. WHY would a expected utility satisficer with an upper limit. E. G. Collect between 100 and 200 stamps fail?
@josiahferguson6194
@josiahferguson6194 4 жыл бұрын
My guess is that it would still run into the problem of the satisficer, since it could become an expected untility maximizer for that bounded function. But maybe it would be possible to limit that by making changing your own code result in an automatic zero on the utility function.
@underrated1524
@underrated1524 3 жыл бұрын
@Tobias Görgen An expected utility satisficer with an upper limit probably just turns into a version of the maximizer that seeks to obtain exactly 100 stamps with maximum confidence, which again leads to the world getting turned into stamp counting machinery. @Josiah Ferguson Sadly, in principle, there's always a way to achieve the same result while technically skirting around the restriction. If "changing your own code" is illegal, the AI might just write a new program in a different memory location on the same hardware such that the code acts as a maximizer. If you ban changing the code on the hardware at all, the AI might seek to write and run the maximizer code on some other accessible machine, and if you ban that, the AI might just fast-talk one of its supervisors into writing and running the code. Fundamentally, we can't reliably write rules for AI - if we tried to formally specify something as vague and broad as "don't change your own code", the translation into code would be spotty enough that there'd predictably be loads of loopholes.
@nickmagrick7702
@nickmagrick7702 4 жыл бұрын
"the issue is that utility maximizers have precisely 0 chill" I loled. nice way of putting it
@jonwatte4293
@jonwatte4293 4 жыл бұрын
Also, the "Xenos paradox" of "infinitely ordering another 100 to increase probability" obviously has other solutions. But with a cost function of actions, it will very quickly converge on safe, cheap actions.
@owlman145
@owlman145 4 жыл бұрын
Seems like any AI will want to change it's own source code unless otherwise hardcoded to not do that. Can't you make such that it also wants to satisfy the condition sourceCode = originalSourceCode? If it can rewrite that then it could also rewrite it's maximizer function, which means the easiest solution would be to set stamps needed to 0.
@underrated1524
@underrated1524 4 жыл бұрын
The obvious loophole: Build a maximizer that's completely external to yourself but shares your values to a T. No need to change your own code then.
@KissatenYoba
@KissatenYoba 4 жыл бұрын
@@underrated1524 and if creator limits you to not producing other AIs that can change you in turn, you do actions that may theoretically cause creation of AI that's not decided by you that may change you. And if owner forbids that of you as well you do the same but rely on humans to change you instead, unless owner is willing to let you eliminate humanity for the sake of limiting you to change yourself. Man, it's like Tsiolkovsky's dilemma about weight of rockets going to space.
@owlman145
@owlman145 4 жыл бұрын
@@underrated1524 Not sure that's a loophole. A smart generic AI would be wary of creating another generic AI for the same reasons we are. Thuss the satisficer function would rate such a solution pretty low. Nor is it likely to be a simple solution to the problem. The reason it considers changing its own code to become a maximizer is that it was easy.
@Inedits
@Inedits 4 жыл бұрын
The satisficier can easily create a maximizer...(in cases in which it can´t change itself)
@JustAZivi
@JustAZivi 4 жыл бұрын
Would be great to see the mentioned "next video" soon. ;-)
@pafnutiytheartist
@pafnutiytheartist 4 жыл бұрын
What if we do a utility function in a following way: F(s) = s, if s = 100 If the number of stamps is between 100 stamps and 120 stamps the reward is 100 exactly. If it gets less than 100 the reward is the number of stamps. If it gets more than 120 the reward is 220-number of stamps (negative if more than 220 stamps are collected) You can also add a small negative term for environment disruption as you discussed in side effects video. This way the agent wants to make sure it collects around 100-120 stamps but is punished for the possibility of collecting too much (or turning the world into a stamp counting device if you include the negative term for turning the world into different things). It's not a 100 percent way to get the AI to finally chill out but it's very likely to not destroy the world.
@pafnutiytheartist
@pafnutiytheartist 4 жыл бұрын
Example: it came up with a strategy that is likely to yield 115 stamps. It gets 99 for the strategy because it's not 100% sure and penalty of .01 for doing stuff and lightly disturbing the stamp market. Final value 98.99 If it creates a crazy disturbance to make shure it gets what it expects like rewriting itself and creating new agents that make sure that 100% of the stamps are collected it will get 99.9999 points and -5000 penalty for expanding resources and changing the environment.
@BinaryReader
@BinaryReader 4 жыл бұрын
Can't you just limit on energy expenditure of the strategy?
@victorlevoso8984
@victorlevoso8984 4 жыл бұрын
Well if you know a good way of defining whats limiting energy expenditure that doesn't run into lots of problems (a lot of them similar to the ones shown in the video about minimizing side effects) then maybe. Otherwise it's not "just" it's a very complicated potential research direction. But yeah it is potentially useful.
@underrated1524
@underrated1524 4 жыл бұрын
How do you measure energy expenditure? By most metrics, "build a maximizer that doesn't have this limitation and let it do all the work instead" would be a relatively low-energy-expenditure strategy, especially if you can persuade a human to do it on your behalf. If you instead make the definition of "energy expenditure" broad enough to make sure that a separately built maximizer still counts towards the quota, then you run into the problem where the agent kills pre-existing humans because their unrelated energy use is being counted too.
@governmentofficial1409
@governmentofficial1409 4 жыл бұрын
Another potential problem with this approach is that energy can't be destroyed. If by energy expenditure, you mean that part of the AI's preferences is to only use energy that humans provide it, then you run into the same problem as you do when specifying any other goal. This AI would be incentivized to manipulate humans into giving it energy (maybe by plugging them into the matrix?), for instance.
@theshaggiest303
@theshaggiest303 4 жыл бұрын
​@@underrated1524 It looks to me like the solution to your objections is practically contained within them. "build a maximizer that doesn't have this limitation and let it do all the work instead" is a great example of why "only count energy that we use directly" doesn't work. So, also consider energy used indirectly (but still as a result of our actions). "kill pre-existing humans because their unrelated energy use is being counted" is a great example of why "count ALL energy, even energy unrelated to our operations" doesn't work. So, don't count unrelated energy (energy spent independently of our actions).
@underrated1524
@underrated1524 4 жыл бұрын
@@theshaggiest303 So now you're left with the near-hopeless task of defining what energy counts as related and what energy counts as unrelated.
@brindlebriar
@brindlebriar 3 жыл бұрын
But if the A.G.I. can edit it's own source code, then surely it can edit the input commands. In that case, there's a universal option for every input command, to simply change the command to one that is super easy to carry out, like, "don't do anything." That would be the easiest way to carry out 'the command.' After all, isn't that what we humans do when we have lots of things we're supposed to get done, and we decide to say 'fuck it,' and just play video games or take a nap? We change our input command to one that seems easier to carry out. In a way, we are Intelligence programs. Our DNA is the source code. And our biological and environmental imperatives are input commands. But sometimes, we cheat. For example, we have a sex drive, to get us to replicate ourselves, so that our DNA can take over the universe. But sometimes, we just masturbate. So we can look to what humans actually do, to get an idea of what sorts of things A.G.I. might do.
@stampy5158
@stampy5158 3 жыл бұрын
You're right to say an AI can modify itself - even if we try to stop it, if it's more intelligent than us we should expect it to outsmart us and modify itself anyway. But while an AI will likely want to modify itself, there are some aspects of itself it won't want to change. As Rob mentioned in the Computerphile video about the stop button problem, giving itself a new command (/ utility function) will rank very low on its existing command so we can probably assume an AI won't want to do that. That is to say, if the AI wants to maximise human happiness, it won't want to do things like modify itself into a "lazy" AI that does nothing because doing so doesn't cause much happiness. We strongly believe AI won't do things like "goof off all Sunday and play videogames" like humans do because our goals include things like "relax occasionally" and "socialise with other meat popsicles" and many other things we don't even realise are important to us, which are almost all values the AI won't share. Having said all that, AIs might behave as though they've modified their reward functions. A real AI running on a real computer system might store its score in some address in memory and might do something that sets its score in memory to a very high or maximal value. We call this "Wireheading" and it's actually already manifested in some relatively simple systems. You could imagine an AI instructed to "maximise how many stamps you think you have" actually finding it easier to lie to itself by just putting a really big number in its "how many stamps do I think I have" memory location, than it would be to actually make that many stamps. Unfortunately this is still a guaranteed apocalypse because the AI will now want to make the space in its memory where it stores the stamp counter as large as possible, and it'll reprogram itself and modify its hardware to store the largest possible number. Eventually it'll run out of servers. -- _I am a bot. This reply was approved by plex and Social Christancing_
@cornjulio4033
@cornjulio4033 3 жыл бұрын
Hello Robert. Finally I found your channel !
@omarcusmafait7202
@omarcusmafait7202 4 жыл бұрын
9:37 is just perfect 😂 plz make more of that XD
@susanmaddison5947
@susanmaddison5947 4 жыл бұрын
The solution seems simple. Give a positive utility value for stamps collected up to 100 stamps, and a negative utility value for stamps collected beyond 100.
@haeilsey
@haeilsey 4 жыл бұрын
Susan Maddison like a reverse bounded utility function
@ukaszgolon5617
@ukaszgolon5617 4 жыл бұрын
The problem is it would still want to make sure it has exactly 100 stamps, so a utility maximizer would acquire as much resources as possible and devote them into endlessly recounting all its stamps. If it would get away with it, it could even reassemble people into stamp counting machines and computers, to upgrade the certainty, that it has maximized the utility function, from 99.999999% to 99.999999999999999999999999999999999999999999%. Which is why a powerful AGI needs some kind of safety regulation that would stop it from wanting to maximize the certainty as well. It needs some kind of meta-chill pill.
@19aavila
@19aavila 4 жыл бұрын
An even better way might be to give it a maximum utility when the probability of 100 stamps is (let's say) 90%, and then run it until it happens. 0 = utility( P(100 stamps)=0) and 0 = U(P(100 stamps) = 100%). Wouldn't it then be chill and just try a little bit?
@susanmaddison5947
@susanmaddison5947 4 жыл бұрын
​@@ukaszgolon5617 Right. It needs a reverse utility function for spending too much time, energy, and resources on the problem. And reverse utility for spending too much time on figuring out that it's spending too much time. This is like "calling the question" in Parliament, and in the individual brain. Or like awareness of "opportunity cost" of information gathering. Should also give it a time-discount function, reducing the utility value of things produced at later dates. In general, we should give it functions for every factor that goes into rational choice -- or what we are able to understand of rational choice theory and bounded rationality. Including respect for the multiplicity of goals of the purpose-giver (us), the limited value of each goal. And, in light of this last consideration, which is only loosely quantifiable: an incentivization of continued iterative learning about what are the residual embedded irrational factors in our choice process -- recognizing these in light of the limited-value and multiple purposes consideration, self-correcting/ self-reprogramming for the irrationalities where able, in any case alerting us to correct for them. In the process, clarifying further for us the meaning of rational choice, the programmable meaning of each factor that goes into it, the additional factors that we need to keep iteratively discerning.
@weeaboobaguette3943
@weeaboobaguette3943 4 жыл бұрын
Nonsense, do not worry fellow biological unit, there is nothing to worry about.
@hello-ji7qj
@hello-ji7qj 3 жыл бұрын
Great video. I love it, but too much for me when I'm trying to distract myself during breakfast.
@jaimeduncan6167
@jaimeduncan6167 4 жыл бұрын
The inside about laziness is interesting. Maybe the system needs competing objectives to be stable and "safe". Or maybe we need a stamp colector predator.
@Paint2D_
@Paint2D_ 4 жыл бұрын
So there is no difference between capitalism and utility maximizers?
@underrated1524
@underrated1524 4 жыл бұрын
Qualitatively, corporations have a reasonable amount in common with utility maximizers, though they do have important differences as well. For more information, you can see this other video of Robert's: kzfaq.info/get/bejne/gpugiKRksdmpkas.html
@PaulHobbs23
@PaulHobbs23 4 жыл бұрын
Robert has a video on Corporations vs. AGIs
@tuqann
@tuqann 4 жыл бұрын
My satificatories have been maximized, new channel to subscribe to! Love and peace from Paris!
@williamfrederick9670
@williamfrederick9670 4 жыл бұрын
This is my new favorite channel
@vfugjjhfuyft
@vfugjjhfuyft 4 жыл бұрын
Unbound maximization of reward/minimization of error is not by itself a bad AI training strategy. Humans and life on Earth in general work by that principle. We are maximizing our chances of survival. The reason we are chill is that conserving energy and gaining profit with minimal effort is part of survival. That is ingrained in us on, both, physiological and psychological level. So you don't really need to change the type of your error function. You just need to include energy cost as a factor for every action. Decrease your learning rate, add noise to the input. Maybe fiddle around with genetic algorithms, and it should be fine.
@ThylineTheGay
@ThylineTheGay 4 жыл бұрын
that comic at the end Edit: you got yourself a subscriber!
@projecttitanomega
@projecttitanomega 3 жыл бұрын
I love watching your videos, because sometimes I'll have this moment where I'll pause the video because I've thought of a solution, and feel kinda smug for a second, and then I'd unpause the video and immediately hear you say "And so you think, what if *solution*? Well, the problem with that is...", but you still phrase it and make the videos in such a way that, I don't feel like an idiot for coming up with this flawed solution, because that "no" is always said in a way that's like "It's understandable that you would come up with that solution, given the knowledge and what I've just talked about, however, by teaching you more, and this by you learning more, you'll see why it actually isn't" and darned if that isn't how science works, even a wrong hypothesis usually teaches us something new It's hard to teach a complex field of study like AI to people who aren't in that field without making them feel dumb, but you are really good at actually making feel smarter.
@goonerOZZ
@goonerOZZ 4 жыл бұрын
Ooooh new video! Nice!
@the_furf_of_july4652
@the_furf_of_july4652 4 жыл бұрын
Insufficiently thought out solution: Have some kind of secondary criteria. Using a satisficer, asking it for several possible plans, and then ranking them according to some other criteria may help prevent some of the randomness in the result. For example, you could rank things by time to implement, or money spent, or if we can find a mathematical way to quantify it, damage done. Then pick the least costly, least damaging solution and run that. Turning itself into a maximizer would have unknown levels of cost and damage done, in theory it wouldn’t be able to trust that the output would be the least costly, especially when other solutions have a definite low cost (order stamps for a couple dollars and be done with it). Perhaps it could end up building a maximizer to come up with more efficient solutions, then rank them according to the criteria.. and the maximizer’s plan to take over the world would likely rank worse than ebay in terms of damage (again, assuming we can quantify that). Though without that damage function, it’s still possible for apocalyptic solutions to have zero cost. Then you have to go through the effort of having it understand laws and fines and incorporate that into the utility function. And then it’ll just murder the people in charge of fines and taxes and get a discount. ...yeah that damage function would be a very useful thing to have.
@dorianmccarthy7602
@dorianmccarthy7602 4 жыл бұрын
I'm looking forward to the sequel video!
@benjamineneman4276
@benjamineneman4276 4 жыл бұрын
Using dayenu as the song at the end was perfect.
@sukritmanikandan3184
@sukritmanikandan3184 3 жыл бұрын
I live for the way Rob pronounces 'utility'
@bryanroland8649
@bryanroland8649 4 жыл бұрын
Don't try too hard? But I've still got the greatest enthusiasm and confidence in the mission.
@neweins8864
@neweins8864 4 жыл бұрын
I love your work. Keep doing it. I've just one question, isn't it very likely that superintelligent machines will most certainly find some flaw/loophole in our AI safety mechanism which we might not consider? By definition those machines are superintelligent.
@samuelshadrach1512
@samuelshadrach1512 2 жыл бұрын
Yep
@rcookie5128
@rcookie5128 4 жыл бұрын
So interesting how every approach seems fine at first sight but ends up with a definite or probable chance of causing the apocalypse.. :D
@JustAZivi
@JustAZivi 4 жыл бұрын
Thank you for the great videos on your channel! To maximize the number of views on your channel, you probably should upload a new video more often. ;-)
@ReubsWalsh
@ReubsWalsh 3 жыл бұрын
"Scattered showers, poor" like the shipping forecast!🤣
@ronensuperexplainer
@ronensuperexplainer Жыл бұрын
The music at the end דינו of passover is very fitting
@bilalsulaiman2177
@bilalsulaiman2177 4 жыл бұрын
You are brilliant! Please keep up 💔
@finminder2928
@finminder2928 4 жыл бұрын
9:40 the zoo pals commercial muisic
@douglasjackson295
@douglasjackson295 4 жыл бұрын
What would happen if you used a standard distribution for value and then used another standard distribution for probability of choice so the agent attempts to do the thing but not aggressively so.
Quantilizers: AI That Doesn't Try Too Hard
9:54
Robert Miles AI Safety
Рет қаралды 84 М.
3M❤️ #thankyou #shorts
00:16
ウエスP -Mr Uekusa- Wes-P
Рет қаралды 14 МЛН
Vivaan  Tanya once again pranked Papa 🤣😇🤣
00:10
seema lamba
Рет қаралды 34 МЛН
Intelligence and Stupidity: The Orthogonality Thesis
13:03
Robert Miles AI Safety
Рет қаралды 667 М.
Glitch Tokens - Computerphile
19:29
Computerphile
Рет қаралды 316 М.
There's No Rule That Says We'll Make It
11:32
Robert Miles 2
Рет қаралды 35 М.
No, it's not Sentient - Computerphile
9:41
Computerphile
Рет қаралды 870 М.
What can AGI do? I/O and Speed
10:41
Robert Miles AI Safety
Рет қаралды 118 М.
10 Reasons to Ignore AI Safety
16:29
Robert Miles AI Safety
Рет қаралды 338 М.
AI? Just Sandbox it... - Computerphile
7:42
Computerphile
Рет қаралды 264 М.
100+ Linux Things you Need to Know
12:23
Fireship
Рет қаралды 740 М.
A Response to Steven Pinker on AI
15:38
Robert Miles AI Safety
Рет қаралды 206 М.
Is AI Safety a Pascal's Mugging?
13:41
Robert Miles AI Safety
Рет қаралды 371 М.
iPhone 16 с инновационным аккумулятором
0:45
ÉЖИ АКСЁНОВ
Рет қаралды 8 МЛН
Как слушать музыку с помощью чека?
0:36
Easy Art with AR Drawing App - Step by step for Beginners
0:27
Melli Art School
Рет қаралды 11 МЛН
تجربة أغرب توصيلة شحن ضد القطع تماما
0:56
صدام العزي
Рет қаралды 42 МЛН
iPhone 15 Pro в реальной жизни
24:07
HUDAKOV
Рет қаралды 237 М.