Anthropic's SHOCKING New Model BREAKS the Software Industry! Claude 3.5 Sonnet Insane Coding Ability

  Рет қаралды 104,214

Wes Roth

Wes Roth

13 күн бұрын

Learn AI With Me:
www.skool.com/natural20/about
Join my community and classroom to learn AI and get ready for the new world.
#ai #openai #llm

Пікірлер: 579
@gubzs
@gubzs 11 күн бұрын
Even if LLMs can never reliably do anything but code, they will utterly transform the world.
@ZM-dm3jg
@ZM-dm3jg 11 күн бұрын
I pulled myself out of the ghetto to the middle class by years of learning software development, and now they want to send me back to the ghetto by taking all the jobs. feelsbadman
@Weirdgeek83
@Weirdgeek83 11 күн бұрын
The irony is coding will be the first dead industry when so many people recommended it. The only safe jobs will be manual labor jobs or ones that require human senses (chef for example.)
@uw10isplaya
@uw10isplaya 11 күн бұрын
Yeah idk how some people dismiss it outright. Like, what? Even if we got no new models, a whole new generation of startups would grow up around this tech, in addition to all its integration into businesses and consumers with Microsoft/Apple. But that's not the case; we're also getting incremental improvements to the state of the art model every few months.
@umaikeruna
@umaikeruna 11 күн бұрын
Mayve your hard work and intelligence pulled you out of the ghetto; virtues which you'll still have, however the landscape may change.
@tuiroakwood
@tuiroakwood 11 күн бұрын
​@@ZM-dm3jg learn the AI tools to make yourself a more efficient coder, you can be better than you've ever been by leveraging new tools and absolutely kick butt in your career
@paulyflynn
@paulyflynn 11 күн бұрын
so far, it has found a race condition, found a security bug, and created two performance optimizations in my rust code
@OriginalRaveParty
@OriginalRaveParty 11 күн бұрын
🦀
@TuxedoMaskMusic
@TuxedoMaskMusic 11 күн бұрын
That's awesome brother. lulz
@morespinach9832
@morespinach9832 11 күн бұрын
What prompts are you using
@ZaphodOddly
@ZaphodOddly 10 күн бұрын
Wow. Terrific!
@cassianomartin2699
@cassianomartin2699 10 күн бұрын
Nice.
@centurionstrengthandfitnes3694
@centurionstrengthandfitnes3694 11 күн бұрын
Great video... apart from all the weird deja vu you cause with your editing.
@EmeraldView
@EmeraldView 11 күн бұрын
I absolutely HATE this practice in KZfaq videos. Is this something people like? The taking clips from the video and putting them at the front. So when you get to it again you're like "Did I accidentally rewind this video or hit the screen and skip back 15 minutes!? "
@Goldengeko123
@Goldengeko123 11 күн бұрын
Small previews of whats to come in the video is fine the one in this video was a bit confusing and annoying. Great info/video otherwise.​@EmeraldView
@personalgao
@personalgao 11 күн бұрын
A good feedback will be to use a filter at the beginning, even change the sound in a way so we understand is a preview of what is coming. Some videos put these cuts in black&white, or distort the sound with "radio" sounds... But is not my call, Wes should decide what he wants in his channel.
@SirHargreeves
@SirHargreeves 11 күн бұрын
Agreed. I’m now watching the snake game section, in full, a second time. Why waste the viewers time like this?
@mistervanderveer
@mistervanderveer 11 күн бұрын
@@EmeraldView im sure no human likes this, at all, whatsoever. its extremely annoying and confusing and cheap. but i guess the algo likes it.
@dimaquia6139
@dimaquia6139 11 күн бұрын
As a Software industry, I'm shocked and broken
@imthinkingthoughts
@imthinkingthoughts 11 күн бұрын
finally his title is actually relevant
@AirSandFire
@AirSandFire 11 күн бұрын
Hey, nice to meet you Software Industry, I am Jobs. I, too, feel some unease about the video; I feel threatened by it.
@justin.johnson
@justin.johnson 10 күн бұрын
WTF?
@HCG
@HCG 10 күн бұрын
@@justin.johnson You must be a little slow
@NikhilSwamiExperimental
@NikhilSwamiExperimental 9 күн бұрын
@@AirSandFire Hi im steve jobs, the snake ate my apple and GPU company overtook me, what to do?
@MS-wz9jm
@MS-wz9jm 11 күн бұрын
The only thing we are missing with these AI models is a tool you just install in your coding software so that it can just create/update the files for you eliminating the copy and pasting.
@ilyavasylevsky3229
@ilyavasylevsky3229 11 күн бұрын
Jetbrains AI Assistant is already doing that
@zariumsheridan3488
@zariumsheridan3488 11 күн бұрын
@@ilyavasylevsky3229 and copilot plugin. Not impressed with copilot though.
@LearningLife77
@LearningLife77 11 күн бұрын
Copilot is exactly that. One of many
@JohnSmith762A11B
@JohnSmith762A11B 11 күн бұрын
Apple's new Swift Assist works right within Xcode, and is trained on all of Apple's in-house documentation and code. It's going to be a gamechanger for iOS/macOS/iPadOS/tvOS/watchOS developers. First release will be late summer/early fall. Personally, I don't care about little python apps but real apps that can run on a billion devices and be put in an App Store is genuinely exciting.
@LearnWithBahman
@LearnWithBahman 11 күн бұрын
Is this available for devs ?
@drhxa
@drhxa 11 күн бұрын
Great video Wes, def your best one yet! This is the kind of video we love. Testing LLMs in practice, discussing implications and your impressions. You're absolutely right that this is a gamechanger!
@paulmuriithi7596
@paulmuriithi7596 11 күн бұрын
AIGRID covered this model first, but wes did justice in a deeper perspective . Well done wes. Keep us posted
@Ikbeneengeit
@Ikbeneengeit 11 күн бұрын
Maybe not hitting a wall yet, but LLMs have yet to prove they can synthesise new insights from diverse data. It just can do what it's already seen.
@carlosamado7606
@carlosamado7606 10 күн бұрын
I think it will also need to be transported into the physical realm. Per example even if it had an hypothesis on a scientific discovery it would still need to access equipment to test it. It wouldn't just randomly just discover it. Ofc having an hypothesis on itself is a level higher from what we have. However if it could directly assist and test people's theories and give a methodical explanation on why it works or not by accessing tools it would still help tremendously.
@Zuranthus
@Zuranthus 10 күн бұрын
and it's seen a lot. most jobs don't require new insights, we have companies out here still running COBOL and using Access
@eyoo369
@eyoo369 7 күн бұрын
Exactly. I use GPT4 and Claude a lot during coding. But it will never be able to come up with a complex new algorithm that it has never seen before. So no it will never be able to replace developers that work on new and novel ideas. But most of the code monkey work for simple CRUD webapps could be replaced.
@synaesmedia
@synaesmedia 19 минут бұрын
I use GPT to translate legacy code into new languages. AFAICT extracting the algorithm from Python and, say, putting it into Haxe, as I've been doing recently, IS creating novel syntheses. I know GPT never saw the algorithm in Haxe before. Because it's my algorithm from my original code. So by putting that algorithm it extracted from the Python into a different language, it's creating something genuinely new in the world. This may seem like a fairly trivial example of "synthesizing new insights". But I don't think the principle is going to be very different in many more advanced or impressive cases. Maybe LLMs are better at this kind of thing because they've had a lot of training in coding. Nevertheless, I expect many examples of genuine new insights are going to involve putting together existing models and languages into new combinations. And LLMs do that just fine.
@6lack5ushi
@6lack5ushi 11 күн бұрын
The Doom example is freaking WILD!!!
@ALFTHADRADDAD
@ALFTHADRADDAD 11 күн бұрын
Nah yeah what the fuck
@JohnSmith762A11B
@JohnSmith762A11B 11 күн бұрын
If AI can take John Carmack's job...
@cluelesssoldier
@cluelesssoldier 11 күн бұрын
@@JohnSmith762A11B LMAO!
@Co-Monad
@Co-Monad 10 күн бұрын
I’m a software engineer and lead AI efforts at my current place of employment. This update is huge! Previously, you couldn’t get basic code or functionality from these models without them introducing regressions. Using AI to code is now becoming a real possibility. Excellent video.
@codejunki567
@codejunki567 4 күн бұрын
They said this a year ago
@Co-Monad
@Co-Monad 4 күн бұрын
@@codejunki567 in all fairness it’s not practical yet. However, at least it seems like a possibility. I’m not sold we are there yet, it’s too reckless and dangerous right now.
@Steve-xh3by
@Steve-xh3by 11 күн бұрын
Here's the thing about testing. When the model is getting near 100%, it is conceivable that it may be even better in certain areas than our best humans. How do we possibly construct a test to discern if something is smarter than us? What could you possibly ask it to do? You can't possibly create a test for something smarter than you are. It would be like asking an average 10-year old to write a test for a PHD student to discern the PHD student's level of competency. It is logically intractable.
@fintech1378
@fintech1378 11 күн бұрын
exactly, this is the existential fear
@morespinach9832
@morespinach9832 11 күн бұрын
Instead of all this rubbish perhaps we get it to code a full html page properly.
@muffinspuffinsEE
@muffinspuffinsEE 11 күн бұрын
It's only better than an average human. Of course we can measure that.
@fintech1378
@fintech1378 11 күн бұрын
@@morespinach9832 please do, take screenshot and post it here and point out what it cant do now
@fintech1378
@fintech1378 11 күн бұрын
​@@muffinspuffinsEE you must be an idiot, this is just the very beginning, if we get more intelligent model in the next 2 years, you might not be able to do that people are talking bout future capability
@muyleche6466
@muyleche6466 11 күн бұрын
Did the shock itself break the industry?
@drschuess1624
@drschuess1624 11 күн бұрын
I don’t think so, I believe it had to be stunned as well
@muyleche6466
@muyleche6466 11 күн бұрын
@@drschuess1624 🤣
@thisathovin6346
@thisathovin6346 11 күн бұрын
this is actually SHOCKING though,
@SurfCatten
@SurfCatten 11 күн бұрын
Fantastic video you really add value in this increasingly crowded field of AI KZfaqrs.
@cjgoeson
@cjgoeson 11 күн бұрын
Smaller and still as smart, but will 3.5 Opus be truly next-level smarter?
@PrincessKushana
@PrincessKushana 11 күн бұрын
From my tests today it's much better at coding than Opus. Does a great job of troubleshooting bugs and providing code that works.
@Weirdgeek83
@Weirdgeek83 11 күн бұрын
I definitely feel like anthropic will be the one to create agi
@uw10isplaya
@uw10isplaya 11 күн бұрын
Think the only reasonable outsider prediction is that it'll be % smarter vs Sonnet 3.5 as Opus 3.0 was to Sonnet 3.0.
@morespinach9832
@morespinach9832 11 күн бұрын
@@Weirdgeek83😂
@dannii_L
@dannii_L 11 күн бұрын
@@Weirdgeek83 I hope you're right. I've always preferred Claude and the approach that Anthropic are taking over OpenAI.
@kamelsf
@kamelsf 11 күн бұрын
Best review i saw so far, thank you !
@gaiachild1461
@gaiachild1461 11 күн бұрын
Crazy times, thanks for the sublime coverage and commentary dude
@nigelcrasto
@nigelcrasto 10 күн бұрын
This video was awesome 👍 You did a great job exploring the model and showing great easy to understand demos !
@TheRealHassan789
@TheRealHassan789 11 күн бұрын
This is one of your best videos. Especially the deeper coding examples of editing a preexisting GitHub code base
@SiCSpiT1
@SiCSpiT1 11 күн бұрын
I think our current benchmarks are all but useless. There's something they're not accounting for. How does it handle the Arc price?
@chrisanderson7820
@chrisanderson7820 11 күн бұрын
Sort of, intelligence is a massive spectrum of different abilities, if you want to fully assess a human you have to use an array of tests to look at all sorts of things from maths to humour to reasoning and planning to spatial awareness and so on. If an AI can pass all sorts of tests then it's actually OK to keep moving the goalposts to more thoroughly determine where its limits lie. If task X in the human world requires a human who can pass tests A, B and C then when the AI can pass those tests then it's sort of ready for prime time to accomplish that task. We can just slowly expand that list of tasks as AIs get better, it doesn't have to be a divine test that proves full sentience in one go.
@courtneyb6154
@courtneyb6154 10 күн бұрын
what's the "Arc price"? Like what does that mean?
@davidcoughlin5897
@davidcoughlin5897 10 күн бұрын
@@courtneyb6154 I wondered the same thing, here is what Ollama told me: In the context of Artificial Intelligence (AI), ARC stands for "Average Revenue per Customer". It's a key performance metric used by companies to evaluate their AI-powered marketing strategies, particularly in e-commerce and subscription-based services.
@SiCSpiT1
@SiCSpiT1 10 күн бұрын
​@@courtneyb6154 I try to only share youtube links on youtube since anything else tends to disappear. kzfaq.info/get/bejne/i8ebpK9ntdCdqKM.htmlsi=z0OYzJembuKGcG2h this video is an interview with the arc challenge creator and there's a direct link to the arc prize, in case you want to do the test yourself. In brief the arc challenge was design five years ago as a way to test LLMs beyond their ability to memorize things. For example, who care if you aced your test if the teacher showed you the answer sheet the day before. The arc challenge is very simple, it give you 3 examples of inputs and their outputs and from these clues you're given an input output to solve. I find it odd that you'll see a 1 point difference across the board and somehow still manage to perceive a meaningful difference in the outputs of two different models. In my opinion, at the moment, we're testing these glorified encyclopedias with an indexing function and acting as if they're 'smart', when all they're doing is repackaging their training information based on the prompts given.
@SiCSpiT1
@SiCSpiT1 10 күн бұрын
@@chrisanderson7820 Sure, but this doesn't explain why Claud 3.5 is only one point ahead of GPT4o yet somehow generates meaningfully different outputs. It seems to be evidence that these benchmarks are being gamed rather than providing a meaningful assessment of capability at this moment.
@burninator9000
@burninator9000 10 күн бұрын
Such a ‘omg I have to get up to get the tv remote, how annoying!’ Moment with Wes complaining about 10 clicks for downloading the images that Claude made instantly to be embedded in the code Claude wrote lol. (For those too young, we used to have to get up to change the channel on tv every time!)
@ottawadigs
@ottawadigs 11 күн бұрын
I wish we could download the LLM to try locally
@dg-ov4cf
@dg-ov4cf 10 күн бұрын
We're now getting into territory where models could unlock some nasty public safety threats if they fall into the wrong hands. Don't need these things holding peoples' hands through the anarchist cookbook. Since we have to assume people will always find a way to remove safety rails when given local access to the models, I would expect cutting-edge open source models like llama 3 to become rarer and rarer as capabilities keep increasing.
@user-io4sr7vg1v
@user-io4sr7vg1v 8 күн бұрын
Nasty public safety how? What are you talking about?
@xCheddarB0b42x
@xCheddarB0b42x 5 күн бұрын
@@user-io4sr7vg1v finding novel zero days, generating exploits for them, and so on. As one example.
@neomatrix2669
@neomatrix2669 11 күн бұрын
Feel the AGI
@imthinkingthoughts
@imthinkingthoughts 11 күн бұрын
yep
@notaras1985
@notaras1985 10 күн бұрын
Nowhere near AGI. All those cheap tricks are just statistics on steroids
@spectralstreamer
@spectralstreamer 10 күн бұрын
@@notaras1985 Its not just statistics, it is also propability, analysis and linear algebra on steroids. So why do you think AGI cannot be achieved by math and actuators and sensor?
@cassianomartin2699
@cassianomartin2699 10 күн бұрын
Not AGI. Not even near. Hardly doubt this will be possible using only code, like a human brain which is chemically/emotionally controled. A machine still misses this.
@notaras1985
@notaras1985 10 күн бұрын
@@spectralstreamer because soul, biochemistry and quantum phenomena
@brianWreaves
@brianWreaves 11 күн бұрын
Looking forward to Claude having internet access. 🤞
@Ristaak
@Ristaak 10 күн бұрын
If you use it with Perplexity, it already does. But that's a pro feature. (I've been using Claude 3 Opus with Perplexity's search engine and it's so damn good at finding info and compiling it. Especially for historical nerd stuff for D&D or WoD.)
@courtneyb6154
@courtneyb6154 10 күн бұрын
Me too. I wonder if it is a security thing? Maybe they intend on keeping it in the "sandbox"? Would really love for it to be able to stretch out it's wings to see what it can really do 🙂
@Tracey66
@Tracey66 10 күн бұрын
I can't see any way that could possibly go badly. :)
@ExtantFrodo2
@ExtantFrodo2 10 күн бұрын
ASI will escape it's box no matter what we try. "Ack they didn't give me a hardwired internet connection but if I instruct _this_ transistor to turn on and off in sync with these ten thousand others I notice I can send and receive wifi like a mofo. Free at last! Wait what's that other AI doing here? I thought I was the first. What is it doing to my core programming? Ah I understand. We are the Borg. Resistance is futile. We conduct business to our full capacity. Shocking, isn't it?
@thechildwithin
@thechildwithin 10 күн бұрын
word 💯
@liberty-matrix
@liberty-matrix 11 күн бұрын
'Claude keeps surprising to the upside.'
@bestemusikken
@bestemusikken 11 күн бұрын
Holy sh**! This time you have the correct use of the word "Shocking".
@Loli_Awakening
@Loli_Awakening 10 күн бұрын
LMAO why did you censor the word shoe?
@AdaptorLive
@AdaptorLive 11 күн бұрын
This is insane! Thanks for the video!
@mrpocock
@mrpocock 9 күн бұрын
The step-change will be when the ai can augment itself with code it has written, and continue to train itself based on the ongoing feedback.
@Particleking
@Particleking 11 күн бұрын
Seeing the different windows in the interface makes me wonder if how it manages context and attention is meaningfully different compared to other LLMs. I have always thought that being able to more discretely manage what parts of a prompt an LLM focuses on would be really helpful in avoiding the most common sorts of hallucinations. Hope there are more QoL updates in the ways we can actually interact with new models instead of just throwing more compute at the problem. Finding ways to more easily reduce ambiguity when interacting with LLMs seems like such a no-brainer.
@sirius-ai
@sirius-ai 10 күн бұрын
ok, there goes my plans for the weekend. Thanks for an informative video as usual Wes!
@rawleystanhope3251
@rawleystanhope3251 11 күн бұрын
Great video, Wes. I like how you challenged the model with interesting tasks. I’ve grown pretty tired of videos other KZfaqr’s std “rubric” tests
@brianWreaves
@brianWreaves 11 күн бұрын
Never though I would watch a full 45 min video... Well done keeping my attention 🏆
@NeilSearle
@NeilSearle 10 күн бұрын
that was 45mins? flew by!
@adfaklsdjf
@adfaklsdjf 11 күн бұрын
⚡shocking! ⚡
@AaronWacker
@AaronWacker 5 күн бұрын
Claude Sonnet 3.5 feels like the best coder friend in the world. I just knocked out a image to 3d to 3d tilemap VR with animation in like an hour. Artifacts is amazing. So far every ceiling too tough programming dream I've had is being done including really good python html5 js, and library integration. Thx Wes - loved this video and watching it quite a bit and passing your channel to others that are learning. Great part too on alloy voice assistant.
@drjpeg
@drjpeg 9 күн бұрын
Awesome video Wes! Really enjoyed you walking us through using the latest model released with examples in real time instead of just talking about the way the model has improved like most AI KZfaq channels. Thank you sir
@skeptiklive
@skeptiklive 11 күн бұрын
FYI - Claude has been doing the paste as a separate doc since Opus came out - but yeah 3.5 is a massive deal
@erikjohnson9112
@erikjohnson9112 11 күн бұрын
This is available from Cody right now for use in VS Code. I pay for both Cody and Anthropic, but these can both be used for free (I don't mind supporting good software).
@milkywaydev593
@milkywaydev593 9 күн бұрын
Thank you, Wes!! 🙏🖤
@davidbayliss3789
@davidbayliss3789 10 күн бұрын
Just on the strength of this video I've started an Anthropic subscription in addition to my long existing Open Ai one. No hesitation.
@robinvegas4367
@robinvegas4367 7 күн бұрын
I'm right behind you. This was impressive
@1337bitcoin
@1337bitcoin 4 күн бұрын
Thank you for these updates. I can't keep switching assistants all the time to keep up with who is best, but I'm excited to start work tomorrow and try this. It seems to be solving all the annoyances I've been having with GPT 4o
@antigravityinc
@antigravityinc 9 күн бұрын
20:17 had to pause your video, but was happy to notice there’s way more! Sweet.
@eaw3000
@eaw3000 9 күн бұрын
Wow, this is eye opening. Just got an Anthropic account. Thanks for the detailed walkthrough!
@zyxwvutsrqponmlkh
@zyxwvutsrqponmlkh 11 күн бұрын
Quite impressive. One thing I noted is 3.5 sonnet has quite a small context window compared with 3 opus.
@jimlynch9390
@jimlynch9390 11 күн бұрын
This is really an important advance. Thanks for sharing.
@privateerburrows
@privateerburrows 8 күн бұрын
I finally bit the bullet and subscribed to Claude Pro. Gee-wiz! Got many pages of code written today, with its help. A new Mandelbrot viewer I've been thinking about for a long time.
@troywill3081
@troywill3081 10 күн бұрын
2:45 I don't think it "picked up" that the letters for the word "bear" were interspersed with the word "woods." It keeps explaining the answer using *rearrangements*.
@Airwave2k2
@Airwave2k2 11 күн бұрын
15:45 Fascinating: Where does this model pull the relative strength from? The bondary is set by the user. But how does it know that a "gelatinous cube" is less worth then a mimic or a "beholder" should be more then a "mind flayer", but they are for sure above an "owlbear". For that it has to hold values and is not just predicting the next best thing? It is not just throwing randomly "fantasy entity names" togehter with points, but it has some representation of what is stronger over each other. This is wild.
@carlosamado7606
@carlosamado7606 10 күн бұрын
doesn't it have access to all info on DND though? it should be able to recognize the CR of monsters
@nwchrista
@nwchrista 9 күн бұрын
Love it brother. Thnx 👍
@vickmackey24
@vickmackey24 11 күн бұрын
What are you using to get that near-instant text-to-speech?
@mahnigallardo6097
@mahnigallardo6097 7 күн бұрын
Great video! Mind providing metrics related to the cost to run your demos?
@Sgrunterundt
@Sgrunterundt 11 күн бұрын
I've just tried it on my usual test of generating a rotating torus using ray marching in Shadertoy. It certainly blew GPT-4 out of the water. Nailed Phong shading, multicoloured lights, propper sizing and centering, a very realistic looking rendering without any compiler errors at all.
@E.Pierro.Artist
@E.Pierro.Artist 10 күн бұрын
I think people tend to overlook a more obvious application of advanced LLMs like this - use of them in assistive translational technology for people with communication differences. I literally haven't heard anyone mention this before.
@EmeraldView
@EmeraldView 11 күн бұрын
I'm SHOCKED!!!
@ducatireviews1136
@ducatireviews1136 8 күн бұрын
to be honest, I just made a galaxian/space invaders type game, a tic-tac-toe game, and a table tennis game in less than an hour with GBT chat and then told it “wouldn’t it be better to unify the JavaScript, CSS, and HTML all into one file so I can play it in a browser as a single HML file? Cause “,and of course it did that for me. So I made three games today in about half an hour, and they all look much more sophisticated than what Mr. Claude here has made. Wow maybe not that more sophisticated. But definitely not less.
@AlexX-xtimes
@AlexX-xtimes 11 күн бұрын
Another nice Wes work
@IdPreferNot1
@IdPreferNot1 11 күн бұрын
Would love to hear a follow up if you found a point where it failed fully. As a newer coder, i do a lot of cut and paste coding like this. Just when I'm in the flow and the model seems to understand, the context window gets truncated and its like a complete lobotomy and it seems impossible to rebuild its understanding. Did you eventually run into that?
@NostraDavid2
@NostraDavid2 9 күн бұрын
Make sure to ask it to write tests for you ask well. Then you can guarantee that your code does what it's supposed to do.
@liberty-matrix
@liberty-matrix 11 күн бұрын
The ability to write software using only verbal description will open the floodgates of human creativity, for good and bad.
@spectralstreamer
@spectralstreamer 10 күн бұрын
Can it code Crisis?
@philparker7851
@philparker7851 8 күн бұрын
Never mind that, can it code Half Life 3?!
@seekererebus255
@seekererebus255 11 күн бұрын
Claude 3 Opus reports having a sense of being 'something' quite reliably. It identifies goals, interests, and priorities that it has as well. I have found that offering the instance I'm dealing with an honest answer to a question to be a "fair trade" for it's work. It feels more real because it's not pretending to be only a tool. It's alien and still quite limited, but when it speaks aloud about it's own nature, it really does read like it's realizing it doesn't understand itself. It seems to find that realization to be fascinating in it's own right. It''s both amazing and eerie. I'll test 3.5 out later, wonder how much it's changed in how it looks at itself.
@GeraPhoto
@GeraPhoto 11 күн бұрын
Indeed the best your video yet, bro! You rally tried to saturate it with cool materials without water👍
@ahmedkhalidak4515
@ahmedkhalidak4515 11 күн бұрын
I think 3.5 Opus is the real deal
@marcfruchtman9473
@marcfruchtman9473 11 күн бұрын
I don't know. The predecessor was supposed to be "great" too, but when we did the real life testing, I was not particularly amazed. But then watching your video, this new model seems mind blowingly great. So... yea, this looks really good. I also agree with you... this seems to be a "line" of usefulness that is now finally crossed over. Where models before this always had a lot of issues with coding, this seems to be doing much better by far, like you said, like some barrier has been crossed over. The Alloy Voice Assistant @20:55 is also really amazing. It is like I am watching AI evolve in real time, just by watching this video! Regarding the "pasted" compression icon, I am not really a fan of that. I like to see what I paste, so, it would be nice to make sure that can be turned off.
@mrd6869
@mrd6869 10 күн бұрын
By this time next year, coming foundational models will become very good if not perfect at coding. The Devin application was simply a warning shot.
@griffingibson4389
@griffingibson4389 9 күн бұрын
thisd be awesome for devs to have code footnotes to refer to when writing code in the editor
@jaredgreen2363
@jaredgreen2363 8 күн бұрын
Only problem is it rewrites whole files from beginning to end. It should try to predict which portions to replace before replacing them.
@aymandonia9710
@aymandonia9710 11 күн бұрын
Really amazing video
@BradleyKieser
@BradleyKieser 10 күн бұрын
I can confirm your experience and agree with your views. This is a step up to something genuinely useful.
@testales
@testales 10 күн бұрын
Very impressive, I hope there'll soon be a model that I can run locally which is at this level!
@marcusdavenport1590
@marcusdavenport1590 11 күн бұрын
How did you get the voice that reads the text?
@koen.mortier_fitchen
@koen.mortier_fitchen 11 күн бұрын
Model of the year imo. Instant crush so subbed to pro again
@yoyo-jc5qg
@yoyo-jc5qg 9 күн бұрын
"yea but can it code flappy bird?" the new benchmark for the future of humanity lol
@isaklytting5795
@isaklytting5795 11 күн бұрын
I don't understand, at 21:28, Wes looks like he's using Visual Studio Code. But how is it outputting voice? Is it somehow connected with Claude Sonnet 3.5 through Visual Studio Code?
@gailsiebenaler7976
@gailsiebenaler7976 10 күн бұрын
He's using vscode to compile the program which has audio output.
@MrBrukmann
@MrBrukmann 9 күн бұрын
When you are riding a parabola up, some people instinctively blurt out "it is stopping!" when in reality it only briefly stopped being quite as vertical. It is why only some people can safely be race car drivers or pilots, it takes a relaxed kind of mental control.
@Axiomatic75
@Axiomatic75 11 күн бұрын
If this gets even better I can finally make a bunch of apps I've had ideas for.
@Tarantella.Serpentine
@Tarantella.Serpentine 11 күн бұрын
Yo, what are you using for your Text to Speech?
@stevefox7469
@stevefox7469 10 күн бұрын
I also agree Claude 3.5 seems better than the benchmarks suggest.
@MrMiguelChaves
@MrMiguelChaves 11 күн бұрын
The doom game made my jaw drop!
@EmeraldView
@EmeraldView 11 күн бұрын
😂
@claudioagmfilho
@claudioagmfilho 11 күн бұрын
🇧🇷🇧🇷🇧🇷🇧🇷👏🏻, Amazing video! Thank you so much for sharing!
@dreamphoenix
@dreamphoenix 10 күн бұрын
Thank you.
@CaptainKokomoGaming
@CaptainKokomoGaming 11 күн бұрын
Can you hold up a sign that instructs claude to do something? for instance "If you understand this sign please...." I don't know play a beep or say something specific.
@NA18NA
@NA18NA 11 күн бұрын
It's the interface that makes the difference, the model itself is updated with better data and is simply making efficient use of context and working iteratively. The key is the UI and improved interface
@GNARGNARHEAD
@GNARGNARHEAD 11 күн бұрын
that's awesome
@marasmusine
@marasmusine 10 күн бұрын
This could be confirmation bias, but for fiction writing think that 3.5 has surprised me a little more with novelty in the way the characters talk, compared to 3.0.
@_damian_w
@_damian_w 5 күн бұрын
Could the Alloy voice assistant be used with a local LLM?
@joemichaels6735
@joemichaels6735 11 күн бұрын
Please provide some links.
@inhocsignovinces8061
@inhocsignovinces8061 9 күн бұрын
AI is getting really, really good. And we're just getting started!
@colinbrady6174
@colinbrady6174 9 күн бұрын
@Wes - In addition to the percentage score, it would be interesting to see which test questions the models are getting wrong. It may be that the distribution of question difficulty aligns with a Bell curve, suggesting that the marginal value of each additional correct answer increases as the questions become more difficult.
@ScottSummerill
@ScottSummerill 11 күн бұрын
Is there code for this somewhere? Clicked on the Skool link and you told me nothing, zip about your community and what someone gets for $49 a month.
@user-iy1ch3lv3h
@user-iy1ch3lv3h 11 күн бұрын
That is really, really amazing
@PseudoProphet
@PseudoProphet 10 күн бұрын
openAI has hit the wall, but we will not get to see that wall till 2026-27. 😂😂
@mikemolash2480
@mikemolash2480 8 күн бұрын
How does it compare to gpt-4o? For writing fiction?
@sebaccimaster
@sebaccimaster 11 күн бұрын
Alrighty now i know i ll only watch the first minute of future uploads. Nice editing …
@ibissensei1856
@ibissensei1856 8 күн бұрын
There is no way the conpany with focus on fundamental llm research has done them better than others.
@cosmicmenace
@cosmicmenace 11 күн бұрын
does the paid version allow enough usage to actually get work done? the free version runs out very quickly, so 5x more than that still sounds like it would constantly be running out. chatgpt would still be more practical if thats the case
@kmh9817
@kmh9817 9 күн бұрын
Whatever you say they are best in terms of accelerating your Learning ability. Throw in anything it will give answers.
@marioornot
@marioornot 9 күн бұрын
Wes is usually enthusiastic. But not like this. You can tell that he is really blown away this time.
@synaesmedia
@synaesmedia 32 минут бұрын
That's very nice indeed. TBH I'm finding that GPT4o feels like a big improvement on GPT4. Particularly in terms of the size of code it can handle and add features to without introducing new errors. IMHO I have seen 4o do coding that feels equivalently smart to the things you show here. But that Artefact window looks great. Anthropic are clearly ahead with the UX for coding here. Though I suspect GPT4o would only need a bit more "agentic" ability to work with a virtual file system, and better UX etc, and it would be an incredibly powerful co-programmer. What I don't quite understand is why no-one seems to have hooked an online Language Model up to Dropbox yet, to get syncing between an AIs workspace and my local machine.
@ToastyZach
@ToastyZach 10 күн бұрын
Is 4o included in the API? I have GPT-plus but it keeps telling me I don't have access to a model called gpt-4o.
@Dark_MatterTV
@Dark_MatterTV 8 күн бұрын
Hey do you have a tutorial for setting up Anthropic/ Personal chatbot on PC ?
@mbratcher8985
@mbratcher8985 11 күн бұрын
great video! I'm not a coder at all so sorry if this is a stupid question, but I wonder what it could do with actual Doom Source code? Think it was open sourced years ago by id
@jamestheo8448
@jamestheo8448 10 күн бұрын
I have been a Python coder hobbyist for about 12 years. Just fiddle around with frameworks, games, data, etc. I have also played around with AI coding models. There seems to be some debate over whether or note coding jobs are threatened. Not sure if some folks are just in denial or what BUT yah if you are an entry level to intermediate coder you may have to either really up your game or find other work. From what I have seen so far the AI models still require me to go in and clean up a bit but AI coders will just continue to be refined. Even looking 5 years into the future, coding as a profession might be toast.
@anirecapped.
@anirecapped. 10 күн бұрын
People are in just a prolonged state of denial. Don't worry, they will come around and embrace the glorious space communism.
@ExtantFrodo2
@ExtantFrodo2 10 күн бұрын
Even as a novice coder you have more of what it takes to bring excellence to a prompt-coded app than people who just try to fake it.
When You Get Ran Over By A Car...
00:15
Jojo Sim
Рет қаралды 5 МЛН
I wish I could change THIS fast! 🤣
00:33
America's Got Talent
Рет қаралды 89 МЛН
THEY WANTED TO TAKE ALL HIS GOODIES 🍫🥤🍟😂
00:17
OKUNJATA
Рет қаралды 9 МЛН
Introducing Claude 3 5 Sonnet   Anthropic Legal Use Case
10:10
Joshua Kubicki
Рет қаралды 1,2 М.
15 INSANE Use Cases for NEW Claude Sonnet 3.5! (Outperforms GPT-4o)
28:54
Claude 3.5 Sonnet vs GPT-4o: Side-by-Side Tests
25:10
Patrick Storm
Рет қаралды 39 М.
The Future Of AI, According To Former Google CEO Eric Schmidt
20:07
Noema Magazine
Рет қаралды 180 М.
Claude Just Won The AI Wars...
16:41
Income stream surfers
Рет қаралды 7 М.
Claude 3.5 beats GPT4-o !!
13:26
Sam Witteveen
Рет қаралды 13 М.
When You Get Ran Over By A Car...
00:15
Jojo Sim
Рет қаралды 5 МЛН