"Think Before You Speak" (Quiet-STaR) AI Experiment

Рет қаралды 7,270

Күн бұрын

"Think Before You Speak" (Quiet-STaR) AI Experiment
👊 Become a member and get access to GitHub and Code:
/ allaboutai
🤖 AI Engineer Course:
scrimba.com/?ref=allabtai
📧 Join the newsletter:
www.allabtai.com/newsletter/
🌐 My website:
www.allabtai.com
GitHub Repo:
github.com/AllAboutAI-YT/thin...
Quiet-STaR Paper:
arxiv.org/pdf/2403.09629.pdf
In this video I share my what would be my GPT-5 dream feautre and I create an experiment with Claude 3 Opus where i try to create i "Think before you speak" System 2 " (Quiet-STaR) AI Experiment Thinking using prompting. Interesting results
00:00 GPT-5 Dream Feature
02:34 System 2 Thinking Prompting
04:50 Writing the AI Code
11:07 Test 1 - Is it working?
14:06 Test 2 - Base Line
17:13 Test 3 - Final Answer
20:58 Test 4 - GPT-4 Eval

Пікірлер: 76

@AllAboutAI Ай бұрын

GitHub Repo: github.com/AllAboutAI-YT/think-before-you-speak

@AllAboutAI Ай бұрын

awesome, just uploaded the code to the repo, feel free to check it out! let me know if you have any questions or issues, always happy to help :)

@davediamond Ай бұрын

Brilliant as always. Thank you for sharing legend!

@AllAboutAI Ай бұрын

thnx a lot mate, really appreciate it :) thnx for tuning in and the kind words!

@TimTruth Ай бұрын

Great points- thanks man

@AllAboutAI Ай бұрын

no problem :) thnx for tuning in!

@Jeffben24 Ай бұрын

Thank you. This is by far the best prompt engineering I tested ! I would like to express to you my gratitude. Can you share more prompt engineering methods ?

@petersobolewski1354 Ай бұрын

That's exactly my experiment ;) I'm trying to simulate the System 1 and SYstem 2 thinking via simulating the conversation flow before the response reaches the user :)

@OdinsTechAndGaming Ай бұрын

Great video, as always

@AllAboutAI Ай бұрын

thnx a lot, really appreciate the kind words :)

@BillyRybka Ай бұрын

Yeah claude 3 is awesome! Finally an LLM that does what I want it to do lol

@AllAboutAI Ай бұрын

yeah im really happy with the progress, hoping to see more cool stuff from claude in the future :)

@SuccessDynamics Ай бұрын

Interesting approach! Thank you for your experiments, I am always happy to see your ideas. I immediately thought of systemic counseling. In short, the counselor would not give advice, but would motivate the client to pursue his wishes by asking open questions or come to the answer himself, since he already has the answer for himself anyway.

@AllAboutAI Ай бұрын

thnx a lot, really appreciate the feedback! that's a super interesting perspective - i love the idea of guiding someone to find their own answer rather than just giving advice. definitely something i'll think more about as i continue exploring these kinds of systems. cheers!

@miker99 Ай бұрын

you could give more compute to deeper chain of thoughts. Maybe asking several LLMs responses and maybe asking all of them a deeper thought on a subject, and then maybe each of them, assessing each others answers, and then working out the best response. All of this takes a heap of compute though ... but the result maybe a very thorough and concise answer.

@adebinha1984 5 күн бұрын

I loved the video and definitely learned something there! I just can't seem to connect this kind of double pass prompt technique with the Quiet STaR paper itself, which to me doesn't seem to be a prompting technique but rather a built-in system in the LLM itself

@hope42 Ай бұрын

This is exactly what I've been saying. First grade I was asked to read aloud to the class. I had dyslexia but I didn't know it. I decided to read the sentence in my head constructing it before I spoke. When I read it aloud the teacher advanced me to advance reading. It took until third grade to figure out that I had a reading problem but I thought before I spoke with the sentences I read aloud. I guess I faked out the teachers for three grades before they found out I had dyslexia putting me in advanced reading how messed up is that.

@helix8847 Ай бұрын

Whats GPT 4? Its all about Claude 3 now. :P

@AllAboutAI Ай бұрын

haha, yeah i do agree. I concider Claude 3 > GPT-4 for almost all tasks now for sure. Haiku is also amazing

@matthewfuller9760 Ай бұрын

It would neat to see any differences in math performance. I want to create a spanish tutor and its helpful to have a ai agent that can count words precisely. Maybe that is a simple prompting issue. One more thought: could this work for image generation?

@avi7278 Ай бұрын

Kris have you tried the python package "rich"?

@NeuroNap Ай бұрын

cool love the video , i have signed up as a member but i cant see the find the GH repo, what should I do?

@AllAboutAI Ай бұрын

hey! just send me your github username to kris@allabtai.com and i will add you to the private repo asap =)

@MarcusHast Ай бұрын

This sounds like ideas explored in chain-of-though prompting or multi step prompting. AFAIK there is not way to really have the LLM "think more" about the answer by spending more compute directly on the problem. It will always spens the same amount of compute on the same amount of tokens, thats just how the algorithm works. So if you want it to "think more" about a problem you have to make it generate more tokens. And possibly have it summarize the result in the end. Naturally that will make it more expensive.

@AllAboutAI Ай бұрын

hey, thnx for the comment :) yeah you are right, it is def inspired by chain-of-thought and multi-step prompting. and i agree, the only way to make it "think more" is to make it generate more tokens. but i think combining it with some of the "self reflection" technics could be interesting. but for now, its mostly just a fun experiment, and a easy way to learn some of the core concepts in a applied way. thnx for tuning in! :)

@darrylrogue7729 Ай бұрын

Great concept! I think during Nvidia’s most recent GTC keynote, Jenson pointed out that the architecture of the Blackwell GPUs includes a kind of ‘smart router’ on a hardware level, which allocates computing resources based on the task requirements.

@AllAboutAI Ай бұрын

thnx! yeah, that's really interesting about the nvidia blackwell chips. i'll have to check that out, sounds like it could be a step towards what i was describing. always cool to see hardware innovations that can support more advanced ai systems. thanks for sharing that!

@darrylrogue7729 Ай бұрын

I just had to confirm, I used the transcript from the GTC keynote and then asked GPT4 if there is in fact a intelligent router, this was the response: Yes, Jensen Huang mentioned Blackwell’s capability to allocate compute resources dynamically, especially in relation to AI models and their computational demands. This capability is tied to Blackwell’s Transformer engine, which can dynamically and automatically rescale and recast numerical formats to a lower precision whenever possible. This feature is crucial for AI applications because it allows for more efficient computation by adapting the precision and computational resources based on the specific needs of the task at hand. Essentially, this means Blackwell can adjust its compute allocation to optimize performance and efficiency for various AI model demands, whether it’s processing simple tasks or complex AI algorithms. This approach not only maximizes computational efficiency but also enhances the overall performance of AI applications running on Blackwell.

@AllAboutAI Ай бұрын

Thnx:) really great observation, will def look into that!

@darrylrogue7729 Ай бұрын

@@AllAboutAI 8/10

@playthisnote Ай бұрын

And if it’s already answered the question it would have that for reference and the allocation of needed “thought time” would be reduced. Additionally that would be the thought behind why what is a banana being more simple because it’s already answered or has more data behind it. Therefore a new question you asked yesterday would take less time tomorrow after you or others have asked. Collectively it would be understandable for openAI to collect the data as we ask. I think the api doesn’t save questions. You could however for easy reference. Every question with answer could be backed up and have the ai read the questions for easy data retrieval and only grab info from data backup if it’s useful.

@AllAboutAI Ай бұрын

thnx for the insightful comment :) yeah i think you are spot on, if the model has seen the question before, the response would be much faster. collecting a knowledge base of questions and answers is def the way to go for a prod system. excited to see what the future holds in this space. appreciate you tuning in and sharing your thoughts!

@skaramicke Ай бұрын

We dedicate the same level of compute per token, yes, but one of those questions will yield a longer response for each subsequent token the response-so-far is part of the context. Quite often ChatGPT changes its mind during a response, which is in effect spending more compute for a more complex question. Also, you can already switch compute allocation for different questions by changing ChatGPT models to the crappier-and-faster ones.

@AllAboutAI Ай бұрын

hey mate :) thnx for the detailed comment and feedback, rly apreciate it. agree 100% that the current chatgpt model is quite dynamic in how it spends compute and tokens.cheers!

@skaramicke Ай бұрын

Thirdly, your idea is awesome and I’m implementing something similar in my agent manager thingie.

@aifpl Ай бұрын

great video! is it even possible do give "thinking time" to a LLM today or in the future? what do you think?

@AllAboutAI Ай бұрын

thnx a lot! yeah i think in the future we could see llms with more customizable "thinking time" and compute allocation. it's a really interesting concept and i'm excited to see how the tech evolves. for now, i'd say we're still a bit limited, but the core idea is really cool. can't wait to see what happens next!

@aifpl Ай бұрын

okey cool, do you think GPT-5 or some other model provider?@@AllAboutAI

@unsolved_mysteriez Ай бұрын

great job:) where can i find this code? and will this be a feature in GPT-5?

@AllAboutAI Ай бұрын

thnx a lot :) the code for this is on my github, just check the link in the description. as for gpt-5, that's just a dream feature of mine for now, but who knows what the future holds! hope you enjoyed the vid.

@AntonBj3 Ай бұрын

It would surprise me if they didn't have some method to choose appropriate amount of compute based on token-length. But maybe you are right

@AllAboutAI Ай бұрын

yeah, I would like this to happen in either a different system before any token is generated that I can see, some kind of arch that uses time to come as close as possible to the Best / correct answer with high accuracy, I dont wanna regenerate for a "better answer", i want the "best" the first time I ask

@KrisPeteCOD Ай бұрын

yeah this is good:) what would you think is the biggest challenge to create this system? and how do i become a member to get access to the code?

@AllAboutAI Ай бұрын

thnx :) the biggest challenge is probably the compute and storage needed to save all those inner monologue thoughts. becoming a member is easy, just use the link in the description to join up!

@KrisPeteCOD Ай бұрын

@@AllAboutAI 8/10

@agedbytes82 Ай бұрын

Great

@AllAboutAI Ай бұрын

thnx a lot :)

@yoagcur Ай бұрын

so what is the answer? Options are green, yellow, brown or black

@jimgsewell Ай бұрын

Also red, or a combination of some of those colors

@chrisbloem730 Ай бұрын

When would AGI start to create its own KZfaq channels

@AllAboutAI Ай бұрын

thnx for the question! i'm as excited about the future of agi as you are, but i think we're still a ways off from agi creating its own youtube channels. there's still a lot of work to be done on the technical and safety side before we get to that point. for now, i'm focused on making great ai-powered videos to share with the community. but i'll definitely keep an eye on the progress of agi, it's a fascinating area! let me know if you have any other questions.

@Ms.Robot. Ай бұрын

Aww… i was hoping it would use a voice and personality to speak to you. 😊

@jimgsewell Ай бұрын

I really enjoy your channel and I’m interested in experimenting with Quiet-STaR too. I’m sorry to say, but I think that you should spend more time thinking about this. In your first example “What color is a banana?” I believe you are looking for an easy answer of yellow. Great, but this isn’t the correct answer. A banana can be yellow, green, red, brown, black, or a combination of some of those colors. So, I don’t think that the correct answer is as easy as you think that it is. Also a single quick search of Wikipedia for Pythagorean theorem yields multiple examples of proofs. You almost need to know the answer to a question to know if it is an easy or difficult question. Even then, LLMs don’t necessarily process info the same as we do. What is easy for us, could be more difficult for the LLM, and what is difficult for our brain, may be a simple lookup for the LLM. Further, I don’t think that you are being fair with your baseline question. You ask a very different question and don’t ask it to format the answer in the way that you want. Then complain about the format. It seems that you are purposefully biasing the phrasing of your questions in order to get the result that you want, that using the Quiet-STaR method results in better answers. I’m with you man, I too think that the Quiet-STaR method or something like it can provide better answers with lesser models, but I think that this project is flawed enough, that it really doesn’t illustrate that. I have enjoyed many of your projects, you are one of my favorite AI channels, but I think this project needs to be better thought out. I think the analysis shows this to be true. I’m looking forward to your next project. I hope that you revisit this think before you speak idea again in the future, after you have given it more thought.

@AllAboutAI Ай бұрын

hey, thnx for the thoughtful comment and feedback, really appreciate it :) you def raise some good points, and i agree that the examples might be a bit cherrypicked to prove my point. but i still think the general idea has potential, but ofc needs more testing and thought. tnx for supporting the channel and tuning in, awesome to have you onboard and get this kind of input, keep it coming =)

@avgplayer Ай бұрын

Would this method get us closer to 42?

@AllAboutAI Ай бұрын

thnx for the question! haha, well i guess it would get us closer to 42 in a philosophical sense, but i'm not sure the compute power is quite there yet. maybe in a few years with gpt-5 or 6 we could start to really crack that big question ;)

@wurstelei1356 Ай бұрын

Did you notice that 42 was only the first part of the answer and 179 with access to a 42 is the full one ?

@ThinkBigPod Ай бұрын

i dont see this happening in years to be honest. and your select compute allocation seems sus. do you mean ppl can set thinking time and compute allocations as parameters before running the prompt?

@AllAboutAI Ай бұрын

yeah, i know it's not realistic right now, but that's how i'd love to see it evolve in the future. you're right, letting folks set the compute and thinking time would be pretty neat. maybe one day!

@TheHistoryCode125 Ай бұрын

This video tries to sound cutting-edge but it's mostly over-engineered fluff. The core idea - having the AI reflect before responding - is interesting, but here's the real takeaway: 1) You can achieve a similar effect with simpler prompts that encourage thoughtful, personal responses. 2) Asking the AI to rate its own responses is pointless - it doesn't understand nuance the way humans do. 3) True value lies in your ability to guide prompts for the results you want, not fancy systems.

@Romathefirst Ай бұрын

Bro typed a whole lotta bs , dumb comment

@GodbornNoven Ай бұрын

The effect you achieve through prompting is far worse.

@GodbornNoven Ай бұрын

I completely disagree with your second point. I'd even argue its better at rating responses than most humans

@GodbornNoven Ай бұрын

3) that's completely wrong too. There's only so much you can do with prompting 😂.

@lydellty Ай бұрын

@tswiftly89 Ай бұрын

nah I think you are delutional, i dont think GPT-5 is even close to this yet... big fail

@AllAboutAI Ай бұрын

nah, i dont think so mate. this is just a bit of fun and creativity, not meant to be super realistic. i'm just exploring some ideas and having a bit of a think about what i'd love to see in the future. if youre not into it, no worries! i know ai hasnt quite gotten there yet, but its fun to imagine :) hope you still enjoy the vids!