RoPE Rotary Position Embedding to 100K context length

No video

RoPE Rotary Position Embedding to 100K context length

Рет қаралды 3,109

Күн бұрын

ROPE - Rotary Position Embedding explained in simple terms for calculating the self attention in Transformers with a relative position encoding for extended Context lengths of LLMs.
All rights w/ authors:
ROFORMER: ENHANCED TRANSFORMER WITH ROTARY POSITION EMBEDDING (RoPE)
arxiv.org/pdf/...
#airesearch
#aiexplained

Пікірлер: 13

@xiaohuiwang9308 25 күн бұрын

Best rope explanation ever

@code4AI 25 күн бұрын

Thanks!

@MrEmbrance Ай бұрын

first llm related yt channel that doesn't suck, thanks !

@LamontCranston-qh2rv 3 ай бұрын

Thank you SO MUCH for providing such high quality conten! Very much enjoying all your many videos! If you have a chance, I'd love to see you discuss the recent work in giving AI spatial reasoning. I.e. artificial "imagination". (In it's natural form, very much a core feature of human thought.) Perhaps one might think about the creation of a "right brain" to go along with the "left brain" language models we have now? (Please forgive the over-simplification of human neuroscience.) Thanks again! All the best to you sincerely!

@desmur36 3 ай бұрын

Amazing content! The explanations are SO clear! Thank you!

@riyajatar1311 13 күн бұрын

Nice explanation Can u give some talk on how can we use these techniques for existing model some notebook walkthrough

@dumbol8126 25 күн бұрын

is there any where i can find an example to increasing the ctx length of model which previously used positional embeddings and then its changed to rotatory and then finetuned to work on longer sequences

@hangjianyu 2 ай бұрын

there is a mistake, smaller dimensions change more quickly,and large dimensions change more slowly

@mshonle 3 ай бұрын

If one rotation is good, how about going into three dimensional rotations and using quaternions? Is there any work using that?

@AYUSHSINGH-db6ev 3 ай бұрын

Hi Sir! Really love your videos! How can we access your presentation slides?

@paratracker 3 ай бұрын

Maybe it's obvious to YOU that the solution is that complex exponential, but I wish you hadn't assumed that WE would all see that as self-evident as you do.

@code4AI 3 ай бұрын

I see what you mean. You know, I spend some days to find simple explanations for the not so self explanatory RoPE algo, especially I will build on this in my second video, and then we examine more complex, more recent ideas about RoPE. I decided for a way, that will enable my audience to understand the main ideas and methods, and go from there. I recorded 90 min for the second part, and currently I cut it to max 60 min, striking a balance of providing insights for all my viewers. I'll try harder ....