No video

RoPE Rotary Position Embedding to 100K context length

  Рет қаралды 3,109

code_your_own_AI

code_your_own_AI

Күн бұрын

ROPE - Rotary Position Embedding explained in simple terms for calculating the self attention in Transformers with a relative position encoding for extended Context lengths of LLMs.
All rights w/ authors:
ROFORMER: ENHANCED TRANSFORMER WITH ROTARY POSITION EMBEDDING (RoPE)
arxiv.org/pdf/...
#airesearch
#aiexplained

Пікірлер: 13
@xiaohuiwang9308
@xiaohuiwang9308 25 күн бұрын
Best rope explanation ever
@code4AI
@code4AI 25 күн бұрын
Thanks!
@MrEmbrance
@MrEmbrance Ай бұрын
first llm related yt channel that doesn't suck, thanks !
@LamontCranston-qh2rv
@LamontCranston-qh2rv 3 ай бұрын
Thank you SO MUCH for providing such high quality conten! Very much enjoying all your many videos! If you have a chance, I'd love to see you discuss the recent work in giving AI spatial reasoning. I.e. artificial "imagination". (In it's natural form, very much a core feature of human thought.) Perhaps one might think about the creation of a "right brain" to go along with the "left brain" language models we have now? (Please forgive the over-simplification of human neuroscience.) Thanks again! All the best to you sincerely!
@desmur36
@desmur36 3 ай бұрын
Amazing content! The explanations are SO clear! Thank you!
@riyajatar1311
@riyajatar1311 13 күн бұрын
Nice explanation Can u give some talk on how can we use these techniques for existing model some notebook walkthrough
@dumbol8126
@dumbol8126 25 күн бұрын
is there any where i can find an example to increasing the ctx length of model which previously used positional embeddings and then its changed to rotatory and then finetuned to work on longer sequences
@hangjianyu
@hangjianyu 2 ай бұрын
there is a mistake, smaller dimensions change more quickly,and large dimensions change more slowly
@mshonle
@mshonle 3 ай бұрын
If one rotation is good, how about going into three dimensional rotations and using quaternions? Is there any work using that?
@AYUSHSINGH-db6ev
@AYUSHSINGH-db6ev 3 ай бұрын
Hi Sir! Really love your videos! How can we access your presentation slides?
@paratracker
@paratracker 3 ай бұрын
Maybe it's obvious to YOU that the solution is that complex exponential, but I wish you hadn't assumed that WE would all see that as self-evident as you do.
@code4AI
@code4AI 3 ай бұрын
I see what you mean. You know, I spend some days to find simple explanations for the not so self explanatory RoPE algo, especially I will build on this in my second video, and then we examine more complex, more recent ideas about RoPE. I decided for a way, that will enable my audience to understand the main ideas and methods, and go from there. I recorded 90 min for the second part, and currently I cut it to max 60 min, striking a balance of providing insights for all my viewers. I'll try harder ....
LongRoPE & Theta Scaling to 1 Mio Token (2/2)
58:30
code_your_own_AI
Рет қаралды 1,3 М.
Rotary Positional Embeddings: Combining Absolute and Relative
11:17
Efficient NLP
Рет қаралды 30 М.
❌Разве такое возможно? #story
01:00
Кэри Найс
Рет қаралды 3,6 МЛН
Can This Bubble Save My Life? 😱
00:55
Topper Guild
Рет қаралды 85 МЛН
Get 10 Mega Boxes OR 60 Starr Drops!!
01:39
Brawl Stars
Рет қаралды 19 МЛН
The Attention Mechanism in Large Language Models
21:02
Serrano.Academy
Рет қаралды 91 М.
Has Generative AI Already Peaked? - Computerphile
12:48
Computerphile
Рет қаралды 966 М.
The math behind Attention: Keys, Queries, and Values matrices
36:16
Serrano.Academy
Рет қаралды 235 М.
Transformers explained | The architecture behind LLMs
19:48
AI Coffee Break with Letitia
Рет қаралды 23 М.
LongRoPE
1:59:05
hu-po
Рет қаралды 2,8 М.
Rotary Positional Embeddings
30:18
Data Science Gems
Рет қаралды 3 М.
❌Разве такое возможно? #story
01:00
Кэри Найс
Рет қаралды 3,6 МЛН