Consistent Hashing | The Backend Engineering Show

  Рет қаралды 40,356

Hussein Nasser

Hussein Nasser

Күн бұрын

In this episode of the backend engineering show I discuss consistent hashing a very important algorithm in distributed computing specially in database systems such as Apache Cassandra and DynamoDB.
0:00 Intro
2:00 Problem of Distributed Systems
5:00 When to Distribute
7:00 Simple Hashing
9:30 Where Simple Hashing Breaks
11:40 Consistent Hashing
18:00 Adding a Server
21:15 Removing a Server
22:30 Limitations
Fundamentals of Networking for Effective Backends udemy course (link redirects to udemy with coupon)
network.husseinnasser.com
Fundamentals of Database Engineering udemy course (link redirects to udemy with coupon)
database.husseinnasser.com
Introduction to NGINX (link redirects to udemy with coupon)
nginx.husseinnasser.com
Python on the Backend (link redirects to udemy with coupon)
python.husseinnasser.com
Become a Member on KZfaq
/ @hnasr
Arabic Software Engineering Channel
/ @husseinnasser
🔥 Members Only Content
• Members-only videos
🏭 Backend Engineering Videos in Order
backend.husseinnasser.com
💾 Database Engineering Videos
• Database Engineering
🎙️Listen to the Backend Engineering Podcast
husseinnasser.com/podcast
Gears and tools used on the Channel (affiliates)
🖼️ Slides and Thumbnail Design
Canva
partner.canva.com/c/2766475/6...
Stay Awesome,
Hussein

Пікірлер: 68
@hnasr
@hnasr 2 жыл бұрын
Learn about the fundamentals of Database Engineering Get my course database.husseinnasser.com
@bibekjoshi9021
@bibekjoshi9021 2 ай бұрын
I scoured internet to understand this for last 2 hours. And i stumbled upon hussein's video's again... everything is crystal clear now. Excited for your new course on OS , bought it but need to go through it 😁😁😁😁
@serathiuk
@serathiuk 23 күн бұрын
I am watching random videos on KZfaq to understand Consistent Hashing. With this video, the subject is clear now for me. I will read again the Chord paper, and now probably i will understand. I subscribed your channel. You are a good teacher.
@AleksandarT10
@AleksandarT10 2 жыл бұрын
Great topic, its amazing to see great products such as Dynamo and Cassandra use this advanced ideas! Would like to see more topics like that
@startup_cult
@startup_cult Жыл бұрын
This explanation is so beautiful man, loved it. Consistent Hashing feels like magic
@Altamashattari786
@Altamashattari786 2 жыл бұрын
Beautiful explanation of this complicated topic :) Thank you Hussein !
@PrashantKhanolkarUSA
@PrashantKhanolkarUSA Жыл бұрын
My mind is blown the way you explained this topic
@sharvyahmed
@sharvyahmed 2 ай бұрын
Hussein, you explain things so nicely! The depth of your knowledge can be easily seen. Keep doing the great work :)
@lakhveerchahal
@lakhveerchahal 2 жыл бұрын
What a great effort to explain all this!!
@driziiD
@driziiD Жыл бұрын
i finally get it thanks! everytime i visit your channel i'm rest assured that im going to actually understand what you're teaching :)
@vikramragunathan6392
@vikramragunathan6392 Жыл бұрын
Possibly the best explanation for Consistent Hashing. :)
@MallikarjunRaoShankesi
@MallikarjunRaoShankesi 4 ай бұрын
Excellent explanation! Seen many videos but this one cleared everything....
@potaraju92
@potaraju92 4 ай бұрын
You have incredible talent to teach
@kartheek6495
@kartheek6495 2 жыл бұрын
Thank you so much sir for making this concept super simple
@channuangadi7504
@channuangadi7504 9 ай бұрын
What an amazing explanation without any fancy writing and animation
@simplefinance5165
@simplefinance5165 2 жыл бұрын
I was avoiding this topic from a very long time due to complexity, you explained it so well. Thanks.
@brymstoner
@brymstoner 2 жыл бұрын
very interesting. thanks hussein! i do something similar for flat file db's. similar that is in ranging. so for every object between numbers n and n, find them in file x.
@addiegupta
@addiegupta 2 жыл бұрын
great video as always. thank you
@luisphilipe
@luisphilipe 2 жыл бұрын
Awesome content. Keep doing it!
@vinoths7140
@vinoths7140 5 ай бұрын
Amazing explanation, Thank You.
@Adi-xy8iu
@Adi-xy8iu 2 жыл бұрын
Great explanation!
@officialismailshah
@officialismailshah 11 ай бұрын
Nicely described your struggle show in this video keep it up bro
@samart3010
@samart3010 Жыл бұрын
What a great talk Hussain. ❤
@yousefkhaled2
@yousefkhaled2 2 жыл бұрын
nice video understood perfectly, thnx
@eshaanpandey7353
@eshaanpandey7353 8 ай бұрын
Nicely explained.
@harirambj
@harirambj Жыл бұрын
Great explanation
@programmingwithjavascript3579
@programmingwithjavascript3579 2 жыл бұрын
After watching gaurav sen video. I would say, hussain nesser is best👍💯 🙏
@kartheek6495
@kartheek6495 2 жыл бұрын
🤣
@fadygamilmahrousmasoud5863
@fadygamilmahrousmasoud5863 8 ай бұрын
and now this is what we call a state of art
@tanumoymajumdar874
@tanumoymajumdar874 Жыл бұрын
Beautiful.
@mohammedsalman3397
@mohammedsalman3397 2 жыл бұрын
are you reading my mind, I was just writing a hashtable for my opengl renderer to cache unifroms :)
@dhanushshetty7840
@dhanushshetty7840 2 жыл бұрын
What would happen if your server count goes above 360 in the above scenario? Do you have to reshuffle all keys one time using higher number?
@alejandrombc
@alejandrombc 2 жыл бұрын
Awesome video!!. One question with the ring topology, how can we mantain a “right” amount of load through the servers?. In your example if you add servers that go around the same degrees you might have a debalanced system (like 4 servers between 0-90 and just 2 between 180-270). Is this a limitation or I dont quite get it right? 🤔
@AjaSiva
@AjaSiva 2 жыл бұрын
in real life there are virtual nodes for each servers like s0_0, s0_1, s0_2..and so on. These are distributed in the ring, so eventually you will have a more balanced distribution
@himanshusharma9817
@himanshusharma9817 2 жыл бұрын
Can you talk about virtual nodes and data replication as the next part of this video ?
@khalidgria6863
@khalidgria6863 Жыл бұрын
very important algorithm in distributed computing specially in database systems such as Apache Cassandra and DynamoDB.
@HamzaKhan-oz2xm
@HamzaKhan-oz2xm 2 жыл бұрын
NIcely explained
@ravitejavankam2977
@ravitejavankam2977 2 жыл бұрын
Some ideas: 1. While adding server: we can have dedicated slave for each server and we can promote the slave as master like in this case slave of S90 will act as S50. 2. When removing server: instead of dedicated server we can also have slave on adjacent nodes for eg S90 slave will be in S0 and S180 and if we want to remove S90 then slave data of S90 in S180 will act as leader.
@shahman1
@shahman1 2 жыл бұрын
Can you elaborate the slave idea? Would the slave be another server along with S90 called S90slave (for example)? And when a new one is added (S50 in this case), the S90slave becomes S50 but with only the values that belong to S50?
@ravitejavankam2977
@ravitejavankam2977 Жыл бұрын
@@shahman1 Yes S90slave will act as leader for S50 and can discard data outside the degree.
@AJ-fb9pk
@AJ-fb9pk 2 жыл бұрын
I did not understand why to use consistent hashing, after watching this I now understand why and also only noticed why it is called consistent. The hashing that is consistent even when adding new nodes.
@AJ-fb9pk
@AJ-fb9pk 2 жыл бұрын
Almost consistent
@roshedulalamraju7936
@roshedulalamraju7936 Жыл бұрын
In consistent hashing , if one datanase is full and goes to read only mode but can't insert new data, how does this works?
@saurabhjagtap
@saurabhjagtap Жыл бұрын
Well taking this example, you cannot have more than 360 servers, right? What if the number of servers increases to 361, how will we handle such complexities? Even if we change the modulo to, lets say, 720: this will add up more to complexities, like earlier result was 1 (when we did %360) now it will be 361. How would we handle such scenarios?
@andresgutierrez1804
@andresgutierrez1804 2 жыл бұрын
Thx for the cupon
@anthonyoleinik6472
@anthonyoleinik6472 2 жыл бұрын
When we add another server to the ring, won’t that server and the next server in the ring get half the load? I.e the loads are now unbalanced. Is there a solution for this or is this a given caveat?
@anthonyoleinik6472
@anthonyoleinik6472 2 жыл бұрын
Interesting trade off… adding a server moves less data, but you end up with an unbalanced load
@anthonyoleinik6472
@anthonyoleinik6472 2 жыл бұрын
But I guess in the first place, if your load is initially balanced and you start running out of space, you run out of space on all the servers, not just one. So you would add N servers and your load is balanced again. 3 comments instead of just editing the first for that sweet sweet youtube algorithm :) great vid!
@hoangnguyendinh1107
@hoangnguyendinh1107 2 жыл бұрын
They add the concept of virtual nodes mean multiple virtual nodes are associated with one physical node and are distributed along the hash ring. Hence it may reduce the uneven distribution of moving data.
@anthonyoleinik6472
@anthonyoleinik6472 2 жыл бұрын
@@hoangnguyendinh1107 That is very very clever! I could see how that solves the problem. Thanks!
@danku1013
@danku1013 2 жыл бұрын
If you really think about it, that's not only about system design but also some algorithmic stuff that many people hate on the coding interviews. Please make more videos related to system design, there are a lot of interesting things(you probably covered most of them ahaha). I was nostalgic watching this video if you know what I mean. 😊
@jobosan4855
@jobosan4855 2 жыл бұрын
Index 3!
@sachinchauhan6489
@sachinchauhan6489 Жыл бұрын
Rendezvous hashing is more simple than consistent hashing and solves the same problem. But there are other tradeoffs.
@5590priyank
@5590priyank 2 жыл бұрын
when we add a new server, rather than new server talking to next server to transfer keys, how about it gets populated lazily as and when cache miss happens
@BlagiwRoblox
@BlagiwRoblox Жыл бұрын
Somebody shall ping the data number destribution when have an issue there you can't guess on that or just turn a script look for every single one in the array do there can be said upgrade to that do I can't spoil it, bc it is coming 👍
@SocialTransmission
@SocialTransmission Жыл бұрын
Does introducing a new server in your example not result result in the load being unevenly distributed? With 4 nodes, each node is receiving an equal proportion of the total range 0-360. Adding a node between S0 and S90 means that the range between S270 and S90 is served by 3 nodes while the rest of the hash range is still served by 2 nodes. It sounds like one needs to add 4 new nodes to get an even distribution of load.
@fedefede843
@fedefede843 10 ай бұрын
The most common way to mitigate this is using more than one hash function for the servers. A good value could be log(M), where M is the max number in the circle, in this case 360. This way you "ensure" the are not big chunks without a server in the circle and keys get distributed uniformly. Off course you pay the price of complexity, and when remove/add a server, you need to reallocate log(M) times. Cheers!
@AhmedMahmoud-wz6du
@AhmedMahmoud-wz6du 2 жыл бұрын
wow
@panixx8289
@panixx8289 2 жыл бұрын
Another problem is that data load not evenly distributed if many servers are close on the ring
@hnasr
@hnasr 2 жыл бұрын
Right! Hot spots.
@fedefede843
@fedefede843 10 ай бұрын
​@@hnasr a way to mitigate that is to have multiple hash functions for the servers.Cheers!
@BlagiwRoblox
@BlagiwRoblox Жыл бұрын
this is like a HEX
@BlagiwRoblox
@BlagiwRoblox Жыл бұрын
and then we play endless snake that consume more and more trouble how we explain it XD
@WeilongYou
@WeilongYou Жыл бұрын
So this video doesn't discussion V-Nodes yet?
@sathvikreddy5799
@sathvikreddy5799 2 жыл бұрын
Hussein I am a fan. Please pin this comment.
@PGinPublic
@PGinPublic 2 жыл бұрын
first
@parasite6731
@parasite6731 2 жыл бұрын
🎖️this for your Now enjoy
@krishnabirla16
@krishnabirla16 Жыл бұрын
Expected a better explanation. Expect a long follow up video.
The cost of Hash Tables | The Backend Engineering Show
25:26
Hussein Nasser
Рет қаралды 34 М.
DNS is beautiful
41:01
Hussein Nasser
Рет қаралды 46 М.
No empty
00:35
Mamasoboliha
Рет қаралды 8 МЛН
Spot The Fake Animal For $10,000
00:40
MrBeast
Рет қаралды 178 МЛН
DAD LEFT HIS OLD SOCKS ON THE COUCH…😱😂
00:24
JULI_PROETO
Рет қаралды 16 МЛН
Consistent hashing, Rendezvous hashing | Вопросы собеседований
48:08
Андрей Суховицкий
Рет қаралды 1,4 М.
Prime Video Swaps Microservices for Monolith: 90% Cost Reduction
35:10
Hussein Nasser
Рет қаралды 157 М.
Hashing Algorithms and Security - Computerphile
8:12
Computerphile
Рет қаралды 1,5 МЛН
Threads and Connections | The Backend Engineering Show
49:30
Hussein Nasser
Рет қаралды 63 М.
How Discord Stores Trillions of Messages | Deep Dive
1:08:33
Hussein Nasser
Рет қаралды 174 М.
Consistent Hashing | Algorithms You Should Know #1
8:04
ByteByteGo
Рет қаралды 294 М.
Memcached Architecture - Crash Course with Docker, Telnet, NodeJS
1:04:51
8 Товаров с Алиэкспресс, о которых ты мог и не знать!
49:47
РасПаковка ДваПаковка
Рет қаралды 166 М.
My iPhone 15 pro max 😱🫣😂
0:21
Nadir Show
Рет қаралды 211 М.
Частая ошибка геймеров? 😐 Dareu A710X
1:00
Вэйми
Рет қаралды 1,7 МЛН
Что делать если в телефон попала вода?
0:17
Лена Тропоцел
Рет қаралды 3 МЛН