No video

Messenger/WhatsApp Design Deep Dive with Google SWE! | Systems Design Interview Question 4

  Рет қаралды 9,417

Jordan has no life

Jordan has no life

Күн бұрын

Rumor has it that to be a menace in the DM, you must first learn about the platform in which they are sent - Proverbs 1:2
00:00 Introduction
02:18 Functional Requirements
04:10 Capacity Estimates
05:18 API Design
06:31 Database Schema
11:15 Architectural Overview

Пікірлер: 70
@dorio5535
@dorio5535 2 жыл бұрын
Just found your System Design Study Guide and your channel. Can't believe these resources are free to access! Keep up the good work 🙏🏼
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
I appreciate it man! All I ask is you share with your friends :)
@03chiku
@03chiku 4 ай бұрын
Hi Jordan, recently started following your videos and they are super informative and will say that one of the best on youtube as per my knowledge.
@DickWu1111
@DickWu1111 Жыл бұрын
Again, words cannot express how thankful I am. I have a bunch of interviews coming up and this is saving me a ton of time. This is the best system design course on the internet :P Yes even better than the paid ones out there. You shouldn't be doing this for free to be honest LOL. About server-sent events, could you explain a little bit more on how the automatic re-connection works? Also I'm a bit confused on why only websockets have the thundering herd problem: if a partition is down and the load needs to be redistributed to the other servers, don't all of the options we have here (SSE, websockets, long polling) need the client to re-establish a connection with the server? how come only websockets have the thundering herd problem in this case?
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
Haha first off thanks man! I was studying it on my own anyways, so maybe one day I'll make a better paid version, but at the moment I'd rather help you all out and keep it free, and maybe one day the following that I gain will help me out in some other way. Server sent events also have the thundering herd problem. The reason that long polling doesn't is that those connections are gradually destroyed and reconnected, meaning they can go through the load balancer at random intervals, as opposed to all at once. I actually have a video devoted to this where I explain it better.
@CptDakrapa
@CptDakrapa Жыл бұрын
@@jordanhasnolife5163 I had the same question, nice to see your answer. I also want to thank you for these excellent videos. I was recently laid off and your videos make it so easy to prepare for the system design interviews due to the high information density. Thanks!
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
@@CptDakrapa Glad to hear and best of luck with your search!
@mickeyp1291
@mickeyp1291 7 ай бұрын
regional issues: server events application service would be by region, reading from streams of logged in users by region, and groups by latest timstamp redis clusters over stream of same name my login svc updates mirror maker regarding regional replication, while polling redis cluster users K Groups list and sending a KV map group to brokers taking into consideration what topics are in what region, making messages closer. replication would be handled by the regional replicator using the group to brokers_region list. when replication is updated i wrote this hurriedly soi hope it makes sense. the client has a group->brokers local map
@jordanhasnolife5163
@jordanhasnolife5163 7 ай бұрын
Yeah I think I'm a bit lost on this one, feel free to elaborate further!
@pashazzubuntu
@pashazzubuntu 2 жыл бұрын
The man, the myth, the legend! Thank you My Uber interview is in a week so would like to get acquainted with your attempt at quadtrees but I feel like it would be in a month at most...
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
Haha, I promise they're a lot less complicated than they seem! Just watch my geohash video if need be.
@kamalsmusic
@kamalsmusic 2 жыл бұрын
I'm not sure the argument for long polling makes sense (avoiding the thundering herd problem). Even if you use long polling, a bunch of users will be connected to some host, and if that host goes down, those long poll connections will still need to be re-established to another host right? If you are using consistent hashing for long polling, all the users that get dropped when a server dies will just get mapped to the same server that is next on the ring
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
It's mostly when adding new servers :) dev.to/kevburnsjr/websockets-vs-long-polling-3a0o
@backend-engineering-yz9mp
@backend-engineering-yz9mp 3 ай бұрын
@@jordanhasnolife5163 This was great reference.
@brandonwl
@brandonwl 9 ай бұрын
SSEs need to create a request for every single message sent to the server, but this is not the case with websockets since they are two-way communication. So wouldn't SSEs be less efficient? Moreover, custom reconnect logic needs to be created when a node goes down to prevent cascading failures, so the inability of websockets to auto-reconnect isn't necessarily a disadvantage. What do you think?
@jordanhasnolife5163
@jordanhasnolife5163 9 ай бұрын
All reasonable to me, I think you can certainly make the case for websockets here
@2tce
@2tce 2 жыл бұрын
I still don't understand, from your explanation, how the chat servers will send/forward the chats/messages to the recipients. Those recipients (a subset of them) may not have active handshakes with the servers. Besides, the POST call from the client was towards the LB not the server directly (although that can be transferred). This is not possible with SSE except we use sockets or long polling 🤔
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
Yeah, if a client is offline and doesn't have an active connection with a server, it ought to pull messages from the DB when it comes back online, or it can also just resume from its last known position within the log based message queue.
@ohileshwaritagi2971
@ohileshwaritagi2971 2 жыл бұрын
The legend is back!!
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
Can't be gone for too long - funny seeing you in the company DMs
@2tce
@2tce 2 жыл бұрын
Very interesting. For my choice of a DB to store the messages, I'll go with Riak KV. It is also leaderless and has better multi-datacenter replication with hash rings per cluster per datacenter. Offers CRDTs resolving conflicts, and Vector clocks for concurrent message ordering (if need be - may be useful for chats 🤔). Finally, Riak has better Fault Tolerance etc etc etc.... One question: Considering that these messages keep coming infinitely, would a time series DB like Riak TS be useful?
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
I think that a time series DB could be good for this, definitely! I chose Cassandra due to leaderless replication, whereas I imagine that TSDbs would use single leader replication (though maybe Riak would not in its TSDB)
@naren_legha
@naren_legha 10 ай бұрын
"Offers CRDTs resolving conflicts" assuming each chat message is going to be written once (mostly), and updates are not frequent (moreover, user can only modify their own messages), i am not sure what will be the source of conflicts?
@naren_legha
@naren_legha 10 ай бұрын
In place of time series DB, can we not use timestamp as a clustering key? Cassandra orders all records using clustering keys. For e.g primary key for Cassandra can be (chat_id, ts, user_id)
@naren_legha
@naren_legha 10 ай бұрын
Regarding the ordering of messages, we can we borrow the idea of version numbers from "handling concurrent writes" topic to infer casual order. Each message will have a message sequence number. And for each new message sent by user A, the message sequence number will be 1 + max(message seq number of all messages in the chat seen so far by user A). And wall-clock can be used as for tie-break. This approach has limitations and UX factors need to be considered as well but I feel using logical timestamps like this is a better approach than just simply using wall-clock timestamps. Do you think this approach can work?
@jordanhasnolife5163
@jordanhasnolife5163 10 ай бұрын
Can totally work, I just think it's overkill lol, but makes for a good discussion
@constantinekhvalin6038
@constantinekhvalin6038 Жыл бұрын
11:50 65K is only a hard limit for a specific - - combination.
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
True!
@nubanimator
@nubanimator 2 жыл бұрын
Can we get a design ticketmaster next? I appreciate your new perspective on these problems!
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
Not next but next few! Gimme a couple weeks then I got you
@sagarjvora1
@sagarjvora1 2 жыл бұрын
This is great. One question- I have been using distributed systems and have seen consistent timestamps across various systems. You mentioned something about timestamps usually being a problem. Can you explain why?
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
When you have a bunch of different systems, timestamps cannot be exactly consistent with one another. This is because of something known as clock drift, where the quartz crystal in a computer tends to go slightly faster or slower than it should. You can sync up with a time server frequently using NTP, however even then you can't account for the time it takes for the result to go over the internet. I have a dedicated video about this in more detail which I'd recommend watching.
@2tce
@2tce 2 жыл бұрын
This is why we use logical clocks. Atomic clocks come to the rescue but are SUPER expensive.
@yingchen592
@yingchen592 Жыл бұрын
could you elaborate more about "chat servers are going to communicate with each other using the consistent hashing pattern delegated by the load balancer"? Thanks!
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
Sure - let's imagine person A wants to send a message to person B. Person A is connected to server 1 via websocket, and person B is connected to server 2. How does server 1 know which server person B is connected to? It can use consistent hashing. If we know person B's user id, we know which server it is connected to, and then server 1 can get the IP of server 2 and send the message there.
@yingchen592
@yingchen592 Жыл бұрын
@@jordanhasnolife5163 I see, looks like server 1 needs to send msg through an internal load balancer?
@neek6327
@neek6327 2 жыл бұрын
Awesome vid! Just adding a thought. I think with chat apps consistency is more important than availability, so that's maybe why grokking prefers hbase as it uses that single leader region server architecture (which I learned from your hbase vid lol). Cassandra's cool too though as you can always tune for stronger consistency.
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
Interesting point and makes sense! The one reason though why we may not need as much consistency is due to the fact that the websockets are the ones propagating the messages as opposed to the database (at least as they're being sent). That being said you could definitely be right and I can tell you're learning fast!!
@neek6327
@neek6327 2 жыл бұрын
@@jordanhasnolife5163 Ah gotcha, that makes sense. Ya I've been powering through your vids in preparation for interviews next month. They're a hidden gem. I feel like they won't be hidden long though! I've been recommending them and I am sure others are too.
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
@@neek6327 Really appreciate that man! Good luck on the interviews lmk how they go!
@rajrsa
@rajrsa Жыл бұрын
I only know about Server Sent Events in a theoretical sense. Does using SSE for sending messages mean that the every message will be a stream that is consumed by the receiver? I am guessing it would take in a lot of different streams? So, it is possible that the client gets overloaded at a time like New Years eve etc. when a lot of people are sending messages??
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
IIRC it's just over the HTTP protocol
@johnsonjthomas
@johnsonjthomas Жыл бұрын
Regarding the issue of clients connecting to different chat servers, what if we use chatID as the load balancing key so that all the user chat sessions go to the same chat server? This way the single chat server would have all the websockets for all users on a particular chat (single chat between 2 users or group chat). I guess one downside of this approach would be if you have a large group chat and there is a high number of chat messages, all of this would be on a single chat server as compared to different servers.
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
So I thought about this when making a video - I think that the issue here is that most people are a part of many chats. If each chatId corresponds to a server, I may have to maintain hundreds of active websockets to servers which isn't really very feasible for my phone (things like data usage, battery, etc.). I think in an ideal world though having every single person on just one chat server though would be great :)
@prateekaggarwal3305
@prateekaggarwal3305 2 жыл бұрын
Why are we not using Graph databases for this social media feature ? I think it will be good to discuss why or why not
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
I think it's out of scope for the problem but would be an interesting concept to try and do consistent hashing of users based on where they are in a social graph. Seems complex though, in the sense that you have to split up the social graph into regions and use those as a hash, maybe FB actually does something like this!
@scottlim5597
@scottlim5597 Жыл бұрын
can we use Kafka to send as the message broker? Each user acts as both producer and consumers and topic is the Chat_ID ?
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
Can you elaborate on this? Yeah a message broker solution could be reasonable here, more likely you'd have people writing to queues corresponding to a chat id and then connect to the consumer via a websocket. Issue is you may need to maintain many websockets if you're in a bunch of chats.
@axings1
@axings1 6 ай бұрын
@@jordanhasnolife5163 I think message broker can just replace the consistent hashing part (used for server to server communication). End user still connect to a server via web socket to receive messages.
@mayankkaushik6837
@mayankkaushik6837 2 жыл бұрын
Such a great video ! Really impressed with your knowledge especially as a new grad. Where did you learn all this stuff fom? Designing data intensive applications? I followed the grokking course but it lacks this sort of depth for eg which DB uses lsm or b-tree etc.
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
Yep definitely read designing data intensive applications, it's very interesting! I've done a decent amount of googling this stuff on my own outside of that, but that textbook was definitely the most worth my time
@2tce
@2tce 2 жыл бұрын
@@jordanhasnolife5163 Yup! I that text book did helps a ton.
@valty3727
@valty3727 Жыл бұрын
How would we address scaling the MySQL database? I was considering functional partitioning (where we separate the db into three dbs user, chat, userchat that can be replicated and sharded etc.) but at the same time that would slow down reads due to having to query multiple databases. That brings us to sharding, but if I'm being honest I don't even know how sharding based on something like user_id would work when we have multiple tables in the same database and 'chats' doesn't even have a user_id to shard with. thanks in advance!
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
Yeah I'd probably separate all user related metadata to MySQL tables which are sharded by userid and then all chats are on a separate Cassandra database sharded by chat id
@valty3727
@valty3727 Жыл бұрын
@@jordanhasnolife5163 ty! another question i got: what are some hints that you should use a message queue? Because during the design of this problem i was thinking that maybe the chat service should feed new messages into a message queue since that has the potential of being a bottleneck (sort of like what you did with the Post Service in your Twitter video), but here you didn't use one
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
@@valty3727 The reason that messages are not a bottleneck and they are in twitter is that in twitter the post needs to be delivered to many places (the inbox of all of the followers). In this case the message is just being written once to the database. A good way to tell if you're going to need to use a message queue is if the message requires a lot of work to be done (hence it would be too much for our application servers), or a message requires multiple different writes to either the same or different databases.
@valty3727
@valty3727 Жыл бұрын
@@jordanhasnolife5163 thank you for responding i’m literally right about to have my interview!🙏🙏
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
@@valty3727 Good luck! Where are you interviewing?
@raj_kundalia
@raj_kundalia 8 ай бұрын
thank you!
@franklinyao3833
@franklinyao3833 2 жыл бұрын
I don't understand the sentence "map them with the front end' to keep every one in the group have a consitence sequence of messages.
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
In front end development (specifically with React), you use a function called map on an array of data to show a list of items to a user. In this case, we'd be mapping an array of message objects, to a visible message bubble on the screen, and ordering them based on their assigned timestamp.
@franklinyao3833
@franklinyao3833 2 жыл бұрын
@@jordanhasnolife5163 But we know that clocks in a distributed system are not reliable, how does timestamp work?
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
@@franklinyao3833 they're not reliable in the sense that the order isn't necessarily consistent with reality. But for a messenger app, this isn't a big deal. If I sent my message a millisecond before you but it looks like you sent yours who cares? The issue of distributed timestamps being problematic is much more important for something like analytics and metrics, where you want very accurate data, or perhaps in a database design where you want to be able to order all writes across partitions.
@franklinyao7597
@franklinyao7597 Жыл бұрын
@@jordanhasnolife5163 How about the following situation: B sent A a message and then A read it and sent B a message back. Then second message hit another service which generates a smaller timestamp. Then after B read the second message and order messages, the order is reversed. This viates the causality that the second message is sent after user A has read the first message.
@AJ-ju7tl
@AJ-ju7tl 8 ай бұрын
@@franklinyao7597that will not be happening, timestamps drift milliseconds. the scenario you describe involves http roundtrips and reading a message and typing a response. All takes minimum of 3 seconds
@bkrichar
@bkrichar 2 жыл бұрын
Where should the chat id come from?
@jordanhasnolife5163
@jordanhasnolife5163 2 жыл бұрын
Recall that the Chats table can be relational, so you can just use the ID from that. If the chats table is partitioned over many machines, you can use node+local_chat_id for the id
If Barbie came to life! 💝
00:37
Meow-some! Reacts
Рет қаралды 50 МЛН
Harley Quinn's plan for revenge!!!#Harley Quinn #joker
00:49
Harley Quinn with the Joker
Рет қаралды 27 МЛН
World’s Largest Jello Pool
01:00
Mark Rober
Рет қаралды 126 МЛН
Sunglasses Didn't Cover For Me! 🫢
00:12
Polar Reacts
Рет қаралды 5 МЛН
WHATSAPP System Design: Chat Messaging Systems for Interviews
25:15
Gaurav Sen
Рет қаралды 1,8 МЛН
Google system design interview: Design Spotify (with ex-Google EM)
42:13
IGotAnOffer: Engineering
Рет қаралды 1 МЛН
Whatsapp System Design | System Design| HLD | High Level Design
34:41
The Tech Granth
Рет қаралды 12 М.
If Barbie came to life! 💝
00:37
Meow-some! Reacts
Рет қаралды 50 МЛН