Twitter System Design - Microservices Architecture Part I

No video

Twitter System Design - Microservices Architecture Part I - Google Interview Question

Рет қаралды 26,680

Күн бұрын

Twitter System Design video deals with system design of Twitter service. This is the first part of the my system design Twitter video series. Here I am discussing the Microservices architecture of Twitter System Design.
00:00 - Introduction
00:35 - Functional Requirements of Twitter System Design
06:00 - Non-Functional Requirements of Twitter System Design
08:05 - Application Programming Interface (API) Specs
13:20 - High-Level Microservices Architecture of Twitter Service
17:25 - Design of Tweet Service (includes database schema and generation of unique tweet id)
22:30 - Discussion on different mechanisms to shard the datastore
30:00 - Design of Social Graph Service
32:00 - Design of User Timeline Service
Distributed System Design Interviews Bible | Best online resource for System Design Interview Preparation is now online. Please visit: www.thinksoftw...?KZfaq-twitter
Please follow me on / think.software.community if you like to get notified about new course chapters getting added or when we will start another round of mock interviews and you want to participate in mock interviews or any other updates. I will also take your suggestions there about the course and the channel.
Please check my other videos for more information about following topics:
1. How to generate a unique id: • System Design Intervie...
2. Distributed Cache Design: • Distributed Cache Syst...
3. The right way to tackle the system design interviews: • The Format of Distribu...
Check out our following articles:
- How to Ace Object-Oriented Design Interviews: / how-to-ace-object-orie...
- Elevator System Design - A tricky technical interview question: / elevator-system-design...
- System Design of URL Shortening Service like TinyURL: / tinyurl-design-from-th...
- File Sharing Service Like Dropbox Or Google Drive - How To Tackle System Design Interview: / how-to-tackle-system-d...
- Design Twitter - Microservices Architecture of Twitter Service: / design-twitter-microse...
- How to Effectively Use Mock Interviews to Prepare for FAANG Software Engineering Interviews: / how-to-effectively-use...
- Payment Gateway System Design - How does the Stripe work: / payment-gateway-system...
I am discussing the commonly asked distributed system design interview question in Google, Facebook, Netflix, Amazon etc.
#FAANG #Facebook #Google #Amazon #Apple #Microsoft #Uber #Netflix #Oracle #Lyft #SystemDesign #Interview #ComputerProgramming

Пікірлер: 97

@ThinkSoftware 4 жыл бұрын

Thanks for watching this video. Please let me know your feedback below.

@gersonadr2 Жыл бұрын

I'm humbled by how good your content is. I'll binge watch the entire channel and become a better professional. Thanks!

@ThinkSoftware Жыл бұрын

Thanks for the comment 😊

@sakshipandeywishicudgetit 4 жыл бұрын

This is the best resource for System design understanding, which I have come across so far . Thanks a lot for your efforts !

@ThinkSoftware 4 жыл бұрын

Thanks for the comment 🙂

@user-oy4kf5wr8l 3 жыл бұрын

buddy, u r really the best one on youtube! i am sure that many people will be willing to buy ur course!

@ThinkSoftware 3 жыл бұрын

Thanks for the comment 🙂

@0xggbrnr 4 жыл бұрын

Excellent video. I like that you pause to explain WHY you make each decision. Great job.

@ThinkSoftware 4 жыл бұрын

Thanks

@justinrao8458 2 жыл бұрын

Best Twitter Design video among all the ones on KZfaq. Should have had more likes!!!

@ThinkSoftware 2 жыл бұрын

Thanks for the comment 🙂

@akundlia 2 жыл бұрын

Thank you for the video. Great work! One comment: Given strong consistency is not a requirement, for N=5 is W=4 needed. Even if we have W=2 or 3. Eventually they will get replicated --> eventual consistency!

@ThinkSoftware 2 жыл бұрын

thanks for the comment

@govindaHari16 Жыл бұрын

For W=2, R has to be at least 4. For W=3, R has to be 3. R = 3 or 4 are slower than what he proposed (W=4, R=2). Since twitter is read heavy, we should keep R to as min as possible.

@shubhangverma2428 3 жыл бұрын

You are a great teacher

@ThinkSoftware 3 жыл бұрын

Thanks for the comment 🙂

@AnkitGarg 2 жыл бұрын

Great video, thanks for the detailed explanation. At 30:10 you'll need a distributed transaction (2PC) to keep everything in sync. Or use event sourcing (but that's async).

@ThinkSoftware 2 жыл бұрын

Thanks for the comment 🙂

@uditagrawal6603 3 жыл бұрын

I think for the usertimeline service we can use widecolumn database such as Cassandra where key would be userid and coulmns would be tweets sorted in descending order of timestamp, sharded based on user id.

@ThinkSoftware 3 жыл бұрын

thanks for the comment :)

@ThakurArjun247 3 жыл бұрын

@Think Software, thanks for posting the awesome content, a few questions/suggestions: 1. How about using separate tables (or datastore) for separate services? User timeline API can query from the table sharded based on userID, home timeline API can query a separate table (or datastore) that could be sharded based on the IDs of the followees. In other words, model the data based on the queries that will run on it, something similar to what we do in Cassandra data modelling. 2. It would be great to see separate data stores for all the services and interactions between them (using APIs or queues.). Once again great tutorial, please keep it up.

@ThinkSoftware 3 жыл бұрын

Thanks for the comment. Different micro-services uses different/separate data stores in the design that I have discussed.

@theghostwhowalk 4 жыл бұрын

@32.0 How to shard User_Relation table: I think since system is read heavy, when we post a tweet, we want all followers to get the tweet with minimum lag. Hence shard by tweet posting user's userID so that all followers of a user can be found in one shard. Then go to respective shards and publish the topic in the queue of the user Tweet feed. Shard by tweetID would not make much sense.

@ThinkSoftware 4 жыл бұрын

Thanks for the comment 🙂

@talivanov93 4 жыл бұрын

Great video, helped me a lot. Thank you!

@ThinkSoftware 4 жыл бұрын

Thanks for the comment 🙂

@vineetbhargava4141 2 жыл бұрын

Please keep making videos. One request design a highly scalable flash sale system.

@ThinkSoftware 2 жыл бұрын

Thanks for the comment 🙂.

@SalmanAhmed-cn9qe 4 жыл бұрын

Hi sir, your videos are very good.Please add more videos on system design and oops design videos if possible

@ThinkSoftware 4 жыл бұрын

Thanks for the comment. More videos coming soon.

@mehranbehbahani3050 2 жыл бұрын

quenstion 1 -> regarding keeping each user's tweets in memory in the time line service, we are talking about 224bytes x 100 tweets per user x 500 000 000 users = 11.2 TB of information! is it really economic to store such amount of data in memory? How many machines would we even need (take into consideration the number of replicas!)? even if it is, based on your design, this data is accessed only when a user's profile is accessed which is not that frequent. why not only keep this data for the top 20% of our users hot users and evict the LRU? question 2 -> if we assume that your implementation of user timeline is feasible and really needed, why not have Home Timeline service access Social Graph and get a list of followees of the user and then go to User Timeline service and get a list of K (choose K based on the number of followees of the user in a way that when merged we get a list of 500 tweets so that merging and ordering is fast) recent tweets for those followees, merge and sort them and return the feed? I understand it would mean order of nlogn but it would still be fast considering the size of data retrieved (about 500 tweets).

@fokkor 3 жыл бұрын

This is again an amazing video. I've 2 questions - 1. regarding sharding by User id. We can definitely solve the scale problem by adding more replicas but what about data skew for the celebrity user shards. Isn't the idea for data storage is to keep partition size deltas minimal? 2. ) regarding storing in memory tweets for users to generate the timelines. I'm guessing this is going to be 140 + 60 additional bytes (user id + ts) etc. To store 200 bytes * 100 * 300 Mn Users = 6 TB memory. If we do pick a machine of 16 GB memory it will take at least 375 app servers just for this. Is this a scalable cost efficient approach?

@ThinkSoftware 3 жыл бұрын

Thanks for the comment. This types of things are discussed in the incoming course. Just to answer your #2 here, we won't be keeping all the tweets in cache. Only the tweets for the active users - how we define active users is different story.

@Claudius025 4 жыл бұрын

Why isn't there a load balancer and routing service in front of the microservices such as home timeline, social graph, user timeline, etc.?

@ThinkSoftware 4 жыл бұрын

The high level architecture is a logical diagram. When I am discussing individual design then I show LB.

@ganzee6928 2 жыл бұрын

Curious, why should the user token be part of every API definition? Auth information can be part of the request headers.

@ThinkSoftware 2 жыл бұрын

Putting user token in request header is only applicable in REST/HTTP implementation of an API. What if API is implemented over some other RPC protocol?

@muralidhargali2596 Жыл бұрын

Please let me know is the APIs are Rest based, if so why API signatures does not have GET, PUT methods

@ThinkSoftware Жыл бұрын

Thanks for the question. I have answered similar question before in other comments. Using REST or HTTP or even RPC is just an implementation of an API. Here we are discussing APIs at an abstract level without going into details on how an API will be implemented.

@experience147 4 жыл бұрын

why do we want to generate tweet Id using same mechanism as mentioned in tiny url video ? We can use some uuid right as we dont have constraint on tweetId length compared to tiny Url ?

@ThinkSoftware 4 жыл бұрын

UUID will work as well.

@AbhishekKumar-hr1sr 4 жыл бұрын

readUserTimeline method should also take the userId parameter for reading the timeline of the other user ????

@ThinkSoftware 4 жыл бұрын

Yes you are correct.

@MixedMatchedVideos 3 жыл бұрын

Hi Great video. I think one aspect, which is very important, is how does Fan Out scale for "celebrities". For example, if a celebrity has 100M followers, and he/she posts a tweet the fan out service would be under huge stress. How do you avoid that?

@ThinkSoftware 3 жыл бұрын

Thanks for the comment. This is something discussed in the incoming course.

@VishalThakur-wo1vx 3 жыл бұрын

Loved the video . One concern that i see about sharding by userId is , finding out all the users who liked a tweet. In this case we have we will have to perform query on multiple shards to find out users who liked the tweet . Is this right ?

@ThinkSoftware 3 жыл бұрын

Thanks for the comment. This is discussed in the course.

@harrylin6282 4 жыл бұрын

Your video is very helpful. Thanks a lot! However, I have a question about the DB schema. When interviewer asks how to choose SQL or NoSQL, in most case, we will choose NoSQL, since it is more salable. suppose that I have chosen a NoSQL in the interview like Cassandra, can I still design the schema (such as User table, Tweet Table @19:14 ) like yours ? if not, how to design the schema for NoSQL?

@ThinkSoftware 4 жыл бұрын

Thanks. I will have a future video on this or else I will discuss this in my incoming book/course.

@ThakurArjun247 3 жыл бұрын

@Harry Lin model data store based on the query patterns, great read here: tech.ebayinc.com/engineering/cassandra-data-modeling-best-practices-part-1/

@vyaassrinivasan5267 4 жыл бұрын

Very impressive work. I wonder from where do you learn these things - do you mind sharing?

@ThinkSoftware 4 жыл бұрын

Thanks. I am making these videos based on my 15 years of experience. I do go through existing material available online via Google search but didn't find them adequate enough.

@pallavibansal84 4 жыл бұрын

Instead of using the fanout service, could the tweet svc directly write to DB, and then we can have an event management system like Azure events grid, that could listen to DB updates on tweet table that publishes different events (add / update / delete / read)? .. This way, we won't have to maintain the fanout svc, and the eventsgrid could auto scale without us having to worry about scale at all. Basically, one less piece in the arch .... When a new svc comes up, like trending svc / recommendations svc, all we need to do is subscribe to relevant tweet events, and then nothing else needs to change ..

@ThinkSoftware 4 жыл бұрын

My understanding is (and you can correct me if I am wrong) that Azure event grid is a distribution system and not a queueing system. If an event is pushed in, it gets pushed out immediately and if it doesn’t get handled, it’s gone forever. Unless we send the undelivered events to a storage account.Also event grid does not guarantee the order of the events. So at the end it depends on what requirements you have.

@pallavibansal84 4 жыл бұрын

In an event routing service (aws SNS / azure EventGrid), once it receives an event, it'll send the event to all the registered subscribers .. and only deletes the event, once all the subscribers send an OK ... else it has a retry mechanism with exponential backoff ... It's true that sometimes when a service is down, it could receive the events out of order, but in Twitter's case, I feel the ordering is not as important ... Only case is to build home timeline of a user, where a specific feed won't be present for some time ... but as soon the service re-receives the event, it can process it and place the lost feed at correct position (depending on time) ...

@pallavibansal84 4 жыл бұрын

@John Cohen there are multiple retries with exponential back offs. So although there is a chance that it might fail but it's less. Bte same chance exist with service call .. it may fail to process the request even after successful response .. or maynot be able to receive the request only! The advantage is that you won't have to manage this new service which is huge!

@kumarmanish9046 2 жыл бұрын

Searching for tweets was missed functional requirements. It is important feature.

@ThinkSoftware 2 жыл бұрын

Thanks for the comment. Search is discussed in part two of this video.

@ashwinisinha7100 4 жыл бұрын

At 25:13, when system is read heavy then the no of servers for read should be 4 and write should be 2 , but you are saying just opposite , why?

@ThinkSoftware 4 жыл бұрын

I think you misunderstood it. I am talking about read/write quorum. All hosts will be serving reads/writes but in read heavy system you need to read from 2 servers only for a successful read and write to atleast 4 servers to declare a successful write. It is not that only 2 servers are available for read and 4 for write. I hope it clarifies.

@supriyachugh2653 4 жыл бұрын

can you please explain two layer / two level shard approach

@ThinkSoftware 4 жыл бұрын

I think I have discussed this in this video and some other videos as well.

@ashwinisinha7100 4 жыл бұрын

How will you use linked list data structure in distributed systems and scale that table of user id and linked list

@ThinkSoftware 4 жыл бұрын

Linked list is stored in in-memory cache and not table in data store. You don't store a single linked list across machines. You store a single linked list in a machine. The list being let's say user time line tweets are shard by user ID.

@tusharsinha94 4 жыл бұрын

@@ThinkSoftware is the cache part of the app servers? in that case we would need to route all requests for a user to the same app server..

@ThinkSoftware 4 жыл бұрын

You should check how you can have list stored in a distributed cache like memcached etc

@akshaykhatavkar 3 жыл бұрын

Can you please talk about how will you shard follower and followee data?

@ThinkSoftware 3 жыл бұрын

Thanks for the comment. This is discussed in the course.

@uditagrawal6603 3 жыл бұрын

I think this table should be sharded based on the followee id so that all the people following a user resides on the same shard which is inline with our requirement for the home timeline generation.

@helloworld6679 4 жыл бұрын

When would we see the next part of this video?

@ThinkSoftware 4 жыл бұрын

I was sick for almost two weeks. Hopefully will upload next week.

@bostonlights2749 3 жыл бұрын

32:00 - Design of User Timeline Service Q1. Why is the User Timeline in memory and not db ? It should be persisted right ? or is it done for speed?

@ThinkSoftware 3 жыл бұрын

Thanks for the question. This depends on the requirements like number of users for which you want to create time-line in advance along with cost. In our case, we are fine keeping it in cache and if not present in cache then generate it from DB.

@bostonlights2749 3 жыл бұрын

@@ThinkSoftware I get your point... Thnx!

@vishalsarda9764 3 жыл бұрын

@@ThinkSoftware What is the type of DB used to store information from various service here? Is it RDBMS like MySQL or NoSQL like Cassandra?

@ThinkSoftware 3 жыл бұрын

It depends on service based on the requirements. This question is being handled in more detail in the course.

@vishalsarda9764 3 жыл бұрын

@@ThinkSoftware Got it. But is it even possible for NoSQL databases to have primary and secondary keys?

@kamalsmusic 4 жыл бұрын

What is the reasoning for using a write through method when writing to the datastore instead of just using cache aside? Also, for an application like twitter consistency is less important than availability, so why do you need to ensure R+W>N?

@ThinkSoftware 4 жыл бұрын

Which service are you asking about?

@kamalsmusic 4 жыл бұрын

@@ThinkSoftware Basically for the tweet service, but really I am asking about the part at around 24:00 when you are saying we need to make sure the # of replicas we read to + # of replicas we write to is > N. R+W>N means you get strong consistency, but this doesn't matter as much for twitter right? The write through method for the cache is also for the tweet service basically (around 17:52)

@ThinkSoftware 4 жыл бұрын

For Tweet service, consistency for the sender is important otherwise it will be confusing and bad experience for the sender if he does not find his tweet. Why are we using write through because we know that this tweet is more likely to be accessed right away so avoiding even a first read call to the database. Consistency is eventual for the case of user home time line.

@nidage6385 2 жыл бұрын

At kzfaq.info/get/bejne/Z8V4fNOFp7uscac.html, it seems not enough to ensure consistency if using w+r>n, consider this: if we write to four instances, one failed, we return failure to user, but the successful three are committed, so when user reading from one of these three and another one instance not been written to(in this case it's two read), user can read the tweet he/she posted, which is not consistency since we already told the user the write is failed. Is my understanding correct? Thanks.

@ThinkSoftware 2 жыл бұрын

Thanks for pointing out this edge scenario. This is discussed in the course how we handle such scenarios.

@ras6746 4 жыл бұрын

Why is K = 1000, at ~33:00. This seems on overkill. Even a 100 is too much

@ThinkSoftware 4 жыл бұрын

I mentioned that this value is configurable and so we would be using analytics to come up with a suitable value for this. Of course to start with, we need to have some value and 1000 was just an example as mentioned in the video.

@ras6746 4 жыл бұрын

@@ThinkSoftware Thank you! Great job on the videos. Good quality. Keep it up.

@ThinkSoftware 4 жыл бұрын

Many thanks for the comments and kind words 🙂

@_paralaks_ 3 жыл бұрын

An interview takes 45 minutes yet your video is 35 minutes and there is a part 2!

@ThinkSoftware 3 жыл бұрын

You are not supposed to copy this video in your interviews as it all depends on situation and circumstances. I am covering many things here. An interviewer may not cover everything but you cannot guess what he will cover. So better to be prepared for everything 🙂. You should check mock interviews in my channel and course to get idea.

@willchen8581 3 жыл бұрын

That's because he explains things in context of an average youtube audience. An actual interview... i mean for example there's a dialogue between TWO people... who watches this single teacher style class and thinks that this is what an interview is like?? Like you think the interviewer will explain the concepts like this to you?

@learnersparadise7492 2 жыл бұрын

you made write to 4 nodes, won't it increase write latency, please don't do it just for jargon.

@ThinkSoftware 2 жыл бұрын

I think you didn't understand the reasoning. Thanks for the comment anyway.

@learnersparadise7492 2 жыл бұрын

@@ThinkSoftware Lol, that's the easy path out, you did not understand. The write has to be quick as well like read, fanout would help for read latency. You don't necessarily need to make read acknowledgement count low, strike a balance, bookish knowledge is dangerous man. User can at Max see written result sometimes later.

@govindaHari16 Жыл бұрын

@@learnersparadise7492 what is this attitude? You should be bit respectful here. He is sharing all this valuable knowledge for free (even if he charged, it would be still be worth it). Instead of encouraging him, you type this? About your question. Twitter is a read heavy system (Reads >> writes). Hence there is a trade off b/w optimizing read or write latency. With W+R= 4+2, write takes bit longer and read is faster. Which is what we want. If we did 3+3, write gets faster but makes reads slower, hence not ideal. We can also do 5+1, this will make read very fast but write slower. But this still is a feasible solution.

@learnersparadise7492 2 жыл бұрын

make it crisp and short, you yourself feel lost along the way, lengthy and not very intuitive.