Messenger Application System Design
High level system design for a basic messaging Application.
Messages plays an important role in this current times, we do a lot of communication just by sharing text with each other rather than talking on phone. That's why I have thought of writing an article on messaging application would be a great idea.
Let's get started...
let's go with a structured approach, we will wrap up this process of designing in 4 steps:
Write down the MVP.
Estimate the scale.
Design Trade Off.
System Design deep dive.
Write down the MVP:
In this step we discuss about key functionalities of the product that we are going to design.
-> Send or receive a message.
-> 1 : 1 message.
-> Group messaging.
-> Conversations where messages are part of.
Estimate the scale of Application: In this step must estimate the traffic that could hit our application.
assume that we have 2 billion users and 400 million are the daily active users.
at max they would be sharing 50 messages a day.
so the Average messages per day = 400 million x 50, which is 20 Billion
20 billion messages per day.
the write queries per second would be 20 billion / 86400, and assume the read queries can be 4 times more of write queries.
lets calculate the message content :
-
so 200 bytes x 200 billion is equal to 4000 GB, meaning 4TB messages per day. By this figure we can say that Database sharding is needed.
Design Trade offs: By the CAP theorem we must go with any one, availability or consistency. In this application I prefer to go with consistency because i do not want my application to be eventually consistent meaning if any message is sent by the user and he gets a acknowledgement after 5 mins that the message which the user tried to send got failed that not a pleasant experience i believe.
So i want my application to be reasonably low in latency, consistency high and availability compromised.
Design Deep dive :
API design :
(a) sendMessage(sender_id,conversation_id,text, message_id,meta-data,timestamp);
(b) getConversationList(user_id,offset,limit);
(c) getMessages(user_id, conversation_id,offset,limit);
here the sendMessage api is just a simple post request, i prefer the conversation_id rather than receiver_id because in group chats there would be many receivers for a message, so i assumed every chat as a conversation, refer it with a id.
getConversationList, here its a fetch api or get api, which is responsible to fetch all the conversation list the user have.
getMessages, here its again a get api, which would get the messages of a particular conversation.
Data Base Sharding and the Sharding keys : I have selected the data base as NoSql because the message content can vary. I have divided the database into three sections. user data collection ii. conversation data collection iii. latest conversation data collection.
User data sharding key : for user data collection I have chosen the user_id as the sharding key, so any query related to the user data would be a intra shard query.
Conversation data sharding key : for conversation data collection I have chosen conversation_id as the sharding key. This would make any query regarding a particular conversation as intra shard query.
Latest Conversation data sharding key : for latest conversation data collection I have chosen user_id as the sharding key. This would make the query which will fetch the conversations the user is involved in as a intra shard query.
Web sockets are used for the latest update on any conversations.