System design paradigm: Sharding

3 min readDec 23, 2020

Scale out for too big data or too many writes

Primary-replica pattern and in-memory cache solves the problem on the read side. But what if the total data is too big to fit in one host or the write throughput is too high? The answer is very intuitive: we split the total data into multiple hosts. This is called sharding.

Notice that write latency is usually not an issue because

We expect write to be slow for most applications
The write can return immediately after persistent on commit log, we therefore trade strong consistency for lower latency

Sharding key

A key decision of sharding is given a record R, how to determine which shard it belongs to. The sharding design contains a key function that maps R to a key string. We then compute a hash of the key. The modulo of the hash(integer) on N is the shard ID. N is the number of hosts.

Shard Hash(Key(R)) mod N

Notice that Key(R) doesn’t have to be available at query time. For example, R is <user_id, order_id, quantity>. We can use order_id as a sharding key(Key(R) = order_id). We can still serve the query of finding all orders for a given user by doing the query on all shards.

System design paradigm: Sharding

Scale out for too big data or too many writes

Sharding key

Written by Abracadabra