Why distributes database

Why Distributed Database

Before starting to understand the core concept of distributed database system, first try to understand why distributed database system is needed.

Scalability:

If your data volume, read load, write load grows bigger then a single machine capability, you can potentially spread load across multiple machines.

High Availability or Fault Tolerance:

If your application needs to continue working even if one machine or network or entire data centre goes down, you can use multiple machines to give you redundancy. when one fail another can take over.

Latency:

If you have users around the world, you might want to have servers at various locations world wide so that each user can be served from a datacenter that is geographically closer to them.
That avoids the users having to wait for network packets to travel from longer distance.

Scaling to Higher load:

Vertical Scaling (Shared memory architecture)
Cost grows faster than linearly
limited fault tolerance
limited to single geographically location
Horizontal Scaling ( Shared nothing architecture)
Low cost machines
distribute data across multiple geographic regions, hence reduce latency
high availability
additional complexity
Replication (Redundancy)
Keeping a copy of same data on several different nodes, potentially in different locations
Replication provides redundancy: If some nodes are unavailable, the data can still be served from remaining nodes.
It improve performance
Partitioning (Data Sharding):
Splitting a big database into smaller subsets called partitions so that different partitions can be assigned to different nodes.
Next is details of Replications.

Comments