Understanding Database 9 - Distributed Database Systems(DDB)

2023. 1. 16. 08:33Data science/Database

반응형

Data Systems 

: combines different data storage technologies

: caches, indexes and data in sync with application code

: API(application programming interface) to hide architectural complexity

: There are three ways to deal with requests in this figure, 1. In-memory cache 2. Primary database 3. Full-text index

  1. Before checking DB, check the cache to see if the data is cached.
  2. If there is no cache, go to DB.
  3. If we get search requests, check the full-text index. But this makes to read a lot of documents which takes a long time
  4. When the operations don't need to process immediately (Asynchronous task - i.e. email), approach differently.

Martin Kleppmann, 2020

Q1. How do you ensure the data remains correct and complete, even when things go wrong internally?

Q2. How do you provide consistently good performance to clients, even when parts of your system are degraded?

Q3. How do you scale to handle an increase in load?

 

Reliability, Scalability and Maintainability

Reliability Scalability Maintainability
* perform as the user expected
* tolerate the user's mistake
* good enough performance for the required use case by the user
* prevent any unauthorized access or abuse
* Ability to cope with increased load
* described using load parameters
* or using fan-out
* using throughput(batch-processing) * response time(online system)
* Operability: running smoothly
* Simplicity: easy for new engineers
* Evolvability: easy to make changes in the future, adapting it for unanticipated use cases as requirements changes

Another question?! Case study of Twitter 

Approach 1. follows table, tweets table, and users table store data individually and link them to get the outcome

Approach 2. Make a time-lined tweets list per user in DB and print it out immediately. (Fan-out: Deliver tweet to each follower, up to 31M followers per user)

-> Which approach is better?????

 

Scaling Out 

  • Scaling up: Vertical scaling, upgrade to more powerful machines (Not used these days) 
  • Scaling out: Horizontal scaling, Distributing the load across multiple smaller machines(adding more instances)
  • Centralized database: Using one centralized DB in one area, the local office gets data from there. 
  • Distributed database: Using multiple local DB (user closer DB used) 

Replication

Replication means keeping a copy of the same data on multiple machines that are connected via a network

- keep data geographically close to users

- increase availability(even though other parts failed)

- increase read throughput(scale out the number of machines) 

: Single-leader, Multi-Leader or Leaderless replication can be applied 

  1. One of the replicas is designated as the leader(primary or master)
  2. The other replicas are known as followers (secondary, hot standby, slaves, read replicas)
  3. Read Query can be done by either leader or followers

Synchronous: data is consistent, but if the follower doesn't respond -> the writing doesn't work.

Asynchronous: Eventual consistency, but it will always work 

 

#Handling Node Outage 

Follower Failure: Catchup Recovery 

Leader Failure: Failover (1. Determine the leader fails, 2. Choosing a new leader, 3. Reconfigure the system to use a new leader) - When elects the leader, there is a vote. Details can be found in MongoDB documents. 

 

#Problem with Replication Lag: The figures show the problem, and the solution is given in the statement

1. Reading stale data solution: read your own write - after write, the replica is not up-to-dated

- when reading something the user may have modified, read it from the leader.

 

https://arpitbhayani.me/blogs/

2. Backward Time travel solution: Monotonic read - the update is yet to reach a certain replica. 

- Each user always makes their read from the same replica

 

 

 

https://arpitbhayani.me/blogs/monotonic-reads

3. Observer Dejavu solution: Consistent prefix read - For the Observer, the message from Mrs.cake shows up first.

- writes in that same order (read always see a consistent prefix)

- any writes related to each other should write in the same partition

Martin Kleppmann(2022)

Multi-Leader Replication

: Performance improvement, Tolerance of DC(Data Centre) outage and network Problem

: Client with offline operations(Calendar), Collaborative editing(Google docs) 

: There can be a conflict in writing data. 

 

#Solution of Write Conflicts (Multi-Leader Replication)

  1. making the conflict detection synchronous
  2. all writes for particular records go through the same leader 
  3. give each write a unique ID, and pick the write with the highest one is the winner(the latest one) 
  4. recording the conflict in an explicit data structure that preserves all info
  5. to be automatically resolved by the application or the user at a later time

https://towardsdatascience.com/database-replication-explained-10ff929bdf8a 

Excellent blog summarised the solution in detail. 

 

Leaderless Replication

: Quorom reading, writing - wait until at least N/2 + 1 nodes returns ACK (W+R>N)

: Read repair - stale data 

: Anti-entropy process - if they are not read, some values can be missed

: High consistency

https://arpitbhayani.me/blogs/leaderless-replication
https://arpitbhayani.me/blogs/leaderless-replication

Partitioning(Sharding)

In MongoDB, the term Sharding is used for Partitioning. 

: It is usually combined with replication, and they are stored on multiple nodes for fault tolerance(Allocation)

 

Partition by key range: Assign a continuous range of keys to each partition. 

Partition by Hash of Key: Risk of skew and hot spots, a hash can be sued instead of the key range. But, the capability of efficient range query will be lost.

 

Rebalancing

  • The query throughput increases - more CPUs required
  • The data size increases - more disk and RAM required 
  • A machine field - another machine to take over failed one's responsibility

Therefore, rebalancing can be used for data and request fairness and no downtime. 

The strategy: Fixed number of partitions.

Figure below, first it was 5 partitions in 4 nodes -> relocate the partitions and make them as 4 partitions in 5 nodes 

Martin Klepmann(2020)

 

Request Routing

  1. Allow clients to contact any node
  2. Send all requests from clients to the routing tier first 
  3. Require that clients be aware of the partitioning and the assignment of partitions to nodes 

https://xgwang.me/posts/ddia-6-partitioning/

 

반응형