Chapter 6:  Consistency and replication

Replication of data is very important in distributed systems

Two reasons for replication:  reliability and performance

bulletReliability:  Replicate to increase the reliability (if one disk goes down, the other one can still work)
bulletPerformance: Replacing a server and dividing the work.  Geographically; If the area is large, then put a copy of the data closer to the processes using it.  Or copy of a webpage called caching a webpage.

PROBLEM:  Will lead to consistency problems (the copies are not alike).

Before replicating a remote object across several machines:  HAVE TO PROTECT THE OBJECT AGAINST SIMULTANEOUS ACCESS BY MULTIPLE CLIENTS.

Two solutions to this problem:

1:   Object itself can handle concurrent invocation (declaring object`s methods to be syncronized)

2:   The server where the object is, gets the responebility for the concurrency control.

Additional synchronization is needed for the concurrent invocations.

bulletReplication-aware object.  It is then possible to adopt object-specific reeplication strategies
bulletMake the distributed system responsible for the replication. This is easier for programmers.

Need to synchronize all replicas:

bulletAll replicas need to reach an agreement on when an update is to be done locally
bullet        like using Lamport timestamps or let a coordinator assign such an order to get Global synchronization


In the absence of a global clock, it is difficult to define precisely which write operation is the last one.

Is a contract between processes and the data store.  The data store defines precisely what the results of the read and write operations are when it comes to CONSISTENCY.


bulletStrict Consistency:  Operations on shared data are synchronized, relies on Global time.  It does not make sense to talk about the most recent in a distributed system.
bulletLinearizability (Each operation is timestamped + sequencially consistent)
bulletSequencial (linearization is stronger and receives a time stamp).  Respect of the sequential order.
bulletCausal (Causally related must be seen in the same order by all machines, Concurrent writes can be seen in different order on different machines.  Have to be seen by all processes in the same order.
bulletFIFO (same as Causal, but dropping the Causal..)    First In First Out (?)  Using sequence numbers  Update by a SINGLE process. Received in the order they were issued.

Causally related:  Read follows later than write

Concurrent (entydig)( may be seen in different order on different machines):  Opposite of Causally (assosiasjon)( must be seen by all processes in the same order)

Weak: (Synchronization occurs only when shared data is locked and unlocked)



CLIENT-CENTRIC CONSISTENCY MODELS(client not server, this is why there is only one process, L1 and L2):

Monotonic reads:

Guarantees that the user sees all updates.  Two local copies, one process.  Reading and updating your calendar.

Monotonic Writes:

Take with you all the updates.  A write operation must finish before the next write operation.

Read your writes:

Updating your webbrowser.  You will see your updates after you have written them.  You will  not see your old writing.

Writes follow reads:

See reactions of posted only if you have the original posting.