Vector clock explained
Vector clock should not be confused with Version vector.
A vector clock is a data structure used for determining the partial ordering of events in a distributed system and detecting causality violations. Just as in Lamport timestamps, inter-process messages contain the state of the sending process's logical clock. A vector clock of a system of N processes is an array/vector of N logical clocks, one clock per process; a local "largest possible values" copy of the global clock-array is kept in each process.
Denote
as the vector clock maintained by process
, the clock updates proceed as follows:
[1] - Initially all clocks are zero.
- Each time a process experiences an internal event, it increments its own logical clock in the vector by one. For instance, upon an event at process
, it updates
.
- Each time a process sends a message, it increments its own logical clock in the vector by one (as in the bullet above, but not twice for the same event) then it pairs the message with a copy of its own vector and finally sends the pair.
- Each time a process receives a message-vector clock pair, it increments its own logical clock in the vector by one and updates each element in its vector by taking the maximum of the value in its own vector clock and the value in the vector in the received pair (for every element). For example, if process
receives a message
from
, it first increments its own logical clock in the vector by one
and then updates its entire vector by setting
VCi[k]\leftarrowmax(VCi[k],VCj[k]),\forallk
.
History
Lamport originated the idea of logical Lamport clocks in 1978.[2] However, the logical clocks in that paper were scalars, not vectors. The generalization to vector time was developed several times, apparently independently, by different authors in the early 1980s.[3] At least 6 papers contain the concept.[4] The papers canonically cited in reference to vector clocks are Colin Fidge’s and Friedemann Mattern’s 1988 works,[5] [6] as they (independently) established the name "vector clock" and the mathematical properties of vector clocks.[3]
Partial ordering property
Vector clocks allow for the partial causal ordering of events. Defining the following:
denotes the vector clock of event
, and
denotes the component of that clock for process
.
VC(x)<VC(y)\iff\forallz[VC(x)z\leVC(y)z]\land\existsz'[VC(x)z'<VC(y)z']
is less than
, if and only if
is less than or equal to
for all process indices
, and at least one of those relationships is strictly smaller (that is,
).
denotes that event
happened before event
. It is defined as: if
, then
Properties:
, then ¬
and
, then
; or, if
and
, then
Relation with other orders:
be the real time when event
occurs. If
, then
be the
Lamport timestamp of event
. If
, then
Other mechanisms
- In 1999, Torres-Rojas and Ahamad developed Plausible Clocks, a mechanism that takes less space than vector clocks but that, in some cases, will totally order events that are causally concurrent.
- In 2005, Agarwal and Garg created Chain Clocks,[7] a system that tracks dependencies using vectors with size smaller than the number of processes and that adapts automatically to systems with dynamic number of processes.
- In 2008, Almeida et al. introduced Interval Tree Clocks. This mechanism generalizes Vector Clocks and allows operation in dynamic environments when the identities and number of processes in the computation is not known in advance.
- In 2019, Lum Ramabaja proposed Bloom Clocks, a probabilistic data structure based on Bloom filters.[8] [9] Compared to a vector clock, the space used per node is fixed and does not depend on the number of nodes in a system. Comparing two clocks either produces a true negative (the clocks are not comparable), or else a suggestion that one clock precedes the other, with the possibility of a false positive where the two clocks are unrelated. The false positive rate decreases as more storage is allowed.
See also
External links
Notes and References
- Web site: Distributed Systems 3rd edition (2017). 2021-03-21. DISTRIBUTED-SYSTEMS.NET. en-US.
- Lamport . L. . Leslie Lamport. Time, clocks, and the ordering of events in a distributed system . 10.1145/359545.359563 . Communications of the ACM . 21 . 7 . 558–565. 1978 . 215822405 .
- Schwarz . Reinhard . Mattern . Friedemann . Detecting causal relationships in distributed computations: In search of the holy grail . Distributed Computing . March 1994 . 7 . 3 . 149–174 . 10.1007/BF02277859. 3065996 .
- Web site: Kuper . Lindsey . Who invented vector clocks? . decomposition ∘ al . en . 8 April 2023. The papers are (in chronological order):
- Book: Fischer . Michael J. . Michael . Alan . Proceedings of the 1st ACM SIGACT-SIGMOD symposium on Principles of database systems - PODS '82 . Sacrificing serializability to attain high availability of data in an unreliable network . 1982 . 70 . 10.1145/588111.588124. 0897910702 . 8774876 .
- Parker . D.S. . Popek . G.J. . Rudisin . G. . Stoughton . A. . Walker . B.J. . Walton . E. . Chow . J.M. . Edwards . D. . Kiser . S. . Kline . C. . Detection of Mutual Inconsistency in Distributed Systems . IEEE Transactions on Software Engineering . May 1983 . SE-9 . 3 . 240–247 . 10.1109/TSE.1983.236733. 2483222 .
- Book: Wuu . Gene T.J. . Bernstein . Arthur J. . Proceedings of the third annual ACM symposium on Principles of distributed computing - PODC '84 . Efficient solutions to the replicated log and dictionary problems . 1984 . 233–242 . 10.1145/800222.806750. 0897911431 . 2384672 .
- Strom . Rob . Yemini . Shaula . Optimistic recovery in distributed systems . ACM Transactions on Computer Systems . August 1985 . 3 . 3 . 204–226 . 10.1145/3959.3962. 1941122 . free .
- Schmuck . Frank B. . Software clocks and the order of events in a distributed system . November 1985 . unpublished .
- Book: Liskov . Barbara . Ladin . Rivka . Proceedings of the fifth annual ACM symposium on Principles of distributed computing - PODC '86 . Highly available distributed services and fault-tolerant distributed garbage collection . 1986 . 29–39 . 10.1145/10590.10593. 0897911989 . 16148617 .
- Raynal . Michel . A distributed algorithm to prevent mutual drift between n logical clocks . Information Processing Letters . February 1987 . 24 . 3 . 199–202 . 10.1016/0020-0190(87)90186-4.
- Colin J.. Fidge. February 1988. Timestamps in message-passing systems that preserve the partial ordering. Proceedings of the 11th Australian Computer Science Conference (ACSC'88). 10. 1 . K. Raymond. 56–66 . 2009-02-13.
- Virtual Time and Global States of Distributed systems. Proc. Workshop on Parallel and Distributed Algorithms. Friedemann. Mattern . Cosnard . M. . Chateau de Bonas, France . October 1988 . Elsevier . 215–226.
- Book: Agarwal . Anurag . Garg . Vijay K. . Proceedings of the twenty-fourth annual ACM symposium on Principles of distributed computing . Efficient dependency tracking for relevant events in shared-memory systems . 17 July 2005 . 19–28 . 10.1145/1073814.1073818 . http://users.ece.utexas.edu/~garg/dist/agarwal-garg-DC.pdf . 21 April 2021 . Association for Computing Machinery. 1-58113-994-2 . 11779779 .
- Pozzetti . Tommaso . Kshemkalyani . Ajay D. . Resettable Encoded Vector Clock for Causality Analysis With an Application to Dynamic Race Detection . IEEE Transactions on Parallel and Distributed Systems . 1 April 2021 . 32 . 4 . 772–785 . 10.1109/TPDS.2020.3032293. 220362525 . free .
- Book: Kulkarni . Sandeep S . Appleton . Gabe . Nguyen . Duong . Proceedings of the 23rd International Conference on Distributed Computing and Networking . Achieving Causality with Physical Clocks . 4 January 2022 . 97–106 . 10.1145/3491003.3491009. 2104.15099 . 9781450395601 . 233476293 .