Lock (computer science) explained

In computer science, a lock or mutex (from mutual exclusion) is a synchronization primitive that prevents state from being modified or accessed by multiple threads of execution at once. Locks enforce mutual exclusion concurrency control policies, and with a variety of possible methods there exist multiple unique implementations for different applications.

Types

Generally, locks are advisory locks, where each thread cooperates by acquiring the lock before accessing the corresponding data. Some systems also implement mandatory locks, where attempting unauthorized access to a locked resource will force an exception in the entity attempting to make the access.

The simplest type of lock is a binary semaphore. It provides exclusive access to the locked data. Other schemes also provide shared access for reading data. Other widely implemented access modes are exclusive, intend-to-exclude and intend-to-upgrade.

Another way to classify locks is by what happens when the lock strategy prevents the progress of a thread. Most locking designs block the execution of the thread requesting the lock until it is allowed to access the locked resource. With a spinlock, the thread simply waits ("spins") until the lock becomes available. This is efficient if threads are blocked for a short time, because it avoids the overhead of operating system process re-scheduling. It is inefficient if the lock is held for a long time, or if the progress of the thread that is holding the lock depends on preemption of the locked thread.

Locks typically require hardware support for efficient implementation. This support usually takes the form of one or more atomic instructions such as "test-and-set", "fetch-and-add" or "compare-and-swap". These instructions allow a single process to test if the lock is free, and if free, acquire the lock in a single atomic operation.

Uniprocessor architectures have the option of using uninterruptible sequences of instructions—using special instructions or instruction prefixes to disable interrupts temporarily—but this technique does not work for multiprocessor shared-memory machines. Proper support for locks in a multiprocessor environment can require quite complex hardware or software support, with substantial synchronization issues.

The reason an atomic operation is required is because of concurrency, where more than one task executes the same logic. For example, consider the following C code:if (lock

0)

The above example does not guarantee that the task has the lock, since more than one task can be testing the lock at the same time. Since both tasks will detect that the lock is free, both tasks will attempt to set the lock, not knowing that the other task is also setting the lock. Dekker's or Peterson's algorithm are possible substitutes if atomic locking operations are not available.

Careless use of locks can result in deadlock or livelock. A number of strategies can be used to avoid or recover from deadlocks or livelocks, both at design-time and at run-time. (The most common strategy is to standardize the lock acquisition sequences so that combinations of inter-dependent locks are always acquired in a specifically defined "cascade" order.)

Some languages do support locks syntactically. An example in C# follows:public class Account // This is a monitor of an account

The code lock(this) can lead to problems if the instance can be accessed publicly.[1]

Similar to Java, C# can also synchronize entire methods, by using the MethodImplOptionsSynchronized attribute.[2] [3] [MethodImpl(MethodImplOptions.Synchronized)]public void SomeMethod

Granularity

Before being introduced to lock granularity, one needs to understand three concepts about locks:

There is a tradeoff between decreasing lock overhead and decreasing lock contention when choosing the number of locks in synchronization.

An important property of a lock is its granularity. The granularity is a measure of the amount of data the lock is protecting. In general, choosing a coarse granularity (a small number of locks, each protecting a large segment of data) results in less lock overhead when a single process is accessing the protected data, but worse performance when multiple processes are running concurrently. This is because of increased lock contention. The more coarse the lock, the higher the likelihood that the lock will stop an unrelated process from proceeding. Conversely, using a fine granularity (a larger number of locks, each protecting a fairly small amount of data) increases the overhead of the locks themselves but reduces lock contention. Granular locking where each process must hold multiple locks from a common set of locks can create subtle lock dependencies. This subtlety can increase the chance that a programmer will unknowingly introduce a deadlock.

In a database management system, for example, a lock could protect, in order of decreasing granularity, part of a field, a field, a record, a data page, or an entire table. Coarse granularity, such as using table locks, tends to give the best performance for a single user, whereas fine granularity, such as record locks, tends to give the best performance for multiple users.

Database locks

See main article: article and Lock (database). Database locks can be used as a means of ensuring transaction synchronicity. i.e. when making transaction processing concurrent (interleaving transactions), using 2-phased locks ensures that the concurrent execution of the transaction turns out equivalent to some serial ordering of the transaction. However, deadlocks become an unfortunate side-effect of locking in databases. Deadlocks are either prevented by pre-determining the locking order between transactions or are detected using waits-for graphs. An alternate to locking for database synchronicity while avoiding deadlocks involves the use of totally ordered global timestamps.

There are mechanisms employed to manage the actions of multiple concurrent users on a database—the purpose is to prevent lost updates and dirty reads. The two types of locking are pessimistic locking and optimistic locking:

Where to use pessimistic locking: this is mainly used in environments where data-contention (the degree of users request to the database system at any one time) is heavy; where the cost of protecting data through locks is less than the cost of rolling back transactions, if concurrency conflicts occur. Pessimistic concurrency is best implemented when lock times will be short, as in programmatic processing of records. Pessimistic concurrency requires a persistent connection to the database and is not a scalable option when users are interacting with data, because records might be locked for relatively large periods of time. It is not appropriate for use in Web application development.

Where to use optimistic locking: this is appropriate in environments where there is low contention for data, or where read-only access to data is required. Optimistic concurrency is used extensively in .NET to address the needs of mobile and disconnected applications,[4] where locking data rows for prolonged periods of time would be infeasible. Also, maintaining record locks requires a persistent connection to the database server, which is not possible in disconnected applications.

Lock compatibility table

Several variations and refinements of these major lock types exist, with respective variations of blocking behavior. If a first lock blocks another lock, the two locks are called incompatible; otherwise the locks are compatible. Often, lock types blocking interactions are presented in the technical literature by a Lock compatibility table. The following is an example with the common, major lock types:

Lock compatibility table
Lock type read-lock write-lock
read-lock X
write-lockX X

Comment: In some publications, the table entries are simply marked "compatible" or "incompatible", or respectively "yes" or "no".[5]

Disadvantages

Lock-based resource protection and thread/process synchronization have many disadvantages:

Some concurrency control strategies avoid some or all of these problems. For example, a funnel or serializing tokens can avoid the biggest problem: deadlocks. Alternatives to locking include non-blocking synchronization methods, like lock-free programming techniques and transactional memory. However, such alternative methods often require that the actual lock mechanisms be implemented at a more fundamental level of the operating software. Therefore, they may only relieve the application level from the details of implementing locks, with the problems listed above still needing to be dealt with beneath the application.

In most cases, proper locking depends on the CPU providing a method of atomic instruction stream synchronization (for example, the addition or deletion of an item into a pipeline requires that all contemporaneous operations needing to add or delete other items in the pipe be suspended during the manipulation of the memory content required to add or delete the specific item). Therefore, an application can often be more robust when it recognizes the burdens it places upon an operating system and is capable of graciously recognizing the reporting of impossible demands.

Lack of composability

One of lock-based programming's biggest problems is that "locks don't compose": it is hard to combine small, correct lock-based modules into equally correct larger programs without modifying the modules or at least knowing about their internals. Simon Peyton Jones (an advocate of software transactional memory) gives the following example of a banking application:[6] design a class that allows multiple concurrent clients to deposit or withdraw money to an account; and give an algorithm to transfer money from one account to another. The lock-based solution to the first part of the problem is:

class Account: member balance: Integer member mutex: Lock method deposit(n: Integer) mutex.lock balance ← balance + n mutex.unlock method withdraw(n: Integer) deposit(−n)

The second part of the problem is much more complicated. A routine that is correct for sequential programs would be

function transfer(from: Account, to: Account, amount: Integer) from.withdraw(amount) to.deposit(amount)

In a concurrent program, this algorithm is incorrect because when one thread is halfway through, another might observe a state where has been withdrawn from the first account, but not yet deposited into the other account: money has gone missing from the system. This problem can only be fixed completely by taking locks on both account prior to changing any of the two accounts, but then the locks have to be taken according to some arbitrary, global ordering to prevent deadlock:

function transfer(from: Account, to: Account, amount: Integer) if from < to // arbitrary ordering on the locks from.lock to.lock else to.lock from.lock from.withdraw(amount) to.deposit(amount) from.unlock to.unlock

This solution gets more complicated when more locks are involved, and the function needs to know about all of the locks, so they cannot be hidden.

Language support

See also: Barrier (computer science). Programming languages vary in their support for synchronization:

See also

External links

Notes and References

  1. Web site: lock Statement (C# Reference).
  2. Web site: 2011-11-22. ??. MSDN. ThreadPoolPriority, and MethodImplAttribute.
  3. Web site: 2011-11-22 . C# From a Java Developer's Perspective . dead . https://archive.today/20130102015335/http://www.25hoursaday.com/CsharpVsJava.html . 2013-01-02 .
  4. Web site: Designing Data Tier Components and Passing Data Through Tiers. Microsoft. August 2002. 2008-05-30. https://web.archive.org/web/20080508154329/http://msdn.microsoft.com/en-us/library/ms978496.aspx. 2008-05-08. dead.
  5. Web site: 2018-03-07 . Lock Based Concurrency Control Protocol in DBMS . 2023-12-28 . GeeksforGeeks . en-US.
  6. Encyclopedia: Beautiful concurrency . Simon . Peyton Jones . Beautiful Code: Leading Programmers Explain How They Think . Greg . Wilson . Andy . Oram . O'Reilly . 2007 .
  7. Book: ISO/IEC 8652:2007. Ada 2005 Reference Manual. Protected Units and Protected Objects. http://www.adaic.com/standards/1zrm/html/RM-9-4.html. A protected object provides coordinated access to shared data, through calls on its visible protected operations, which can be protected subprograms or protected entries.. 2010-02-27.
  8. Book: ISO/IEC 8652:2007. Ada 2005 Reference Manual. Example of Tasking and Synchronization. http://www.adaic.com/standards/1zrm/html/RM-9-11.html. 2010-02-27.
  9. Web site: Mutual Exclusion Locks. Marshall. Dave. March 1999. 2008-05-30.
  10. Web site: Synchronize. msdn.microsoft.com. 2008-05-30.
  11. Web site: Synchronization. Sun Microsystems. 2008-05-30.
  12. Web site: Apple Threading Reference. Apple, inc. 2009-10-17.
  13. Web site: NSLock Reference. Apple, inc. 2009-10-17.
  14. Web site: NSRecursiveLock Reference. Apple, inc. 2009-10-17.
  15. Web site: NSConditionLock Reference. Apple, inc. 2009-10-17.
  16. Web site: NSLocking Protocol Reference. Apple, inc. 2009-10-17.
  17. Web site: flock.
  18. Web site: The Mutex class. 2016-12-29. 2017-07-04. https://web.archive.org/web/20170704152552/http://php.net/manual/en/class.mutex.php. dead.
  19. Web site: Thread Synchronization Mechanisms in Python. Lundh. Fredrik. July 2007. 2008-05-30. 2020-11-01. https://web.archive.org/web/20201101025814/http://effbot.org/zone/thread-synchronization.htm. dead.
  20. Web site: Coarrays in the next Fortran Standard. John Reid. 2010. 2020-02-17.
  21. Web site: Programming Ruby: Threads and Processes. 2001. 2008-05-30.
  22. Web site: std::sync::Mutex - Rust . doc.rust-lang.org . 3 November 2020.
  23. Web site: Shared-State Concurrency - The Rust Programming Language . doc.rust-lang.org . 3 November 2020.
  24. Book: Marlow, Simon. Simon Marlow. August 2013. Parallel and Concurrent Programming in Haskell. O’Reilly Media. 9781449335946. Basic concurrency: threads and MVars.
  25. Book: Marlow, Simon. Simon Marlow. August 2013. Parallel and Concurrent Programming in Haskell. O’Reilly Media. 9781449335946. Software transactional memory.
  26. Web site: sync package - sync - pkg.go.dev. 2021-11-23. pkg.go.dev.