Global Sources
EE Times-India
EE Times-India > EDA/IP

Common problems in multi-core programming, Part 1: Threads, data races, deadlocks, live locks

Posted: 03 Mar 2009     Print Version  Bookmark and Share

Keywords:Parallel programming  multi-core processors  threading 

Fortunately, in such a scenario the high-level locking causes the low-level locks to be uncontended, and most lock implementations optimise the uncontended case. Hence the performance impact is somewhat mitigated, but for best performance the superfluous locks should be removed. Of course there are times when components should provide their own internal locking. This topic is discussed later in the discussion of thread-safe libraries.

Dealing with deadlocks
Race conditions are typically cured by adding a lock that protects the invariant that might otherwise be violated by interleaved operations. Unfortunately, locks have their own hazards, most notably deadlock. Figure 7.4 shows a deadlock involving two threads. Thread 1 has acquired lock A. Thread 2 has acquired lock B. Each thread is trying to acquire the other lock. Neither thread can proceed.

image name

Figure 7.4: Deadlock caused by cycle.

Though deadlock is often associated with locks, it can happen any time a thread tries to acquire exclusive access to two more shared resources. For example, the locks in Figure 7.4 could be files instead, where the threads are trying to acquire exclusive file access. Deadlock can occur only if the following four conditions hold true:
1. Access to each resource is exclusive.
2. A thread is allowed to hold one resource while requesting another.
3. No thread is willing to relinquish a resource that it has acquired.
4. There is a cycle of threads trying to acquire resources, where each resource is held by one thread and requested by another.

Deadlock can be avoided by breaking any one of these conditions.

Often the best way to avoid deadlock is to replicate a resource that requires exclusive access, so that each thread can have its own private copy. Each thread can access its own copy without needing a lock. The copies can be merged into a single shared copy of the resource at the end if necessary.

By eliminating locking, replication avoids deadlock and has the further benefit of possibly improving scalability, because the lock that was removed might have been a source of contention.

If replication cannot be done, that is, in such cases where there really must be only a single copy of the resource, common wisdom is to always acquire the resources (locks) in the same order. Consistently ordering acquisition prevents deadlock cycles. For instance, the deadlock in Figure 7.4 cannot occur if threads always acquire lock A before they acquire lock B.

The ordering rules that are most convenient depend upon the specific situation. If the locks all have associated names, even something as simple as alphabetical order works. This order may sound silly, but it has been successfully used on at least one large project.

For multiple locks in a data structure, the order is often based on the topology of the structure. In a linked list, for instance, the agreed upon order might be to lock items in the order they appear in the list.

In a tree structure, the order might be a pre-order traversal of the tree. Somewhat similarly, components often have a nested structure, where bigger components are built from smaller components. For components nested that way, a common order is to acquire locks in order from the outside to the inside.

If there is no obvious ordering of locks, a solution is to sort the locks by address. This approach requires that a thread know all locks that it needs to acquire before it acquires any of them.

For instance, perhaps a thread needs to swap two containers pointed to by pointers x and y, and each container is protected by a lock. The thread could compare "x < y" to determine which container comes first, and acquire the lock on the first container before acquiring a lock on the second container, as Figure 7.5 illustrates.

image name

Figure 7.5: Locks ordered by their addresses.

In large software projects, different programmers construct different components, and by necessity should not have to understand the inner workings of the other components. It follows that to prevent accidental deadlock, software components should try to avoid holding a lock while calling code outside the component, because the call chain may cycle around and create a deadlock cycle.

The third condition for deadlock is that no thread is willing to give up its claim on a resource. Thus another way of preventing deadlock is for a thread to give up its claim on a resource if it cannot acquire the other resources. For this purpose, mutexes often have some kind of "try lock" routine that allows a thread to attempt to acquire a lock, and give up if it cannot be acquired.

This approach is useful in scenarios where sorting the locks is impractical. Figure 7.6 sketches the logic for using a "try lock" approach to acquire two locks, A and B. In Figure 7.6, a thread tries to acquire both locks, and if it cannot, it release both locks and tries again.

image name

Figure 7.6: "Try and back off" logic.

Figure 7.6 has some timing delays in it to prevent the hazard of live lock. Live lock occurs when threads continually conflict with each other and back off. Figure 7.6 applies exponential backoff to avoid live lock. If a thread cannot acquire all the locks that it needs, it releases any that it acquired and waits for a random amount of time.

The random time is chosen from an interval that doubles each time the thread backs off. Eventually, the threads involved in the conflict will back off sufficiently that at least one will make progress. The disadvantage of backoff schemes is that they are not fair. There is no guarantee that a particular thread will make progress. If fairness is an issue, then it is probably best to use lock ordering to prevent deadlock.

- Shameem Akhter and Jason Roberts
  Intel Corp.

View the PDF document for more information.

 First Page Previous Page 1 • 2 • 3

Comment on "Common problems in multi-core progra..."
*  You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.


Go to top             Connect on Facebook      Follow us on Twitter      Follow us on Orkut

Back to Top