
Chapter 3 - Part 2. Sharing data between threads

If not specified, all the code is written in c++23 standard, compiled with g++ 13.2
command: g\++ -std=c\++2b -O0 ch3.cpp -o out && ./out
Chapter 3. Sharing data between threads - Part 2
Deadlock
Imagine you !!are stupid enough to!! lock the key for lock A in a container with lock B and lock the key for lock B in a container with lock A. When you are trying to open the lock A with key A, but you cannot open the lock A because you need key B to open lock A, and you cannot open lock B because you need key A to open lock B. This is called a deadlock.
%%{ init: { 'theme': 'base', 'themeVariables': { 'primaryColor': '#add8e6', 'noteTextColor': '#0f0f0f', 'noteBkgColor': '#f5f6fa', 'textColor': '#EAA1AF' } } }%% graph LR A[Lock A] --> B[Lock B] B --> A
Similar thing can happen with mutex because mutex is also a kind of lock. Consider the following code:
1 |
|
In this case, when you call opA
, a deadlock will occur because opA
acquired the ownership of lock_a
so opB
have to wait for opA
to finish, while opA
cannot finish before opB
finish. Therefore, a deadlock occurs.
In this case, cpp standard lib have a special lock called std::recursive_mutex
that can be locked multiple times by the same thread. This way, the deadlock within thread can be avoided.
1 |
|
output:
1 |
|
This is exactly what we are expecting.
Another scenario a deadlock will appear is when there are two threads executing two methods that both require two locks, but the order of the locks are different. Consider the following code:
1 |
|
One possible execution order will be:
%%{ init: { 'theme': 'base', 'themeVariables': { 'primaryColor': '#add8e6', 'noteTextColor': '#0f0f0f', 'noteBkgColor': '#f5f6fa', 'textColor': '#EAA1AF' } } }%% sequenceDiagram participant A as th1 participant B as th2 participant C as obj A->>C: lock lock_a B->>C: lock lock_b A->>C: lock lock_b Note right of B: deadlock because
b is locked by th2 B->>C: lock lock_a
In this case, the deadlock occurs because the order of the lock is different. When th1
need lock_b
, it has been locked by th2
, and when th2
need lock_a
, it has been locked by th1
. Therefore, a deadlock occurs.
One way is to make sure the order of the lock is the same. Sometimes this intuitive, but sometimes it’s not. Consider a function that swaps two member variables of two objects:
1 |
|
The order of locking two mutex seems to be fixed among threads, but consider this code:
1 | void badOp3() { |
The parameters are swapped, so the order of the lock is swapped. Therefore, deadlock occurs as the previous scenario.
C++ standard lib provides a way to avoid this problem by using std::lock
to lock multiple mutexes at the same time. This way, the order of the lock is fixed, and the deadlock can be avoided.
1 | void goodSwap(BadBadBad& obj1, BadBadBad& obj2) { |
In this case, both the lock are locked at the same time, therefore swapping parameter order won’t affect the sequence of the lock. Also, since the lock is already locked, we use std::adopt_lock
to tell the lock guard that the lock is already locked, and it doesn’t need to lock it again.
C++ standard lib provided even more powerful tool to deal with multiple mutexes that requires a certain sequence of locking: std::scoped_lock
.
1 |
|
std::scoped_lock
is a RAII lock wrapper like std::lock_guard
, but it accepts a list of mutexes and lock them with the same method as std::lock
but with RAII support.
icon-padding
std::scoped_lock
is a C++17 feature and uses a language feature called variadic template and automatic class template deduction. When you supply many mutexes, the template will choose the correct mutex type according to the parameter passed in.
Definition and implementation of `std
1 | template <class ..._MArgs> |
Even std::scoped_lock
can solve the problem with multiple mutexes, it cannot do anything when multiple mutexes are locked separately. Consider this code:
1 |
|
In this case, std::scoped_lock
cannot do anything because the two mutexes are locked separately. Therefore, in this case, we have to rely on programmer to make sure the order of the lock is correct.
Guideline to follow to avoid deadlock
-
Avoid nested lock
Most of the problem we discussed involves locking multiple locks together. Even if we have tools like
std::lock
andstd::scoped_lock
, there are scenarios that we cannot use them and have to manage the sequence by ourselves, which is prone to error. -
Avoid using user supplied function with lock in scope
Since the code is given by user, you have no idea what it will do to locks or threads. It can do anything to locks therefore leads to unpredictable behavior. Thus, it’s generally the best to don’t let user provide function (either in
std::function
, lambda expression, or other callable objects) -
Always lock mutex by a fixed order
When nested lock is unavoidable and
std::lock
andstd::scoped_lock
cannot be used, you must make sure locks are locked in a certain order. For example, for a double linked list, we might want to increase read and write speed by give each node a mutex to reduce the granularity of lock. However, this might lead to multiple mutexes being locked in different places by same function but different thread. For example, when we are deleting a node, we must acquire the lock of the node being deleted and the lock of nodes adjacent to it. Also, when two threads is traversing the list in different direction, they must acquire the lock of current node and the next node. Same principle apply for traversing and deleting node at the same time.If two different threads are traversing threads in different direction, one possible execution sequence will be this:
%%{ init: { 'theme': 'base', 'themeVariables': { 'primaryColor': '#add8e6', 'noteTextColor': '#0f0f0f', 'noteBkgColor': '#f5f6fa', 'textColor': '#EAA1AF' } } }%% sequenceDiagram participant A as th1 participant B as th2 participant C as list A->>C: lock i A->>C: obtain ptr i + 1 B->>C: lock i + 1 A->>C: lock i + 1 B->>C: lock i Note right of B: deadlock because A is waiting B to release i + 1
and B is waiting A to release i A->>C: unlock i B->>C: obtain ptr i B->>C: unlock i + 1
-
Use a wrapper that let us specify lock and unlock sequence of mutexes
We can define a wrapper of
std::mutex
to require the lock to be unlocked in some order, or an exception will be thrown.hierarchical mutex 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
class HierarchicalMutex {
using ull = unsigned long long;
private:
std::mutex internal_lock;
const ull this_layer;
static thread_local ull thread_layer;
ull prev_layer;
void checkLayerViolation() {
if (thread_layer >= this_layer) throw std::logic_error("mutex layer violated!");
}
void updateLayerVal() {
prev_layer = this_layer;
thread_layer = this_layer;
}
using ull=unsigned long long;
public:
explicit HierarchicalMutex(ull layer) : this_layer(layer), prev_layer(0) {
}
void lock() {
checkLayerViolation();
internal_lock.lock();
updateLayerVal();
}
void unlock() {
if (this_layer != thread_layer) throw std::logic_error("mutex layer violated!");
thread_layer = prev_layer;
internal_lock.unlock();
}
bool try_lock() {
checkLayerViolation();
if (!internal_lock.try_lock()) return false;
updateLayerVal();
return true;
}
};
thread_local unsigned long long HierarchicalMutex::thread_layer = ULLONG_MAX;Let’s interpret this code part by part:
-
Line 4-7: These lines of code defined member variables, including the lock and the layer state variables.
this_layer
will be initialized in constructor, holding which “layer” this lock is on.thread_layer
hold the “current deepest lock layer” of this thread. This variable is declaredthread_local
andstatic
, so it’s shared within each thread as a state variable.prev_layer
hold the previousthread_layer
. This is necessary because once we unlock the mutex, we have to roll back to the previous layer by settingthread_layer
to its previous value.
-
Method
checkLayerViolation
checks whether this is a lock from higher layer than the layer of last lock, or if the lock should be locked after other locks that has lower layer number than it. -
Method
updateLayerVal
updates layer value when there is a new lock added. The layer number of last layer (or current lock layer) is stored inprev_layer
, and we update current layer to be the latest lock layer value -
Line 22-43: These methods enables the mutex to be used by a RAII container like
std::lock_guard
. The class doesn’t necessarily have to inheritstd::mutex
. After implemented these methods, it can be used like a normal mutex.
When we are locking the mutex, we check whether the layer of this lock is less than the deepest lock layer, since we don’t want to lock a mutex that is in “higher layer” to make sure the sequence of lock is correct. Then, when we locked this mutex, we update the deepest layer to be this layer.
When we unlock the mutex, we move upward one layer by changing the current deepest layer to the previous deepest layer. And we want to make sure that we are always unlocking mutex that’s locked at the end, so we do a check to make sure this mutex is in the deepest layer.
try_lock
uses the same principle aslock
, but it returns false if the lock cannot be locked.- Line 48: Initially the deepest layer is set to be the maximum value of
unsigned long long
to make sure the first lock can be locked.
-
-
Avoid inter-thread waiting
These principles extends to the scenarios that nested lock doesn’t appear. For example, when one thread is waiting for another thread that is waiting to acquire a lock, deadlock will also occur.%%{ init: { 'theme': 'base', 'themeVariables': { 'primaryColor': '#add8e6', 'noteTextColor': '#0f0f0f', 'noteBkgColor': '#f5f6fa', 'textColor': '#EAA1AF' } } }%% sequenceDiagram participant A as th1 participant B as th2 participant C as mutex A->>B: .join(th2) B->>C: wait to acquire lock Note right of A: deadlock because
th1 is waiting for th2
and th2 is waiting for a lockEven without mutex, inter-thread waiting is also dangerous. when two threads joined each other and wait for each other to finish, a deadlock will occur.
%%{ init: { 'theme': 'base', 'themeVariables': { 'primaryColor': '#add8e6', 'noteTextColor': '#0f0f0f', 'noteBkgColor': '#f5f6fa', 'textColor': '#EAA1AF' } } }%% sequenceDiagram participant A as th1 participant B as th2 A->>B: .join(th2) B->>A: .join(th1) Note right of A: deadlock because
th1 is waiting for th2
and th2 is waiting for th1This is a bad practice that we should avoid. One good practice is to join the thread only in the function that starts them, and identify the thread hierarchy to make sure that only higher thread can wait for lower thread.
Other scenarios of race condition and how to avoid them
Race conditions caused by lazy initialization
In many cases, we don’t want to initialize a member in an object until we want to use it. Such technique is called lazy initialization. For example, a singleton:
1 |
|
only initialize the object when we need it.
This code, however, is not thread-safe, because when two threads are calling Instance
at the same time, they might both find data
to be nullptr
and initialize it, which is not what we want. We can wrap it with a mutex, but we have to use the infamous double-check pattern:
1 |
|
This code can prevent multiple instance problem, but it will also create a race condition. Initialization of data
could take a long process and time, but another thread might find data
is not nullptr(at least it appears not to be) and returns a partially initialized object, which is not what we want.
We can of course increase the granularity of the lock by locking the mutex during the whole initialization process, but this will make the code run slower since it make more portion of the code unable to be executed asynchronously.
In this case, cpp offers a better solution: std::call_once
and std::once_flag
:
1 |
|
warning
In this case, we are passing a lambda to std::call_once
because std::shared_ptr.reset
is an unresolved overloaded function type. Since we are using a member function pointer, and since there are multiple overloads of std::shared_ptr.reset
, and we cannot pass data
as the type reference, we have to use a lambda to wrap the function call.
Race conditions caused by static variable
If we recall the memory model of a program, we know that a static variable of a class is being stored in the static area of the program, is shared among all class instances, and can be accessed by class name. In cpp, the static variables are initialized first time when the program comes through it, and multiple threads thinks they are the first one to initialize the static variable. In this case, static variables are initialized multiple times and may be changed.
The behavior of initialization of static variables is then defined in C++11: the initialization of static variable will occur exactly once even in concurrency circumstances.
Optimize I/O of rarely updated data with std::shared_mutex
and std::shared_lock
Sometimes we have data that are rarely updated(like a mapping between domain name to IP address). In this case, if we use a mutex to lock the whole data structure when we are reading it, it will waste a lot of time because accessing without modifying generally won’t create any bad race condition. For this scenario, C++17 standard provides us a tool to deal with this: std::shared_mutex
and corresponding RAII wrapper: std::shared_lock
.
Essentially, std::shared_mutex
is two mutexes that can be locked differently when locked by different RAII wrapper. When this mutex is locked by std::lock_guard
, it behaves like normal mutex. However, when it’s locked by std::shared_lock
, it will only block RAII wrappers that is not std::shared_lock
. This way, for reading operations, we can use std::shared_lock
and std::lock_guard
for writing operations to make sure when we are reading, only writing operations are blocked and when we are writing, both reading and writing operations are blocked.
Summary
- Race condition
- Race conditions can be problematic, but we can use several techniques to protect it:
- Use mutually exclusive protective wrapper that are well-designed
- Keep in mind to use language features like
std::unique_ptr
- Wrap certain action sets into transactions to make they are indivisible
- Balance the trade-off between granularity and performance
- With good choice of mutex, code can be optimized(such as
std::unique_lock
andstd::shared_lock
).
- With good choice of mutex, code can be optimized(such as
- Race conditions can also be avoided by
std::call_once
,std::once_flag
, and static variable (after C++11).
- Never let user supplied function acquire reference or pointer of private data.
- Race conditions can be problematic, but we can use several techniques to protect it:
- Deadlock
- There are several ways to avoid deadlock
- Avoid nested lock and choose a good granularity
- If can’t, always lock the mutex in a fixed order (by using
std::lock
orstd::scoped_lock
) - If can’t, use a wrapper that let us specify lock and unlock sequence of mutexes
- If can’t, maybe revise the design decision.
- Waiting for another thread to finish is also a kind of deadlock
- Avoid inter-thread waiting
- Identify the thread hierarchy to make sure that only higher thread can wait for lower thread.
- There are several ways to avoid deadlock
Associated code
1 |
|