Chapter 3 - Part 2. Sharing data between threads

If not specified, all the code is written in c++23 standard, compiled with g++ 13.2
command: g\++ -std=c\++2b -O0 ch3.cpp -o out && ./out

# Deadlock

Imagine you are stupid enough to lock the key for lock A in a container with lock B and lock the key for lock B in a container with lock A. When you are trying to open the lock A with key A, but you cannot open the lock A because you need key B to open lock A, and you cannot open lock B because you need key A to open lock B. This is called a deadlock.

%%{
    init: {
        'theme': 'base',
        'themeVariables': {
            'primaryColor': '#add8e6',
            'noteTextColor': '#0f0f0f',
            'noteBkgColor': '#f5f6fa',
            'textColor': '#EAA1AF'
        }
    }
}%%
graph LR
    A[Lock A] --> B[Lock B]
    B --> A

Similar thing can happen with mutex because mutex is also a kind of lock. Consider the following code:

deadlock


class Bad {
    using mtx = std::lock_guard<std::mutex>;
public:
    void opA() {
        mtx lock(lock_a);
        opB();
        //...
    }

    void opB() {
        mtx lock(lock_a);
        opA();
        //...
    }

private:
    std::mutex lock_a;
};

In this case, when you call opA , a deadlock will occur because opA acquired the ownership of lock_a so opB have to wait for opA to finish, while opA cannot finish before opB finish. Therefore, a deadlock occurs.

In this case, cpp standard lib have a special lock called std::recursive_mutex that can be locked multiple times by the same thread. This way, the deadlock within thread can be avoided.

recursive mutex


class NotBad {
    using mtx = std::lock_guard<std::recursive_mutex>;
public:
    void opA() {
        mtx lock(lock_a);
        std::cout << "A\n";
        opB();
    }

    void opB() {
        mtx lock(lock_a);
        std::cout << "B\n";
    }

private:
    std::recursive_mutex lock_a;
};

int main(){
    auto obj = NotBad();
    obj.opA();
}

output:

recursive mutex

A
B

This is exactly what we are expecting.

Another scenario a deadlock will appear is when there are two threads executing two methods that both require two locks, but the order of the locks are different. Consider the following code:

deadlock 2


class BadBad {
    // things...
public:
    std::mutex lock_a;
    std::mutex lock_b;
};

void op1(BadBad& obj) {
    std::lock_guard<std::mutex> lock1(obj.lock_a);
    // do things...
    std::lock_guard<std::mutex> lock2(obj.lock_b);
}

void op2(BadBad& obj) {
    std::lock_guard<std::mutex> lock1(obj.lock_b);
    // do things...
    std::lock_guard<std::mutex> lock2(obj.lock_a);
}

void badOp2(BadBad& obj) {
    std::thread th1(op1, std::ref(obj));
    std::thread th2(op2, std::ref(obj));
    th1.join();
    th2.join();
}

One possible execution order will be:

%%{
    init: {
        'theme': 'base',
        'themeVariables': {
            'primaryColor': '#add8e6',
            'noteTextColor': '#0f0f0f',
            'noteBkgColor': '#f5f6fa',
            'textColor': '#EAA1AF'
        }
    }
}%%

sequenceDiagram
    participant A as th1
    participant B as th2
    participant C as obj
    A->>C: lock lock_a
    B->>C: lock lock_b
    A->>C: lock lock_b
    Note right of B: deadlock because 
 b is locked by th2
    B->>C: lock lock_a

In this case, the deadlock occurs because the order of the lock is different. When th1 need lock_b , it has been locked by th2 , and when th2 need lock_a , it has been locked by th1 . Therefore, a deadlock occurs.

One way is to make sure the order of the lock is the same. Sometimes this intuitive, but sometimes it’s not. Consider a function that swaps two member variables of two objects:

bad swap


class BadBadBad {
public:
    std::mutex lock;
    int data;
};

void badSwap(BadBadBad& obj1, BadBadBad& obj2) {
    std::lock_guard<std::mutex> lock1(obj1.lock);
    std::lock_guard<std::mutex> lock1(obj2.lock);
    std::swap(obj1.data, obj2.data);
}

The order of locking two mutex seems to be fixed among threads, but consider this code:

not so fixed

void badOp3() {
    auto obj1 = BadBadBad();
    auto obj2 = BadBadBad();
    std::thread th1(badSwap, std::ref(obj1), std::ref(obj2));
    std::thread th2(badSwap, std::ref(obj2), std::ref(obj1));
    th1.join();
    th2.join();
}

The parameters are swapped, so the order of the lock is swapped. Therefore, deadlock occurs as the previous scenario.

C++ standard lib provides a way to avoid this problem by using std::lock to lock multiple mutexes at the same time. This way, the order of the lock is fixed, and the deadlock can be avoided.

std::lock

void goodSwap(BadBadBad& obj1, BadBadBad& obj2) {
    std::lock(obj1.lock, obj2.lock);
    std::lock_guard<std::mutex> lock1(obj1.lock, std::adopt_lock);
    std::lock_guard<std::mutex> lock2(obj2.lock, std::adopt_lock);
    std::swap(obj1.data, obj2.data);
}

In this case, both the lock are locked at the same time, therefore swapping parameter order won’t affect the sequence of the lock. Also, since the lock is already locked, we use std::adopt_lock to tell the lock guard that the lock is already locked, and it doesn’t need to lock it again.

C++ standard lib provided even more powerful tool to deal with multiple mutexes that requires a certain sequence of locking: std::scoped_lock .

std::scoped_lock


void betterSwap(BadBadBad& obj1, BadBadBad& obj2) {
    std::scoped_lock lock(obj1.lock, obj2.lock);
    std::swap(obj1.data, obj2.data);
}

std::scoped_lock is a RAII lock wrapper like std::lock_guard , but it accepts a list of mutexes and lock them with the same method as std::lock but with RAII support.

icon-padding

std::scoped_lock is a C++17 feature and uses a language feature called variadic template and automatic class template deduction. When you supply many mutexes, the template will choose the correct mutex type according to the parameter passed in.

Definition and implementation of `std

template <class ..._MArgs>
explicit scoped_lock(_MArgs&... __margs)
  : __t_(__margs...)
{
    _VSTD::lock(__margs...);
}

Even std::scoped_lock can solve the problem with multiple mutexes, it cannot do anything when multiple mutexes are locked separately. Consider this code:

no hope for scoped_lock


void operations(BadBadBad& obj1, BadBadBad& obj2) {
    // do things...
    std::lock_guard<std::mutex> lock1(obj1.lock);
    // do things...
    std::lock_guard<std::mutex> lock2(obj2.lock);
}

In this case, std::scoped_lock cannot do anything because the two mutexes are locked separately. Therefore, in this case, we have to rely on programmer to make sure the order of the lock is correct.

# Guideline to follow to avoid deadlock

Avoid nested lock

Most of the problem we discussed involves locking multiple locks together. Even if we have tools like std::lock and std::scoped_lock , there are scenarios that we cannot use them and have to manage the sequence by ourselves, which is prone to error.
Avoid using user supplied function with lock in scope

Since the code is given by user, you have no idea what it will do to locks or threads. It can do anything to locks therefore leads to unpredictable behavior. Thus, it’s generally the best to don’t let user provide function (either in std::function , lambda expression, or other callable objects)
Always lock mutex by a fixed order

When nested lock is unavoidable and std::lock and std::scoped_lock cannot be used, you must make sure locks are locked in a certain order. For example, for a double linked list, we might want to increase read and write speed by give each node a mutex to reduce the granularity of lock. However, this might lead to multiple mutexes being locked in different places by same function but different thread. For example, when we are deleting a node, we must acquire the lock of the node being deleted and the lock of nodes adjacent to it. Also, when two threads is traversing the list in different direction, they must acquire the lock of current node and the next node. Same principle apply for traversing and deleting node at the same time.

If two different threads are traversing threads in different direction, one possible execution sequence will be this:

%%{
    init: {
        'theme': 'base',
        'themeVariables': {
            'primaryColor': '#add8e6',
            'noteTextColor': '#0f0f0f',
            'noteBkgColor': '#f5f6fa',
            'textColor': '#EAA1AF'
        }
    }
}%%
sequenceDiagram
    participant A as th1
    participant B as th2
    participant C as list
    A->>C: lock i
    A->>C: obtain ptr i + 1
    B->>C: lock i + 1
    A->>C: lock i + 1
    B->>C: lock i
    Note right of B: deadlock because A is waiting B to release i + 1 
 and B is waiting A to release i
    A->>C: unlock i
    B->>C: obtain ptr i
    B->>C: unlock i + 1

Use a wrapper that let us specify lock and unlock sequence of mutexes

We can define a wrapper of std::mutex to require the lock to be unlocked in some order, or an exception will be thrown.

hierarchical mutex


class HierarchicalMutex {
    using ull = unsigned long long;
private:
    std::mutex internal_lock;
    const ull this_layer;
    static thread_local ull thread_layer;
    ull prev_layer;

    void checkLayerViolation() {
        if (thread_layer >= this_layer) throw std::logic_error("mutex layer violated!");
    }

    void updateLayerVal() {
        prev_layer = this_layer;
        thread_layer = this_layer;
    }

    using ull=unsigned long long;

public:

    explicit HierarchicalMutex(ull layer) : this_layer(layer), prev_layer(0) {

    }

    void lock() {
        checkLayerViolation();
        internal_lock.lock();
        updateLayerVal();
    }

    void unlock() {
        if (this_layer != thread_layer) throw std::logic_error("mutex layer violated!");
        thread_layer = prev_layer;
        internal_lock.unlock();
    }

    bool try_lock() {
        checkLayerViolation();
        if (!internal_lock.try_lock()) return false;
        updateLayerVal();
        return true;
    }


};

thread_local unsigned long long HierarchicalMutex::thread_layer = ULLONG_MAX;

Let’s interpret this code part by part:

Line 4-7: These lines of code defined member variables, including the lock and the layer state variables.
- this_layer will be initialized in constructor, holding which “layer” this lock is on.
- thread_layer hold the “current deepest lock layer” of this thread. This variable is declared thread_local and static , so it’s shared within each thread as a state variable.
- prev_layer hold the previous thread_layer . This is necessary because once we unlock the mutex, we have to roll back to the previous layer by setting thread_layer to its previous value.
Method checkLayerViolation checks whether this is a lock from higher layer than the layer of last lock, or if the lock should be locked after other locks that has lower layer number than it.
Method updateLayerVal updates layer value when there is a new lock added. The layer number of last layer (or current lock layer) is stored in prev_layer , and we update current layer to be the latest lock layer value
Line 22-43: These methods enables the mutex to be used by a RAII container like std::lock_guard . The class doesn’t necessarily have to inherit std::mutex . After implemented these methods, it can be used like a normal mutex.

When we are locking the mutex, we check whether the layer of this lock is less than the deepest lock layer, since we don’t want to lock a mutex that is in “higher layer” to make sure the sequence of lock is correct. Then, when we locked this mutex, we update the deepest layer to be this layer.

When we unlock the mutex, we move upward one layer by changing the current deepest layer to the previous deepest layer. And we want to make sure that we are always unlocking mutex that’s locked at the end, so we do a check to make sure this mutex is in the deepest layer.

try_lock uses the same principle as lock , but it returns false if the lock cannot be locked.

Line 48: Initially the deepest layer is set to be the maximum value of unsigned long long to make sure the first lock can be locked.

Avoid inter-thread waiting
These principles extends to the scenarios that nested lock doesn’t appear. For example, when one thread is waiting for another thread that is waiting to acquire a lock, deadlock will also occur.

   %%{
    init: {
        'theme': 'base',
        'themeVariables': {
            'primaryColor': '#add8e6',
            'noteTextColor': '#0f0f0f',
            'noteBkgColor': '#f5f6fa',
            'textColor': '#EAA1AF'
        }
    }
}%%
sequenceDiagram
    participant A as th1
    participant B as th2
    participant C as mutex
    A->>B: .join(th2)
    B->>C: wait to acquire lock
    Note right of A: deadlock because 
 th1 is waiting for th2 
 and th2 is waiting for a lock

Even without mutex, inter-thread waiting is also dangerous. when two threads joined each other and wait for each other to finish, a deadlock will occur.

   %%{
        init: {
            'theme': 'base',
            'themeVariables': {
                'primaryColor': '#add8e6',
                'noteTextColor': '#0f0f0f',
                'noteBkgColor': '#f5f6fa',
                'textColor': '#EAA1AF'
            }
        }
    }%%
sequenceDiagram
    participant A as th1
    participant B as th2
    A->>B: .join(th2)
    B->>A: .join(th1)
    Note right of A: deadlock because 
 th1 is waiting for th2 
 and th2 is waiting for th1

This is a bad practice that we should avoid. One good practice is to join the thread only in the function that starts them, and identify the thread hierarchy to make sure that only higher thread can wait for lower thread.

# Other scenarios of race condition and how to avoid them

# Race conditions caused by lazy initialization

In many cases, we don’t want to initialize a member in an object until we want to use it. Such technique is called lazy initialization. For example, a singleton:

singleton


class UnsafeSingleton {
private:
    UnsafeSingleton() = default;
    static std::shared_ptr<UnsafeSingleton> data;
public:
    UnsafeSingleton(const UnsafeSingleton& other) = delete;
    UnsafeSingleton(UnsafeSingleton&& other) = delete;
    UnsafeSingleton& operator=(UnsafeSingleton& other) = delete;
    static std::shared_ptr<UnsafeSingleton> Instance() {
        if (data == nullptr) {
            data.reset(new UnsafeSingleton());
        }
        return data;
    }
};

std::shared_ptr<UnsafeSingleton> UnsafeSingleton::data = nullptr;

only initialize the object when we need it.

This code, however, is not thread-safe, because when two threads are calling Instance at the same time, they might both find data to be nullptr and initialize it, which is not what we want. We can wrap it with a mutex, but we have to use the infamous double-check pattern:

double check pattern


class DoubleCheckSingleton {
private:
    DoubleCheckSingleton() = default;
    static std::shared_ptr<DoubleCheckSingleton> const data;
    static std::mutex mutex;
public:
    DoubleCheckSingleton(const DoubleCheckSingleton& other) = delete;
    DoubleCheckSingleton(DoubleCheckSingleton&& other) = delete;
    DoubleCheckSingleton& operator=(DoubleCheckSingleton& other) = delete;
    static std::shared_ptr<DoubleCheckSingleton> Instance() {
        if (data == nullptr) {
            std::lock_guard<std::mutex> lock(mutex);
            if (data == nullptr){
                data.reset(new DoubleCheckSingleton());
            }
        }
        return data;
    }
};


std::shared_ptr<DoubleCheckSingleton> DoubleCheckSingleton::data = nullptr;

This code can prevent multiple instance problem, but it will also create a race condition. Initialization of data could take a long process and time, but another thread might find data is not nullptr(at least it appears not to be) and returns a partially initialized object, which is not what we want.

We can of course increase the granularity of the lock by locking the mutex during the whole initialization process, but this will make the code run slower since it make more portion of the code unable to be executed asynchronously.

In this case, cpp offers a better solution: std::call_once and std::once_flag :

call_once singleton


class CallOnceSingleton {
private:
    CallOnceSingleton() = default;
    static std::shared_ptr<CallOnceSingleton> data;
    static std::once_flag init_flag;
public:
    CallOnceSingleton(const CallOnceSingleton& other) = delete;
    CallOnceSingleton(CallOnceSingleton&& other) = delete;
    CallOnceSingleton& operator=(CallOnceSingleton& other) = delete;
    static std::shared_ptr<CallOnceSingleton> Instance() {
        if (CallOnceSingleton::data == nullptr) {
            std::call_once(CallOnceSingleton::init_flag, []() {
                CallOnceSingleton::data.reset(new CallOnceSingleton());
            });
        }
        return CallOnceSingleton::data;
    }
};

std::shared_ptr<CallOnceSingleton> CallOnceSingleton::data = nullptr;
std::once_flag CallOnceSingleton::init_flag = std::once_flag();

warning

In this case, we are passing a lambda to std::call_once because std::shared_ptr.reset is an unresolved overloaded function type. Since we are using a member function pointer, and since there are multiple overloads of std::shared_ptr.reset , and we cannot pass data as the type reference, we have to use a lambda to wrap the function call.

# Race conditions caused by static variable

If we recall the memory model of a program, we know that a static variable of a class is being stored in the static area of the program, is shared among all class instances, and can be accessed by class name. In cpp, the static variables are initialized first time when the program comes through it, and multiple threads thinks they are the first one to initialize the static variable. In this case, static variables are initialized multiple times and may be changed.

The behavior of initialization of static variables is then defined in C++11: the initialization of static variable will occur exactly once even in concurrency circumstances.

# Optimize I/O of rarely updated data with `std::shared_mutex` and `std::shared_lock`

Sometimes we have data that are rarely updated(like a mapping between domain name to IP address). In this case, if we use a mutex to lock the whole data structure when we are reading it, it will waste a lot of time because accessing without modifying generally won’t create any bad race condition. For this scenario, C++17 standard provides us a tool to deal with this: std::shared_mutex and corresponding RAII wrapper: std::shared_lock .

Essentially, std::shared_mutex is two mutexes that can be locked differently when locked by different RAII wrapper. When this mutex is locked by std::lock_guard , it behaves like normal mutex. However, when it’s locked by std::shared_lock , it will only block RAII wrappers that is not std::shared_lock . This way, for reading operations, we can use std::shared_lock and std::lock_guard for writing operations to make sure when we are reading, only writing operations are blocked and when we are writing, both reading and writing operations are blocked.

# Summary

Race condition
- Race conditions can be problematic, but we can use several techniques to protect it:
  - Use mutually exclusive protective wrapper that are well-designed
  - Keep in mind to use language features like std::unique_ptr
  - Wrap certain action sets into transactions to make they are indivisible
  - Balance the trade-off between granularity and performance
    - With good choice of mutex, code can be optimized(such as std::unique_lock and std::shared_lock ).
  - Race conditions can also be avoided by std::call_once , std::once_flag , and static variable (after C++11).
- Never let user supplied function acquire reference or pointer of private data.
Deadlock
- There are several ways to avoid deadlock
  - Avoid nested lock and choose a good granularity
  - If can’t, always lock the mutex in a fixed order (by using std::lock or std::scoped_lock )
  - If can’t, use a wrapper that let us specify lock and unlock sequence of mutexes
  - If can’t, maybe revise the design decision.
- Waiting for another thread to finish is also a kind of deadlock
  - Avoid inter-thread waiting
  - Identify the thread hierarchy to make sure that only higher thread can wait for lower thread.

# Associated code

ch3.cpp


#include <thread>
#include <mutex>
#include <shared_mutex>
#include <functional>
#include <iostream>
#include <climits>
#include <stack>

template<typename T>
class Node {
public:
    Node(T value): data(value), next(nullptr){

    }

    Node* getNext() const noexcept {
        return next;
    }

    void setNext(Node* next) noexcept {
        this->next = next;
    }

    T getValue() const noexcept {
        return data;
    }

    void setValue(T val) noexcept {
        data = val;
    }


private:
    T data;
    Node<T>* next;
};
template<typename T>
class LinkedList {
public:
    LinkedList(): head(nullptr){

    }

    void pushBack(const T& val) {
        std::lock_guard<std::mutex> lock(mutex);
        if (head == nullptr){
            head = new Node<T>(val);
        }
        else{
            auto ptr = head;
            while(ptr->getNext() != nullptr) ptr = ptr->getNext();
            ptr->setNext(new Node<T>(val));
        }
    }

    template<typename F>
    void oops(F f) {
        std::lock_guard<std::mutex> lock(mutex);
        f(*head);
    }

    Node<T>* getHead() const noexcept {
        return head;
    }

private:
    Node<T>* head;
    std::mutex mutex;
};

Node<int>* unprotected;
void bad(Node<int>& head){
    unprotected = &head;
}

template<typename T>
class Stack {
    using mtx = std::lock_guard<std::mutex>;

public:
    explicit Stack(const std::initializer_list<T>& li){
        data.resize(li.size());
        std::copy(li.begin(), li.end(), data.begin());
    }

    bool empty() noexcept{
        mtx lock(mutex);
        return data.empty();
    }

    std::size_t size() noexcept {
        mtx lock(mutex);
        return data.size();
    }

    void push(const T& val) {
        mtx lock(mutex);
        data.push_back(val);
    }

    void pop(){
        mtx lock(mutex);
        data.pop_back();
    }

    T top() noexcept {
        mtx lock(mutex);
        return data[size() - 1];
    }

private:
    std::vector<T> data;
    std::mutex mutex;
};

template<typename T>
void ok1(Stack<T>& stack) {
    if (!stack.empty()) {
        std::cout << "op1 start" << std::endl;
        // do things assume the stack is not empty
        std::this_thread::sleep_for(std::chrono::seconds(3));
        std::cout << "op1 end" << std::endl;
    }
}

template<typename T>
void ok2(Stack<T>& stack) {
    std::cout << "op2 start" << std::endl;
    // do things involving popping value from stack
    std::this_thread::sleep_for(std::chrono::seconds(3));
    std::cout << "op2 end" << std::endl;
}

template<typename T>
void notOk(Stack<T>& stack) {
    std::thread th1(ok1<T>, std::ref(stack));
    std::thread th2(ok2<T>, std::ref(stack));
    th1.join();
    th2.join();
}

template<typename T>
void notOk2(Stack<T>& stack) {
    if (!stack.empty()) {
        T val = stack.top();
        stack.pop();
        // ...do things with val
    }
}

template<typename T>
class BetterStack {
    using mtx = std::lock_guard<std::mutex>;
public:
    explicit BetterStack(const std::initializer_list<T>& li){
        data.resize(li.size());
        std::copy(li.begin(), li.end(), data.begin());
    }

    bool empty() noexcept{
        mtx lock(mutex);
        return data.empty();
    }

    std::size_t size() noexcept {
        mtx lock(mutex);
        return data.size();
    }

    void push(const T& val) {
        mtx lock(mutex);
        data.push_back(val);
    }

    std::shared_ptr<T> pop() {
        mtx lock(mutex);
        std::shared_ptr<T> const val = std::make_shared<T>(std::move(data[data.size() - 1]));
        data.pop_back();
        return val;
    }

    T top() noexcept {
        mtx lock(mutex);
        return data[data.size() - 1];
    }

    void startSync() noexcept {
        sync.lock();
    }

    void endSync() noexcept {
        sync.unlock();
    }

private:
    std::vector<T> data;
    std::mutex mutex;
    std::mutex sync;
};

template<typename T>
void realOk1(BetterStack<T>& stack) {
    stack.startSync();
    if (!stack.empty()) {
        std::cout << "op1 start" << std::endl;
        std::this_thread::sleep_for(std::chrono::seconds(3));
        std::cout << "op1 end" << std::endl;
    }
    stack.endSync();
}

template<typename T>
void realOk2(BetterStack<T>& stack) {
    stack.startSync();
        std::cout << "op2 start" << std::endl;
        std::this_thread::sleep_for(std::chrono::seconds(3));
        std::cout << "op2 end" << std::endl;
    stack.endSync();
}

template<typename T>
void NowOk(BetterStack<T>& stack) {
    std::thread th1(realOk1<T>, std::ref(stack));
    std::thread th2(realOk2<T>, std::ref(stack));
    th1.join();
    th2.join();
}

class Bad {
    using mtx = std::lock_guard<std::mutex>;
public:
    void opA() {
        mtx lock(lock_a);
        opB();
        //...
    }

    void opB() {
        mtx lock(lock_a);
        //...
    }

private:
    std::mutex lock_a;
};

class NotBad {
        using mtx = std::lock_guard<std::recursive_mutex>;
public:
    void opA() {
        mtx lock(lock_a);
        std::cout << "A\n";
        opB();
    }

    void opB() {
        mtx lock(lock_a);
        std::cout << "B\n";
    }

private:
    std::recursive_mutex lock_a;
};

class BadBad {
    // things...
public:
    std::mutex lock_a;
    std::mutex lock_b;
};

void op1(BadBad& obj) {
    std::lock_guard<std::mutex> lock1(obj.lock_a);
    // do things...
    std::lock_guard<std::mutex> lock2(obj.lock_b);
}

void op2(BadBad& obj) {
    std::lock_guard<std::mutex> lock1(obj.lock_b);
    // do things...
    std::lock_guard<std::mutex> lock2(obj.lock_a);
}

void badOp(BadBad& obj) {
    std::thread th1(op1, std::ref(obj));
    std::thread th2(op2, std::ref(obj));
    th1.join();
    th2.join();
}

class BadBadBad {
public:
    std::mutex lock;
    int data;
};

void badSwap(BadBadBad& obj1, BadBadBad& obj2) {
    std::lock_guard<std::mutex> lock1(obj1.lock);
    std::lock_guard<std::mutex> lock2(obj2.lock);
    std::swap(obj1.data, obj2.data);
}

void badOp2(BadBad& obj) {
    std::thread th1(op1, std::ref(obj));
    std::thread th2(op2, std::ref(obj));
    th1.join();
    th2.join();
}

void badOp3() {
    auto obj1 = BadBadBad();
    auto obj2 = BadBadBad();
    std::thread th1(badSwap, std::ref(obj1), std::ref(obj2));
    std::thread th2(badSwap, std::ref(obj2), std::ref(obj1));
    th1.join();
    th2.join();
}

void goodSwap(BadBadBad& obj1, BadBadBad& obj2) {
    std::lock(obj1.lock, obj2.lock);
    std::lock_guard<std::mutex> lock1(obj1.lock, std::adopt_lock);
    std::lock_guard<std::mutex> lock2(obj2.lock, std::adopt_lock);
    std::swap(obj1.data, obj2.data);
}

void betterSwap(BadBadBad& obj1, BadBadBad& obj2) {
    std::scoped_lock lock(obj1.lock, obj2.lock);
    std::swap(obj1.data, obj2.data);
}

void noLockOp(std::thread& th) {
    // do something
    std::cout << "thread " << th.get_id() << " start\n";
    if (th.joinable()) th.join();
    std::cout << "thread " << th.get_id() << " ok\n";
}

void noLockDeadlock(std::thread& th1, std::thread& th2) {
    std::thread th3(noLockOp, std::ref(th1));
    std::thread th4(noLockOp, std::ref(th2));
    th3.join();
    th4.join();

}

class HierarchicalMutex {

    using ull = unsigned long long;
private:
    std::mutex internal_lock;
    const ull this_layer;
    static thread_local ull thread_layer;
    ull prev_layer;

    void checkLayerViolation() {
        if (thread_layer <= this_layer) throw std::logic_error("mutex layer violated!");
    }

    void updateLayerVal() {
        prev_layer = thread_layer;
        thread_layer = this_layer;
    }

public:

    explicit HierarchicalMutex(ull layer) : this_layer(layer), prev_layer(0) {

    }

    void lock() {
        checkLayerViolation();
        internal_lock.lock();
        updateLayerVal();
    }

    void unlock() {
        if (this_layer != thread_layer) throw std::logic_error("mutex layer violated!");
        thread_layer = prev_layer;
        internal_lock.unlock();
    }

    bool try_lock() {
        checkLayerViolation();
        if (!internal_lock.try_lock()) return false;
        updateLayerVal();
        return true;
    }


};

thread_local unsigned long long HierarchicalMutex::thread_layer = ULLONG_MAX;



class ObjectWithExposedMutex {
public:
    // ...
    std::mutex lock;
    // ...
};

void someOp(ObjectWithExposedMutex& obj1, ObjectWithExposedMutex& obj2) {
    // ...
    std::unique_lock<std::mutex> lock_a(obj1.lock, std::defer_lock);
    std::unique_lock<std::mutex> lock_b(obj2.lock, std::defer_lock);
    std::lock(lock_a, lock_b);
    // ...
    lock_a.unlock();
    lock_b.unlock();
    // ...
    std::lock(lock_a, lock_b);
}

void slowOp(ObjectWithExposedMutex& obj1, ObjectWithExposedMutex& obj2) {
    std::scoped_lock lock(obj1.lock, obj2.lock);

    // time consuming work but doesn't access shared object
    std::this_thread::sleep_for(std::chrono::seconds(1));
    // access shrared object
    std::this_thread::sleep_for(std::chrono::seconds(1));
    // time consuming work but doesn't access shared object
    std::this_thread::sleep_for(std::chrono::seconds(1));
    // access shrared object
    std::this_thread::sleep_for(std::chrono::seconds(1));
}

void fastOp(ObjectWithExposedMutex& obj1, ObjectWithExposedMutex& obj2){
    std::unique_lock lock1(obj1.lock, std::defer_lock);
    std::unique_lock lock2(obj2.lock, std::defer_lock);

    // time consuming work but doesn't access shared object
    std::this_thread::sleep_for(std::chrono::seconds(1));
    // access shrared object
    std::lock(lock1, lock2);
    std::this_thread::sleep_for(std::chrono::seconds(1));
    lock1.unlock();
    lock2.unlock();
    // time consuming work but doesn't access shared object
    std::this_thread::sleep_for(std::chrono::seconds(1));
    // access shrared object
    std::lock(lock1, lock2);
    std::this_thread::sleep_for(std::chrono::seconds(1));
}

class BigObject {
public:
    void something() {

    }
};

class UnsafeSingleton {
private:
    UnsafeSingleton() = default;
    static std::shared_ptr<UnsafeSingleton> data;
public:
    UnsafeSingleton(const UnsafeSingleton& other) = delete;
    UnsafeSingleton(UnsafeSingleton&& other) = delete;
    UnsafeSingleton& operator=(UnsafeSingleton& other) = delete;
    static std::shared_ptr<UnsafeSingleton> Instance() {
        if (data == nullptr) {
            data.reset(new UnsafeSingleton());
        }
        return data;
    }
};

std::shared_ptr<UnsafeSingleton> UnsafeSingleton::data = nullptr;

class DoubleCheckSingleton {
private:
    DoubleCheckSingleton() = default;
    static std::shared_ptr<DoubleCheckSingleton> data;
    static std::mutex mutex;
public:
    DoubleCheckSingleton(const DoubleCheckSingleton& other) = delete;
    DoubleCheckSingleton(DoubleCheckSingleton&& other) = delete;
    DoubleCheckSingleton& operator=(DoubleCheckSingleton& other) = delete;
    static std::shared_ptr<DoubleCheckSingleton> Instance() {
        if (data == nullptr) {
            std::lock_guard<std::mutex> lock(mutex);
            if (data == nullptr){
                data.reset(new DoubleCheckSingleton());
            }
        }
        return data;
    }
};


std::shared_ptr<DoubleCheckSingleton> DoubleCheckSingleton::data = nullptr;


class CallOnceSingleton {
private:
    CallOnceSingleton() = default;
    static std::shared_ptr<CallOnceSingleton> data;
    static std::once_flag init_flag;
public:
    CallOnceSingleton(const CallOnceSingleton& other) = delete;
    CallOnceSingleton(CallOnceSingleton&& other) = delete;
    CallOnceSingleton& operator=(CallOnceSingleton& other) = delete;
    static std::shared_ptr<CallOnceSingleton> Instance() {
        if (data == nullptr) {
            std::call_once(init_flag, []() {
                data.reset(new CallOnceSingleton());
            });
        }
        return data;
    }
};

std::shared_ptr<CallOnceSingleton> CallOnceSingleton::data = nullptr;
std::once_flag CallOnceSingleton::init_flag = std::once_flag();

Ayano's blog