Ataraxia through Epoché: [C++] memory model as for high performance concerns

Tips

The downside with sequential consistency is that it can hurt performance.
Use atomic with a relaxed memory model instead.

std::shared_ptr
To satisfy thread safety requirements, the reference counters are typically incremented using an equivalent of std::atomic::fetch_add with std::memory_order_relaxed (decrementing requires stronger ordering to safely destroy the control block).

Performance guidelines

correctly over performance
Avoid contention
Minimize the time spent in critical sections.
Avoid blocking operations (posix system call / sync call)
Be aware of number of threads/CPU cores
Thread priorities

Important for lowering the latency of tasks

Avoid priority inversion (ref: Golang's goroutine model; if one goroutine is starving in 1ms, has high priority)
i.e A thread with high priority is waiting to acquire a lock that is currently held by a low-priority thread.
For real-time applications, we cannot use locks to protect any shared resources that need to be accessed by real-time threads.
A thread that produces real-time audio, for example, runs with the highest possible priority, and in order to avoid priority inversion, it is not possible for the audio thread to call any functions (including std::malloc() ) that might block and cause a context switch.
Thread affinity; a request to the scheduler that some threads should be executed on a particular core if possible, to minimize cache misses.
False sharing
Pad each element in the array so that two adjacent elements cannot reside on the same cache line.
Since C++17, there is a portable way of doing this using the std::hardware_destructive_interference_size constant defined in <new> in combination with the alignas specifier.

// Sqeeze data into same cache line thus true sharing

std::hardware_constructive_interference_size

// Seperate data into different cache line thus avoid false sharing

std::hardware_destructive_interference_size

// each vector element owns a cacheline
struct alignas(std::hardware_destructive_interference_size) Element {
	int counter_{};
};
auto elements = std::vector<Element>(num_threads);

Ataraxia through Epoché

Mar 16, 2022

[C++] memory model as for high performance concerns

Tips

Performance guidelines

No comments:

Post a Comment