Reference:
https://www.youtube.com/watch?v=ZQFzMfHIxng
- Lock-free means 'FAST'
- Algorithm rules supreme
- 'Wait-free' has nothing to do with time
- Wait-free refers to the number of compute 'steps'
- Steps do not have to be of the same duration
- Atomic operations do not guarantee good performance
What types can be made atomic?
- Any trivially copyable type can be made atomic
- Continuous chunk of memory
- Copying the object means copying all bits(memcpy)
- No virtual functions, noexcept constructor
Operation:
std::atomic<int> x{0};
++x;
x++;
x += 1;
x |= 2;
int y = x * 2;
x = y + 1;
x *= 2; // no atomic multiply; not compile
x = x + 1; // not atomic; atomic read x store in the register follow by atomic write x
x = x * 2; // not atomic; atomic read x store in the register follow by atomic write x
// same compiles to IR.
++x;
x += 1;
x = x + 1;
What is so special about CAS?
- Compare-and-swap (CAS) is used in most lock-free(not wait-free) algorithms
std::atomic x{0};
int x0 = x;
while(!x.compare_exchange_strong(x0, x0+1)){}
- even:
while(!x.compare_exchange_strong(x0, x0*2)){} // x0*2 is not atomic
Spinlock as using FUTEX with pause, slower if thread number > core number
std::atomic is not always lock free;
Judge at run-time due to runtime memory alignment.
Atomic operation do wait on each other,
- in particular, write operation do
- read-only operations can scale near-perfectly.
Be aware of NUMA architecture cache-line
Spurious wake up
Atomic queue; lock free implement
and memory-barrier plays the role; only store and release issues memory barrier operation.
reference:
- processing invalid queue (RMB)
- write store buffer to cache. (WMB)
CAS
Read is faster than write, keep this in mind. Thus the memory order setting is different.
For read, use more relax orders.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.