Jun 16, 2019

[C++] C++ Concurrency In Action, Second edition, recap [Ch.7]

Designing lock-free concurrent data structures

Lock-free; however it's not wait-free, i.e we still needs some CAS mechanism which wait for eventual consistent data being sync-up.

FUTEX note: http://vsdmars.blogspot.com/2019/03/futex-futex-skim-through.html



Types of nonblocking data structures

Operation types

  • Obstruction-Free
    If all other threads are paused, then any given thread will complete its operation in a bounded number of steps.
  • Lock-Free
    If multiple threads are operating on a data structure, then after a bounded number of steps one of them will complete its operation.
  • Wait-Free
    Every thread operating on a data structure will complete its operation in a bounded number of steps, even if other threads are also operating on the data structure.



Data structures

  • Lock-free data structures
    Algorithms that use compare/exchange operations on the data structure often have loops in them.
    The reason for using a compare/exchange operation is that another thread might have modified the data in the meantime, in which case the code will need to redo part of its operation before trying the compare/exchange again.
    This code can still be lock-free if the compare/exchange would eventually succeed if the other threads were suspended.
    If it didn't, you'd have a spin lock, which is non-blocking but not lock-free.
    Caution:  Loops can result in one thread being subject to starvation.
  • Wait-free data structures
    A wait-free data structure is a lock-free data structure with the additional property that every thread accessing the data structure can complete its operation within a bounded number of steps, regardless of the behavior of other threads.
    Writing wait-free data structures correctly is extremely hard.
    Be sure that the benefit outweighs the cost before designing one.



The pros and cons of lock-free data structures

  • Enable maximum concurrency.
    With a lock-free data structure, some thread makes progress with every step.
    With a wait-free data structure, every thread can make forward progress, regardless of what the other threads are doing; there’s no need for waiting.
    This is a desirable property to have but hard to achieve.
    It's all too easy to end up writing what's essentially a spin lock.
  • Robustness
    • With lock we might:
      • Forget to unlock mutex.
      • Dead lock.
    • However, lock-free can also cause:
      • Be sure to avoid the undefined behavior associated with a data race, you must use atomic operations for the modifications.
      • But that alone isn't enough; you must ensure that changes become visible to other threads in the correct order.
      • All this means that writing thread-safe data structures without using locks is considerably harder than writing them with locks.
      • Live lock occurs when two threads each try to change the data structure, but for each thread, the changes made by the other require the operation to be restarted, so both threads loop and try again.
        Live lock saps performance rather than cause long-term problems, but they're still something to watch out for. 
      • By definition, wait-free code can't suffer from live lock because there’s always an upper limit on the number of steps needed to perform an operation.
        The flip side here is that the algorithm is likely more complex than the alternative and may require more steps even when no other thread is accessing the data structure.
    • This brings us to another downside of lock-free and wait-free code: 
      • Although it can increase the potential for concurrency of operations on a data structure and reduce the time an individual thread spends waiting, it may well decrease overall performance.
        • First, the atomic operations used for lock-free code can be much slower than non-atomic operations, and there'll likely be more of them in a lock-free data structure than in the mutex locking code for a lock-based data structure.
        • Not only that, but the hardware must synchronize data between threads that access the same atomic variables. (cache ping-pong effect)

TBD

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.