Jun 20, 2020

[Go][design] high through-put low contention in memory cache

Project https://github.com/dgraph-io/ristretto as example


Fast Access:

batching:

  • use ring-buffer for each lock and update the global cache.

generate a fast hash:


High Concurrency and Contention Resistance:

Get (tinyLFU):

  • use sync.Pool to implement striped, lossy ring-buffers that have great performance with little loss of data to update frequency. 
  • Use with channel for async update.

Set (tinyLFU + Count–min sketch): 

  • use a channel to capture the Sets, dropping them on the floor if the channel is full to avoid contention.    

Memory Bounding:

  • An infinitely large cache is practically impossible.
  • A cache must be bounded in size. Many cache libraries would consider cache size to be the number of elements, this is naive and only works in a workload where values are of identical size. Most workloads, however, have variable-sized values. One value could cost a few bytes, another a few kilobytes and yet another, a few megabytes.Treating them as having the same memory cost isn't realistic.

Idea:

  • Attach a cost to every key-value.
  • Users can specify what that cost is when calling Set. 
  • When the cache is operating at capacity, a heavy item could displace many lightweight items.

Admission Policy:

  • What should we let into the cache when the capacity is hit?
    The goal, obviously, is to let in new items if they are more “valuable” than the current items. (there is overhead)
  • Using LFU, not LRU:
    Paper: TinyLFU: A Highly Efficient Cache Admission Policy
    • The main idea is to only let in a new item if its estimate is higher than that of the item being evicted
    • Use count-min sketch.
    • After N key increments, the counters get halved.
    • Use TTL. A key that has not been seen for a while would have its counter get reset to zero.
  • To find a key with low ɛ, use randomness sampling, which provided by iterating through Go's map key.
  • If the incoming key's ɛ is higher than ɛ from randomness sampling, reject the incoming key, otherwise, incoming key is stored.
  • Note that the cost of the incoming key does not play a factor in choosing the eviction keys.

DoorKeeper:

  • When capacity hits, use a bloom filter to first check if the key has been seen before. 
  • Only if the key is already present in the bloom filter, is it inserted into the TinyLFU.

Metrics:

  • When using atomic counters for metrics, be ware of false sharing between caches among CPU cores.
  • While cache line size, on my thinkpad is 64-bytes, use atomic.AddUint64 as atomic counter.



Reference:

Design Of A Modern Cache (Benjamin Manes)
http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html

BP-Wrapper: A System Framework Making Any Replacement Algorithms (Almost) Lock Contention Free
https://dgraph.io/blog/refs/bp_wrapper.pdf

Adaptive Software Cache Management
https://dgraph.io/blog/refs/Adaptive%20Software%20Cache%20Management.pdf

TinyLFU: A Highly Efficient Cache Admission Policy
https://dgraph.io/blog/refs/TinyLFU%20-%20A%20Highly%20Efficient%20Cache%20Admission%20Policy.pdf

The State of Caching in Go
https://dgraph.io/blog/post/caching-in-go/

Count–min sketch
https://en.wikipedia.org/wiki/Count%E2%80%93min_sketch

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.