Jul 8, 2022

[algorithm] hot key detection

An algorithm/data structure for hot-key detection

Reference:


Requirement

  1. The QPS threshold is defined as such that there are limited keys are hot in the defined time range.
  2. Hot keys should be detected in less than 30 seconds, without significant CPU, memory overhead, or latency increase to the system.


Input

Hashed Key (or itself should be unique)


Output

Hot index of the key.


Idea

  • Using 2 level filter.
  • Each level is a data structure that utilizes CPU pre-fetch.
  • We choose std::vector in this case.
  • The size of std::vector is predefined; no runtime adjustment allowed.


Level-1 vector(std::vector<int>)

hash function

/**
 * bucketsMask: bucket number - 1; bucket number should be the power of 2.
 * i.e. assert((numBuckets & (numBuckets - 1)) == 0);
 * Thus the bucketsMask would become 0x111...111
 * This indicates how many buckets and mimic consistent hash ring nodes.
 *
 */
size_t l1HashFunction(uint64_t hash) const { return hash & bucketsMask; }
Level-1 vector counter result is used as lever to detect if proceed to Level-2.
By proceeding to Level-2 means:
  1. If level-1 counter value < (l1Threshold/2); return 0.
  2. If level-1 counter > l1Threshold, return level-2 counter result
  3. Otherwise, insert the input key/hash value to level-2 counter if there's
    open space left. We check for up to defined bucket to see if there are available slot.
  4. Return result which is normalized between (1 ~ 255).


Level-2 vector(std::vector<L2Record>)

hash function

size_t l2HashFunction(uint64_t hash) const {
    return (hash * 351551 /* prime */) & bucketsMask;
}
Create another "Base Nodes" counts based on input hash value.


Data structure

struct L2Record {
    uint64_t hash{0};
    uint32_t count{0};
    uint32_t hashHits{0};
};

At this stage, we didn't record the 'exact' input key; thus we use L2Record's hash to assign the exact input key/hash. 
L2Record's count is the Level-2 l2HashFunction "Base node" count; it's the count of multiple input key/hash value that has the same l2HashFunction value. 
And if L2Record's count < hotnessMultiplier; we exit early with current L2Record's count result. Otherwise we decided to insert the input key/hash value as a new entry to the Level-2 count.

hotnessMultiplier can be 4 or 6.

There are chances input key/hash value will not be recorded/inserted due to there are
more than certain keys are hot; which violates the contract of this algorithm.


Now we can focus the maintenance of the data structure.
The concept of 'decay' is used and the algorithm does not use extra thread for the clean up due to avoiding lock which reduces the performance.
We check the maintenance threshold for each API call.

Maintenance comes in 4 parts:
  1. move holes
  2. decay Level-1 and Level-2 data structure value.
  3. update l1Threshold_ base on numbers of the current non-zero value in Level-2
    data structure(i.e. std::vector<L2Record>)
  4. The maintenance threshold is calculated at runtime based on current l1Threshold_.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.