Mar 25, 2026

[intel] 5-level paging (LA57) affects MSB pointer tagging

Resource:
https://en.wikipedia.org/wiki/Intel_5-level_paging

The adoption of 5-level paging (also known as LA57) allows the virtual address space to expand from the traditional 48 bits (256 TiB) to 57 bits (128 PiB).




In 5-level paging (LA57), the "free" or unused bits are indeed bits 57 through 63, which totals 7 bits. However, there is a catch: those bits aren't truly "free" for software to store random data (like tags or metadata) unless a specific hardware feature like Intel LAM (Linear Address Masking) or AMD UAI (Upper Address Ignore) is enabled.

The Breakdown of the 64-bit Address

Here is how the 64 bits are partitioned in a 5-level paging system:

  • Bits 0–11: Page Offset (4 KB boundaries).
  • Bits 12–56: The actual translation bits used by the five levels of page tables 9 x 5 = 45 bits
  • Bits 57–63: The 7 unused bits.


The "Canonical" Constraint

The CPU enforces a rule called Canonical Form. For an address to be valid:
Bits 57 through 63 must be an exact copy of bit 56.
If bit 56 is 0, then bits 57–63 must all be 0.
If bit 56 is 1, then bits 57–63 must all be 1.

If software tries to use an address where those 7 bits are "dirty" (containing random data), the CPU will trigger a General Protection Fault (#GP). This is why programmers can't simply use those 7 bits for pointers without the masking features mentioned above.

[math] Empirical distribution function

 



Example:


Let's look at a real example with 5 marbles of different sizes. Imagine their sizes are 1, 2, 2, 3, 5

Here is what our "marble staircase" looks like:

How to read your stairs:

  1. The Bottom (Ground): We start at 0 because we haven't seen any marbles yet.

  2. Size 1: Find first marble! We take a step up. Now we've seen1/5 (or 20%) of the marbles.

  3. Size 2: Find two marbles that are the same size! This makes a double step up. Now we've seen 3/5 (or 60%) of the marbles.

  4. Size 3: Another marble! Another step up to 4/5 (or 80%).

  5. Size 5: The last, biggest marble! One final step takes you to the very top 100%.

[math] Poisson distribution

 

  • Lambda is the average.

  • k is the number you are guessing will happen.


Example:


Hash Table Collisions

While hash functions aim for uniformity, the number of keys that map to a specific "bucket" in a large hash table can be modeled using Poisson.

  • If you have n keys and m buckets, and n/m is small, the number of items in a bucket follows a Poisson distribution with Lambda = n/m

  • This helps in estimating the frequency of collisions and optimizing the size of the table.

Mar 19, 2026

[C++] uintN_t guaranteed to be exactly N bits

uintN_t is guaranteed to be exactly N bits with no padding if it exists, so 

sizeof(uint8_t) != sizeof(int32_t)

is guaranteed, if they both exist.

Mar 3, 2026

[C++] define function in the header works better for LTO

 




[C++] Elaborated type specifier

Reference:
https://en.cppreference.com/w/cpp/language/elaborated_type_specifier.html
https://vsdmars.blogspot.com/2014/02/c11-extended-friend-declaration.html

class T
{
public:
    class U;
private:
    int U;
};
 
int main()
{
    int T;
    T t; // error: the local variable T is found
    class T t; // OK: finds ::T, the local variable T is ignored
    T::U* u; // error: lookup of T::U finds the private data member
    class T::U* u; // OK: the data member is ignored
}

template<typename T>
struct Node
{
    struct Node* Next; // OK: lookup of Node finds the injected-class-name
    struct Data* Data; // OK: declares type Data at global scope
                       // and also declares the data member Data
    friend class ::List; // error: cannot introduce a qualified name
    enum Kind* kind; // error: cannot introduce an enum
};
 
Data* p; // OK: struct Data has been declared

template<typename T>
class Node
{
    friend class T; // error: type parameter cannot appear in an elaborated type specifier;
                    // note that similar declaration `friend T;` is OK.
};
 
class A {};
enum b { f, t };
 
int main()
{
    class A a; // OK: equivalent to 'A a;'
    enum b flag; // OK: equivalent to 'b flag;'
}

enum class E { a, b };
enum E x = E::a; // OK
enum class E y = E::b; // error: 'enum class' cannot introduce an elaborated type specifier
 
struct A {};
class A a; // OK
`