Apr 20, 2026

[Rust] method resolution

Method Resolution and Borrowing

This is where the magic happens for your daily coding. When you call a method:

  1. Rust tries to find the method on the type itself.

  2. If it fails, it tries to dereference the type (*t) and looks again.

  3. If that fails, it tries to reference the type (&t) and looks again.



The Search Algorithm

Rust looks for a method named method by checking these categories in order:

1. Direct Match (Value and References)

First, it checks the type of the receiver (T) and its immediate references:

  • T (The type itself)

  • &T (Immutable reference)

  • &mut T (Mutable reference)

2. Deref Coercion (The "Unwrapping" Loop)

If no match is found, Rust uses the Deref trait to "unwrap" the type. If T implements Deref<Target = U>, it adds U to the search list and repeats step 1:

  • U

  • &U

  • &mut U

This continues recursively. If U derefs to V, it checks V, &V, &mut V. This is why you can call &str methods on a Box<String>.

3. Unsourced Coercion (Array to Slice)

If the type is an array [T; n], Rust will eventually attempt to coerce it into a slice [T].


The Three Transformation Rules

For every candidate type P found in the search above, Rust tries to match the method signature by attempting these three transformations in this exact order:

  1. Identity: receiver (The type matches exactly).

  2. Autoref: &receiver (Rust borrows it for you).

  3. Autoref-Mut: &mut receiver (Rust borrows it mutably).


Why the order matters

If a type has both an immutable foo(&self) and a mutable foo(&mut self), and you call it on a value, Rust will pick the immutable one first if it satisfies the call. This prevents accidental mutation.


Special Cases & Edge Cases

1. Shadowing (Inherent vs. Trait)

Rust has a strict hierarchy for where the method is defined:

  • Inherent Impls: Methods defined directly on the struct (impl MyStruct { ... }) always win.

  • Trait Impls: Methods from traits only count if the trait is in scope (use path::to::Trait).

  • Collision: If two traits in scope provide the same method name for the same type, the compiler throws an error, and you must use Fully Qualified Syntax: Trait::method(&receiver).

2. The "Dot" vs. "Function" Syntax

The auto-deref/autoref magic only happens with dot notation (a.b()).

If you use the associated function syntax (Type::method(a)), you must provide the exact type expected by the function signature. No coercion will help you there.

[AI] Tuning jargon

Supervised Fine-Tuning (SFT)

Parameter-Efficient Fine-Tuning (PEFT):
LoRA / QLoRA: The industry standard. Instead of changing the whole model, you add tiny "adapter" layers. This reduces the VRAM requirement by up to 90%.
DoRA (Weight-Decomposed Low-Rank Adaptation): A newer 2025/2026 favorite that decouples the magnitude and direction of weight updates, often yielding better results than LoRA.


SDFT (Self-Distillation Fine-Tuning): A breakthrough method (popularized by MIT in early 2026) where the model uses its own reasoning to generate better training data for itself, reducing the need for human-labeled sets.

SFT vs. RLHF (The "Teacher" vs. the "Critic")
It is helpful to think of SFT as the first step in a two-part education:

SFT (The Teacher): Tells the model, "Here is exactly how a good answer looks. Copy this."
RLHF/RFT (The Critic): Comes after SFT. It tells the model, "You gave me three answers; this one is better than that one." This is used for "alignment"—making the model safer, more polite, or better at complex reasoning (like DeepSeek-R1 or OpenAI’s o1).


Why Use SFT?

Domain Expertise: Teaching a model medical, legal, or proprietary company jargon.
Style/Voice: Ensuring the AI sounds like your specific brand (e.g., "Professional yet cheeky").
Format Constraints: Forcing the model to always output valid JSON or specific code structures.
Efficiency: A fine-tuned 7B model can often outperform a generic 70B model on a specific, narrow task.
Pro Tip: In 2026, the mantra is "Quality over Quantity." 1,000 extremely high-quality, human-verified examples will almost always result in a better model than 50,000 noisy, machine-generated ones.

Apr 3, 2026

[SIMD][SVE] example


#include <iostream>
#include <arm_sve.h> // The magical SVE library for modern ARM chips

// A function to add two arrays together using SVE
void add_arrays_sve(float* A, float* B, float* C, int n) {
    int i = 0;
    
    // Keep looping until we have processed all 'n' elements
    while (i < n) {
        // 1. THE MAGIC TAPE (Predicate)
        // This generates a true/false mask based on how many numbers are left.
        // It tells the CPU to "turn off" slots we don't need so we don't crash.
        svbool_t mask = svwhilelt_b32(i, n);

        // 2. Load data from A and B into our stretchy vectors, using the mask
        svfloat32_t vecA = svld1_f32(mask, &A[i]);
        svfloat32_t vecB = svld1_f32(mask, &B[i]);

        // 3. Add the vectors together, safely ignoring masked-off slots
        svfloat32_t vecC = svadd_f32_z(mask, vecA, vecB);

        // 4. Store the results back into standard memory
        svst1_f32(mask, &C[i], vecC);

        // 5. THE STRETCHY PART
        // svcntw() asks the CPU: "How many 32-bit words fit in your vector?"
        // We move forward by that amount, whether it's 4, 8, 16, or 64!
        i += svcntw(); 
    }
}

int main() {
    int n = 10; // 10 numbers total
    float A[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    float B[10] = {10, 10, 10, 10, 10, 10, 10, 10, 10, 10};
    float C[10] = {0}; // Where the answers go

    add_arrays_sve(A, B, C, n);

    std::cout << "Results of SVE math: ";
    for(int i = 0; i < n; ++i) {
        std::cout << C[i] << " ";
    }
    std::cout << "\n";

    return 0;
}

Mar 25, 2026

[intel] 5-level paging (LA57) affects MSB pointer tagging

Resource:
https://en.wikipedia.org/wiki/Intel_5-level_paging

The adoption of 5-level paging (also known as LA57) allows the virtual address space to expand from the traditional 48 bits (256 TiB) to 57 bits (128 PiB).




In 5-level paging (LA57), the "free" or unused bits are indeed bits 57 through 63, which totals 7 bits. However, there is a catch: those bits aren't truly "free" for software to store random data (like tags or metadata) unless a specific hardware feature like Intel LAM (Linear Address Masking) or AMD UAI (Upper Address Ignore) is enabled.

The Breakdown of the 64-bit Address

Here is how the 64 bits are partitioned in a 5-level paging system:

  • Bits 0–11: Page Offset (4 KB boundaries).
  • Bits 12–56: The actual translation bits used by the five levels of page tables 9 x 5 = 45 bits
  • Bits 57–63: The 7 unused bits.


The "Canonical" Constraint

The CPU enforces a rule called Canonical Form. For an address to be valid:
Bits 57 through 63 must be an exact copy of bit 56.
If bit 56 is 0, then bits 57–63 must all be 0.
If bit 56 is 1, then bits 57–63 must all be 1.

If software tries to use an address where those 7 bits are "dirty" (containing random data), the CPU will trigger a General Protection Fault (#GP). This is why programmers can't simply use those 7 bits for pointers without the masking features mentioned above.

[math] Empirical distribution function

 



Example:


Let's look at a real example with 5 marbles of different sizes. Imagine their sizes are 1, 2, 2, 3, 5

Here is what our "marble staircase" looks like:

How to read your stairs:

  1. The Bottom (Ground): We start at 0 because we haven't seen any marbles yet.

  2. Size 1: Find first marble! We take a step up. Now we've seen1/5 (or 20%) of the marbles.

  3. Size 2: Find two marbles that are the same size! This makes a double step up. Now we've seen 3/5 (or 60%) of the marbles.

  4. Size 3: Another marble! Another step up to 4/5 (or 80%).

  5. Size 5: The last, biggest marble! One final step takes you to the very top 100%.

[math] Poisson distribution

 

  • Lambda is the average.

  • k is the number you are guessing will happen.


Example:


Hash Table Collisions

While hash functions aim for uniformity, the number of keys that map to a specific "bucket" in a large hash table can be modeled using Poisson.

  • If you have n keys and m buckets, and n/m is small, the number of items in a bucket follows a Poisson distribution with Lambda = n/m

  • This helps in estimating the frequency of collisions and optimizing the size of the table.

Mar 19, 2026

[C++] uintN_t guaranteed to be exactly N bits

uintN_t is guaranteed to be exactly N bits with no padding if it exists, so 

sizeof(uint8_t) != sizeof(int32_t)

is guaranteed, if they both exist.