Dec 26, 2019

[hugepage][kernel][notes]

Reference:
k8s feature-gates:
https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/

HugePages:
https://wiki.debian.org/Hugepages

Huge pages part 1, Introduction: https://lwn.net/Articles/374424/
Huge pages part 2, Interfaces: https://lwn.net/Articles/375096/

mmap: http://man7.org/linux/man-pages/man2/mmap.2.html

golang mmap: https://godoc.org/golang.org/x/exp/mmap

Use mmap With Care:
https://www.sublimetext.com/blog/articles/use-mmap-with-care
Nice write up, taking advantage of memory pages to load large files into memory with mmap(). Some gotcha to be aware of if files are located on network drive(e.g nfs).


Design of Virtual Memory (VM) is "one of the engineering triumphs of the computer age" [denning96].


VM implementations take advantage of the principal of locality [denning71] by storing recent translations in a cache called the Translation Lookaside Buffer (TLB) [casep78,smith82,henessny90].


The amount of memory that can be translated by this cache is referred to as the "TLB reach" and depends on the size of  the page and the number of TLB entries.


Thus, a percentage of a program's execution time is spent accessing the TLB and servicing TLB misses.


Cost formulas:
Cycles(tlbhit) = TLBHitRate * TLBHitPenalty
Cycles(tlbmiss_cache) = TLBMissRate(cache) * TLBMissPenalty(cache)
Cycles(tlbmiss_full) = TLBMissRate(full) * TLBMissPenalty(full)
TLBMissCycles = Cycles(tlbmiss_cache) + Cycles_(tlbmiss_full)
TLBMissTime = (TLB Miss Cycles)/(Clock rate)


If the TLB miss time is a large percentage of overall program execution, then the time should be invested to reduce the miss rate and achieve better performance.


The benefits of huge pages are twofold:
  1. Fewer translations requiring fewer cycles. 
  2. Address translation information is typically stored in the L2 cache. With huge pages, more cache space is available for application data, which means that fewer cycles are spent accessing main memory. 


System call getting the current page size:
sysconf(_SC_PAGE_SIZE)
$ getconf PAGE_SIZE


Huge Page Fault Behaviour:
  1. Huge pages were faulted at the same time as mmap() was called. 
  2. Beware that, if calling mmap with MAP_PRIVATE, with fork() should follows an exec() immediately otherwise any modification of the forked process memory will trigger COW, which COW fault could result in application failure by either parent or child in the event of allocation failure.
  3. There is no support for the paging of huge pages to backing storage.


How can we use hugepage:
  1. Through  Shared Memory, i.e shmget() with SHM_HUGETLB flag.
  2. RAM-based filesystem "hugetlbfs.
    Every file on this filesystem is backed by huge pages and is accessed with mmap() or read().
    $ mount -t hugetlbfs none /mnt/hugetlbfs -o pagesize=64K
    mount parameters:
    size= // specifies (in bytes; the "K," "M," and "G" suffixes are understood) the maximum amount of memory used by this mount.
    nr_inodes= // the number of files that can exist on the mount point which, in effect, limits the number of possible mappings.
    These options can be used to divvy up the available huge pages to groups or users in a shared system.
  3. Anonymous mmap() , i.e mmap() by specifying the flags MAP_ANONYMOUS|MAP_HUGETLB
  4. libhugetlbfs Allocation APIs
  5. Automatic Backing of Memory Regions, i.e preload dynamic library to replace shmget()
    $ hugeadm
    $ hugectl
  6. Shared Memory, i.e When libhugetlbfs is preloaded or linked and the environment variable HUGETLB_SHM is set to yes, libhugetlbfs will override all calls to shmget(). 
  7. Heap, i.e When libhugetlbfs is preloaded or linked and the environment variable HUGETLB_MORECORE set to yes, libhugetlbfs will configure the __morecore hook, causing malloc() requests will use huge pages.
    $ hugectl --heap
  8. Text and Data, i.e This is accomplished by linking against libhugetlbfs and specifying -Wl,--hugetlbfs-alignAssuming the version of binutils installed is sufficiently recent.
    If the application is to be invoked multiple times, it is worth sharing that data by specifying the --share-text switch.
    Setting the environment variable HUGETLB_FORCE_ELFMAP to yes.
    $ hugectl --text --data --bss ./target-application
  9. Stack, no support.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.