Reference:
Unlocking Modern CPU Power - Next-Gen C++ Optimization Techniques - Fedor G Pikus - C++Now 2024
https://vsdmars.blogspot.com/2016/01/likely-or-unlikely-easy-misleading.html
https://vsdmars.blogspot.com/2022/10/book-art-of-writing-efficient-programs.html
RCU:
https://vsdmars.blogspot.com/2024/07/c-rcu.html
TLB:
https://vsdmars.blogspot.com/2020/07/virtual-memory-refresh.html
https://vsdmars.blogspot.com/2020/07/pacific-2018re-read-designing-for.html
https://vsdmars.blogspot.com/2018/11/pacific-2018-designing-for-efficient.html
Modern CPUs rely on caches and pipelining to a much greater degree.
Penalty for not using caches and for disrupting pipelines is far greater.
Memory access is characterized bny bandwidth and latency
Bandwidth is much higher than 'latency per word'
Random access speed is limited by latency
Sequential access speed is limited by bandwidth
Prefetch attempts to predict future memory accesses and transfers memory content into cache in advance.
Random access defeats prediction.
Kernal flushes everything if TLB is outdated through 'TLB shootdown"; which is an inter-processor interrupt. The shootdown kernel code runs on the CPU. The shootdown is counted as 'system time' in the profiler.
2) Kernel tuning
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.