Showing posts with label golang_code_design. Show all posts

Mar 13, 2021

[Go] Design

Reference:
https://dave.cheney.net/2016/08/20/solid-go-design

Bad code

Old fashion articles

Rigid. Is the code rigid? Does it have a straight jacket of overbearing types and parameters, that making modification difficult?
Fragile. Is the code fragile? Does the slightest change ripple through the code base causing untold havoc?
Immobile. Is the code hard to refactor? Is it one keystroke away from an import loop?
Complex. Is there code for the sake of having code, are things over-engineered?
Verbose. Is it just exhausting to use the code? When you look at it, can you even tell what this code is trying to do?

SOLID

Single Responsibility Principle

A class should have one, and only one, reason to change.
Coupling & Cohesion
Package names, don't use name like 'common', 'util' which are too broad.
Unix philosophy; small, sharp tools which combine to solve larger tasks.

Open / Closed Principle
Software entities should be open for extension, but closed for modification.
Liskov Substitution Principle

Two types are substitutable if they exhibit behaviour
such that the caller is unable to tell the difference.
Require no more, promise no less.

Interface Segregation Principle

Clients should not be forced to depend on methods they do not use.
A great rule of thumb for Go is accept interfaces, return structs.

Dependency Inversion Principle

High-level modules should not depend on low-level modules. Both should depend on abstractions.
Abstractions should not depend on details. Details should depend on abstractions.
In Go, import graph must be acyclic.
All things being equal the import graph of a well designed Go program should be a wide, and relatively flat, rather than tall and narrow.
If you have a package whose functions cannot operate without enlisting the aid of another package,that is perhaps a sign that code is not well factored along package boundaries.

Focus

Go programmers need to start talking less about frameworks, and start talking more about design.
We need to stop focusing on performance at all cost, and focus instead on reuse at all costs.
What I want to hear is people talking about how to design Go programs in a way that is well engineered, decoupled, reusable, and above all responsive to change.

Jul 4, 2020

[Pacific++ 2018][re-read] "Designing for Efficient Cache Usage" - Scott McMillan

Reference:
Latency Numbers Every Programmer Should Know:
https://colin-scott.github.io/personal_website/research/interactive_latency.html

Cache Lines

Compact data structure

Padding
https://vsdmars.blogspot.com/2014/01/c-c-lost-art-of-c-structure-packing-by.html
http://vsdmars.blogspot.com/2018/09/golangc-padding.html
Honor access cache pattern (sequential, do not jumping) http://vsdmars.blogspot.com/2017/11/cppcon-2014-data-oriented-design-mike.html
https://vsdmars.blogspot.com/2018/11/cppcon-2018-oop-is-dead-long-live-data.html
https://vsdmars.blogspot.com/2018/11/cppcon-2016-high-performance-code-201.html

Hardware Prefetch

Predictable access patterns are faster
We want sequential locality

Access Locality

Cache locality

Spatial
Temporal

Beware the algorithm/data structure to use honors cache locality.
Look beyond just big-O notation as constant-time costs can differ significantly.
Large benefit in hitting faster cache levels.
In C++, allocators matters. In Go(lang), the standard implement dealt with this problem.
In Go(lang), use flat map instead of std map.

Multipe CPU Core Considerations

MESI
http://vsdmars.blogspot.com/2018/09/concurrency-c-wrap-up-2018.html
http://vsdmars.blogspot.com/2019/06/c-c-concurrency-in-action-second.html
Take care to avoid data sharing problems.

Write Combined Memory

Accumulate writes to flush as 64Bytes operations
Partial buffer flush causes: (avoid this)

Not writing all bytes convered by a buffer
Writing too many streams at once
Atomic read-modify-write operations

Write Combined memory read causes:

C++ bit fields
Optimization
Virtually always an accident (read the source code of implement)
Solution: Expose write-only interface

Non-temporal writes on x86:
Optimizing Cache Usage With Nontemporal Accesses

Use compiler intrinsics:

SSE2

_mm_stream_si32: store 4 bytes
_mm_Stream_si128: store 16 bytes

_mm256_stream_si256: store 32 bytes

AVX-512

_mm512_stream_si512: store 64 bytes

Address Translation

hugepage: https://vsdmars.blogspot.com/2019/12/hugepagekernelnotes.html#more
https://www.kernel.org/doc/html/latest/core-api/cachetlb.html
TLB flush optimization: https://lwn.net/Articles/684934/
TLB size (4KiB pages)

Intel Skylake Series:

L1: 64 entries => 256 4KiB addressed
L2 shared: 1536 entries => 6144KiB addressed

Thus, use hugepage!

Platform-specific
Not directly pageable
Difficult/slow to allocate
Linux: Use hugepage

Jun 20, 2020

[Go][design] high through-put low contention in memory cache

Project https://github.com/dgraph-io/ristretto as example

Fast Access:

batching:

use ring-buffer for each lock and update the global cache.

generate a fast hash:

use runtime.memhash from Go Runtime.

High Concurrency and Contention Resistance:

Get (tinyLFU):

use sync.Pool to implement striped, lossy ring-buffers that have great performance with little loss of data to update frequency.
Use with channel for async update.

Set (tinyLFU + Count–min sketch):

use a channel to capture the Sets, dropping them on the floor if the channel is full to avoid contention.

Memory Bounding:

An infinitely large cache is practically impossible.
A cache must be bounded in size. Many cache libraries would consider cache size to be the number of elements, this is naive and only works in a workload where values are of identical size. Most workloads, however, have variable-sized values. One value could cost a few bytes, another a few kilobytes and yet another, a few megabytes.Treating them as having the same memory cost isn't realistic.

Idea:

Attach a cost to every key-value.
Users can specify what that cost is when calling Set.
When the cache is operating at capacity, a heavy item could displace many lightweight items.

Admission Policy:

What should we let into the cache when the capacity is hit?
The goal, obviously, is to let in new items if they are more “valuable” than the current items. (there is overhead)
Using LFU, not LRU:
Paper: TinyLFU: A Highly Efficient Cache Admission Policy

The main idea is to only let in a new item if its estimate is higher than that of the item being evicted.
Use count-min sketch.
After N key increments, the counters get halved.
Use TTL. A key that has not been seen for a while would have its counter get reset to zero.

To find a key with low ɛ, use randomness sampling, which provided by iterating through Go's map key.
If the incoming key's ɛ is higher than ɛ from randomness sampling, reject the incoming key, otherwise, incoming key is stored.
Note that the cost of the incoming key does not play a factor in choosing the eviction keys.

DoorKeeper:

When capacity hits, use a bloom filter to first check if the key has been seen before.
Only if the key is already present in the bloom filter, is it inserted into the TinyLFU.

Metrics:

When using atomic counters for metrics, be ware of false sharing between caches among CPU cores.
While cache line size, on my thinkpad is 64-bytes, use atomic.AddUint64 as atomic counter.

Reference:

Design Of A Modern Cache (Benjamin Manes)
http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html

BP-Wrapper: A System Framework Making Any Replacement Algorithms (Almost) Lock Contention Free
https://dgraph.io/blog/refs/bp_wrapper.pdf

Adaptive Software Cache Management
https://dgraph.io/blog/refs/Adaptive%20Software%20Cache%20Management.pdf

TinyLFU: A Highly Efficient Cache Admission Policy
https://dgraph.io/blog/refs/TinyLFU%20-%20A%20Highly%20Efficient%20Cache%20Admission%20Policy.pdf

The State of Caching in Go
https://dgraph.io/blog/post/caching-in-go/

Count–min sketch
https://en.wikipedia.org/wiki/Count%E2%80%93min_sketch

Sep 28, 2019

[Go] Naming

Reference:
https://talks.golang.org/2014/names.slide

Name it in:

MixedCase.
Consistent (easy to guess),
Short (easy to type),
Accurate (easy to understand).

The greater the distance between a name's declaration and its uses,
the longer the name should be.
i.e using single char naming for 'for' loop is just fine.

Prefer i to index.
Prefer r to reader.
Prefer b to buffer.

Function parameters are like local variables,
but they also serve as documentation.
e.g Where the types are descriptive, they should be short:

func AfterFunc(d Duration, f func()) *Timer
func Escape(w io.Writer, s []byte)

Where the types are more ambiguous, the names may provide documentation:

func Unix(sec, nsec int64) Time
func HasPrefix(s, prefix []byte) bool

Return values on exported functions should only be named for
documentation purposes.
These are good examples of named return values:

func Copy(dst Writer, src Reader) (written int64, err error)
func ScanBytes(data []byte, atEOF bool) (advance int, token []byte, err error)

Receivers are a special kind of argument.
By convention, they are one or two characters that reflect the receiver type,
because they typically appear on almost every line.
Receiver names should be consistent across a type's methods.
(Don't use r in one method and rdr in another.)

Exported names are qualified by their package names.
Remember this when naming exported
variables, constants, functions, and types.
That's why we have bytes.Buffer and strings.Reader,
not ~~bytes.ByteBuffer~~ and ~~strings.StringReader~~.

Interfaces that specify just one method are usually just that function
name with 'er' appended to it.
Sometimes the result isn't correct English, but we do it anyway.
When an interface includes multiple methods,
choose a name that accurately describes its purpose
(examples: net.Conn, http.ResponseWriter, io.ReadWriter).

Error types should be of the form FooError:

type ExitError struct {
    ...
}

Error values should be of the form ErrFoo:

var ErrFormat = errors.New("image: unknown format")

Choose package names that lend meaning to the names they export.
Steer clear of util, common, and the like.

The last component of a package path should be the same as the package name.
Avoid stutter in repository and package paths.
Avoid upper case letters (not all file systems are case sensitive).
For libraries, it often works to put the package code in the repo root:

"github.com/golang/oauth2" // package oauth2

[Go] project directory layout and etc.

Reference:
https://github.com/golang-standards/project-layout
https://blog.golang.org/v2-go-modules

Publishing v2 and beyond:

e.g v2 directory

github.com/googleapis/gax-go/v2

Go for Industrial Programming:
https://peter.bourgon.org/go-for-industrial-programming/
Optimizing for mean time to recovery (MTTR) instead of mean time between failure (MTBF).

Mitchell Hashimoto - Advanced Testing with Go:
https://www.youtube.com/watch?v=8hQG7QlcLBk

How you name your variable:
https://talks.golang.org/2014/names.slide
https://golang.org/doc/effective_go.html#names
https://blog.golang.org/package-names
https://rakyll.org/style-packages/ (No plurals on package naming)

Code review comments:
https://github.com/golang/go/wiki/CodeReviewComments

Feb 17, 2019

[structure concurrency][Go][design] Graceful Shutdown

Recently I'm working on Actor design utilizing with Golang's channel.

Link: https://github.com/vsdmars/actor

While bumped into Martin Sústrik's blog post:
Graceful Shutdown: http://250bpm.com/blog:146
Along with the discussion: https://trio.discourse.group/t/graceful-shutdown/93
Structured concurrency resources: https://trio.discourse.group/t/structured-concurrency-resources/21

It's a nice summary, touched the well known concept of pthread cancellation point as well as Golang's context.Done()/Cancel() pattern.

Notes:
In-band cancel:
through application level channel

out-of-band cancel:
through language run-time

Sending a graceful shutdown request cannot possibly be a feature of the language. It must be done manually by the application.

POSIX C thread:
Reference:
https://stackoverflow.com/a/27374983
http://man7.org/linux/man-pages/man7/pthreads.7.html (Cancellation points)

The rules of hard cancellation are strict:
Once a coroutine has been hard-canceled the very next blocking call will immediately return ECANCELED.
In other words, every single blocking call is a cancellation point.

As an additional note, if looking at the POSIX requirement for cancel points, virtually all blocking interfaces are required to be cancel points.
Otherwise, on any completely blocked thread (in such call), there would be no safe way to terminate that thread.

Graceful shutdown can terminate the coroutine only when it is idling and waiting for a new request.

Reasoning:

If we allowed graceful shutdown to terminate the coroutine while it is trying to send the reply, it would mean that a request could go unanswered.

And that doesn't deserve to be called "graceful shutdown".
(e.g. Golang, context.Done() pattern)

Definition summarize:

Hard cancellation:

- Is triggered via an invisible communication channel created by the language runtime.

- It manifests itself inside the target coroutine as an error code (ECANCELED in libdill) or an exception (Cancelled in Trio).

- The error (or the exception) can be returned from any blocking call.

- In response to it, the coroutine is not expected to do any application-specific work. It should just exit.

Graceful shutdown:

- Is triggered via an application-specific channel.

- Manifests itself inside the target coroutine as a plain old message.

- The message may only be received at specific, application-defined points in the coroutine.

- In response to it, the coroutine can do arbitrary amount of application-specific work.

Hard cancellation is fully managed by the language.

Graceful shutdown is fully managed by the application.

Golang library reference:

go-resiliency/deadline: https://github.com/eapache/go-resiliency/tree/master/deadline

Godoc: https://godoc.org/github.com/eapache/go-resiliency/deadline#example-Deadline

Nov 2, 2018

[cppcon 2018] OOP Is Dead, Long Live Data-oriented Design - Stoyan Nikolov

Data-Oiented Design

OOP marries data with operations

Heterogeneous data is brought together by a 'logical' black box object.
The object is used in vastly different contexts
Hides 'state' all over the place
Impact on

Performance
Scalability
Modifiability
Testability

Why? Cache miss~

Data-oriented design

Like Golang, data first
Separates data from logic
Structs and functions live independent lives
Data is regarded as information that has to be transformed
The logic embraces the data
Does not try to hide the logic
Leads to functions that work on arrays
Reorganizes data according to it's usage

If we aren't going to use a piece of information, why packs it together?

Avoids 'hidden state'
No virtual calls
Promotes deep domain knowledge
Reference:
http://vsdmars.blogspot.com/2017/11/cppcon-2014-data-oriented-design-mike.html

Examples from Chromium code base :-)

--
class CORE_EXPORT Animation final: public ~
--

So, for OOP in Chromium:

Uses more than 6 non-trivial classes
Objects contain smart pointers to other objects
Interpolation uses abstract classes to handle different property types
CSS Animations directly 'reach out' to other systems - coupling
Calling events
Setting values in DOM element
What's the lifetime of elements being synchronized?

DOD:

Data operations

Tick -> 99.9%
Add
Remove
Pause
...

Tick Input

Definition
Time

Tick Output

Changed properties
New property values
Who owns the new values

Design for 'many animations',
i.e many objects

Define a type:

struct AnimationController{
    AnimationState* as_ [];
};

// Golang style.
// No shared_ptr, every instance of this type
// has it's own value. 
// Thread safe.
struct AnimationState{
    AnimationID Id;
    time StartTime;
    time PauseTime;
    ...
};

// Avoid type erasure, use template
template<typename T>
struct AnimationStateProperty : public AnimationState {
    AnimatedDefiniationFrames<T> Keyframes;
};


// We can't use vector<baseType>
// But since we know every property types,
// create vector for each type
CSSVector<AnimationStateProperty<ZIndex>> m_ZIndexActiveAnimState;

// Iterates them for every CSSVector types

With above design, keep in mind,
std::vector
is the best container to avoid cache misses!
(continuous memory, sequential container)

Avoid branches:

Keep lists per-boolean 'flag'
Separate Active and Inactive animations
i.e Base on the states we have, put object into a list of the same state.
avoid using 'if branch' test.
Avoid 'if (isActive)'
If there are too many states, try to cut down the size of states, or put the state that changes most into 'list' style.

Add API to the caller:

We don't have OOP style object, thus
no member functions!
i.e Animation.Play()
Use free function taking ID!
i.e
void PlayAnimation(AnimationID aid);

Key points:

Keep data flat (Golang style)

Maximise cache usage
No RTTI
Amortized dynamic allocations
Some read-only duplication improves performance and readability

Existence-based predication

Reduce branching
Apply the same operation on a whole table

Id-Based handles

No pointers
Allow rearranging internal memory

Table-based output

No external dependencies
Easy to reason about the flow

Scalability:

OOP multi-threading

Complicated

DoD multi-threading

Group state into list
Each task/job/thread keeps a private table of modified data
Join merges the tables (thread.join)
Classic fork-join

Testability:

OOP case

Hard to mock(lots of types)
Hidden states
Asserting correct state is difficult - multiple output points(VERY BAD DESIGN)

DOD case

Contract style design
Easier to mock(less types)
Asserting correct state is easy

Modifiability:

Hard to modify base types
But, easy to do 'quick' changes, because we have if branches

FP style. Building blocks
A bit harder to to quick changes, but with FP, we have monoid.

Downsides of DOD:

Correct data separation can be hard

Know the problem well

Existence-based predication is not always feasible(or easy)
'Quick' modifications can be tough

What to keep from OOP:

Simple struct with simple methods are fine
Keep polymorphism & interface under control
Use template
Use 'impl'

Extra reference:

Open addressing: http://www.mathcs.emory.edu/~cheung/Courses/323/Syllabus/Map/open-addr.html

Aug 12, 2018

[Go] On the uses and misuses of panics in Go - Eli Bendersky

https://eli.thegreenplace.net/2018/on-the-uses-and-misuses-of-panics-in-go/

Start my reading of Eli Bendersky's blog since my days in VMWare, while I was once a rookie in C++.

tl;dl;
Short take away:
Sometimes, it's better to be pragmatic than a zealot.

I found that in golang world people are a bit stubborn and insist to stick on to some idiom, which by sometimes doesn't make much sense (i.e try to mandate passing context to every callee function etc.)

C++ gives us freedom, thus we code by reasoning; however, in golang, emm... emm... emm...

Nov 19, 2017

[CPPCON 2014] Data-oriented Design - Mike Acton

video:
https://www.youtube.com/watch?v=rX0ItVEVjHc

No Exception
No Template
No IOSTREAM
No Multiple Inheritance
Seldom Operator overloading
No RTTI (turn off) i.e no virtual function
No STL
Custom allocators
Custom debugging tools (into the code)

The purpose of all programs, and all parts of those programs, is to transform data from one form to another.

If you don't understand the data you don't understand the problem.

Conversely, understand the problem by understanding the data.

Different problems require different solutions.

If you don't understand the cost of solving the problem, you don't understand the problem.

If you don't understand the hardware, you can't reason about the cost of solving the problem.

Everything is a data problem. Including usability, maintenance, debug-ability etc.
It's not code problem.

Avoid adding problem into the problem while solving the problem.

Latency and throughput are only the same in sequential systems.

Rule of thumb: Where there is one, there are many. Try looking on the time axis.

Code is NEVER more important then data.

Software is NOT platform. Hardware is.

There's no ideal abstract solution.

Help the compiler reasoning.

Don't put evaluation inside a loop. Put it outside.

Good programming is HARD.
Bad programming is EASY.

Design pattersn are spoonfeed material for brainless programmers incapable of independent thought, who will be resolved to producing code as mediocre as the design patterns they use to create it.