Showing posts with label golang_high_performance. Show all posts
Showing posts with label golang_high_performance. Show all posts

Apr 27, 2020

[Go] tidwall/evio dig

Reference:

  1. syscall.SO_REUSEADDR
    https://github.com/tidwall/evio/blob/master/vendor/github.com/kavu/go_reuseport/udp.go#L117
    reusePort
    https://github.com/tidwall/evio/blob/master/vendor/github.com/kavu/go_reuseport/tcp.go#L116
    example code: (var reusePort = 0x0F)
  2. if err = syscall.SetsockoptInt(fd, syscall.SOL_SOCKET, syscall.SO_REUSEADDR, 1); err != nil {
      syscall.Close(fd)
      return nil, err
     }
    
     if err = syscall.SetsockoptInt(fd, syscall.SOL_SOCKET, reusePort, 1); err != nil {
      syscall.Close(fd)
      return nil, err
     }
    
  3. *os.File type is used as carrier for storing unix file descriptor which can be copied through fork()
    https://github.com/tidwall/evio/blob/master/vendor/github.com/kavu/go_reuseport/udp.go#L129
    e.g
    file = os.NewFile(uintptr(fd), getSocketFileName(proto, addr))
    if l, err = net.FilePacketConn(file); err != nil {
        return nil, err
    }
  4. Use net.FilePacketConn to open os.File socket. https://golang.org/pkg/net/#FilePacketConn
  5. set non-block IO
    https://golang.org/pkg/syscall/#SetNonblock
  6. Use of epoll:
    epoll_ctl
    epoll_create1 https://github.com/tidwall/evio/blob/master/internal/internal_linux.go#L20
  7. Produce/Consume syscall.SYS_EVENTFD2
    Used to manually trigger a epoll consume.
    https://github.com/tidwall/evio/blob/master/internal/internal_linux.go#L27
    https://github.com/tidwall/evio/blob/master/internal/internal_linux.go#L71
  8. epoll wait, same as in C but in Go:
    https://github.com/tidwall/evio/blob/master/internal/internal_linux.go#L54
  9. Use of syscall.ForkLock  (aka sync.RWMutex )
    It's used for CLOEXEC.
    Go runtime will call syscall.ForkLock.Lock()
    which prevents any thread from setting the CLOEXEC.
    in other words, any FD intended to be set with CLOEXEC(before kernel 2.6.23 this can't be set with atomic system call to open()) should call syscall.ForkLock.RLock() to guard the CLOEXEC setting.

    Quote from source:
    https://github.com/golang/go/blob/go1.14.1/src/syscall/exec_unix.go#L18
    Lock synchronizing creation of new file descriptors with fork.
    Time to use it:
    1) Pipe. Does not block. Use the ForkLock.
    2) Socket. Does not block. Use the ForkLock.
    3) Accept. If using non-blocking mode, use the ForkLock.
    Otherwise, live with the race.
    4) Open. Can block. Use O_CLOEXEC if available (Linux).
    Otherwise, live with the race.
    5) Dup. Does not block. Use the ForkLock. On Linux, could use fcntl F_DUPFD_CLOEXEC  instead of the ForkLock, but only for dup(fd, -1).
  10. A Mutex must not be copied after first use.
    https://golang.org/src/sync/mutex.go
  11. socketcall:
    http://man7.org/linux/man-pages/man2/socketcall.2.html
    https://golang.org/src/syscall/asm_linux_386.s?h=socketcall#L128
  12. From Go 1.9 syscall.ForkExec starts using posix_spawn (equivalent to clone(2) syscall with CLONE_VFORK and CLONE_VM)
  13. timerfd_create system call is not wrapped in Go, however can call with
    syscall.Syscall6
  14. some unix system call wrapper under x/sys
     https://pkg.go.dev/golang.org/x/sys/unix?tab=doc
  15. internal/poll.SendFile is a system call SendFile wrapper
    https://golang.org/pkg/internal/poll/#SendFile which is, fast, due to copy inside kernel space.

Apr 19, 2020

[Go] Micro optimizing Go Code (2018 GopherCon)

Micro optimizing Go Code
https://www.youtube.com/watch?v=keydVd-Zn80

Benchmarking tools:
https://dave.cheney.net/2013/06/30/how-to-write-benchmarks-in-go
https://godoc.org/golang.org/x/perf/cmd/benchstat
https://golang.org/pkg/net/http/pprof/
https://golang.org/pkg/runtime/pprof/

Consider writing a function which can be inlined.

This will show which functions are not able getting inlined:
$ go test -gcflags="-m=2" 2>&1 | ag "too complex"


Inlining rules in Go:
1. No nonlinear control flow:
    for, range, select, break, defer, type switch
2. No recover(ok with panic)
3. No certain runtime funcs and no non-intrinsic assembly
go compiler internal: inl.go


Consider writing code avoid index boundary check.
This will show where BCE occured:
$ go test -gcflags="-d=ssa/check_bce/debug=1"

BCE:
http://vsdmars.blogspot.com/search/label/golang_runtime_bce

Optimizing table lookup (all old-school opt from C++/C):
1. propagate constants
2. unroll loops
3. reuse previously-allocated local variables
4. reduce indirection
5. Consider using array instead of slice(Go specific)

Reuse the slice. (sliceForAppend, also consider using sync.Pool 
http://vsdmars.blogspot.com/2020/04/go-use-syncpool-with-sense.html)

Nov 20, 2019

[Go] High performancei, design lesson from FastHttp opensource project

Working on my stealth mode library project - vact,
there are several design calls can follow through the fasthttp project.

Jot it down here.

Server:
https://github.com/valyala/fasthttp

Client:
https://godoc.org/github.com/valyala/fasthttp#Client
HostClient
https://godoc.org/github.com/valyala/fasthttp#HostClient
PipelineClient
https://godoc.org/github.com/valyala/fasthttp#PipelineClient
(Head Of Line blocking issue due to it's using pipelined requests)

Reference:
http/1.0 pipelining
https://en.wikipedia.org/wiki/HTTP_pipelining
http/2.0
https://en.wikipedia.org/wiki/HTTP/2
QUIC
https://en.wikipedia.org/wiki/QUIC


Tips:
  1. Re-use object
    func fetchURLS(ch <-chan string) {
        var r Response
        for url := range ch {
            Get(url, &r)
            processResponse(&r)
            // reset response object, so it may be re-used
            r.Reset()
        }
    }
    
  2. Do not put object into pool if some header/tag is present.
    Close/re-cycle such connection/object instead.
  3. Server may stop responding, so goroutines calling Get() may
    pile up indefinitely.
    One way to detect it:
    (Check graceful shutdown: http://vsdmars.blogspot.com/2019/02/golangdesign-graceful-shutdown.html)
  4. func GetTimeout(url string, timeout time.Duration, resp *Response) {
        select {
            case concurrencyCh <- struct{}{}
            default:
            // too many goroutines are blocked in Get(), return back to caller
        }
        ch := make(chan struct{})
    
        go func() {
            Get(url, resp)
            <-concurrencyCh
            close(ch)
        }()
    
        select {
            case <-ch:
            case <-time.After(timeout):
        }
    }
  5. For every kind of network connection, setup a timeout!
    It's been used in TCP as well.
  6. conn := dialHost(url)
    conn.SetReadDeadline(time.Now().Add(readTimeout))
    conn.SetWriteDeadline(time.Now().Add(writeTimeout))
    
  7. Should the connection be closed on timed out request?
    Only if you want to DoS the remote server, otherwise, let it timeout.
  8. Pool size should ALWAYS be limited.
  9. After connection spike, there could be connection left overs.
    Limit the connection life time.
  10. Each tcp dial will trigger a DNS request.
    We could cache host -> ip mapping for a period of time, or
    rely on host's dns cache
  11. http client can be smart, dial with round-robin to each ip behind single
    domain name.
  12. Each dial should have timeout.