Ataraxia through Epoché: [Design][Software Engineering][C++] Using/Design type effectively

"On the whole, I'm inclined to say that when in doubt, make a new type."
– Martin Fowler, When to Make a Type
"Don't set a flag; set the data."
– Leo Brodie, Thinking Forth

Considering this talk provides an essential abstraction design for Type.
Yet Golang's multiple variables return programming paradigm should design as follows the concept of considering them as a whole into sum type.

Types as sets of values:

Type, like math's function, defines value domain.
If types' value domain are same, we could consider they are equivalent.
(But not 'equality')
Algebraically, a type is the number of values that inhabit it.

e.g.
How many values?

bool;  // 2, true, false
char;  // 256
void;  // 0
struct Foo {};  // 1
enum FireSwampDangers : int8_t {   // 3
    FLAME_SPURTS,
    LIGHTNING_SAND,
    ROUSES
};

template <typename T> // as many values as T
struct Foo {
    T m_t;
};

Aggregating Types:

When two types are "concatenated" into one compound type,
we multiply the # of inhabitants of the components.
This kind of compounding gives us a product type.
e.g
How many values?

std::pair<char, bool>;  // 256 * 2

struct Foo {  // 256 * 2
    char a;
    bool b;
};

std::tuple<bool, bool, bool>;  // 2 * 2 * 2 = 8

template <typename T, typename U>  // (# of values in T) * (# of values in U)
struct Foo {
    T m_t;
    U m_u;
};

Alternating Types:

When two types are "alternated" into one compound type,
we add the # of inhabitants of the components.
This kind of compounding gives us a sum type.
e.g.
How many values?

std::optional<char>;  // 256 + 1
std::variant<char, bool>;  // 256 + 2

template <typename T, typename U>  // (# of values in T) + (# of values in U)
struct Foo {
    std::variant<T, U>;
}

Function Types:

The number of values of a function is the number of different ways we can draw arrows between the inputs and the outputs.
When we have a function from A to B,
we raise the # of inhabitants of B to the power of the # of inhabitants of A.
Curring, foundation of Lambda Calculus : https://en.wikipedia.org/wiki/Currying
e.g.
How many values?

bool f(bool);  // 4
char f(bool);  // 256 * 256 = 65,536

enum class Foo
{
    BAR,
    BAZ,
    QUUX};
char f(Foo);   // 256 * 256 * 256 = 16,777,216

template <class T, class U>  // U ^ T
U f(T);

The above definition gives us how to present equivalent type:

e.g.

Equivalence:

template <typename T>
struct Foo {
    std::variant<T, T> m_v;
};

template <typename T>
struct Bar {
    T m_t;
    bool m_b;};

Algebraic Datatypes:

the ability to reason about equality of types
to find equivalent formulations

more natural
more easily understood
more efficient

to identify mismatches between state spaces and the types used to
implement them
to eliminate illegal states by making them inexpressible

Making illegal states unrepresentable:

std::variant is a game changer because it allows us to (more) properly express types,
so that (more) illegal states are un-representable.

Let's using sum types (variant, optional) as well as product types (structs):
e.g
Old way:

enum class ConnectionState {
    DISCONNECTED,
    CONNECTING,
    CONNECTED,
    CONNECTION_INTERRUPTED};

struct Connection {
    ConnectionState m_connectionState;
    std::string m_serverAddress;
    ConnectionId m_id;
    std::chrono::system_clock::time_point m_connectedTime;
    std::chrono::milliseconds m_lastPingTime;
    Timer m_reconnectTimer;
};

New way:

struct Connection {
    std::string m_serverAddress;

    struct Disconnected {};
    struct Connecting {};
    struct Connected {
        ConnectionId m_id;
        std::chrono::system_clock::time_point m_connectedTime;
        std::optional<std::chrono::milliseconds> m_lastPingTime;};

    struct ConnectionInterrupted {
        std::chrono::system_clock::time_point m_disconnectedTime;
        Timer m_reconnectTimer;};

    std::variant<Disconnected,
      Connecting,
      Connected,
      ConnectionInterrupted> m_connection;};

Old way:

class Friend {
std::string m_alias;
bool m_aliasPopulated;
...
};

New way:

class Friend {
std::optional<std::string> m_alias;
...
};

Thus, we have a new design pattern for modern C++:

(I would recommend Robert Nystrom's Game Programming Patterns)

Command
Composite
State
Interpreter

The addition of sum types to C++ offers an alternative formulation for some
design patterns.
State machines and expressions are naturally modeled with sum types.

Designing with types:

std::variant and std::optional are valuable tools that allow us to model
the state of our business logic more accurately.
When you match the types to the domain accurately, certain categories of
tests just disappear. (Consider Data Oriented Design)

Fitting types to their function more accurately makes code easier to
understand and removes pitfalls.
The bigger the code-base and the more vital the functionality, the more
value there is in correct representation with types.

Using types to constrain behavior:

"Phantom types" is one technique that helps us to model the behavior of
our business logic in the type system. Illegal behavior becomes a type error.

e.g.

Old ways:

std::string GetFormData();
std::string SanitizeFormData(const std::string&);
void ExecuteQuery(const std::string&);

template <typename T>
struct FormData {
    explicit FormData(const string& input) : m_input(input) {}
    std::string m_input;};
struct sanitized {};
struct unsanitized {};

New ways:

FormData<unsanitized> GetFormData();

std::optional<FormData<sanitized>>
SanitizeFormData(const FormData<unsanitized>&);

void ExecuteQuery(const FormData<sanitized>&);

Total functions:

A total function is a function that is defined for all inputs in its domain.
Writing total functions with well-typed signatures can tell us a lot about functionality.
Using types appropriately makes interfaces unsurprising, safer to use and harder to misuse.
Total functions make more test categories vanish.
Effectively using types can reduce test code.

Name this function:

(having lambda calculus knowledge is essential to understand what's going on next)

template <typename T>
T f(T);
// identity
// int f(int);

template <typename T, typename U>
T f(pair<T, U>);
// first

template <typename T>
T f(bool, T, T);
// select

template <typename T, typename U>
U f(function<U(T)>, T);
// apply or call

template <typename T>
vector<T> f(vector<T>);
// reverse, shuffle, ...

template <typename T>
optional<T> f(vector<T>);

template <typename T, typename U>
vector<U> f(function<U(T)>, vector<T>);
// transform

template <typename T>
vector<T> f(function<bool(T)>, vector<T>);
// remove_if, partition, ...

template <typename K, typename V>
optional<V> f(map<K, V>, K);
// lookup

template <typename T>
T f(vector<T>);
// Not possible! It's a partial function - the vector might be empty.
// T& vector<T>::front();

template <typename T>
T f(optional<T>);
// Not possible!

template <typename K, typename V>
V f(map<K, V>, K);
// Not possible! (The key might not be in the map.)
// V& map<K, V>::operator[](const K&);

Take away:

Make illegal states unrepresentable
Use std::variant and std::optional for formulations that are

more natural
fit the business logic state better

Use phantom types for safety

Make illegal behavior a compile error

Write total functions

Unsurprising behavior
Easy to use, hard to misuse

Reference:
[golang][c++] padding https://vsdmars.blogspot.com/2018/09/golangc-padding.html

Ataraxia through Epoché

Jan 5, 2019

[Design][Software Engineering][C++] Using/Design type effectively - Ben Deane@Blizzard