Jan 5, 2019

[Design][Software Engineering][C++] Using/Design type effectively - Ben Deane@Blizzard

"On the whole, I'm inclined to say that when in doubt, make a new type."
                                                                 – Martin Fowler, When to Make a Type
"Don't set a flag; set the data."
                                                                 – Leo Brodie, Thinking Forth



Considering this talk provides an essential abstraction design for Type.
Yet Golang's multiple variables return programming paradigm should design as follows the concept of considering them as a whole into sum type.


Types as sets of values:

Type, like math's function, defines value domain.
If types' value domain are same, we could consider they are equivalent.
(But not 'equality')
Algebraically, a type is the number of values that inhabit it.

e.g.
How many values?
bool;  // 2, true, false
char;  // 256
void;  // 0
struct Foo {};  // 1
enum FireSwampDangers : int8_t {   // 3
    FLAME_SPURTS,
    LIGHTNING_SAND,
    ROUSES
};

template <typename T> // as many values as T
struct Foo {
    T m_t;
};


Aggregating Types:

When two types are "concatenated" into one compound type,
we multiply the # of inhabitants of the components.
This kind of compounding gives us a product type.
e.g
How many values?
std::pair<char, bool>;  // 256 * 2

struct Foo {  // 256 * 2
    char a;
    bool b;
};

std::tuple<bool, bool, bool>;  // 2 * 2 * 2 = 8

template <typename T, typename U>  // (# of values in T) * (# of values in U)
struct Foo {
    T m_t;
    U m_u;
};


Alternating Types:

When two types are "alternated" into one compound type,
we add the # of inhabitants of the components.
This kind of compounding gives us a sum type.
e.g.
How many values?
std::optional<char>;  // 256 + 1
std::variant<char, bool>;  // 256 + 2

template <typename T, typename U>  // (# of values in T) + (# of values in U)
struct Foo {
    std::variant<T, U>;
}


Function Types:

The number of values of a function is the number of different ways we can draw arrows between the inputs and the outputs.
When we have a function from A to B,
we raise the # of inhabitants of B to the power of the # of inhabitants of A.
Curring, foundation of Lambda Calculus : https://en.wikipedia.org/wiki/Currying
e.g.
How many values?
bool f(bool);  // 4
char f(bool);  // 256 * 256 = 65,536

enum class Foo
{
    BAR,
    BAZ,
    QUUX};
char f(Foo);   // 256 * 256 * 256 = 16,777,216

template <class T, class U>  // U ^ T
U f(T);


The above definition gives us how to present equivalent type:

e.g.
Equivalence:
template <typename T>
struct Foo {
    std::variant<T, T> m_v;
};
template <typename T>
struct Bar {
    T m_t;
    bool m_b;};


Algebraic Datatypes:

  • the ability to reason about equality of types
  • to find equivalent formulations
    • more natural
    • more easily understood
    • more efficient
  • to identify mismatches between state spaces and the types used to
    implement them
  • to eliminate illegal states by making them inexpressible


Making illegal states unrepresentable:

std::variant is a game changer because it allows us to (more) properly express types,
so that (more) illegal states are un-representable.

Let's using sum types (variant, optional) as well as product types (structs):
e.g
Old way:
enum class ConnectionState {
    DISCONNECTED,
    CONNECTING,
    CONNECTED,
    CONNECTION_INTERRUPTED};

struct Connection {
    ConnectionState m_connectionState;
    std::string m_serverAddress;
    ConnectionId m_id;
    std::chrono::system_clock::time_point m_connectedTime;
    std::chrono::milliseconds m_lastPingTime;
    Timer m_reconnectTimer;
};

New way:
struct Connection {
    std::string m_serverAddress;

    struct Disconnected {};
    struct Connecting {};
    struct Connected {
        ConnectionId m_id;
        std::chrono::system_clock::time_point m_connectedTime;
        std::optional<std::chrono::milliseconds> m_lastPingTime;};

    struct ConnectionInterrupted {
        std::chrono::system_clock::time_point m_disconnectedTime;
        Timer m_reconnectTimer;};

    std::variant<Disconnected,
      Connecting,
      Connected,
      ConnectionInterrupted> m_connection;};

Old way:
class Friend {
std::string m_alias;
bool m_aliasPopulated;
...
};

New way:
class Friend {
std::optional<std::string> m_alias;
...
};


Thus, we have a new design pattern for modern C++:

  • Command
  • Composite
  • State
  • Interpreter
The addition of sum types to C++ offers an alternative formulation for some
design patterns.
State machines and expressions are naturally modeled with sum types.


Designing with types:

std::variant and std::optional are valuable tools that allow us to model
the state of our business logic more accurately.
When you match the types to the domain accurately, certain categories of
tests just disappear. (Consider Data Oriented Design)

Fitting types to their function more accurately makes code easier to
understand and removes pitfalls.
The bigger the code-base and the more vital the functionality, the more
value there is in correct representation with types.


Using types to constrain behavior:

"Phantom types" is one technique that helps us to model the behavior of
our business logic in the type system. Illegal behavior becomes a type error.
e.g.
Old ways:
std::string GetFormData();
std::string SanitizeFormData(const std::string&);
void ExecuteQuery(const std::string&);

template <typename T>
struct FormData {
    explicit FormData(const string& input) : m_input(input) {}
    std::string m_input;};
struct sanitized {};
struct unsanitized {};

New ways:
FormData<unsanitized> GetFormData();

std::optional<FormData<sanitized>>
SanitizeFormData(const FormData<unsanitized>&);

void ExecuteQuery(const FormData<sanitized>&);


Total functions:

A total function is a function that is defined for all inputs in its domain.
Writing total functions with well-typed signatures can tell us a lot about functionality.
Using types appropriately makes interfaces unsurprising, safer to use and harder to misuse.
Total functions make more test categories vanish.
Effectively using types can reduce test code.


Name this function:

(having lambda calculus knowledge is essential to understand what's going on next)
template <typename T>
T f(T);
// identity
// int f(int);

template <typename T, typename U>
T f(pair<T, U>);
// first

template <typename T>
T f(bool, T, T);
// select

template <typename T, typename U>
U f(function<U(T)>, T);
// apply or call

template <typename T>
vector<T> f(vector<T>);
// reverse, shuffle, ...

template <typename T>
optional<T> f(vector<T>);

template <typename T, typename U>
vector<U> f(function<U(T)>, vector<T>);
// transform

template <typename T>
vector<T> f(function<bool(T)>, vector<T>);
// remove_if, partition, ...

template <typename K, typename V>
optional<V> f(map<K, V>, K);
// lookup

template <typename T>
T f(vector<T>);
// Not possible! It's a partial function - the vector might be empty.
// T& vector<T>::front();

template <typename T>
T f(optional<T>);
// Not possible!

template <typename K, typename V>
V f(map<K, V>, K);
// Not possible! (The key might not be in the map.)
// V& map<K, V>::operator[](const K&);


Take away:

  • Make illegal states unrepresentable
  • Use std::variant and std::optional for formulations that are
    • more natural
    • fit the business logic state better
  • Use phantom types for safety
    • Make illegal behavior a compile error
  • Write total functions
    • Unsurprising behavior
    • Easy to use, hard to misuse


Reference:

[golang][c++] padding https://vsdmars.blogspot.com/2018/09/golangc-padding.html

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.