Ataraxia through Epoché: [Cppcon 2016] High Performance Code 201: Hybrid Data Structures

std::vector's problem, no SSO(small size optimization)

Why? Because STD says when move a vector, it's iterator can't be invalidated.

That implies std::vector's iterator as pointer points to heap memory.

However, SSO vector when moves, it copies, and invalidates the iterators.

Domain specific data structure has it's purpose of efficiency.
(Less corner cases, easier to design.)

Besides domain specific data structure,

why not just using customized allocator with std::vector?

Thus SSO for vector becomes:

template<typename T, int N>
using SmallVector = std::vector<T, short_alloc<T, N>>;

void fun(){
    SmallVector<int, 4>::allocator_type::arena_type a;
    SmallVector<int> v{a};
}

But...
It doesn't work well with interface boundary,
i.e due to std::vector has allocator as type argument.

void jump(SmallVector<int, 4> &v);

void fun(){
    // Taking short_alloc<int, 8>
    SmallVector<int, 8>::allocator_type::arena_type a;
    SmallVector<int> v{a};
    jump(v);  // doesn't work... callee taking short_alloc<int, 4>
}

Another issue, callee's return type could reference to memory on callee's stack..
i.e

SmallVector<int, 4> fun(){
    SmallVector<int, 4>::allocator_type::arena_type a;
    SmallVector<int> v{a};
    return v; // BAD
}

All of all, the SmallVector loses it's value semantics.
Which is IMPORTANT.

With Domain Specific Type, these issues solved.
SmallVector type in Clang:

https://github.com/llvm-mirror/clang/blob/f3b7928366f63b51ffc97e74f8afcff497c57e8d/test/SemaCXX/invalid-instantiated-field-decl.cpp#L14

template <typename T, unsigned N>
class SmallVector : public SmallVectorImpl<T> {
  typedef typename SmallVectorImpl<T>::U U; // expected-error {{no type named 'U' in 'SmallVectorImpl<CallSite>'}}
  enum {

    MinUs = (static_cast<unsigned int>(sizeof(T))*N + // expected-error {{invalid application of 'sizeof' to an incomplete type 'CallSite'}}
             static_cast<unsigned int>(sizeof(U)) - 1) /
            static_cast<unsigned int>(sizeof(U)),
    NumInlineEltsElts = MinUs
  };
  U InlineElts[NumInlineEltsElts];
public:
  SmallVector() : SmallVectorImpl<T>(NumInlineEltsElts) {
  }

};

Small-size optimization is best when the values are small.
(in C++, copy by value is the default mechanism, although not being well/widely known...)

Design:

Give large objects address identity.
i.e Use object's memory address as identity avoids object's content equality test.

SmallVector<std::unique_ptr<BigObject>, 4> Objects;

// FAST
class BumpPtrAllocator {
    constexpr int SlabSize = 4096;
    SmallVector<void *, 4> Slabs;
    void *CurPtr, *End;

public:
    void *Allocate(int Size) P
        if (Size >= (End - CurPtr)) {
            CurPtr = malloc(SlabSize);
            End = CurPtr + SlabSize;
            Slabs.push_back(CurPtr);
        }

        void *Ptr = CurPtr;
        CurPtr += Size;
        return Ptr;
    }
    // ...
};

If pointers are too large, use an index.
Aggressively pack the bits.
PointerIntPair:
http://llvm.org/doxygen/classllvm_1_1PointerIntPair.html
PointerEmbeddedInt:
http://llvm.org/doxygen/classllvm_1_1PointerEmbeddedInt.html
TinyPtrVector:
http://llvm.org/doxygen/classllvm_1_1TinyPtrVector.html
Thus, we have SmallMutiMap:

template<typename KeyT, typename ValueT>
using SmallMultiMap = SmallDenseMap<KeyT, TinyPtrVector<ValueT>>;

Use bitfields everywhere.
Sometimes, we need an ordering.
i.e comparison operator.
Where possible, sort the vector.
i.e gives you a linear BST, works well with CPU pre-fetching.
What if there's no intrinsic ordering?
We have SmallSetVector:
(Has a set, has a vector, when insert, check data in set, and insert into vector.)
http://llvm.org/doxygen/classllvm_1_1SmallSetVector.html

Ataraxia through Epoché

Nov 14, 2018

[Cppcon 2016] High Performance Code 201: Hybrid Data Structures - Chandler Carruth

No comments:

Post a Comment