allocPool/README.md

2.7 KiB

allocPool - A simple & high performance object pool using modern C++

This is allocPool, a pool of objects in a single header file that allow you to avoid expensive allocations during runtime. This preallocates objects in the constructor (with threads) then offers you two functions: getPtr() and returnPtr(ptr).

Using C++ concepts, we can use templates and require the class given to have a default constructor and to have a .reset() function. It will be used to clean the objects before giving them to another caller instead of reallocating.

We avoid false sharing by keeping a high amount of work per thread. This should lead to cache lines not being shared between threads. In the advent that cache lines would overlap, we use a smaller amount of threads with a bit more work per thread. This option is enabled with the bool enableFalseSharingMitigations parameter in the constructor. It is on by default and can be disabled in case that you have big objects where writing pointers in the std::vector will not be the bottleneck and where more threads will benefit the performance.

While this pool uses a hashmap and a pivot to make returnPtr(ptr) extremely fast, when saving the pointers, the main bottleneck is in the locking and unlocking of the hashmap's mutex. We need to do this since we cannot write in a std::unordered_map at different hashes concurrently.

It will automatically grow to twice its size when the max capacity is reached.

Performance

With a simple stub class and a pool of 10000 objects, using the pool to take a pointer and give it back for each element is significantly faster than doing it by hand.

class stub {
public:
    stub() {
        for (int j{}; j < 1000; j++) { i++; }
    };
    void reset() {}

private:
    int i = 15;
};

On Linux:

Time (milliseconds) required for allocations without pool: 21
Time (milliseconds) required for allocations with pool: 3
Time (milliseconds) required for real allocations when constructing pool: 9

On Windows:

Time (milliseconds) required for allocations without pool: 62
Time (milliseconds) required for allocations with pool: 6
Time (milliseconds) required for real allocations when constructing pool: 51

This trivial example shows some performance improvements that would be much more important should the allocation and construction/destruction of the objects be more complex.

In the case where the allocator is very fast (such as glibc's on Linux) this approach may not be necessary, but in the case of slow allocators (such as Windows' default), it could be interesting to consider such an approach.

Safety

AddressSanitizer, LeakSanitizer and ThreadSanitizer have been used to ensure the safety of the class. Tests have been added to ensure the correct behavior in all cases.