A simple & high performance object pool using modern C++
Go to file
Timothée Leclaire-Fournier 3c5c9b4a97 allocPool: Try to avoid false sharing.
A benchmark on a 50 objects size with false sharing mitigations gives these values (for the initArray() function):
- With mitigations: 5508 microseconds
- Without mitigations: 9075 microseconds
2024-03-11 19:44:38 -04:00
.idea allocPool: First commit 2024-03-01 14:21:52 -05:00
.clang-format allocPool: First commit 2024-03-01 14:21:52 -05:00
.gitignore allocPool: Fix threading issue and add benchmark. 2024-03-01 16:30:51 -05:00
allocPool.hpp allocPool: Try to avoid false sharing. 2024-03-11 19:44:38 -04:00
CMakeLists.txt Meta: Turn on asan 2024-03-01 17:20:08 -05:00
main.cpp allocPool: Try to avoid false sharing. 2024-03-11 19:44:38 -04:00
README.md allocPool: Répare compilation sur Windows 2024-03-03 14:40:55 -05:00
tests.cpp allocPool: Fix threading issue and add benchmark. 2024-03-01 16:30:51 -05:00
tests.hpp allocPool: Fix threading issue and add benchmark. 2024-03-01 16:30:51 -05:00

allocPool - A simple & high performance object pool using modern C++

This is allocPool, a pool of objects in a single header file that allow you to avoid expensive allocations during runtime. This preallocates objects in the constructor (with threads) then offers you two functions: getPtr() and returnPtr(ptr).

Using C++ concepts, we can use templates and require the class given to have a default constructor and to have a .reset() function. It will be used to clean the objects before giving them to another caller.

We avoid false sharing by keeping a high amount of work per thread. This should lead to cache lines not being shared between threads. While this pool uses a hashmap and a pivot to make returnPtr(ptr) extremely fast, the construction's main bottleneck is in the locking and unlocking of the hashmap's mutex. We need to do this since we cannot write in a std::unordered_map at different hashes concurrently.

It will automatically grow when the max capacity is reached, though there will be a performance penalty.

Performance

With a simple stub class and a pool of 10000 objects, using the pool to take a pointer and give it back for each element is significantly faster than doing it by hand.

class stub {
public:
    stub() {
        for (int j{}; j < 1000; j++) { i++; }
    };
    void reset() {}

private:
    int i = 15;
};

On Linux:

Time (milliseconds) required for allocations without pool: 21
Time (milliseconds) required for allocations with pool: 3
Time (milliseconds) required for real allocations when constructing pool: 9

On Windows:

Time (milliseconds) required for allocations without pool: 62
Time (milliseconds) required for allocations with pool: 6
Time (milliseconds) required for real allocations when constructing pool: 51

This trivial example shows some performance improvements that would be much more important should the allocation and construction/destruction of the objects be more complex.

In the case where the allocator is very fast (such as glibc's on Linux) this approach may not be necessary, but in the case of slow allocators (such as Windows' default), it could be interesting to consider such an approach.

Safety

AddressSanitizer, LeakSanitizer and ThreadSanitizer have been used to ensure the safety of the class. Tests have been added to ensure the correct behavior in all cases.