2.7 KiB
allocPool - A simple & high performance object pool using modern C++
This is allocPool, a pool of objects in a single header file that allow you to
avoid expensive allocations during runtime. This preallocates objects in the
constructor (with threads) then offers you two functions: getPtr()
and returnPtr(ptr)
.
Using C++ concepts, we can use templates and require the class given to have a
default constructor and to have a .reset()
function. It will be used to clean the
objects before giving them to another caller instead of reallocating.
We avoid false sharing by keeping a high amount of work per thread. This should
lead to cache lines not being shared between threads. In the advent that cache
lines would overlap, we use a smaller amount of threads with a bit more work
per thread. This option is enabled with the bool enableFalseSharingMitigations
parameter in the constructor. It is on by default and can be disabled in case
that you have big objects where writing pointers in the std::vector
will
not be the bottleneck and where more threads will benefit the performance.
While this pool uses a hashmap and a pivot to make returnPtr(ptr)
extremely fast,
when saving the pointers, the main bottleneck is in the locking and unlocking of the
hashmap's mutex. We need to do this since we cannot write in a std::unordered_map
at different hashes concurrently.
It will automatically grow to twice its size when the max capacity is reached.
Performance
With a simple stub class and a pool of 10000 objects, using the pool to take a pointer and give it back for each element is significantly faster than doing it by hand.
class stub {
public:
stub() {
for (int j{}; j < 1000; j++) { i++; }
};
void reset() {}
private:
int i = 15;
};
On Linux:
Time (milliseconds) required for allocations without pool: 21
Time (milliseconds) required for allocations with pool: 3
Time (milliseconds) required for real allocations when constructing pool: 9
On Windows:
Time (milliseconds) required for allocations without pool: 62
Time (milliseconds) required for allocations with pool: 6
Time (milliseconds) required for real allocations when constructing pool: 51
This trivial example shows some performance improvements that would be much more important should the allocation and construction/destruction of the objects be more complex.
In the case where the allocator is very fast (such as glibc's on Linux) this approach may not be necessary, but in the case of slow allocators (such as Windows' default), it could be interesting to consider such an approach.
Safety
AddressSanitizer, LeakSanitizer and ThreadSanitizer have been used to ensure the safety of the class. Tests have been added to ensure the correct behavior in all cases.