I am working on a single producer single consumer ring buffer implementation.I have two requirements:
- Align a single heap allocated instance of a ring buffer to a cache line.
- Align a field within a ring buffer to a cache line (to prevent false sharing).
My class looks something like:
#define CACHE_LINE_SIZE 64 // To be used later.
template<typename T, uint64_t num_events>
class RingBuffer { // This needs to be aligned to a cache line.
public:
....
private:
std::atomic<int64_t> publisher_sequence_ ;
int64_t cached_consumer_sequence_;
T* events_;
std::atomic<int64_t> consumer_sequence_; // This needs to be aligned to a cache line.
};
Let me first tackle point 1 i.e. aligning a single heap allocated instance of the class. There are a few ways:
Use the c++ 11
alignas(..)
specifier:template<typename T, uint64_t num_events> class alignas(CACHE_LINE_SIZE) RingBuffer { public: .... private: // All the private fields. };
Use
posix_memalign(..)
+ placementnew(..)
without altering the class definition. This suffers from not being platform independent:void* buffer; if (posix_memalign(&buffer, 64, sizeof(processor::RingBuffer<int, kRingBufferSize>)) != 0) { perror("posix_memalign did not work!"); abort(); } // Use placement new on a cache aligned buffer. auto ring_buffer = new(buffer) processor::RingBuffer<int, kRingBufferSize>();
Use the GCC/Clang extension
__attribute__ ((aligned(#)))
template<typename T, uint64_t num_events> class RingBuffer { public: .... private: // All the private fields. } __attribute__ ((aligned(CACHE_LINE_SIZE)));
I tried to use the C++ 11 standardized
aligned_alloc(..)
function instead ofposix_memalign(..)
but GCC 4.8.1 on Ubuntu 12.04 could not find the definition instdlib.h
Are all of these guaranteed to do the same thing? My goal is cache-line alignment so any method that has some limits on alignment (say double word) will not do. Platform independence which would point to using the standardized alignas(..)
is a secondary goal.
I am not clear on whether alignas(..)
and __attribute__((aligned(#)))
have some limit which could be below the cache line on the machine. I can't reproduce this any more but while printing addresses I think I did not always get 64 byte aligned addresses with alignas(..)
. On the contrary posix_memalign(..)
seemed to always work. Again I cannot reproduce this any more so maybe I was making a mistake.
The second aim is to align a field within a class/struct to a cache line. I am doing this to prevent false sharing. I have tried the following ways:
Use the C++ 11
alignas(..)
specifier:template<typename T, uint64_t num_events> class RingBuffer { // This needs to be aligned to a cache line. public: ... private: std::atomic<int64_t> publisher_sequence_ ; int64_t cached_consumer_sequence_; T* events_; std::atomic<int64_t> consumer_sequence_ alignas(CACHE_LINE_SIZE); };
Use the GCC/Clang extension
__attribute__ ((aligned(#)))
template<typename T, uint64_t num_events> class RingBuffer { // This needs to be aligned to a cache line. public: ... private: std::atomic<int64_t> publisher_sequence_ ; int64_t cached_consumer_sequence_; T* events_; std::atomic<int64_t> consumer_sequence_ __attribute__ ((aligned (CACHE_LINE_SIZE))); };
Both these methods seem to align consumer_sequence
to an address 64 bytes after the beginning of the object so whether consumer_sequence
is cache aligned depends on whether the object itself is cache aligned. Here my question is - are there any better ways to do the same?
EDIT:
The reason aligned_alloc
did not work on my machine was that I was on eglibc 2.15 (Ubuntu 12.04). It worked on a later version of eglibc.
From the man page: The function aligned_alloc()
was added to glibc in version 2.16.
This makes it pretty useless for me since I cannot require such a recent version of eglibc/glibc.
See Question&Answers more detail:os