Cpp templates vs function pointers for AVX - agnostic system

1 day ago 2
ARTICLE AD BOX

I want to create a computational block which should be able to use AVX if it is available. For that I want to use a template depending on the concrete AVX, compile all versions and in runtime use the one supported by the local CPU. I plan it to look something like this [note that for simplicity, we consider that at least one AVX is available]:

enum AvxT : int { AVX_512, AVX_256, AVX_128 }; template<AvxT E> class AvxCalc { public: using Cell = std::conditional_t<E == AvxT::AVX_128, __m128i, std::conditional_t<E == AvxT::AVX_256, __m256i, __m512i> >; template<AvxT T = E> std::enable_if_t<T == AvxT::AVX_128, AvxCalc<E>> operator + (const AvxCalc<E> & other) const; // ... [the other AvxT declarations] private: Cell data; }; template<AvxT E> template<AvxT T> std::enable_if_t<T == AvxT::AVX_128, AvxCalc<E>> AvxCalc<E>::operator+(const AvxCalc<E> & other) const { return AvxCalc<E>(_mm_add_epi32(data, other.data)); } // ... [the other AvxT implementations]

Somewhere else there will be router:

if (haveAvx128()) // non constexpr router { AvxCalc<AvxT::AVX_128> calc; // ... [do calculations] }

The question is - if it is compiled on a machine capable to do any of the AVX and then distributed, will it crash on cpus not compatible with, say, AVX512? Is this idea actually viable at all?

Read Entire Article