Making non cuda code compatible with cuda which uses std functions where cuda::std exists

22 hours ago 1

ARTICLE AD BOX

I'm attempting to create a function that handles bit casting a float/double/float16 to an unsigned key of equivalent size. I can do this in host code easily with

template<std::floating_point T> [[nodiscard]] constexpr auto bit_cast_unsigned(T t) { if constexpr(sizeof(T) == 1) { return std::bit_cast<std::uint8_t>(t); } if constexpr(sizeof(T) == 2) { return std::bit_cast<std::uint16_t>(t); } if constexpr(sizeof(T) == 4) { return std::bit_cast<std::uint32_t>(t); } if constexpr(sizeof(T) == 8) { return std::bit_cast<std::uint64_t>(t); } }

However, I want to be able to also support this in device code and I want this function used at runtime, not just at compile time, I have to do the following:

template<std::floating_point T> [[nodiscard]] __host__ __device__ constexpr auto bit_cast_unsigned(T t) { if constexpr(sizeof(T) == 1) { return cuda::std::bit_cast<std::uint8_t>(t); } if constexpr(sizeof(T) == 2) { return cuda::std::bit_cast<std::uint16_t>(t); } if constexpr(sizeof(T) == 4) { return cuda::std::bit_cast<std::uint32_t>(t); } if constexpr(sizeof(T) == 8) { return cuda::std::bit_cast<std::uint64_t>(t); } }

The problem is that this doesn't work when I'm in a codebase which isn't using CUDA, and I don't want to make CUDA a requirement on this function existing. But I still want to support cuda::std:: equivalent functions when available.

Is there a way to do something like :

#if defined(CUDA_ENABLED) #include <cuda/bit> #define PROJECT_HOST_DEVICE_STD_NAMESPACE cuda::std #define PROJECT_HOST_DEVICE __host__ __device__ #else #include <bit> #define PROJECT_HOST_DEVICE_STD_NAMESPACE std #define PROJECT_HOST_DEVICE #endif ... template<std::floating_point T> [[nodiscard]] PROJECT_HOST_DEVICE constexpr auto bit_cast_unsigned(T t) { if constexpr(sizeof(T) == 1) { return PROJECT_HOST_DEVICE_STD_NAMESPACE::bit_cast<std::uint8_t>(t); } if constexpr(sizeof(T) == 2) { return PROJECT_HOST_DEVICE_STD_NAMESPACE::bit_cast<std::uint16_t>(t); } if constexpr(sizeof(T) == 4) { return PROJECT_HOST_DEVICE_STD_NAMESPACE::bit_cast<std::uint32_t>(t); } if constexpr(sizeof(T) == 8) { return PROJECT_HOST_DEVICE_STD_NAMESPACE::bit_cast<std::uint64_t>(t); } }

I've seen posts like CUDA and nvcc: using the preprocessor to choose between float or double which seem to indicate using __CUDACC__ as a solution, however I'm worried I'll run into weird linker issues or something if I have one section of code that compiles using bit_cast_unsigned in a cpp file in code that doesn't have cuda enabled (ie transitively through VCPKG) and then another piece of code, either in another cuda enabled library or a final cuda enabled executable that does have cuda enabled using the same bit_cast_unsigned.

Can I use __CUDACC__ with out issue like above with cuda and non cuda codebases (which may even be mixed together, it like in VCPKG) or is there another solution?

Read Entire Article

LEFT SIDEBAR AD

Hidden in mobile, Best for skyscrapers.

Making non cuda code compatible with cuda which uses std functions where cuda::std exists

ARTICLE AD BOX

Related

Why does Visual Studio IntelliSense give error E0304 when the code compiles just fine?

Why does gcc 8 cause double free in this std::string example

Using std::launder to retrieve a dynamic array

LEFT SIDEBAR AD