ARTICLE AD BOX
this is not a question about how to write a CUDA kernel for Torch - I have done this and the kernel is confirmed to function as expected.
But I have a problem with compilation. After making the call of the kernel one of methods of my Torch object, I'm running into compilation problems.
// in "MyModule.h" header class CMyModule:public torch::nn::Module{ void DoSomethingWithoutUsingKernel() const; void CallMyKernel() const; }; // in regular "MyModule.cpp" file void CMyModule::DoSomethingWithoutUsingKernel() const{ // this compiles fine } // in CUDA "MyModule.cu" file #define WIN32_LEAN_AND_MEAN #include <windows.h> #undef min #undef max #include "torch/torch.h" #include "MyModule.h" #include <cuda.h> #include <cuda_runtime_api.h> static __global__ void MyKernel(){ } void CMyModule::CallMyKernel() const{ MyKernel<<< 1, 1 >>>(); }My regular C++ compiler is set this way (with additional compiler option /Zc:__cpluscplus to force the __cplusplus constant to be given the value required by the Standard):


The CUDA compiler has just the --std c++20 extra parameter fed in, otherwise inherits from the regular C++ compiler (if that's my correct understanding):

How I have to compile:
I press the Build → Build Solution command.
I obtain the following error:

I open the project configuration dialog:

I do NOT do any changes. As it is, I just close it by clicking OK straight away.
I press the Build → Build Solution command again.
This time, I have miraculously success and the application functions as desired.

I have to open the configuration dialog in Step 3, otherwise the compilation fails.
So my question is obvious - how do I correctly configure the project so that it is compiled in just one go without opening+closing the configuration dialog? (Or is this a bug in Visual Studio 2022?)
Here I'm attaching full compiler output (sorry for external link, don't know how to upload it to StackOverflow as a file).
I of course could have modified the code so that LibTorch and CUDA runtimes don't meet each other in which case the project compiles in just one run without opening the configuration dialog, as below. But out of curiosity I'd like to know what I'm doing wrong in the above.
// in "MyModule.h" header void MyModule_CallMyKernel(); // announce there is such function elsewhere class CMyModule:public torch::nn::Module{ void DoSomethingWithoutUsingKernel() const; }; // in regular "MyModule.cpp" file void CMyModule::DoSomethingWithoutUsingKernel() const{ // this compiles fine } // in CUDA "MyModule.cu" file #define WIN32_LEAN_AND_MEAN #include <windows.h> #undef min #undef max #include <cuda.h> #include <cuda_runtime_api.h> static __global__ void MyKernel(){ } void MyModule_CallMyKernel(){ MyKernel<<< 1, 1 >>>(); }