Several simple examples for popular neural network toolkits calling custom CUDA operators
Neural Network CUDA Example Several simple examples for neural network toolkits (PyTorch, TensorFlow, etc.) calling custom CUDA operators. We provide several ways to compile the CUDA kernels and their cpp wrappers, including jit, setuptools and cmake. We also provide several python codes to call the CUDA kernels, including kernel time statistics and model training. For more accurate time statistics, you’d best use nvprof or nsys to run the code. Environments NVIDIA Driver: 418.116.00 CUDA: 11.0 Python: 3.7.3 PyTorch: 1.7.0+cu110 TensorFlow: […]
Read more