They also miss that on CUDA's case it is an ecosystem.
Actually it is C, C++, Fortran, OpenACC and OpenMP, PTX support for Java, Haskell, Julia, C#, alongside the libraries, IDE tooling and GPU graphical debugging.
Likewise Metal is plain C++14 plus extensions.
On the graphics side, HLSL dominates, following by GLSL and now slang. There are then MSL, PSSL and whatever NVN uses.
By the way, at GTC NVIDIA announced going all in with Python JIT compilers for CUDA, with feature parity with existing C++ tooling. There is now a new IR for doing array programming, Tile IR.
This is supposed to be used in place of CUDA, HIP, Metal, Vulkan, OpenGL, etc... It's targeting the hardware directly so doesn't need to be supported as such.
The site also seems to clearly state it's a work in progress. It's just an interesting blog post...
I am a complete noob in GPU but is AMDGCN the older generation with the new one being RDNA? If you generate a binary for AMDGCN, will it run on the newest cards?
Also, I though that these GPU ISAs were "proprietary". I wonder how reliable the binary generation can be.
AMD ISAs are changing for almost every generation so LLVM[1] continues to keep the architecture name "amdgcn" and handle the variation based on the model flag (e.g., -mcpu=gfx1030 for RDNA2, -mcpu=gfx1100 for RDNA3).
> I though that these GPU ISAs were "proprietary"
PTX spec[2] is publicly available but the actual hardware assembly (SASS) is not. Although i believe Nsight allows you to view it.
If LLVM can target AMD GPUs what exactly prevents AMD and ROCm from supporting all the damn GPUs?
At this point I'm convinced that the real problem with AMD GPUs isn't necessarily the compilers (although they do produce mediocre code) or even the hardware itself, but some crappy C++ driver code that can't handle running graphics and compute at the same time. The datacenter GPUs never had to run graphics in the first place, so they are safe.
In my experience, the compiler, compute drivers, and HIP runtime work fine for all modern AMD GPUs. The only parts of the stack that don't run on all GPUs are the math and AI libraries. And that is mostly because AMD isn't building and testing those libraries for unsupported GPUs. The actual work required to enable functional support was straightforward enough that I ported them myself when packaging the libraries for Debian. Though, I had a lot of help on the testing.
From my understanding, Vulkan and OpenGL are nice but the true performance lies in the specific toolkits (ie CUDA, Metal).
Wrapping the vendor provided frameworks is liable to break and that isn't tenable for someone who wants to do this on a professional basis.
Actually it is C, C++, Fortran, OpenACC and OpenMP, PTX support for Java, Haskell, Julia, C#, alongside the libraries, IDE tooling and GPU graphical debugging.
Likewise Metal is plain C++14 plus extensions.
On the graphics side, HLSL dominates, following by GLSL and now slang. There are then MSL, PSSL and whatever NVN uses.
By the way, at GTC NVIDIA announced going all in with Python JIT compilers for CUDA, with feature parity with existing C++ tooling. There is now a new IR for doing array programming, Tile IR.
This is supposed to be used in place of CUDA, HIP, Metal, Vulkan, OpenGL, etc... It's targeting the hardware directly so doesn't need to be supported as such.
The site also seems to clearly state it's a work in progress. It's just an interesting blog post...
Also, I though that these GPU ISAs were "proprietary". I wonder how reliable the binary generation can be.
> I though that these GPU ISAs were "proprietary"
PTX spec[2] is publicly available but the actual hardware assembly (SASS) is not. Although i believe Nsight allows you to view it.
1. https://llvm.org/docs/AMDGPUUsage.html#processors
2. https://docs.nvidia.com/cuda/parallel-thread-execution
At this point I'm convinced that the real problem with AMD GPUs isn't necessarily the compilers (although they do produce mediocre code) or even the hardware itself, but some crappy C++ driver code that can't handle running graphics and compute at the same time. The datacenter GPUs never had to run graphics in the first place, so they are safe.
See the Debian Trixie Supported GPU list: https://salsa.debian.org/rocm-team/community/team-project/-/...
I say surprisingly, because I'd expect Rust support to be more mature than Zig's.