Any such books, courses etc will be much appreciated.
Multiflow still has some relevant ideas [2]
Programming on Parallel Machines: GPU, Multicore, Clusters and More. Gives you a look at some of the issues [3]
SPIRV-VM is a virtual machine for executing SPIR-V shaders [4]
NyuziRaster: Optimizing Rasterizer Performance and Energy in the Nyuzi Open Source GPU [5]
Ocelot is a modular dynamic compilation framework for heterogeneous systems, providing various backend targets for CUDA programs and analysis modules for the PTX virtual instruction set. [6]
glslang is the Khronos-reference front end for GLSL/ESSL, partial front end for HLSL, and a SPIR-V generator.
[1]: https://www.goodreads.com/book/show/83895.Computer_Organizat...
[2]: https://en.wikipedia.org/wiki/Multiflow
[3]: http://heather.cs.ucdavis.edu/parprocbook
[4]: https://github.com/dfranx/SPIRV-VM
[5]: https://www.cs.binghamton.edu/~millerti/nyuziraster.pdf
From there, hopefully you'll have the intuition to actually evaluate whether a given resource being recommended here is any good.
https://scholarworks.iu.edu/dspace/items/3ab772c9-92c9-4f59-...
This makes a rather spectacular mess of the tooling. Instead of localising the cuda semantics in clang, we scatter it throughout the entire compiler pipeline, where it does especially nasty things to register allocation and generally obstructs non-cuda programming models. It's remarkably difficult to persuade GPU people that this is a bad thing.
Also the GPU programming languages use very large compiler runtimes to do a degree of papering over the CPU-host GPU-target assumption that also dates from long ago, so expect to find a lot of complexity in multiple libraries acting somewhat like compiler-rt. Those are optional in reality but the compiler usually emits a lot of symbols that resolve to various vendor libraries.
Great book series on the subject of HPC. Not sure if it actually touches GPU. Great material anyway. BONUS: it's free!
Some of the stuff in this playlist might be relevant to you, though it is mostly about programming GPUs in a functional language that compiles to Cuda. The author (me) sometimes works on the language during the video, either fixing bugs or adding new features.
1. play around with the NVPTX LLVM backend and/or try compiling CUDA with Clang,
2. get familiar with the PTX ISA,
3. play around with ptxas + nvdisasm.
SPIR-V is important in the compute shader space, especially because DXIL and Metal's AIR are similar. I'm going to link three articles critical of SPIR-V: [3], [4], [5].
WebGPU [6] is interesting for a number of reasons, largely because they're trying to actually nail down the semantics, and also make it safe (see the uniformity analysis [7] in particular, which is a very "compiler" approach to a GPU-specific problem). Both Tint and naga projects are open source, with lots of high quality discussion in the issue trackers.
Shader languages suck, and we really need a good one. Promising approaches are Circle [8] (which is C++ based, very advanced but not open source), and Slang [9] (an evolution of HLSL). The Vcc work (also related to [4]) is worth studying.
Best of luck! This is a fascinating, if frustrating, space, and there's lots of room to improve things.
[1]: https://www.gfxstrand.net/faith/blog/
[1a]: https://www.collabora.com/news-and-blog/blog/2024/04/25/re-c...
[2]: https://mastodon.gamedev.place/@gfxstrand
[3]: https://kvark.github.io/spirv/2021/05/01/spirv-horrors.html
[4]: https://xol.io/blah/the-trouble-with-spirv/
[5]: https://themaister.net/blog/2022/08/21/my-personal-hell-of-t...
[6]: https://github.com/gpuweb/gpuweb
[7]: https://www.w3.org/TR/2022/WD-WGSL-20220505/#uniformity-over...