Recently, I encountered a situation where I needed to contribute to Triton without access to any GPU.
The only device I had was my M3 Mac. After some experimentation, it turns out that Triton has a well-designed abstraction layer, and you don’t need a GPU if you only want to work at the compiler level (i.e., MLIR and LLVM).
The method is present in the codebase, but it’s not officially documented. I hope this guide will save someone else’s time.