That's really interesting. Only macOS instructions though? Seems like something that would easily run on Linux as well.
(I'd love to hook my server's GPU into local LLM workloads otherwise only offloaded to the CPU from my main workstation when needing too much VRAM)