All the calculations could be done before hand and stored and then the only thing left in the delayed draw is to set the buffer.
I haven't looked at the code yet so not sure how much if any it will save though.
Could also group pixels that are far away from eachother into a single call, while a compromise i think it will maintain the effect.
No?
Anyone can run an AI even on the weakest hardware there are plenty of small open models for this.
Training an AI requires very strong hardware, however this is not an impossible hurdle as the models on hugging face show.