> You seem to be moving the goalposts. First it was using in production, now it'...

danielmarkbruce · on Aug 24, 2023

Companies are using both A100 and H100 for inference. The datacenter numbers include both, and I don't believe they break it down further. And no, by and large, large enterprises are not building out large clusters - but they are much of the demand for all those cloud provider build outs.

No, no one is using CUDA directly. But if you are a vendor with a new equivalent, it's no small feat integrating it with a framework like PyTorch. There is a lot of work required by various parties.

haldujai · on Aug 25, 2023

It would be interesting to see what the breakdown is on H100/A100 users. I would expect most inference users are similar to my lab and max out at a DGX node rather than being the bulk of users.

PyTorch has gotten a lot better on TPUs this year, I don’t believe there’s much of a performance hit now. Jax and TF (I don’t use the latter anymore) of course work. I never used Gaudi2 but it apparently works.

All of this to say, is it possible that Intel gets their fabs working and strategically partners with their long-time partner MS and OpenAI to extend Triton to Gaudi3 or 4 and be a threat to Nvidia within 2-3 years? Absolutely.

Is it similarly possible that Google increases development on Jax and TPUv5? Sure.

Neither of these possibilities, regardless if you think they’re probable or improbable, would need a decade to catch up to Nvidia.

danielmarkbruce · on Aug 25, 2023

You may well be right. It's going to be an interesting few years.