And .ain which was even better but now seems to be half lost to time (no Wikipedia, just a few links repeating the same fragments of info like http://justsolve.archiveteam.org/wiki/AIN)
I previously ran 150,000 AMD gpus in all conditions at 100% utilization for years. I currently have a multi-million $ cluster of enterprise AMD GPUs.
A couple real world points:
1. They generally don't just fail. More likely a repairable component on a board fails and you can send it out to be repaired.
2. For my current stuff, I have a 3 year pro support contract that can be extended. Anything happens, Dell goes and fixes it. We also haven't had someone in our cage at the DC in over 6 months now.
I have to maintain our GPU's. Generally the worst parts are the watercooling pressure, the HVAC, and the power. I can run it stable only at 300W per CPU, the normal max is 310W. Now with throttling to 300 it's stable for a year, before it burned two mainboards already, with lots of downtimes.
My experience is that power problems stem from not having good power and/or poor airflow.
I'm convinced that this is why we haven't had any issues in our current location. Zero outside air, zero dust, insanely well built zero expense spared airflow and power supply / management.
It is easier to do in the cloud than it is to do with actual hardware though, because you'll need enough hardware to do the migration. There is a capital moat around that.
I feel like the company that can figure out how to 100% safely live migrate any VMWare workload to another "cheaper" solution, will do quite well.
The fact that there are no tests is a non-starter for me. AI mostly writes them for you now, so there really is no excuse to not have them, especially for a library that people are going to depend on.
ive been running tests from a gitignored folder (see the play/record commands in the package.json) since setting up thoughtful testing infra thats not just mocking everything is going to take an evening or two. but it's on the roadmap and will be added soon !!
Well that's embarrassing! I reported it as if it wasn't a joke. I thought the joke issue was this one about translating everything to Chinese: https://github.com/tldraw/tldraw/issues/8092
If it was a joke (the test suite issue), then it was a really shit joke. It reads more like backtracking, I don't think _you_ should feel any embarrassment.
SemiAnalysis
reply