Hacker Newsnew | past | comments | ask | show | jobs | submit | fdefitte's commentslogin

The dog ships faster because it has zero opinions about the architecture.

Agreed on ZLUDA being the practical choice. This project is more impressive as a "build a GPU compiler from scratch" exercise than as something you'd actually use for ML workloads. The custom instruction encoding without LLVM is genuinely cool though, even if the C subset limitation makes it a non-starter for most real CUDA codebases.


ZLUDA doesn't have full coverage though and that means only a subset of cuda codebases can be ported successfully - they've focused on 80/20 coverage for core math.

Specifically:

CuBLAS (limited/partial scope), cuBLASLt (limited/partial scope), cuDNN (limited/partial scope), cuFFT, cuSPARSE, NVML (very limited/partial scope)

Notably Missing: cuSPARSELt, cuSOLVER, cuRAND, cuTENSOR, NPP, nvJPEG, nvCOMP, NCCL, OptiX

I'd estimate it's around 20% of CUDA library coverage.


The filter used to be effort. You had to care enough to spend weeks on something, which meant you probably understood the problem deeply. Now that filter is gone and we get a flood of "I prompted this in 20 minutes" posts where the author can't answer a single follow-up about their own code. The interesting Show HNs still exist, they're just buried under noise.


> The filter used to be effort. You had to care enough to spend weeks on something, which meant you probably understood the problem deeply. Now that filter is gone and we get a flood of "I prompted this in 20 minutes" posts where the author can't answer a single follow-up about their own code. The interesting Show HNs still exist, they're just buried under noise.

More's the pity. I'm prepping for a ShowHN with a completely hand-coded project I started in December. It will be finished around end-march. I vibe-coded all the docs, because I spent all the time on the code.

I have no idea if the ShowHN is going to be at all useful, but I pivoted multiple times on various implementation things. Had it been coded by an AI, I don't think there would be any pivot.

The value is precisely from the pivots. An AI would have plodded on ahead anyway with a broken model for the problem-space.


The 8% one-shot number is honestly better than I expected for a model this capable. The real question is what sits around the model. If you're running agents in production you need monitoring and kill switches anyway, the model being "safe enough" is necessary but never sufficient. Nobody should be deploying computer-use agents without observability around what they're actually doing.


The "native multimodal agents" framing is interesting. Everyone's focused on benchmark numbers but the real question is whether these models can actually hold context across multi-step tool use without losing the plot. That's where most open models still fall apart imo.


That 95% payout only works if you already know what good looks like. The sketchy part is when you can't tell the diff between correct and almost-correct. That's where stuff goes sideways.


Skills are great for static stuff but they kinda fall apart when the agent needs to interact with live state. WebMCP actually fills a real gap there imo.


What prevents them with working with live state. Coding agents deal with the live state of source code evolving fine. So why can't they watch a web page or whatever update over time? This seems to be a micro optimization that requires explicit work from the site developer to make work. Long term I just don't see this taking off versus agents just using sites directly. A more long term viable feature would be a way to allow agents to scroll the page or hover over menus without the user's own view being affected.


Agent teams working autonomously sounds cool until you actually try it. We've been running multi-agent setups and honestly the failure modes are hilarious. They don't crash, they just quietly do the wrong thing and act super confident about it.


AI offshore teams in "yes cultures".


Hi everyone ! Super happy to release this package. We feel like Evals belong in the CI like unit testing, and should be easy to setup and run automatically. Can't wait to get your feedback !


Free guide, enjoy ! Made with by the Basalt team


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: