More

fdefitte · 2026-02-25T01:15:52 1771982152

The dog ships faster because it has zero opinions about the architecture.

fdefitte · 2026-02-18T00:28:06 1771374486

Agreed on ZLUDA being the practical choice. This project is more impressive as a "build a GPU compiler from scratch" exercise than as something you'd actually use for ML workloads. The custom instruction encoding without LLVM is genuinely cool though, even if the C subset limitation makes it a non-starter for most real CUDA codebases.

tgtweak · 2026-02-18T15:34:58 1771428898

ZLUDA doesn't have full coverage though and that means only a subset of cuda codebases can be ported successfully - they've focused on 80/20 coverage for core math.

Specifically:

CuBLAS (limited/partial scope), cuBLASLt (limited/partial scope), cuDNN (limited/partial scope), cuFFT, cuSPARSE, NVML (very limited/partial scope)

Notably Missing: cuSPARSELt, cuSOLVER, cuRAND, cuTENSOR, NPP, nvJPEG, nvCOMP, NCCL, OptiX

I'd estimate it's around 20% of CUDA library coverage.

fdefitte · 2026-02-18T00:27:49 1771374469

The filter used to be effort. You had to care enough to spend weeks on something, which meant you probably understood the problem deeply. Now that filter is gone and we get a flood of "I prompted this in 20 minutes" posts where the author can't answer a single follow-up about their own code. The interesting Show HNs still exist, they're just buried under noise.

lelanthran · 2026-02-18T05:06:58 1771391218

> The filter used to be effort. You had to care enough to spend weeks on something, which meant you probably understood the problem deeply. Now that filter is gone and we get a flood of "I prompted this in 20 minutes" posts where the author can't answer a single follow-up about their own code. The interesting Show HNs still exist, they're just buried under noise.

More's the pity. I'm prepping for a ShowHN with a completely hand-coded project I started in December. It will be finished around end-march. I vibe-coded all the docs, because I spent all the time on the code.

I have no idea if the ShowHN is going to be at all useful, but I pivoted multiple times on various implementation things. Had it been coded by an AI, I don't think there would be any pivot.

The value is precisely from the pivots. An AI would have plodded on ahead anyway with a broken model for the problem-space.

fdefitte · 2026-02-18T00:27:25 1771374445

The 8% one-shot number is honestly better than I expected for a model this capable. The real question is what sits around the model. If you're running agents in production you need monitoring and kill switches anyway, the model being "safe enough" is necessary but never sufficient. Nobody should be deploying computer-use agents without observability around what they're actually doing.

fdefitte · 2026-02-16T20:15:06 1771272906

The "native multimodal agents" framing is interesting. Everyone's focused on benchmark numbers but the real question is whether these models can actually hold context across multi-step tool use without losing the plot. That's where most open models still fall apart imo.

fdefitte · 2026-02-16T20:12:11 1771272731

That 95% payout only works if you already know what good looks like. The sketchy part is when you can't tell the diff between correct and almost-correct. That's where stuff goes sideways.

fdefitte · 2026-02-16T20:11:48 1771272708

Skills are great for static stuff but they kinda fall apart when the agent needs to interact with live state. WebMCP actually fills a real gap there imo.

charcircuit · 2026-02-16T20:24:33 1771273473

What prevents them with working with live state. Coding agents deal with the live state of source code evolving fine. So why can't they watch a web page or whatever update over time? This seems to be a micro optimization that requires explicit work from the site developer to make work. Long term I just don't see this taking off versus agents just using sites directly. A more long term viable feature would be a way to allow agents to scroll the page or hover over menus without the user's own view being affected.

fdefitte · 2026-02-16T20:11:25 1771272685

Agent teams working autonomously sounds cool until you actually try it. We've been running multi-agent setups and honestly the failure modes are hilarious. They don't crash, they just quietly do the wrong thing and act super confident about it.

oblio · 2026-02-17T09:41:07 1771321267

AI offshore teams in "yes cultures".

fdefitte · 2026-02-12T22:16:32 1770934592

Hi everyone ! Super happy to release this package. We feel like Evals belong in the CI like unit testing, and should be easy to setup and run automatically. Can't wait to get your feedback !

fdefitte · 2025-06-13T07:04:45 1749798285

Free guide, enjoy ! Made with by the Basalt team