More

Havoc · 2026-03-11T17:59:18 1773251958

You can already run some models on the NPUs in the Rockchip RK3588 SBCs which are pretty abundant.

A claude 4.6 they are most certainly not, but if you get through the janky AF software ecosystem they can run small LLMs reasonably well with basically zero CPU/GPU usage

Havoc · 2026-03-10T19:35:48 1773171348

They're definitely inferior to proper tests, but even weak CC tests on top of CC code is an improvement over no tests. If CC does make a change that shifts something dramatically even a weak test may flag enough to get CC to investigate.

Even better though - external test suits. Recently made a S3 server of which the LLM made quick work for MVP. Then I found a Ceph S3 test suite that I could run against it and oh boy. Ended up working really good as TDD though.

aray07 · 2026-03-10T19:37:36 1773171456

yeah i have been hearing a lot more about this concept of “digital twins” - where you have high fidelity versions of external services to run tests against. You can ask the API docs of these external services and give it to Claude. Wonder if that is where we will be going more towards.

didgeoridoo · 2026-03-10T19:45:49 1773171949

Isn’t this just an API sandbox? Many services have a test/sandbox mode. I do wish they were more common outside of fintech.

Havoc · 2026-03-10T17:47:49 1773164869

Not controversial per se but it’ll go the same way as Netflix - once it’s got adoption they’ll crank enshitification up to 11

Havoc · 2026-03-10T16:26:52 1773160012

Crazy writeup.

Author is right about the base64 part. Does seem weird that it can decode and understand it at same time. And I guess what makes it weird that we just sorta accept that for say English and German this works ie normal use but when framed as base64 then it suddenly stops feeling intuitive

dinobones · 2026-03-10T17:02:36 1773162156

why tho? it's just an alternate alphabet/set of symbols.

dnhkng · 2026-03-10T17:32:26 1773163946

Because its generally expected that models only work 'in distribution', i.e. they work on stuff they have previously seen.

They almost certainly have never seen regular conversations in Base64 in their training set, so its weird that it 'just works'.

Does that make sense?

fweimer · 2026-03-10T21:01:44 1773176504

If you do not properly MIME-decode email, you end up with at least some base64-encoded conversations.

dormento · 2026-03-10T17:36:01 1773164161

For all we know, AI tech companies could theoretically have converted all of the "acquired" (ahem!) training set material into base64 and used it for training as well, just like you would encode say japanese romaji or hebrew written in the english alphabet.

dtj1123 · 2026-03-10T18:18:03 1773166683

Unlikely that every company would have bothered to do this.

idiotsecant · 2026-03-10T19:00:55 1773169255

'Yes, I know we already trained on all that data, but now I want you to convert to base64 and train it again! at enormous cost!'

adcoleman6 · 2026-03-11T12:34:49 1773232489

On the contrary, it could be a deliberate attempt to augment or diversify the dataset.

gwern · 2026-03-11T01:49:46 1773193786

> They almost certainly have never seen regular conversations in Base64 in their training set, so its weird that it 'just works'.

People use Base64 to store payloads of many arbitrary things, including web pages or screenshots, both deliberately and erroneously, and so they have almost certainly seen regular conversations in Base64 in their 10tb+ text training sets scraped from billions of web pages and files and mangled emails etc.

dnhkng · 2026-03-11T06:41:21 1773211281

Yes, thats true.

But that points again to the main idea: The model has learnt to transform Base64 into a form it can already use in the 'regular' thinking structures.

The alternative is that there is an entire parallel structure just for Base64, which based on my 'chats' with LLMs in that format seems implausible; it acts like the regular model.

If there is a 'translation' organ in the model, why not a math or emotion processing organs? Thats what I set out to find, and are illustrated in the heatmaps.

Also, any writing tips from the Master blogger himself? Huge fan (squeal!)

Havoc · 2026-03-10T10:05:01 1773137101

> Plus who knows what open routed providers do in term quantization

The quantisation is shown on the provider section.

Havoc · 2026-03-10T07:57:58 1773129478

Well on the plus side at least we’ll see how this story ends. Seems MS is going all in with win 12 on subscription and AI everything

Havoc · 2026-03-10T02:14:18 1773108858

What’s that North Korean Linux flavor called again?

tmtvl · 2026-03-10T02:48:59 1773110939

Red Star. I'd sooner use Berry, Kylin, or SUSE if I wanted to avoid the Noid- I mean, avoid U.S.-based distros.

selfhoster11 · 2026-03-10T02:47:28 1773110848

Red Star OS.

Havoc · 2026-03-10T02:10:54 1773108654

That’s some serious out of the box thinking

Havoc · 2026-03-09T22:33:13 1773095593

I'd think thin copper sheet on something soft would also work. That indentation will probably outlast any sort of ink

Havoc · 2026-03-09T15:52:47 1773071567

Sounds a lot like Zuckerberg getting caught on a hot mike at the trump dinner about how many billion meta is investing

All made up bullshit numbers