More

clbrmbr · 2026-03-12T23:46:49 1773359209

Do you have a more detailed trace that shows the system reminders? Do you know in which order it was fed into the LLM call that resulted in the bad reasoning?

clbrmbr · 2026-03-12T23:43:43 1773359023

Huh. I’m missing out I guess. Is there a plugin you use for spinning them up? Heavy superpowers/CC user here.

adevilinyc · 2026-03-13T00:22:59 1773361379

I think they're talking about the Agent Teams feature in Claude Code: https://code.claude.com/docs/en/agent-teams

clbrmbr · 2026-03-12T23:42:05 1773358925

Not everyone. If your business is chill and you are REEEEALY thoughtful and respectful with newsletters you will be rewarded with open rates well in excess of 50%…

clbrmbr · 2026-03-12T23:38:12 1773358692

Hahah yeah if you play with LoRas on local models you will see this a lot. Most often I see it hallucinate a user turn or a system message.

clbrmbr · 2026-03-12T23:36:07 1773358567

This. The models struggle with differentiating tool responses from user messages.

The trouble is these are language models with only a veneer of RL that gives them awareness of the user turn. They have very little pretraining on this idea of being in the head of a computer with different people and systems talking to you at once. —- there’s more that needs to go on than eliciting a pre-learned persona.

clbrmbr · 2026-03-12T23:30:52 1773358252

Same here. Auto mode is NOT ok. Sadly, smaller models cannot be trusted with access to Bash.

clbrmbr · 2026-03-10T19:52:10 1773172330

man i miss GPT-4.5

clbrmbr · 2026-03-10T19:45:23 1773171923

Here to say I'm one of those people who did my first Show HN recently, and it was 100% due to the lowered activation energy to build something awesome with Claude. Not 45min, but took about 6 hours of my time, and benefitted from testing against a 10yr old firmware codebase at my startup.

So I guess I'm saying, the ideal rate of Show HN posts has probably gone way up. Unfortunately its also resulting in lower SNR. Not sure what to do about it tho.

clbrmbr · 2026-03-08T15:30:31 1772983831

I’d love to see a post that clearly walks through how this works for some examples to give the intuition.

And then, how much is really saved on training for a non-trivial model?

And is this applicable to deep models?

fakesum · 2026-03-08T16:16:58 1772986618

sure: just posted on reddit: https://www.reddit.com/r/deeplearning/comments/1ro8uw2/analy... with benchmarks and things. and yes it works for LLMs, LSTMs, RNNs, CNNs, and more.

clbrmbr · 2026-03-10T19:48:47 1773172127

Thanks. Do come back to post with your tutorials. I'd recommend going quite granular and being didactic. Take somebody who understands gradient descent on MLPs and ELI5 the analytical part. Try to anticipate some of the doubts (there will be many! gradient descent from random init is dogma at this point).

clbrmbr · 2026-03-03T16:42:27 1772556147

This would be awesome. Even titles and shasums could be enough.