Hacker Newsnew | past | comments | ask | show | jobs | submit | zby's commentslogin

I laughed - but I don't want more of this

You don't want more of this on Hacker News?

Hmm - but has its incidence increased or just other causes have fallen down faster?

I just found the xkcd that expresses my opinion on this:

https://xkcd.com/810/

I am surprised that apparently I am in a minority here.


I also feel the frustration of the llm reverse-compression - when a whole article is generated from a single sentence. But when I post something edited by AI it is usually a result of a long back and forth of editing and revising. I guess I could post the whole conversation thread - but it would be very long.

Personally I would just like to read the best comments.


Here is my theory about weaving deterministic code and prompts: https://github.com/zby/llm-do/blob/main/docs/theory.md . Plus a library that realises the unified call space that I propose.

I think co-recursion between prompts and code is crucial, but I also think that the ephemeral nature of code in Recursive Language Models is impending deployment time learning (https://github.com/zby/llm-do/blob/main/kb/notes/deploy-time...).


I think these rules should have a pre-determined shelf life. They are not bad at the current state of the world - they push in the right direction - but they complicated law, and I bet there will be many second-level outcomes that are hard to predict now. Besides that - once the capabilities for reuse are built - they should be sustainable - so the second level outcomes will actually dominate.


The instructions are standard documents - but this is not all. What the system adds is an index of all skills, built from their descriptions, that is passed to the llm in each conversation. The idea is to let the llm read the skill when it is needed and not load it into context upfront. Humans use indexes too - but not in this way. But there are some analogies with GUIs and how they enhance discoverability of features for humans.

I wish they arranged it around READMEs. I have a directory with my tasks and I have a README.md there - before codex had skills it already understood that it needs to read the readme when it was dealing with tasks. The skills system is less directory dependent so is a bit more universal - but I am not sure if this is really needed.


Humans use indexes too - but not in this way.

What's different?


Hmm - maybe I should not call it index - people lookup stuff in the index when needed. Here the whole index is inserted in the conversation - it is as if when starting a task human read the whole table of contents of the manual for that task.


Claude reads from .claude/instructions.md whenever you make a new convo as a default thing. I usually have Claude add things like project layout info and summaries, preferred tooling to use, etc. So there's a reasonable expectation of how it should run. If it starts 'forgetting' I tell it to re-read it.


No, Claude Code reads the CLAUDE.md in the root of your project. It's case sensitive so it has to be exactly that, too. Github Copilot reads from .github/copilot-instructions.md and supposedly AGENTS.md. Anigravity reads AGENTS.md and pulls subagents and the like from a .agents directory. This is probably why you have to remind it to re-read it so much, the harness isn't loading it for you.


> What the system adds is an index of all skills, built from their descriptions, that is passed to the llm in each conversation. The idea is to let the llm read the skill when it is needed and not load it into context upfront.

This is different from swagger / OpenAPI how?

I get cross trained web front-end devs set a new low bar for professional amnesia and not-invented-here-ism, but maybe we could not do that yet another time?


> This is different from swagger / OpenAPI how?

Because the descriptions aren't API specs and the things described aren't APIs.

Its more like a structure for human-readable descriptions in an annotated table of contents for a recipe book than it is like OpenAPI.


> This is different from swagger / OpenAPI how?

In the way that Swagger / OpenAPI is for API endpoints, but most of the "skills" you need for your agents are not based on API endpoints


I mean conceptually.

Why not just extend the OpenAPI specification to skills? Instead of recreating something that's essentially communicating the same information?

T minus a couple years before someone declares that down-mapping skills into a known verb enumeration promotes better skill organization...


> Why not just extend the OpenAPI specification to skills?

Because approximately none of what exists in the existing OpenAPI specification is relevant to the task, and nothing needed for the tasks is relevant to the current OpenAPI use case, so trying to jam one use case into a tool designed for the other would be pure nonsense.

It’s like needing to drive nails and asking why grab a hammer when you already have a screwdriver.


You think indexing skills in increasingly structured, parameterized formats has nothing to do with documenting REST API endpoints?


Reasoning is recursive - you cannot isolate where is should be symbolic and where it should be llm based (fuzzy/neural). This is the idea that started https://github.com/zby/llm-do - there is also RLM: https://alexzhang13.github.io/blog/2025/rlm/ RLM is simpler - but my approach also have some advantages.


I think the AI community is sleeping hard on proper symbolic recursion. The computer has gigabytes of very accurate "context" available if you start stacking frames. Any strategy that happens inside token space will never scale the same way.

Depth first, slow turtle recursion is likely the best way to reason through the hardest problems. It's also much more efficient compared to things that look more like breadth first search (gas town).


I only agree with that statement if you're drawing from the set of all possible problems a priori. For any individual domain I think it's likely you can bound your analytic. This ties into the no free lunch theorem.


Computers are finite - but we use an unbounded model for thinking about them - because it simplifies many things.


Pi has probably the best architecture and being written in Javascript it is well positioned to use the browser sandbox architecture that I think is the future for ai agents.

I only wish the author changed his stance on vendor extensions: https://github.com/badlogic/pi-mono/discussions/254


“standardize the intersection, expose the union” is a great phrase I hadn’t heard articulated before


I've got the wording from an llm. I knew there was this pattern in all traditional tools - but I did not know the name.


You’ve never heard it before because explicitly signaling “I know basic set theory” is kind of cringy


I would double - do skills reliably work for you? I mean are they reliably injected when there is a need, as opposed to being actively called for (which in my opinion defeats the purpose of skills - because I can always ask the llm to read a document and then do something with the new knowledge).

I have a feeling that codex still does not do it reliably - so I still have normal README files which it loads quite intelligently and it works better than the discovery via skills.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: