Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah I remember in undergrad I was working on using transformation learning to train an object detector. Basically you only needed 100ish images to get the model to detect that new object really well.

I'm not sure what the analogous term is for a similar process on LLMs, but that will be huge when there is a service for it.



LLMs can do that without any examples (zero shot) or with one or a few demonstrations in the prompt, if you can describe the task in the limited context window.

If you want for example to train the model to learn to use a very large API, or access the knowledge in a whole book, it might need fine-tuning.


Could I just train a very small LLM with an English dictionary + Python + large API documentation + large Python code base?

Then do some chat fine tuning (like what HF did with StarCoder to get ChatCoder)

And get a lightweight LLM that knows the docs and code for the thing I need it for

After that, maybe incrementally fine tune the model as part of your CI/CD process


How similar were the object to other objects?

E.g., were you trying to distinguish an object vs nothing, a bicycle vs a fish, a bird vs a squirrel, or two different species of songbird at a feeder?

How much would the training requirements increase or decrease moving up or down that scale?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: