Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
adebayoj
2 hours ago
|
parent
|
context
|
favorite
| on:
Show HN: Steerling-8B, a language model that can e...
You are exactly right, it is guiding the model, during training, with concepts and the dictionary. This is important because dictionary learning for interpretability (post hoc) is not currently reliable:
https://www.arxiv.org/abs/2602.14111
help
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: