It makes the black box slightly more transparent. Knowing more in this regard al...

great_psy · 2026-02-24T04:12:04 1771906324

Can this method be extended to go down to the sentence level ?

In the example it shows how much of the reason for an answer is due to data from Wikipedia. Can it drill down to show paragraph or sentence level that influences the answer ?

rickydroll · 2026-02-24T05:39:13 1771911553

Your question should be "Can it drill down to show the paragraphs or sentences that influence the answer?"

I believe that the plagiarism complaint about llm models comes from the assumption that there is a one-to-one relationship between training and answers. I think the real and delightfully messier situation is that there is a many-to-one relationship.

great_psy · 2026-02-24T06:42:35 1771915355

The example on the website shows one to many as well: Wikipedia, axive article, etc along with a ratio how much it influences the chunk of the answer.

adebayoj · 2026-02-24T09:32:32 1771925552

Exactly! We will have a future post that shows this more granularly over the coming weeks. Here is a post we wrote on how this works at smaller scale: https://www.guidelabs.ai/post/prism/

rickydroll · 2026-02-24T12:42:05 1771936925

Oh, that looks like a wonderful article. I just skimmed it, and I hope to get back to it later today. One thing I would love to see is how much of the training set is substantially similar to each other, especially in the code training set.

adebayoj · 2026-02-24T08:24:52 1771921492

Great questions. We have several posts in the works that will drill down more into these things. The model was actually designed to answer these questions for any sentence (or group of tokens it generates).

It can tell you which specific text (chunk) in the training data that led to the output the model generated. We plan to show more concrete demos of this capability over the coming weeks.

It can tell you where in the model's representation it learned about science, art, religion etc. And you can trace all of these to either to input context, training data, or model's representations.

Grimblewald · 2026-02-24T11:29:55 1771932595

Does it? If i make a system prompt for most models right now, tell them they were trained on {list} of datasets, and to attribute their answer to their training data, i get quite similar output. It even seems quite reasonable. The reason being each data corpus has a "vibe" to it and the predictions simply assign response vibe to dataset vibe.

That's still firmly in divination land.