Isn’t that the same thing? The non-fine-tuned models also have assumptions based...

cubefox · on May 17, 2023

It's very different. We don't know exactly what the model consideres good after fine-tuning (which can lead to surprising cases of misalignment), while the probability that something is the next token in the training distribution is very clear. I don't know how they measure it, but they can apparently measure the "loss" which (I think) says how close the model is to some sort of real probability.

brookst · on May 17, 2023

What I meant was, fine tuning is not substantially different from training. It seems odd to use different words for the resulting systems.

cubefox · on May 17, 2023

But fine-tuning is very different from (pre)training. Pretreating proceeds via unsupervised learning on massive amounts of data and compute, while fine-tuning uses much smaller amounts, with supervised learning (instruction tuning) and reinforcement learning (RLHF, constitutional AI).