What OSS model comes close to 3.5?

baryphonic · on May 19, 2023

For complex tasks? Just about any of them. Alpaca, Llama isn't bad (it's inferior but relatively close), other fine-tuned models.

To be more precise, they're all about as as bad as GPT-3.5 at complex tasks, which isn't great.

I don't live and breathe this stuff, but I do use GPT-3.5 frequently, and many of the open source models I've tried are surprisingly close to GPT-3.5.

whimsicalism · on May 19, 2023

I don't think the evidence bears that out [0]. I agree that GPT-4 is way better than GPT-3.5, but I don't think most of the OSS are even close to GPT-3.5. Vicuna is closer for simple tasks/conversation, but it still doesn't match GPT-3.5 elsewhere IMO, even though GPT-3.5 is also not great at complex tasks.

[0]: https://github.com/microsoft/guidance/blob/main/notebooks/ch...

baryphonic · on May 19, 2023

This is fair; my only evidence is my personal experience, most of which has consisted of trying models on Huggingface or similar, which isn't persuasive.

I will say based on my cursory glance that a lot of the tasks here seem odd for a chat AI, though there are certainly applications that might use them. E.g. asking if someone insulted another person given some transcript of their conversation seems like a somewhat tough sell to me. Nevertheless, ChatGPT performed better. Was that because ChatGPT has a better training set, a better architecture or more examples in it training set related to the questions? Is that even knowable?

Anyway, cool notebook.