Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The model says it has been trained in a variety of languages, but its current version is english. Its translations are bad, probably because it uses a translation model on top of its output. Sometimes it is obvious it is translating english articles from the mistakes it makes


I don't know what it uses to output non-English text, but if you address it in a different language, it'll quite happily speak that back to you. And I'd say it understands it just fine, not really any worse than English - although it sometimes gets the word forms wrong in inflective languages.


But I was asking it for rhyming words in Dutch (because of Sinterklaas), and it suggested words that would rhyme if pronounced as English...


Yup, doesn't rhyme for me, either, if I ask for rhymes or poetry in Russian (but I don't think it'd rhyme even in English).

I think that makes sense if the supermajority of the training data was in English - it'd make it heavily biased towards that by default. So even when it speaks another language, it still "thinks" of the words in terms of questions like "how it sounds". I suspect it might be possible to craft a sufficiently detailed prompt to work around that.


I also found the same thing happening in my native tongue. It would be much more exciting if it had "native" understanding of foreign languages.


The original GPT-3 also did bad translations, just from scraps of foreign languages in the dataset. They seem to be worse though, so perhaps you're right that Chat-GPT is using another model the way it uses Internet searches




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: