A large portion of the model is common (and cross-trained) across multiple langu...

A large portion of the model is common (and cross-trained) across multiple languages. Only a small part is for language specific encoding/decoding. That means that not only is it easier for them to add language, but you can expect similar performance across languages.

Also you can mix and match encoders and decoders to translate whichever languages you want, and it will just work. Previously there was a separate model for each language pair.