This speaks very much to the idea that LLMs are in some sense a ridiculously effective, somewhat lossy, compression algorithm that has been applied to the whole internet.
It's a good way to frame base models that have only been pretrained.
However, modern frontier models have undergone rounds of fine-tuning, RLHF (reinforcement learning from human feedback), and RLVR (RL from verifiable rewards) that turn them into something else. The compressed internet is still in there, but it's wrapped in problem-solving and people-pleasing circuitry.