I feel like I'm the only one not getting the world models hype. We've been talking about them for decades now, and all of it is still theoretical. Meanwhile LLMs and text foundation models showed up, proved to be insanely effective, took over the industry, and people are still going "nah LLMs aren't it, world models will be the gold standard, just wait."
I bet LLMs and world models will merge. World models essentially try to predict the future, with or without actions taken. LLMs with tokenized image input can also be made to predict the future image tokens. It's a very valuable supervised learning signal aside from pre-training and various forms of RL.
I think "world models" is the wrong thing to focus on when contrasting the "animal intelligence" approach (which is what LeCun is striving for) with LLMs, especially since "world model" means different things to different people. Some people would call the internal abstractions/representations that an LLM learns during training a "world model" (of sorts).
The fundamental problem with today's LLMs that will prevent them from achieving human level intelligence, and creativity, is that they are trained to predict training set continuations, which creates two very major limitations:
1) They are fundamentally a COPYING technology, not a learning or creative one. Of course, as we can see, copying in this fashion will get you an extremely long way, especially since it's deep patterns (not surface level text) being copied and recombined in novel ways. But, not all the way to AGI.
2) They are not grounded, therefore they are going to hallucinate.
The animal intelligence approach, the path to AGI, is also predictive, but what you predict is the external world, the future, not training set continuations. When your predictions are wrong (per perceptual feedback) you take this as a learning signal to update your predictions to do better next time a similar situation arises. This is fundamentally a LEARNING architecture, not a COPYING one. You are learning about the real world, not auto-regressively copying the actions that someone else took (training set continuations).
Since the animal is also acting in the external world that it is predicting, and learning about, this means that it is learning the external effects of it's own actions, i.e. it is learning how to DO things - how to achieve given outcomes. When put together with reasoning/planning, this allows it to plan a sequence of actions that should achieve a given external result ("goal").
Since the animal is predicting the real world, based on perceptual inputs from the real world, this means that it's predictions are grounded in reality, which is necessary to prevent hallucinations.
So, to come back to "world models", yes an animal intelligence/AGI built this way will learn a model of how the world works - how it evolves, and how it reacts (how to control it), but this behavioral model has little in common with the internal generative abstractions that an LLM will have learnt, and it is confusing to use the same name "world model" to refer to them both.
RL on LLMs has changed things. LLMs are not stuck in continuation predicting territory any more.
Models build up this big knowledge base by predicting continuations. But then their RL stage gives rewards for completing problems successfully. This requires learning and generalisation to do well, and indeed RL marked a turning point in LLM performance.
A year after RL was made to work, LLMs can now operate in agent harnesses over 100s of tool calls to complete non-trivial tasks. They can recover from their own mistakes. They can write 1000s of lines of code that works. I think it’s no longer fair to categorise LLMs as just continuation-predictors.
Thanks for saying this. It never ceases to amaze me how many people still talk about LLMs like it’s 2023, completely ignoring the RLVR revolution that gave us models like Opus that can one-shot huge chunks of works-first-time code for novel use cases. Modern LLMs aren’t just trained to guess the next token, they are trained to solve tasks.
Forget 2023 - the advances in coding ability in just last 2-months are amazing. But, they are still not AGI, and it is almost certainly going to take more than just a new training regime such as RL to get there. Demis Hassabis estimates we need another 2-3 "transformer-level" discoveries to get there.
RL adds a lot of capability in the areas where it can be applied, but I don't think it really changes the fundamental nature of LLMs - they are still predicting training set continuations, but now trying to predict/select continuations that amount to reasoning steps steering the output in a direction that had been rewarded during training.
At the end of the day it's still copying, not learning.
RL seems to mostly only generalize in-domain. The RL-trained model may be able to generate a working C compiler, but the "logical reasoning" it had baked into it to achieve this still doesn't stop it from telling you to walk to the car wash, leaving your car at home.
There may still be more surprises coming from LLMs - ways to wring more capability out of them, as RL did, without fundamentally changing the approach, but I think we'll eventually need to adopt the animal intelligence approach of predicting the world rather than predicting training samples to achieve human-like, human-level intelligence (AGI).
You can’t really say it is just predicting continuations when it is learning to write proofs for Erdos problems, formalise significant math results, or perform automated AI research. Those are far beyond what you get by just being a copying and re-forming machine, a lot of these problems require sophisticated application of logic.
I don’t know if this can reach AGI, or if that term makes any sense to begin with. But to say these models have not learnt from their RL seems a bit ludicrous. What do you think training to predict when to use different continuations is other than learning?
I would say LLM’s failure cases like failing at riddles are more akin to our own optical illusions and blind spots rather than indicative of the nature of LLMs as a whole.
I think you're conflating mechanism with function/capability.
I'm not sure what I wrote that made you conclude that I thought these models are not learning anything from their RL training?! Let me say it again: they are learning to steer towards reasoning steps that during training led to rewards.
The capabilities of LLMs, both with and without RL, are a bit counter-intuitive, and I think that, at least in part, comes down to the massive size of the training sets and the even more massive number of novel combinations of learnt patterns they can therefore potentially generate...
In a way it's surprising how FEW new mathematical results they've been coaxed into generating, given that they've probably encountered a huge portion of mankind's mathematical knowledge, and can potentially recombine all of these pieces in at least somewhat arbitrary ways. You might have thought that there are results A, B and C hiding away in some obscure mathematical papers that no human has previously considered to put together before (just because of the vast number of such potential combinations), that might lead to some interesting result.
If you are unsure yourself about whether LLMs are sufficient to reach AGI (meaning full human-level intelligence), then why not listen to someone like Demis Hassabis, one of the brightest and best placed people in the field to have considered this, who says the answer is "no", and that a number of major new "transformer-level" discoveries/inventions will be needed to get there.
> they are still predicting training set continuations
But this is underselling what they do. Probably a large part of what they predict is learnt from their training set, but RL has added a layer on top that does not just come from just mimicry.
Again, I doubt this is enough for “AGI” but I think that term is not very well-defined to begin with. These models have now shown they are capable of novel reasoning, they just have to be prodded in the right way.
It’s not clear to me that there isn’t scaffolding that can use LLMs to search for novel improvements, like Katpathy’s recent autoresearch. The models, with the help of RL, seem to be getting to the point where this actually works to some extent, and I would expect this to happen in other fields in the next few years as well.
In general there's a difference between novel and discovering something new.
Pretraining has given the LLM a huge set of lego blocks that it can assemble in a huge variety of ways (although still limited by the "assembly patterns" is has learnt). If the LLM assembles some of these legos into something that wasn't directly in the training set, then we can call that "novel", even though everything needed to do it was present in the training set. I think maybe a more accurate way to think of this is that these "novel" lego assemblies are all part of the "generative closure" of the training set.
Things like generating math proofs are an example of this - the proof itself, as an assembled whole, may not be in the training set, but all the piece parts and thought patterns necessary to construct the proof were there.
I'm not much impressed with Karpathy's LLM autoresearch! I guess this sort of thing is part of the day to day activities of an AI researcher, so might be called "research" in that regard, but all he's done so far is just hyperparameter tuning and bug fixing. No doubt this can be extended to things that actually improve model capability, such as designing post-training datasets and training curriculums, but the bottleneck there (as any AI researcher will tell you) isn't the ideas - it's the compute needed to carry out the experiments. This isn't going to lead to the recursive self-improvement singularity that some are fantasizing about!
I would say these types of "autoresearch" model improvements, and pretty much anything current LLMs/agents are capable of, all fall under the category of "generative closure", which includes things like tool use that they have been trained to do.
It may well be possible to retrofit some type of curiosity onto LLMs, to support discovery and go beyond "generative closure" of things it already knows, and I expect that's the sort of thing we may see from Google DeepMind in next 5 years or so in their first "AGI" systems - hybrids of LLMs and hacks that add functionality but don't yet have the elegance of an animal cognitive architecture.
You laid out the theoretical limitations well, and I tend to agree with them.
I just get frustrated when people downplay how big of an impact filling in the gaps at the frontier of knowledge would have. 99.9% of researchers will never have an idea that adds a new spike to the knowledge frontier (rather than filling in holes), and 99.99% of research is just filling in gaps by combining existing ideas (numbers made up). In this realm, autoresearch may not be groundbreaking, but it can do the job. AlphaEvolve is similar.
If LLMs can actually get closer to something like that, it leaves human researchers a whole lot more time to focus on new ideas that could move entire fields forward. And their iteration speed can be a lot faster if AI agents can help with the implementation and testing of them.
> What do you think training to predict when to use different continuations is other than learning?
Sure, training = learning, but the problem with LLMs is that is where it stops, other than a limited amount of ephemeral in-context learning/extrapolation.
With an LLM, learning stops post-training when it is "born" and deployed, while with an animal that's when it starts! The intelligence of an animal is a direct result of it's lifelong learning, whether that's imitation learning from parents and peers (and subsequent experimentation to refine the observed skill), or the never ending process of observation/prediction/surprise/exploration/discovery which is what allows humans to be truly creative - not just behaving in ways that are endless mashups of things they have seen and read about other humans doing (cf training set), but generating truly novel behaviors (such as creating scientific theories) based on their own directed exploration of gaps in mankind's knowledge.
Application of AGI to science and new discovery is a large part of why Hassabis defines AGI as human-equivalent intelligence, and understands what is missing, while others like Sam Altman are content to define AGI as "whatever makes us lots of money".
Memory systems built on top of LLMs could provide continual learning. I do not agree that it is some fundamental limitation.
Claude Code already writes its own memory files. And people already finetune models. There is clear potential to use the former as a form of short-term memory and the latter for long-term “learning”.
The main blockers to this are that models aren’t good enough at managing their own memory, and finetuning is expensive and difficult. But both of these seem like solvable engineering problems.
Continual learning isn't a "fundamental limitation" or unsolvable problem. Animal brains are an existence proof that it's possible, but it's tough to do, and quite likely SGD is not the way to do it, so any attempt to retrofit continual learning to LLMs as they exist today is going to be a hack...
Memory and learning are two different things. Memorization is a small subset of learning. Memorizing declarative knowledge and personal/episodic history (cf. LLM context) are certainly needed, but an animal (or AI intern) also needs to be able to learn procedural skills which need to become baked into the weights that are generating behavior.
Fine tuning is also no substitute for incremental learning. You might think of it as addressing somewhat the same goal, but really fine tuning is about specializing a model for a particular use, and if you repeatedly fine tune a model for different specializations (e.g. what I learnt yesterday, vs what I learnt the day before) then you will run into the catastrophic forgetting problem.
I agree that incremental learning seems more like an engineering problem rather than a research one, or at least it should succumb to enough brain power and compute put into solving it, but we're now almost 10 years into the LLM revolution (attention paper in 2017) and it hasn't been solved yet - it's not easy.
Fundamentally, I’m more optimistic on how far current approaches can scale. I see no reason why RL could not be used to train models to use memory, and fine-tuning already works, it’s just expensive.
The continual learning we get may be a bit hamfisted, and not fit into a neat architecture, but I think we could actually see it work at scale in the next few years. Whereas new techniques like what Yann Lecun have demonstrated still live heavily in the realm of research. Cool, but not useful yet.
Fine tuning is also not so limited as you suggest. For one, we don’t need to fine tune the same model over and over, you can just start with a frontier model each time. And two, modern models are much better at generating synthetic data or environments for RL. This could definitely work, but it might require a lot of work in data collection and curation, and the ROI is not clear. But if large companies continue to allocate more and more resources to AI in the next few years, I could see this happening.
OpenAI already has a custom model service, and labs have stated they already have custom models built for the military (although how custom those models are is unclear). It doesn’t seem like a huge leap to also fine-tune models over a companies internal codebases and tooling. Especially for large companies like Google, Amazon, or Stripe that employ tens of thousands of software engineers.
>The fundamental problem with today's LLMs that will prevent them from achieving human level intelligence, and creativity, is that they are trained to predict training set continuations, which creates two very major limitations:
I am of the opinion that imagination and creativity comes from emotion, hence a machine that cannot "feel" will never be truly intelligent.
One can go ahead and ask, but you are just a lump of meat, if you can feel, then so a computer of similar structure can.
If we assume that physical reality is fundamental, then that might make sense. But what if consciousness is fundamental and reality plays on consciousness?
Then randomness, and in-turn ideas come from the attributes of the fundamental reality that we are in.
I ll try to simplify it. Imagine you having an idea that extends your life for a day. Then from all the possible worlds, in some worlds, you find yourselves living in the next day (in others you are dead). But this "idea" you had, was just one among the infinite sea of possibilities, and your consciousness inside one such world observes you having that idea and survive for a day!
If you want to create a machine that can do that, it implies that you should be a consciousness inside a world in it (because the machine cannot pick valid worlds from infinite samples, but just enables consciousness to exists such suitable worlds). So it cannot be done in our reality!
Mayyyyy be "Quantum Darwinism" is what I am trying to describe here..
> I am of the opinion that imagination and creativity comes from emotion
How do you see emotion as being necessary for creativity?
It sure seems that things like surprise (prediction failure) driven "curiosity" and exploration (I can't predict what will happen if I do X, so let me try) are behind creativity, pushing the boundaries of knowledge and discovering something new.
Perhaps you mean artistic creativity rather than scientific, in which case we're talking about different things, but I'd agree with you since the goal of much art is to elicit an emotional response in those engaging with it.
I don't think there is anything stopping us from implementing emotions, every bit as real as our own, in some form of artificial life if we want to though. At the end of the day emotion comes down to our primitive brain releasing chemicals like adrenaline, dopamine, etc as a result of certain stimuli, the functioning of our brain/body being affected by those chemicals, and the feedback loop of us then recognizing how our brain/body is operating differently ("I feel sad/exited/afraid" etc). It's all very mechanical.
FWIW I think consciousness is also very mechanical, but it seems somewhat irrelevant to the discussion of intelligence/AGI.
Yea, its our nature to feel good about it, which is what evolution does. If you are curious, and if exploration makes you feel good, you have a better chance to survive, and you pass that trait along.
It seems we're basically in agreement, not arguing!
The only quibble I have is whether "feeling good" is the right way to describe how evolution has made us choose to engage in exploration/etc. I don't think it's quite as simple as evolution making things that are good for us feel good, and making things that are bad for us feel bad.
There are a bunch of neurotransmitters and hormones that control how we behave. Evolution discourages us from doing things that are bad for us via a range of emotions including things like fear and disgust (not just "feeling bad"). Evolution also encourages us to do, or keep on doing, things that are good for us via a range of emotions such as enjoyment (this is a tasty fruit), contentment (this feels nice - I'll keep doing it) to curiosity, again not just "feeling good". I think curiosity and exploration (which may lead to learning and discovery, which are good for us) are based around attention and focus ... rather than feeling good, it feels interesting.
I'd say that motivation and feeling are largely unrelated.
We do what we do, not because of motivation, but because that what we've evolved to do.
Feeling really comes after the fact, or independent of it, when we're introspecting on what we've already done (courtesy of evolution), and how we feel about that, or how can we explain it!
There is tons of evidence that this is the way our brains work - we do things "because" then (if asked, or if thinking about it) concoct post-hoc explanations of why we did it. An example of this is split brain patients, where one half of the brain happily explains why the other half did what it did, despite there being no connection between the two (nor any subjective feeling by the patient that there is anything amiss with their brain)!
But sure, we have feelings, closely related to qualia you could say. It does "feel like something" to be depressed, or excited, or inspired, or whatever. I don't see any big mystery to this - our brain is able to self-observe and not surprisingly able to detect it's own varied patterns of operation (including ones induced by brain chemicals, natural or otherwise).
I presume where you may want to go with this is "but why does it feel like anything? why is there any subjective experience at all (and would a machine have it too)?", and I think the answer is that this is just an emergent property of having a cognitive apparatus capable of self-observation. We can already see the glimmerings of this in LLMs, maybe a bad example since their thoughts are derived from humans, but nonetheless LLMs self-report as if they are conscious, and have existential discussions about it on Moltbook. It's hard to imagine (to bring this back on topic) that LeCun's animal intelligence, basically something LLM-like that is trained from scratch (no baked in human knowledge) wouldn't report the exact same thing.
Sort of ... (depends on what you mean by "feel" - detects, or is consciously aware of)
I think the way this works is something like this:
1) Our body / brain will detect that we're low on energy and release neurotransmitters indicating this
2) Our body may also provide physical indications of hunger based on empty stomach
3) These hunger-detection (or i-should-eat) signals may directly trigger behavior patterns related to finding food. In a primitive animal or baby this may be direct (baby starts crying, mom provides food), and as a human adult it may include triggering past patterns of food finding (go to the kitchen) when we felt like this before
4) In parallel with 3), the evolutionally newer part of our brain will eventually recognize what's going on, and that we're feeling hungry, and if we haven't already done anything about it then we might make a more deliberate plan to do something about it "i'm hungry, should get some lunch, what will i have .."
So, I think it's a combination of 3) and 4).
I'd guess there are probably also some animals, maybe some humans, that feed instinctively, more or less all the time, and perhaps have more of an "i'm full, off switch" than an "i'm hungry, on switch".
I don't know - this is obviously partly guess work. Certainly if you ask someone in the kitchen grabbing a snack why they are doing it they will provide a post-hoc explanation of "i'm feeling hungry", even if the "decision" to do it was subconscious.
I'm just being honest. Anyone who tells you they know what its like to be a bat, or any other animal, and what drives it's behavior is lying.
"Animals eat when they're hungry" is fine story, and obviously there is a lot of truth to it, but there are also obviously a lot of exceptions too.
Do you really think the Grizzly, preparing to hibernate, who can barely walk because he is so fat, and is eating his 50th salmon of the day, is eating because he is hungry?
What about the baby birds, mouths open when parent returns with food? Hungry, or just instinct?
What about an alligator that can go months without eating. In captivity it will eat every day if food is offered, and get obese (so keepers don't do that). Is it eating when hungry?
What about grazing animals that eat nutritionally poor food, and spend most of their waking hours eating? An elephant eats for 12-18 hours a day. Is it always hungry, or has evolution just given it survival instincts to behave like this?
When the bat leaving it's cave at dusk to go catch insects, is it hungry? What is it like to be a bat?