Regardless of your opinion of Yann or his views on auto regressive models being "sufficient" for what most would describe as AGI or ASI, this is probably a good thing for Europe. We need more well capitalized labs that aren't US or China centric and while I do like Mistral, they just haven't been keeping up on the frontier of model performance and seem like they've sort of pivoted into being integration specialists and consultants for EU corporations. That's fine and they've got to make money, but fully ceding the research front is not a good way to keep the EU competitive.
brandonb 3 hours ago [-]
LeCun's technical approach with AMI will likely be based on JEPA, which is also a very different approach than most US-based or Chinese AI labs are taking.
If you're looking to learn about JEPA, LeCun's vision document "A Path Towards Autonomous Machine Intelligence" is long but sketches out a very comprehensive vision of AI research:
https://openreview.net/pdf?id=BZ5a1r-kVsf
Training JEPA models within reach, even for startups. For example, we're a 3-person startup who trained a health timeseries JEPA. There are JEPA models for computer vision and (even) for LLMs.
You don't need a $1B seed round to do interesting things here. We need more interesting, orthogonal ideas in AI. So I think it's good we're going to have a heavyweight lab in Europe alongside the US and China.
sanderjd 2 hours ago [-]
Have you published anything about your health time series model? Sounds interesting!
BTW, I went to your website looking for this, but didn't find your blog. I do now see that it's linked in the footer, but I was looking for it in the hamburger menu.
jsnell 2 hours ago [-]
I don't think it's "regardless", your opinion on LeCun being right should be highly correlated to your opinion on whether this is good for Europe.
If you think that LLMs are sufficient and RSI is imminent (<1 year), this is horrible for Europe. It is a distracting boondoggle exactly at the wrong time.
andrepd 2 hours ago [-]
It's been 6 months away for 5 years now. In that time we've seen relatively mild incremental changes, not any qualitative ones. It's probably not 6 months away.
AStrangeMorrow 1 hours ago [-]
Yeah. I feel like that like many projects the last 20% take 80% of time, and imho we are not in the last 20%
Sure LLMs are getting better and better, and at least for me more and more useful, and more and more correct. Arguably better than humans at many tasks yet terribly lacking behind in some others.
Coding wise, one of the things it does “best”, it still has many issues: For me still some of the biggest issues are still lack of initiative and lack of reliable memory. When I do use it to write code the first manifests for me by often sticking to a suboptimal yet overly complex approach quite often. And lack of memory in that I have to keep reminding it of edge cases (else it often breaks functionality), or to stop reinventing the wheel instead of using functions/classes already implemented in the project.
All that can be mitigated by careful prompting, but no matter the claim about information recall accuracy I still find that even with that information in the prompt it is quite unreliable.
And more generally the simple fact that when you talk to one the only way to “store” these memories is externally (ie not by updating the weights), is kinda like dealing with someone that can’t retain memories and has to keep writing things down to even get a small chance to cope. I get that updating the weights is possible in theory but just not practical, still.
mfru 28 minutes ago [-]
Reminds me of how cold fusion reactors are only 5 years away for decades now
lordmathis 47 minutes ago [-]
It's 6 months away the same way coding is apparently "solved" now.
HarHarVeryFunny 28 minutes ago [-]
I think we - in last few months - are very close to, if not already at, the point where "coding" is solved. That doesn't mean that software design or software engineering is solved, but it does mean that a SOTA model like GPT 5.4 or Opus 4.6 has a good chance of being able to code up a working version of whatever you specify, with reason.
What's still missing is the general reasoning ability to plan what to build or how to attack novel problems - how to assess the consequences of deciding to build something a given way, and I doubt that auto-regressively trained LLMs is the way to get there, but there is a huge swathe of apps that are so boilerplate in nature that this isn't the limitation.
I think that LeCun is on the right track to AGI with JEPA - hardly a unique insight, but significant to now have a well funded lab pursuing this approach. Whether they are successful, or timely, will depend if this startup executes as a blue skies research lab, or in more of an urgent engineering mode. I think at this point most of the things needed for AGI are more engineering challenges rather than what I'd consider as research problems.
basket_horse 2 hours ago [-]
But I swear this time is different! Just give me another 6 months!
andrepd 31 minutes ago [-]
And another 6 trillion dollars :^)
next_xibalba 1 hours ago [-]
> RSI
Wait, we have another acronym to track. Is this the same/different than AGI and/or ASI?
mietek 1 hours ago [-]
Some people should definitely be getting Repetitive Strain Injury from all the hyping up of LLMs.
robrenaud 58 minutes ago [-]
Recursive self improvement. It's when AI speeds up the development of the next AI.
Hm, Singapour looks more like "one of their base"; they will have offices in Paris, Montréal, Singapour and New York (according to both this article and the interview Yann Le Cun did this morning on France Inter, the most listened radio in France).
Of course, each relevant newspaper on those areas highlight that it's coming to their place, but it really seems to be distributed.
rubzah 1 hours ago [-]
All your base are belong to Yann LeCun.
RamblingCTO 17 minutes ago [-]
Which would be a good idea, as a European. I'd hate to see the investment go to waste on taxes that are spent on stupid shit anyway. Should go into R&D not fighting bureaucracy.
fnands 4 hours ago [-]
Probably just a satellite office.
Might be to be close to some of Yann's collaborators like Xavier Bresson at NUS
5 hours ago [-]
stingraycharles 5 hours ago [-]
That's a Singaporian newspaper, though; not sure if it's objectively their main base, or just one of them
throwpoaster 3 hours ago [-]
"Show me the incentive and I will show you the outcome."
Almost certainly the IP will be held in Singapore for tax reasons.
re-thc 5 hours ago [-]
> they are setting up in Singapore as their base
Europe in general has been tightening up their rules / taxes / laws around startups / companies especially tech and remote.
It's been less friendly. these days.
Signez 5 hours ago [-]
Yann Le Cun litteraly said this morning on the radio in France that it is headquarted in Paris and will pay taxes in France. Go figure…
roromainmain 4 hours ago [-]
For such companies, France also offers generous R&D tax credits (Crédit Impôt Recherche): companies can recover roughly 30% of eligible R&D expenses incurred in France as a tax credit, which can eventually be refunded (in cash) if the company has no taxable profit.
storus 3 hours ago [-]
Is that alongside 100% of R&D expenses amortized in taxes when a company has taxable profit covering them?
roromainmain 3 hours ago [-]
Yes indeed, if the company is profitable.
ttoinou 4 hours ago [-]
No he said something like “well yes, only for the parts of profits made in France”
mi_lk 4 hours ago [-]
Doesn’t he live in New York himself? Although not sure if that matters depending on his role
kvgr 5 hours ago [-]
There will be no corporate taxes for a long time, so alls good.
Imustaskforhelp 2 hours ago [-]
This is a singaporean news article from a singporean company[0] (Had to look it up)
As such, They are more likely to talk about singapore news and exaggerate the claims.
Singapore isn't the Key location. From what I am seeing online, France is the major location.
Singapore is just one of the more satellite like offices. They have many offices around the world it seems.
While I’d love there to be a European frontier model, I do very much enjoy mistral. For the price and speed it outperforms any other model for my use cases (language learning related formatting, non-code non-research).
vessenes 1 hours ago [-]
Partner in a fund that wrote a small check into this — I have no private knowledge of the deal - while I agree that one’s opinion on auto regressive models doesn’t matter, I think the fact of whether or not the auto regressive models work matters a lot, and particularly so in LeCun’s case.
What’s different about investing in this than investing in say a young researcher’s startup, or Ilya’s superintelligence? In both those cases, if a model architecture isn’t working out, I believe they will pivot. In YL’s case, I’m not sure that is true.
In that light, this bet is a bet on YL’s current view of the world. If his view is accurate, this is very good for Europe. If inaccurate, then this is sort of a nothing-burger; company will likely exit for roughly the investment amount - that money would not have gone to smaller European startups anyway - it’s a wash.
FWIW, I don’t think the original complaint about auto-regression “errors exist, errors always multiply under sequential token choice, ergo errors are endemic and this architecture sucks” is intellectually that compelling. Here: “world model errors exist, world model errors will always multiply under sequential token choice, ergo world model errors are endemic and this architecture sucks.” See what I did there?
On the other hand, we have a lot of unused training tokens in videos, I’d like very much to talk to a model with excellent ‘world’ knowledge and frontier textual capabilities, and I hope this goes well. Either way, as you say, Europe needs a frontier model company and this could be it.
giancarlostoro 4 hours ago [-]
I didn't really know who he was, so I went and found his wikipedia, which is written like either he wrote it himself to stroke his ego, or someone who likes him wrote it to stroke his ego:
> He is the Jacob T. Schwartz Professor of Computer Science at the Courant Institute of Mathematical Sciences at New York University. He served as Chief AI Scientist at Meta Platforms before leaving to work on his own startup company.
That entire sentence before the remarks about him service at Meta could have been axed, its weird to me when people compare themselves to someone else who is well known. It's the most Kanye West thing you can do. Mind you the more I read about him, the more I discovered he is in fact egotistical. Good luck having a serious engineering team with someone who is egotistical.
pama 4 hours ago [-]
You underestimate academia. Any academic that reads these two sentences only focuses on the first one: He has a named chair at Courant. In Germany, being a a Prof is added to your ID card/passport and becomes part of your official name, like knighthood in other countries.
dr_hooo 1 hours ago [-]
No true regarding the IDs, only PhD titles can be added. Not job descriptions. Source: academia person in Germany.
DeathArrow 19 minutes ago [-]
It seems Germans add their PhD titles even to their nicknames. :)
timr 4 hours ago [-]
It's not comparing him to anyone. He has an endowed professorship. This is standard in academia, and you give the name because a) it's prestigious for the recipient and b) it strokes the ego of the donor.
This is just the official name of a chair at NYU. I'm not even sure Jacob T. Schwartz is more well known than Yann LeCun
stephencanon 3 hours ago [-]
Yann is definitely more well-known outside of academia. Inside academia, it's going to depend a lot on your specific background and how old you are.
bobwaycott 4 hours ago [-]
That’s not a comparison to another person. That’s his job title. It is not uncommon for universities to have distinguished chairs within departments named after a notable person—in this case, the founder of NYU’s Department of Computer Science.
g947o 3 hours ago [-]
Eh, that paragraph reads perfectly normal to me.
Either you have not read enough Wikipedia pages, or you have too much to complain about. (Or both.)
neversupervised 1 hours ago [-]
Is it good? This will almost certainly fail. Not because Yann or Europe, but because these sort of hyper-hyped projects fail. SSI and Thinking Machines haven’t lived to the hype.
ma2rten 1 hours ago [-]
Erm, ... OpenAI has hyped when it started and it took 6 years to take off. It's way to early to declare the SSI and Thinking Machines have failed.
koakuma-chan 34 minutes ago [-]
They took money and haven't released anything. How are they doing?
A_D_E_P_T 5 hours ago [-]
Justifiable.
There are a lot more degrees of freedom in world models.
LLMs are fundamentally capped because they only learn from static text -- human communications about the world -- rather than from the world itself, which is why they can remix existing ideas but find it all but impossible to produce genuinely novel discoveries or inventions. A well-funded and well-run startup building physical world models (grounded in spatiotemporal understanding, not just language patterns) would be attacking what I see as the actual bottleneck to AGI. Even if they succeed only partially, they may unlock the kind of generalization and creative spark that current LLMs structurally can't reach.
jnd-cz 2 hours ago [-]
The sum of human knowledge is more than enough to come up with innovative ideas and not every field is working directly with the physical world. Still I would say there's enough information in the written history to create virtual simulation of 3d world with all ohysical laws applying (to a certain degree because computation is limited).
What current LLMs lack is inner motivation to create something on their own without being prompted. To think in their free time (whatever that means for batch, on demand processing), to reflect and learn, eventually to self modify.
I have a simple brain, limited knowledge, limited attention span, limited context memory. Yet I create stuff based what I see, read online. Nothing special, sometimes more based on someone else's project, sometimes on my own ideas which I have no doubt aren't that unique among 8 billions of other people. Yet consulting with AI provides me with more ideas applicable to my current vision of what I want to achieve. Sure it's mostly based on generally known (not always known to me) good practices. But my thoughts are the same way, only more limited by what I have slowly learned so far in my life.
daxfohl 44 minutes ago [-]
I guess you need two things to make that happen. First, more specialization among models and an ability to evolve, else you get all instances thinking roughly the same thing, or deer in the headlights where they don't know what of the millions of options they should think about. Second, fewer guardrails; there's only so much you can do by pure thought.
The problem is, idk if we're ready to have millions of distinct, evolving, self-executing models running wild without guardrails. It seems like a contradiction: you can't achieve true cognition from a machine while artificially restricting its boundaries, and you can't lift the boundaries without impacting safety.
andy12_ 5 hours ago [-]
I don't understand this view. How I see it the fundamental bottleneck to AGI is continual learning and backpropagation. Models today are static, and human brains don't learn or adapt themselves with anything close to backpropagation. World models don't solve any of these problems; they are fundamentally the same kind of deep learning architectures we are used to work with. Heck, if you think learning from the world itself is the bottleneck, you can just put a vision-action LLM on a reinforcement learning loop in a robotic/simulated body.
zelphirkalt 5 hours ago [-]
> I don't understand this view. How I see it the fundamental bottleneck to AGI is continual learning and backpropagation. Models today are static, and human brains don't learn or adapt themselves with anything close to backpropagation.
Even with continuous backpropagation and "learning", enriching the training data, so called online-learning, the limitations will not disappear. The LLMs will not be able to conclude things about the world based on fact and deduction. They only consider what is likely from their training data. They will not foresee/anticipate events, that are unlikely or non-existent in their training data, but are bound to happen due to real world circumstances. They are not intelligent in that way.
Whether humans always apply that much effort to conclude these things is another question. The point is, that humans fundamentally are capable of doing that, while LLMs are structurally not.
The problems are structural/architectural. I think it will take another 2-3 major leaps in architectures, before these AI models reach human level general intelligence, if they ever reach it. So far they can "merely" often "fake it" when things are statistically common in their training data.
andy12_ 5 hours ago [-]
> Even with continuous backpropagation and "learning"
That's what I said. Backpropagation cannot be enough; that's not how neurons work in the slightest. When you put biological neurons in a Pong environment they learn to play not through some kind of loss or reward function; they self-organize to avoid unpredictable stimulation. As far as I know, no architecture learns in such an unsupervised way.
Forgive me for being ignorant - but 'loss' in supervised learning ML context encode the difference between how unlikely (high loss) or likely (low loss) was the network in predicting the output based on the input.
This sounds very similar to me as to what neurons do (avoid unpredictable stimulation)
andy12_ 2 hours ago [-]
So, I have been thinking about this for a little while. Image a model f that takes a world x and makes a prediciton y. At a high-level, a traditional supervised model is trained like this
f(x)=y' => loss(y',y) => how good was my prediction? Train f through backprop with that error.
While a model trained with reinforcement learning is more similar to this. Where m(y) is the resulting world state of taking an action y the model predicted.
f(x)=y' => m(y')=z => reward(z) => how good was the state I was in based on my actions? Train f with an algorithm like REINFORCE with the reward, as the world m is a non-differentiable black-box.
While a group of neurons is more like predicting what is the resulting word state of taking my action, g(x,y), and trying to learn by both tuning g and the action taken f(x).
f(x)=y' => m(y')=z => g(x,y)=z' => loss(z,z') => how predictable was the results of my actions? Train g normally with backprop, and train f with an algorithm like REINFORCE with negative surprise as a reward.
After talking with GPT5.2 for a little while, it seems like Curiosity-driven Exploration by Self-supervised Prediction[1] might be an architecture similar to the one I described for neurons? But with the twist that f is rewarded by making the prediction error bigger (not smaller!) as a proxy of "curiosity".
I think people MOSTLY foresee and anticipate events in OUR training data, which mostly comprises information collected by our senses.
Our training data is a lot more diverse than an LLMs. We also leverage our senses as a carrier for communicating abstract ideas using audio and visual channels that may or may not be grounded in reality. We have TV shows, video games, programming languages and all sorts of rich and interesting things we can engage with that do not reflect our fundamental reality.
Like LLMs, we can hallucinate while we sleep or we can delude ourselves with untethered ideas, but UNLIKE LLMs, we can steer our own learning corpus. We can train ourselves with our own untethered “hallucinations” or we can render them in art and share them with others so they can include it in their training corpus.
Our hallucinations are often just erroneous models of the world. When we render it into something that has aesthetic appeal, we might call it art.
If the hallucination helps us understand some aspect of something, we call it a conjecture or hypothesis.
We live in a rich world filled with rich training data. We don’t magically anticipate events not in our training data, but we’re also not void of creativity (“hallucinations”) either.
Most of us are stochastic parrots most of the time. We’ve only gotten this far because there are so many of us and we’ve been on this earth for many generations.
Most of us are dazzled and instinctively driven to mimic the ideas that a small minority of people “hallucinate”.
There is no shame in mimicking or being a stochastic parrot. These are critical features that helped our ancestors survive.
jstummbillig 4 hours ago [-]
> They will not foresee/anticipate events, that are unlikely or non-existent in their training data, but are bound to happen due to real world circumstances. They are not intelligent in that way.
Can you be a bit more specific at all bounds? Maybe via an example?
wiz21c 5 hours ago [-]
I'm sure that if a car appeared from nowhere in the middle of your living room, you would not be prepared at all.
So my question is: when is there enough training data that you can handle 99.99% of the world ?
ben_w 5 hours ago [-]
> Models today are static, and human brains don't learn or adapt themselves with anything close to backpropagation.
While I suspect latter is a real problem (because all mammal brains* are much more example-efficient than all ML), the former is more about productisation than a fundamental thing: the models can be continuously updated already, but that makes it hard to deal with regressions. You kinda want an artefact with a version stamp that doesn't change itself before you release the update, especially as this isn't like normal software where specific features can be toggled on or off in isolation of everything else.
* I think. Also, I'm saying "mammal" because of an absence of evidence (to my *totally amateur* skill level) not evidence of absence.
charcircuit 5 minutes ago [-]
Agents have the ability of continual learning.
10xDev 5 hours ago [-]
The fact that models aren't continually updating seems more like a feature. I want to know the model is exactly the same as it was the last time I used it. Any new information it needs can be stored in its context window or stored in a file to read the next it needs to access it.
kergonath 4 hours ago [-]
> The fact that models aren't continually updating seems more like a feature.
I think this is true to some extent: we like our tools to be predictable. But we’ve already made one jump by going from deterministic programs to stochastic models. I am sure the moment a self-evolutive AI shows up that clears the "useful enough" threshold we’ll make that jump as well.
10xDev 29 minutes ago [-]
Stochastic and unpredictability aren't exactly the same. I would claim current LLMs are generally predictable even if it is not as predictable as a deterministic program.
jnd-cz 2 hours ago [-]
Unless you use your oen local models then you don't even know when OpenAI or Anthropic tweaked the model less or more. One week it's a version x, next week it's a version y. Just like your operating system is continuously evolving with smaller patches of specific apps to whole new kernel version and new OS release.
10xDev 32 minutes ago [-]
There is still a huge gap between a model continuously updating itself and weekly patches by a specialist team. The former would make things unpredictable.
A_D_E_P_T 5 hours ago [-]
You could have continual learning on text and still be stuck in the same "remixing baseline human communications" trap. It's a nasty one, very hard to avoid, possibly even structurally unavoidable.
As for the "just put a vision LLM in a robot body" suggestion: People are trying this (e.g. Physical Intelligence) and it looks like it's extraordinarily hard! The results so far suggest that bolting perception and embodiment onto a language-model core doesn't produce any kind of causal understanding. The architecture behind the integration of sensory streams, persistent object representations, and modeling time and causality is critically important... and that's where world models come in.
energy123 5 hours ago [-]
I don't understand why online learning is that necessary. If you took Einstein at 40 and surgically removed his hippocampus so he can't learn anything he didn't already know (meaning no online learning), that's still a very useful AGI. A hippocampus is a nice upgrade to that, but not super obviously on the critical path.
staticman2 3 hours ago [-]
> If you took Einstein at 40 and surgically removed his hippocampus so he can't learn anything he didn't already know (meaning no online learning), that's still a very useful AGI.
I like how people are accepting this dubious assertion that Einstein would be "useful" if you surgically removed his hippocampus and engaging with this.
It also calls this Einstein an AGI rather than a disabled human???
daxfohl 39 minutes ago [-]
He basically said that himself:
"Reading, after a certain age, diverts the mind too much from its creative pursuits. Any man who reads too much and uses his own brain too little falls into lazy habits of thinking".
-- Albert Einstein
zelphirkalt 5 hours ago [-]
I guess the sheer amount and also variety of information you would need to pre-encode to get an Einstein at 40 is huge. Every day stream of high resolution video feed and actions and consequences and thoughts and ideas he has had until the age of 40 of every single moment. That includes social interactions, like a conversation and mimic of the other person in combination with what was said and background knowledge about the other person. Even a single conversation's data is a huge amount of data.
But one might say that the brain is not lossless ... True, good point. But in what way is it lossy? Can that be simulated well enough to learn an Einstein? What gives events significance is very subjective.
a-french-anon 2 hours ago [-]
Kinda a moot point in my eyes because I very much doubt you can arrive at the same result without the same learning process.
jeltz 3 hours ago [-]
It could possibly be useful but I don't see why it would be AGI.
andy12_ 5 hours ago [-]
That's true. Though could that hippocampus-less Einstein be able to keep making novel complex discoveries from that point forward? Seems difficult. He would rapidly reach the limits of his short term memory (the same way current models rapidly reach the limits of their context windows).
andsoitis 5 hours ago [-]
Where does that training data come from?
robrenaud 47 minutes ago [-]
Was Alphago's move 37 original?
In the last step of training LLMs, reinforcement learning from verified rewards, LLMs are trained to maximize the probability of solving problems using their own output, depending on a reward signal akin to winning in Go. It's not just imitating human written text.
Fwiw, I agree that world models and some kind of learning from interacting with physical reality, rather than massive amounts of digitized gym environments is likely necessary for a breakthrough for AGI.
Unearned5161 5 hours ago [-]
I have a pet peeve with the concept of "a genuinely novel discovery or invention", what do you imagine this to be? Can you point me towards a discovery or invention that was "genuinely novel", ever?
I don't think it makes sense conceptually unless you're literally referring to discovering new physical things like elements or something.
Humans are remixers of ideas. That's all we do all the time. Our thoughts and actions are dictated by our environment and memories; everything must necessarily be built up from pre-existing parts.
bonesss 3 hours ago [-]
Genuinely novel discovery or invention?
Einstein’s theory of relativity springs to mind, which is deeply counter-intuitive and relies on the interaction of forces unknowable to our basic Newtonian senses.
There’s an argument that it’s all turtles (someone told him about universes, he read about gravity, etc), but there are novel maths and novel types of math that arise around and for such theories which would indicate an objective positive expansion of understanding and concept volume.
davidfarrell 4 hours ago [-]
W Brian Arthur's book "The Nature of Technology" provides a framework for classifying new technology as elemental vs innovative that I find helpful. For example the Huntley-Mcllroy diff operates on the phenomenon that ordered correspondence survives editing. That was an invention (discovery of a natural phenomenon and a means to harness it). Myers diff improves the performance by exploiting the fact that text changes are sparse. That's innovation. A python app using libdiff, that's engineering.
And then you might say in terms of "descendants": invention > innovation > engineering. But it's just a perspective.
A_D_E_P_T 5 hours ago [-]
Suno is transformer-based; in a way it's a heavily modified LLM.
You can't get Suno to do anything that's not in its training data. It is physically incapable of inventing a new musical genre. No matter how detailed the instructions you give it, and even if you cheat and provide it with actual MP3 examples of what you want it to create, it is impossible.
The same goes for LLMs and invention generally, which is why they've made no important scientific discoveries.
You can learn a lot by playing with Suno.
0x3f 4 hours ago [-]
Novel things can be incremental. I don't think LLMs can do that either, at least I've never seen one do it.
10xDev 5 hours ago [-]
Whether it is text or an image, it is just bits for a computer. A token can represent anything.
A_D_E_P_T 5 hours ago [-]
Sure, but don't conflate the representation format with the structure of what's being represented.
Everything is bits to a computer, but text training data captures the flattened, after-the-fact residue of baseline human thought: Someone's written description of how something works. (At best!)
A world model would need to capture the underlying causal, spatial, and temporal structure of reality itself -- the thing itself, that which generates those descriptions.
You can tokenize an image just as easily as a sentence, sure, but a pile of images and text won't give you a relation between the system and the world. A world model, in theory, can. I mean, we ought to be sufficient proof of this, in a sense...
firecall 5 hours ago [-]
It’s worth noting how our human relationship or understanding of our world model changed as our tools to inspect and describe our world advanced.
So when we think about capturing any underlying structure of reality itself, we are constrained by the tools at hand.
The capability of the tool forms the description which grants the level of understanding.
whiplash451 3 hours ago [-]
The term LLM is confusing your point because VLMs belong to the same bin according to Yann.
Using the term autoregressive models instead might help.
ml-anon 45 minutes ago [-]
Honestly, how do people who know so little have this much confidence to post here?
energy123 5 hours ago [-]
why LLMs (transformers trained on multimodal token sequences, potentially containing spatiotemporal information) can't be a world model?
> One major critique LeCun raises is that LLMs operate only in the realm of language, which is a simple, discrete space compared to the continuous, complex physical world we live in. LLMs can solve math problems or answer trivia because such tasks reduce to pattern completion on text, but they lack any meaningful grounding in physical reality. LeCun points out a striking paradox: we now have language models that can pass the bar exam, solve equations, and compute integrals, yet “where is our domestic robot? Where is a robot that’s as good as a cat in the physical world?” Even a house cat effortlessly navigates the 3D world and manipulates objects — abilities that current AI notably lacks. As LeCun observes, “We don’t think the tasks that a cat can accomplish are smart, but in fact, they are.”
energy123 5 hours ago [-]
But they don't only operate on language? They operate on token sequences, which can be images, coordinates, time, language, etc.
kergonath 4 hours ago [-]
It’s an interesting observation, but I think you have it backwards. The examples you give are all using discrete symbols to represent something real and communicating this description to other entities. I would argue that all your examples are languages.
samrus 4 hours ago [-]
Whats the first L stand for? Thats not just vestogial, their model of the world is formed almost exclusively from language rather than a range of things contributing significantly like for humans.
The biggest thing thats missing is actual feedback to their decisions. They have no "idea of that because transformers and embeddings dont model that yet. And langiage descriptions and image representations of feedback arent enough. They are too disjointed. It needs more
bsenftner 5 hours ago [-]
There will be no "unlocking of AGI" until we develop a new science capable of artificial comprehension. Comprehension is the cornucopia that produces everything we are, given raw stimulus an entire communicating Universe is generated with a plethora of highly advanceds predator/prey characters in an infinitely complex dynamic, and human science and technology have no lead how to artificially make sense of that in a simultaneous unifying whole. That's comprehension.
chilmers 5 hours ago [-]
Ironically, your comment is practically incomprehensible.
copperx 5 hours ago [-]
These two comments above me capture Slashdot in the early 2000s.
rvz 5 hours ago [-]
A lot more justifiable than say, Thinking Machines at least. But we will "see".
World models and vision seems like a great use case for robotics which I can imagine that being the main driver of AMI.
az226 3 hours ago [-]
Yann LeCun seeks $5B+ valuation for world model startup AMI (Amilabs).
He has hired LeBrun to the helm as CEO.
AMI has also hired LeFunde as CFO and LeTune as head of post-training.
They’re also considering hiring LeMune as Head of Growth and LePrune to lead inference efficiency.
I was thinking the same, are all people he hires LeSomething like those working at Bolson Construction having -son as a suffix.
dude250711 2 hours ago [-]
First grinding LEetcode, now having to have 'Le' in the name?
I have no chance in AI industry...
2 hours ago [-]
andrepd 2 hours ago [-]
Bolson-ass hiring policy.
Oras 5 hours ago [-]
> But this is not an applied AI company.
There is absolutely no doubt about Yann's impact on AI/ML, but he had access to many more resources in Meta, and we didn't see anything.
It could be a management issue, though, and I sincerely wish we will see more competition, but from what I quoted above, it does not seem like it.
Understanding world through videos (mentioned in the article), is just what video models have already done, and they are getting pretty good (see Seedance, Kling, Sora .. etc). So I'm not quite sure how what he proposed would work.
torginus 2 hours ago [-]
Most folks get paid a lot more in a corporate job than tinkering at home - using the 'follow the money' logic it would make sense they would produce their most inspired works as 9-5 full stack engineers.
But often passion and freedom to explore are often more important than resources
stein1946 3 hours ago [-]
> There is absolutely no doubt about Yann's impact on AI/ML, but he had access to many more resources in Meta, and we didn't see anything.
That's true for 99% of the scientists, but dismissing their opinion based on them not having done world shattering / ground breaking research is probably not the way to go.
> I sincerely wish we will see more competition
I really wish we don't, science isn't markets.
> Understanding world through videos
The word "understanding" is doing a lot of heavy lifting here. I find myself prompting again and again for corrections on an image or a summary and "it" still does not "understand" and keeps doing the same thing over and over again.
nashadelic 1 hours ago [-]
Your take is brutal but spot on
boccaff 4 hours ago [-]
llama models pushed the envelope for a while, and having them "open-weight" allowed a lot of tinkering. I would say that most of fine tuned evolved from work on top of llama models.
oefrha 4 hours ago [-]
Llama wasn’t Yann LeCun’s work and he was openly critical of LLMs, so it’s not very relevant in this context.
> My only contribution was to push for Llama 2 to be open sourced.
Quite a big contribution in practice.
oefrha 19 minutes ago [-]
Sure, but I don't that's relevant in a startup with 1B VC money either. Meta can afford to (attempt to) commoditize their complement.
YetAnotherNick 2 hours ago [-]
> we didn't see anything.
Is it a troll? Even if we just ignore Llama, Meta invented and released so many foundational research and open source code. I would say that the computer vision field would be years behind if Meta didn't publish some core research like DETR or MAE.
_giorgio_ 4 hours ago [-]
I can’t reconcile this dichotomy: most of the landmark deep learning papers were developed with what, by today’s standards, were almost ridiculously small training budgets — from Transformers to dropout, and so on.
So I keep wondering: if his idea is really that good — and I genuinely hope it is — why hasn’t it led to anything truly groundbreaking yet? It can’t just be a matter of needing more data or more researchers. You tell me :-D
samrus 3 hours ago [-]
Its a matter of needing more time, which is a resource even SV VCs are scared to throw around. Look at the timeline of all these advancements and how long it took
Lecun introduced backprop for deep learning back in 1989
Hinton published about contrastive divergance in next token prediction in 2002
Alexnet was 2012
Word2vec was 2013
Seq2seq was 2014
AiAYN was 2017
UnicornAI was 2019
Instructgpt was 2022
This makes alot of people think that things are just accelerating and they can be along for the ride. But its the years and years of foundational research that allows this to be done. That toll has to be paid for the successsors of LLMs to be able to reason properly and operate in the world the way humans do. That sowing wont happen as fast as the reaping did. Lecun was to plant those seeds, the others who onky was to eat the fruit dont get that they have to wait
_giorgio_ 40 minutes ago [-]
If his ideas had real substance, we would have seen substantial results by now.
He introduced I-JEPA in 2023, so almost three years ago at this point.
If he still hasn’t produced anything truly meaningful after all these years at Meta, when is that supposed to happen? Yann LeCun has been at Facebook/Meta since December 2013.
Your chronological sequence is interesting, but it refers to a time when the number of researchers and the amount of compute available were a tiny fraction of what they are today.
the_real_cher 5 hours ago [-]
He was suffocated by the corporate aspect Meta I suspect.
mihaitoth 1 hours ago [-]
This couldn't have happened sooner, for 2 reasons.
1) the world has become a bit too focused on LLMs (although I agree that the benefits & new horizons that LLMs bring are real). We need research on other types of models to continue.
2) I almost wrote "Europe needs some aces". Although I'm European, my attitude is not at all that one of competition. This is not a card game. What Europe DOES need is an ATTRACTIVE WORKPLACE, so that talent that is useful for AI can also find a place to work here, not only overseas!
Link does not work, goes into loop at verify human check with some weird redirect
Looks like you appended the original URL to the end
storus 2 hours ago [-]
Wasn't there some recent argument that world models won't achieve AGI either due to overlooking the normative framework, fundamental symmetries of the world purely from data and collapse in multi-step reasoning? JEPA is sacrificing fidelity for abstract representation yet how does that help in the real world where fidelity is the most important point? It's like relying on differential equations yet soon finding out they only cover minuscule amount of real world problems and almost all interesting problems are unsolvable by them.
paxys 3 hours ago [-]
I feel like I'm the only one not getting the world models hype. We've been talking about them for decades now, and all of it is still theoretical. Meanwhile LLMs and text foundation models showed up, proved to be insanely effective, took over the industry, and people are still going "nah LLMs aren't it, world models will be the gold standard, just wait."
pendenthistory 2 hours ago [-]
I bet LLMs and world models will merge. World models essentially try to predict the future, with or without actions taken. LLMs with tokenized image input can also be made to predict the future image tokens. It's a very valuable supervised learning signal aside from pre-training and various forms of RL.
mmaunder 1 hours ago [-]
That's between 1 and 10 training runs on a large foundational model, depending on pricing discounts and how much they manage to optimize it. I priced this out last night on AWS, which is admittedly expensive, but models have also gotten larger.
npn 5 hours ago [-]
I wish him luck.
Recently all papers are about LLM, it brings up fatigue.
As GPT is almost reaching its limit, new architecture could bring out new discovery.
secondary_op 4 hours ago [-]
That being sad, Yann LeCun's twitter reposts are below average IQ.
goldenarm 3 hours ago [-]
Do you have a recent example ?
htrp 2 hours ago [-]
impressive that the round was 100% oversubscribed but to be expected when it's the prof that trained a good chunk of the current AI founders.
insydian 5 hours ago [-]
As someone in the tech twitter sphere this is yann and his ideas performing a suplex on LLM based companies. It is completely unfathomable to start an ai research company… Only sell off 20% and have 1 billion for screwing around for a few years.
insydian 5 hours ago [-]
I liken this to watching a godzilla esque movie. Just grab some popcorn and enjoy the ride.
That article is from June 2025 so may be out of date, and the definition of "seed round" is a bit fuzzy.
_giorgio_ 4 hours ago [-]
Thinking Machines looks half-dead already.
The giant seed round proves investors were willing to fund Mira Murati, not that the company had built anything durable.
Within months, it had already lost cofounder Andrew Tulloch to Meta, then cofounders Barret Zoph and Luke Metz plus researcher Sam Schoenholz to OpenAI; WIRED also reported that at least three other researchers left. At that point, citing it as evidence of real competitive momentum feels weak.
az226 3 hours ago [-]
Was just a grift
hnarayanan 2 hours ago [-]
Shock, gasp.
ardawen 1 hours ago [-]
Does anyone have a sense of how funding like this is typically allocated?
how much tends to go toward compute/training versus researchers, infrastructure, and general operations?
imjonse 2 hours ago [-]
At least some of that money should definitely go towards improving his powerpoint slides on JEPA related work :)
A fair amount of negative comments here, but Yann might very well be the person who brings the Bell Labs culture back to life. It’s been badly missing, and not just in Europe.
margorczynski 5 hours ago [-]
He couldn't achieve at least parity with LLMs during his days at Meta (and having at his disposal billions in resources most probably) but he'll succeed now? What is the pitch?
samrus 3 hours ago [-]
The pitch isnt to try to squeeze money out of a product like altman does. Its to lay the groundwork for the next evolution in AI. Llms were built on decades of work and theyve hit their limits. We'll need to invest alot of time building foundations without getting any tangible yeild for the next step to work. Get too greedy and youll be stuck
itigges22 4 hours ago [-]
I just saw a post from Yann mentioning that AMI Labs is hiring too!
sylware 4 hours ago [-]
If, for even 1s, they get in a position which is threatening, in any way, Big Tech AI (mostly US based if not all), they will be raided by international finance to be dismantled and poached hardcore with some massive US "investment funds" (which looks more and more as "weaponized" international finance!!). Only china is very immune to international finance. Those funds have tens of thousands of billions of $, basically, in a world of money, there is near zero resistance.
rvz 5 hours ago [-]
Once again, US companies and VCs are in this seed round. Just like Mistral with their seed round.
Europe again missing out, until AMI reaches a much higher valuation with an obvious use case in robotics.
Either AMI reaches over $100B+ valuation (likely) or it becomes a Thinking Machines Lab with investors questioning its valuation. (very unlikely since world models has a use-case in vision and robotics)
embedding-shape 5 hours ago [-]
> Europe again missing out
I can't read the article, but American investors investing into European companies, isn't US the one missing out here? Or does "Europe" "win" when European investors invest in US companies? How does that work in your head?
joe_mamba 2 hours ago [-]
>isn't US the one missing out here?
Why would the US miss out here? The US invests in something = the US owns part of something.
This isn't a zero sum game.
embedding-shape 49 minutes ago [-]
> Why would the US miss out here?
Personally I don't believe anyone is missing out on anything here.
But rvz earlier claimed that Europe is missing out, because US investors are investing in a European company. That's kind of surprising to me, so asking if they also believe that the US is "missing out" whenever European investors invest in US companies, or if that sentiment only goes one way.
thibaut_barrere 5 hours ago [-]
It is well enough to attract worthy talents & produce interesting outcomes.
myth_drannon 3 hours ago [-]
This could have been 1000 seed rounds. We are creating technological deserts by going all-in on AI and star personalities.
net01 3 hours ago [-]
Because for these investors the opportunity cost of this is higher than other startups.
I agree with you; there should be more diversity in investments in EU startups, but ¯\_(ツ)_/¯ not my money.
abmmgb 5 hours ago [-]
Not based on true valuation unless h-index has become a valuation metric lol
Academics don’t always make great entrepeneurs
general1465 5 hours ago [-]
Here you can see why it is so hard to compete as European startup with US startups - abysmal access to money. Investment of 1B USD in Europe is glorified as largest seed ever, but in USA it is another Tuesday.
weego 5 hours ago [-]
A billion seed is not an every day event anywhere.
mattmaroon 5 hours ago [-]
Not at all. A quick google turns up evidence of 4. There may be more but I think probably not many.
s08148692 5 hours ago [-]
For a foundation AI lab with a world famous AI researcher at the helm though, it's not so impressive. Won't even touch the sides of the hardware costs they'd need to be anywhere near competitive
compounding_it 5 hours ago [-]
Europeans have free healthcare and retirement. They consider putting their money with long term benefits not just become CEO on Tuesday and declare bankruptcy on Wednesday.
general1465 5 hours ago [-]
It is not free, we just pay taxes.
ExpertAdvisor01 5 hours ago [-]
Retirement is the worst.
You are basically forced to pay into a unsustainable system ( at least in Germany ).
It already has to be subsidized by taxes .
joe_mamba 2 hours ago [-]
Exactly. State retirement in Europes is not free nor great. We pay extra in taxes for it and it's only great for the present day retirees, not for those paying into the system right now who will retire into the future. It's the same as US social security, it's not some extra perk that Europeans have over Americans.
Top tier scientists aren't gonna be swayed by European state retirement systems.
ExpertAdvisor01 5 hours ago [-]
Free healthcare and retirement ?
ExpertAdvisor01 5 hours ago [-]
It is an universal system but definitely not free .
In Germany you pay on average 17.5% of your salary for healthcare insurance and 18.6% for retirement .
However contribution caps exists . 70k for healthcare and 100k for retirement .
MrBuddyCasino 5 hours ago [-]
„free“
oceansky 5 hours ago [-]
A startup getting 1B net worth is so rare that such companies are called unicorns.
As the other commenter pointed out, this is 1B seed.
ArnoVW 5 hours ago [-]
actually, they raised $1.03 billion at a $3.5 billion valuation.
dude250711 2 hours ago [-]
Yes, the faster they get used to the thought that loosing a billion is not a big deal, the better.
mentalgear 5 hours ago [-]
Adds up : We are seeing a clear exodus of both capital and talent from the US - with the current US administration’s shift toward cronyism - and the EU stands as the most compelling alternative with a uniform market of 500 million people and the last major federation truly committed to the rule of law.
drstewart 5 hours ago [-]
"Exodus of capital" as if OpenAI didn't just raise 115b
gmerc 2 hours ago [-]
That's a bonfire of capital into a gaping hole in the ground with zero chance outside of "military pork" and "overcharging the taxpayer" to ever make their money back.
The brain capital loss here is what's going to spook investors.
draw_down 1 hours ago [-]
[dead]
whiplash451 3 hours ago [-]
You lost me at “uniform”…
draw_down 5 hours ago [-]
[dead]
Rendered at 16:23:16 GMT+0000 (Coordinated Universal Time) with Vercel.
If you're looking to learn about JEPA, LeCun's vision document "A Path Towards Autonomous Machine Intelligence" is long but sketches out a very comprehensive vision of AI research: https://openreview.net/pdf?id=BZ5a1r-kVsf
Training JEPA models within reach, even for startups. For example, we're a 3-person startup who trained a health timeseries JEPA. There are JEPA models for computer vision and (even) for LLMs.
You don't need a $1B seed round to do interesting things here. We need more interesting, orthogonal ideas in AI. So I think it's good we're going to have a heavyweight lab in Europe alongside the US and China.
BTW, I went to your website looking for this, but didn't find your blog. I do now see that it's linked in the footer, but I was looking for it in the hamburger menu.
If you think that LLMs are sufficient and RSI is imminent (<1 year), this is horrible for Europe. It is a distracting boondoggle exactly at the wrong time.
Sure LLMs are getting better and better, and at least for me more and more useful, and more and more correct. Arguably better than humans at many tasks yet terribly lacking behind in some others.
Coding wise, one of the things it does “best”, it still has many issues: For me still some of the biggest issues are still lack of initiative and lack of reliable memory. When I do use it to write code the first manifests for me by often sticking to a suboptimal yet overly complex approach quite often. And lack of memory in that I have to keep reminding it of edge cases (else it often breaks functionality), or to stop reinventing the wheel instead of using functions/classes already implemented in the project.
All that can be mitigated by careful prompting, but no matter the claim about information recall accuracy I still find that even with that information in the prompt it is quite unreliable.
And more generally the simple fact that when you talk to one the only way to “store” these memories is externally (ie not by updating the weights), is kinda like dealing with someone that can’t retain memories and has to keep writing things down to even get a small chance to cope. I get that updating the weights is possible in theory but just not practical, still.
What's still missing is the general reasoning ability to plan what to build or how to attack novel problems - how to assess the consequences of deciding to build something a given way, and I doubt that auto-regressively trained LLMs is the way to get there, but there is a huge swathe of apps that are so boilerplate in nature that this isn't the limitation.
I think that LeCun is on the right track to AGI with JEPA - hardly a unique insight, but significant to now have a well funded lab pursuing this approach. Whether they are successful, or timely, will depend if this startup executes as a blue skies research lab, or in more of an urgent engineering mode. I think at this point most of the things needed for AGI are more engineering challenges rather than what I'd consider as research problems.
Wait, we have another acronym to track. Is this the same/different than AGI and/or ASI?
Of course, each relevant newspaper on those areas highlight that it's coming to their place, but it really seems to be distributed.
Might be to be close to some of Yann's collaborators like Xavier Bresson at NUS
Almost certainly the IP will be held in Singapore for tax reasons.
Europe in general has been tightening up their rules / taxes / laws around startups / companies especially tech and remote.
It's been less friendly. these days.
As such, They are more likely to talk about singapore news and exaggerate the claims.
Singapore isn't the Key location. From what I am seeing online, France is the major location.
Singapore is just one of the more satellite like offices. They have many offices around the world it seems.
[0]: https://www.sgpbusiness.com/company/Sph-Media-Limited
What’s different about investing in this than investing in say a young researcher’s startup, or Ilya’s superintelligence? In both those cases, if a model architecture isn’t working out, I believe they will pivot. In YL’s case, I’m not sure that is true.
In that light, this bet is a bet on YL’s current view of the world. If his view is accurate, this is very good for Europe. If inaccurate, then this is sort of a nothing-burger; company will likely exit for roughly the investment amount - that money would not have gone to smaller European startups anyway - it’s a wash.
FWIW, I don’t think the original complaint about auto-regression “errors exist, errors always multiply under sequential token choice, ergo errors are endemic and this architecture sucks” is intellectually that compelling. Here: “world model errors exist, world model errors will always multiply under sequential token choice, ergo world model errors are endemic and this architecture sucks.” See what I did there?
On the other hand, we have a lot of unused training tokens in videos, I’d like very much to talk to a model with excellent ‘world’ knowledge and frontier textual capabilities, and I hope this goes well. Either way, as you say, Europe needs a frontier model company and this could be it.
> He is the Jacob T. Schwartz Professor of Computer Science at the Courant Institute of Mathematical Sciences at New York University. He served as Chief AI Scientist at Meta Platforms before leaving to work on his own startup company.
That entire sentence before the remarks about him service at Meta could have been axed, its weird to me when people compare themselves to someone else who is well known. It's the most Kanye West thing you can do. Mind you the more I read about him, the more I discovered he is in fact egotistical. Good luck having a serious engineering team with someone who is egotistical.
This is just the official name of a chair at NYU. I'm not even sure Jacob T. Schwartz is more well known than Yann LeCun
Either you have not read enough Wikipedia pages, or you have too much to complain about. (Or both.)
There are a lot more degrees of freedom in world models.
LLMs are fundamentally capped because they only learn from static text -- human communications about the world -- rather than from the world itself, which is why they can remix existing ideas but find it all but impossible to produce genuinely novel discoveries or inventions. A well-funded and well-run startup building physical world models (grounded in spatiotemporal understanding, not just language patterns) would be attacking what I see as the actual bottleneck to AGI. Even if they succeed only partially, they may unlock the kind of generalization and creative spark that current LLMs structurally can't reach.
What current LLMs lack is inner motivation to create something on their own without being prompted. To think in their free time (whatever that means for batch, on demand processing), to reflect and learn, eventually to self modify.
I have a simple brain, limited knowledge, limited attention span, limited context memory. Yet I create stuff based what I see, read online. Nothing special, sometimes more based on someone else's project, sometimes on my own ideas which I have no doubt aren't that unique among 8 billions of other people. Yet consulting with AI provides me with more ideas applicable to my current vision of what I want to achieve. Sure it's mostly based on generally known (not always known to me) good practices. But my thoughts are the same way, only more limited by what I have slowly learned so far in my life.
The problem is, idk if we're ready to have millions of distinct, evolving, self-executing models running wild without guardrails. It seems like a contradiction: you can't achieve true cognition from a machine while artificially restricting its boundaries, and you can't lift the boundaries without impacting safety.
Even with continuous backpropagation and "learning", enriching the training data, so called online-learning, the limitations will not disappear. The LLMs will not be able to conclude things about the world based on fact and deduction. They only consider what is likely from their training data. They will not foresee/anticipate events, that are unlikely or non-existent in their training data, but are bound to happen due to real world circumstances. They are not intelligent in that way.
Whether humans always apply that much effort to conclude these things is another question. The point is, that humans fundamentally are capable of doing that, while LLMs are structurally not.
The problems are structural/architectural. I think it will take another 2-3 major leaps in architectures, before these AI models reach human level general intelligence, if they ever reach it. So far they can "merely" often "fake it" when things are statistically common in their training data.
That's what I said. Backpropagation cannot be enough; that's not how neurons work in the slightest. When you put biological neurons in a Pong environment they learn to play not through some kind of loss or reward function; they self-organize to avoid unpredictable stimulation. As far as I know, no architecture learns in such an unsupervised way.
https://www.sciencedirect.com/science/article/pii/S089662732...
This sounds very similar to me as to what neurons do (avoid unpredictable stimulation)
f(x)=y' => loss(y',y) => how good was my prediction? Train f through backprop with that error.
While a model trained with reinforcement learning is more similar to this. Where m(y) is the resulting world state of taking an action y the model predicted.
f(x)=y' => m(y')=z => reward(z) => how good was the state I was in based on my actions? Train f with an algorithm like REINFORCE with the reward, as the world m is a non-differentiable black-box.
While a group of neurons is more like predicting what is the resulting word state of taking my action, g(x,y), and trying to learn by both tuning g and the action taken f(x).
f(x)=y' => m(y')=z => g(x,y)=z' => loss(z,z') => how predictable was the results of my actions? Train g normally with backprop, and train f with an algorithm like REINFORCE with negative surprise as a reward.
After talking with GPT5.2 for a little while, it seems like Curiosity-driven Exploration by Self-supervised Prediction[1] might be an architecture similar to the one I described for neurons? But with the twist that f is rewarded by making the prediction error bigger (not smaller!) as a proxy of "curiosity".
[1] https://arxiv.org/pdf/1705.05363
Our training data is a lot more diverse than an LLMs. We also leverage our senses as a carrier for communicating abstract ideas using audio and visual channels that may or may not be grounded in reality. We have TV shows, video games, programming languages and all sorts of rich and interesting things we can engage with that do not reflect our fundamental reality.
Like LLMs, we can hallucinate while we sleep or we can delude ourselves with untethered ideas, but UNLIKE LLMs, we can steer our own learning corpus. We can train ourselves with our own untethered “hallucinations” or we can render them in art and share them with others so they can include it in their training corpus.
Our hallucinations are often just erroneous models of the world. When we render it into something that has aesthetic appeal, we might call it art.
If the hallucination helps us understand some aspect of something, we call it a conjecture or hypothesis.
We live in a rich world filled with rich training data. We don’t magically anticipate events not in our training data, but we’re also not void of creativity (“hallucinations”) either.
Most of us are stochastic parrots most of the time. We’ve only gotten this far because there are so many of us and we’ve been on this earth for many generations.
Most of us are dazzled and instinctively driven to mimic the ideas that a small minority of people “hallucinate”.
There is no shame in mimicking or being a stochastic parrot. These are critical features that helped our ancestors survive.
Can you be a bit more specific at all bounds? Maybe via an example?
So my question is: when is there enough training data that you can handle 99.99% of the world ?
While I suspect latter is a real problem (because all mammal brains* are much more example-efficient than all ML), the former is more about productisation than a fundamental thing: the models can be continuously updated already, but that makes it hard to deal with regressions. You kinda want an artefact with a version stamp that doesn't change itself before you release the update, especially as this isn't like normal software where specific features can be toggled on or off in isolation of everything else.
* I think. Also, I'm saying "mammal" because of an absence of evidence (to my *totally amateur* skill level) not evidence of absence.
I think this is true to some extent: we like our tools to be predictable. But we’ve already made one jump by going from deterministic programs to stochastic models. I am sure the moment a self-evolutive AI shows up that clears the "useful enough" threshold we’ll make that jump as well.
As for the "just put a vision LLM in a robot body" suggestion: People are trying this (e.g. Physical Intelligence) and it looks like it's extraordinarily hard! The results so far suggest that bolting perception and embodiment onto a language-model core doesn't produce any kind of causal understanding. The architecture behind the integration of sensory streams, persistent object representations, and modeling time and causality is critically important... and that's where world models come in.
I like how people are accepting this dubious assertion that Einstein would be "useful" if you surgically removed his hippocampus and engaging with this.
It also calls this Einstein an AGI rather than a disabled human???
"Reading, after a certain age, diverts the mind too much from its creative pursuits. Any man who reads too much and uses his own brain too little falls into lazy habits of thinking".
-- Albert Einstein
But one might say that the brain is not lossless ... True, good point. But in what way is it lossy? Can that be simulated well enough to learn an Einstein? What gives events significance is very subjective.
In the last step of training LLMs, reinforcement learning from verified rewards, LLMs are trained to maximize the probability of solving problems using their own output, depending on a reward signal akin to winning in Go. It's not just imitating human written text.
Fwiw, I agree that world models and some kind of learning from interacting with physical reality, rather than massive amounts of digitized gym environments is likely necessary for a breakthrough for AGI.
I don't think it makes sense conceptually unless you're literally referring to discovering new physical things like elements or something.
Humans are remixers of ideas. That's all we do all the time. Our thoughts and actions are dictated by our environment and memories; everything must necessarily be built up from pre-existing parts.
Einstein’s theory of relativity springs to mind, which is deeply counter-intuitive and relies on the interaction of forces unknowable to our basic Newtonian senses.
There’s an argument that it’s all turtles (someone told him about universes, he read about gravity, etc), but there are novel maths and novel types of math that arise around and for such theories which would indicate an objective positive expansion of understanding and concept volume.
You can't get Suno to do anything that's not in its training data. It is physically incapable of inventing a new musical genre. No matter how detailed the instructions you give it, and even if you cheat and provide it with actual MP3 examples of what you want it to create, it is impossible.
The same goes for LLMs and invention generally, which is why they've made no important scientific discoveries.
You can learn a lot by playing with Suno.
Everything is bits to a computer, but text training data captures the flattened, after-the-fact residue of baseline human thought: Someone's written description of how something works. (At best!)
A world model would need to capture the underlying causal, spatial, and temporal structure of reality itself -- the thing itself, that which generates those descriptions.
You can tokenize an image just as easily as a sentence, sure, but a pile of images and text won't give you a relation between the system and the world. A world model, in theory, can. I mean, we ought to be sufficient proof of this, in a sense...
So when we think about capturing any underlying structure of reality itself, we are constrained by the tools at hand.
The capability of the tool forms the description which grants the level of understanding.
Using the term autoregressive models instead might help.
> One major critique LeCun raises is that LLMs operate only in the realm of language, which is a simple, discrete space compared to the continuous, complex physical world we live in. LLMs can solve math problems or answer trivia because such tasks reduce to pattern completion on text, but they lack any meaningful grounding in physical reality. LeCun points out a striking paradox: we now have language models that can pass the bar exam, solve equations, and compute integrals, yet “where is our domestic robot? Where is a robot that’s as good as a cat in the physical world?” Even a house cat effortlessly navigates the 3D world and manipulates objects — abilities that current AI notably lacks. As LeCun observes, “We don’t think the tasks that a cat can accomplish are smart, but in fact, they are.”
The biggest thing thats missing is actual feedback to their decisions. They have no "idea of that because transformers and embeddings dont model that yet. And langiage descriptions and image representations of feedback arent enough. They are too disjointed. It needs more
World models and vision seems like a great use case for robotics which I can imagine that being the main driver of AMI.
He has hired LeBrun to the helm as CEO.
AMI has also hired LeFunde as CFO and LeTune as head of post-training.
They’re also considering hiring LeMune as Head of Growth and LePrune to lead inference efficiency.
https://techcrunch.com/2025/12/19/yann-lecun-confirms-his-ne...
I have no chance in AI industry...
There is absolutely no doubt about Yann's impact on AI/ML, but he had access to many more resources in Meta, and we didn't see anything.
It could be a management issue, though, and I sincerely wish we will see more competition, but from what I quoted above, it does not seem like it.
Understanding world through videos (mentioned in the article), is just what video models have already done, and they are getting pretty good (see Seedance, Kling, Sora .. etc). So I'm not quite sure how what he proposed would work.
But often passion and freedom to explore are often more important than resources
That's true for 99% of the scientists, but dismissing their opinion based on them not having done world shattering / ground breaking research is probably not the way to go.
> I sincerely wish we will see more competition
I really wish we don't, science isn't markets.
> Understanding world through videos
The word "understanding" is doing a lot of heavy lifting here. I find myself prompting again and again for corrections on an image or a summary and "it" still does not "understand" and keeps doing the same thing over and over again.
Source: himself https://x.com/ylecun/status/1993840625142436160 (“I never worked on any Llama.”) and a million previous reports and tweets from him.
Quite a big contribution in practice.
Is it a troll? Even if we just ignore Llama, Meta invented and released so many foundational research and open source code. I would say that the computer vision field would be years behind if Meta didn't publish some core research like DETR or MAE.
So I keep wondering: if his idea is really that good — and I genuinely hope it is — why hasn’t it led to anything truly groundbreaking yet? It can’t just be a matter of needing more data or more researchers. You tell me :-D
Lecun introduced backprop for deep learning back in 1989 Hinton published about contrastive divergance in next token prediction in 2002 Alexnet was 2012 Word2vec was 2013 Seq2seq was 2014 AiAYN was 2017 UnicornAI was 2019 Instructgpt was 2022
This makes alot of people think that things are just accelerating and they can be along for the ride. But its the years and years of foundational research that allows this to be done. That toll has to be paid for the successsors of LLMs to be able to reason properly and operate in the world the way humans do. That sowing wont happen as fast as the reaping did. Lecun was to plant those seeds, the others who onky was to eat the fruit dont get that they have to wait
If he still hasn’t produced anything truly meaningful after all these years at Meta, when is that supposed to happen? Yann LeCun has been at Facebook/Meta since December 2013.
Your chronological sequence is interesting, but it refers to a time when the number of researchers and the amount of compute available were a tiny fraction of what they are today.
1) the world has become a bit too focused on LLMs (although I agree that the benefits & new horizons that LLMs bring are real). We need research on other types of models to continue.
2) I almost wrote "Europe needs some aces". Although I'm European, my attitude is not at all that one of competition. This is not a card game. What Europe DOES need is an ATTRACTIVE WORKPLACE, so that talent that is useful for AI can also find a place to work here, not only overseas!
Looks like you appended the original URL to the end
Recently all papers are about LLM, it brings up fatigue.
As GPT is almost reaching its limit, new architecture could bring out new discovery.
That article is from June 2025 so may be out of date, and the definition of "seed round" is a bit fuzzy.
The giant seed round proves investors were willing to fund Mira Murati, not that the company had built anything durable.
Within months, it had already lost cofounder Andrew Tulloch to Meta, then cofounders Barret Zoph and Luke Metz plus researcher Sam Schoenholz to OpenAI; WIRED also reported that at least three other researchers left. At that point, citing it as evidence of real competitive momentum feels weak.
Europe again missing out, until AMI reaches a much higher valuation with an obvious use case in robotics.
Either AMI reaches over $100B+ valuation (likely) or it becomes a Thinking Machines Lab with investors questioning its valuation. (very unlikely since world models has a use-case in vision and robotics)
I can't read the article, but American investors investing into European companies, isn't US the one missing out here? Or does "Europe" "win" when European investors invest in US companies? How does that work in your head?
Why would the US miss out here? The US invests in something = the US owns part of something.
This isn't a zero sum game.
Personally I don't believe anyone is missing out on anything here.
But rvz earlier claimed that Europe is missing out, because US investors are investing in a European company. That's kind of surprising to me, so asking if they also believe that the US is "missing out" whenever European investors invest in US companies, or if that sentiment only goes one way.
I agree with you; there should be more diversity in investments in EU startups, but ¯\_(ツ)_/¯ not my money.
Academics don’t always make great entrepeneurs
Top tier scientists aren't gonna be swayed by European state retirement systems.
As the other commenter pointed out, this is 1B seed.