Yann LeCun being interviewed by John Werner at Imagination in Action, Davos Switzerland
Patrick Tighe
As January comes to an end, many of us who attended the annual summit at Davos are pondering next steps, considering the context of AI today, and still trying to parse the interactions between us humans, and ever-evolving AI agents that will accommodate us, inspire us, rival us, and generally make us re-evaluate our place in the world. I interviewed Yann LeCun at our annual Imagination in Action event (I put this event together, it’s free to attend, and it’s designed to foster discussions of timely, important topics). The result was an eye-opening series of revelations about how artificial intelligence research is changing, and what it might lead to relatively soon.
Getting Realistic About AGI
First, do we now “have AGI?”
Speaking on the prospect of artificial general intelligence, LeCun suggested it’s a misnomer, because human intelligence, in his view, is not general. He prefers the term “human-level intelligence,” and while acknowledging that we are approaching this type of AI, we’re not likely to see it this year, or next year.
“We need a few conceptual breakthroughs,” LeCun said, explaining the deficits of today’s LLMs in more detail. The gist of his argument was this: although there are absolutely reasons to hype today’s LLMs as intelligent, we have to remember that humans still have the edge in knowing how to navigate the physical world. LeCun spoke rather pointedly about this, explaining that although LLMs can do a lot of intellectual work, they don’t have the world knowledge to rival humans at many aspects of life. In other words, they’re book-smart, but not street-smart.
LeCun put it this way:
“If you want intelligent behavior, you need a system to be able to anticipate what’s going to happen in the world, and also predict the consequences of its actions. If you can do this, then it can plan a sequence of actions to arrive at a particular objective. And that’s what’s missing. That’s the concept of a world model. You’re not going to get intelligent behavior without that.”
He pointed to the example of autonomous vehicles, which I thought was a good move.
“We have millions of hours of training data to train autonomous cars, and we still don’t have level five autonomous driving (capability),” he noted. “So this tells you (that) the basic architecture is not there.”
In response to this fundamental lack of real-world knowledge, LeCun suggested a “physical AI revolution” is coming. But challenges remain.
“Unfortunately, the real world is messy,” he said. “Sensory data is high-dimensional, continuous, noisy, and generative architectures do not work with this kind of data. So the type of architecture that we use for LLMs and generative AI does not apply to the real world. The next revolution of AI, which is coming fast, is going to be AI systems that understand the real world. Systems that understand high-dimensional, continuous noisy data like video, like sensor data. Systems that can build predictive models of how their environment is going to evolve, and what their effect on the environment is. Systems that can plan, they can reason at the core level. Systems that are controllable and safe, so that you give them a task, and they accomplish it.”
Yann LeCun being interviewed at Imagination in Action, Davos Switzerland
John Werner
But What About AI Agents?
LeCun also addressed the boom in “agentic AI,” still contending that we will not reach human-level intelligence by building agents on LLMs, an approach that he called “a disaster.”
“How can a system possibly plan a sequence of actions if it can’t predict the consequences of its actions?” he asked rhetorically. “So if you want intelligent behavior, you need a system to be able to anticipate what’s going to happen in the world, and also predict the consequences of its actions. If you can do this, and it can plan a sequence of actions to arrive at a particular objective … that’s what’s missing.”
Advanced Machine Intelligence
Recently, LeCun had made headlines with his announcement of his own business, Advanced Machine Intelligence, so I asked him about what this company wants to do, and how long it may take.
The goal, he explained, is building systems that can work intelligently off of these world models.
If you have such a world model (and resulting system) you can plan a sequence of actions to accomplish a task,” he said, citing a vision paper he wrote on this subject, and talks from 2022 that are online. “We have systems now that we can train, completely self-supervised on unlabeled videos, and those systems understand video, represent it really well, can predict missing parts in a video… they also have acquired a certain sense of common sense.”
For example:
“If you show (these types of models) a video where something impossible happens, they tell you ‘this is impossible,’” he continued. “You throw a ball in the air, and the ball stops, or it disappears; the system says ‘no, this is completely incompatible with what I’ve observed during my training.”
You can imagine how impressive this type of thing would be, and how it represents a radical departure from, say, your garden-variety chatbot, which, in comparison, just seems like a digital parrot.
The foundation for this is something that LeCun pioneered called JEPA (Joint Embedding Predictive Architecture) and it’s a work in progress.
“We already have prototypes that work, but we want to generalize the methodology so that it applies to any modality, any data, any sensor data,” he said. “So then we can build, from data, phenomenological models of complex systems … an industrial process of any kind, manufacturing process, chemical plant, a turbo jet engine, a whole airplane, perhaps, you know, chemical reactions, a living cell. Everything in the world is complicated, because it’s an emerging collective phenomenon of really complex systems, and we can only build (limited) models of those things.”
Digital Twinning, and the LaPlace Demon
I had already been thinking that the above approach sounds a lot like extraordinarily complex digital twinning, but LeCun suggested there’s a level of abstraction that we have to factor in. The next part of the interview became fairly profound, as he compared the idea of simulating everything in an extremely complex system to coming at things with more of a diagnostic and targeted view.
“The way we can understand what’s taking place right now in this room is through psychology, maybe a little bit of science, you know, things like that,” LeCun explained. “Not at the level of quantum field theory, or particle physics, or atomic physics, or molecules, or proteins or … cells or organisms.”
Is Alignment the Right Frame?
Another thing that I asked LeCun is about AI alignment, the frantic processes of companies and people to try to direct AI in appropriate ways.
The bottom line, he suggested, is that tomorrow’s systems will be different, and our impression that we’ll be working with dressed-up LLMs as human-like entities is misguided. LeCun noted:
“If you imagine that future AI systems that have humanlike intelligence will be LLMs, which of course is not going to happen, you say, ‘Oh my god, that’s going to be dangerous.’”
If, on the other hand, you think of these future systems as world-responders that are objective-driven and smarter in specific ways, you can see that the problem will, largely, be solved.
The Digital Commons
Another point that LeCun spoke about at length is the need for open systems and open research.
In building “predictive architecture,” and developing context capabilities for AI agents, he suggested, we need to apply the open source philosophy, a “consortium” approach, and not a set of walled towers. LeCun noted how the best open source models are often Chinese, and how stakeholders use these open models to innovate.
He also was pretty assertive about using a “bottom up, not top down” approach to AI, which cuts against the grain of the industrial views of the twentieth century (and certainly, of the feudal centuries before it) – LeCun argues we need a new framework for the new millennium, one where knowledge is not siloed, and humanity works together for the common good.
Some Challenges, and Solutions
Later, LeCun went over some of the potential pitfalls of this important change, like the concentration of power, and the potential for human misuse of AI.
“The most important risk of AI is that in the near future, where our entire digital diet will be mediated by AI systems, if those AI systems come from a handful of proprietary companies on the west coast of the U.S. or China, we’re in big trouble for the health of democracy, cultural diversity, linguistic diversity, value systems,” he said. “So we need a highly diverse population of AI assistance, for the same reason we need diversity in the press, and that can only happen with open source.”
The open communities that LeCun is asking for are in play, for example, in academia: check out this example from Drexel where planners are trying to build this type of open forum.
“The Digital Commons is designed to foster an open, supportive, and collaborative environment where faculty and professional staff at Drexel can explore and share the creative ways they are using AI,” spokespersons write.
We have this kind of mentality happening at MIT, too, where the Media Lab and other offices are building those connective tissues where AI work can benefit from broad collaboration.
You can watch the rest of LeCun’s interview here, or navigate over to YouTube. The central idea, the idea of open source tech and open research, is driving a generation of innovators who understand that this is the key to egalitarian outcomes in the twenty-first century. Let’s keep that in mind as we move forward, and also stay open to ideas about objective-driven models that can understand the world around them.
Yann LeCun being interviewed at Imagination in Action, Davos Switzerland
Patrick Tighe
