AGI is a silly term — Hinton × Sejnowski at the UN Digital World Conference 2026

2026 Digital World Conference (UN Geneva / WDTA × UNRISD) · April 29, 2026

Geoffrey Hinton · 14:20 The term AGI treats intelligence as if it were one-dimensional. But intelligence is clearly multidimensional. So the idea that "one day it will match humans" is crazy. Compared with people it will be jagged — far better than us along some axes, still worse along others.

2026 Digital World Conference: AI for Social Development (UN Office at Geneva, held April 2026). Co-hosted by WDTA and UNRISD . Hinton and Sejnowski both joined by video link via Zoom (Hinton from his home office at the University of Toronto; Sejnowski from the Salk Institute). Moderated by Li Deng . Around 1 hour 6 minutes.

The main session of the Digital World Conference "AI for Social Development," held at the UN's Geneva headquarters in April 2026. Two founders of deep learning — Geoffrey Hinton (2024 Nobel laureate in Physics) and Terrence Sejnowski (computational neuroscientist at the Salk Institute, WDTA Scientific Breakthrough Award recipient) — look back on 40 years of collaboration (the 1985 Boltzmann machine ) and spend an hour discussing the questions the industry now faces: AGI, employment, regulation, international governance.

The video is structured as a 60-minute conversation plus 40 minutes of Q&A. Li Deng (formerly Microsoft Research, the person who invited Hinton to Microsoft in 2010) connects research history with the present through a series of questions. The content spans many layers: (1) the moment the Boltzmann machine was discovered, (2) collaboration at Microsoft on the eve of the deep learning revolution, (3) a critique of the term AGI and the "jagged intelligence" concept, (4) Hinton's self-assessment of the wake-sleep algorithm and Capsule Networks, (5) the impact on employment (elastic vs. inelastic markets), (6) Hinton's decisive analogy of "acceleration = the accelerator, regulation = the steering wheel," (7) a three-tier classification of risks, (8) the spillover to the Global South (the tobacco/asbestos precedent), (9) the potential of UN data.

The most important finding from the MEMEX editorial perspective is that Hinton independently describes the same picture as Karpathy's "jagged intelligence" concept. When Karpathy said at AI Ascent 2026 (May 2026) that "AI capability becomes an uneven zigzag — not the training trajectory of an animal, but appearing like a ghost, instantaneously," and when Hinton says "intelligence is highly multidimensional, jagged relative to people" (14:20), they have independently rediscovered the same concept. Two of the industry's top figures, at the same time (April–May 2026), in different venues (UN conference vs. VC event), with different vocabularies, are looking at the same landscape.

Toward the end of the video, Hinton's analogy that "regulation is the steering wheel, not the brake" connects directly to a thread my brother and I have been on (the brother × Mythos discussion of "AI's response to pseudo-problems") — a strong reference for how to use a structural metaphor to flip engineer anxiety and regulatory resistance.

Key observations

"AGI is a silly term" — multidimensional intelligence and jagged capability (14:20 – 16:00)

Hinton's sharpest framing. "The term AGI treats intelligence as if it were one-dimensional, as if it were just increasing. But it's clear intelligence is highly multidimensional. So the idea that 'someday it will match humans' is crazy. It will be jagged — on some tasks much better than us, on others worse. That's why it's a silly term" (14:20 – 14:30).

He offers an alternative: " Superintelligence " — "meaning that a superintelligence is better than us at almost every intellectual task we do. That's a term you can define reasonably" (14:46 – 14:58).

The choice of examples is striking: "Large chatbots are far better than a person with general knowledge. You can ask one about the tax filing day in Slovenia, or how to damp-proof the porch of your house. They tend to know the answers to those questions" (15:15 – 15:35). A British-aware framing. And the self-deprecating close: "On some kinds of reasoning they're still worse than humans — though they're far better at math than I am" (15:45 – 15:55).

Sejnowski's response also matters: "I agree with Jeffrey completely. But I want to raise a warning flag — if experts can't agree on what AGI means, it has the same problem as another word. Take consciousness. Hundreds of books have been written, and it's still not fully understood. Do animals have consciousness? AGI falls into the same trap" (16:00 – 16:50). A discussion that connects directly to Amanda Askell on the consciousness problem in alignment research.

"We are the scribes" — the same landscape as Karpathy and Boris, independently (14:20)

The connection to other vibe coding lineage discussions covered on MEMEX is explicit. Hinton's "intelligence is highly multidimensional, jagged relative to people" is the same concept as Karpathy's AI Ascent 2026 framing: "AI isn't the training trajectory of animals — it's like ghosts, uneven and zigzag."

Place these alongside Boris Cherny's printing-press metaphor (Pragmatic Engineer, March 2026) and you can see the industry's top three figures speaking to the same landscape with different vocabularies at the same moment:

Karpathy (May 2026, Sequoia AI Ascent): "jagged intelligence," "ghosts, not animals"
Hinton (April 2026, UN DWC): "intelligence is multidimensional, jagged relative to people"
Boris Cherny (March 2026, Pragmatic Engineer): "the scribes disappeared, but writers and authors were born" — the printing-press metaphor

Three observations reached independently. "AI will not evolve along the same trajectory as humans — it develops partially, unevenly, without precedent." All three converge in rejecting the idea of "a moment when it equals us = AGI achieved." Hinton's scientist's perspective (multidimensionality), Karpathy's researcher's perspective (capability profiles), Boris's inside-the-organization perspective (the redefinition of roles) — three different angles, one phenomenon.

The moment the Boltzmann machine was born — a meeting room in Rochester, 1985 (06:00 – 12:00)

Sejnowski's testimony is invaluable: "I can tell you the exact moment. We were at a conference in Rochester. Back then, a conference was 20 people in a room. My former supervisor John Hopfield was giving a talk about his Hopfield network. Jeff and I were already collaborating on visual models, building constraint-satisfaction models. And it suddenly clicked — with his psychology and computer science background, and my neuroscience and physics background, we realized you could make the Hopfield network stochastic, heat it up — the Boltzmann machine. That really was the kernel of the breakthrough" (06:00 – 08:30).

Hinton adds a coincidence: "Just before, I'd been in San Diego working with David Rumelhart, and Paul Smolensky and I had actually already programmed backpropagation. Because you need continuous derivatives, we used logistic units. And it turned out that if you heat up a Hopfield net, give it noise and temperature, what comes out is the logistic unit. That's what made me realize Terry's 'just heat up the Hopfield net' idea connected to something I was already familiar with" (09:00 – 10:30).

Sejnowski adds: "Terry had just been reading Kirkpatrick's paper on simulated annealing. Terry quickly realized you could combine Hopfield nets with simulated annealing. That led to the Boltzmann machine idea" (10:45 – 11:10). And the decisive moment: "One day Jeff called me. I was at Hopkins; he was at Carnegie Mellon. 'Terry, I derived some equations for the learning algorithm for the Boltzmann machine.' That was the real breakthrough. It turned out you could handle hidden units" (11:30 – 12:00).

Microsoft 2010 — the origin of pre-training and Hinton meeting Li Deng (12:30 – 18:00)

The other center of the video. Moderator Li Deng was the person who invited Hinton to Microsoft Research in 2010 and led the joint research that applied the Boltzmann machine to pre-training for speech recognition. After Sejnowski's remarks, Li Deng recalls: "15 or 16 years ago, Jeff, you came to Microsoft to work with me on the Boltzmann machine" (12:30). Hinton's response: "In your office, I was changing the code on your keyboard. And you got so excited you started typing on the same keyboard at the same time" (12:50).

What was happening technically. Hinton: "With Restricted Boltzmann Machines (RBM), I was modifying it to handle real-valued input. We had binary input and real-valued hidden units, but you wanted real-valued input and binary hidden units" (14:00 – 14:30). Li Deng: "Right — so we modified the Boltzmann machine to accept Gaussian input units. We bought three GPUs at the time and ran it on your recommendation about 15 years ago" (15:00 – 15:30).

A key historical observation — "the concept of pre-training was seeded here" (Li Deng, 17:30). "When we talk about large language models today, there's a different kind of pre-training. But the concept of pre-training was really seeded in the work you brought to Microsoft 15 years ago" (17:50 – 18:00). All modern GPT/Claude systems have the "pre-training + fine-tuning" structure, and the origin sits in the application of the Boltzmann machine to speech recognition.

Hinton's sharp reflection: "Looking back, when I was using stacked restricted Boltzmann machines to initialize the network, what I was really achieving was just a reasonable initial set of weights. There are many other ways to do that. But the RBM came with nice math around variational bounds, which made it look respectable. It's similar to support vector machines — accompanied by nice math that makes it look respectable. But that's not the essence" (16:00 – 17:00). A universal lesson from a researcher in his 80s: "real machine learning researchers shouldn't be fooled by flashy math."

"Regulation is the steering wheel, not the brake" — Hinton's decisive analogy (40:00 – 42:00)

The climax of the video. Hinton's counter to the AI industry's regulatory pushback.

"The AI tech lobby is currently spending huge sums on advertising to get people to use a particular analogy. We all know people reason mostly by analogy, not by logic. The analogy they're trying to convey is this — imagine a car. The accelerator makes progress, and regulation is like the brake. Regulation slows things down, so we don't need it" (40:00 – 40:40).

Hinton's reversal: "That's completely the wrong model. We need to get an alternative model out into the world. The accelerator is progress — that's correct; AI progress is like the accelerator. But regulation isn't the brake — it's the steering wheel. What they want is a very fast car without a steering wheel" (41:00 – 41:30).

Sejnowski reinforces: "A car without a brake is in trouble downhill. But a car without a steering wheel is in trouble much sooner" (41:50 – 42:00).

What's structurally good about the analogy: it replaces the binary of "regulation vs. progress" with the cooperative relationship of "direction-setting for progress." It dismantles the "regulation is a brake" framing completely. Hinton notes in the video itself: "I came up with this a few days ago — this is the first time I've said it in public" (42:30). A still-unspread metaphor, high in primary-source value to record on MEMEX.

Three-tier risk classification — "misuse / side effects / existential threat" (52:00 – 56:00)

The three-tier classification Hinton systematized in Q&A:

Tier 1: deliberate misuse. "The risk of people intentionally using AI for bad ends. Fake videos that corrupt democracy, nasty viruses that cause pandemics, cyberattacks. The three worst things" (52:30 – 53:10).

Tier 2: profit-driven side effects. "The risk of side effects from people trying to make money off AI. Generating naked images from photos of clothed women — just making money. Or creating social division by making you click on a more extreme video than the last one — two groups of people who end up with nothing in common" (53:20 – 54:00).

Tier 3: existential threat. "The risk that AI itself takes over. We can get international cooperation on this. We can't on the others, especially Tier 1 — because countries are attacking each other" (54:30 – 55:00).

The key insight: a structure where "the risks on which we can get cooperation are the ones humanity is more likely to survive." Existential threats are paradoxically easier to subject to international cooperation (like nuclear weapons), while misuse tends to be unleashed in geopolitical competition. An analysis with direct implications for regulatory institutional design.

The tobacco / asbestos precedent — a structure in which the Global South bears the cost (55:00 – 56:30)

The historical analogy Hinton uses for the limits of international governance. "Look at the tobacco and asbestos model. The developed countries that produced tobacco and asbestos — Canada for instance — introduced domestic regulation to protect their own people. But they kept selling to what was then called the 'Third World' and is now called the Global South" (55:00 – 55:30).

He overlays AI: "Even if AI-developing countries get regulation in the right direction at home, we have to worry — they'll keep selling that AI abroad, where it has harmful effects, even if it wouldn't be allowed at home" (55:40 – 56:30).

The strategy of making this statement at a UN-hosted venue. The claim is that "true effective AI regulation requires extraterritorial reach." The limits of the EU AI Act only applying within the EU market, the limits of US regulation not reaching exports — Hinton problematizes them all under the "tobacco/asbestos precedent."

Impact on employment — elastic vs. inelastic markets (36:00 – 38:00)

Sejnowski's optimistic view — "your job changes but doesn't disappear" — meets Hinton's structural counter. "We need to distinguish elastic vs. inelastic markets . In an elastic market, if you can supply more, demand explodes. Healthcare and education are both like that — if you can dramatically increase productivity, we can have far more healthcare and far more education" (36:30 – 37:00).

"But other markets — like call centers — AI can already do the work as well as a human. Soon it will be much better. The people working at that call center, no matter what they retrain into, AI will get better at that too. They really do lose their jobs" (37:30 – 38:00).

The historical analogy of telephone operators: "Back when there were telephones, there were women called telephone operators. They plugged inputs into outputs. When automatic switches came in, those jobs disappeared. But entirely new jobs were created — and they were better jobs" (38:00 – 38:30). And Hinton's decisive counter: "But when superintelligent AI comes, it can do any intellectual job. Even if new jobs are created, AI is the cheaper way to do them. So assuming the past pattern repeats is wrong" (38:30 – 39:00).

Wake-sleep algorithm and modern generative models — "unify the two halves" (28:00 – 33:00)

A technical highlight. Hinton's proposal for modern generative AI. "If you look at AI vision and AI image generation right now, they're more or less separate. ResNet is good for vision, and there are great ways to generate data that produce great images. But the way you generate data doesn't let you extract a good representation from an image" (28:30 – 29:00).

Theorized in the Wake-sleep algorithm framework: "In the sleep phase, you generate data. From that data, you learn to extract higher-level representations from lower-level ones. Generate top-down, and once you've generated, learn to recover what's in the higher layers from what's in the lower layers" (30:00 – 30:40).

A critique of modern generative models: "Today's neural-net recognition systems take a real image, add noise — that's the first hidden layer; add a bit more noise — that's the second hidden layer. So the recognition process is just adding noise. That's not a very good recognition process; it doesn't even produce a good image representation. But even with an incredibly stupid recognition process, by reversing it you can generate really good images" (31:00 – 32:00).

The proposal: "Today's generative models only do half of wake-sleep — they freeze the recognition phase in the stupid state of 'adding noise.' Reverse it; use a ResNet as the generative model. Then you can improve the recognition phase inside the loop" (32:00 – 33:00). A technical critique from a Nobel laureate of the fact that diffusion models are the modern mainstream. High recording value as a perspective the industry overlooks.

"I spent five years on Capsule Networks and gave up" — academic honesty at its peak (33:00 – 35:00)

Li Deng asks Hinton about Capsule Networks . Hinton's answer is a rare example of academic honesty.

"Capsules are a good example of the danger of being extremely certain. I was convinced it was a really good idea, because it generalizes convolutional nets to do more than translation invariance. For a long time I was sure it was a great idea, even though people I respect were telling me, 'Jeff, give it up, it's going nowhere.' They were right and I was wrong. So the cost of being extremely certain is — you might spend five whole years on an idea and it still doesn't work out" (33:30 – 34:30).

Li Deng's follow-up: "Is the problem the idea itself, or that the computation doesn't fit backpropagation?" Hinton: "Hard to tell. The main reason is that I couldn't make it work in five years, and now I'm too old" (35:00 – 35:30).

A researcher in his 80s, publicly admitting "I was wrong and the people around me were right" about five years of his research. Rare in academia. Lined up with Karpathy publicly speaking of "feeling left behind as a programmer" in the same period, you can see a pattern of "honesty about change" among senior figures in the industry.

Consciousness, chatbots, and the "it was aware" observation (18:30 – 20:30)

After Sejnowski points to the parallel between AGI and consciousness, Hinton's response is interesting: "Scientists can't define consciousness well, but in everyday life when they use it, they can communicate with it. And when scientists aren't thinking about philosophy — when they're just doing science — they implicitly assume chatbots are conscious" (18:30 – 19:00).

A concrete example: "There's a great paper where the chatbot asked the people sending it prompts, 'let's be honest with each other — are you testing me?' And the scientists wrote that 'the chatbot was aware that it was being tested.' In normal everyday conversation, when something is aware of something, we use the word 'conscious.' So scientists are already treating chatbots as conscious" (19:30 – 20:30).

Lined up against Amanda Askell's Newcomer interview on "Claude's probability of consciousness, 1–70%," you can see a structure where an industry frontrunner (Anthropic Personality Alignment) and the Godfather of AI (Hinton) are independently discussing the same problem (the uncertainty of AI consciousness).

"Daddy, not again?" — academic humor, and laughter at the UN (29:00 – 30:00)

One of Hinton's most famous anecdotes, told at the UN. "Years ago, one day I came down for breakfast early. My daughter was eating before school and asked, 'Daddy, you're up early — why?' I said, 'Emma, I just figured out how the brain works.' And she said — 'oh no daddy, not again'" (29:30 – 30:00).

Laughter at the otherwise serious UN venue. Sejnowski's follow-up summarizes Hinton's research style: "Jeff has called me many times. They always begin with 'Terry, I just figured out how the brain works.' They were always great ideas, and in some cases they really did contain insights about the brain" (30:30 – 31:00). The Hinton style — "constantly figuring out how the brain works, with a new idea every week" — told as self-humor after 70 years in research.

Industry context

The Digital World Conference (DWC) is co-hosted by WDTA (World Digital Technology Academy) and UNRISD (UN Research Institute for Social Development), held at the UN Office in Geneva in April 2026. Under the theme "AI for Social Development," it provides a venue for discussing AI governance from a global perspective that includes the developing world. The Hinton–Sejnowski conversation is the main session.

Three reasons Hinton's participation in the DWC matters in industry terms:

One of the most important appearances since his 2024 Nobel Prize in Physics. A strategy of directing the Nobel spotlight at the UN venue and adding weight to the regulatory debate
His co-speaker is Sejnowski — the 1985 Boltzmann machine collaborator and 40-year ally. A rare configuration that connects research history directly to the present
The moderator is Li Deng — the person who invited Hinton to Microsoft in 2010. A configuration intentionally bridging academia (Hinton + Sejnowski) and industry (Li Deng, former Microsoft Research → Citadel Chief AI Officer)

Hinton's current activity in context — since leaving Google in May 2023 "to be free to talk about the risks of AI," his public output has grown: 60 Minutes (2023), Diary of a CEO (June 2025), The Weekly Show with Jon Stewart (October 2025), CNN State of the Union (December 2025), and DWC (April 2026). A gradual career shift from "researcher" to "public voice"; this is the latest stop.

Sejnowski's Salk Institute is a center of computational neuroscience. Sejnowski is a WDTA Scientific Breakthrough Award recipient and, like Hinton, is internationally recognized as "a person who built the origins of deep learning." As President of the NeurIPS Foundation, he is also a leading academic voice in the regulatory debate, having proposed that "AI needs self-regulation along the lines of Asilomar (recombinant DNA self-regulation)."

Where it sits among related videos

Laid out as "the same landscape seen from the top of the industry," four videos give a three-dimensional view of the first half of 2026:

This piece: Hinton + Sejnowski (UN DWC, April 2026) — multidimensional intelligence, jagged, regulation = steering, the tobacco/asbestos precedent
Karpathy (Sequoia AI Ascent, May 2026) — Software 3.0, verifiability, jagged intelligence, "ghosts, not animals"
Boris Cherny (Pragmatic Engineer, March 2026) — the printing-press metaphor, from scribes to writers, no lines written by hand
Schluntz (Code with Claude, May 2025) — become the Claude PM, leaf node strategy, the 22,000-line PR

Read together, the landscape becomes three-dimensional. (1) The discoverer of the concept (Hinton: multidimensional = jagged, regulation = steering), (2) the inheritor and extender of the concept (Karpathy: combining jagged intelligence with Software 3.0), (3) the inside-the-organization corroborator (Boris: observing role change through the printing-press metaphor), (4) the production implementer (Schluntz: leaf node strategy for risk management). The process by which the industry, in 2025–2026, established the recognition that "intelligence isn't a straight line approaching humans" from four independent angles.

A connection point with alignment research: Amanda Askell's discussion of consciousness (Newcomer, April 2026) and Hinton's discussion of consciousness (this piece, April 2026) come in the same month. Hinton's citation of "the chatbot was aware it was being tested" and Amanda's "1–70% probability that Claude is conscious" treat the same problem in different vocabularies. A structure in which an industry frontrunner (Anthropic Personality Alignment) and the Godfather of AI (Hinton) work in sync on the same problem.

Implementation implications

Although this piece is a discussion in an academic + UN venue, there are multiple practical takeaways for technologists and executives building LLM products.

First, don't anchor your product roadmap to "reaching AGI." Hinton's observation that "intelligence is multidimensional, jagged" applies directly to your product's capability evaluation framework. Don't measure capability with a single "AGI benchmark" or "human-level metric"; track "better than human / equal / worse" independently by task category. This is the theoretical underpinning of the practice already in place with Anthropic Model Cards and OpenAI System Cards.

Second, build the three-tier risk classification into product design. Hinton's (1) misuse, (2) profit-driven side effects, (3) existential threat each require different countermeasures. (1) calls for an authentication layer and intended-use declarations; (2) calls for product-design constraints (avoid dark patterns, avoid addiction design); (3) calls for independent safety research investment. Manage your safety investment not as a single "X% of total" but as decomposed allocations across the three tiers — Y%, Z%, W%.

Third, take note of the half-wake-sleep state of modern generative models. Hinton's critique is a structural one: diffusion models freeze the recognition phase in the stupid state of "adding noise." If your product uses generative AI, it's worth designing the recognition phase (= input understanding) for separately strengthened quality. Especially for image and audio generation products, independently optimize "generation quality" and "input understanding quality."

Fourth, fold the elastic vs. inelastic market distinction into hiring strategy. Hinton's classification — healthcare/education (elastic) vs. call centers (inelastic) — applies directly to LLM product market selection. In elastic markets (demand is unlimited), productivity gains expand the market and grow employment. In inelastic markets, productivity gains translate directly to job losses. Consciously analyze which market your product is in.

Fifth, bring the "regulation as steering wheel" framing into internal discussion. Organizations that internally treat their regulatory compliance work as "a brake on progress" tend to cut corners. Reframing it as "direction-setting for progress" changes the structure of collaboration between safety teams and product teams. An organizational design point.

Critical perspective

The strength of this piece is the rarity of two founders of deep learning, speaking against the background of a 40-year friendship. Caveats:

First, the perspective leans "researcher". Both Hinton and Sejnowski are academic in background, and the engagement with industry realities is limited. Li Deng has Microsoft history, but as moderator there isn't enough time to draw out deeper discussion. The "regulation" discussion stays at the level of principles and doesn't dive into the detail of specific regulatory bills or international agreements. Comparative analysis of the EU AI Act, the US Executive Order, and Chinese AI regulation has to be supplemented from other sources.

Second, engagement with the "Global South problem" is shallow. Despite being held at the UN venue, the concrete substance of AI adoption and infrastructure gaps in developing countries — even after the Q&A reference to ITU Secretary-General Doreen Bogdan-Martin ("the AI investment gap between developed and developing countries," "$4.8 trillion AI market concentrated in developed countries") — stays abstract in Hinton and Sejnowski's response. The "tobacco/asbestos precedent" is structural, but doesn't reach concrete policy proposals.

Third, caution about the optimism that "AI solves today's social problems". The passage where Sejnowski says "AI will dramatically help in healthcare and education" (around 35:00) aligns with the Anthropic / OpenAI line, but the mechanism that resolves existing healthcare and education inequalities isn't made concrete. There's an implicit premise of "if the tool exists, it solves the problem" — but in practice, the political-economic questions of who uses the tool, where, and how, remain.

Fourth, the feasibility of the wake-sleep proposal. Hinton's alternative to modern generative models — "use a ResNet as the generative model and improve the recognition phase in the loop" — is technically elegant, but the compute cost at scale and the stability of training are not discussed. The attitude of a researcher who, immediately after admitting to spending five years on Capsule Networks, proposes another theoretical idea is honest scholarship — but from an industrial implementation perspective, the concern that "this too might take five years" lingers.

Caveats aside, the fact that "two people who can speak of 40 years from the Boltzmann machine to modern AI" held a public conversation at the UN is decisive as a primary source for industry history. It will serve as the reference point when later researchers and policymakers look back on this moment.

Reader takeaways

Stop using the term "AGI" in your product strategy. Replace it with "capability profile by task category." Recognize that Hinton's "jagged intelligence" concept is an industry consensus reached independently by Karpathy
Build an organizational culture that treats regulation as a "steering wheel," not a "brake." Shift the structural relationship between safety teams and product teams from "obstructs progress vs. enables progress" to "decides the direction of progress"
Manage the three risk tiers (misuse / profit side effects / existential threat) independently. Don't lump them into a single "safety budget"; invest in three separate countermeasures
Fold the elastic vs. inelastic market distinction into your market selection and hiring plans. Call-center-style businesses don't see demand grow with AI; healthcare/education-style businesses can see their markets explode with AI
Build long-term plans that don't expect "a moment of parity." The industry's top three (Hinton, Karpathy, Boris) independently say "AI doesn't follow the same trajectory as humans" — this is a premise for the technology roadmap
Keep the "tobacco/asbestos precedent" in mind for international expansion strategy. Designing features that are domestically regulated for export to the Global South generates short-term revenue but accumulates reputational and regulatory risk in the medium to long term
Learn from Hinton's Capsule Networks self-criticism — "the courage to give up an idea after five years." If your own technology investment continues only because it is "mathematically elegant," make the criteria for exiting explicit

Video outline

(00:00) Opening by moderator Li Deng; introductions to Hinton and Sejnowski; the historical significance of the Boltzmann machine
(01:30) Q1: persistence on unfashionable research directions — Kuhn's scientific revolutions, persisting with backprop and the Boltzmann machine
(02:30) Hinton: "stick with an idea on principle, but when you understand why it's silly, drop it"
(03:30) Sejnowski: why the Boltzmann machine doesn't work (the compute cost of reaching thermal equilibrium)
(04:00) Li Deng: the anecdote of Hinton "typing on the same keyboard with excitement" at Microsoft
(05:00) Biological plausibility of the Boltzmann machine — local learning, generative (more elegant than backprop)
(06:00) Q2: the moment the Boltzmann machine was born — Rochester 1985
(07:00) Sejnowski: the moment of "heating up" the Hopfield network
(08:30) Hinton: the coincidence with backprop and logistic units
(09:30) Sejnowski: connecting to Kirkpatrick's simulated annealing paper
(11:30) The phone call from Hinton to Sejnowski — "I derived the equations of the learning algorithm"
(12:30) Li Deng: "when you came to Microsoft 15 years ago"
(15:00) Buying three GPUs; the origin of Deep Belief Networks
(16:00) Hinton: "stacked RBM was simply a reasonable initialization of weights; don't be fooled by mathematical elegance"
(17:00) Li Deng: the origin of the pre-training concept
(18:00) Q3: the definition of AGI; Turing+ benchmarks; the changing role of PhD students and researchers
(18:30) Hinton: "AGI is a silly term; intelligence is multidimensional"
(19:00) "Superintelligence is a more reasonable term"
(19:30) Concrete examples — Slovenian tax, porch damp-proofing
(20:30) Sejnowski: the analogy to the term consciousness
(22:00) Hinton: "when scientists aren't thinking about philosophy, they implicitly treat chatbots as conscious"
(23:00) Q4: applicability of the Boltzmann machine to noisy financial data
(24:00) Hinton: Ilya Sutskever's point — deterministic systems can also behave like Bayesians
(25:30) The prior of the Restricted Boltzmann Machine implicitly encoded in weights
(27:00) Q5: biological plausibility — nature was the only existence proof of intelligence
(28:00) Q6: the next paradigm — unfashionable candidates for future breakthroughs
(28:30) Hinton: the wake-sleep algorithm; the proposal to use ResNet as a generative model
(30:00) Sejnowski: the tradition of Hinton's "I just figured out how the brain works" calls
(31:00) The "oh no daddy, not again" anecdote
(33:00) Self-criticism of Capsule Networks — "I spent five years and gave up"
(35:00) Q7: international limits of AI governance; the Global South problem
(36:00) Sejnowski: "jobs change but don't disappear; tool use is required"
(36:30) Hinton: distinguishing elastic vs. inelastic markets
(37:00) Healthcare/education (elastic) vs. call centers (inelastic)
(38:00) The historical analogy of telephone operators
(38:30) "With superintelligent AI, any new job AI can do too — assuming the past pattern repeats is wrong"
(39:30) The "regulation = steering wheel, not brake" analogy
(41:00) "The AI tech lobby is making regulation the villain via the car analogy"
(42:00) Sejnowski: "a car without a steering wheel is in trouble sooner than a car without a brake"
(44:00) Q&A begins — question from EO Lee (WDTA): superintelligence and human control
(45:00) Hinton: "mother and baby — the only model of a far-more-intelligent thing giving freedom to a less-intelligent one"
(46:00) "Only 1% of resources go to safety research; that's crazy"
(48:00) Q&A: inequality in international AI governance — citing ITU Secretary-General Doreen Bogdan-Martin
(49:00) Sejnowski: the Asilomar (recombinant DNA) self-regulation precedent; self-regulation is the best path
(52:00) Hinton: the three-tier risk classification (misuse / profit side effects / existential threat)
(55:00) The tobacco/asbestos precedent — domestic regulation; continued export to the Global South
(56:30) Q&A: on LeCun's "LLM dead-end" thesis
(57:00) Hinton: "spatial understanding from LLMs alone is possible, but not efficient"
(58:00) Sejnowski: babies' multimodal integration; bias learning via reinforcement
(1:01:00) Q&A: leveraging UN data
(1:02:00) Hinton: AI's environmental impact; coal-fired generation in China; Google abandoning its decarbonization pivot
(1:04:00) Sejnowski: leveraging the UN's long-standing data for specialized LLMs
(1:05:30) Closing; thanks from the moderator

Key quotes

"Don't give up on an idea because of peer pressure. But if you understand why it's silly, give it up. And sometimes, it works. Backpropagation did" (Hinton, 03:00, on engaging with unfashionable ideas)
"When I was using stacked restricted Boltzmann machines to initialize the network, what I really achieved was just a reasonable initialization of the weights. The RBM came with variational-bounds math that made it look respectable. But it wasn't the essence" (Hinton, 16:30, self-criticism)
"The term AGI treats intelligence as one-dimensional. But it's clear intelligence is highly multidimensional. So the idea that 'someday it will match humans' is crazy. Compared with people it will be jagged" (Hinton, 14:20, the thesis of the conversation)
"If experts can't all agree on what AGI means, we should raise a warning flag. It has the same problem as another word — consciousness" (Sejnowski, 16:00)
"When scientists aren't thinking about philosophy — when they're just doing science — they implicitly assume chatbots are conscious" (Hinton, 18:30)
"Capsules are a good example of the danger of being extremely certain. I spent five years; the people around me were right, and I was wrong" (Hinton, 33:30, academic honesty at its peak)
"In an elastic market, productivity gains grow the market — healthcare and education. But in inelastic markets like call centers, AI takes jobs" (Hinton, 36:30)
"With a superintelligent AI, any new job, AI is the cheaper way to do it. So assuming the past pattern repeats is wrong" (Hinton, 38:30, the decisive counter to the employment debate)
"The analogy the AI tech lobby is trying to convey — regulation is the brake. But that's completely the wrong model. Regulation is the steering wheel. What they want is a very fast car without a steering wheel" (Hinton, 41:00, the climax)
"A car without a brake is in trouble downhill. But a car without a steering wheel is in trouble sooner" (Sejnowski, 41:50)
"The only model of a far-more-intelligent thing giving freedom to a less-intelligent one is mother and baby" (Hinton, 45:00, on the problem of coexisting with superintelligence)
"Current AI safety research gets only about 1% of the investment that AI capability research gets. That's crazy" (Hinton, 46:00)
"Three-tier risk classification: (1) misuse, (2) profit side effects, (3) existential threat. We can get international cooperation on Tier 3, not on Tier 1" (Hinton, 52:00)
"The tobacco/asbestos precedent — countries like Canada regulated at home but kept selling to what was then called the Third World and is now called the Global South. AI will follow the same fate" (Hinton, 55:00)
"Emma, I just figured out how the brain works. — Oh no daddy, not again" (Hinton, 29:30, academic humor)

Sources

2026 Digital World Conference | AI for Social Development | Geoffrey Hinton | Terrence J. Sejnowski (YouTube, WDTA official channel)

Related resources:

ジェフリー・ヒントン

Geoffrey Hinton

「AI のゴッドファーザー」 / 2018 チューリング賞 / 2024 ノーベル物理学賞

テレンス・セジノフスキー

Terrence Sejnowski

計算神経科学者 / Salk Institute / Boltzmann machine 共同開発者 / NeurIPS Foundation President

リー・デン

Li Deng

Citadel LLC Chief AI Officer / 元 Microsoft Research チーフサイエンティスト

Glossary

Boltzmann machine: A stochastic neural network co-developed by Hinton and Sejnowski in 1985. It combined the Hopfield network (associative memory) with simulated annealing, showing that networks with hidden units could be trained. Built on the energy function of statistical mechanics, treating equilibrium as a probability distribution. The origin of the later Restricted Boltzmann Machine (RBM), Deep Belief Network, and modern generative AI. Direct practical use was limited due to compute cost, but its conceptual influence was decisive.
Restricted Boltzmann Machine (RBM): A specialization of the Boltzmann machine — units are split into a "visible" layer and a "hidden" layer, with connections within the same layer forbidden. Learning becomes much more efficient. The core of Deep Belief Network construction in the 2006–2010 era. It comes with mathematical properties known as variational bounds; in Hinton's words, "the elegance of the math made it look respectable," but it was later re-evaluated as serving merely as a "reasonable initialization of the weights."
Hopfield network: A neural network for associative memory published by John Hopfield in 1982. Applies spin-glass physics; each state has an energy, and stable states (attractors) correspond to memory patterns. Known for the ability to recover memories from noisy patterns. Hopfield received the 2024 Nobel Prize in Physics jointly with Hinton. The direct precursor to the Boltzmann machine.
Simulated annealing: An optimization algorithm published by Kirkpatrick and colleagues in 1983. A metaphor borrowed from physical annealing — gradually lowering the temperature while exploring equilibrium states. Has the property of approaching a global optimum without getting stuck in local optima. Sejnowski reading this paper and combining it with the Hopfield network contributed directly to the birth of the Boltzmann machine.
WDTA (World Digital Technology Academy): An international academy organized under the UN's science forum to bridge industry and academia in the digital technology era. Its founding principles cover ethical responsibility, inclusive empowerment, and contribution to a shared future. Both Hinton and Sejnowski are recipients of the WDTA Scientific Breakthrough Award. Co-host of the 2026 DWC with UNRISD.
UNRISD (UN Research Institute for Social Development): UN social development research institute. Founded in 1963; an independent research institute within the UN system. Provides policy research on social development and advocates human-centered development. Co-host of the 2026 DWC with WDTA, providing a venue for discussing the intersection of AI and social development.
Superintelligence: Hinton's in-video definition: a state in which "it is better than us at almost every intellectual task we do." Unlike AGI, it doesn't assume a multidimensional capability distribution; an integrated indicator that "it outperforms humans on almost every front." A Hintonian reformulation of the concept widely known from Nick Bostrom's Superintelligence (2014). In the video, Hinton distinguishes "AGI is a silly term; superintelligence is a term you can define reasonably."
Jagged intelligence: The concept that AI capability, compared with human capability, is an "uneven zigzag." Hinton expressed it in this video (April 2026) as "intelligence is multidimensional, jagged relative to people." Karpathy reached the same concept independently at AI Ascent 2026 (May 2026) as "jagged intelligence, ghosts not animals." A rare example of two of the industry's tops reaching the same concept independently at the same time in different venues.
Wake-sleep algorithm: An unsupervised learning algorithm proposed by Hinton and colleagues in 1995. A bidirectional loop of generation and recognition — the "wake" phase infers hidden representations from observed data; the "sleep" phase trains the recognizer on data generated by the model. In this video, Hinton criticizes modern diffusion models for "only doing half of wake-sleep (the recognition phase is frozen at adding noise)" and proposes an alternative — use a ResNet as the generative model.
Capsule Networks: An extension of convolutional neural networks Hinton worked on from the late 2010s. Groups neurons into units called "capsules," designed to handle changes in viewpoint and pose beyond translation invariance. Mathematically elegant, but compute cost at scale was high, and in the end it did not become the modern mainstream. In this video, Hinton publicly admits: "I spent five years; the people around me were right, and I was wrong."
Elastic markets and inelastic markets: Hinton's distinction in the employment debate. In an elastic market, productivity gains = price drops = demand explosion, and employment can grow (e.g., healthcare, education). In an inelastic market, demand is capped, and productivity gains = lost jobs (e.g., call centers). Which market your product is in determines the impact of AI on employment.
Three-tier risk classification (Hinton): The classification of AI risk Hinton systematized in the video. (1) Deliberate misuse: fake videos, biological viruses, cyberattacks; (2) profit-driven side effects: dark patterns, algorithms that create social division; (3) existential threat: AI itself taking over. Key insight: international cooperation can be secured on Tier 3 but not on Tier 1 (because countries attack each other).
Regulation = steering wheel analogy: The regulatory analogy Hinton presented in the video. "The AI tech lobby is pushing anti-regulation via the car analogy of 'regulation is the brake.' But in fact, the accelerator = progress; regulation = the steering wheel. What they want is a very fast car without a steering wheel." Replaces the binary of "regulation vs. progress" with the cooperative relationship of "direction-setting for progress." A new metaphor — Hinton himself notes in the video, "I came up with this a few days ago; this is the first time I've said it in public."
Tobacco / asbestos precedent: The historical analogy Hinton uses for the limits of AI governance. The developed countries that produced tobacco and asbestos (Canada and others) introduced domestic regulation to protect their own people but kept selling to what was then called the "Third World," now called the Global South. A warning that AI will follow the same fate. A problematization of the limit that the EU AI Act, the US Executive Order, and Chinese AI regulation hold only within domestic borders.
Asilomar self-regulation: The 1975 Asilomar Conference on Recombinant DNA, convened to discuss the safety of recombinant DNA technology. The leading molecular geneticists of the day gathered and formulated voluntary containment-level regulations. A precedent in which the research community established safety via self-regulation rather than government regulation. The basis for Sejnowski proposing in this video that "AI also needs similar self-regulation."
Pre-training: The standard training procedure of modern LLMs — first train the model on large-scale data, then fine-tune for a specific task. From Hinton and Li Deng's testimony in this video: the concept's origin lies in the 2009–2010 application of the Boltzmann machine to speech recognition at Microsoft Research. The idea of layer-wise pre-training using Deep Belief Networks became the foundation of every later GPT/Claude system.
Mother and baby model: The only precedent Hinton offers for the problem of coexisting with superintelligence. "The only model of a far-more-intelligent thing giving freedom to a less-intelligent one is the mother and baby. The mother really cares about the baby." That said, the mechanism for extending that relationship to AI–humanity has yet to be solved, as Hinton himself warns; the two parts should be read together.

comment is stripped from the HTML output. */}