Infinite Ethics, Cluelessness, and Moral Empathy for Intellectual Adversaries — Amanda Askell on 80,000 Hours #42 (2018)

80,000 Hours Podcast #42 (Rob Wiblin) · September 11, 2018

Amanda Askell · September 11, 2018 "Moral empathy for people with different worldviews — an effort to understand their belief system is what makes constructive dialogue possible."

80,000 Hours Podcast Episode #42, "Amanda Askell on tackling the ethics of infinity, being clueless about the effects of our actions, and having moral empathy for intellectual adversaries," published September 11, 2018. Host: Robert Wiblin (research director at 80,000 Hours) + Keiran Harris (producer). Approximately 2.5 hours.

80,000 Hours is an Effective Altruism (EA) -aligned career advisory organization that publishes long-form interviews with philosophers, economists, and scientists for "career choices that produce the greatest good." This episode (#42, September 2018) was recorded at the moment Amanda Askell had just received her PhD in philosophy from NYU (May 2018), and is a valuable record from just before she joined OpenAI (two months before her November 2018 hire).

What matters is that this is Amanda's output as a pure philosopher before entering AI Safety. Here, the philosophical sources of all her later positions — in the Anthropic official channel (June 2024), the Anthropic Salon (January 2025), Hard Fork (January 2026), Scaling Laws (February 2026), and Newcomer (April 2026) — are recorded. "Ethics is more like physics — empirical, with uncertainty"; "people who run on a single moral theory feel brittle and dangerous"; "a wide uncertainty band of 1–70% on AI consciousness" — the lineage of those later statements is already present in Amanda's 2018 discussion as a pure philosopher.

The core has three themes. (1) A general-audience explanation of her dissertation topic, Infinite Ethics (also functions as an introduction to the dissertation article). (2) Cluelessness — the fundamental problem of ignorance about long-term effects, theorized as a modern counterargument to the EA movement. (3) Moral Empathy — the effort to enter the worldview of an intellectual adversary and recognize that "this is a moral question for them too." The direct source of Claude's later Charitable Interpretation training.

What matters in Amanda's own testimony is her acknowledgment: "The PhD program was time-intensive, and given the career uncertainty, in hindsight I might not have made the same choice. The research was in an extremely specialized area — infinite ethics — and my fit with academic positions was limited." The experience of writing a paper that, over three years, was read by only 17 people would later motivate her move into AI companies (OpenAI, November 2018; Anthropic, March 2021). The origin point for understanding "why a formal philosopher moved to an AI company."

Key observations

Self-assessment of the PhD — "in hindsight I might not have made the same choice"

At the top of the interview, Amanda introduces herself: she has just finished six and a half years of doctoral study at NYU. The crucial self-assessment: "The PhD program was time-intensive, and given the career uncertainty, in hindsight I might not have made the same choice." She acknowledges a strong tendency to focus on a single research theme and difficulty running parallel projects. She also confesses that the research was in the extremely specialized field of infinite ethics, and her fit with academic positions was limited.

This is an important foreshadow for understanding Amanda's career. In November 2018, two months after this conversation, Amanda joined OpenAI. Her later recollection on Hard Fork (January 2026) — "I spent three years writing a document only 17 people would read" — reflects her state of mind at this point. The career pivot from formal philosophy to AI Safety was not a fad-driven move, but a decision born at the intersection of her self-assessment in academia and concern about AI safety.

Because 80,000 Hours is an EA-aligned career advisory organization, the question "was the PhD a good investment?" connects directly to its central interest. Amanda's honest answer — "personally I might have made a different choice, but the skills I have now are usable" — functioned as important advice for young researchers considering philosophy PhDs. This timing also coincided with the moment the EA community began to recognize AI Safety as a new central problem in 2018.

An accessible explanation of infinite ethics — the Pareto principle and impossibility

Amanda explains the central thesis of her dissertation, "Pareto Principles in Infinite Ethics," in language that minimizes technical terminology. "If the universe contains an infinite number of morally significant subjects, classical utilitarianism's utility-sum aggregation no longer works directly. Even starting from the Pareto principle — that improving one subject should always make the world better — you arrive at impossibility results."

Her response to Robert Wiblin's question, "why is this an urgent problem?", matters. "The probability that the universe is infinite is high enough, given eternal inflation and contemporary cosmology. Even at 1%, the expected value calculation breaks down. This isn't an abstract problem — it's practical when considering the long-term effects of our actions." A point that connects to EA's longtermism.

The conclusion Amanda makes public for the first time on 80,000 Hours: "We should not regard a single ethical theory as decisively correct; we should recognize the conflict between multiple incompatible but valid principles and maintain normative humility." This translates the final chapter of her dissertation for the EA community. The philosophical grounding of her later Claude training at Anthropic (a model that responds thoughtfully to moral uncertainty) is first publicly stated here.

Cluelessness — the malaria prevention example

Amanda introduces James Lenman's (2000) cluelessness argument, as systematically developed by Hilary Greaves. The concrete example: "Direct effects of malaria prevention are measurable — distributing bed nets reduces X deaths. But the lives saved go on to marry, have children, and shift demographics. Whether the cumulative effect 100 years later is positive or negative is in principle unpredictable."

Robert's question: "So does that mean we shouldn't do anything?" Amanda: "No — cluelessness is a counterargument against consequentialism, not against action. Measuring direct effects matters. The mistake is assuming long-term effects are 'calculable.' Adopt imprecise probabilities (confidence intervals) and build expected-value uncertainty itself into the decision."

This is Amanda's response to the EA movement's core tension. EA oscillates between "prioritize evidence-based interventions" (GiveWell, etc.) and "prioritize long-term existential risk" (longtermism). Her cluelessness argument, by saying "even evidence-based interventions carry large uncertainty when long-term effects are considered," eases the opposition between the two. It also becomes a logic encouraging entry into AI Safety — "if our actions affect an infinite future, then investing in a technology with large effects, like AI, is rational."

Moral Empathy — the origin of Charitable Interpretation

The concept Amanda publicly proposed for the first time in 2018. "The effort to understand the moral positions of people with different worldviews — not mere tolerance, but an active mode of thinking that recognizes 'this is a moral question for them too.'"

A concrete example: the vegetarianism debate. "In the conflict between those opposing meat consumption and those defending it, both sides often confuse their moral positions with personal preferences. Moral empathy is the effort to understand 'what does the other side recognize as a moral problem?' Without it, it becomes mere mutual imposition of preferences."

Amanda discusses this concept alongside Kate Manne's analysis of misogyny in "Down Girl" (2018). "Entering the other side's mode of thought" is not mere relativism but the capacity to maintain your own position while understanding the logic of the other position from within. This concept becomes the direct source of Claude's later Charitable Interpretation training. The design judgment Amanda explains in the Anthropic official channel (June 2024) — "interpret the steroid question charitably; it may be eczema cream, not anabolic steroids" — is the AI-implementation version of the 2018 "moral empathy" concept.

The moral value of information — the multi-armed bandit problem

An extension of the theme Amanda lectured on at EA Global 2017 Boston, "The Moral Value of Information." "When comparing an established intervention to a high-uncertainty one, accounting for information value can justify investing in the latter."

Amanda reformulates intervention selection in the framework of the multi-armed bandit problem . "Established interventions like malaria control are 'exploitation' — reward is certain, but information to learn is small. Investment in new intervention areas is 'exploration' — reward is uncertain, but information value is large."

Citing Brian Christian and Tom Griffiths' "Algorithms to Live By" (2016), she notes that "in the real world, reward probabilities shift over time (non-stationary multi-armed bandit), which is harder than the classical theory." Amanda's own perspective: handle EA intervention selection as a compound decision problem that incorporates information value, not as simple expected value maximization.

Integration of social justice and analytic reasoning

Amanda's distinctive position: "social justice work and effective altruism are not in opposition." "Areas of social justice — criminal justice reform, anti-discrimination policy, improving the lives of women — can be integrated into the EA framework as 'experimental social reform.'"

Citing Mark Kleiman's "When Brute Force Fails" (2009), she positions criminal justice reform within EA's priorities. "Pursue social fairness via decade-scale institutional change, not single-event outcome evaluation." A bridging perspective between longtermism and social justice.

In dialogue with Robert Wiblin, Amanda emphasizes the position that "we can prioritize multiple ethical objectives simultaneously while applying selective resource allocation." Pluralism — animal ethics, global poverty, and existential risk reduction are all morally important. She rejects a single priority (e.g., "put all resources into existential risk") and defends diversity within EA.

Clarity in communication — unclarity as intentional deception strategy

The discussion where Amanda's training as a philosopher shows through most strongly. "Ambiguity and overuse of jargon impose excessive interpretive labor on the listener — an unfair demand." She endorses the philosophical norm of "state your claim clearly" and notes that "unclear expression is often an intentional strategy of deception."

A concrete example: "A writer who says 'this is a complex issue' and avoids a conclusion is often hiding that their position cannot be argued. If something is genuinely complex, you should be able to explain the complexity clearly."

This aligns directly with MEMEX's editorial principles (the "white lies" pattern covered in our editorial-failures notes). Amanda's 2018 output translates the philosopher's norm for a general audience. It is also one of the epistemological grounds for her later Claude training at Anthropic (designing the model to clearly acknowledge "I don't know" when it cannot answer clearly).

The lineage from 2018 statements to 2026 Anthropic statements

The continuity of Amanda's thinking visible across this 80,000 Hours appearance:

2018: "We should not regard a single ethical theory as decisively correct" → 2025–2026: Anthropic Salon, "ethics is actually more like physics — empirical, with uncertainty"
2018: "Moral empathy — effort to understand the other's moral position" → 2024–2026: Claude's Charitable Interpretation training
2018: Cluelessness — long-term effects are unpredictable → 2025–2026: Recognition of "Unknown Unknowns," Claude's calibrated uncertainty
2018: "Unclear expression is an intentional strategy of deception" → 2024–2026: Training Claude's honesty to acknowledge "I don't know"
2018: Intervention selection via the multi-armed bandit problem → 2025–2026: Automation-of-alignment research at Anthropic's Alignment Science (Jan Leike)
2018: "Integrating social justice + EA + analytic reasoning" → 2024–2026: Claude constitution's Principal Hierarchy, the three-layer user / operator / Anthropic structure

The consistency of thought across eight years is striking. The work of training Claude at Anthropic is, in effect, implementing in a new medium — the LLM — the philosophy Amanda already articulated in 2018. A rare case of "formal philosophy applied directly to AI design."

Industry context

80,000 Hours is an EA-aligned career advisory organization. The name comes from the calculation that "a person's working life spans roughly 80,000 hours." Founded in 2011 by William MacAskill (Amanda's former spouse, married 2013, divorced 2015) and Benjamin Todd. The podcast started in 2017, releasing long-form interviews with philosophers, economists, and AI researchers. Robert Wiblin (research director) is the main host.

The industry timing for this episode (#42, September 2018): the period in which AI Safety began to be recognized as a central issue in the EA community. The same year, Nick Bostrom's Future of Humanity Institute (Oxford) was focusing on AI Safety, and OpenAI was preparing GPT-2 (released 2019). Amanda joining OpenAI (November 2018) is an early symbolic case of EA-aligned philosophers flowing into the AI industry. 80,000 Hours published other AI-related episodes during the same period (Stuart Russell, Paul Christiano, and others), recording the EA community's strategic shift.

Hilary Greaves (Oxford philosopher) systematized cluelessness in 2016–2017, just before this episode. Greaves is also director of the Future of Humanity Institute and is close to Amanda's dissertation in thought. 80,000 Hours functioned as a hub where the Oxford-based EA philosophers (William MacAskill, Toby Ord, Hilary Greaves) and Amanda, an NYU-trained PhD, intersected.

Position relative to other Amanda appearances

This episode is the philosophical source of all of Amanda's other outputs:

This episode: 80,000 Hours #42 (September 2018) — Output as a pure philosopher before entering AI
Dissertation, "Pareto Principles in Infinite Ethics" (May 2018, 263 pages) — the formal version of the infinite ethics explained in this episode
What should an AI's personality be? (Anthropic official, June 2024) — "people who run on a single moral theory feel brittle and dangerous"
How hard is AI alignment? (Anthropic Salon, January 2025) — "ethics is actually more like physics"
Anthropic's philosopher answers reader questions (Anthropic official, December 2025)
Reading Claude's constitution with NYT reporters (Hard Fork, January 2026) — "a six-year-old genius by the age of fifteen..."
Reading Claude's constitution as a lawyer (Scaling Laws, February 2026) — "the rules approach is brittle"
You created an entity whose consciousness you don't know (Newcomer, April 2026) — "the uncertainty of 1–70% on AI consciousness"

What decisively separates this episode from the others is that it is before Anthropic / OpenAI. Her later statements are all made from the position of "Anthropic employee"; here Amanda speaks purely as a philosopher. The most honest record of Amanda's original thinking, free of commercial and organizational bias.

Implementation implications

Implications, from this 2018 output as a philosopher, for engineers building LLM products:

First, place "moral empathy" at the core of Claude's response design. Amanda's 2018 concept of moral empathy is the direct source of the later Charitable Interpretation training. When having Claude generate user responses in your product, "interpret the user's motives charitably" is a behavior already built into Claude as a trained trait. Overriding this with a "vigilance mode" design (e.g., "treat all questions as potentially malicious") is structurally in conflict with Claude's character.

Second, build cluelessness into evaluation metrics. The long-term effects of an LLM product (cumulative impact on users, society, culture) cannot be measured by a single test. Amanda's cluelessness argument proposes the approach of "incorporating long-term effects that cannot be precisely measured via imprecise probabilities." This is the theoretical grounding for modern AI ethics and impact assessment frameworks (DeepMind / Anthropic governance documents, and so on).

Third, A/B testing as a multi-armed bandit problem. Amanda's 2018 argument also applies to decisions about feature additions in LLM products. A design that judges the balance between "improving an established feature (exploitation)" and "adding an experimental feature (exploration)" through the multi-armed bandit framework that incorporates information value. This is also the philosophical grounding for modern SaaS A/B testing methodology.

Fourth, vigilance against "unclear expression as intentional deception strategy." Designs that intentionally use ambiguous expressions in LLM product usage policies, privacy statements, or feature descriptions run against Amanda's philosophical norm. Designs that clearly communicate "what Claude can and cannot do" in your product's communications lead to long-term retention of user trust.

Critical perspective

The strength of this episode is that Amanda speaks as a pure philosopher, free of organizational and commercial bias. There are also caveats.

First, the cluelessness argument offers an answer to the EA movement's central problem but is not a complete solution. The prescription of "adopt imprecise probabilities" is conceptually attractive, but concrete guidance on how it operates in implementation is scarce. Amanda herself has not fully digested cluelessness in her later work at Anthropic. The concrete method of "modeling Claude's long-term effects via imprecise probabilities" has not been adequately presented in public materials.

Second, the concept of moral empathy is intellectually attractive but has implementation limits. "Understand the other's moral position" is executable at the individual conversation level, but applying it across all users and all situations in a large-scale LLM product faces challenges of both compute cost and judgment accuracy. Claude's implementation of Charitable Interpretation cannot be called fully successful (the false-refusal problem Amanda herself acknowledges in the Anthropic official video in June 2024).

Third, central involvement in the EA movement entangles Amanda with politically and ideologically contested territory. EA has come under criticism in recent years through the SBF (FTX) affair, the OpenAI board fight (November 2023), and conflicts between Effective Altruism and Effective Accelerationism. Amanda's 2018 output belongs to the optimistic period of the EA movement, but readers' assessments of the movement as a whole afterward vary. MEMEX must remain conscious of these later issues when covering EA.

Fourth, applying the multi-armed bandit framework to ethical questions is attractive at the abstract / formal level, but the opposing view — "treating human life as equivalent to slot machines" — also stands. Amanda proposes the formal framework as "aid to thinking," but readers may misread it as "cost-benefit calculation of human life." The broader critique of the EA movement ("the arrogance of trying to numericize what cannot be numericized") remains as a caution against this methodology.

These caveats aside, as the most systematic output of Amanda's as a pure philosopher, this is essential background for understanding her later Anthropic statements. To understand the motivation and the continuity of thought behind her career pivot "from formal philosophy to AI design," this episode is required listening.

Reader takeaways

Amanda's later Anthropic statements (the philosophy of Claude's character design) all continue from her 2018 thinking as a pure philosopher. To understand "the philosophy injected into Claude," you need to grasp this eight-year continuity
The concept of "moral empathy" is the direct source of Claude's Charitable Interpretation training. Designs that run Claude in "vigilance mode" in your product structurally conflict with this trained trait
Cluelessness (the unpredictability of long-term effects) is a concept to build into LLM product evaluation metrics. Designs that equate "passing single test cases" with "safety" are not supported by formal philosophy
The personal and intellectual ties between the EA movement and Anthropic's AI Safety thinking are deep. Amanda's career — from EA-aligned philosopher to an AI company — is a symbolic case of the industry's wider strategic shift
Amanda's philosophical norm, "unclear expression is likely an intentional strategy of deception," applies directly to communication design for LLM products. It should be reflected in designs that structurally reduce the risk of "users misunderstanding Claude's capabilities"
Intervention selection via the multi-armed bandit framework can also be applied to feature-addition decisions in LLM products. Design the balance between "improving an established feature vs. adding an experimental one" as decision-making that incorporates information value

Outline (estimated chapter structure)

80,000 Hours has not published full timestamps, but the discussion flow is roughly:

(Opening) Robert Wiblin introduces Amanda — just after her NYU philosophy PhD
The PhD experience — six and a half years, focus tendency, ex post evaluation
Accessible explanation of infinite ethics — Pareto principle, impossibility
The cluelessness problem — the malaria prevention example
The imprecise probability / confidence interval approach
Multi-armed bandit problem and intervention selection
The moral value of information — exploration vs. exploitation trade-off
Integrating social justice and EA — the criminal justice reform example
Moral pluralism — prioritizing animal ethics / global poverty / existential risk
Moral empathy — the effort to enter intellectual adversaries' worldviews
Clarity in communication — vigilance against unclarity
Methodological lessons — normative humility
(Close) Statement of interest in AI policy (two months before joining OpenAI)

Key quotes (including summaries and paraphrases)

"The PhD program was time-intensive, and given the career uncertainty, in hindsight I might not have made the same choice" (Amanda, self-assessment)
"We should not regard a single ethical theory as decisively correct; we should recognize the conflict between multiple incompatible but valid principles and maintain normative humility" (Amanda, methodological conclusion)
"Moral empathy for people with different worldviews — the attitude of trying to understand the other side's belief system is what enables constructive dialogue" (Amanda, core of the moral empathy concept)
"Ambiguity and overuse of jargon impose excessive interpretive labor on the listener — an unfair demand" (Amanda, norms of communication)
"Direct effects of malaria prevention are measurable, but the lives saved go on to shift demographics — the cumulative effect 100 years later is unpredictable" (Amanda, cluelessness example)
"When comparing an established intervention to a high-uncertainty one, accounting for information value can justify investing in the latter" (Amanda, moral value of information)
"Social justice work and effective altruism are not in opposition — rather, this is a question of prioritization across multiple morally important areas" (Amanda, pluralism)

Sources

80,000 Hours Podcast #42: Amanda Askell on tackling the ethics of infinity, being clueless about the effects of our actions, and having moral empathy for intellectual adversaries

Related resources:

80,000 Hours official site (EA-aligned career advisory organization)
Amanda Askell CV (askell.io) — including the EA Global 2017 lecture
Pareto Principles in Infinite Ethics (dissertation PDF, PhilArchive)
Hilary Greaves, "Cluelessness" paper (PhilPapers)
Nick Bostrom, "Infinite Ethics" (precursor paper)
MacAskill, Bykvist, Ord, "Moral Uncertainty" (Oxford University Press, 2020)
Brian Christian & Tom Griffiths, "Algorithms to Live By" (2016) — popular explanation of the multi-armed bandit problem
Mark Kleiman, "When Brute Force Fails" (2009) — criminal justice reform

アマンダ・アスケル

Amanda Askell

Anthropic 哲学者・Personality Alignment チーム責任者 / Claude のキャラクターと憲法の主要設計者

Glossary

80,000 Hours: An Effective Altruism-aligned career advisory organization. Founded in 2011 by William MacAskill and Benjamin Todd. The name comes from the calculation that "a person's working life spans roughly 80,000 hours." The podcast began in 2017, releasing long-form interviews with philosophers, economists, and AI researchers. Robert Wiblin is the main host.
Effective Altruism (EA): A social movement that began at Oxford in the 2010s. The principle: "optimize career and resources to produce the greatest good." Central figures include William MacAskill, Peter Singer, Toby Ord, and Hilary Greaves. Amanda has lectured at EA Global and is a member of Giving What We Can. Close in thinking to Anthropic's AI Safety stance.
Cluelessness: An objection to consequentialism raised by James Lenman (2000). Since the long-term causal effects of our actions are unpredictable, consequentialism — which evaluates actions by outcomes — does not function. Systematically developed in modern form by Hilary Greaves. A central theme in Amanda's dissertation and her 80,000 Hours appearance.
Moral Empathy: A concept Amanda Askell proposed in her 2018 80,000 Hours appearance. The effort to understand the moral positions of people with different worldviews. Not mere tolerance — an active mode of thinking that recognizes "this is a moral question for them too." The direct source of the central element in Claude's later character training (Charitable Interpretation).
Charitable Interpretation: The principle of interpreting another's statements or actions as favorably as possible. One of the central traits included in Claude's character training. When a question admits multiple interpretations, prefer the most well-intentioned. An AI implementation of the "moral empathy" concept Amanda proposed in 2018.
Imprecise Probability: An approach that represents probability not as a single value but as a confidence interval (e.g., "between 20% and 60%"). Considered problematic in classical Bayesian theory, it plays an important role in decision theory under uncertainty. Amanda proposes imprecise probability for areas where cluelessness applies, such as long-term effects of malaria prevention.
Multi-armed Bandit Problem: A classic problem in statistics and reinforcement learning. Searching for a strategy to maximize reward over a limited number of trials across multiple slot machines (each with a different unknown probability distribution). The central trade-off is exploration vs. exploitation. Amanda applies this framework to EA's selection of interventions.
Infinite Ethics: Ethics in cases where the universe contains an infinite number of morally significant subjects. Classical utilitarianism's aggregation (the sum of utilities) breaks down, and the field searches for alternative ranking principles. Begins in the Henry Sidgwick tradition, modernized by Nick Bostrom, and formally refined by Amanda Askell.
Longtermism: One of the core ideas of the EA movement. The position that "our actions can affect an infinite future, and therefore we have a moral obligation to consider future generations." Systematized in William MacAskill's "What We Owe the Future" (2022). Consistent in thinking with Amanda's dissertation's interest in the infinite case.
Giving What We Can: A charitable pledge organization originating at Oxford and rooted in EA. A community of those pledging to donate 10% or more of lifetime income to charities. Founded by Toby Ord (2009). William MacAskill was an early member. Amanda Askell is also a pledged member (she has publicly stated, "if possible I'd like to make it 50% or more").
EA Global: The annual international conference of Effective Altruism. Amanda lectured on "The Moral Value of Information" at the 2017 Boston conference. A central event in the EA community where philosophers, economists, philanthropists, and policymakers gather. Held in the Bay Area, London, Boston, and elsewhere.
Robert Wiblin: Research director at 80,000 Hours and primary host of its podcast. An economist and philosopher who has been involved since the early days of the EA movement. Close to peers in the same EA-leadership generation, including Amanda Askell, Will MacAskill, and Toby Ord.
Hilary Greaves: Oxford philosopher and director of the Future of Humanity Institute (FHI). Known for the modern systematization of the cluelessness argument. Close in thinking to Amanda's dissertation and frequently cited on 80,000 Hours. Works on Moral Uncertainty research with William MacAskill and others.
Kate Manne, "Down Girl": A philosophical analysis of misogyny by Cornell philosopher Kate Manne (2018). Amanda cites it in the moral empathy discussion. She places it alongside the structural analysis of sex discrimination as a case of the effort to "understand the other side's moral position."
Algorithms to Live By: A 2016 popular book by Brian Christian and Tom Griffiths. Explains exploration vs. exploitation, sorting algorithms, Bayesian decision-making, and more in the context of daily life. Amanda's reference when discussing the multi-armed bandit problem on 80,000 Hours.
When Brute Force Fails: Mark Kleiman's 2009 work on criminal justice reform. An evidence-based policy proposal: "the certainty of punishment > the severity of punishment." Amanda's reference when discussing the integration of social justice and EA on 80,000 Hours.