Who Gets to Teach the Machine Right from Wrong?

Last week, representatives from Anthropic and OpenAI sat down with religious leaders from across the theological spectrum — Hindu, Baha'i, Sikh, Greek Orthodox, Mormon — for what was billed as the inaugural "Faith-AI Covenant" roundtable in New York. Organized by the Geneva-based Interfaith Alliance for Safer Communities, the stated goal was to figure out how to infuse morality and ethics into fast-developing artificial intelligence. More roundtables are planned in Beijing, Nairobi, and Abu Dhabi.

It's a striking image. The architects of systems reshaping how humanity works, thinks, and communicates, sitting across from representatives of traditions that have been thinking about human nature for millennia. Easy to mock. Easy to celebrate. Having spent time seriously in both worlds, I'm skeptical of how casually each is being invoked here.

There is a real argument for bringing religious and philosophical traditions into conversations about AI ethics, and it deserves to be made honestly before it's interrogated. Religious traditions have spent centuries wrestling with questions that AI developers are only beginning to ask in earnest: What do we owe each other? Who bears responsibility for harm caused at a distance, through systems we design but do not fully control? What does it mean to act justly under conditions of radical uncertainty? These are not engineering questions. They are not market questions. And the frameworks that dominate Silicon Valley — optimization, scale, user engagement — are, on their own, morally impoverished.

Aristotle identified something in the Nicomachean Ethics that remains stubbornly relevant: practical wisdom, what he called phronesis, cannot be reduced to rules. It is a capacity developed through experience, judgment, and character. You cannot encode phronesis. You can only cultivate it, in people, over time.

More fundamentally: AI systems already encode values. They do this through training data, moderation policies, human feedback loops, edge case decisions made by engineers under deadline pressure. The question was never whether AI has values. It has always been whose values, and who decided.

In 2021, researchers at the Allen Institute for AI released a system called Delphi. Its purpose was audacious: generate moral judgments about ethical scenarios submitted by users. Ask it whether it's acceptable to steal bread to feed a starving child. Ask it whether lying to protect someone's feelings is justified. Delphi would return a verdict — it's okay, it's wrong, it depends.

The results were instructive, and not in the way the researchers intended. Delphi produced outputs that were biased, contradictory, and at times genuinely disturbing — confidently delivering judgments that reflected the particular demographics and assumptions baked into its training data rather than anything resembling moral reasoning. It had learned to sound like moral authority while doing something categorically different: pattern-matching on aggregated human opinion.

Ludwig Wittgenstein observed that if a lion could speak, we could not understand him — because genuine comprehension is inseparable from the form of life that generates it. Delphi inverts the problem. It speaks our language with apparent fluency. But the fluency is the trap. It creates a false sense of moral presence, the impression that something is reasoning when it is calculating, that something is judging when it is predicting.

Hannah Arendt's concept of the banality of evil is uncomfortable to invoke but difficult to avoid. Arendt's insight was that catastrophic harm does not require malice — it requires only thoughtlessness, the suspension of genuine moral reasoning in favor of procedural compliance. A system that processes ethical questions at scale without the capacity for genuine reasoning is not evil. It is something potentially more dangerous: thoughtlessness institutionalized, automated, and deployed at speed.

Delphi was, at its core, an attempt at mechanistic casuistry — the case-based moral reasoning with deep roots in Catholic and Jewish ethical traditions. But casuistry, properly practiced, is not a lookup table. It is a discipline embedded in interpretive community, sustained by centuries of argument, and held accountable by the ongoing practice of asking why, not just what. Delphi extracted the form and discarded the substance. What remained told us less about machine morality than about the limits of reducing ethics to prediction.

Which brings us to what makes the Faith-AI Covenant genuinely difficult, and it isn't primarily technical. Religious traditions do not agree with each other. More importantly, they do not agree within themselves — on gender and authority, on punishment and mercy, on the nature of conscience and the limits of forgiveness, on whether moral truth is discovered or revealed, absolute or contextual, universal or communal.

Thomas Aquinas built the most systematic moral theology the Western tradition has produced, and even within Thomism, applying principle to particular cases requires prudentia — a situated, embodied, relational judgment that cannot be formalized without remainder. If the most architecturally rigorous tradition in Western ethics resists full codification, what exactly are we expecting a language model to capture?

Isaiah Berlin spent much of his intellectual life on a related problem: that genuinely important human values are sometimes irreconcilable, not because we haven't thought hard enough, but because the conflict is real. Liberty and equality. Justice and mercy. Tradition and progress. Berlin's point was not that we should stop trying to choose well. It was that we should stop expecting a unified framework to resolve the tension for us.

The skeptics of the Faith-AI Covenant raise a sharper version of this concern. Rumman Chowdhury, CEO of Humane Intelligence and former U.S. science envoy for AI, has called these efforts at best a distraction, at worst a way of diverting attention from structural questions about power and accountability that ethics language tends to obscure. There is a name for that pattern: ethics washing — the substitution of values discourse for governance, of roundtables for regulation, of principles documents for accountability mechanisms. The concern is not that religion has nothing to offer. It is that no single moral authority — religious, corporate, governmental, or algorithmic — should govern systems this powerful without pluralism, transparency, and genuine accountability structures.

John Rawls proposed a useful thought experiment: imagine designing the principles that will govern a society without knowing where in that society you will be born — your class, your culture, your tradition. What principles would you choose from behind that veil of ignorance? The question translates uncomfortably but usefully to AI governance. If you did not know which tradition or value system would be embedded in the systems governing your information environment, your medical decisions, your employment prospects — what oversight structures would you demand? The answer is probably not whichever tradition was best represented in the room that day.

A more durable approach involves pluralistic input, not as compromise but as epistemological necessity. Religious perspectives bring centuries of moral reflection. Secular philosophy brings frameworks for interrogating assumptions. Democratic oversight brings legitimacy. Legal governance brings accountability. Technical safeguards bring constraint. Transparency makes all of it auditable. The goal is not consensus, which may be impossible, but structured accountability across multiple frameworks simultaneously. Worth naming too: the most concerning version of this story isn't religion influencing AI. It is AI becoming the moral authority — treated as oracle, as arbiter, as the system beyond which there is no appeal.

Simone Weil wrote that attention is the rarest and purest form of generosity. She meant something precise: a quality of presence directed at a particular person, in a particular moment, that perceives what that person actually needs rather than what we expect or prefer them to need. It is irreducibly particular. It is an act of will as much as perception. It is exactly what no system optimized for scale can replicate — not because the engineers aren't trying, but because the architecture works against it.

I've sat in rooms where AI systems get shaped at the implementation level — not at roundtables, but in working sessions where someone has to decide which edge cases to prioritize, which harms to weight more heavily, which communities get included in testing, whose complaints get escalated and whose get closed. Those decisions are where values actually land. The Faith-AI Covenant is a conversation at altitude. It can surface frameworks, name stakes, build relationships across communities that rarely talk to each other. That has value. But it doesn't answer the ground-level question that matters most: who is in that room, what are they accountable to, and when something goes wrong, who gets to say so?

Religious traditions have spent millennia asking who bears responsibility for harm. That is exactly the right question. The challenge is ensuring the answer is structural, plural, and enforceable — not ceremonial.

Who Gets to Teach the Machine Right from Wrong?

Bibliography