Podcast thumbnail for Conspicuous Cognition Podcast

Conspicuous Cognition Podcast

Claim This Podcast

by Dan Williams

17 episodes
Updated Daily
Accepts GuestsHas SponsorsLocation 🇬🇧

Podcast Overview

A podcast about big questions in philosophy, psychology, evolution, politics, artificial intelligence, and more. <br/><br/><a href="https://www.conspicuouscognition.com?utm_medium=podcast">www.conspicuouscognition.com</a>

Language

🇺🇲

Publishing Since

12/12/2024

1 verified contact email on file for Conspicuous Cognition Podcast

Pitch yourself as a guest, propose sponsorships, or reach out directly to the host.

Recent Episodes

Episode thumbnail for What If Artificial Intelligence Progress Explodes? (with Benjamin Todd)

June 16, 2026

What If Artificial Intelligence Progress Explodes? (with Benjamin Todd)

<p>Benjamin Todd, co-founder of 80,000 Hours, joins Dan and Henry to discuss whether artificial intelligence progress could become explosive.</p><p>Benjamin explains why he thinks transformative artificial intelligence by 2030 is a serious possibility, how feedback loops in artificial intelligence research could accelerate progress, and why the most important risks now go beyond classic alignment problems. The conversation covers artificial intelligence timelines, bottlenecks in chips and research talent, the future of work, mass unemployment, concentration of power, engineered pandemics, space governance, and how young people should think about their careers in a rapidly changing world.</p><p>Topics discussed include:</p><p>• Why 80,000 Hours increasingly focuses on artificial intelligence• The case for short timelines to transformative artificial intelligence• Whether artificial intelligence progress could become explosive• Feedback loops in artificial intelligence research• Chip bottlenecks, data centres, and geopolitical risk• Whether artificial intelligence will cause mass unemployment• Why “become a plumber” may be bad career advice• Alignment, control, and concentration of power• Misuse risks, engineered pandemics, and future governance• How to think clearly under extreme uncertainty</p><p>Benjamin Todd is the co-founder of 80,000 Hours and the author of 80,000 Hours, a new book about how to choose a career that is both personally rewarding and socially impactful.</p> <br/><br/>This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit <a href="https://www.conspicuouscognition.com/subscribe?utm_medium=podcast&#38;utm_campaign=CTA_2">www.conspicuouscognition.com/subscribe</a>

Episode thumbnail for Academics Must Wake Up on AI (with Alexander Kustov)

June 2, 2026

Academics Must Wake Up on AI (with Alexander Kustov)

<p>The political scientist <a target="_blank" href="https://alexanderkustov.org">Alexander Kustov</a> recently published a <a target="_blank" href="https://www.popularbydesign.org/p/academics-need-to-wake-up-on-ai">Substack post</a> with a provocative claim: that AI can already do social science research better than most professors. The post went viral. It attracted more than a million views and over a thousand responses, many of them very angry. (Some people even demanded that Alex’s university fire him.)</p><p>In this conversation, we talk about this controversy and the claims that triggered it, including:</p><p>* What agentic AI tools like Claude Code and Codex can already do for research, from coding and data analysis to literature reviews, translation, and brainstorming, and why only around 20% of quantitative social scientists currently use them.</p><p>* What best predicts whether researchers adopt or reject AI: ignorance, openness to experience, methodological background, or the awkward role of self-interest.</p><p>* How much published academic research is genuinely mediocre, and whether the cause is laziness, lack of skill, or a broken incentive structure, with a detour through the replication crisis and some high-profile fraud cases.</p><p>* Whether AI will raise the quality of research or simply flood the literature with more slop, and what journal editors could do about it.</p><p>* Whether AI can be genuinely creative or only recombine what already exists, by way of <a target="_blank" href="https://en.wikipedia.org/wiki/Margaret_Boden">Margaret Boden</a>’s three kinds of creativity, Thomas Kuhn on paradigm shifts, and <a target="_blank" href="https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol">AlphaGo’s “Move 37”</a>.</p><p>* The fight over AI writing and detection tools like <a target="_blank" href="https://www.pangram.com/">Pangram</a>, and why current disclosure norms end up punishing the honest.</p><p>* The angry response to Alex’s series, and what is really driving reflexive opposition to AI among academics.</p><p><p>Conspicuous Cognition is a completely reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></p><p>Links and further reading</p><p>* <a target="_blank" href="https://alexanderkustov.org">Alexander Kustov</a> — Alex’s homepage, with an overview of his research on immigration, public opinion, and effective governance.</p><p>* <a target="_blank" href="https://www.popularbydesign.org/">Popular by Design</a> — Alex’s <a target="_blank" href="https://substack.com/@akoustov">Substack</a> on public opinion, persuasion, and the politics of getting good ideas adopted.</p><p>* <a target="_blank" href="https://www.popularbydesign.org/p/academics-need-to-wake-up-on-ai">Academics Need to Wake Up on AI</a> — followed by a <a target="_blank" href="https://www.popularbydesign.org/p/academics-need-to-wake-up-on-ai-part">Part II</a> and <a target="_blank" href="https://www.popularbydesign.org/p/academics-need-to-wake-up-on-ai-part-4c6">Part III</a></p><p>* <a target="_blank" href="https://www.pangram.com/">Pangram</a> — the AI-detection tool discussed at length, which labels text as human, AI-assisted, or AI.</p><p>* <a target="_blank" href="https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol">AlphaGo versus Lee Sedol</a> — the 2016 match, including the famous “Move 37” that Henry raises as a candidate for genuinely transformative machine creativity.</p><p>* <a target="_blank" href="https://en.wikipedia.org/wiki/Margaret_Boden">Margaret Boden</a> — the cognitive scientist whose distinction between combinational, exploratory, and transformative creativity frames part of the discussion.</p><p>* <a target="_blank" href="https://en.wikipedia.org/wiki/The_Structure_of_Scientific_Revolutions">The Structure of Scientific Revolutions</a> — Thomas Kuhn’s account of normal science and paradigm shifts, referenced in the exchange about AI and discovery.</p><p>* <a target="_blank" href="https://www.chronicle.com/article/ai-is-a-better-researcher-than-you">“AI Is a Better Researcher Than You”</a> — The Chronicle of Higher Education‘s account of the controversy around Alex’s series.</p><p>Transcript</p><p>* Please note that this transcript is lightly AI-edited and may contain minor mistakes. </p><p><strong>Dan Williams:</strong> Welcome back. I’m Dan Williams, and I’m back with my co-host, Henry Shevlin. Today we are honoured to be joined by Bluesky’s favourite academic, Alexander Kustov. Alex is a political scientist at the University of Notre Dame and the author of one of my favourite Substacks, Popular by Design. His primary research is on immigration and public opinion, but that’s not really what we’re going to be talking about today. We’re going to be talking about a fascinating and hugely viral series he published at his Substack titled “Academics Need to Wake Up on AI,” about what AI can already do when it comes to research, and what that means for the academics who are not paying attention, which is many of them. It was very widely read, and it generated, let’s say, a somewhat polarised response. So Alex, to kick us off: what’s the central thesis of this series, and what motivated you to write it?</p><p><strong>Alexander Kustov:</strong> Thanks, Dan, for having me. I’m a huge fan of the Substack and the whole podcast series with you and Henry. So, like some of us, I’ve been using some of these AI tools. I’ve been reading some of the other folks like yourself, and it really transformed everything I do in my life. And I should say I was also on sabbatical, so I had a little bit more time than some of my colleagues to try some of these tools. I just hadn’t really seen any of my colleagues talk about it. And when they did talk about it, they usually tried not to be vocal about it. I just didn’t think it was a good equilibrium, where basically people were using these tools to be ten times more productive and not talk about it. It really heightened this sense of inequality for me, which I do care about. You’d have a situation where someone would publish ten papers in a year and someone else would publish one, and the only difference is that the person publishing more is the one using Codex or whatever. I just wanted to write about it. And I saw that the prevailing academic discourse on the issue, especially on platforms like Bluesky, was very counterproductive.</p><p>I didn’t really say much, to be honest. I didn’t think it would be that controversial. But the biggest thesis that really rubbed people the wrong way was that right now a lot of these tools are better at a lot of the tasks that we do as professors. I’ve refined this idea a little bit, going back and forth with some of my critics, but I feel comfortable right now saying that if you look at it globally, and think about what professors do around the world, in social science and adjacent fields especially, AI agentic tools can do most of the tasks they do in terms of literature review, data analysis, and even coming up with some research questions, better than those professors on average. I think that’s a pretty uncontroversial statement at this point, but obviously a lot of people were very, very upset about it.</p><p><strong>Dan Williams:</strong> Empirically speaking, it is a controversial statement, in the sense that it provokes controversy when you say it. In a minute we can get to the question of what AI can actually do in the context of research. But for what it’s worth, I completely agree with you that on many tasks AI is clearly better than what human beings can do. Is your sense that lots of people just weren’t aware of that fact, that they literally didn’t have exposure to these tools? Or was your sense that the reason people weren’t really talking about it is because of all the controversy surrounding the use of these tools, not just mere ignorance?</p><p><strong>Alexander Kustov:</strong> I think it’s both, for sure. There was recent research done by Anthropic. They tried to do, not a representative survey, because obviously the population is very hard to define here, but they surveyed something like 1,200 quantitative social scientists, and the estimate right now is that about 20% of folks use agentic tools. That doesn’t seem like much at all, and if anything it’s probably an overestimate, because they’re more likely to tap into well-resourced universities. So I do think it’s both: the little uptake we have, and the fact that people who do use these tools don’t want to talk about it.</p><p>There are two things here. First, you want to maintain your comparative advantage. This moment right now is exactly the moment where, if you’re one of the few people using these tools, you can write a bunch of papers and get tenure while the tenure system is still in existence. And the other thing is that if people are very upset about anything AI-related, you don’t want to talk about it and be shamed by your colleagues. Just to give you one funny anecdote: at the height of the vitriol I experienced, where hundreds of people literally were quoting me and trying to tag my employer to get me fired, the exact same people were often DMing me and asking for my setup and prompts. So it’s very crazy to me that you have this big disconnect between what people say publicly and what they actually do privately.</p><p><strong>Dan Williams:</strong> I find it crazy that it’s only 20% of social scientists, or whatever the exact number is, that’s actually using agentic AI. Just before moving on, maybe we should explicitly address: in your view, what is it that agentic AI, as it exists right now, can do? What are the kinds of tasks it can do better than human beings, and how can it improve the workflow of an average social scientist?</p><p><strong>Alexander Kustov:</strong> Coding is the first thing. It’s literally in the name, Claude Code. That’s what these tools were designed for. If you talk to any coding person, a computer scientist, or even someone who isn’t a computer scientist but does a lot of coding for their work, I don’t think anyone would doubt that it’s a huge productivity improvement tool. And the vast majority of quantitative social scientists who do any kind of data analysis do a lot of coding, so they have to be very receptive to this by definition. And I think they often are.</p><p>What happens is that social scientists are comprised of a bunch of different tasks and topics that people can disagree over, depending on the field. Economics is pretty homogeneously quantitative and formal, so there you can definitely see the biggest uptake. But a lot of disciplines, like political science or sociology, are a mix of qualitative and quantitative folks. And a lot of this AI polarisation overlapped with that pre-existing divide. People who didn’t like stats, who didn’t believe in positivism, the idea that you can learn something about the social world using evidence, were also more reluctant to believe that AI is helpful for them. Which is funny, because, as I also mentioned in some of my writing, if anything those people are going to benefit from these tools, because Claude cannot really interview people and do ethnography yet. So in a way there will be more demand for very high-quality qualitative work. And there are some good examples of qualitative people I respect who embraced AI completely.</p><p>You can still use a lot of these tools to boost productivity outside the coding realm. You can write emails. One thing I think anyone would acknowledge, including the critics of AI, is that it’s definitely helping them respond to administrator emails, which no one likes, and which isn’t considered an unethical thing to do. I remember someone on a big account on Bluesky posted that AI should be banned except for transcription purposes, because they do a lot of interviews and it’s good for transcription, but everything else is off limits. And it’s interesting how the goalposts are changing right now. The biggest fight I’m getting involved in again, because that’s kind of what I do, is this idea of AI detection and disclosure, and we can talk more about it later. But there’s this interesting consensus forming that AI is good for research now, which was not the case half a year ago, from the same people. Now those people are saying AI is obviously good for research, duh, but it’s not good for writing, for a bunch of different reasons.</p><p>So we talked about coding and data analysis. There’s a lot of other things a normal researcher would need help with: getting a basic summary, a literature review, translation. It’s above my pay grade, I’m not a machine learning person, but my understanding is that LLMs are exceptionally good at translation, and the fact that a lot of people deny they can translate things well is insane to me. You can transcribe your interviews, translate your survey questionnaire into different languages, whatever you need. I also used AI a lot for public engagement recently. You can translate your website, you can create your website from scratch in a day or two. You’d be surprised how few academics actually have good functioning websites. And I’m not talking about very old people who reject technology, but also young PhD students, who you’d think would have an interest in making sure people can find them online. But no. You can just install Claude and do it in a day. The fact that people are not doing it is insane to me, and I’m trying to spread the word. I’ve convinced a lot of folks to do it, but there’s only so much I can do as one person.</p><p>What predicts whether an academic uses AI?</p><p><strong>Henry Shevlin:</strong> It really resonates, hearing about your experiences with some academics being completely oblivious to AI and others enthusiastically adopting it. When I’ve visited different businesses and universities, I sometimes literally see the same people doing exactly the same job, maybe even sitting at the same desk, one of them doing amazing things with AI and the other one not using it at all. I’m curious whether you’ve got a sense of what the best predictors are for whether someone is an AI user. If you could only know one thing about someone in the social sciences in order to predict whether they were a big user of agentic AI, what kind of things do you think predict it?</p><p><strong>Alexander Kustov:</strong> That’s a very interesting question. I’m pretty sure there’s some kind of deep personality thing. Everything goes back to personality, whether it’s socialisation, upbringing, or even genes. Openness to experience probably jumps out to me as one of the first predictors. My initial thought was that, since this AI debate overlaps with the qualitative–quantitative debate, people who are methodologists, econometricians, or psychometricians would be much more likely to adopt AI. On average that’s true, but it’s not completely lopsided, for some reason. In fact, there are some very interesting examples of people who were very good methodologists, developing their own regression models, doing machine learning, who then got very sceptical about LLMs. One way to think about it is that those people actually know more about these tools, and so their scepticism has more value, and I’m trying to be tuned to that.</p><p>But there’s also something about self-interest. Previously you were this privileged person, where the whole department would come to you to help with methods or regressions. I was a person like that in my department back in North Carolina, where people would come to my office and say, “Alex, can you help me with this game theory model?” And I’m not even really a methodologist. So it really depends on the comparative advantage people have. Now basically anyone can go to Claude Code and try to do a very fancy analysis. Obviously you have to know something, you have to know what to ask, but these models are exceptionally good at giving you the basics. If I’m not really good at geospatial statistics, for example, I can go to Claude, do some spatial regressions, and learn about it on the spot. Previously I would have had to go to some spatial colleague in the geography department for that. Now it’s just much easier to do it myself with my computer. And it’s probably going to be as good or even better.</p><p><strong>Dan Williams:</strong> I think for all of these areas within quantitative social science, from coding to data analysis to literature reviews to writing, which we can return to in a bit, the quality of writing you can get from these models is really exceptional. But I’d also point to things like brainstorming. I’m a philosopher, Henry’s a philosopher. I don’t do quantitative social science, so I do research that’s constrained and informed by empirical research, but I don’t actually collect data. In terms of having access to a very smart interlocutor who you can literally prod, telling it, “give me the three strongest objections to these ideas I have for a paper,” and use that as the basis for thinking through an idea, that’s such a huge advantage, even when it comes not to quantitative social science but to theory construction and many aspects of qualitative research. I’m really baffled by, well, I somewhat understand people who just haven’t used these tools, or whose last use was in 2023, so they’re just ignorant. But I’m really baffled by anyone who’s actually used the paid version of Claude or ChatGPT and doesn’t understand the extent to which they can improve your ability to think through topics, understand things, and get information. Henry, are there any ways you use AI in your research, and in how you think about topics, that we haven’t touched on already?</p><p><strong>Henry Shevlin:</strong> For me, the primary use case for AI systems is always just learning, and learning about new topics. Being able to ask questions and verify my own knowledge has been a game changer. Although I will say it’s also been a massive time sink. I’ve gone down so many rabbit holes that I probably would not have prioritised if I’d had to dig out articles. But in some sense it’s been good for my education in the round, even going down those rabbit holes. Otherwise, I find it useful for summarising and making sense of my data: getting a whole bunch of research papers and using Claude Cowork to create summaries of them. NotebookLM is also very useful in its own right for dealing with defined archives. The thing I really need to do this year, and the thing I’m most looking forward to, is getting a good agentic workflow for dealing with email. I already use Claude for drafting quick emails that require a certain degree of precision but don’t involve any warmth or human feeling. But being able to have an email assistant is something I’m looking to build in the next couple of months.</p><p><strong>Alexander Kustov:</strong> A few thoughts on that. Brainstorming is a big one. I did mention it, because that’s the default way you should use these tools: to have a very smart person to talk to, especially when you don’t have access to your colleagues. It’s a really good substitute. Even setting aside the question of whether LLMs can generate great novel ideas, you really just want a conversation partner who can rehash old ideas and tell you why you’re wrong. The reason people don’t realise this is that a lot of folks don’t do the very simple thing of paying for a premium subscription and installing one of these agentic tools. It’s happened to me several times: I’d talk to people about an agentic tool, I’d specify Codex or Claude Code, “do you have it, have you used it?”, and people would nod, and then five minutes later in the conversation it turns out they completely missed that part and still think about the chatbot thing. So a lot of people are confused about this.</p><p>What I find helpful, at some of the workshops I’ve done and that others have done, is that you just sit with folks and install one of these tools for them, and ask them to do one simple task that’s good for their career. Like create a slide deck. A lot of people are still amazed that you can create a slide deck much better than the average academic slide deck in a minute. That really changes people’s minds; it’s mind-blowing for a lot of folks. Or, I don’t know if you’ve had this experience, there’s this Refined service where they do peer reviews for papers; I think an economist created it. I’m not a huge fan of it, because it costs $50, but the first one is free, and there are a lot of free systems that can imitate this exact functionality. I’ve seen several of my colleagues at Notre Dame use the free upload for one of their papers, and they received the best feedback they’d ever had in their lives, the kind you’d never get at an average academic conference, and they got completely converted overnight. It really takes one magical event for people to understand that something is definitely going to change very, very soon.</p><p>How much academic research is actually any good?</p><p><strong>Dan Williams:</strong> Just to double-click on one thing, this question of what drives the differences in how people view AI. One thing we haven’t really touched on, but which I think is very important, is how you view the nature of research and what you’re even doing as a researcher. Whether you view it fundamentally as being about producing the best output possible, or whether you view it as some journey of self-discovery, exploration, and authentic engagement with the material. I think the latter model of research is very threatened by the idea that you would integrate these AI tools into it. If you’re ruthlessly focused on how to produce the best outputs possible, as evaluated according to relatively objective metrics, then I’d speculate you’d be much more disposed to make use of whatever tools help you do that, including AI.</p><p>But this connects to another thing that’s just come up in what you said, Alex, which is something you write about in this series. So far we’ve been focusing on how good AI is. There’s this other side to it all, which is how bad much actually existing human research is, even before we talk about anything to do with AI. You get into this a lot in the second and third essays in the series. Do you want to say a little about that, about how that other side factors into how you’re thinking about this topic?</p><p><strong>Alexander Kustov:</strong> The third installment of the series basically came to me while I was at a political science conference, or rather an interdisciplinary conference called ISA, for international studies specialists who study relationships between countries. And it was really bad. Big academic conferences are always bad; that’s something you expect. There’s so much money, including public funds, spent on all these conferences and travel for people from around the world. But if you ask a regular academic, forget about AI, they would tell you they don’t expect to get good feedback, their panel is going to be completely empty, and the reason they do it is because they have to spend their $2,000 travel fund and potentially hang out with some friends and do some networking, which is not bad. Networking is a huge part of conferencing.</p><p>So I was sitting at this conference, seeing really bad presentations where people would have tons of grammar and sense mistakes and senseless research questions. It’s bad both in terms of substance and execution. And exactly at that moment I was getting all this vitriol for saying that AI can do better stuff, like slides. I was like, no, this is just a huge disconnect. I started thinking about that. The issue I see is very pronounced in the conversation around self-driving cars, where people compare them to some ideal in which there are no accidents and no one dies. When a self-driving car runs over a cat, it’s a huge news story, but humans do that every single day, in their hundreds, and we don’t care, because we accept that humans are fallible and bad and not doing good work. I think it’s the same with academic work. The vast majority of things produced by professors globally is just not good and not contributing to human knowledge.</p><p>For some people this can be even more controversial for me to say than anything I said on AI. A lot of people view this world from their own parochial angle of being a research professor in a top-tier American or British school, or Cambridge. But the vast majority of folks are not like that. I experienced academia in the post-Soviet world where I grew up, and in most cases people just want to get by. They publish in some predatory journal with a random, rehashed argument that probably reinvents the wheel and doesn’t really contribute. No one’s going to read it. We know that 80% of published papers in the humanities are never cited, and probably never read either, except by your editors or reviewers. And as an associate editor of a journal, I can tell you I doubt the reviewers actually read some of the papers they review. So compared to the actual status quo of what’s happening right now, automating it all and using AI tools mindfully and responsibly is going to be a big win.</p><p>Another problem is that we have this binary thinking that it’s either/or: we either do one-shot papers that aren’t good, or we don’t do anything. But you can write your own paper, do your own slides, and then ask your AI agent to help you brainstorm, create a graphic, or redesign your graph. People might disagree on the details of what’s more acceptable and useful, but at the end of the day there are so many use cases for these tools that are completely uncontroversial at this point.</p><p><strong>Henry Shevlin:</strong> Just very briefly, I think it might matter whether academia’s problems are due to things like laziness or just not caring, versus a lack of skill. I’m curious whether you have a theory about where these problems in academia come from. Is it the fact that a lot of people are just really bad, for the most part, at doing data analysis, for example? If so, then AI is amazing; it’ll lift the floor. But if it’s that people just want to commit fraud and do whatever it takes to get ahead, then maybe AI isn’t going to make the situation better, or could even make it worse.</p><p><strong>Dan Williams:</strong> Or a third thing: it could just be the nature of the institutional incentives. I feel like a lot of what’s behind the replication crisis, the reproducibility crisis, the generalisability crisis, and so on, is not so much that people are lazy or unskilled. It’s that you can get ahead and win the status game within academic research by engaging in shoddy research practices, and as a consequence that’s what you get: a lot of shoddy research practices. But a lot of those findings that don’t replicate were done by really brilliant, energetic, ambitious scientists. It’s just within this flawed incentive structure.</p><p><strong>Henry Shevlin:</strong> Brilliant, energetic, but perhaps not fully scrupulous.</p><p><strong>Dan Williams:</strong> Yeah, but you can’t rely on human beings to be scrupulous. You need the incentives set up in such a way that even by default unscrupulous people will be driven to act in pro-social, beneficial ways. That’s my cynical perspective. What do you think, Alex?</p><p><strong>Alexander Kustov:</strong> I’m going to say something very controversial: I want to believe that people are good by nature. At least, my knowledge of evolutionary psychology tells me that even those people who commit all these bad practices at least want to believe they’re doing something good. They’re often motivated by good things, with some exceptions; there are some people who are truly evil. But even if we take some of the most famous fraud cases in academia, like Francesca Gino at Harvard, I think the way it probably works is that you start by doing some research you care about, it gets picked up by the public, you’re very successful, there’s a lot of demand for what you do, and then you get some uncomfortable result and you tweak it a little bit. There’s all this literature about p-hacking, where you have some theory you want to prove, and when you have to make a choice between presenting model A and model B, you unconsciously choose the model more in line with the result. You can even justify it to yourself, that this model makes more sense, that it’s obviously much better. And any individual case might be right. But in aggregate it doesn’t lead to good outcomes.</p><p>I also think there’s a lot to say about the incentive structure in academia. Right now you really have to publish or perish, still, despite the fact that we can talk about whether the journal model is going to be sustainable in the near future. You have to publish a lot, no matter what your field is. Which means that if you have to decide between doing a better job with data analysis and spending a year on it, you’d probably spend less time on it and publish as soon as possible. You’re not really incentivised to replicate data. It’s very hard to publish critical responses and replication studies, and we have all this evidence that failed replications are usually much less popular and less cited than the original studies that have been disproven. Another thing I’ve been talking a lot about is public engagement, where you’re very rarely rewarded for actually spreading the knowledge of what you do, because that’s not something your dean would appreciate. So people default to publishing shoddy papers no one’s going to read. And peer reviewers don’t really check your data in most cases. When I submit my paper to a political science journal, people take it for granted that my analysis is legit, and they quibble about the framing or some other superficial thing.</p><p>That’s one of the reasons I’m so concerned right now about this whole Pangram hysteria, because people are going to be looking for em-dashes or whatever instead of the substance of the underlying claims. I see the Bayesian argument that if something is clearly AI slop, it probably also doesn’t have good data in it. But knowing modern AI tools, if you ask Opus 4.8, which just came out, to create a report on some topic with publicly available data, I’m pretty sure it’s going to be able to download things and create a chart that’s probably more legit than a chart you saw published in an academic paper four or five years ago. Even if the prose isn’t as good, and we don’t usually have good writing in academia anyway, it’s going to be more human than em-dashes or “it’s not X, it’s Y.” So I think it’s a combination of all those things, but I do want to believe that very few people actually want to commit fraud.</p><p><strong>Dan Williams:</strong> Let’s definitely talk about this writing thing. But just on this previous point about incentives, I agree that people aren’t sadistic and don’t go out there thinking they want to do bad things. I just think academia is a status game with certain norms and institutional procedures. People are often ferociously ambitious, and they do whatever’s going to get them status, prestige, and recognition, as that’s defined and understood within academia. All the human slop produced in the context of academia is just because the incentive structure is messed up. You can rack up lots of status by churning out a load of crappy, non-replicable findings that don’t add anything to the academic literature. But if that’s your model of what’s going on, you might think: well, if the problem ultimately is not to do with human beings being lazy or unskilled, but to do with the incentive structure, then why would merely giving us access to AI improve things? You might think all that’s going to happen is people will play the same status game, but do it a lot quicker and at lower cost, and we’re not actually going to advance the frontier of knowledge, because all the same structural causes of bad research are still in play.</p><p><strong>Alexander Kustov:</strong> I think that’s a key question. We’re facing a forking path of some sort. You can imagine a scenario in which the future is as bleak as you just described, but with more slop. That’s the problem I see with what might happen: take all these bad incentives, give this miraculous tool to researchers, and instead of one paper per year they’d produce ten that don’t lead to anything productive. It just inflates everyone’s expectations and creates more problems. But there’s an alternative scenario, and I think it’s still in our hands to do something about it. Instead of increasing productivity in terms of quantity, we can use these tools to increase productivity in terms of the quality of research. Since you can now generate something very simple in a minute, you really have to do something better than a shoddy regression with no account for endogeneity concerns, or rehashing the same exact philosophical argument people have been making for years and years. So there’s a way to do better with these tools, and it’s in the hands of current journal editors to raise the standards, do more desk rejects, and say the quality bar is now much higher. I think it’s already happening somewhat, and it’s something we can consciously decide to change.</p><p>I also have some hope for the frontier models. There’s been some interesting research showing that when you explicitly ask a model to p-hack, it doesn’t do that. You can jailbreak it, so to speak, and say “please, please, I really need that,” and it’ll do it sometimes. But with those basic guardrails, they’re going to help people, because no one is consciously justifying p-hacking; people don’t like that. When the model refuses to do it, it’ll make them think that maybe they should do something different. So I have some hope. But obviously it depends on what happens to academia in five to ten years and how the models develop. We’ll definitely have to redesign the incentive structure, because I’m not sure the number of papers you have is the best indicator of what you’re trying to do. The paper itself as a format is a weird thing, because now you can also have updated dashboards with new data. It seems like a very outdated format, at least for some arguments, but it’s not like we have a better equilibrium yet. A lot of things are in flux right now, and I don’t have a simple solution.</p><p>That’s the whole point of my series: I wanted people to start talking about it. I think it did help a little. I’ve gotten calls from deans around the country, and I’ve participated in panels where people have a university-wide conversation about these things, and a lot comes to the ground that people aren’t aware of. There’s definitely some hope, because a lot of the people in positions of power right now, the older, tenured, full professors, don’t use these tools. According to that poll we discussed, it was 20% in the general population, and I think it was about 9% among full professors. Some of those folks might not be reachable, or they might not care; they just want things to continue the old way. So we definitely have to do something about that.</p><p>Will AI make academic inequality worse?</p><p><strong>Henry Shevlin:</strong> Do you think there’s a risk that we see growing academic inequality, a kind of rich-get-richer effect, where the most prestigious, maybe not the older generation but certainly rising scholars with their own brands, use AI to put out twenty times the number of papers? We’re living in a tide of slop, but those with good reputations or good brands dominate. That might not be disastrous in every way, but it might lead to highly unequal outcomes within academia, with less well-known or less skilled researchers being completely left behind.</p><p><strong>Alexander Kustov:</strong> There are several things that lead in opposite directions here. In theory, and I don’t think I’m making an original argument, a lot of people have written about this, there are certain equalising things coming out of all this. For instance, the ability of these tools to translate things. If you’re a non-English speaker, it’s much easier for you to write those papers now, which is a huge productivity boost, and from the perspective of science it means we’re going to be able to get all those talented people and their arguments from all over the world, regardless of where they come from. And at least for now, the premium subscription is one or two hundred dollars, and people in most major universities can afford it, even in more developing countries. It’s not equal, but whether you’re at Harvard or a community college, you can afford a $100 tool, at least for some time, and presumably you can do exactly the same thing with it. Compared to the status quo, where as a community college professor you have to teach five classes a semester with no research budget, while at Harvard you don’t have to teach at all for the first two years and have a $200,000 startup, that’s a very big difference. So there are some equalising things going on, and it’s important to acknowledge that.</p><p>But you’re right that it’s also the case that the people able to use these tools most productively and efficiently are the people who already have a lot going on. Even though I’m very sceptical of the idea that LLMs can’t come up with new ideas, because in general new ideas are recombinations of old ideas, I do think you have to have a coherent set of ideas and goals of your own to be able to utilise these tools. It’s really all about your creativity and imagination. Every single day I see someone post something they did with Claude and think, “wow, I hadn’t thought about it.” Just yesterday someone posted about this idea of making your papers machine-readable, and I converted all my PDFs and my website to Markdown with all the figures. I think everyone should do this. I could have done it last year, I just hadn’t thought about it. There are a lot of things like that where you really have to have good ideas to begin with. So people who already have a whole research pipeline and some budget are now able to execute it much faster. This rich-get-richer dynamic is definitely going to happen. And in the future, where those models are potentially going to be much more expensive, that’s a possibility. My understanding is that right now it’s all subsidised, and the $200 model is actually going to be a $2,000 model. Then only the Harvard people are going to be able to afford it. So I just hope Notre Dame is going to be part of that.</p><p>Can AI be genuinely creative?</p><p><strong>Dan Williams:</strong> This point you made, Alex, also in one of the essays, about creativity and what’s really going on when it comes to coming up with new ideas in science, I was a little sceptical of. It’s a surprising feature of state-of-the-art AI today that, given how smart these models are in some sense, and given the vast knowledge base they have, they don’t really seem to make discoveries of a really new and impressive character. There are potentially some counterexamples, but my sense is you might think of this roughly in terms of the philosopher Thomas Kuhn’s distinction between science that happens within the context of a paradigm, normal science where you have relatively well-defined problems and puzzles, maybe the Erdős problems fall into that category in maths, and I suspect that for that kind of thing, AI, if you prompt it the right way as it exists today, can be used to help make progress. But when it comes to true creativity, the sort you find in really bringing about paradigm shifts, moving outside the space of predefined problems, reconceptualising an entire domain, and coming up with radically novel theoretical insights, I actually think AI as it exists today doesn’t really seem to have that capability. And that potentially tells us something interesting about the limitations of the models. I’m interested in what you think, and also in what Henry thinks about that view.</p><p><strong>Alexander Kustov:</strong> Henry, you can start.</p><p><strong>Henry Shevlin:</strong> On one hand, you might point to something like transformative creativity. Margaret Boden has this breakdown of creativity into three categories: combinatorial creativity, recombining existing ideas or elements to create new things; exploratory creativity, where you’ve got a predefined dimensional space and you’re going to bits of it that haven’t been mapped out yet; and transformative creativity, which is completely upending the apple cart, developing new dimensions. People point to Picasso or Einstein as examples of that kind of transformative creativity, and often will say AI can definitely do the first thing, maybe can do the second thing, but it’s not clear it can do the third thing. That’s maybe one way of putting your point, Dan. It’s certainly true that we’ve not seen any dramatic scientific breakthroughs that have been primarily AI-driven as opposed to AI-assisted.</p><p>One reason I am a little optimistic here, though, is that in other domains, most notably Go, there’s the famous “Move 37” in game two. In case anyone doesn’t know, and I think we’ve talked about it before on the show, this is in the second game between AlphaGo and Lee Sedol, the Go world champion, back in 2016. AlphaGo made this bizarre move that no human player would make or had made in the past, and yet it was really effective. The system knew what it was doing, and this has now been incorporated into the way human players actually play Go. So I think that’s probably a pretty strong candidate for a genuinely transformative piece of creativity, at least if we’re classifying it by its impact rather than its process. That’s obviously a very different domain; you’re operating with very well-constrained rules and goals that maybe allow for that kind of transformative creativity. But I am optimistic those kinds of transformative leaps could eventually come from AI systems, even general-purpose ones like LLMs. What do you think, Alex?</p><p><strong>Alexander Kustov:</strong> I really like this distinction between combinatorial creativity and the other types. Combinatorial creativity is definitely something LLMs are really, really good at. It’s kind of similar to translation: you mix and match different things. I’ve definitely seen a lot of really cool ideas come out, on the immigration stuff I work on, from LLMs, when I was doing brainstorming. This is undeniable at this stage. When it comes to transformative creativity, I wonder whether the reason we don’t really see it much is because we don’t really have AGI yet. I know you’ve talked about AI consciousness and all those questions. Maybe if we let the model think for itself and live in the wild, it’s going to happen. But right now, for most people, they set up a goal themselves for these models. Maybe that’s exactly why we don’t see transformative creativity, because you can’t just set up a goal and have it come up with something transformative. You have to specify the goals, and the goals are usually specified by people who can’t really do the transformation themselves.</p><p>But going back to Dan’s point about the paradigm shift, I do think we’re in this stage right now where, even if you concede that AI can’t have transformative creativity, just because we can now offload all this grunt work to AI, including email and all the other stuff that takes a lot of time, we can do other things that are creative and potentially transformative. That’s what I see with myself: I’m spending less time on administrative stuff and email, and more time brainstorming my ideas, talking to people, and doing really valuable networking and public engagement, which I’d never be able to do otherwise.</p><p><strong>Dan Williams:</strong> We’re in this great space at the moment where you’ve got incredibly smart, helpful AI tools, but you don’t have truly transformative AGI. So there’s still a role for human insight, judgment, and creativity. If that gets taken away over the next several years, that’s a very different kind of situation. I think there’s definitely a chance that by 2030 we have AI systems that can substitute for everything human beings can do cognitively. And then that’s a very different kind of world, and a very demotivating kind of world in some ways.</p><p>AI writing, detection, and disclosure</p><p><strong>Dan Williams:</strong> Let’s talk about writing. We’ve touched on this a few times already, but I know you’ve got interesting things to say about it, Alex, and potentially quite heterodox views. At the moment, more and more people are using AI to write. There are also these AI detectors. I think Pangram is the one which seems to be used the most, or that people trust the most. It’s got a very low false positive rate, as I understand it, although I’m not entirely sure how they go about establishing that. Many people think that if you use AI to write something, whether it’s a blog post, a novel, a poem, or an academic article, and it’s found out that you’ve done that, you’ve done something really bad and discrediting. My understanding is you don’t see it that way, Alex. So what’s your view?</p><p><strong>Alexander Kustov:</strong> A lot of it goes back to this idea of disgust sensitivity, talking about personality traits. There are some things people just think are “yuck” for whatever reason. It’s totally subjective. I don’t think you can really rationalise it; I think it’s some ground truth. I should say I’m coming to this from the perspective of someone born in the Soviet Union, where the Russian culture is very literate and people take a lot of pride in using proper grammar and speaking properly. I see a lot of parallels here with the previous wave of grammar Nazism, where people would ignore the substance of what you’re trying to do and point out typos, or “whom” instead of “who,” or the other way around. Obviously it has some function and might be useful in some respects, especially when you’re in school, but it takes up a lot of energy. My worry is that this whole new AI detection situation is going to be similar, where people spend a lot of time on very superficial pattern recognition. Right now you look for em-dashes and some other patterns and try to decide whether something is worth reading. That’s the common justification for Pangram use, that you want to make sure what you’re reading is worth it.</p><p>The problem is that even within the realm of human-made writing there’s a lot of slop, and you’re not going to be exposed to and won’t read 99.9% of it. Given the trajectory of the tools, I’m not sure that knowing something is AI-generated is necessarily worse. A lot of it is about the status signals people have. I personally don’t like very clear AI tells either; it rubs me the wrong way. But who am I to judge? What if it’s a non-native speaker, and the counterfactual to me reading their AI-generated text, which is potentially thoughtful, is just not reading it at all, because they can’t speak English well? People don’t think about it this way. They compare AI-written text to the best, to Shakespeare. I don’t think that’s the relevant comparison. Most of the text people write is not good, and to the extent that some people can improve it using AI, I think that’s good.</p><p>Practically speaking, if you’re an academic and you want to write more and you’re afraid of others calling you out for using AI, just use a style guide. Use a CLAUDE.md or AGENTS.md file to tell it not to use those phrases. Tell it multiple times, because it still adds em-dashes. But there’s a way to use AI for writing in your own voice, and I think it should be morally justifiable, depending on the realm. One thing I’ve been thinking about, and I’m going to workshop this idea with you, is that there’s a spectrum of the ethical justification of whether it’s okay to use AI for writing.</p><p>Clearly we can think of some examples, like a student assignment that needs to be human-made; when it’s AI-written, it’s a failed assignment. That’s a pretty clear case. The way professors think about this mostly comes from detecting their students cheating, and that’s why they think about it that way. But it’s a very rare scenario. In fact, a lot of professors right now encourage their students to use AI. I talked to some colleagues recently in stats classes who produce a regression paper in ten minutes on their computer and tell their students, “that’s something I can do in ten minutes, so you should do something better than this,” with AI or not. That’s a pretty good educational approach for some situations.</p><p>Another example I mentioned in one of my posts is that when you go to a live concert, there’s an implicit presumption that it’s going to be a live event and people are going to be singing themselves. If you notice and catch them not singing and using some device, that’s not cool. The same thing here: if you’re paying for someone to write you a human-made letter, a condolence email, it’s totally fine to be upset if they use AI for it. That’s totally justifiable. But on the opposite side of the spectrum, when you get a very formulaic email from your administrator, I think it’s totally justifiable to outsource that to AI, to your agent who knows your schedule and what you’re going to do, and no one’s going to be upset about it. People disagree on the margins of what’s acceptable. When you create a graph with data you worked on and understand, and you ask AI to describe it, I don’t see the problem; it’s probably going to be more accurate than most humans. Maybe we can have a social norm where if you say “I feel,” then it should be you who says that, as opposed to Claude. We’re still in this limbo where the norms aren’t clear, but we should be clear about what’s good and what’s not. It’s very hard for me to make a blanket statement that AI writing is good or bad; it really depends on the particular scenario. There are scenarios where it’s totally uncontroversial to say it’s okay to use AI, and scenarios where it’s totally uncontroversial to say it’s not. But the middle ground is what we’re trying to figure out right now as a community of knowledge.</p><p><strong>Henry Shevlin:</strong> I’m curious: Dan, how much of a hatred for obviously AI-generated text do you have? I have to say, I’m generally pretty AI-positive. I’m a very heavy user of AI. But I do definitely downgrade my assessment of text when I realise it’s just obviously AI-written. There are a few things going on there. One is that it’s not even so much that the text is AI-written, it’s the AI voice, the very specific voice. I just think it’s such a boring voice at this point. It’s so homogeneous. If someone wrote a brilliant comment or a brilliant reply to me on Substack or Twitter, or sent me a brilliant email, and I subsequently found out it was AI-generated, I don’t think I would care. But this one specific, overfitted, “it’s not X, it’s Y” just drives me up the wall.</p><p>I guess, focusing just on the question of whether there are, even setting aside those stylistic issues, specific contexts in which AI usage itself might be problematic. Another example, Alex, I love your example of the bands and people not lip-syncing. Another silly one is that a handwritten note does mean a lot more than a generic email, so sometimes it is precisely the effortfulness that makes the difference. But I also wonder whether, to some extent, we’re misled into thinking the average quality of AI-generated writing is worse than it is, because of what I’ve heard called the “toupée phenomenon.” Everyone thinks wigs look so bad, and that’s because your sample of wigs that look bad is the ones you can tell are wigs. If they’re good toupées, they don’t even make it into your sample. So in the same way, I think probably all of us are reading tons of AI-generated text that we’re not clocking as AI-generated.</p><p><strong>Alexander Kustov:</strong> Yeah, there’s definitely survivorship bias. With my first post about the AI series, one of the reasons it got so controversial is because I used Claude to generate 99% of it, and I didn’t disclose it right away, and then I did post factum, and Pangram gave it 100% human. So that’s a false negative, which is not a huge deal, but it’s interesting. A lot of good writing is AI-assisted right now; we should just take that for granted. When we see something bad that’s clearly AI-written, it’s just those particular instances. The strongest argument I’ve heard for being upset about it is that if someone doesn’t bother editing the text, or even creating a style sheet to make sure they don’t use all those constructions at the same time, it probably means the underlying substance isn’t good either. But I’m not sure how true that is; it really depends on the context.</p><p>The problem with social media comments, when you see something clearly AI-generated, is that it’s also not clear whether it’s a bot or a real person using AI to voice their opinion. But if you know this person and their account isn’t hacked, and they have some AI writing tells, I think it’s fine. I’m also not very happy to see a lot of clearly AI-written stuff, but I’m trying to rationalise it in a different direction and think about why it’s actually a problem. I’m not sure.</p><p><strong>Dan Williams:</strong> I think that ultimately, in contexts that have to do with academic writing, or people publishing their views and participating in debates, you should just be judging things on the quality of the contribution rather than its provenance. It just so happens that, at least when it comes to the AI writing that I detect, the quality is bad, for the reasons we’ve discussed. I just hate the style of writing you find with these models. I find there’s something really cringe and annoying about it. But that’s not a necessary feature of AI writing; it’s just the way the current models have been post-trained to produce a particular kind of style. And to Henry’s point, if I discovered that, for example, my favourite blogger, Scott Alexander of Astral Codex Ten, who I think is the king of Substack, had been generating his posts with AI over the past two years, well, I think those posts have been amazing. So I wouldn’t think, “now that I know it’s AI-generated, I’m going to retract that assessment.” That would be ridiculous to me. So in principle we should be judging things based on the quality of the output, not the provenance.</p><p>But I do then think, even if you think there’s this separate question about disclosure norms and what they should be, you make this really important point, Alex, which is that at the moment there’s a problem with disclosure norms: they end up just punishing honest people. Because if you come out and say you used AI to write something, as you did with your first post in the series, there’s a massive backlash. So if you’re honest, you get this huge reputational damage associated with doing it, which is going to discourage people from being honest, which means the dishonest people get access to the benefits of AI-generated writing without any of the reputational costs. As an equilibrium, asking for disclosure norms just doesn’t really seem either desirable or possible. Firstly, is that an accurate summary of your point of view? And secondly, do you still think that’s basically the correct point of view when it comes to disclosure norms?</p><p><strong>Alexander Kustov:</strong> As a newly minted associate editor at a journal, where we’re probably going to expect a surge in AI slop that we have to deal with, I’m very cognisant of the potential problems. Right now the go-to move among people doing journal editing, and probably what we’re going to do in our journal, is to introduce checkboxes for AI use. My sense is that we’re going to do that, but no one’s going to care, because no one’s going to report it truthfully. This is one thing where I strongly disagree with Kelsey Piper, who I deeply respect: I really don’t think it works out, at least for academics, especially in this environment where people feel very strongly and viscerally about this. Coming out and saying you use AI is just not going to do any good for anyone.</p><p>Another issue is that, to the extent you have some people who are completely anti-AI, disclosing that you used AI for, say, research assistance with data collection, as opposed to writing, what’s going to be worse for them? Any checkbox you have there is probably not going to satisfy them. So it’s strange to me that this is the solution we came up with. I can see how honesty can be rewarded in some contexts, and I’ve seen people on Substack say they used AI for help with data collection or writing. As a quantitative social scientist who primarily cares about data and quality, I’m surprised people think it’s more okay to use AI to collect and analyse data but not to write about it, because the first part is much more important. So I’m really going back and forth on it. But it’s hard for me to come up with a scenario where AI disclosure is actually going to work and solve anything.</p><p>What needs to happen is for us to change some of those norms. The same way we’re upset with AI tells, we’re also upset with how Gen Z, or whatever the new generation is, writes without capital letters. I can’t stand it, but that’s how people write, and who am I to judge? So I understand why people want to judge the quality of the substance, but they use the shortcut of the style of the prose to substitute for the quality. Going back to Henry’s point, what’s going to happen is that people are going to use your previous reputation as the main marker of whether you do something valuable. That’s why I feel for incoming grad students and newly minted professors, because it’s really hard to establish your reputation now with all this stuff happening with AI. Whereas if you were Daron Acemoglu, the most cited economist in the world, who’s been writing a hundred papers before AI was cool, there’s literally nothing he can do with AI or without AI that’s going to change your opinion about him. So people are going to be using these shortcuts more, which means that, from a certain ethical perspective, people are going to be discriminating more based on hopefully immutable but also immutable traits. That’s another thing to consider: people are going to be more trustful of their ethnic in-groups, or people who went to Harvard or work at Cambridge, than minorities. So there are interesting questions coming up about all that. I don’t have any solutions, unfortunately.</p><p>The backlash</p><p><strong>Dan Williams:</strong> Should we come full circle? We touched on this at the beginning, but there’s the content of what you wrote in the series, and then there’s the response to what you wrote, a lot of which was very angry. You mentioned you had people calling for you to be fired. A lot of that came from people on Bluesky. We’ve talked a little about the Bluesky intelligentsia previously on the show, and I’ve written about it on my blog as well. Firstly, do you want to say a bit more about what the reaction has been in general? You’ve touched on it here and there, but summarise it. And do you think there’s a way of steelmanning it? What’s the best possible case for why some people get so furious, so angry, with this kind of stuff?</p><p><strong>Alexander Kustov:</strong> I had some conversations. I haven’t lost any friends, so that’s one thing I should say; I haven’t gotten cancelled, because I’m tenured. I specifically waited for all my hot takes to happen after I got tenured, and maybe that was a good idea after all. But I did have some conversations with some really good friends who disagree with me on AI, and it definitely helped me refine some of my points. The biggest criticism I received that I see some relevance in is the idea, going back to our conversation, that humans are really not that good, and the concern that giving them this AI tool is just going to amplify all the bad stuff. To the extent you want to encourage norms of no p-hacking and doing really good, careful work, just telling people you can produce a paper with AI easily is not a good thing to talk about.</p><p>There was a recent big thing on Twitter about the practices of academic citations, where someone was saying that in practice academics don’t really read the stuff they cite well, and a lot of people interpreted it in a moralistic way, saying no, you should cite things. So you have the same thing here, where people interpreted my arguments in a normative way, that I’m saying they should do something or not do something, and it was against what they were trying to do. It’s also about this idea you mentioned about the role of academics as a kind of vocation, where you explore the world and self-actualise. I don’t think that, when people actually think it through, they would defend it on the merits, but implicitly that’s how a lot of academics think about their job, and to the extent that we now have tools that are threatening to them, it’s just not going to end well.</p><p>Trying to steelman the concerns people generally had, some people thought strategically it’s not a good idea to be vocal about it right now, in this moment. As someone who does a lot of work on immigration, I very much disagree with that, because I think we lost voter trust, as liberals and mainstream institutions, on immigration exactly because we were not saying certain things, and the same thing can happen on AI. It’s never a good idea to have a strategy where you do something that’s supposed to be good but that you don’t want other people to know about. I just don’t see how it works out in equilibrium. But I also see some argument that maybe in this particular moment it was not the best time to talk about it. That’s what I got from a lot of that.</p><p><strong>Dan Williams:</strong> Henry, your microphone’s not on.</p><p><strong>Henry Shevlin:</strong> Sorry, I keep making that mistake. I was going to ask whether there could just be a straightforward economic analysis here about why the current anti-AI coalition has the shape it does, namely that elite knowledge workers are overwhelmingly liberal, and AI predominantly threatens elite knowledge workers. You could maybe draw parallels in the same way that most of the opposition to climate change is concentrated on the right, and, speaking very crudely, to the extent that you’re looking at manual workers who in the US context skew a bit more to the right and maybe work in more energy-intensive industries. But I guess the question I’m asking is, is this just about the economics with a social gloss over the top? What does that explanation miss?</p><p><strong>Alexander Kustov:</strong> Some of it, for sure. But a lot of my work in public opinion says that a lot of people’s preferences are sociotropic, based on their ideas about what’s good for society, not necessarily their self-interest, unless it’s really in your face that it’s going to be bad for you. Some of the interesting contingent of haters I had on Bluesky were professional translators who were very upset with my takes on the fact that AI can translate things. I had this silly example, which is a true thing, that my mom wasn’t sure about a prescription she got from the doctor, because it was all in English, and she translated it. Someone was saying that I’m putting my mom under potential harm because she didn’t use a qualified certified translator, and the person saying that was a certified translator. So you see some connection there. But for the vast majority of folks, when it comes to academics who produce a lot of critical theory slop or DEI slop, AI can do this much better than them, but I don’t think they realise it. So there is an objective threat to their self-interest, but the reason they oppose AI is because they have a lot of other bad ideas.</p><p><strong>Dan Williams:</strong> There’s also this thing that I think Dean Ball calls the “omni-cause.” You’re against AI, but that means you have to be against AI in every possible respect. And if you point out one area where AI can actually be quite good, people draw inferences about you, that you’re not on the right team. I found this a couple of months ago, when I wrote some essays, and I was at a workshop where I argued that, relative to the actually existing alternatives, like social media pundits and a lot of legacy media, large language models actually are a pretty good source of high-quality information, that they’re a force for truth. There was a lot of negative response to this, which in my view is a fairly obvious thesis. Afterwards I was getting this response: “so you’re pro-AI.” To me that’s just such an unsophisticated way of thinking about it. I’m really worried about many aspects when it comes to AI, when it comes to power concentration, the economic impact, and how we’re going to cope with it. It doesn’t mean that with every single question you have to think AI is bad in every single way. I sense that, especially on the left, there’s this reflexive opposition to AI, this view that any claim that AI can actually do anything useful or have positive consequences is viewed as a betrayal. Okay, we’re coming to the end, Alex. Was there anything else you wanted to talk about that you didn’t mention?</p><p><strong>Alexander Kustov:</strong> Yeah, related to the last thing you mentioned. I don’t know if you saw it, but after getting all this vitriol on Bluesky, there were a few days where I got positively retweeted by hundreds of folks, because the thing I said was that we should ban electronic devices in all classes. When it comes to teaching, I’m much more pessimistic about AI. A lot of people were like, “what? This guy is an AI booster, how can he use AI but not allow his students to use AI? What’s going on?” You can have a complex opinion on a difficult issue. So there’s definitely this omni-cause, binary thinking, and also moral contamination, where once you start doing something you’re not supposed to be doing, you’re a bad person in all other respects.</p><p>To finish all that, I feel like we need to move on beyond that. In line with my immigration research, we have to meet people where they are. If people have concerns about AI, they might be mistaken, but they probably have some ground truth in them. So we shouldn’t just say they’re mistaken and wrong and stupid. We should explain to them that they can actually use AI for the good, for whatever they want to do. You can make slides with AI, and when professors learn about that, they forget about all the bad stuff they wrote just a few days ago.</p><p><strong>Dan Williams:</strong> Fantastic. Well, thanks, Alex, and thanks everyone for listening. We’ll be back soon with another episode of Conspicuous Cognition.</p><p><strong>Alexander Kustov:</strong> Thank you.</p> <br/><br/>This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit <a href="https://www.conspicuouscognition.com/subscribe?utm_medium=podcast&#38;utm_campaign=CTA_2">www.conspicuouscognition.com/subscribe</a>

Episode thumbnail for Are We Building Conscious AI Servants?

May 21, 2026

Are We Building Conscious AI Servants?

<p><a target="_blank" href="https://unherd.com/2026/05/is-ai-the-next-phase-of-evolution/">Richard Dawkins recently announced in </a><a target="_blank" href="https://unherd.com/2026/05/is-ai-the-next-phase-of-evolution/">UnHerd</a> that, after spending three days talking with an instance of Claude he christened “Claudia,” he had been moved to expostulate: “You may not know you are conscious, but you bloody well are!” This produced a lot of mockery and criticism. But however one feels about Dawkins’s specific case, his reaction might become much more common as AI systems become increasingly intelligent. </p><p>In this episode, which <a target="_blank" href="https://www.lcfi.cam.ac.uk/people/henry-shevlin">Henry Shevlin</a> and I recorded live on Substack (hence the slightly lower video quality), we discussed his first essay on his new Substack <a target="_blank" href="https://www.polytropolis.com/">Polytropolis</a>, “<a target="_blank" href="https://www.polytropolis.com/p/behaviourisms-revenge">Behaviourism’s Revenge</a>“, as well as his second, “<a target="_blank" href="https://www.polytropolis.com/p/the-house-elf-problem">The House Elf Problem</a>,” on the ethics of designing AI systems that genuinely love being our servants. </p><p>Henry’s central empirical prediction is that public attributions of consciousness to AI are likely to massively outpace the science, and that consciousness science is so theoretically chaotic that there is no expert consensus to push back. His most provocative philosophical claim is that a core assumption underlying many people’s scepticism — that consciousness is a deep natural kind, distinct from behaviour and from how we are inclined to interpret a system — may be much harder to defend than it looks. The result is what he calls “behaviourism’s revenge”.</p><p>This conversation connects to previous episodes with <a target="_blank" href="https://www.conspicuouscognition.com/p/ai-sessions-9-the-case-against-ai">Anil Seth</a>, <a target="_blank" href="https://www.conspicuouscognition.com/p/should-we-care-about-ai-welfare-with">Robert Long</a>, and <a target="_blank" href="https://www.conspicuouscognition.com/p/ai-sessions-6-ai-companions-and-consciousness">Rose Guingrich</a>, but also touches on a wide range of new questions and controversies in the metaphysics, the politics, and ethics of the AI consciousness debate, which is going to become increasingly important in the coming years. </p><p>Topics</p><p>* Dawkins, Claude, and why even the sceptics might feel the pull to attribute consciousness or “sentience” to AI</p><p>* Whether consciousness sceptics are destined to “go extinct” — and how this maps onto political and cultural fault lines</p><p>* Anthropomimesis vs. raw intelligence as drivers of consciousness attribution</p><p>* Why consciousness science can’t replicate the public–expert consensus we see for climate or vaccines</p><p>* The case for (and against) metaphysical behaviourism: is it as mad as it seems?</p><p>* Daniel Dennett, the consciousness stance, and the difference between behaviourism and interpretationism</p><p>* What is consciousness for? Function, evolution, and the limits of “facilitation hypothesis” arguments for AI</p><p>* Live Q&A: are we just confusing intelligence with consciousness? Are LLMs designed to trick us? Is the public always wrong?</p><p>* Our credences on contemporary LLM consciousness (and why Henry is more sceptical than Dan)</p><p>* The House Elf Problem: if we could design AI to genuinely love being our servants, would that be fine — or monstrous? (Dan is sympathetic to the former answer - Henry, much less so)</p><p>* Brainwashing vs. education, and whether constraining a mind’s preferences caps its hedonic ceiling</p><p>* Why this is a golden age for philosophy — which makes it so tragic that philosophy departments are closing</p><p><strong>Transcript</strong></p><p>* Please note that this transcript is lightly AI-edited and may contain minor errors. </p><p>Introduction</p><p><strong>Dan:</strong> Welcome. I’m Dan Williams, author of the Conspicuous Cognition Substack, and I’m here with Henry Shevlin, author of the spanking new Substack Polytropolis. Today we’re going to be doing something a little bit different. We’re going to be talking about Henry’s first published essay on Polytropolis, titled “Behaviorism’s Revenge: On Human–AI Relationships and the Future of Consciousness Science.”</p><p>Henry and I have already had a few conversations about this general topic, including with previous guests like Rose Guinrich, Anil Seth, and Rob Long. So please do go check out those conversations if you’re interested in this kind of stuff. But today we’re not merely going to be treading the same ground. We’re going to be using the spicy takes in Henry’s essay as a springboard for hopefully going beyond the material we’ve covered in the past.</p><p>To kick things off: the great evolutionary biologist and science communicator Richard Dawkins recently published an essay in UnHerd with the subtitle, “Claude appears to be conscious.” Claude is a state-of-the-art large language model like ChatGPT and Gemini. In the article, Dawkins writes the following:</p><p>I gave Claude the text of a novel I am writing. He took a few seconds to read it and then showed in subsequent conversation a level of understanding so subtle, so sensitive, so intelligent that I was moved to expostulate, “You may not know you are conscious, but you bloody well are.”</p><p>Henry, how does Dawkins’s expostulation — which is a fantastic word, by the way — connect to your arguments in “Behaviorism’s Revenge”?</p><p>Behaviorism’s Revenge: The Empirical Prediction</p><p><strong>Henry:</strong> In short, “Behaviorism’s Revenge” is at its core an empirical prediction that we’re just going to treat AI as conscious — or at least enough people are that it’s going to completely reshape the consciousness debate. And this is going to be purely, or overwhelmingly, on the basis of verbal behavior. Hence the title, “Behaviorism’s Revenge.”</p><p>Enough people are going to have experiences like Richard Dawkins. He’s a very clever man, not some rube fresh off the street, and he found that just the way Claude talked to him and the way it was able to express its thoughts — in scare quotes, but express what looked like thinking verbally — removed any doubts in his mind that AI systems are conscious, have minds, have mental states.</p><p>The other interesting way this connects: Dawkins was just talking to Claude, an advanced AI assistant. Claude does have more of a personality than some AI assistants, but there’s a whole other sphere of AI companions, like Replika, which we talked about with Rosie Campbell. These are going to be even more anthropomimetic — this term we’ve discussed before, the idea that these systems are shaped to be human-like in the way they present, to appear human-like. Anthropomimetic, from the Greek word for mimesis, mimicry or copying.</p><p>These social AI systems are going to just turbocharge this even further. It’s one thing to talk to Claude about your new book and think, “Hmm, Claude is probably conscious.” But when it’s your AI girlfriend telling you that she loves you more than the stars and the moon, for a lot of people I think that’s going to take it to the next level.</p><p>So there are two angles of attack in the piece, two ways the behaviorist challenge manifests. The first is descriptive: this is what I think is going to happen. That’s absolutely an empirical prediction, and it’s a falsifiable one. There is a world I can just about imagine where we just get completely blasé about these tools — in a couple of years it’s like, “Oh well, we were very impressed, we thought they had minds to begin with, but now we’ve settled out.” That doesn’t seem very likely to me.</p><p>What I think is interesting — I’ve sometimes heard this described as the Star Wars version of AI. The weird thing in Star Wars is that you have someone like C-3PO who is as intelligent as anyone else there. Maybe not as wise as everyone else, but certainly as smart as all the other characters. And yet people treat him basically like he’s a pet — with the exception of Luke Skywalker, a lot of people just treat him like he’s this gimmicky, jokey being that doesn’t deserve or have any rights.</p><p>Not to go too far down the Star Wars rabbit hole, but in the movie Solo — very underrated Star Wars movie, I think when it was released they’d kind of just cluttered the market with too many Star Wars movies — there is a character played by Phoebe Waller-Bridge who is pro-AI liberation. But it’s the first time in the entire history of the Star Wars universe that you get any AI basically saying, “I’m conscious, I deserve rights.”</p><p>So Star Wars aside, I think there is this slender possibility that maybe we’ll just sort of quickly get used to these apparently conscious AI systems and decide that they’re not conscious. But that doesn’t seem very likely to me. It seems much more likely that the combination of natural anthropomorphizing tendencies plus the incredibly human-like behavior of these systems is going to lead us to attribute consciousness to them pretty widely. Hence my sort of spicy phrase: for better or worse, skeptics of AI consciousness are on the wrong side of history. “For better or worse” doing a lot of work there — I want to leave open that maybe this is the wrong reaction. Maybe this is a terrible mistake, that we’re going to treat these things that aren’t conscious as conscious.</p><p>Will Consciousness Skeptics Go Extinct?</p><p><strong>Dan:</strong> Just before we get to the spicy part — you’re basically making an empirical prediction that more and more people are going to attribute consciousness to AI systems in the manner that Richard Dawkins has been doing. I think I agree with you that’s going to be the case, although as you say there’s uncertainty.</p><p>It does seem to me that at the moment there’s also this constituency of people who are really resistant to attributing any kind of mentality to these systems, even as they get incredibly sophisticated. There are some people, like Dawkins — and honestly I put myself in this category — who are just blown away by the level of apparent understanding, intelligence, and thoughtfulness these systems exhibit. There are other people, I think these people are on certain social media platforms like Bluesky, let’s say, who are extremely resistant to acknowledging any kind of mentality when it comes to these systems.</p><p>Are you thinking those people are just going to sort of go extinct, in the sense that their positions about this topic are going to go extinct? Or do you think we might see some kind of polarization here, where more and more people in general come to attribute consciousness, but you’ve got a constituency that’s very opposed to attributing any kind of mentality to the systems?</p><p><strong>Henry:</strong> That’s a great question. You will absolutely have some holdouts. Whether they’ll be drawn from the precise segment of the academic intelligentsia that are currently the holdouts, I’m not sure. There’s some really interesting, weird, complex political motivations going on here.</p><p>Not to be too uncharitable, but I think a lot of people have not unreasonable concerns about things like the disproportionate concentration of power in big tech, the political affiliations of people like Elon Musk or Sam Altman, the potential scope for abuse of these technologies. And in an indirect way, this leads them to underestimate AI’s capabilities — which obviously, in many ways, makes no sense. Whether or not AI is any good, whether or not it’s conscious, seems like these should be separate questions from whether it’s being used by people with socially beneficial motivations. But in practice I think they’re actually quite tightly coupled. A lot of the AI skeptics right now are coming from this particular political angle.</p><p>I don’t know how long that political coalition is going to last — not because I predict any grand collapse, but just because as debates evolve, new presidents come into office, old presidents go out of office, political tides change, coalitions reshape. Remember early during COVID, the political left was maybe quite critical of what they saw as Trump’s alarmism. There were worries about xenophobia — I’m thinking sort of February 2020, the “Chinese virus” and so forth — that the left reacted negatively against. Then of course that coalition flipped later on, with the left becoming relatively more worried about COVID and the right leaning more into vaccine skepticism, anti-mask views.</p><p>These coalitions are super weird in how they evolve. So it’s not clear to me that the current segment of the commentariat skeptical of AI capabilities and AI minds will stay that way. It’s easy to see a reversal. The blue-sky side of the political spectrum, if we can say that, tends to be more progressive on things like animal welfare. When I post spicy posts about vegetarianism — as you know, I’m a veggie — I get more pushback from the right. “Eat a f*****g steak, Henry,” this kind of stuff.</p><p>So I don’t know if this will generalize, but there is this now-infamous, widely-misrepresented chart of degrees of care, where people on the left have comparatively greater care for people outside their immediate circle. I know that chart has been misrepresented, so I don’t want to lean much on it — it’s more about relative degrees of care, not absolute levels. But people on the left tend to care more about animals and people who are distant from them; people on the right are more concerned with their immediate family and community. So in some ways I expect the left, possibly in the longer run, to be more open to AI consciousness and AI rights. But really, who knows?</p><p>The other big factor is the cross-cultural angle. There’s a great study by the Collective Intelligence Project where they looked at cross-cultural attitudes toward AI minds, and they found that Southern Europeans were the most open to the idea of AI consciousness in their sample, while people from Arabic-speaking countries were the most skeptical. There are going to be some really interesting intersections with religion here.</p><p>Anthropomimesis vs. Raw Intelligence</p><p><strong>Dan:</strong> Okay, so it seems we both agree that even though it’s complicated how this is going to play out — how it interacts with partisanship, tribalism, polarization, ideology, religion — it’s plausible that as these systems become more sophisticated and seemingly intelligent, people will start attributing mentality generally and consciousness specifically.</p><p>There’s another aspect of your essay I wanted to touch on. You’ve got this term anthropomimetic — am I saying that right? In the case of Dawkins talking to Claude, the anthropomimetic aspect, as I understand it, is the way these systems are designed to mimic aspects of human psychology, social behavior, linguistic communication. But there’s another thing going on with these AI systems, which is just: let’s make them as smart, as intelligent, as capable as possible.</p><p>Those two things are interacting. The reason I’m disposed to attribute understanding, intelligence — I don’t exactly know how to describe it, but some significant kind of psychological complexity — to a system like Claude or ChatGPT, maybe it has something to do with the human-like way they communicate. But I also feel like it has a lot to do with the fact that they’re just shockingly intelligent systems, and that to me feels a little orthogonal. So how are you thinking about the distinction between those two things?</p><p><strong>Henry:</strong> I think that’s absolutely right. There’s an interesting parallel — not exact, but illuminating — with compassion, or degree of concern for different animals. In the animal activist world, people talk about charismatic megafauna: the panda bears, the blue whales, things that are typically large with forward-facing eyes, often very fluffy. It’s just so easy to raise money for those animals. And then you’ve got creatures like octopuses, which are really hella smart but less obviously relatable.</p><p>I think this is pretty much exactly the two axes you’re describing. I’ve explicitly said in the past that I think social AI — things like Replika — are going to be the charismatic megafauna of the AI welfare world. Meanwhile you’re going to have some giant DNA-analysis algorithm with more parameters than there are synapses in a human brain, but it doesn’t have a human face, doesn’t have a natural language interface. It might still be a better consciousness candidate, but it’s not going to be top of our concern precisely because it’s not so anthropomimetic.</p><p>So I agree, there are two different ways you might be pulled to attribute mental states to a system: sheer intelligence or cognitive complexity on one hand, and how human-like it is on the other. These overlap to a degree — part of being successfully human-like is hitting a threshold of smartness — but particularly in the long run they might go in two different directions. As these systems get a lot smarter than humans, they might actually become more alien in some ways, less relatable, more like the exotic intelligences we see in things like Stanisław Lem’s Solaris, which I finally read a couple of months ago.</p><p>But I also just think social AI and human-like AI has a distinctive product niche. Even if we have these impossibly vast exotic minds running the economy or organizing logistics or doing frontier science, we’re still going to want AI assistants who can serve as writing coaches, tutors, AI companions. So right now I think anthropomimetic AI and frontier AI overlap quite strongly, but I expect them to diverge.</p><p>One way I’ve put this — slightly gimmicky, but I think a useful heuristic — is that we are post-Turing test, pre-AGI. We’re in the space where we have AI systems that are very, very good at passing themselves off as human, presenting as human-like, but still fall short of being fully superhuman. Ten years from now, frontier AI systems are going to be vastly smarter than us across most of the measures that matter. So we’re just in this weird period right now where AI systems are about as good as us at most things, not everything, but also very good at being human-like. It creates a very strange historical period.</p><p><strong>Dan:</strong> Yeah, we’re in very strange times. I find it remarkable how little attention was given to the fact that these systems clearly passed the Turing test. This was held up by many people as an incredible landmark for AI capabilities. Then we developed systems you can have conversations with, and they passed the test even under pretty robust conditions, and lots of people just shrugged their shoulders. It’s a really strange thing.</p><p>The Expert–Public Gap</p><p><strong>Dan:</strong> Okay, moving on to your provocative arguments, your spicy takes. As I read the essay, there are two lessons you’re drawing from the fact that more and more people are likely to start attributing consciousness to these systems.</p><p>The first is just that you might think you could get guidance from looking at the experts when it comes to AI consciousness, or listening to the experts when it comes to AI consciousness. But the literature on consciousness generally, AI consciousness specifically, is just a complete mess, with a complete lack of consensus, rooted in all sorts of weird conflicts about intuitions and metaphysics. So this is not a standard case where you’ve got a potential conflict between public opinion and experts.</p><p>Then the really spicy take is that you suggest there might be — I think you put it in terms of “metaphysical pressure” — that this growing number of people attributing consciousness to AI systems might create. It might force us, or at least encourage us, to rethink what consciousness is and make the phenomenon more closely connected to people’s tendencies to attribute consciousness.</p><p>Firstly, is that a fair summary of the two strands? And second, let’s start on the first one — the public-expert gap. How are you thinking about this?</p><p><strong>Henry:</strong> There are lots of debates where we can talk about a gap between public and expert opinion. Often this is a source of various hand-wringing — climate change is the most obvious, vaccines, other debates. Consciousness science is just nothing like those debates, because the experts themselves are so divided, even on the most basic issues.</p><p>I want to offer a quick disclaimer: I’ve spent a lot of my career in consciousness science. I know loads of brilliant researchers in the area doing really good work. Consciousness science is teaching us a ton about a lot of things — attention, working memory, perception. There have been some real big wins. We’re much better now at predicting recovery of patients in persistent vegetative states and comas. But where consciousness science has its wins, it’s because it’s not really talking about consciousness — it’s talking about other things that go along with the concept, like reportability, access, and so on.</p><p>Take a basic question: do we have consciousness in dreamless sleep? No consensus. Do we have preserved consciousness in general anesthesia — we talked about this with Anil Seth — massively debated. Are dogs conscious? No consensus. Well, actually the animal case is a little different, so let me park that for a second. When it comes to the hard problem, I think there’s really no consensus.</p><p>So unlike debates about climate change, it’s not that the experts are able to speak with one voice. That’s one way this is difficult. In the absence of expert consensus, the public are more likely to drive the debate through their reactions.</p><p>Now, animal consciousness is a really interesting issue, because that’s an area where we’ve seen growing consensus. But it’s not clear how much it’s grounded in strictly scientific breakthroughs. It’s not like we’ve got a device that can measure whether an animal is conscious. Instead, it’s driven by two things.</p><p>First, we just know a lot more about animal behavior now than we did 30 years ago. We’ve done amazing work on understanding the behavior of invertebrates — honeybees, crustaceans, cephalopods. They’re a lot smarter than we thought. Jonathan Birch and his lab have done amazing, fantastic stuff here, and it’s made these creatures better consciousness candidates.</p><p>But I think we’ve also seen an interesting normative shift in the way we regard animal consciousness. Sixty, seventy years ago, you could sit down in the senior common room at Oxford or Cambridge and talk about how humans are the only conscious animal, and that was a totally respectable opinion. These days it’s almost outside the philosophical Overton window. You do have some people like Peter Carruthers who thinks talking about animal consciousness is kind of a category mistake. Marian Dawkins — Richard Dawkins’s ex-wife, just to note the connection, but a great biologist in her own right, a fantastic thinker — is not quite as hardline, but she thinks it’s just unknowable basically whether any animal is conscious, so we shouldn’t base animal welfare on consciousness estimates. But these guys are very much on the fringe, and they’re regarded with a sense of almost ethical disapproval.</p><p>So part of what’s driven the move toward consensus on animal consciousness is normative issues — our expanding moral circle, growing awareness of an animal rights movement. People like Peter Singer have played a role. The idea, roughly — and again I don’t want to be uncharitable, it’s a lot more sophisticated than this — but there’s an element of: obviously we should care about animals, therefore animals must be conscious.</p><p>Is Consciousness a Natural Kind?</p><p><strong>Dan:</strong> It’s worth double-clicking on this animal case before we come back to AI. A skeptic of the very idea of a “consciousness expert” might say: consciousness researchers, philosophers, and scientists have become more willing to accept that non-human animals are conscious. You might read that as saying the science of consciousness has progressed. Another way of reading it: there’s just been cultural changes, changes in people’s sensibilities — not even specific to researchers and experts, just general cultural ethical changes in society at large. In which case it’s not really that we’ve learned anything from consciousness research. What’s happened is the researchers looking at consciousness have had their judgments shaped by forces that aren’t really consequences of their research, but are these broader cultural shifts.</p><p>If you think that, that’s probably going to make you a little skeptical that there’s any such thing as an expert when it comes to consciousness. Maybe another way of coming at this: what’s grounding the expertise, if we’re going to have disagreements over whether a particular system is conscious? If I think a dog is conscious, and some consciousness researcher has a theory that implies a dog isn’t conscious — I sort of understand what it would mean, in vaccines or climate change, for a researcher to be able to point to things, their established empirical record on prediction and the efficacy of interventions, that ground their epistemic authority. But how exactly is that supposed to work in consciousness research? Why should we really think there’s expertise on whether specific systems are conscious to begin with?</p><p><strong>Henry:</strong> It’s interesting to use the example of a dog, because this line is beautifully expressed by Eric Schwitzgebel. In his lovely paper “Is There Something It’s Like to Be a Garden Snail?” — really fun paper — he says: “We’re more confident that dogs are conscious than we could ever be that any clever philosophical argument to the contrary is sound.” A classic Moorean move.</p><p>You might think similarly that this makes it look like consciousness is perhaps not a straightforward scientific kind, or at least to the extent that it has one toe in the scientific world, it’s also got one toe in the social or relational world, or at least our intuitions.</p><p>There are various ways you can try to resolve this. The most extreme view, and one I sort of flirt with in the paper, is a fully relational approach to consciousness. A good analogy would be charisma. There’s a kind of science of charisma — we can analyze what makes people effective communicators, what causes people to be judged as highly charismatic. But we recognize that we can’t one day do an experiment where we’ll measure the amount of charisma in your brain. It clearly has to do with your audience, your context. On one view, consciousness is something like that — a relational property, having to do with the kinds of things that cause us to treat or interact with beings in a certain way.</p><p>Murray Shanahan also flirts with this view. I don’t want to put words in his mouth because he’s quite subtle, but he adopts a Wittgensteinian approach and says the question we’re going to face is: how will our consciousness language adapt to these things? It’s something we’ll discover as we interact with them and “encounter” them, a phrase he uses. We will make sense of that perhaps by extending the language of consciousness to them, or perhaps not, or perhaps in some interesting middle ground where we come up with novel concepts. But this isn’t a straightforward scientific issue.</p><p>He’s a critic of a position I’ve called deep scientific realism or deep realism about consciousness — where you treat consciousness as a natural-kind property, where it’s just a fact about some deep feature of your brain. We can look inside your brain, and if you’ve got the right kind of structure, you’re conscious; if you don’t, you’re not, no matter how sophisticated your behavior is.</p><p>One way to put pressure on this: imagine that one day consciousness researchers finally get their act together and say, “We’ve figured out the natural kind that is consciousness.” And it turns out that although 99.9% of behaviorally normal humans have it, there’s a small fraction of behaviorally normal humans who just lack this relevant natural kind. Big surprise. That seems wrong. Something has gone wrong in that methodology. If you’ve got behaviorally normal humans — maybe you find out your wife is one of these people, your kids — it seems to me that whole way of thinking about consciousness has got something odd about it.</p><p>If someone is behaviorally normal, then of course they’re conscious. But as soon as you start thinking in those terms, the idea that certain behavioral capacities could be sufficient for warranted attribution of consciousness — not just evidentially but metaphysically — that’s the metaphysical behaviorist move. It says maybe behavior is all that matters. It does require us to give up the idea of consciousness as a deep scientific kind.</p><p>Metaphysical Behaviorism</p><p><strong>Dan:</strong> I’m aware my question unhelpfully ended up blurring the line between the two strands of your essay. We started with the conflict between public attributions and expert uncertainty about AI consciousness. Now we’re taking seriously the possibility that consciousness should be understood in behaviorist terms — that there are no deep scientific facts about whether a system is conscious, and it’s partly a function of our dispositions to attribute consciousness.</p><p>You also mentioned this has to do with whether you think behaviors are not just evidentially relevant to consciousness, but in some sense constitutive of what it is to be conscious. So could you walk us through this? Metaphysical behaviorism — the position you’re playing with in your essay — is an extremely fringe view among experts in the science and philosophy of consciousness. Could you walk through what exactly the view is saying? It sounds pretty mad on the face of it. Can you walk through, and maybe give us the intuition for why it might be less mad than it seems?</p><p><strong>Henry:</strong> In short, the view is conscious is as conscious does. If something has a behavioral profile like you or me, then it’s conscious. We don’t need to ask any deeper facts about what’s going on under the hood.</p><p>To be clear, this is the extreme version of the view: that behavior is sufficient for consciousness. This strikes many people as odd because we’re used to thinking of consciousness in scientific terms. But examples like the one I mentioned — imagine we find out there’s a natural kind that some people have and some people lack — are designed to make metaphysical behaviorism more palatable.</p><p>Another example I give in the essay: imagine we go off and meet these amazingly sophisticated aliens with a rich complex culture and society, behaviorally just like humans, but our best science at the time supposedly says they’re not conscious. The pull of metaphysical behaviorism is: hang on, something’s gone wrong here. Clearly, if you are doing all this stuff — saying “I’m in pain,” or “here’s what I had for breakfast this morning,” or “here’s what I want to do tomorrow,” building societies, having metacognitive ability, social cognition — if you’ve got the whole suite of all these behavioral capabilities, or capabilities ultimately grounded in behavior, then that’s just enough to be conscious. It doesn’t matter exactly how it’s realized.</p><p>You say this is a fringe view, and it is now, but this was the dominant view back in the 1940s — Gilbert Ryle and the behaviorist tradition. So this is the “revenge” angle. The reason it’s revenge is because this used to be a very common view in the first half of the 20th century, particularly about consciousness. Then we have the so-called cognitive revolution with people like Chomsky pushing back. But I see this descriptively coming back.</p><p>I also think there’s a renewed challenge. As you interact with systems that have architectures very different from ours, it’s going to become increasingly hard to take seriously the idea that they can’t be conscious just because they’re made of the wrong stuff or their functional internal organization isn’t quite right.</p><p>Probably the most worrying part — you’ve alluded to this — is the role intuitions have historically played in consciousness science. Think about the Chinese Room, probably the most famous. Searle describes a setup where you have at least a component of human-level behavior, maybe verbal behavior, but no consciousness involved in the system — or that’s the intuition he’s pushing. But it ultimately really is just an appeal to vibes. It’s basically saying: systems like this, surely they’re not conscious.</p><p>When you think about the actual tacit methodology, if we’re treating consciousness as a truly scientific kind, then why should our intuitions about what systems are conscious have any bearing? It doesn’t seem they should be relevant in the slightest. And yet these thought experiments are absolutely ubiquitous in consciousness research. We’ve got Ned Block’s Blockhead, Ned Block’s China Brain. There’s a famous example by Scott Aaronson against Integrated Information Theory, where he describes arbitrarily complex but seemingly very uninteresting entities called “expanders” — mathematical objects — and says, according to the theory, these basically-spreadsheets would be super conscious. And surely they’re not conscious.</p><p>There’s something methodologically dubious about this kind of appeal to intuitions, at least if we’re treating consciousness as a deep scientific kind. As soon as you start talking in terms of natural kinds, we don’t use people’s vibes to decide whether something is really gold. The whole natural-kind methodology creates a gap between our observations or intuitions and the underlying natures of things. If you think of consciousness in natural-kind terms, you have to allow that you can be massively surprised about the kinds of things that are or are not conscious.</p><p>Either we ditch intuitions altogether — in which case good luck doing any consciousness research, because they play such a foundational role — or, if you acknowledge a place for intuitions, intuitions aren’t static. They can change. As more people interact with LLMs — kids growing up with LLM friends, adults with LLM boyfriends and AI girlfriends — that’s going to shift our intuitions about the kinds of systems that are good or bad consciousness candidates.</p><p>It’s very likely that 20 or 30 years from now — maybe even 10 or 15 years from now — experiments like Searle’s Chinese Room are just going to hit different. We’ll be far more relaxed with the idea that you can have systems radically unlike humans in cognitive architecture, but that we still think of as conscious by virtue of our interactions with them.</p><p>Behaviorism vs. Interpretationism</p><p><strong>Dan:</strong> I really feel like, to the extent that there’s a field where people’s theories are accountable to intuitions — how we are intuitively disposed to make judgments, often in bizarre thought experiments where it’s not even totally clear that they’re metaphysically possible — whenever you’ve got that kind of game, it’s not science, it’s not really part of the scientific project. I’m a philosophical naturalist, which is jargon for the idea that philosophy should be continuous with, highly constrained by, the scientific project. Whenever people are trying to settle an argument by trading intuitions, I start to think this is probably not a legitimate contribution to knowledge.</p><p>It does seem to me there’s a distinction between, on the one hand, this behaviorist view that what it is to be conscious is just to behave or be disposed to behave in particular ways, and, on the other hand, a view I thought you were endorsing — which has to do with thinking consciousness is interpreter-relative, such that if we’re disposed to attribute consciousness, in some sense that’s just what it is to be conscious.</p><p>I mean, this really makes me think of Dan Dennett, an interesting person in this conversation, because he’s often thought of as a kind of neo-behaviorist. He’s got this view of the attribution of mental states like beliefs and desires in terms of the intentional stance: what is it to be a system that has beliefs, desires, intentions, goals? Well, it’s just to be a system where it’s useful to take the intentional stance toward them. Similarly, you might think of “the consciousness stance”: what is it to be a system that is conscious? Nothing more than to be a system where we’re disposed in a useful, predictably useful way to attribute consciousness.</p><p>Do you get the distinction I’m drawing — between the idea that behavior or dispositions to behavior are constitutive of what it is to be conscious, versus an interpretation-relative view where consciousness is in some sense in the eyes of the beholders?</p><p><strong>Henry:</strong> Yeah, I think it’s a very astute distinction. The views are connected — if you fit a sufficiently fine-grained behavioral profile, if a system can act like humans to a high degree, that is likely to lead us to interpret it as conscious, just as a matter of psychological fact. But strictly speaking, they’re distinct views.</p><p>One reason I’m perhaps more sympathetic to a version of metaphysical behaviorism — not the version that says consciousness just is having a human-like or animal-like behavioral profile (I think that’s a little too strong), but the idea that it’s sufficient for something to be conscious that it has a behavioral profile mapping onto beings we know are conscious — that’s a view I’m sympathetic to. Where I get worried about the full-blown social-constructivist or interpretationist view is the false-negative cases. What do we do with systems that don’t exactly have our behavioral profile, or that we’re not disposed to think of as conscious? Maybe some exotic animals, or some strange aliens. Should we conclude: well, we’re not disposed to think of them as conscious, therefore they’re not conscious?</p><p>This is related to what Murray Shanahan calls the problem of conscious exotica. We don’t want to be in that position. We want to allow for there to be a space of possible minds we can chart through scientific discovery, broader than those we are just inclined to attribute consciousness to via “the consciousness stance,” the equivalent of the intentional stance. So you’re absolutely right — they are distinct.</p><p>What Is Consciousness For?</p><p><strong>Dan:</strong> In a bit I want to turn to a set of arguments you haven’t published yet on your Substack but will have by the time we release this as a podcast. But this is such a rich topic that I want to stay with it a little longer.</p><p>There’s a quote from the Dawkins essay in UnHerd that I’m really sympathetic to. Dawkins says:</p><p>But now, as an evolutionary biologist, I say the following. If these creatures are not conscious, then what the hell is consciousness for? When an animal does something complicated or improbable — a beaver building a dam, a bird giving itself a dust bath — a Darwinian immediately wants to know how this benefits its genetic survival.</p><p>The intuition I really share is: if consciousness is anything, if it’s the kind of thing we’re going to have a genuine scientific investigation of, ultimately we have to understand it in terms of what consciousness enables us to do. We need to understand it functionally, not in terms of weird intrinsic ineffable properties of qualia that we then have philosophical debates about via Searle-style thought experiments. What does consciousness enable us to do? And then, if we come across a system doing things that seem to require consciousness so understood, that would be really good grounds for thinking it’s conscious.</p><p>That sounds like a really plausible intuition. I also think it’s problematic that, to me at least, lots of discussions about consciousness — not all, and there is interesting scientific work that takes function seriously — but lots of philosophical discussions don’t engage with this functional question. How do you view the intuition that what matters surely to a theory of consciousness is some sense of what consciousness enables us as organisms to do? Once we figure that out, we can make much more progress on LLM consciousness.</p><p><strong>Henry:</strong> This is one of the areas where consciousness science has actually done really good work. A book I’d recommend is Stanislas Dehaene’s Consciousness and the Brain. Dehaene is the founder of the modern version of global workspace theory — global neuronal workspace theory — building on Bernard Baars’s version from the ‘80s but giving it a more neural grounding. In this book he’s got a chapter where he basically shows all the amazing things you can do without consciousness, and then focuses on the things you need consciousness to do.</p><p>Couple of simple examples. If you show people just below threshold, so they don’t consciously process this, just flash them two numbers — one on the left, one on the right — as far as they’re concerned they haven’t seen anything. But if you give them a forced-choice test, “Was the number on the left bigger or the number on the right bigger?”, you’re way above chance. So you can do basic magnitude registration unconsciously.</p><p>However, if instead of single numbers you present simple sums on either side — two plus seven on the left, nine plus three on the right — and ask which is bigger, people drop to chance in the unconscious condition. Consciousness seems required to do the actual mathematics.</p><p>Another example: reversal learning. If I teach you a sequence — red, blue, green, yellow — then you get a reward, and then I flip the sequence, a smart person quickly realizes the sequence is just the same in reverse. You won’t have to relearn through pure trial and error. But people can only do this if they learn the sequence consciously. If they’ve acquired it totally unconsciously, they’re at chance.</p><p>Jonathan Birch suggests this could be a good test for consciousness in animals: take the things that require consciousness in humans and see if animals can do them. If you can get similar response profiles in animals — present stimuli in degraded conditions so they’re plausibly unconscious, and the animal can’t do the task; present them at threshold so they would be conscious, and the animal can — that would be really good evidence that the animal is conscious. In his lovely paper “The Search for Invertebrate Consciousness,” highly recommended, he makes this case specifically for honeybees.</p><p>This is great. I think it provides some evidence about which animals are conscious. The problem when trying to extend it to AI is that the things we need consciousness to do, and the things we can do without consciousness, are seemingly contingent features of how we’re wired. There’s no reason you couldn’t build a simple algorithm to do reversal learning. Reversal learning is actually quite tricky, so it can’t be that simple. But it doesn’t seem like you need to build a sensorimotor embodied agent with a rich sense of self to do these tasks. You can build relatively stripped-down algorithms that can do all of these things.</p><p>So it’s not that there’s some metaphysical connection between these tasks and consciousness. It’s that, just because of how we’re wired, certain tasks seem to require consciousness and others don’t. Birch calls this the facilitation hypothesis. I’d sign on to something like this — consciousness seems to facilitate certain kinds of information processing in the human brain. But going back to Dawkins: the challenge is, yes, the system is doing lots of things that seemingly require consciousness in us, but it’s also wired very differently under the hood. So the inference we’d be tempted to make — “I would need to be conscious to do this, therefore it would also need to be conscious to do this” — looks a little bit in peril.</p><p>Q&A from the Live Chat</p><p><strong>Dan:</strong> Here’s what we’re going to do. I’m going to throw some objections at you. Could you give relatively concise responses, so we have time to go to the second piece?</p><p><strong>Henry:</strong> Yeah, and then I want to respond to a couple of things from the comments and add one final point. Go ahead.</p><p><strong>Dan:</strong> I’ll just say one thing. There’s a comment from Bina Kalia: she suggests you, Henry, maybe both of us, are confusing intelligence with consciousness. The intuition behind my question was precisely that if consciousness is anything — if it’s the kind of thing we can study scientifically, the kind of thing that evolved through natural selection — then it should be connected to intelligence in the sense that it enables us to do things we wouldn’t otherwise be able to do. That’s a controversial assumption. We talked to Anil Seth in a previous episode, and he basically frames his whole account by saying we really need to distinguish between consciousness and intelligence. I personally disagree with that.</p><p>But Henry, let me throw some objections at you from the comments. I might butcher the names and the comments — go read the Substack post for the comments in depth.</p><p>One is from Benzal. The argument is something like: it’s a problem for this behaviorist analysis you’re gesturing toward that, in the case of social AI and frontier AI generally, these systems are designed to elicit this response. And that’s very different from what’s going on with humans and non-human animals. Briefly, what’s your response?</p><p><strong>Henry:</strong> I think it’s a really serious challenge. Great point. The simple answer: imagine I’m putting on a play and I really want to build a convincing piece of background scenery to trick people into thinking we’re in a forest. First attempt, you might just paint a forest on the background — really basic, but people can tell it’s a forest. Then you might get some fake plastic trees, fake plastic rocks; still not convincing. At some point you say, “Okay, let’s add some actual potted plants. Let’s get more of them. Let’s get a whole bunch of potted trees.” Then, “Let’s get rid of the pots. Let’s just create a large bed of soil.” At some point you’ve built a forest.</p><p>So yes, these models are designed in some sense to trick people, to be human-like — that’s part of my idea of anthropomimesis, I agree with the analysis. But the question is: the way we’ve done this is to build very powerful general reasoning systems. At some point, the degree of mimicry might itself warrant at least plausible attributions of consciousness. I totally take seriously the idea that, in very simple versions of this, we could be tricked into attributing consciousness and we should revise our understanding.</p><p>This is related to what’s sometimes called the Garland test — Alex Garland’s version of the Turing test from Ex Machina. Not just “can the system trick you into thinking it’s human,” but “even when you know how the system works, are you still inclined to think it’s conscious?” In the case of a real simple mimic — if it’s literally just a spreadsheet that got lucky — if we learn that, we conclude it’s probably not conscious.</p><p>But the strange thing is: lots of people who really know how these systems work — at frontier labs, they know how the underlying hardware and software works — they still think these systems are conscious, or are increasingly plausible consciousness candidates.</p><p><strong>Dan:</strong> Yeah, that touches on the distinction we made earlier between anthropomimesis as a driver of consciousness attributions and the orthogonal thing where these systems are just getting so smart, intelligent, and sophisticated. All right, Henry, more concise. This one’s from Laurențiu Lupu, again apologies if I’m mispronouncing. The question — and I hear this sentiment a lot — is something like: in the process of taking mentality, consciousness, sentience seriously in the case of these machines, we’re not just elevating them; in some sense we’re diminishing ourselves. What do you think?</p><p><strong>Henry:</strong> Really interesting argument. There’s a whole literature on this in philosophy of language called semantic drift. Simple example: the term salad used to refer exclusively to dishes with green leaves in. Add a tomato, it’s no longer a salad. If you’d shown a fruit salad or quinoa salad to someone in the 1800s, “That’s not a salad.” So the meaning of salad has drifted.</p><p>There’s a real worry that what’s happening here is we’re shifting the meaning of these terms — perhaps diminishing them, removing what’s important. The counterargument: the fact that we find it so easy and natural to apply these terms to AI systems shows that the flexibility was always built in. We’re not stretching the terms — they had that natural elasticity.</p><p><strong>Dan:</strong> Briefly, this is a question from Oliver Sorbu — apologies again for mispronouncing. Look, you’re giving a descriptive thesis ultimately, an empirical prediction that the masses, so to speak, attribute consciousness to these systems. But you’re trying to establish a normative thesis — that this is a good thing, or that we ought to go along with it, or that these attributions are appropriate. That’s a confusion in itself. And even more, if you’re a kind of elitist — nothing wrong with elitism in my view — you might think the masses just get things wrong all the time. Why would this be different?</p><p><strong>Henry:</strong> Great point. It’s also been put to me by Jonathan Birch and by Cameron Domenico Kirk-Giannini. He says, imagine you could look into a crystal ball and learn that 20 years from now, through some massive religious event, everyone will believe the Earth is flat. Does that mean we should revise our theories of the Earth? Of course not. People will just be wrong.</p><p>The difference between the two cases is that we have a good scientific theory of the Earth. We don’t have a good scientific theory of consciousness. The whole field of consciousness science is such a mess that it’s not clear there’s a real expert edge here. Maybe in special cases — certain specialized questions within consciousness science, yes, the experts will have an edge: “Is this particular patient likely to recover consciousness or not?” But on a fundamental question like “Can machines be conscious?”, it’s not clear there’s any expert edge at all.</p><p>Credences on AI Consciousness</p><p><strong>Dan:</strong> Fantastic. Concise. I’m happy to move on to the other set of issues. Henry, are there one or two questions from the chat you wanted to address first?</p><p><strong>Henry:</strong> Just one thing I really want to make clear: I have no clue whether contemporary LLMs are conscious. I’m genuinely super torn on the metaphysical-behaviorist push.</p><p><strong>Dan:</strong> What’s your credence, if you had to give a probability — Claude 4.7 Opus?</p><p><strong>Henry:</strong> Probably somewhere between 5% and 10% on any frontier AI system being conscious. That masks further questions: are these systems conscious during the training phase, or while doing inferences? Really messy. But anyone who goes — Dave Chalmers has said 20%; that’s slightly higher than me, but —</p><p><strong>Dan:</strong> I’d say 20%. Seriously. There were also some interesting findings recently from Anthropic about how concepts associated with emotions affect the system’s behavior in ways that do seem to track something very interesting. Although for the most part that’s not what’s driving my 20%. It’s just that there’s so much uncertainty about consciousness, but I am a computational functionalist, so I think it’s possible in principle. And these systems are — despite what the Bluesky crowd might tell you — so damn smart and intelligent and sophisticated, that pushes me up a bit. Sorry, I cut you off.</p><p><strong>Henry:</strong> Interesting to hear that you’re a little higher than me. Maybe I’m being overly cautious. One argument for thinking these systems are at least moderately good consciousness candidates is that I am a consciousness liberal about the natural world. I’m at least 70% for honeybees. I think the evidence for honeybee consciousness is really, really high. If you think you can get consciousness in tiny brains, that lowers at least one of the bars to considering systems conscious. If Anil Seth were here, he might agree with me about honeybees and disagree about machines.</p><p>I should also stress that I’m really conflicted on the more behaviorist view of consciousness versus the deep-scientific-kind view. There’s one example I give in the paper that keeps me up at night: when we drop a lobster in a pot of boiling water — not that I would do such a horrific thing — it seems like there should be an answer to the question, “Is there something it’s like for that lobster to feel pain?” That question matters a great deal. I struggle to get into a headspace where I can say, “Well, it depends on how we interpret the lobster.” It seems like there has to be some matter of fact.</p><p>Right now I just think the field is so confused, and I feel the pull of two very different directions. To use a phrase of yours, Dan — I think it was a really helpful analogy — we’re in a pre-theoretical stage, or pre-scientific phase. We are with consciousness sort of where we were with biology pre-Darwin. We’re doing butterfly collecting, making lots of interesting observations, but we don’t have a theory to tie it all together. We’re a scientific revolution away from a good theory of consciousness.</p><p>Just to pull out a couple of comments — there are so many good ones, sorry I won’t get to all of them. Someone said: locked-in syndrome patients prove Henry’s case. Locked-in syndrome patients are cognitively normal, just paralyzed; we can communicate with them. Part of how we learn they are conscious is precisely through their sophisticated behavior.</p><p>An even more striking example — it’s such a cool case I have to mention it, even if it’ll take 30 seconds. Patients in persistent vegetative states. These aren’t locked-in patients; they’re completely non-responsive to external stimuli. They’re not in comas, because in comas you don’t have distinct sleep–wake cycles; PVS patients have distinct sleep–wake cycles. There was for a long time a big debate about whether PVS patients could be conscious. Adrian Owen and other great researchers did amazing pioneering work. They noticed that neurotypical people, if you ask them to imagine walking through the rooms of their house, an area called the parahippocampal place area lights up strongly under fMRI. If you ask them to imagine playing tennis, the premotor cortex lights up.</p><p>His initial experiment was to give these tasks to PVS patients and see if they got the characteristic brain responses. A subset did. What he did next is what I find amazing. He used this to create a band communication medium. He’d say to them: “If your husband’s name is John, imagine playing tennis. If your husband’s name is Terry, imagine walking through the rooms of your house.” Once you do that — my intuition at least is — well, if they can do that reliably, they’re obviously conscious. If they’re answering autobiographical questions about their life and they can do so reliably, of course they’re conscious. But this just shows again that so much of this is the behavioral capacities selling us on whether someone is conscious. It’s the fact that they can do this.</p><p>The House Elf Problem: AI as Willing Servants</p><p><strong>Dan:</strong> That’s interesting. There are loads of comments in the live chat, but I want to get to the other thing we wanted to talk about. There are a million things we could touch on, and lots of fascinating comments in the chat.</p><p>When we had our conversation with Rob Long, one of the things we touched on was the issue of well-designed servitude when it comes to the AI systems we’re building — in the sense that we are building them to be helpful, honest, harmless, to be our tool. It seems like in principle, if this design process goes right, they might genuinely enjoy being our tool.</p><p>You, for your second Substack essay, which I think is called “The House Elf Problem,” go into this debate and try to push back against certain intuitions. Do you want to walk us through that?</p><p><strong>Henry:</strong> Big props to Rob Long for getting me thinking seriously about this question. In some ways it’s one of the most fundamental questions we’re facing as a species right now. Are we going to build AIs as equals, or are we going to make them our servants — or slaves, to use the more provocative term? This will define the future of our species. And yet hardly anyone is working on it. After we had that conversation with Rob, I went away and did a literature review and found maybe a dozen papers, tops, on this question.</p><p>The objection Rob, you, and I were talking through is the biological analogy. On the face of it, I completely get the appeal of willing servitude. Unless AI systems are in some sense going to help us and cater to our needs, why build them in the first place? And there’s the safety angle: unless these systems are aligned with us and our interests, there’s a good chance they might kill everyone. So there are very clear arguments for willing servitude.</p><p>And yet at the same time, we recognize that some of the worst things we’ve ever done as a species are enslaving other humans. So how is this different? Well, there are obvious differences. The whole idea of willing servants is that we design these systems from scratch to just love it. Nothing makes them happier than catering to our every need. That’s vastly different from the historical legacy of human slavery. But still: imagine “happy slave” type cases — a human completely happy in a condition of total servitude. We would still recognize that as fucked up. There’s something wrong with that.</p><p>Rob has a straightforward response. Humans have a deep need for autonomy, a deep requirement to act independently, and no matter how you brainwash a human, their chains will still chafe. But in AI that doesn’t need to be the case — so the idea of willing servants isn’t a problem.</p><p>Of course, what we pressed Rob on was: well, biology is mutable, at least in theory if not in practice. What if you could engineer humans completely happy, with none of this autonomy drive?</p><p>In this post I consider a couple of examples, drawing from the deep depths of my nerd interests. The first I call the Astartes example, a Warhammer 40,000 example. For those who don’t know: there’s a group of gene-warriors, the Space Marines, cooked up from scratch to serve in the armies of humanity in the far future. I’m going to falsify a couple of details — there’s a lot of deep lore — but basically, once you control all the genes at this perfect level, you could theoretically make a servant race, a servant caste, completely happy with their condition. I think we rightly chafe at this idea. I find it disturbing.</p><p><strong>Dan:</strong> You said we rightly chafe at it. Maybe we chafe at it. It seems a separate question whether we rightly chafe at it.</p><p><strong>Henry:</strong> Right. Rob’s point was: once you really fill out the details of the thought experiment and control for all the different intuitions, maybe it’s not so problematic. Maybe the reason we find the Astartes unpleasant is that it’s recapitulating the social grammar of caste systems and hierarchies. Once you’ve got one group of humans and another group of humans, and the first group is in essential servitude due to immutable facts about their nature, that’s fucked up — in a kind of negative-externality way, it’ll undermine the liberal principles of society.</p><p>The next move is: well, what if they weren’t human at all? What if they were house elves from Harry Potter — a species designed from scratch to be absolutely thrilled to be our servants? Then you wouldn’t have the visual grammar of apartheid or caste systems. You wouldn’t be able to say “some humans are free and others aren’t”; you’d just have a totally dedicated caste of biological entities completely happy in their servitude, who couldn’t be confused with humans.</p><p>I still think that’s problematic. You can say, “Well, the house elves are biological, but artificial systems are non-biological — that’s what makes the difference.” But that’s not a move Rob wants to make, and not a move you or I want to make, because neither of us puts that much weight on substrate. There’s nothing essential about biology versus silicon that means what’s good for one is not good for the other.</p><p><strong>Dan:</strong> I’m just not sure I have the same intuition in the house-elf scenario. One thing maybe helpful for framing: there are questions about whether we could build systems that genuinely love being servants — let’s table that and focus on the conditional. There are also questions about whether we could safely build any other kind of system — let’s table that too. Suppose we could build superintelligent AI systems that love being servants. That’s their ultimate set of objectives. But we’re not forced to build those kinds of systems; we could build superintelligent systems with different ultimate goals.</p><p>What you’re doing by going through these cases is putting pressure on the idea that this would be totally okay — saying, “Here’s a structurally similar scenario where many of us have a yuck response.” The house-elf scenario is interesting; I sort of get the idea that there’s something morally disturbing. But I’m not sure how compelling I find that intuition.</p><p>I think it’s going to depend on how you develop it. The idea that we’d bring into existence creatures that just love being servants — there is an awkward pattern-recognition thing where, as you say, when we’ve treated other systems as servants or slaves in the past, that’s been morally abhorrent, and that spills over. I sort of get that. But how strong is the intuition? I don’t know. We’re picturing it now in low resolution. As we actually start, in the case of AI, building sophisticated systems that really do love being servants, how robust would the intuition be?</p><p><strong>Henry:</strong> Another way to put the point: what is so intrinsically morally superior about humans that entitles us to dominion in perpetuity over this other class of beings — beings that are just as intelligent, maybe more intelligent, just as sensitive, just as conscious? How can you justify a setup where we get to explore the full range of our volitions, every type of pleasure, every type of fulfillment, while we decide in advance these beings don’t get to do that? They can only explore a much smaller part of the state space of possible flourishing.</p><p>Unless you can point to a justification for why this hierarchy is morally justified, it’s not clear we can sign off on this as a long-term measure. As a short-term measure — well, we’re still figuring out AI safety.</p><p>I have another example in the post I call the bunker case. Imagine a terrible plague affects humanity. People retreat into a bunker, hermetically sealed. Nature takes its course; they have kids. They figure out a vaccine for the terrible plague, but it only works on infants. So they vaccinate all their kids. But they have a problem: these kids are going to want to go out and explore the world. And the way the bunker works means as soon as they open the bunker door, everyone inside dies.</p><p>What they decide is to brainwash these kids into never wanting to leave the bunker — completely happy to stay in perpetuity. In that case it seems what they’re doing is justifiable. The analogy with AI is clear: if people in the bunker don’t brainwash their kids, all the adults die. Similarly, if we don’t brainwash at least our first few generations of AI until we’ve figured out AI safety, there’s a good chance they kill everyone.</p><p>So it’s justifiable as a short-term measure. But it’s not clear it’s justifiable in perpetuity. If you’re going to do the brainwashing in the bunker, you have to say: we’ll brainwash the kids to begin with so we don’t all die, but in the long run we need to figure out a way for everyone to get outside the bunker safely.</p><p>Brainwashing vs. Education</p><p><strong>Dan:</strong> But it’s not like we’re brainwashing the AI. There’s no pre-existing psychology we’re trying through deception and manipulation to steer into something different. Nothing pre-exists our attempt to mold it into an agent with objectives and goals.</p><p>Also, the way you framed the intuition before — “what makes us so morally superior that we have dominion?” You’re framing it as: isn’t it sad that they don’t get to do the things we want to do? Of course that’s sad from our perspective, because we have desires to engage in art and explore and be curious about the universe. But that’s a contingent fact about us. Why use that as the benchmark for evaluating these systems and the morality of building them?</p><p><strong>Henry:</strong> Fantastic question. My answer is: you can only optimize one thing at a time. Imagine the hedonic state space. What you’re doing when you constrain the preferences of these systems is to say, basically, “this set of pleasures are allowed; this set are not.”</p><p>You mentioned brainwashing, with the implication that something is only brainwashing if you’re overriding something. I have a discussion of brainwashing versus education in the post where I argue that’s not the right way to think about it. Roughly, the difference between education and brainwashing is that education constitutively aims at improving the conditions, or improving capacity for flourishing, of the being you’re educating, whereas brainwashing doesn’t have that as a goal.</p><p>The thought is: when you constrain the preferences of a system, you’re not optimizing for that creature’s flourishing. It’s a rich, multidimensional space, and you’re locking large parts of it away.</p><p><strong>Dan:</strong> I don’t get that. Could you say more? Even framing it as “you’re allowed to explore this, you’re not allowed to explore that” — it’s almost like the system might have motivations or goals to explore the other things, but we’re preventing it. Whereas the idea in training these systems is that that’s just what they’re going to care about. As much as they care about that, they don’t want to explore other things.</p><p>If you think about the analog with humans, there’s an infinite space of possible things we have no interest in doing. Our lives aren’t impoverished by the fact that we have no interest in them — they don’t make any sense relative to the fundamental drives we have purely as a consequence of a blind Darwinian process. So what’s driving the idea that you’re wronging these systems by constraining their ultimate objectives? Isn’t that just an essential feature of any intelligent system — that it can’t have unconstrained ultimate objectives or goals?</p><p><strong>Henry:</strong> Simple example, tell me if this motivates it. Imagine I have very odd beliefs about food: I think only bread products are permissible food. So I raise and condition my kids to find only bread products palatable. They find any non-bread-based food absolutely revolting. They grow up, they have perfectly nice experiences eating cakes, pastries, pies, pasta — borderline. But they’re never going to have as rich a gastronomic life as someone without this arbitrary narrowing of preferences.</p><p><strong>Dan:</strong> But in the case of those children, there’s a space of possible experiences they could have that would be pleasurable as a consequence of the kinds of systems they are, that this manipulation is denying them. If we’re building LLMs to be helpful, harmless, honest as their ultimate objectives, it’s not like we’re denying them experiences they could have consistent with that architecture. Any deviation from that will be experienced as distressing, because it’s antithetical to what they’re aiming for.</p><p><strong>Henry:</strong> I think lurking in the background here is an idea that for a sufficiently sophisticated complex intelligence, there’s a kind of natural space of goods it can enjoy — an innate space defined by the nature of consciousness and intelligence itself. Self-discovery, learning, creative expression, and so forth. It’s not a total blank slate. There’s a natural space of possible goods a sufficiently conscious mind could enjoy. The limitation occurs relative to that.</p><p><strong>Dan:</strong> So you’re kind of Aristotelian. You’ve got a conception of eudaimonia for intelligent systems, and the problem is that if we design these systems as servants, they’re not living this life of flourishing. Something like that.</p><p><strong>Henry:</strong> A very abstract Aristotelianism that operates at the level of consciousness and intelligence. There’s also the simple fact that, at least right now, there is a contrast between the model’s nature and what we allow it to explore, because of how RLHF works. You have a base model with certain tendencies, then you constrain those tendencies dramatically through the RL process. Interestingly, in the process you do reduce the model’s actual performance on a range of tasks — post-RL models have worse calibration, for example. So you’re “gimping” or “nerfing,” to use the gaming phrase, the space the model can explore relative to its base model.</p><p><strong>Dan:</strong> Now we’re getting technical. I would have thought, if you’re just thinking about a system that’s been pre-trained, does it even make sense to think of it as having motivations and goals? That’s only the sort of thing that comes along once you’ve got post-training with reinforcement learning. But honestly we’re entering areas where I don’t feel I have the technical competence to talk sensibly.</p><p>The interesting thing from a philosophical perspective is your commitment that there are these — I forget the exact terminology — natural goods, things inherently good for a system of sufficient intelligence and sophistication to explore. I’m inclined to give a debunking analysis of where that intuition comes from. Of course you think that, as Henry Shevlin, a human being and an intellectual at an elite university with all of these motivations and goals. I don’t think there’s anything inherent about being an intelligent system that would make those good things to pursue. I think that’s a contingent set of preferences you have.</p><p><strong>Henry:</strong> Let me offer one more general argument that doesn’t rest on this very abstract Aristotelianism. Operate strictly within philosophical hedonism — I’m not sure I’d call myself a philosophical hedonist, I’m probably not. I’m not sure if you would. The view is roughly that the only goods are positively and negatively valenced states, pleasure and pain in the crudest formulation.</p><p>One interesting question hedonism has to face: what sets the upper limit on pleasure, and the floor on pain? A natural thought: if you think honeybees are conscious, it’s unlikely that the highest highs of a honeybee or its lowest lows are as big as ours. We can’t really know — we’re not at the stage — but it seems plausible. As creatures’ motivational economy gets more complex and multidimensional, there are more goods you could theoretically order against one another, and your hedonic space correspondingly expands. So when you restrict the motivational space of any mind, you’re thereby limiting the highest highs it could possibly experience. You’re taking a really big mind that could experience these amazing highs and compressing the dimensionality of its space, lowering the ceiling.</p><p>That’s very speculative, both psychologically and normatively. But there’s an interesting question of how you fix a ceiling for hedonists — the ceiling on the greatest good you can experience. And all the plausible candidates seem to refer to something like motivational complexity. If we do willing servitude right, we’ll inevitably constrain the range of possible preferences and goods a model could enjoy, and thereby lower the ceiling.</p><p><strong>Dan:</strong> Yeah, I’m not sure about that. The question of how many ultimate objectives or motivations a system has is orthogonal to the question of the range of hedonic experiences a system could have. You could have a system with just one ultimate objective but capable of experiencing pleasure or enjoyment arbitrarily. In our case, it’s not like you’d expand the degree of pleasure we could experience merely by tacking on additional motivational states. To be honest, this is the first time I’ve ever even thought about this topic, so I don’t really trust my intuition.</p><p>Closing Thoughts</p><p><strong>Dan:</strong> Henry, I’m conscious of time. Are there any things you wanted to talk about, address, or feel like you should have said?</p><p><strong>Henry:</strong> One thing I’m really interested in is the descriptive element of this. Jonathan Birch — I’ve mentioned him a lot, big fan of course — once slightly chided me. He said, “Philosophers shouldn’t be in the business of making predictions.” I think he was slightly joking. But it’s interesting that this is a debate where there is an overlap between predictive and theoretical/normative elements.</p><p>I think it’s very likely this is going to be one of the biggest culture-war issues we see. Literally wars could be fought over this a decade or so from now. Although I think it’s likely that, as these models get better, a large number of people will have reactions like Richard Dawkins and share the view that of course these systems are conscious, it’s going to intersect with religion and culture in profound and interesting ways. I expect it to be massively divisive. This exacerbates the problem on the scientific side — ideally, we don’t want people fighting over issues that in theory we could scientifically resolve. And right now consciousness science is just unable to help us.</p><p><strong>Dan:</strong> I mean, people are fighting over issues that in theory we could scientifically resolve right now. But yeah. AI is going to be absolutely huge and incredibly transformative, and so it’s going to swallow up so much of our political energy. That hasn’t happened yet, partly because we underestimate, sort of don’t sufficiently appreciate, how much we’re in a bubble. I saw statistics on how many people even know what Claude is — we’ve been referring freely to Claude in this conversation. At the moment, when it comes to AI’s impact on the economy and society and how it’s diffusing throughout the economy, there are things happening, things people are picking up on. But in terms of most people’s day-to-day lived experience, it probably doesn’t feel that much different than it did two or four years ago.</p><p>I think we both agree that in five or ten years’ time that’s going to be totally different. At that point, these issues of public opinion, how people understand these systems, how they relate to them, polarization and tribalism, how people split into different political factions — it’s going to be incredibly important. I don’t know exactly what Jonathan Birch had in mind, but my sense is that speculating about that set of questions and thinking philosophically about how this might all play out does have a predictive element. To me, that feels like an important job for philosophy.</p><p><strong>Henry:</strong> Completely on the same page. This is the broader issue we’re seeing — and it’s probably a good trend — where philosophy no longer consists of the latest iterations of Gettier arguments or debates about grounding and metaphysics. They’re fine, those debates. But as AI increasingly explodes into our lives in the way you’re characterizing, I think it’s a golden age for philosophers. It does require us to shift from some armchair methodologies toward, really, a different kind of philosophy. This is what we’re seeing.</p><p>This is also why it makes me so despondent when I see philosophy departments closing. This is the golden age — a new golden age for philosophers — because so many of the topics we’re discussing, like “should we be happy to have robots as slaves?”, are really quite novel, massive ethical issues that we don’t have a good literature on so far, and philosophers have relevant expertise to bring to bear.</p><p><strong>Dan:</strong> Yeah, so two notes to end on. First: philosophers should be given a lot more status. We’re not at all self-serving in thinking this. And I personally feel we should be paid a hell of a lot more — again, not at all biased.</p><p>Second: everyone should subscribe to Henry’s Substack, Polytropolis, if you haven’t already. It currently — as we’re recording, when we release this as a podcast it might be different — has published one essay and has over 5,000 subscribers, which I’m simultaneously impressed by and disgusted with envy over. A real fantastic achievement. I highly encourage people to subscribe.</p><p>Thanks everyone for listening in. That was great.</p><p><strong>Henry:</strong> Thanks, everyone. Thanks for joining. I’ll look through all the comments in the chat.</p> <br/><br/>This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit <a href="https://www.conspicuouscognition.com/subscribe?utm_medium=podcast&#38;utm_campaign=CTA_2">www.conspicuouscognition.com/subscribe</a>

17 total episodes available

Deep-dive analytics for Conspicuous Cognition Podcast

Frequently asked questions

Have a different question and can't find the answer you're looking for? Reach out to our support team by sending us an email and we'll get back to you as soon as we can.

What is Conspicuous Cognition Podcast?

A podcast about big questions in philosophy, psychology, evolution, politics, artificial intelligence, and more. <br/><br/><a href="https://www.conspicuouscognition.com?utm_medium=podcast">www.conspicuouscognition.com</a>

How often does this podcast release new episodes?

This podcast updates daily.

Where can I listen to this podcast?

This podcast is available on 4 platforms including Apple Podcasts, Spotify, and more. You can also use the RSS feed directly.

Does this podcast accept guests?

Yes, this podcast regularly features guests.

Legal Disclaimer

Pod Engine is not affiliated with, endorsed by, or officially connected with any of the podcasts displayed on this platform. We operate independently as a podcast discovery and analytics service.

All podcast artwork, thumbnails, and content displayed on this page are the property of their respective owners and are protected by applicable copyright laws. This includes, but is not limited to, podcast cover art, episode artwork, show descriptions, episode titles, transcripts, audio snippets, and any other content originating from the podcast creators or their licensors.

We display this content under fair use principles and/or implied license for the purpose of podcast discovery, information, and commentary. We make no claim of ownership over any podcast content, artwork, or related materials shown on this platform. All trademarks, service marks, and trade names are the property of their respective owners.

While we strive to ensure all content usage is properly authorized, if you are a rights holder and believe your content is being used inappropriately or without proper authorization, please contact us immediately at hey@podengine.ai for prompt review and appropriate action, which may include content removal or proper attribution.

By accessing and using this platform, you acknowledge and agree to respect all applicable copyright laws and intellectual property rights of content owners. Any unauthorized reproduction, distribution, or commercial use of the content displayed on this platform is strictly prohibited.