Stuart Russell, AI, and the problem of Utilitarianism

By Bulgakov's Behemoth

Sep 08, 2023

Human Compatible is an interesting book. Stuart Russell does a decent job of explaining the basics of artificial intelligence (AI) and artificial general intelligence (AGI) to the average layman (of which I am one). It deals with what is AI, the difference between AI and AGI; the development of AI in both its historical and political aspects; the debates over AI and possible future developments in the field, and more. As Russell is a leading expert in the field (or so the blurb and a Google search tell me) he is well equipped to deal with these topics. If you are interested in learning about this topic, I can heartily recommend this book, Russell writes in a very clear style with ideas appropriately dumbed down for non-experts.

But alas, I am not here to review the technical content of the book. Oh no, I am not suited for that, and it would be no fun anyway. No, I am here to take a little look at the assumptions and philosophical content of the text.

Dear reader, I feel it is my duty to inform you that Russell has served as “vice-chair of the World Economic Forum’s Council on AI and Robotics and as an adviser to the United Nations on arms control.” That should make it clear that his opinions and values are probably not too out of alignment with the powers that be.

Now, that we’ve got the preliminaries out of the way let's jump straight into it.

The Utilitarianism Problem

Dear reader, AI has a problem: utilitarianism. Well, it is not a problem for AI itself, seems unavoidable that it will be guided by some utility function. Rather, the problem is that its researchers are utilitarians not only for AI but for humans as well. Stuart Russell, I am sad to say, is such a one. Alas, not only is he a utilitarian but also of the most naïve and base kind of utilitarian.

Russell has a specific section in his book where he argues for utilitarian beliefs. Let's take a look at his arguments.

His first argument for utilitarianism amounts to (I am paraphrasing liberally here): “When we act, we care about the consequences we bring about, and we compare different consequences resulting from different actions by their utility to us” and therefore utilitarianism is true. This seems a decent enough model of human behaviour (God knows they stuff this down your throat in economics), but I fail to see why we should take it seriously as a theory of ethics, of what action one ought to take; what consequences ought we pursue?

The second argument is almost too painful to bear repeating but suffering builds character, or so I hear. “Consequentialism is a hard principle to argue against… because it’s incoherent to object to consequentialism on the grounds that it would have undesirable consequences.” Russell is of course correct. If consequentialism is just bringing about good outcomes, it is impossible, by definition, to object that it might bring about bad outcomes. But then, is not every ethical theory concerned with good outcomes? Does virtue ethics not intend the consequence that humans be virtuous? Do deontologists not intend the consequence that humans obey the moral law? The consequence of an act is always relevant to a system of ethics. Even the utilitarian R.M. Hare considered consequentialism trivially true for all ethical systems, what most differentiates different ethical systems is what they consider the morally relevant facts and their interpretation of those facts.

(A virtue ethicist or a deontologist might object to my reasoning and say that in many, or even all cases, the action itself is evil or good; not its consequences. That might well be true. Personally, I even agree with it. But I am trying to beat the utilitarian at his own game, not convince him of an entire metaphysics.)

Let’s take an example to try and prove my point. Say you shoot a gun, but consequently no one is hurt; say you shoot a gun and consequently an innocent man is hurt. No action has a complete description without the consequences. Not all the morally relevant facts are there, unless we include, at least some, of the consequences. One always runs afoul of consequences, both those intended and those foreseen.

So, consequentialism, aiming for good outcomes is trivially true. But is utilitarianism trivially true? Is it also not true by definition? Obviously not, and Russell’s conflation of the two is worse than sophomoric. Different formulations of utilitarianism can, of course, be objected to on consequential grounds. For example, we can object to any formulation of utilitarianism that says an innocent person can be murdered even if (for example) this murder quells a riot and so spares many other innocent lives. How can we do this on consequential grounds? Simple, to murder an innocent would be a violation of God’s law, and violating God’s law is a consequence we ought to always avoid.

The only way to make utilitarianism trivial, though not true, is to make it into a descriptive model and not an ethical theory. Theoretically, we can describe human behaviour by a utility function. We could then take a utility function to describe how an ethical human ought to act (e.g. violating God’s law would have infinite negative utility, feeding the poor gets you +1 good boy points, things like that). But this is no longer a theory of how one ought to act, it takes as given a theory of ethical action and then describes how a human following this theory would behave.

The above, of course, is not what Russell takes utilitarianism to mean. He takes utilitarianism to be direct interpersonal utility comparisons. What is utility? How could we possibly measure it on the same scale to make comparisons? Well, we could measure “dopamine levels or the degree of electrical excitation of neurons related to pleasure and pain, happiness and misery. If Alice’s and Bob’s chemical and neural response to lollipop are pretty much identical, as well as their behavioral responses… it seems odd to insist, nevertheless, their subjective degrees of enjoyment differ by a factor of a thousand or a million.”

We are so fucked. AI is 100% going to entomb us in a titanium sarcophagus, stimulating every pleasure center in the brain, genitals stimulated at appropriate intervals for maximum satisfaction, VR headset nailed into our skull with Avengers Endgame playing on repeat, forever and ever.

Russell’s idea of utility comparisons is especially off-putting as he seems to endorse John Harsanyi’s version of utilitarianism called “preference autonomy.” To quote Harsanyi: “In deciding what is good and what is bad for a given individual, the ultimate criterion can only be his own wants and preferences.” Clinically retarded, but we must press onward.

If Russell’s utilitarianism is decided by what people prefer, as the Harsanyi quote earlier stated, then it is hardly appropriate to measure utility by dopamine levels. Carthusian monks obviously preferred to become monks given their choices, and I doubt they couldn’t find other lifestyles that lead to more dopamine. To forcibly strap a Carthusian monk in the UltraAutoMasturbator6000 might increase his dopamine levels, but if we measure utility by preferences, it would clearly reduce his utility.

Russell does not seem to notice the above problem and how he contradicts himself. But he does notice a separate problem, notably Derek Parfit’s Repugnant Conclusion: the idea that for “any situation with N people, there is a preferable situation with 2N people who are ever so slightly less happy” (btw, N stands for number, you racist). This is straightforwardly true for utilitarians like Parfit and Russell; unfortunately, it is also a slippery slope. It eventually leads to the maximum possible population, all of which have a life barely worth living.

Russell, and to his credit he admits it, has absolutely no clue what the hell to do with this. His only response is “I suspect we are missing some fundamental axioms.” LOL, LMAO.

The Machines are going to put us in breeding pits whilst attached to the MegaUltraTurboAutoMasturbator600000000000000. And there is nothing we can do about it. It is even funnier once you realize this is how he ends his defense of utilitarianism.

Oh, do I recommend the book? Yeah. Sure. Good introduction to AI for the layman. Check it out, definitely.

Austro-Thomism’s Substack

Stuart Russell, AI, and the problem of Utilitarianism

By Bulgakov's Behemoth