Russell on AI in technocracy and surveillance

In chapter 4 of his book “Human Compatible”, Stuart Russell discusses various harmful applications of Artificial Intelligence (AI) technology, including the ways AI increasingly makes surveillance more effective, to the point where the “Stasi will look like amateurs by comparison.” We should point out that surveillance is not just “observing,” but is a method for controlling the behavior of people. Okay, so what’s new? Well, AI technology introduces new ways of controlling people. Because we live such a big part of our lives online nowadays (including our consumption of facts, news, our communication with peers etc.), “AI systems can track an individual’s online reading habits, preferences, and likely state of knowledge, they can tailor specific messages to maximize impact on that individual while minimizing the risk that the information will be disbelieved.” Forms of coercion can be made more effective by using personalized strategies (e.g. Russell mentions blackmail bots that use your online profile). A more subtle way to alter people’s behavior is to “modify their information environment so that they believe different things and make different decisions” (my emphasis). In addition to more accurate personalized profiles, AI systems can constantly adjust themselves to be more effective, based on feedback mechanisms such as mouse clicks or time spent reading.

After discussing new phenomena such as deep fakes , Russel makes a bit of a leap and contemplates the various ways in which governments (on the less friendly side of the spectrum) may use AI-supported surveillance to directly control its citizens by implementing “rewards and punishments based on behavior”. In a European context this section may sound slightly unrealistic, but it’s not that unrealistic if you consider a certain… Asian country using surveillance to experiment with evaluating civilians using point systems. If you are not worried about governments in particular, you may think of similar, but less severe and less obvious strategies used by big internet corporations, or the upcoming era of health-related apps . One example I’m thinking of is insurance companies given you a lower health insurance bounty when you use smart watches to show you are moving enough per day (certainly important, but a very specific, limited, unilateral concept of health).

In any case, what these examples share is that “such a system treats people as reinforcement learning algorithms, training them to optimize the objective set by the state” (my emphasis, and again: you could perhaps substitute “state” by your favorite big evil corporation). So not only does the technology build up profiles based on behavioral feedback, the humans themselves, Russell suggests, will be evaluated with a scoring function, as if they themselves work like these algorithms.

What I appreciate about the upcoming argument is that Russell specifically targets those states/companies with a “top-down, engineering mind-set”, which is at first sight quite reasonable and which I suspect to be pervasive. It can also the mindset of techno-optimists with good intentions. But then again, however extreme the surveillance, it’s always for “the greater good”, so good intentions only go so far. Technocracy comes into the picture when we ask exactly who decides what is good, and how we measure that?

Russell reconstructs this engineering-style reasoning as follows, with respect to governments:

  • “it would be better if everyone behaved well, had a patriotic attitude, and contributed to the progress of the country”
  • “technology enables measurement of individual behavior, attitudes, and contributions”
  • “therefore, everyone will be better off if we set up a technology-based system of monitoring and control based on rewards and punishments.”

Or you can come up with some alternative version, like: it is healthier to not drink alcohol and better to have less alcohol-related violence; we have studied and understood the effects of alcohol; therefore it is better if we monitor everyone, punish those who drink alcohol, and reward superfood-munging yoga hipsters (is that a thing?).

Then Russell provides three arguments against this top-down engineering technocracy mindset:

“First, it ignores the psychic cost of living under a system of intrusive monitoring and coercion; outward harmony masking inner misery is hardly an ideal state. Every act of kindness ceases to be an act of kindness and becomes instead an act of personal score maximization and is perceived as such by the recipient. Or worse, the very concept of a voluntary act of kindness gradually becomes just a fading memory of something people used to do. Visiting an ailing friend in hospital will, under such a system, have no more moral significance and emotional value than stopping at a red light.

To stick with Russel’s example: I think the issue here is not that there would no longer be acts of kindness; but I’m wondering how one would recognize them as such in this (hypothetical?) state. Under this regime, the only way to “prove” to its recipient that something is in fact an act of kindness under the all-seeing eye of a scoring metric, would be to show 1) that you understand how your actions are evaluated and 2) then act against the maxim of score maximization. It would not be sufficient to be altruistic and selfless, a kind act would have to be self-destructive (or one might say, extremely altruistic). Perhaps only then the receiver would see rationally, without relying on empathy and good faith, that some intrinsic moral and human value is the best explanation for the shown behavior, rather than score optimization. That’s of course a hypothetical reflection under the assumption of all-pervasive surveillance; otherwise another option would be to communicate covertly.

The paradox of this extreme example is that by trying to optimize desirable human behavior (e.g. visiting that ailing friend) and by quantifying the human values that they promote (e.g. kindness), you lose the quality of what is sought after, similar to the capitalist perversion of friendship or love when they become part of an economy of (monetary) exchange. Friendship or love cannot be part of a contract, because that would imply you can demand some utility from the other and enforce this demand. (I am not denying that friendships and love relationships can have and in fact do have utility. I am rather saying that only the cynical and the sociopathological would argue they are about utility and, as a consequence, can be quantified. But hey, perhaps I’m too romantic.)

In other words, the deeper issue that Russell addresses with this extreme example is that the true objective would not be captured by any explicitly stated objective. This is a fundamental problem of what Russell calls the “standard model” of AI, where “intelligent” machines are optimizing a “purpose put in the machine”:

Second, the scheme falls victim to the same failure mode as the standard model of AI, in that it assumes that the stated objective is in fact the true, underlying objective. Inevitably, Goodhart’s law will take over, whereby individuals optimize the official measure of outward behavior, just as universities have learned to optimize the “objective” measures of “quality” used by university ranking systems instead of improving their real (but unmeasured) quality.

I was unfamiliar with Goodhart’s law, but Russell’s explanation is exceptionally clear (and the university example is awfully accurate). Marilyn Strathern’s paraphrase is elucidating: “When a measure becomes a target, it ceases to be a good measure.” Russell points out that this problem applies when humans design intelligent machines and algorithms to minimize some loss function (the standard model of AI), and he provides many examples where AI optimizing towards some objective has adverse effects. This problem is just as bad when we give humans the algorithmic treatment, so to say. In economics and game theory the issue is people gaming the measure, which may in fact be to the detriment of what you are trying to promote.

Finally, the imposition of a uniform measure of behavioral virtue misses the point that a successful society may comprise a wide variety of individuals, each contributing in their own way.”

The latter point is relatively self-evident and adds to the earlier reasoning: can you capture the quality you are trying to promote by subjugating people to a relatively uniform quantitative measure?

Russel’s book in particular deals with the problem of specifying objectives in AI technology designed to optimize towards a given goal. But you could perhaps generalize his argument to technocracy in general: technocracy firstly assumes you have an (engineer) class of people that know what is best, which is both optimistic and paternalistic, but secondly, even when they do know what is best, it is quite a fundamental issue whether you can also translate that into procedures that truly achieve the intended result.

N.B. I’m reading this book in EPUB format so I can’t quite add page references. All citations can be found in Chapter 4, Surveillance, Persuasion, and Control, section “Controlling your behavior”

Self portraits using stable diffusion <-- Latest

Study Tip: Quiz yourself in Vim <-- Next

Creating and linking Zettelkasten notes in Vim <-- Previous

What is the purpose of this website? <-- Random


Do you want to link a webmention to this page?
Provide the URL of your response for it to show up here.


Victor on Saturday, May 9, 2020:

Interesting article Edwin. It’s been a while since I’ve done any reading of (primary) materials in the field of jurisprudence (which I have done too little of anyway to speak with any authority on the subject), but your post made me think about some central questions in relation to “rule of law” which could also be raised in respect of “rule by AI”.

Why do we obey the law? Is there necessarily a link between law and morality? Are laws just orders backed by threats (in the case of regulation of behaviour by AI, probably yes)? Would laws developed and enforced by AI satisfy Hart’s “rule of recognition” (

I would be interested to hear your views on how legitimacy could be conferred on law developed and enforced through AI. Some first thoughts I had are that such AI-law is based on human behaviour as its “input” in the same way that law derives its legitimacy from a democratic legislative process (at least in our society, for now).

P.S. I would like to compliment you on your clear and entertaining writing style.

Edwin on Saturday, May 9, 2020
In reply to Victor

I think that a link between law and morality is desirable but not necessarily the case. If I understand that “rule of recognition” correctly, it amounts to saying something like: any rule/law is valid within a legal system if it is recognized by the (legal) community as law (thereby heavily relying on common social practice and convention?). I’m probably missing some nuance there (or it’s just completely off), so please correct me if I’m wrong, but then the moral value of a law would depend on the moral quality of the legal community recognizing it as a law. In that sense, I don’t see why “rule by AI” would be incompatible with the “rule of recognition”, especially if we assume that what those AI technologies do is optimize for the objectives that we provide to them. The question then is rather whether that objective is legally and/or morally acceptable.

But in any case, when it comes to providing those objectives, I don’t think looking at the legal side of things is sufficient. First of all, especially in the case of the many technological innovations in AI, law is lagging behind moral issues *, but secondly, the morally relevant question I tried to point out in the post is whether manipulating human behavior by optimizing for any explicit objective measure using AI is a desirable approach, irregardless of whether the “purpose put into the machine” is legal or illegal.

I didn’t have the scenario in mind where AI is used to make law by taking human behavior as input. I’m not aware of people proposing this, but of course it’s an interesting thought experiment to consider. You could argue that those laws would encode the factual mores of a population - ipso facto it should be recognized - but I would immediately counter that both morality and law concern what you ought to do, not what people in fact do. E.g. if the majority suddenly starts knifing people, will this AI allow knifing by law, simply because most people seem to think it’s okay? You could argue this rule could perhaps pass the “rule of recognition”, but this again shows the potential divergence between legal legitimacy and morality (assuming that you surely would not call “knifing people” morally acceptable).

I indeed had the view in mind where AI becomes a tool for enforcing behavior, rather than the issue of legitimacy. Uses of AI may not be illegal and may be considered legitimate (people got used to it?); but its use for manipulation is morally disconcerting even when there may not yet be relevant laws to address that concern in all scenarios.

* To use a “stokpaardje”, think about how self-driving cars challenge the current legal framework.

Thanks for the compliment!