Definition of Agency: SEPic Fail

So I was trying to figure out what agency is. Let’s try the usual starting point first:


agency is the capacity of an agent (a person or other entity, human or any living being in general, or soulconsciousness in religion) to act in a world

What does it mean “to act in a world”? Wikipedia, again:

Basic action theory typically describes action as behavior caused by an agent in a particular situation.

…And we are back to “what is agency?” (and what is behavior).

OK, let’s try SEP:

Donald Davidson [1980, essay 3] asserted that an action, in some basic sense, is something an agent does that was ‘intentional under some description,’ and many other philosophers have agreed with him that there is a conceptual tie between genuine action, on the one hand, and intention, on the other.

…So action is, again, something that an agent does. (SEP does not have a separate entry for agency.) Under Causation and Manipulability is admits to the potential circularity:

von Wright responds as follows:

The connection between an action and its result is intrinsic, logical and not causal (extrinsic). If the result does not materialize, the action simply has not been performed. The result is an essential “part” of the action. It is a bad mistake to think of the act(ion) itself as a cause of its result. (pp. 67–8)

Here we see a very explicit attempt to rebut the charge that an account of causation based on agency is circular by contending that the relation between an action (or a human manipulation) and its result is not an ordinary causal relation.

Next section is A More Recent Version of an Agency Theory (of causation). It muses about the difference between “free agency” and causation in the usual philosophical ways, with no clear definition ever given. Here is a typical sentence:

The idea is thus that the agent probability of B conditional on A is the probability that B would have conditional on the assumption that A has a special sort of status or history—in particular, on the assumption that A is realized by a free act.

Further about this “free act”:

(What “free act” might mean in this context will be explored below, but I take it that what is intended—as opposed to what Price and Menzies actually say—is that the manipulation of X should satisfy the conditions we would associate with an ideal experiment designed to determine whether X causes Y—thus, for example, the experimenter should manipulate the position of the barometer dial in a way that is independent of the atmospheric pressure Z, perhaps by setting its value after consulting the output of some randomizing device.)

So a free act is an act of an “observer”, who is by definition an agent. I don’t know how to read this section charitably. It all seems very circular to me.

OK, let’s try the next section: Causation and Free Action:

It seems clear, however, that whether (as soft determinists would have it) a free action is understood as an action that is uncoerced or unconstrained or due to voluntary choices of the agent, or whether, as libertarians would have it, a free action is an action that is uncaused or not deterministically caused, the persistence of a correlation between A and B when A is realized as a “free act” is not sufficient for A to cause B.

OK, so what’s a “free act”? Alas, there is no definition anywhere in there, not that I can find. There is a related notion of “intervention”, which is somehow different from “free action”:

 The simplest sort of intervention in which some variable Xi is set to some particular value xi amounts, in Pearl’s words, to “lifting Xi from the influence of the old functional mechanism Xi = Fi (PaiUi) and placing it under the influence of a new mechanism that sets the value xi while keeping all other mechanisms undisturbed.” (Pearl, 2000, p. 70; I have altered the notation slightly). In other words, the intervention disrupts completely the relationship between Xi and its parents so that the value of Xi is determined entirely by the intervention. Furthermore, the intervention is surgical in the sense that no other causal relationships in the system are changed. Formally, this amounts to replacing the equation governing Xi with a new equation Xi = xi, substituting for this new value of Xi in all the equations in which Xi occurs but leaving the other equations themselves unaltered. Pearl’s assumption is that the other variables that change in value under this intervention will do so only if they are effects of Xi.

Bummer, so an intervention is also defined in terms of some external “lifter” who “disrupts completely the relationship” and “replaces equations”. And that external lifter is presumably an agent. So, we are again, back to square one, what is an agent? Appropriately, the next section is called Is Circularity a Problem?:

Suppose that we agree that any plausible version of a manipulability theory must make use of the notion of an intervention and that this must be characterized in causal terms. Does this sort of “circularity” make any such theory trivial and unilluminating?

No, of course not, says the article, but how it justifies this conclusion is not at all clear to me. It muses about non-reductionism and how  (emphasis mine)

 Whether one regards the verdicts about these cases [something about a “failure of a gardener to water his plants”] reached by causal process accounts or by interventionist accounts as more defensible, the very fact that the accounts lead to inconsistent judgments shows that interventionist approaches are not trivial or vacuous, despite their “circular”, non-reductive character.

I don’t understand how they came to this conclusion, but fine, let’s see if there is anything at all that I can salvage from this entry. Another section talks about Interventions That Do Not Involve Human Action (emphasis mine):

a purely “natural” process involving no animate beings at all can qualify as an intervention as long as it has the right sort of causal history—indeed, this sort of possibility is often described by scientists as a natural experiment. Moreover, even when manipulations are carried out by human beings, it is the causal features of those manipulations and not the fact that they are carried out by human beings or are free or are attended by a special experience of agency that matters for recognizing and characterizing causal relationships. Thus, by giving up any attempt at reduction and characterizing the notion of an intervention in causal terms, an “interventionist” approach of the sort described under §§5 and 6 avoids the second classical problem besetting manipulability theories—that of anthropocentrism and commitment to a privileged status for human action.

So, it’s the “causal features” that matter but “not the fact that they are carried out by human beings” that matter, yet they still have to be caused by human beings at some point? Yep, back to the missing definition of agency, yet again.

So, my attempt to understand the intuitively obvious concept of agency using the accessible philosophical resources was a complete flop.


Conditions for the Skynet Takeover, or Hostile Intelligence Explosion


Eliezer Yudkowsky has posted a report titled Intelligence Explosion Microeconomics, where he outlines his approach to the need for and directions of research into recursive self-improvement of machine intelligence. Below are my impressions of it.


Basically, the Skynet-type outcome, with humans marginalized and possibly extinct is almost certain to happen if all of the following conditions come true:


  1. A general enough intelligence, likely some computer software, but not necessarily so, gains the ability to make a smarter version of itself, which can then make an even faster version of itself, and so on. leading to the Technological Singularity
  2. The ethics guiding this super-intelligence will not be anything like the human ethics (Nick Bostrom’s fancily named Orthogonality Thesis), so it won’t care for human values and welfare one bit (yes, it’s a pun).
  3. This superintelligence could be interested in acquiring the same resources people find useful (Nick Bostrom’s Instrumental Convergence Thesis). Another fancy name.


Now, what is this “general enough intelligence”? Yudkowsky’s definition is that it is an optimizer in various unrelated domains, like humans are, not just something narrow, like a pocket calculator, a Roomba vacuum or a chess program.


So what happens if someone very smart but not very ethical finds something of yours very useful? Yeah, you don’t have a chance in hell. Like the animals on Earth, they are at the rather questionable mercy of humans. So, there you go, Skynet.


Now, this is not a new scenario, by any stretch, though it is nice to have the Skynet takeover conditions listed explicitly. How to avoid this fate? Well, you can imagine a situation where one of the three condition does not hold or can be evaded.


For example, Asimov dealt with the condition 2 by introducing (and occasionally demolishing) his 3 laws of robotics, which are supposed to make machine intelligence safe for humans. Yudkowsky’s answer to this is the Complexity of Value: if you try to define morality with a few simple rules, the literal genie, trying to obey them most efficiently, will do something completely unexpected and likely awful. In other words, morality is way too complicated to formalize, even for a group of people who tend to agree on what’s moral and what’s not. The benevolent genie is easy to imagine, but hard to design.


Well, maybe we don’t need to worry, and the Orthogonality Thesis (condition 2) is simply wrong? That’s the position taken by David Pearce, who promotes the Hedonistic Imperative, the idea that “genetic engineering and nanotechnology will abolish suffering in all sentient life”. In essence, even if it appears that intelligence is independent of morality, a superintelligence is necessarily moral. This description might be a bit cartoonish, but probably not too far from his actual position. Yudkowsky does not have much patience for this, calling it wishful thinking and a belief in the just universe.


It might be a happy accident if the condition 3 is false. For example, a super-intelligence might invent FTL travel and leave the Galaxy, or invent baby universes and leave the Universe, or decide to miniaturize to Planck scale, or something. But there is no way to know, so no reason counting on it.


The condition 1 is probably the easiest of the three to analyze (though still very hard), and that’s the subject of the Intelligence Explosion Microeconomics report. The goal is to figure out if “investment in intelligence” pays dividend enough for a chain reaction, or if the return remains linear or maybe flattens out completely at some point. Different options lead to different dates for the “intelligence explosion to go critical”, and the range varies from 20 years from now to centuries to never. The latter might happen if, somehow, the human-level intelligence is the limit of what a self-improving AI can do without human input.

The majority of the report on the “microecon0mics of intelligence explosion” goes through the variety of reasons why it may (or may not) occur sooner or later, and eventually suggests that there are enough examples of exponential increase in power with linear increase in effort to be very worried. The examples are the chess programs, the Moore’s law, the economic output and the scientific output (this last one measured in the number of papers produced, admittedly a lousy metric). The report goes on to discuss the effects of the preliminary research on intelligence explosion on the policies of MIRI (Machine Intelligence Research Institute) and so not overly interesting from the pure research perspective.

This seems to be the clearest outline yet of the MIRI’s priorities and research directions related to Intelligence Explosion. They also apparently do a lot of research related to the condition 2, making a potential AGI more humane, but this is not addressed in this report.