The following issues reappear throughout this thesis, and I attempt to clarify my position on each of the following:
Human-machine dichotomy
When working with an improvisation machine, it is difficult to escape the dichotomy but at the same time it paradoxically becomes an opportunity to escape it. As a starting point, I acknowledge the dichotomy and borrow from Hayles, using "flesh" as a designation for what is human and all that is "metal" for what is machine (2002). From this point, the aspects of agency, extended mind, and external embodiment outlined below attempt to blur the boundaries between human/machine and are valuable perspectives in helping to move away from the dichotomy, though it is still a challenge.
The extended mind, external embodiment
I defer to Clark and Chalmers' conceptualization of the extended mind, an active externalism, in which the environment plays an active part in the cognitive process. For example, a notebook could be considered an extension of the mind because it is used to store memories that could otherwise be forgotten by the mind and serves as an integral part in one's daily functioning (Clark & Chalmers, 1998, pp. 12–16).
In the context of machine improvisation, AIYA becomes a metaphorical notebook, a memory extension of my playing, due to the nature in which it has been programmed. In the literal sense, it is programmed to actively record my playing and recall snippets of my playing based on triggers given by the programmed interaction rules. For my own performances with the machine, one of the roles of the extended mind in machine improvisation is to demonstrate an aesthetic connection to the material (e.g. sonic, visual) produced by the human performer. At the same time, the machine improvisor is also an external embodiment of myself as a performer and becomes a form of expressive control interface in the performance. Examples include:
- amplification of my sound (flute, voice)
- modification of my sound (e.g., reverb, delay, distortion, etc.)
- projection visuals influenced by the movement of my body (e.g., real-time avatar animation, algorithmic visuals taking position and rotation values from body skeleton as parameters)
Machine agency and perceived agency
The definition of agency itself is not universally agreed upon, as shown by Franklin and Graesser’s literature review of agent definitions (2015). Descriptions span from being “anything that can be viewed as perceiving its environment through sensors and acting upon that environment through effectors” (Russell & Norvig, 2020, p. 33) to those that must have the properties of autonomy (without direct intervention of humans), social ability (interaction with other agents, possibly humans), reactivity (perceiving and responding in a “timely fashion” to environmental changes), and pro-activeness (goal-oriented behavior) (Woolrdige & Jennings, 1995).
For the purposes of my proposed research, I would like to focus on examining machine agency on the level of social ability in a performance setting: its ability to communicate material that contributes to the performance’s aesthetics, to communicate its analysis of the performance scenario, and to communicate anticipations of future decisions with other agents (e.g., humans). I consider the evaluation of the machine’s agential capacities as an almost entirely subjective process that depends on the perspectives of the human(s) performing with the machine. I refer to this critical subjective view of the machine’s agential capabilities as perceived agency.
In my definition of perceived agency, the human interacts with the machine as if it has properties of autonomy, social ability, reactivity, and proactiveness—even if outside of the interaction of the system, the human does not know for sure which of the aforementioned properties the machine actually possesses. What is important is the interaction within the system, and how the properties of agency are regarded within the system. For example, I might improvise with a machine I know I can turn off, but as long as I keep it running, I am reacting to its multifaceted sonic and visual output as if it possessed the qualities of an agent. It is, is many aspects, a shift in the mindset of the human performer that assigns the property of agency to the machine.
This concept is further reinforced by the research of Nass and Moon, who assert that humans have an immediate and unconscious reaction to automatically treat artificial entities as real humans, due to the phenomenon of “Ethopoeia,” a “mindless” human behavior of “overlearning” that arises from deeply ingrained habits and behaviors (Nass & Moon, 2000, pp. 87–88). Nass and Moon tested human-computer behavior on the measures of politeness, reciprocity [1], and reciprocal self-disclosure [2] and found the treatment of the machines by the humans as being analogous to the behavior expected of human-to-human interaction. I argue that this automatic, unconscious treatment of machines as humans is the first fundamental step that manifests in humans regarding machines as agential entities and perceiving them as having agency.
Why is it important to perceive the machine as having agency in a performance? My hypothesis, as “tested” throughout the series of works I created for my improvisation machine, is that regarding the machine as an agential being allows the human performer to more easily recreate the interactions and behaviors that take place in human-to-human improvisation. This potentially allows the human-machine improvisations to benefit from the centuries-long performance practice of human-to-human improvisation — just in a translated form.
Perhaps what is yet even more interesting is to examine where these translations fail, and when new, hybrid relations between human and machine that do not follow the phenomenon of Ethopoeia take over. Are there behavioral patterns unique to the human-machine relationship?
Distinguishing between improvisation partner and tool
I make a distinction between an improvisation machine (an objective description of the software and hardware involved in the machine’s expression) and an improvisation partner. An improvisation machine can be regarded as an improvisation partner through the subjective viewpoint of the human improviser. The partner is not seen anymore as a tool, but regarded more as an entity that mimics to some degree the kinds of behaviors and interpersonal relationships that exist between human-to-human improvisation.
Lewis‘ description of the process of transduction exemplifies what I consider to be an important attribute of an improvisation partner: The machine relates itself to the emotional state of the human partner, similar to how human improvisers create shared mental models that consist of shared knowledge (e.g., knowledge of the intention and preferences of the other players) (Lewis, 2003). Lösel describes the process of shared mental models as the following:
First is observation, the point at which an improviser realizes that his mental model diverges from others’. Second is repair, which refers to all attempts to reconcile divergences. Repairs can either be attempted in order for an improviser to align himself with another improviser's mental model or in order for an improviser to align another improviser with his own mental model. The third, and final, step of cognitive convergence is acceptance, during which cognitive consensus may occur. It is also possible that consensus may be rejected or that an improviser will achieve perceived cognitive consensus, where they think that they have achieved consensus, but actually have not. (Lösel, 2018, pp. 190–191)
Lösel proceeds to acknowledge the difficulty for a computer to automatically detect divergences in the mental model, which requires the ability for the machine to reconstruct the other players' perspectives (theory of mind). Thus, it could be said—at least for the machine I am working with—that the machine improviser is only achieving "perceived cognitive consensus" throughout the improvisation. This might make it appear that the machine neither possesses agency nor functions as an independent improvisation partner within the performance context. However, there is another aspect at play: that of the performer's memory of and active attention to how the machine is programmed. I argue that the mind of the human performer is almost exclusively occupied with the demands of “observation,” “repair,” and “acceptance” outlined by Lösel, making it too “resource-intensive” to keep full memory of and attention to the dozens of interaction rules that have been programmed for the machine. This causes the human performer to mentally group all of the mechanisms and rules of the machine into a single identity as an external improvisation partner; through this perspective, the performer interacts with it as if it were a musical agent negotiating the framework described by Lösel. Thus, in this context, the machine achieves perceived agency.
[1] The societally-trained principle that one should return help, favors, and benefits to those who have previously helped them.
[2] The phenomenon of humans divulging intimate information about themselves when somebody else, even a stranger, divulges intimate information about themselves.