Research Topic

motivation

Creating an improvisation machine is nothing new, prominent examples including those by:
‍
> IRCAM (OMax) (late 1990s)
> George Lewis (Voyager) (first version 1986-1988)
> Benjamin Carey (_derivations) (~2016)
> Hoffman and Weinberg (Shimon) (~2011)
> Francois et al (Mimi) (~2007)

...but rarely are improvisation machines taking non-audio sources as inputs for sound creation. I wanted the improvisation machine to not only take my sounds as inputs, but also my movements (e.g. via the Chaosflöte motion sensor, cameras, or other motion capture systems such as those from OptiTrack).

Making an improvisation machine is of great interest to me, because through its creation, I begin to learn more about my own biases as a musician as well as how to dissect the complex interactions and customs that are at play during an improvisation. Through the creation of this machine, I attempt to translate the metaphorical black box of spontaneous human interaction into concrete algorithms for the machine’s behavior; in the process of doing so, I view my own position within an improvisation (both with humans and with machines) in a new perspective, which gives me some distance to my normal immersion as a performer playing the flute, and allows me to examine and strategically arrange the underlying principles of improvisation that drive the performance.

Performing with an improvisation machine also causes me to examine its potential to be a musical agent. During the actual performance with it, I become immersed in the process of creating the music and am not actively devoting my attention to remembering all of the interaction rules I have programmed for the machine. Through this experience, the machine is not only triggered by my behavior on the stage, but triggers behavior in my own playing in ways that I perceive are mimicking the spontaneous creation processes at work in human-to-human music improvisation. From this experience emerges the impression of machine agency, which becomes an important facet of investigation in this thesis project.

research question

How can agency (and perceived agency) be given to improvisation machines, and how does working with such a machine impact my performance practice as an improvising musician?‍

> How is it possible to emancipate the machine from myself during the performance (which is in itself a kind of contradiction, but one which makes the question even more interesting)? How does this agency affect my behavior during the performance?

> How has the conceptualization of the work, technical preparation, and artistic preparation evolved/changed?

> During a performance, what are the mechanisms in play that shift my viewpoint of the improvisation machine between viewing it as a separate entity (i.e. chamber music partner) to viewing it as an extension of my body/mind/mindbody (in reference to Hayles and to Clark and Chalmer’s concept of the extended mind)? (1998; 2002)

> By reflecting upon the impact of the improvisation machine on improvisation performance practice, what creative potential does performing with an improvising machine yield?

> How are the intricate human-machine relations conveyed to the audience, and how does this affect the audience’s perception of the performance? (authenticity? interest? reflection?)

position

The following key terms related to the research question take on different forms and definitions based on the source consulted. My position on each of these terms are outlined below:

[human-machine dichotomy]

When working with an improvisation machine, it is paradoxically both difficult to escape the dichotomy and at the same time it becomes an opportunity to escape it. As a starting point, I acknowledge the dichotomy and borrow from Hayles, using "flesh" as a designation for what is human and all that is "metal" for what is machine (1998). From this point, the aspects of agency, extended mind, and external embodiment outlined below attempt to blur the boundaries between human/machine and are valuable perspectives in helping to move away from the dichotomy, though it is still a challenge.

[machine agency]

The definition of agency itself is not universally agreed upon, as shown by Franklin and Graesser’s (2015) generous literature review of agent definitions. Descriptions span from being “anything that can be viewed as perceiving its environment through sensors and acting upon that environment through effectors” (Russell & Norvig, 2020, p. 33) to those that must have the properties of autonomy (without direct intervention of humans), social ability (interaction with other agents, possibly humans), reactivity (perceiving and responding in a “timely fashion” to environmental changes), and pro-activeness (goal-oriented behavior) (Woolrdige & Jennings, 1995).

For the purposes of my proposed research, I would like to focus on examining machine agency on the level of social ability in a performance setting: its ability to communicate material that contributes to the performance’s aesthetics, to communicate its analysis of the performance scenario, and to communicate anticipations of future decisions with other agents (e.g. humans).

[the extended mind, external embodiment]

I defer to Clark and Chalmers' conceptualization of the extended mind, an active externalism, in which the environment plays an active part in the cognitive process. For example, a notebook could be considered an extension of the mind because it is used to store memories that could otherwise be forgotten by the mind and serves as an integral part in one's daily functioning (1998 pp. 12-16).

In the context of machine improvisation, AIYA becomes a metaphorical notebook, a memory extension of my playing, due to the nature in which it has been programmed. In the literal sense, it is programmed to actively record my playing and recall snippets of my playing based on triggers given by the programmed interaction rules. For my own performances with the machine, one of the roles of the extended mind in machine improvisation is to demonstrate an aesthetic connection to the material (e.g. sonic, visual) produced by the human performer.

At the same time, the machine improvisor is also an external embodiment of myself as a performer and becomes a form of expressive control interface in the performance. Examples include:
> amplification of my sound (flute, voice)
> modification of sound (e.g. reverb, delay, distortion, etc.)
> projection visuals influenced by the movement of my body (e.g. real-time avatar animation, algorithmic visuals taking position and rotation values from body skeleton as parameters)

[improvisation partner]

Described by Lewis (2003) as the process of transduction, the machine relates itself to the emotional state of the human partner, similar to how human improvisers create shared mental models that consists of shared knowledge (e.g. knowledge of the intention and preferences of the other players). Lösel (2018) describes the process of shared mental models as the following:

First is observation, the point at which an improviser realizes that his mental model diverges from others’.

Second is repair, which refers to all attempts to reconcile divergences. Repairs can either be attempted in order for an improviser to align himself with another improviser's mental model or in order for an improviser to align another improviser with his own mental model.

The third, and final, step of cognitive convergence is acceptance, during which cognitive consensus may occur. It is also possible that consensus may be rejected or that an improviser will achieve perceived cognitive consensus, where they think that they have achieved consensus, but actually have not.

(pp. 190-191)

Lösel continues with an acknowledgement that it is an incredibly difficult process for the computer to automatically detect divergences in the mental model, which requires the ability for the machine to reconstruct the other players' perspective (theory of mind). Thus, it can be said - at least for the machine I am working with - the machine improviser is achieving "perceived cognitive consensus" throughout the entire improvisation, and I am in constant adaptation to the consequences of this behavior.

However, there is another aspect at play: that of the performer's active memory of how the machine is programmed - that is to say that this active memory is rather limited. The performer is, indeed, not in full control of the machine as the previous paragraph might have intimated. There are many unpredictable behaviors of the machine that are occuring simply because it is too 'resource-intensive' for the human improviser to remember and track all of the interaction rules that she has programmed (at least, this is the case for me). The human improviser is forced to group all of the complex mechanisms of the machine into a general identification as improvisation partner and, in the mind of the human improvisor, the scene recreates the improvisation framework described by Lösel.

method

As the questions posed above are grounded in my own artistic practice, I approach them with the following practice-based method:
‍
1. Create a new work with the improvisation machine.
2. Perform this work.
3. Reflect upon this work in the form of conversations with audience members and feedback sessions with mentors and teachers (e.g. Patrick Müller, Matthias Ziegler). Discussion topics include the following:
     a. Perceived interaction between human performer and generated sounds and visuals
     b. Perceived narrative, composition, and form
     c. Perceived agency of improvisation machine
4. Take the feedback I receive into account and revise/alter the improvisation machine.
5. Repeat.