From the Body with the Body: Performing with a Genome-Based Musical Instrument

INTRODUCTION: In this paper we present Silico, a new Digital Musical Instrument which ideally represents the performer itself. This instrument is composed by two parts: an interface (a sensor glove), which relies on the movements of the performer’s hand, and a computational engine (a set of patches developed in Max 7), which generates sound events based on the genomic data of the performer. OBJECTIVES: We want to propose a new reflection on the relation between the body and musical instruments. Moreover, we aim to investigate the voluntary and involuntary aspects of our body, intended as a starting point for a musical performance. As a metaphor of these two layers, we used here the hand and the genome of the performer. METHODS: We have investigated our objectives through the whole design process of a Digital Musical Instrument, using a practice-based approach. RESULTS: Our system is a multilayered composed instrument which maps its computational part and its interface on the performer’s body. Silico can be used as a standalone musical instrument to generate music in real time. CONCLUSION: Our works shows a new path about the use of genomic data in a musical way, as a new perspective of human-computer interaction in a performative contexts.


Introduction
The human genome is the genetic material in each cell of a human body. Since its discovery in the 90s, musicians and sonic artists have been attracted to use it as a mean to generate sound [1,2,3]. Different authors have developed different methods to sonify the genome, but they all tend to rely on direct mapping between some genome data and the sonic output.
In this paper we propose a fresh look to using genome for musical purposes, by presenting Silico, a DMI, whose computational engine is based on the genome of the performer, while the interface is based on the hand of the performer. Our proposal is to use the genome as a basis to design a new instrument rather than to directly mapping to sound parameters. To do this, we rely on the typical computational engine/interface distinction that characterises Digital Music Instruments (DMIs) [4]. In particular, we propose to use the genome to define limits in the computational engine of a DMI that can then be played through an interface. The two elements of the instrument constitute a multilayered representation of the dichotomy between the body as what we inherited, that we cannot change, and the body as the tool that allow us to act: the engine represents the genome, the inherited material that constitutes the body, and the interface is controlled by an hand, the part of the body we use to manipulate tools.
Silico contributes to the electronic music debate by presenting a new way of using genome for musical expression. Moreover, along with this reflection, it offers a new reflection about the role of body in music performance.
The rest of this paper is organised as follows: in section 2. we present related works in the field of DMIs design, genome sonification, and use of body for musical expression; in section 3. we describe the design of Silico, pinpointing the distinction between the computing engine with its relation to the genome, and the interface based on hand ergonomics; we then discuss our work against the literature described in section 2., and conclude outlining possible future directions.

Background
In this section we present 1) literature related to the design and compositional strategies for DMIs, 2) related works on body-inspired design of musical interfaces, and 3) the genome structure and how it has been used for sonification purposes.

Designing digital instruments
DMIs are normally composed of two main distinct elements: a computing part that comprises sound synthesis and automations algorithms and an interface that usually comprises hardware and software technology. As opposed to an acoustic instrument, where the gestural interface is also part of the sound production unit, in a DMI the interface is usually completely separated [5]. With DMIs, a musician has the possibility to design arbitrary mapping between his/her gesture and the music parameters [6].
Magnusson introduced the concept of ergomimesis to describe how new digital instruments emerge from a process of transport (transduction) from one domain to another [7]. If we think of an acoustic instrument, in fact, we can establish a direct relationship between gesture and produced sound, or combine a certain type of action with a sound result consistent with it (for example, the correlation between picking a string and the sound produced by itself). However, considering the arbitrariness of its mapping, in a DMI it is not possible to draw such direct relationships.
Wanderley also recognised that mapping is a crucial issue in a DMI, as the relationship between gesture and sound can be completely arbitrary according to the idiosyncratic needs of any specific situation [5,8].
Therefore, the intrinsic characteristic of DMIs offers the possibility to compose by inventing the interaction strategies. Many works [9,10,11] have investigated the inherent meaning of the digital instrument, from which we have extrapolated some main ideas which we will briefly illustrate here. In 2002, Schnell and Battier introduced the concept of composed instrument [9]. A composed instrument is a tool that shares features of an instrument (it can be played), a machine (it is composed of algorithms), and of notation (it represents the function of a score). Similarly, Cook claims that the music created by the new instruments is strongly influenced by the choices made during the design process, which in turn are influenced by various artistic, human and technological factors [10]. Building upon these characteristics, Magnusson proposed the idea that DMIs are epistemic tools, as they represent both the function of an instrument and carry a notion of how the music can be thought, composed, and performed. In fact, we can actually say that the process of design and mapping strategies has to be considered as part of the compositional process [11].
DMIs as design objects have also been analysed borrowing tool from HCI [12]. In this sense the concept of affordances and constraints have been widely explored [13,14,15,16].
Particularly relevant is the proposal by Magnusson, who, starting from a phrase by Boden ("Far from being the antithesis of creativity, constraints on thinking are what make it possible" [17]), defines expressive limitations that face the thinking, creative, performing human as "subjective constraints" [15].

Body and sound
Research on the relation between the human body and music has a long tradition [18], in relation to the growing of DMIs studies.
Studies on performative gestures contributed to the investigation about the relation between body and movement, defining the gesture itself as an element that contributes to the formation of the musical meaning as well as the auditory features [5,19,20], and contributes to increase the strength of the performance itself in the audience's perception [21].
With this in mind, Iazzetta, for example, defines the body as "instrument through which the gesture becomes actual" [22]. Moreover, the body has taken on an important role in the study and design of DMIs, also thanks to the proliferation of low-cost sensors that have favoured their spread [5,4].
As an example, we cite here the works by Tanaka who used electromyography and muscle sensors to control electronic musical instruments [23], and extended the gestural boundaries through the use of sensor-based instruments [24]. The performances by Donnarumma "Music for Flesh I & II", offer another important contribution to the development of the debate about music and body [25,26]. In these performances the Author relied on wearable hardware sensors device for capturing biological body sounds called the "Xth Sense". From an holistic perspective, we mention as an example the works of Donato et al., which have extended gesture control systems also in the manipulation of light projection [27], From the Body with the Body: Performing with a Genome-Based Musical Instrument and the Cyber Composer by Ip et al., a system which generates music according to the hand gestures of the user [28].
Particularly relevant for the research about gestures and music are those instruments that relied on gloved hands. The hand has always received special attention thanks to its intrinsic characteristics not only in the field of research but also in pop culture (the famous Cyberglove, for example [29]). It is not difficult to come across gloves equipped with various types of sensors, from the most common flex sensors, accelerometers and gyroscopes to more elaborate position recognition systems [30]. Already in 1984 at STEIM (Studio for Electro-Instrumental Music), Waisvisz had developed and presented his instrument, "The Hands", equipped with ultrasonic sensors, buttons, switches, and accelerometers [31]. A more recent example is presented by Laetitia Sonami, who subsequently developed a complex control system with strong gestural features [32], comparable to the modern MiMuGlove.

Genome and Sonification
The genome is the genetic material in each cell of an organism. It is encoded into the deoxyribonucleic acid (DNA), a molecule composed of two chains that coil around each other, that carries the genetic instructions for the development of our body. The human genome is present in every single cell of a human being, and is composed of over 3 billion nitrogen bases (Adenine (A), Cytosine (C), Guanine (G) and Thymine (T)) divided over 22 pairs of autosomes and 2 sex chromosomes. Regions of this DNA contain sequences which, when read by the cell's molecular machinery, are capable of producing proteins ("coding regions"). The nitrogen bases in these regions are read into triplets called codons, each of which identifies a specific amino acid of the future protein (see [33,34] for an extensive description of the genome).
Researchers in genetics and bioinformatics have developed various methods to analyse and organise human genome resources for identifying features of DNA sequence [35,36]. Moreover, since the launch of the Human Genome Project [34], a vast amount of data about the human genome has been made available on the internet, attracting attention from various fields [37].
Many experiments with genome and music have been made, particulary using sonification.
Sonification is a non-speech audio rendering process used to convey information about data and/or interactions [38,39].
Sonification is primarily applied in the area of Auditory Display [40], in this context three main techniques emerged: 1) Audification, that is direct playback of data samples, 2) Parameter Mapping Sonification, or PMSon, that associates multidimensional/multivariate information with auditory parameters, and 3) Model-Based Sonification, or MBS, the creation of processes that support interaction, by involving the data in a systematic way [41].
Gena and colleagues pioneered the sonification of DNA in the mid-90s, with a system that converts the genome information to MIDI events [42,1]. Other examples are represented by Won, who sonified the chromosome 21 [3], and by Temple, who developed six algorithms to sonify the human DNA, comparing them according to which one is more informative [2]. Moreover, Grond et al. developed an interactive sonification technique to explore ribonucleic acid using a combined auditory and visual interface [43]. The commonality among these scholars is the exploration of a single specific element of the genome.
The aforementioned works mainly focused on sonification algorithms, but the idea to use a human genome to define the algorithm of a musical instrument is currently overlooked.

Silico
Silico aims at representing the body both at the computational level and at the interface level. The genome of the performer is used to shape the sound synthesis algorithms, while the physical body (a hand) is used for the controller. Not only are these two levels shaped on the actual body of the performer, but also they represent two different constituents of the body itself: the genome, an inherited element that is intrinsically unchangeable by the individual, and the hand, that embodies the human capability to manipulate objects and operate tools. As we have seen in 2.1 a DMI is constituted of two main elements, a computational engine (that combines a number of musical algorithms), and a controller interface. Neither of the two, if analysed in isolation, represents the DMI. In Silico, we designed both of these parts based on constituents of the body of its main creator and performer (first author of this manuscript). The computing element is a real-time algorithm that pilots four sound synthesis engines. The genome of the performer is used to model the range of each sound engine. As compared to other traditional sonification system, Silico maps genomic data at a meta layer, determining the performative possibilities and not directly the sound. The control interface is a glove that relies upon the ergonomic ability of the performer.
Silico is composed of two main elements: 1) Computational engine that mainly relies on genomic data to define sound synthesis constraints 2) and an Interface composed of an augmented glove. The actual music generated by Silico is the results of the combination of the genomic instrument constraints and the performative parameters. The values range of the sound engines is determined by the weight of each group of amino acids and multiplied from time to time by the current state of the corresponding sensor on the control interface. In this way, we create a double level of control: an involuntary/structural one that determines the extension of our instrument (the genome), and a voluntary/formal one that moves within these extensions (the interface). We summarized the whole structure of Silico in fig. 1. The resulting instrument is a multilayer non-deterministic representation of the body of its performer. The engine represents the genome, that is not controllable, while the interface is a representation of the actions, that can be controlled. Every sound event is thus a combination of both the uncontrollable element of the body (the sonification of the genome), and the controllable element (the movement of the hand). As we focus primarily on the relations between the performance and the machine, we try to keep the audio system as simple as possible.

Computational engine
Genome data mining In section 2.3, we described how many authors have relied on single elements of the genome in their sonification work [42,3,43]. In this study we followed a similar approach, trying to find a greater data specificity to be used as a starting point. The data mining step focused on the four groups of ordinary amino acids: apolar, polar, basic, acid [33]. The 20 ordinary amino acids, in fact, can be thus grouped according to their biochemical behaviour and their effect in the final polypeptide chain. Below is a simplified subdivision of codons into groups:  Since about 99% of the human genome is common to all individuals, we decided to investigate the remainder, which represents personal differences and defines the uniqueness of each one of us. Our attention has therefore shifted to the so-called mutations, which are the alterations of the nucleotide sequence compared to a physiological standard. A reference genome, representative of the whole human species [34], is employed in genomics as a standard to analyse the differences (variants) between different genetic heritages and individuals. The most recent version (GRCh38), released in 2013 by the Genome Reference Consortium, has been used in this paper.
In our design we proceeded to the sequencing and analysis of a personal genome, obtaining a collection of DNA mutations. After the sequencing process with Illumina SBS Technology [44], the data (reads) were aligned by using BWA [45] and base variations were annotated in an Annovar text file [46].
In a file of this type, each string is formed by a numerical value, which represents the position of the variant in the reference protein, and two letters placed before/after that number, which respectively represent the codon of the standard genome and the mutated codon. For example, the string "p.L815T" means that in position 815 a codon L (Leucine) has been replaced by a codon T (Threonine). The length of the file obviously depends on the number of substitutions present in the analysed genome, and it generally ranges from a few dozen to a few hundred.
The Annovar file is then imported into a simple list operators and counters system, developed in Max, ver. 7.1.0. The counter system which first calculates the number of codons actually replaced, and secondly groups them by type, calculating their incidence (weight) in relation to the variants type. The values thus obtained are normalised 0. -1., assigning by default the value 1. to the greater and accordingly rescaling the others.
Finally, the patch automatically generates five text files in the main folder of the application, labelled "Weights", "Apolar", "Polar", "Basic" and "Acid". "Weights" contains the incidence of each group of amino acids, while "Apolar", "Polar", "Basic" and "Acid" contain the number of individual codons belonging to the reference group.

Sound synthesis engines
Silico's audio engine is made up of four synthesis engines, one for each group of amino acids. The choice of certain types of audio synthesis was aesthetically determined by the characteristics of each group of amino acids. The motivation behind this choice is not grounded in the semantic content of the sound rather on its acoustic features. Indeed, the main purpose that motivated the choice of the four sounds was to obtain sound materials that were simple enough to be clearly differentiated, and at the same time sufficiently complex to be somehow musical. Moreover, the sound should have a sufficient number of parameters, whose range could be mapped with the difference between the specific genome of the performer and the reference genome. Therefore, it follows that four types of sounds and four types of synthesis coexist in our system, each representing one specific group of amino acids.
Codon management is also differentiated: each group of amino acids, as said already, contains a different number of codons, which will determine a different number of choices within the synthesis engine itself: • The Apolar group is represented by an engine which operates multiband subtractive synthesis on concrete samples of voices and applies timed delay on the signal. • The Polar engine generates clouds of grains from concrete samples of noises. • The Basic group operates additive synthesis with selectable waveform using 5 harmonic oscillators. • Finally, the Acid group is represented by a simple/complex frequency modulation (FM) synth.
Each engine is capable of holding up to 20 voices in polyphony. At every trigger received, the system probabilistically selects a different amino acid by reading the pre-loaded correspondent files. In the Apolar and Polar cases, different codons are matched to different 2seconds samples, stored into the main folder, to be processed. In the Basic and Acid cases, different codons determine respectively the oscillator's waveform and the FM complexity. These matching are summarised in table 1. From a poetic point of view, a system of this type allows us not only to be able to use our instrument to build performance in real time, but also to determine, through our inherent characteristics, the number and types of instruments of our small synthetic orchestra.

Trigger automations
The choice of each sound event is entirely determined by a probabilistic system that refers to the files derived from genomic analysis. At each event trigger, the system reads the probabilities from the "Weights" file and chooses the respective synthesis engine to be activated. A message is then generated to be sent to the corresponding polyphonic synth, which contains a series of values in a specific order, depending on the type of synthesis corresponding.
To start a performance, we just need to specify a global triggering speed and activate the "Karl" toggle. Then the system starts playing immediately thanks to Karl, a very simple automatic triggering subpatch.
Moreover, as it is basically generative, Silico is able to play itself without the need for external control by randomly setting the parameter values, in the absence of a significant variation of the input signal (> 0.05) for more than 1 minute. It is therefore possible to disconnect the interface at any time and let the system ring, or reconnect it and start interfacing with it again.

Additional sound treatments
Finally, we use gigaverb~ for adding some reverb by default and send every synthesis engine to a different speaker in a quadraphonic environment. Therefore, it is possible to configure the system to automatically switch to a stereo environment with three different presets: 1) Stable-Unstable configuration: grouping Polar and Acid engines on the left (L) channel and Apolar and Basic engines on the right (R) channel; 2) Mixed configuration: grouping Apolar and Polar engines on the L channel and Basic and Acid engines on the R channel; 3) Dual-Mono configuration: all engines to L-R.

Cyberglove
We proceeded to breadboard a prototype of our glove interface. We used five flex sensor 2.2", five 10kΩ resistors, electric cable, heat shrink tube and an Arduino Nano equipped with ATmega328P processor, which is powered by USB cable.
While the electric circuit is actually very simple and does not require further explanation, we need to discuss the code a little more in depth.
In fact, we notice that the resistance value at 0 and 90 degrees was not always the same. The flex sensors, notoriously, are not very precise, so we first needed to fix this problem. After a few tries, we used a constant voltage value of 4.98V and limited the resistance values between 37.3kΩ at 0 degrees and 90kΩ at 90 degrees. These resistance values are obtained from an empiric calculation: we recorded a collection of 50 full-flat and full-flex values, and operated an approximated mean between the higher 5 values in flat position and the 5 lower value in flex position. In this way we ensure that we From the Body with the Body: Performing with a Genome-Based Musical Instrument can always track and approximate precisely at least the limit positions.
What is in the middle does not need to be so precise: the use of this kind of controllers introduces by force a certain degree of inaccuracy, so we operate a further mean every 10 values in order to avoid glitches and smoothen the movements.
After these calculations, the final value is rescaled between 0. and 1., appended to an index representing the correspondent finger and sent out at every cycle (10ms each).
After the circuit was made, we sewed some little pieces of cloth in the upper part of the glove fingers, where the sensors need to be lodged, and fixed the board on a velcro strap. Notice that we used a left-handed glove because the guitar-background of the first author guaranteed him greater finger independence on the left hand. The complete prototype is shown in fig. 2.

Mapping
In the first phase of this work, we started by isolating interesting parameters for each synth. Apart from dynamic (volume) and density of events, which can be considered as high-level controls (we refer to them as "global parameters"), every kind of digital synthesis has by its nature a different number and a different type of parameters, which also depend on its complexity.
Despite the overall simplicity and neatness of each synthesis engine, we figured out that we still had to deal with a large number of parameters. So, in order to make the system easily controllable, we made sure that not only every synth had the same number of parameters, but also that these parameters were comparable to the same domain categories. We therefore chose three domains, in which we grouped the characteristic parameters of each engine (we refer to them as "specific parameters").
A comprehensive summary of all the five domains and the corresponding parameters for each synthesis engine is shown in table 2. We used a one-to-one mapping strategy [43] in order to enhance the clearness of use, and assigned each domain to a different finger, considering the ergonomics of the glove. The objective of this process was to grant a sharper control over the parameters requiring more precise manipulation. After a few trials, we identified the following mapping as the best solution: • 1° finger: dynamic; • 2° finger: frequency; • 3° finger: time; • 4° finger: spectrum; • 5° finger: density.
Moreover, in this mapping the global parameters (dynamic and density) are controlled by the external fingers, allowing a more precise control over the other three domains.

Graphical User Interface
In this work, we can consider the Graphical User Interface (GUI) as a simple visual feedback tool to help the performer interpret what is going on during the performance. Every synth engine is visually represented by a big coloured button, which blinks whenever a trigger is received. An approximate percentage, calculated by the system after loading the "Weight" file, is also shown over these buttons and indicates the probability for every synth to be activated.
Finally, we inserted five sliders representing the realtime status of each sensor on the glove controller, a clock to control the overall duration of the performance and two pop-up windows for loading boot files and configure the system ( fig. 3).

Practical use
We present a practical use of the system in a solo performance context. This section is also accompanied by example audio files, which are freely available at this link: https://soundcloud.com/silico_1_0 The genome we used (the genome of the first author of this paper) contains a fairly even distribution of variants for each group of amino acids, respectively 20%, 25%, 29%, 24% (the percentages are approximate to the whole).
The first four audio files in the folder (Apolar, Polar, Basic, Acid) are a representation of the range of each synthesis engine individually. We obtain this by muting the other three engines, for a demonstration purpose. This is not a real performative possibility, as the choice of each engine is assigned by Karl. Each of the four files explores the full extension range of the specific parameters (in the frequency, time and spectrum domains), for each of the four synthesis engines. As described in the Mapping section, these parameters are manipulated with the glove, in the following order: frequency min-max, spectrum min-max, time min-max. In order to make them clearer, we have chosen only one amino acid per group, and the files have been converted to mono. We also highlighted the different cues with comments directly on the SoundCloud page.
The fifth file (Silico_1) contains an example of performance lasting about 7 minutes. This performance unfolds following an extremely simple and linear development, which reaches a climax around the two thirds of the execution itself.
The sixth file (Silico_alt) contains a performance version created to highlight the different expressive possibilities and clarify the constraints set by the genome. We then manually entered different boot files into the system, "inventing" an example-genome with more extreme features. In this case, we made sure that two groups of amino acids clearly prevailed over the others, assigning a percentage of 40% to the Polar and Basic groups and a percentage of 10% to the Apolar and Acid groups. A similar degree of imbalance has also been maintained within each group by changing the variant content on the individual codon.
To allow a direct comparison, the hand gesture performance used in the previous file (Silico_1) was recorded and reused to generate this file.
The hand performance is identical in both files, the only element that changes is the distribution of genomic data, and is therefore possible to appreciate how much different genomes determine the overall result.

Discussion
Every cell in our body is built from the information stored in our genome, thus being our genetic material the quintessential determinant of many bodily characteristics and specifications. This notwithstanding, growth and development of ourselves is strongly impacted and, to a certain extent, controlled by external environmental features and our will. Our existence is continuously confronted by both the inner specifics of our "physical instrument", dictated by the genome, and the external determinants trying to morph it.
The unprecedented challenge we tried to face in this work is to constitute a new metaphoric representation of ourselves through different layers of a DMI. In Silico the genome is not simply "performed" or sonified: similarly to what happens in living bodies, it is used to constitute the building blocks of the instrument. Our genetic alterations are the determinants of the different instrument features, being the genome rather a pure expressive tool than the object of the musical performance.
The final composed instrument that the musician can use in a performance is based on the genomic data, created from the body, and therefore not modifiable or adaptable to each performance. This represents the constraints which the performer must deal with; this is also a metaphor of human existence, where a human being can not modify his/her own body. He/she can hence express degrees of freedom by operating the DMI with what, by definition, expresses the human evolution and the willful action: the hand. Through manipulation the artist can subsequently determine the composition and partially control what happens on stage.
Discussing Silico against existing literature on genome sonification, we introduce the idea of a meta layer between the genomic data and the sound. Indeed, as we have seen in section 2.3, [42,3,43] have created direct mapping between data and sound, while we developed a different approach, using the genome to design a new instrument. We argue that this approach was fundamental to obtain the musical expressivity discussed in the previous paragraph. Our approach also presents a new metaphor to use the body of a performer in an electronic music performance. As described in section 2.2, [23,24,25,26,31] successfully used the actual body as part of the performance, and the element that mainly leads the design of the musical interface. Our approach can complete these strategies, by offering the possibility to use the genome, From the Body with the Body: Performing with a Genome-Based Musical Instrument the inherent and hidden element of the body, as part of the musical interfaces.

Conclusion and future work
In this work we presented a new DMI in which we wanted to renegotiate the hierarchy between the user and the machine and the relationship of mutual responsibility in creating a musical performance. The use of the genome to determine the constraints of the system has allowed us to delineate a new control layer which is fixed and dependent on the user himself, which we basically define as "structural". Controlling through the voluntary gestures of the tangible body an algorithmic system whose properties / possibilities are closely linked to the nontangible characteristics of the body itself, represents on the poetic level the same patterns of everyday life, that is, acting consciously with and through a predetermined body, according to its inner properties.
This perspective introduces interesting developments to be addressed in the future, which we can ideally divide into two parallel lines of investigation: the user's point of view and the performance implications.
In the first case, we first want to implement additional control systems using the vast amount of gestures that the body makes available to us, through the use of more complex sensors such as the muscle sensors we already mentioned in section 2.2. Furthermore, it is possible to consider the inclusion of a non-voluntary control level in real time, which can be effectively provided through the use of new generation EEG sensors. The computational engine can be further developed, both in terms of general settings, including in the first instance the possibility of choosing between various types of synthesis, and in terms of their complexity, in order to make Silico even more customisable and immersive. The data mining process on genomic data will also be further enhanced through the addition of other biological layers, such as frameshifts, premature stop gain of a protein or the disruptive event represented by chromosomal rearrangements.
From the performative point of view, we want to introduce a visual support for the audience, in order to clarify the interpretation of the gesture by showing the system status in real time, and to enhance the global experience.
The aforementioned tasks will ideally be conducted through case studies and specific tests in the presence of a selected audience, whose impressions will be collected through questionnaires. Similarly, we intend to proceed by inserting the system into instrumental contexts, from soloists to small ensembles, in order to evaluate the collaborative potential.
Finally, a final evaluation will take place by testing the instrument and the users of different musical extraction, inserting different genomic data from time to time, to determine and evaluate the actual possibilities of interaction that arise from the differences, voluntary and involuntary, between different users.