Learnings from an Iterative Design Process for Technology-Mediated Audience Participation ( TMAP ) using Smartphones

We discuss a setup for technology-mediated audience participation (TMAP)in live music using smartphones and highfrequency sound IDs in a playful setting. The audience needs to install a smartphone app. Using high-frequency sound IDs music samples and colors can be triggered on the audience’s smartphones without the need to have an internet connection. The resulting soundscape is determined by the samples and parameters selected by the artist as well as by the location audience members choose in the performance space. We present the technical basis and iterative explorative design process of such a system for TMAP. The learnings from the perspective of musicians were technical requirements such as low latency, reliability, as well as increasing the number of possible sound samples and sound quality and we further present learnings on creating systems for TMAP from technical and creative perspectives.


Introduction
This article presents a specific method for technologymediated audience participation (TMAP) using smartphones and design learnings from an iterative process of designing TMAP with a musician.We discuss five iterations of an explorative design process of a system for TMAP called Poème Numérique over a time of ten months.The system was designed and developed in a closed loop with a musician.The goal of this article is first to present the resulting technology and second to discuss the learnings from an iterative and explorative design process for TMAP.With the discussed system audience members can use their own smartphones to join in a performance.Music samples and different color schemes can be triggered by the performing artist on all participating smartphones.The resulting soundscape consists of shifted and overlapping samples, which create new rhythmic and melodic patterns dependent on how participants group themselves in the performance space.The presented approach does not require the phones to have an internet connection as control signals are sent from the artist using high-frequency Sound IDs.The music for the proposed demo has been composed by Austrian electronic music artist Electric Indigo [1].The presented technology allows an audience to participate seamlessly using their own smartphones.A lot of control remains with the artist, who is able to trigger the samples played back on the smartphones and the colors of their screens.The audience can shape the resulting soundscape and their own experience by moving around in the performance space.The performance that will build on the described technology is part of the art-based research project Breaking The Wall [2], which discusses audience participation from the perspective of the involved creative processes.The presented technical development was part of a master thesis at the Vienna University of Technology [3].

Related Work
Audience participation goes back to as far as Mozart (1756Mozart ( -1791)), who allegedly composed the parts of the "Musikalisches Würfelspiel" [4] (musical dice game minuet).He made a quite conscious game design decision.He recognized chamber music as a participatory musical form in the need for an interactive diversion for the audience.Thus he introduced two dice, thrown to determine one of many possible combinations of musical segments of waltz music played afterwards.It's a minuet with 16 measures with the choice of one of eleven possible variations (11 16 ), each possibility selected by a roll of two dice, with literally trillions of possible mirror combinations.One of the core challenges in designing musical gameplay for entertainment wasalso due to marketing reasons -to make music accessible to people who do not necessarily play an instrument or are literate in musical notation.This gaming approach seemed to represent the very antithesis of compositional strategies [5].In Mozart's case, he succeeded to make music more varied and introduced a participative mechanic.While this game mechanic is purely based on luck it still involves the audience and makes the musical result feel more personal and unique.For this purpose Mozart abstracted waltz music from continuous pieces of music to smaller segments, which can be rearranged freely.The common denominator of many works in the field of sound art and music-based games [6], is that they make aspects of playing music and composition accessible to the audience by abstracting from its original complexity.In the case of technology-mediated audience participation the process of abstraction is even more delicate.On the one hand, there is a need to reduce and abstract complexity to make music easily accessible to the audience, on the other hand the complexities and intricacies of musical play must not be lost.Mazzini also presents metrics to describe and evaluate the characteristics of participatory performances [7].Smartphone performances have been previously documented by the authors in [8], where we describe a larger event, using a branched version of the software described in this paper.Audience reception was positive, but we outlined the intricate balance of offering freedom of expression for the audience while retaining enough control over the musical result for artists.In [9] Reichl et al. discuss the use of a smartphone application for providing information when audience members leave the performance space for short intervals of time.Wu et al. [10] discuss a smartphone app that provides a higher degree of freedom for audience members, letting them draft musical scores in real time, but audience members still wished for more control.

Explorative Design
The development of Poème Numérique is embedded in a larger project context (see 2.2 below) where design directions are also based on user research and expert focus groups.In this contribution we focus on the iterative and explorative design part of Poème Numérique.Explorative design is a term coined in recent years to describe an approach where design practices are utilized to facilitate research.In the case of Poème Numérique explorative design has been used to first explore the possibility space for TMAP with smartphones and then to narrow down on a specific approach.The concept of explorative design goes back to John Dewey's Theory of Inquiry [11], where he introduces the concept of "doing for the sake of knowing".Donald Schön built on the work of Dewey when he observed that much of the knowledge needed and used in the design process is not known a priori, but is acquired during the design process as a result of interacting with the object to be designed [12].This process, commonly referred to as "analysis through synthesis", is at the core of explorative design approaches.[13] introduced the notion of "Research Through Design", advocating an understanding of design practice as a relevant approach for research.This also underlines why we use explorative design not only to create a system like Poème Numérique but also to derive academic insights into the design process for TMAP.Given the importance of player experience in digital games, it becomes quite obvious that explorative design approaches are of high value to game development.While many aspects of games can be tried out using lowtech means, the player experience often has to be designed in a tightly coupled loop of explorative design and evaluative inspection.

Project Context: Breaking The Wall
The field of audience participation has a rich history of custom-built instruments and devices, and ways to facilitate collaborative performances.The artistic potential of audience participation both for musicians as well as their audiences is very high.Recent advancements in sensor and interface technology have further increased this potential.While research on audience participation shows both practical as well as theoretical perspectives, a structured creative and evaluated approach to fully explore the artistic potential is missing so far.Thus the art-based research project Breaking The Wall addresses the central research question "Which new ways of artistic expression emerge in a popular form of music performance when using playful interfaces for audience participation to facilitate interactivity among everybody involved?" To answer this important question and to shed light on the artists' creative practice we develop, document and evaluate a series of interfaces and musical performances together with popular music artists.The focus is on providing playful game-like interaction, facilitating collaborative improvisation and giving clear feedback as well as traceable results.The interfaces will be deployed in three popular music live performances at one event.The art-based research approach uses mixed methods, including a focus group and surveys as well as quantitative data logging and video analysis to identify parameters of acceptance, new ways of artistic expression, composition and musical experience.The evaluation will allow to present structured guidelines for designing and applying systems for audience participation.
The project team is comprised of popular music artists, and researchers covering diverse areas such as media arts, computer science, human-computer-interaction, game design, musicology, ethnomusicology, technology and interface design.
The results of the project will be situated at the interdisciplinary intersection of art, music and technology.It will present structured and evaluated insights into the unique relation between performers and audience leading to tested and documented new artistic ways of musical expression future performances can build on.will further deliver a tool-set with new interfaces and collaborative digital instruments.

Implementation
The technical basis of Poème Numérique is the use of high-frequency sound IDs to trigger events on the audience's smartphones.The use of high-frequency sound or "Ultra Sound Communication" for audience participation has first been documented in [14].In this approach frequencies above the average human hearing spectrum are transmitted by dedicated speakers and are used to quasi silently trigger events.An app that has to be downloaded before the performance listens for these sound IDs using the smartphone's microphone.Figure 1 shows the full setup with a computer used to send the sound triggers to a sound system and the audience's smartphones, which listen for these triggers using a cross-platform Android / iOS app.The cross-platform app has been implemented using Xamarin Forms [15].
Each Sound ID is composed of two distinct frequencies between 18 kHz and 20,7 kHz.For recognizing the specific frequencies on the smartphones fast fourier transformation (FFT) was used.In our case we use the frequency spectrum of 22050 Hz and divide it into 256 pieces.Each piece consists of a frequency spectrum of 86.13 Hz.To do this 512 time-based values have to be used.The information provided by the array of frequency-base values is crucial for recognizing the intensity of a specific frequency interval and therefore crucial for identifying high frequency sound IDs.Two frequencies are recognized as an ID if both of them are recognized in at least eight out of the ten last FFTs.That means that for both frequencies a peak has to be recognized eight out of ten times.For a peak to be recognized the following two conditions have to be fulfilled: • The value in the respective array entry has to be higher than a threshold which is platform and frequency specific.• The value in the respective array entry has to be higher than the surrounding array entries.
Two speakers are used to transmit the two frequencies simultaneously.The IDs always are played back for three seconds.Much smaller playback timeframes are theoretically possible, but our application does not need to allow for fast sequences of triggers.Within the above frequency range we managed to implement 15 unique IDs.To reduce false positives and faulty recognition we used one of these for a Sync ID sent before an actual Sound ID.This Sync ID prompts the phone to listen for a Sound ID for nine seconds.After the Sync ID we introduced the option of sending what we called a Change ID used to allow a second bank of triggers.After that the actual Sound ID is transmitted.By this means the system at present supports 26 unique Sound IDs.A PD (Pure Data) [16] patch is used to play back the high-frequency Sound IDs and thus is the central hub for controlling the distributed performance.The PD patch can itself be controlled through any network protocol including MIDI or OSC.

Formative Evaluation
In the following, we discuss five steps spanning ten months of the iterative and explorative design process of Poème Numérique.For each of the presented prototypes we describe the setting it was tested in, the improvements made based on previous iterations, the musician's feedback and reflective insights.

Spatial Sound Distribution using Smartphones (first iteration)
The first prototype successfully realised the use of highfrequency sound IDs guided by Hirabayashi and Eshima [14].A basic version of the app allowed sounds to be triggered on multiple smartphones in a university meeting room with 42,57m² tested on 28.October 2015.The sound recognition worked, but had troubles when the phones were moved.The musician emphasised the need that the system has to be 100% reliable in recognising trigger signals.The recognition problems were caused by a Doppler effect.This technical problem was solved with the next iteration, as explained later.
From a creative perspective the musician appreciated the spatial distribution of sounds in the room [2].This test also allowed the musician to get a feeling for the technical setup and its creative capabilities.This firstly prompted her decision to make an original composition instead of using an existing piece and secondly allowed her to brainstorm different ways to compose for a distributed smartphone setup.An open question after this first test was how to give the audience more agency beyond moving with their smartphones.

Second Iteration
The second prototype (see figure 3) mostly included technical improvements in frequency recognition and was again tested in the university meeting room setting.From a design perspective we discussed the need for moderation and feedback of audience interaction.Later versions of the app use distinct colors for each sample to make clear what is played back on each phone.We also added capabilities for an introductory message and to display text with each sample that is triggered.From a creative perspective we discussed that some sounds could be played on a normal sound system with only a part of the music coming from smartphones.To increase the diversity of sounds and to make the resulting soundscape more interesting we also introduced the use of weighted random to determine which sounds are played on a particular phone.Figure 4 shows a test performance using the system at a lecture at the Vienna University of Technology with informatics students.The lecture hall, one of the biggest in this university, had 404,51m².The app was distributed through the Android platform store.The test performance showed that the transmission of high-frequency sound IDs is mostly robust, but that recognition problems might occur with untested smartphones and with increasing distance from the sound source.Also some Android mods (e.g.Cyanogen) block microphone access due to privacy settings.While the large-scale use of the app was impressive visually, the musician was disappointed regarding the loudness of the joint output of the phones.We expected hundreds of phones would increase loudness drastically compared to the meeting room setting, but this was not the case.This may also be due to the special architecture of this auditorium.We consequentially made the decision to amplify the output (of some) of the audience's phones.The test further underlined the necessity of feedback through colours -each sound is associated with a specific (steady or flickering) color displayed on the phone when the sound is played back -which not only made the interaction more transparent for the audience but also added feedback for the musician.

Fourth Iteration
The fourth prototype implemented the visual feedback changes discussed in the previous iteration.The musician also a new set of audio samples and it was the first prototype that was tested in the location planned for the performance (see figure 6).The test first used electromagnetic pick-up coil microphones (see figure 5) to amplify some of the smartphones in addition to the speaker-sounds emitted by all phones.Only electromagnetic waves (in this case from the speaker inside the phone) are amplified to prevent the amplification of any ambient noise.The overall results were appreciated.To ensure a clearer distribution of sounds in the performance space we decided to use a center stage for the performer and four stations in each corner where smartphones can be amplified by the audience.

Fifth Iteration
This iteration marks the first use of the four stations with one speaker each.A new set of sound samples considering this setup was used.This was also the first time that the musician's composition adapted to the sound of mobile phones by using hi-pass filters and generally making use of the frequencies that work well with phones (e.g.bell sounds).We used 17 phones overall.Combined with the four stations this lead to an encompassing spatial soundscape where individual sounds are not localizable anymore.Requirements by the musician were less delay when triggering sounds and to be able to trigger more sounds.Both changes are trade-offs means a trade-off with reliability and response time of the system.We were able to drastically reduce the delay in triggering samples from four seconds to 250 milliseconds with the only requirement of playing back the high-frequency sound IDs louder.We could not add a larger number of sounds because that would either make the system less reliable, introduce more latency again, or make control frequencies hearable.Overall this iteration was considered to be the first realistic setting for the performance.Future iterations will not change this setup but only refine details.

Discussion
In the following, we first draw together artist reflections from the formative evaluation steps (see chapters 4.1 through 4.5) resulting in a list of learnings for design and conceptualisation of TMAP.These issues are mainly derived from the presented case study using the audience's distributed smartphones.However, we discuss their implications in context of other studies around TMAP.
Second, we present a series of learnings concerning the underlying technology from applying TMAP with smartphones using high-frequency sound IDs thereby reflecting and extending the insights presented in [14].Learnings from an Iterative Design Process for Technology-Mediated Audience Participation (TMAP) using Smartphones

Learnings for the design and conceptualisation of TMAP
Having a low latency and an immediate reaction of the devices, when they are remotely triggered by the DJ, were an issue that concerned the general concept throughout all development steps.This low latency is not only a requirement by our particular and her musical intentions, but was reported by other work in the field of TMAP [17,18] as reported the audience.
During development we had to spend a certain amount of effort in testing and improving the reliability of the whole system.Especially at an early phase, there were several dropouts of smartphones which did not react when being remotely triggered.Given, that every smartphone is held by an individual during a performance, every dropout causes one disappointed participant who is prevented from interaction.As spectator interaction is central to TMAP, we consider a reliable system (without dropouts in our case) as crucial for the whole concept.
Iterative testing is necessary for the involved musicians to get an idea and feeling for the anticipated system.As in our case, a system for TMAP starts with a basic concept.Although, this might include musical and technical details even in an early development phase, it needs much more imagination at the beginning than later, when the system can actually be tested to some extent.
Considering the actual music or musical piece, we identified the difficulty to use an existing piece of music.In fact, it was the decision of the artist we collaborated with, that she will write a new piece instead of using an existing one as originally intended.This issue of using or changing existing pieces or create a new composition on purpose for TMAP, has been discussed in other work as well [19,20].
Another issue concerned the musical quality and the artist's flexibility.In particular, this concerned the low audio quality of smartphone speakers (e.g.little loudness and bass).The sound quality was not satisfying first hand, but the artist was willing to adapt to the circumstances.Moreover, the artist suggested to adapt the composition with regard of the given smartphone speaker constraints.In practice, she used sounds within a higher frequency range more appropriate for this kind of speakers.
Important for the artist we worked with, was a maximum of control over the musical result.This concerned the random distribution of a set of different sounds and colours among all participating smartphones.With the early prototypes, the distribution happened completely random, which caused some silent sounds being covered by louder sounds although equally distributed.Thus, the artist wished to have a certain control over the random distribution to give silent sounds more weight.
During the development, we discussed the amount of interaction the spectators have using the application.Our most recent prototype does not allow much interaction except of start and stop the application or moving the phone while sound is triggered remotely.In literature, this requirement was discussed already as the audience wishes a particular amount of interaction [20,21].

Learnings concerning the underlying technology
We developed Poème Numérique based on a particular technology for audience participation using high frequency sound IDs in Sense of Space [14].In relation to the original technology and findings presented by the authors of Sense of Space, we discuss identified issues around this particular technology based on our our experience.
In Sense of Space, the authors used iPhones only for their application and performance.We tried to increase the number of spectators being able to participate by developing an application for iOS and Android.To reduce the effort in building the same application for two platforms, we used Xamarin as described earlier.This approach turned out to be working from a technical perspective as we could realise the application for both platforms.Although, we observed a problem with privacy settings with the Android modification Cyanogenmod.
We can confirm the spatialization effect reported by Sense of Space [14 p.60].At the same time we can confirm the issues with the smartphone speakers being too weak to hear.An issue we observed concerns the distance of smartphones and the power of the PA speakers to distribute the high frequency IDs.In our case, some speakers we used to trigger the sound IDs were too weak if the smartphone to be triggered was further away.Nothing is reported about this issues in the original paper.Further improvements addressed an occurrence of the Doppler effect.The doppler effect occurs when a smartphone is moved towards or away from the speakers.This effect leads to the fact that the frequency recognized by the respective smartphone is higher or lower than the frequencies played by the speakers.The ID is therefore not recognized.To minimize the chance of a doppler effect disturbing the recognition process, the frequency interval used for recognition is chosen very generously.

Conclusions and Outlook
This contribution presented the technical basis and iterative explorative design process of a system for TMAP using smartphones and high-frequency sound IDs.The learnings from the perspective of musicians were technical requirements such as low latency, reliability, as well as increasing the number of possible sound samples and sound quality.Musicians accept constraints and a broad range of functions is not necessary.Constraints even seem to further creativity.Further we learned that such a system for TMAP necessitates original compositions.We also identified balancing the musician's control over the musical result with freedom for audience interaction as one of the core challenges in designing such a system.The iterative design and development process not only helped the results presented in this paper, iterative testing also proved invaluable in helping the musician familiarize herself with the system, quite similarly to learning a new musical instrument.Building on [14] from a technical perspective we could confirm spatialization effects and issues with low volume of smartphone speakers.We also observed issues with decreasing reliability when increasing the distance between smartphone and the sound source of the high-frequency sound IDs or when using speakers that are too weak.In contrast to [14] we did not observe disadvantages from a creative perspective with regard to sounds being not localizable.
Poème Numérique will be refined and finalised in this iterative manner.Further tests in a live music performance setting will determine the and creative possibilities of such a system from an artist and an audience perspective.

Figure 1 .
Figure 1.The technical setup of using highfrequency sound IDs.

Figure 2 .
Figure 2. Triggering sounds of spatially distributed smartphones

Figure 3 .
Figure 3. Second prototype version of the app displaying recognised control signals

Figure 4 .
Figure 4.A test of the system with students during a lecture.

Figure 5 .
Figure 5.Using a pick-up coil to amplify the audio output of a smartphone

Figure 6 .
Figure 6.Center stage setting in the final location with four stations, one in each corner.