Sound affects us both positively and negatively. The urban sound environment can stimulate stress or comfort, boredom or interest, engagement or disinterest, a desire to stay, and a desire to leave. For both inhabitants and visitors, the sound environment forms societal rules that we may adhere to or break. I appreciate silence in my life; I know some people cannot sleep without the hum of traffic outside their window. After decades of being sidelined, noise abatement, and better city planning have gained traction in governmental and local authorities' agendas. Sound type, and not only absolute energy levels, also now has some bearing. For the majority of city inhabitants, when the noise doesn’t actively annoy us, we try to ignore it as we go about our daily activities.
Yet, intriguing features exist within the noise that may tempt a different listening strategy. This information is hidden from us either by acoustic masking or because it is an artifact of a more salient action that holds a more prominent position in our perception, or simply because we have become so familiar with it that we no longer care to listen. If we become interested in these features, we may overcome some of the negative experiences of urban noise and feel a sense of space, place, and presence. In the urban sound landscape, knowing what to preserve, what to improve, or simply feeling more present and positive implies that we need to maneuver our listening to be interested or at least engaged. However, maneuvering our listening in this way is not easy. This is where my work with 3D sound art and the urban landscape comes in.
The significance of our perception of the sound landscape has been discussed and explored since the 1960s when Murry Schafer first coined the terms "lo-fi" and "hi-fi" soundscapes (Schafer 1993). Yet, the complexity of the urban soundscape makes it particularly challenging to quantify and study compared to archetypal ‘hi-fi’ sound landscapes such as forests and natural habitats located away from human activity. Also, our understanding of the importance of spatial perception in the reception of all kinds of sound and music has improved in recent years. Yet, although we understand the significance of sound localization in our general interaction with the world around us (for example, the cocktail party effect), we do not understand the impact of the spatiality of sound and the impact of sound propagation, particularly in urban environments. Past studies on city noise fail to address the importance of the interaction between space and the timing and nature of sound events and how this influences our perception of time, cycles, and rhythms, psychology, and habitual well-being. What we do know is that when sounds are spatially separated, we can mentally process more information by virtue of our spatial hearing, and listening to recorded sound out of context gives the creative artist time to explore its features, and is one of the fundamental approaches to electroacoustic composition.
Recently, the hypothesis that urban soundscapes can be ‘improved’ through sound installations was tested by Fraisse (2019). His study was of a specially made sound installation created from distributed audio sources, located in a city park. The work included detailed listener studies of temporality, loops, rhythmic and non-rhythmic sounds and showed that the sound installation benefited users’ soundscape assessment, especially when nearby construction work was actively sounding. Yet, despite the artwork being 'spatial' in itself, with sounds moving from one place to another, this aspect of the research was only lightly assessed.
Although all questions about urban noise cannot be answered in artistic research alone, we can begin with a hypothesis for evoking and provoking a new awareness of our urban sound environment, and then test this hypothesis in practical work. The hypothesis is as follows: if we can draw our attention to interesting features and enhance what is most curious, beautiful, or relevant, we can maneuver our listening to be excited by the urban soundscape?
I have tested this hypothesis in six artistic case studies, where four were outdoor site-specific sound art installations, one was intended for an outdoor show but for practical reasons found its way as concert pieces, and the final one was an indoor museum audio-visual installation. They all share the following workflow that has developed over the course of the project:
- 3D site-specific recording made with ambisonics microphones.
- 3D impulse response recordings made with ambisonics microphones.
- 3D sound analysis and soundfield decomposition.
- Composition to draw out perceptual features that listeners may find interesting.
- Four of the case studies involved layering the composed 3D soundfield back into the site from which the sources were taken, where three of the installations used the same audio playback technology and the volume was adjusted to be heard just above that of the site's background sound level.
- The best experience is when you pause, listen, and give time for the composition to unfold. Then you may also begin to experience a connection with the sound and your own presence in the space.
Extended durations of sound were recorded to capture the temporal variation of the soundscape, including intermittent sounds, repeating sounds, key sounds (strongly recognizable sounds), and the continuously varying background. The approach to source collection was inspired by the ecological theory of Eleanor and James Gibson, which has been more recently elaborated by Tim Ingold and others. Ingold questions why, in soundscape composition, we artificially slice up the environment based on the unique location of our microphone or along a soundwalk that we may take. Instead, he argues that "the world we perceive is the same world, whatever path we take" and that "place" is not a space occupied by things but rather a woven habitat that we understand through motion: existence unfolds, affected by encounters with the surroundings where knowledge is gained through movement that unravels what is called "the meshwork." There are many interesting things to say about Gibson, Ingold, and the wave of perceptual psychologists and anthropologists that have followed. A more in-depth discussion on this topic and references can be found in my peer-reviewed publication “New Directions in Soundscape-based Sound-Art: Hybridizing Autoethnography with Computational Analysis” (Barrett, 2021).
My theory is that we may overcome problems of transient participation by recording for a duration beyond our normal period of stay and from many spatial locations. In other words, to achieve in recording what is impossible for a composer-visitor. Also, as ambisonic microphones record uniformly from all directions, we are already capturing an unbiased scene direction.
In the early phases of the project, I devised a method to separate multiple simultaneous sources and their motion features from the 3D recording. This process outputs individual streams of mono sounds and spatial data. In addition to isolating clear features, I can alter the spatial-frequency spectrum to reveal information usually masked in noise. In composition, I decide when and how to enhance interesting elements and musically reflect the natural sound environment’s changing spatial and temporal event fingerprints. (Since this time, the VST plugin Compass Tracker (McCormack et al. 2021) was released that can achieve a similar result but outputs only a subset of the information I am interested in). The first iteration of this work can be read in my peer-reviewed article “Deepening Presence: Probing the hidden artifacts of everyday soundscapes” (Barrett, 2020).
The recordings were analyzed using two methods: computer-based feature recognition, also known as 'computer listening,' and traditional 'composer listening.' While the workings of computer listening can become quite technical, simply put, we either instruct the computer to listen for specific information in the space, spectrum, or temporal domain, or we ask it to ‘learn’ about the sound through machine learning and neural networks. In contrast, composers traditionally follow an autoethnographic approach to their sound—that is, a process guided by listening and decision-making based on experiential self-reflection. When considering ecological theory and working with extended recordings, we are limited by the ear’s and body’s variable awareness. After an hour or more of listening, a decision I make one day may not be the same as if I carried out the same procedure another day. This implies that the autoethnographic approach is prone to failure. With computer listening, we can achieve greater consistency. However, this, in turn, may come at the expense of potentially interesting details that the computer ignores. Combining both methods can result in a balanced outcome, revealing both one-off features and main archetypes, and is explained in more detail in “New Directions in Soundscape-based Sound-Art” referenced above.
Inspired by how the IKO loudspeaker creates sound images by reflecting beams of sound off the surrounding surfaces, I wanted to make a similar technology that was cheaper and more portable. Project member Franz Zotter of the IEM (Institute of Electronic Music and Acoustics) had been involved in the invention of the first IKO prototype some years prior (Zotter, Franz, et al. 2017) and in collaboration with Zotter, we designed and prototyped an 8-channel version which we called the 170-loudspeaker. More about this technology can be read in Drack, V. et al. 2020. This speaker was normally used to reveal features of the acoustic fingerprint and played in conjunction with an 8-channel array of small loudspeakers.