Aural Expectations of Home: An Autoethnography of the Amazon Echo Smart Speaker

Stephen J. Neville

The paper conducts an autoethnographic case study of the Amazon Echo smart speaker to explore problems of acoustical privacy at home. A gap in surveillance research is addressed by treating smart speakers as private social media platforms that connect users to sound, devices, and home environments. The concept of aural expectations of home is developed which expresses how dwellers live with and through sound. The paper argues that problems of acoustical boundary control in the domestic sphere not only help understand the communicative and socially connective aspects of smart speakers, but also illuminate surveillance issues as one’s sonic habits at home are rendered as data under the purview of corporations.

1. Introduction

2. A Home Away From Home

3. Aural Expectations of Home: Methodology and Theoretical Framework

4. Sonic Conflict

5. Overhearing and Mishearing at Home

Overhearing and Mishearing at Home

Due to the porous nature of walls and the inherent leakiness of sound, I realized that my own media consumption was certainly audible to my next-door neighbors. I frequently played my own music before the noise curfew and would, on occasion, raise the volume on the Echo to feel the bass emanating from the device’s small subwoofer. The audio quality of the smart speaker is described in marketing materials as “room-filling sound,” and indeed I was quite impressed with the acoustic presence of the device when playing music and in generating Alexa’s voice. Unlike mobile VAPAs, such as Siri on iPhone, the sound quality of the Echo gives Alexa’s voice a powerful presence and sonority. As a result, I am certain that my habitual interaction with Alexa was overheard by my two immediate neighbors and anyone passing in the hall. In light of my status as a man in a single-occupancy room within this international student residence, I wondered: What did people think of me speaking frequently to a feminine-sounding VAPA? Literature has addressed the characteristically feminine-sounding voices of VAPAs as an aspect of their pre-domestication (Humphry and Chesher 2020) – a design feature that “hinges on the patriarchal, profit-driven implementation of symbolic femininity” (Bergen 2016: 95). Further, Heather Woods (2018) argues that as “gracious hosts” (337), the personas of VAPAs such as Siri and Alexa are meant to articulate gendered stereotypes of domestic femininity and to disassociate the technology with nefarious surveillance affordances. Thus, VAPAs’ perkiness, warmth, and obedience in a role of subservience are presumably designed to resonate with sonic imaginaries of a white 1950s housewife (Strengers and Nicholls 2018: 75). In my case, gender dynamics between myself and the VAPA raised a set of interpersonal concerns: If neighbors overheard my interaction with the device, would they perceive me as antisocial or perhaps lonely? My curt voice-commands to Alexa would certainly seem rude according to accepted interpersonal norms, and I wondered whether my neighbors would have contemplated the gender dynamics of a single-man ordering, commanding, and seeking to control Alexa.

The prospect of my neighbors overhearing and judging my interaction with Alexa was unsettling. It took me over a month to feel comfortable with the acoustic reality of inescapable overhearing. One embarrassing moment did occur when I tried to cue up a song that my partner back home had recently recommended by the artist LP. It was 6pm, and most of the tenants on the first floor had returned home for the day when I called out to the device, “Alexa, play ‘Girls Go Wild,’” without realizing that my device was set to nearly maximum volume. Alexa responded with its clear enunciation and resounding voice, “I didn’t find any enabled video skills to play ‘Girls Gone Wild,’” and I felt my face quickly becoming flushed. Alexa had evidently misheard my voice-command, interpreting it as a request for pornography based on a similar-sounding title and then broadcasting this “request” for content to anyone within earshot of the device. This misfire invocation was not only a failed everyday speech act (Austin 1962: 16) and simple technical breakdown of the device (Humphry and Chesher 2020: 11) but, by involving bystanders within earshot, the misfire betrayed my expectations of boundary control within the acoustic environment.

I hoped no one heard this embarrassing response from Alexa or that the words were not linguistically discernable. I tried the same command again later, this time with a lower volume and including the name of the artist in my request. Success! I listened to the song with delight and began to dance with the familiar and comforting feeling of being invisible to the world.

The misfire by the device violated my AEH and left me feeling exposed to public scrutiny. This reveals one way in which the design of smart speakers may affect experiences of the public/private boundary at home. Unlike screen-based and touch screen devices, the Echo requires users to interact through vocalization, effectively making an otherwise private act public. On a computer or smart phone, “Girls Go Wild” autocorrecting to “Girls Gone Wild” would only lead to embarrassment if someone were to look directly over your shoulder at your screen, but in my case, that fact that it was spoken opened the possibility for public scrutiny of my private discourse. Moreover, the opportunity for other cases of overhearing heightened my awareness of the gender dynamics in my sonic interaction with Alexa.

This raises other questions about how users sonically manage how smart speakers can affect privacy values and practices at home. For instance, the primary privacy affordance of the device – besides unplugging the device – consists of a microphone-off button which, when enabled, lights up the top of the device in red to visually signal a state of privacy to the user. However, this contradicts the design principle of VAPAs as invisible IoT personas that are intended to be ignored and forgettable when inactive, leaving me skeptical that users regularly use this feature when carrying out sensitive conversations. Is this feature simply a superficial gesture made by Amazon to create the illusion of respect for mores of privacy at home? In my case, the domestication of the technology was accompanied by complete surrender to its eavesmining (eavesdropping + data mining) function (Neville 2020a). However, I remained aware of the interpersonal dynamics of overhearing and eavesdropping. Researchers have interpreted displays of typical human-to-human verbal social mannerisms towards VAPAs (e.g., saying “thank you” and “please”) as evidence of users’ tendencies toward the personification of their devices (Lopatovska and Williams 2018). However, such (in)politeness is most likely informed by conditions of overhearing. For example, children have also been found to mirror the way their parents interact with VAPAs, “which can be brisk, bossy, and harassing” (Kudina 2019). However, adaptations that arise when the user is sensitive to the power relations between gendered voices – a single man with Alexa, for example – may also shape the user’s sonic habits, especially when the public/private boundary is challenged through leaky acts of vocalization. Finally, volume of speech is another way users may adapt to the technology while seeking to maintain acoustic boundary control. Whispering has always been a tactic to limit the number of recipients of a spoken message, and perhaps since Alexa has been updated with a “whisper mode,” this tactic will be adopted as users seek to avoid embarrassment regarding their media requests: for example, softly vocalizing a request for an ASMR (autonomous sensory meridian response) recording to Alexa to preserve a sense of sonic intimacy.

6. A Sonic Vector of Surveillance

7. Threshold Sounds

8. A Body’s Invisible Tether to Home

9. Conclusion

References

Tip

Tip