The Issue with Voice Chat in Social VR
Papers Worth Reading: "Body, Avatar, and Me"
A few of my posts have touched upon the idea of what it means to interact with an Avatar, how to select one, and other thoughts about engaging with a world through Social VR. Guo Freeman and Divine Maloney of Clemson University, in their paper “Body, Avatar, and Me: The Presentation and Perception of Self in Social Virtual Reality”, explore these identities through a series of user interviews. It’s an insightful study centered around the conceptual pillars of the relationship between VR users, their avatars, and the experience of interacting with others (and their avatars).
Navigating these social environments, and the means by which we “evaluate” avatars often occurs along the lines of aesthetics, gender, race, and age. Through each of these modes, however, comes the unifying through-line of the human voice. The presence of a user's real voice can easily have a detrimental effect on the social experience, especially for those with marginalized identities. However, there is an unexplored potential for tools such as voice modulators, which may enable users to avoid the aforementioned detrimental effects.
You don’t sound like a—
For all of the customization options that creating a virtual self can offer, voice chat is sorely lacking. While this can be circumvented through digital routing, ‘voice changers’, and more, it’s a cumbersome and difficult solution for any but the already-knowledgable to implement. Freeman and Maloney highlighted how each of their ‘categories’ faced its own form of negative backlash, often along the lines of age, gender (both in gender expression / cross-gender play and in non-conforming voices), and race/language. This resistance, often going so far as outright discrimination, often occurred when other’s expectations of what the user behind the avatar ‘should sound like’ weren’t met on the various social platforms.
In the case of age, while there are many understandable reasons for adults to avoid trying to socialize with children, a specific point left undiscussed by the authors is the mutual-respect-and-validation that youth often seek on online platforms. I remember being a 13-year-old, playing World of Warcraft, and enjoying the fact that behind my character, I was just as capable and valid as anyone else. Young people’s desire to not be dismissed on account of their age can often lead them to be victims of manipulation: whether it’s the archetypal far-too-old creep who hits on high school girls by complimenting their ‘maturity’, or a specific strategy used as an alt-right recruiting technique. While of course broader safety concerns cannot be ignored, providing younger users an opportunity to mask their age could help them fulfill that need in a less predatory outlet.
As far as gender, race, and ethnicity are concerned, the multiplicity of negative interactions that arise calls for some form of voice anonymization. Be it a user whose femme voice immediately makes them a target for unwanted attention or harassment, a trans or gender non-conforming user who is outed or ‘clocked’, or simply a user who is harassed or accused of “lying” because their voice doesn’t match other’s expectations of identity, users should be given the choice to not be bound by their speaking voice. This same issue arises along ethnic and/or racial lines, with voice chat revealing language barriers, dialects, or accents that users may wish to keep undisclosed. Though not a perfect solution, a text-to-speech system, or speech recognition a la Apple’s Siri, could serve as a means to power an anonymized voice with numerous ranges, dialects, and more. At the very least, it would enable users an option to engage in “voice chat” in a way that feels comfortable.
a Quick note On the ‘listening’ Side of the equation
In a recent Twitter thread by user @AutisticSciencePerson regarding a game with voice chat they were testing:
The hearing side of Voice chat is just as essential as the speaking part. It is worth saying that a lack of listener-side features is a disservice to anyone who suffers from a hearing hypersensitivity (especially during a time when “screaming into mic to make it distort” is perceived as humor).
Moving Forward
All in all, users of social VR platforms deserve the opportunity to participate in the fantasy that the medium claims to provide. The current array of voice features is dreadfully lacking in that department, and whether it’s to provide an avenue for humorous character exploration or a profoundly meaningful first step into gender euphoria, as audio designers we have a responsibility to create safe and meaningful ways of interacting across all modes of communication if this virtual-social fantasy is to be equitably accessible to all.