September 24 is World Interaction Design Day. It is an annual event where people who are interested in or work in design fields come together as a united global community to show how interaction design improves the human condition. This year, the Ladies that UX of Durham hosted an event at McKinney, in Durham, and invited a panel to discuss trust and responsibility focused on gender and voice interaction. The event was sponsored in partnership by Adobe, IxDA and McKinney. The panel was moderated by Gretchen McNeely, a UX manager at Accenture Digital and featured four speakers from various fields: Scott McCall – a UX Design Lead at IBM, Cait Vlastakis Smith – founder and principal consultant at Everyday UX, Colin Dwan – a Creative Technologist at McKinney, and Nupoor Ranade – a Sweetland DRC fellow and PhD Candidate at the NC State University. The audience was made up of 30-35 attendees took part in the discussion. Conversations in the room explored topics ranging from user expectations/biases, stepping away from gender completely, and empathy for the machine. Multiple questions in each of these broader topics combined with valuable inputs and experiences shared by everyone in the room enriched the discussion content.
User expectations/biases
The discussion kicked off with a question posed by Gretchen about biases when interacting with voice interfaces. What are ways in which users approach or personify voice assistants? Generally, people tend to mirror the personality of the interface which means that they see themselves in this assistant. The other is the commanding perspective where people assume themselves to be more powerful to be able to boss around the voice assistant. Finally the companionship perspective where people consider the AI device to be their peer or friend. Answers to the well articulated questions were diverse. Scott highlighted the idea of looking at the voice assistant as merely an object that helps solve problems.
The key is understanding what users want and providing the required support. Nupoor described the idea of gender roles building into users’ biases by way of repetition and correlation. If most companies build AI roles as feminine, most users biases get trained to associate gender to the voice assistant. Most AIs today like Siri, Alexa, Cortana are set to feminine by default. This leaves lesser opportunity to bring about a change by understanding what users want as compared to what users prefer. Colin brought the idea of context to the conversation explaining that at times the personality of voice assistants can be peer like for applications meant to serve tasks like hospitality, while at other times they could act as assistants or co-captains helping with tasks. But does this mean that functions are the only factor affecting the personality of voice assistants? To discuss that, the conversation was steered in the direction of developing likability. How important is it for us to like a bot? Would we like it if it only serves the purpose by completing assigned tasks? The panel discussed ways in which responsive actions by a bot help users gain confidence about their learning abilities to interact with bots and receiving the right response triggers positive and rewarding feelings.
Voice assistants like Alexa are seen at professional spaces like at the workplace as well as personal family spaces like homes where such assistants are shared by everyone in the family. Research shows that feminine voices are more comforting, which is why they are popular in voice interfaces. However, would everyone in one family agree with this statement? To find out another reason of using feminine gender we need to look back at the history of AI. One of the early appearances of AI was in the movies where the AI’s job of saving the world, doing good and being compassionate were believed to be characteristics of women. What does it mean to have various demographic groups use a single device? As AI devices are becoming more and more popular, the line between human and machine is becoming blurry. Looking back at other examples of assistants drew attention to Clippy. The room wondered whether Clippy was assumed to personify a neutral gender. Although most had assumed Clippy to be a male chatbot, some others hadn’t thought about this aspect when introduced to what remains to be one of the oldest conversational assistants, introduced by Microsoft.
Stepping away from gender
The discussion and confusion about Clippy’s gender led to the idea of a neutral gendered chatbot. Would stepping away from the idea of binary that we see in most places today, cause auditory discomfort to the users? How do we perceive the idea of genderless voice? What are the natural challenges of moving away from binary? At this point, Scott took the opportunity to introduce the audience to the idea of ‘Alexa’s brief mode’. Recently, Amazon introduced a setting for Alexa called the brief mode. Once enabled, this reduces the amount of verbal feedback from Alexa by replacing it with a unique beep. The beep accompanied by a visual cue of the ring lighting up helps to confirm that Alexa has acknowledged your request, without the verbal chatter. Such features of simple chimes to replace longer and more descriptive feedback signals are called ‘earcons’. Earcons eliminate the need for gendered interfaces altogether. The idea of haptic feedback is similar to this. An attendee shared her experience about haptic feedback provided by her iWatch. Apple Watch defines haptics for specific purposes. Each haptic type conveys a specific meaning used to notify updates to users. Haptics can also be configured by the users themselves, giving them more control to avoid jumping from the chair every time they get a new notification. The panel discussed this new approach of designing interfaces that can be controlled by users that help break traditional boundaries. Moving away from voice interaction helps us move away from gender and other related implications of it. But despite having such implications, voices play a major role in making AI humanistic. There are products in the market currently that help elderly find companionship through AI functions. Along with getting tasks completed, not just the elderly, but most of us like to be treated well when we get something right. A humanistic voice affirming our action is a treat that is taken away if the voice is missing. The earcon technology shifts AI from being humanistic to purely machinic in an instant. While the machine needs to learn to adapt to somebody else’s spoken words, false identities can be a problem.
Empathy for the machine
As we delve on these ideas of moving away from visual hinges and familiar elements from day-to-day interfaces, it is important to consider how we feel about the machine. What is our empathy for a machine? Do we have an equivalent bill of rights for machines now that they literally have a voice? Since machines are able to respond to humans, they should have the ability to stand up for themselves. If algorithms don’t help them do that, who will? Cait brought in the idea of collaborative design and that tech giants around the world should be able to establish such standards even before products like these take hold into the market. We need to also worry about technology lock-in if we decide to take this direction. Technology lock-in happens when the market selects a technological standard and because of several factors including network effects the market gets locked-in or stuck with that standard even though market participants may need an alternative. Nupoor argued that one of the challenges of setting up standards is the fast-paced world of technology where standards become old in minutes and users can find alternatives to exploit systems in seconds. This makes it difficult for designers to entangle issues like gender and AI.
Takeaways
Designers face an ethical challenge while defining the role of gender in voice interfaces. It is an instantiation of trust and responsibility given to us as designers and strategists when we design interaction systems. It is important to understand what we owe our audiences, how do we embrace trust and how do we respond to audience requests? We need to handle these issues with responsible design constantly questions the diversities and perspectives surrounding such issues. The more voices and diversification that we have, the more comfortable users will feel communicating with gadgets. Earcons and haptics seem to be viable options, but still do not completely help us meet our goals for empathetic AI product designs. Another way of structuring this problem would be to provide users with the power to select their preferences just like they do for language options. However, how much power is too much and what agency does the machine have to combat unfair treatment? The conversation came to a full circle when the panel concluded that we need more research on algorithms, syntaxes, pleasurable sounds and societal problems to deal with design issues – a lot of food for thought for all the AI researchers out there.
Writing this report gave me a chance to reflect back on the panel and the conversations that happened in depth. Being the only academic panelist strengthened some of my notions about the work taking place in academia and industry. Although academic work is crucial, the fast paced AI inventions in the industry are making it difficult for academics to catch up with them and suggest approaches to add humanistic value to research. Industry practitioners on the hand attempt to rely on academic participation which is sometimes out of reach or out of date. More collaborations between industry and academia for AI based research will benefit both communities and also lead to more sound and ethical AI inventions.