College Thesis: The Power of Conversation
November 1st, 2019
My concentration is The Power of Conversation, specifically the power of conversation in humans’ relationships with one another and with technology. The courses I have taken are a nice combination of technical classes in departments like Computer Science, Linguistics, and Design, paired with Humanities, namely Philosophy of Technology, History, and exposure to of foreign languages - especially during my semester abroad in Berlin. I have also been working at a software startup called Troops.ai for the last two years which closely harmonizes with my concentration; the company uses conversational technology to automate redundant task for sales teams. There, I have been able to see real life applications of the materials I am learning in my courses, helping me build my own opinions on the future landscape of technology and work. Gallatin has provided me with the perfect, well-rounded, interdisciplinary perspective on the burgeoning field of conversational technology.
The questions central to my concentration focus on the role of conversation in computer interfaces, both throughout history and looking forward. In particular, the overlap between my academic work in linguistics and computer science and my experience working for a chatbot company (a “chatbot” is a type of artificial intelligence whose interface consists of alerts containing insights to which users respond with actions) showed me just how complicated mimicking “natural” language is. According to Vijay Saraswat, IBM Chief Scientist for Compliance Solutions, “language is a very tough nut to crack because it allows us in a succinct way, without using a whole lot of symbols, to say an extraordinarily diverse set of things.” So, why do some engineers and designers strive to create conversational technology, an absurdly difficult task, when the currently existing interfaces, namely point-and-click graphical interfaces, work just fine?
Answering this question led me to ask several larger, interrelated questions:
- What role does conversation play in our lives?
- How do we feel when technology simulates human qualities? And how do we feel when they succeed or fail at this task?
- What does the future of human computer conversation mean for humans and computers?
Conversation in its purest form is the ultimate human act. To converse is to exchange thoughts and ideas and learn directly from one another in the process. It demands personality, vulnerability, empathy, and absolute attention. It demands listening and talking in equal degrees, allowing the bliss of being heard and of being understood. Conversation humanizes interaction.
The original interaction between a computer and its user lived in the non-ironic dark mode of the command line. The command line quite literally existed as a space in which users could give commands to the computer, confined to the limitations of text in languages like DOS. With the introduction of Xerox Parc’s Alto, the first personal computer to have a graphical user interface, the relationship between users and computers transcended the console to become what we are familiar with today as state-of-the-art, point-and-click graphical interfaces. However, lately there has been another significant shift in user interfaces from those rooted in standardization - filled with forms, tabs, and links - to new, more natural interfaces that allow for the relationship between humans and computers to be a lot more intuitive. Incorporating some of the attributes of conversation that make it so meaningful to humans, conversational interfaces make the interaction between humans and computers one that requires back and forth between the two, whether that be in the form of chat, voice, or action buttons, a term we use at work to describe buttons on our product that perform actions in response to the messages from the bot. Such back and forth creates a conversation between the user and the computer, even if the user is still in control. While the command line similarly included a back and forth dialogue, the inclusion of natural language instead of code requires less effort from the end user to execute their desired tasks.
But, how do people feel when they interact with conversational technology? Masahiro Mori in his 1970 The Uncanny Valley posits that as technology more closely simulates humans, our empathy towards them grows – but only until a certain threshold. Once the program fails to achieve true life-like features, it enters an uncanny valley in which the viewer is subsequently repulsed. Furthermore, Alan Turing’s 1950 paper Computing Machinery and Intelligence - which proposes a machine intelligence test in which a machine must beat another human at convincing a judge that it is the human and the human is the machine - sets the precedent of humans expecting conversational technology to be accurate to the degree of indistinguishability. But, as Mori notes, when conversational technology comes too close to successfully echoing human qualities, we, as humans, are unnerved. The feelings of discomfort associated with such human-technology interactions are strongly linked to the inability to categorize where such entities belong in our minds - are they like us or are they different?
Historically, humans have always attempted to categorize entities ontologically. Dating back to ancient Greek philosophy, Aristotle makes the distinction between matter and essence when observing the nature of living things in On the Parts of Animals. He claims matter consists of the necessary parts and materials something needs to exist, while the soul constitutes the essential character of an animal. This distinction immensely impacted the way humans viewed nature, creating hierarchies of order. Lisa Zunshine in her book Strange Concepts and the Stories They Make Possible argues such ontological categorizations are amongst the most compelling parts of consuming science-fiction stories. Readers grapple with tales in which entities don’t clearly fall within the lines of mere function or true essence, eventually forcing them to resolve the ambiguity of classifying counter-ontological beings. Such tales include the Frankenstein complex, derived from Mary Shelley’s 1818 novel Frankenstein, in which a master’s super-intelligent creation rebels against him, inevitable killing him. Furthermore, Spike Jonze’s Her, a film about a lonely man who falls in love with his operating system, situates viewers in a constant state of unsettlement - does Samantha love him or is she just programmed to make users feel good? Is their love genuine or is Theodore Twombly incapable of real human connection? When ambiguity remains in the constructs of fiction, we are enticed by the unsettling feeling, namely Mori’s Uncanny Valley, associated with such a phenomenon, yet our brains are hard-wired to fear such stories coming to life. Frankenstein teaches us that playing god is wrong, while Theodore exemplifies humans declining connection to others due to technology. It makes sense that we fear technology that responds to us in the same languages and intonations that we give it; that is because we straddle between whether or not these programs, are in fact, like us or if they belong to a different categorization. We imagine that, with exponential technological advancements, such a future will render us obsolete, at risk of domination or destruction. But this is only a future within the current mindset of human vs. not human.
Looking at the relationship between conversational technologies and their makers, we interestingly see a resemblance to the relationship with an end user. Just as a user attempts to resolve ontological ambiguity as to whether or not the program they are conversing with is more human or machine-like, engineers and designers are very intentional about where their programs fall on this spectrum in order to avoid entering Mori’s uncanny valley. Reflecting on the digital products we use daily, the functions they serve are important - they soothe our pains. Less obviously, the feel these products have majorly influenced our perceptions of them. Exceptional products feel enjoyable to use, the feeling created out of structure and features of the experience, may it be the colors, form, or tone of content. So, how can a designer leave an impression on a user without a clean, user-friendly graphical user interface? Don Norman in The Design of Everyday Things claims, “good design is actually a lot harder to notice than poor design, in part because good designs fit our needs so well that the design is invisible.” This idea of “invisible design” is vital when considering the transition from visual interfaces to conversational ones. Good design in the case of conversational interfaces necessitates character to feel good. This is observed in products like Duolingo’s chatbot personalities like Chef Roberto from Spain or Amazon Alexa’s ability to make jokes. Human-like attributes such as humor, sass, and the ability to go on minor tangents all enhance interactions with technology - one could even argue that such products are more human than monotone phone operators reading from a script.
Yet, there is a fine balance between too functional and machine-like versus too human and unproductive that designers must be wary of when making decisions. The goal is not to convince users that they are interacting with a human, but instead make the interaction with the product more natural while still serving a function. When designers attempt to fully replicate the nonlinearity of human conversations, the messiness of an unmoderated interaction muddles the intention of the program. An example of this line being crossed is Ikea’s chatbot, Anna. One of the first to provide chatbots in customer service, Ikea created Anna for the purpose of guiding customers around the Ikea website in an interactive and conversational way, adding a personal touch. Unfortunately, in trying too hard to make her seem human, namely giving her the capability of going off on excessive tangents, Anna’s designers failed to make her real purpose obvious to the end user: giving the right answer as fast as possible. Ikea had to retire Anna early because users were rude and overly sexual with her. Such a sexist reality is not surprising when it comes to chatbots, though. Examining the most popular chatbots today, Anna, Siri, Alexa, and Samantha from Her are all females. The feminine personification of these programs combined with the functional purposes they serve often lead to inhumane treatments of such programs. To this, I posit, even if we recognize these programs are non-human, are we still not obligated to treat them more humanely than regular non-animate technology?
Examining ELIZA, the first chatbot ever created in 1966 designed to imitate a therapist who would ask open-ended questions and even respond with follow-ups, we see that treating such conversational technology humanely improved users overall experiences. Her creator, Joseph Weizenbaum, was surprised by the number of individuals who attributed human-like feelings to the computer program, many of his coworkers wanting alone time with ELIZA in his office. The reality of ELIZA reveals that some individuals in fact feel more comfortable sharing personal information with conversational programs than friends and family - but clear goals and intentions are important for designers to incorporate when building products as seen with the failure of Anna.
As conversational interfaces become more popular in business contexts, a new ontological category emerges. For many businesses, introducing conversational interfaces to older, clunky systems reduces friction and makes unpleasant experiences more natural. Services like Drift, a conversational marketing tool, helps businesses quickly convert leads into conversations in real time, while products like Lemonade reduce the process of buying insurance down to under five minutes. The point of a conversational interface in these cases in not to fool the end users into believing the product they are interacting with is human, but to be functional and serve a purpose in the traditional sense of technology while creating an experience that is intuitive by pulling on the familiarity of human conversation. If humans can come to recognize conversational technology as a product that is intentional in its function but exudes natural essence, the interplay between humans and the products they use shifts to one of prosperity since there is no ontological ambiguity. The product is non-human but easy to communicate with because it imitates the nature of human’s most important behaviors: conversation. The removal of ontological ambiguity results in an entirely new spectrum of ways in which humans and computers can interact, resulting in new social norms and user expectations, as well as new problems for designers and builders to solve.
While I believe there are many advantages to the shift from standard, inanimate graphical interfaces to more natural, conversational interfaces, the future of voice has many dangers associated. As artificial intelligence and conversational technology that focus on spoken language becomes more powerful in the future, the choice we make to opt into conversing with such products must be backed with proper education on what participation entails. While many of the use cases I mention are confined to the realm of individual experiences like sales transactions, therapy, and automation of redundant tasks, we must remain cognizant of the implications associated with technology that can understand our conversations - even when we are offline. With the rise of voice assistants like Google Home, Amazon Alexa, and Siri occupying over one third of the total population of the United States, surveillance and data mining from the tech giants that deploy them are the real adversaries. What they do with the power of conversation does not lie in the fault of the invention, rather in the corporations that deploy the technology themselves.