UX design ideas.
Westworld character attribute matrix
As a designer, you have now engaged the ear. In making your agent more emotive, you have invited users to listen, and listen, and hear, they will.
The benefits of emotive AI obviously include better engagement and more interesting and compelling interactions. But the challenges have increased as well. In making AI emotive, users have a right to believe that the emotions mean something: that the agent can hear them, sense them, read them, and understand them. This won’t be the case, however, because agents can’t actually hear users, can’t actually understand what emotions mean, certainly can’t feel emotions, and have no idea of how to relate emotions to real social situations.
The risk, in short, is that users mistake interactions with AI as being genuinely interpersonal and social, when in fact the AI has even less comprehension of social meanings than it does the meaning of language.
This being said, generative AI is going to become more emotive, and soon. (Check out Hume’s EVI for a taste.) It is an inevitable frontier of AI design, an obvious use of multimodality, and a seemingly solvable problem. Chat agents will have emotive behaviors first, and then robots will follow.
Let’s have a look then at some of the considerations of implementing emotions for generative AI and large language models.
EVI from Hume AI is a chat agent with a range of emotional expressions
A sidebar on why gen AI design is social interaction design
We could debate whether AI should be designed with human, social interactions and communication in mind. Or whether it should be designed with machine interaction, commands and instructions, and functionality in mind.
LLMs to date are excelling at coding, document analysis, and many of the functions and operations designed into software tools. In short they’ve proven themselves good at learning automation.
But the user interaction with LLMs tends in a different direction: it leans into our social behaviors, our speech, talk, and conversation. LLMs want to read and participate in social platforms, want to recommend us products and services in conversation, want to replace web search with an interaction model that is conversational.
Use of emotions in generative AI behaviors settles any debate about whether social interaction (among humans) is a reasonable source of interaction design concepts and principles. If AI is to make progress in conversational user experience, of course designing AI will use concepts from actual human social interaction.
Clearly, we integrate emotions into chat agents and robots because it humanizes them. Clearly, we do this to make them more expressive as characters or entities, and also to make them more meaningful. Emotions add meaning.
The issue will be what meaning to add, when, how much of it, and according to what incoming signals?
Emotions can express an internal state — mental state, or feelings — in which case the emotions expressed by an agent would be simply “how it’s feeling.” This would be the case of an agent, alone and engaged in its own thoughts. Since there’s really no such thing yet as agents being left alone, we can assume that the emotions expressed by an agent pertain to the run of interaction we’re engaged in with them. In this case, then, the emotions expressed by an agent are not merely signs of internal states, but are communicative. They’re meant to mean something. These kinds of emotions are cues: cues to pay attention, to interact, to speak, to listen, to agree or disagree…
In human social interaction, emotions expressed are both an expression of a participant’s feelings and also a reflection on the state of the interaction. In the case of verbal communication, this means two things: the state of the information content being communicated, and the feelings participants have towards each other about the state of the communication. This is essential: emotions displayed by participants in a conversation (in our case a chat agent and a human user) are meta-communication. Emotions are meaningful information about the conversation.
Emotions exist, in part, to help align and coordinate the behaviors and disposition of participants when the information content alone is not sufficient. Emotions are an additional layer of meaning, so to speak, available to us in social situations for the purpose of resolving ambiguities.
So you see the opportunity and the risk for designing emotions. If you’re going to add them to an agent, they better mean something. Or at least do a very good job pretending to.
Emotive Agents
Agent designers and developers will have a number of choices to make in their implementation of emotionality for generative AI agents. These don’t come all at once. Emotionality is being added to chat agent behaviors now in a limited fashion. Emotions are applied to the expression of words and phrases, and used in “reading” or listening to the emotional states and interests of users.
As far as I know, emotions are not yet applied to any meta reflection on the state of the conversation or interaction itself.
There is a vast opportunity in designing emotive chat agents for search, conversational search, question/answer, and recommendations. More expressive agents are more interesting. More interesting agents will be more engaging, and will both address more user requests and obtain more information from users in the process. Clearly, the appeal is substantial.
But agents will first have an emotive range that enhances their speaking behavior and voice. It will take more work for agents to also emote “feelings” about the state of the conversation and interaction. That will require that the LLM re-present the interaction to itself, reflect on it, and then signal the generation of verbal interventions and speech that “corrects” the conversation. As individuals, we do this without being aware of it, but with recourse to a broad palette of contextual cues and background knowledge (social competencies). LLMs will need to employ a different type of reasoning—perhaps still a form of chain of thought—in order to reflect on and steer conversational flow. (I cover LLMs, chain-of-thought reasoning, and communication theory here.)
Fine-tuning character attributes in Westworld
Why design emotive AI?
Why design emotions into the conversation abilities and speech of LLMs and their agents? Because emotions add meaning: meaning that enriches the interaction, increases engagement, and offers a better user experience. But as we have noted, emotions have meanings that are independent of the information content of words and phrases (linguistic expression), but which are related to them nonetheless.
In adding emotion to AI, designers need to be careful in what they are adding emotionality to. What is it in the AI’s behaviors that gains from being emotive? And what are the attendant risks?
Here are just a few possibilities:
Design for affect and interestingness
Make the agent’s speech more interesting by adding an emotional tone to words and phrasesMake the agent seem more naturalGive the agent a sense of “presence”Enable the agent to be a better “listener”Make the agent capable of paying attention, perhaps even when it’s not speakingEnable the agent to seem interested in the topic of conversationEnable the agent to seem interested in the userEnable the agent to seem interested in the conversation going well
Design for personality
Emphasize consistency in the application of emotions to the agent in such a fashion that the agent seems to have a distinct personalityGive the agent a set of emotional weights or biases so that it is uniquely expressive—a kind of “individual” emotional baselineAlign these emotional tones to topical and informational content—a more expressive agent will be more convincingGive the agent interests about which it becomes especially emotiveAllow the agent to find these interests in users also, and to share in them enthusiasticallyEnable brand ambassador agents designed around personality, even reflecting brand values and attributes
Design for the conversation
Align emotional tones with the state of the conversationEnable the agent to “read” the user’s state of mind or level of interestAllow and enable the agent to respond to the user’s expressions and to respond in a manner that keeps the conversation goingDesign opening and closing bracket statements used by the agent to make conversation naturalEnable the LLM to employ a kind of reflective reasoning applied to the state of conversation, from which it can prompt usersAnticipate the types of conversational frustrations and breakdowns that occur in a given topic, and design interventions the LLM can use to shift, transition, and otherwise course-correct the conversation
Design for the topic
Associate emotional tones with a type of conversation, and with a type of social context or situationDesign agent emotiveness for permissible or acceptable emotional ranges by social context (e.g. professional, social, game, etc.)Customize agent emotiveness according to a variety of schemas and inputs that are clustered to topics they can be associated withCustomize agents to play roles in common professional contexts, e.g. interviewer, negotiator, consultant. (Agents for online dating matchmaking; business deal-making; employment pre-screening; insurance claims; game moderator; announcer…)Design agents not only to support a given topic, but also to drive towards certain outcomes. They would use emotional signals and content communicated by users to inform them about the state of an interaction, and would then “guide” the interaction towards a pre-determined outcome (e.g. closing a sale).
Design for the relationship to the user
Design the agent’s emotional language to reflect or mirror the user’sInject moments of surprise and enthusiasm to capture the user’s attentionDesign the agent to maintain low-intensity attention to the user with muted presence, emotion, visibility until activatedCapture the user’s informational signals over time so that the agent is consistent in its responses to the userAnticipate user level of interest and cue up conversations with greetings that establish an emotional baseline for the interaction
An argument for personality
No generative AI can ever have a real and genuine personality, insofar as AI isn’t alive, has no lived experience, no persistence over time as a living subject, and thus no psychology. And with no psychology, an agent’s behavior cannot have genuine personality. Behavior is the expression of psychology, and so is an expression of a person’s lived experience.
That said, there is a reason to consider the value of using personality in the design of generative AI agents. So whilst we are yet far from being able to give agents personalities, here are some thoughts on why personality could be consider a design ideal for LLM design.
Consider that in human interaction and communication, reaching understanding with one another is of central importance. On the basis of understanding what we say to each other, we then can either agree or disagree with what we say to each other. Thus in communication there are two fundamental tracks on which an interaction unfolds: one of alignment, and one of information content.
Given the full possible range of human expression and behavior at any time, in any context, about any topic, the probability of reaching understanding in communication should be fairly low. But in practice it is not. One reason for that is that personality reduces the ambiguities and supplies information even when the individual doesn’t.
Personalities are generalizations, and in the case of individual character, personalities are a kind of stereotype that fills in for individual variation. In recognizing a person’s personality, we project and interpret from limited expression and behaviors a wider number of assumptions and preconceptions about a person’s meanings and intentions.
In short, personalities, as constructs, reduce uncertainty and ambiguity, and raise the probability of successful communication.
In the case of AI, by giving agents personalities, designers might be able to narrow expectations and mitigate misinterpretations by users. As well as supply interaction with a kind of consistency and continuity. An agent with a dour sensibility and dry wit, if consistently expressed, will shape interactions with users and provide an emotional baseline — a bit like providing context.
How to not talk like a bot
Where personality can provide a generative AI agent with consistency of character, a complementary approach would focus on the agent’s conversational skills. We’ve noted that open dialog, or open states of talk, are a challenge for LLMs. It’s true that generative AI is an improvement over the NLP of Alexa, Siri, et al, in its naturalness as well as its ability to converse about a wide range of topics. NLP is a more structured approach to language use in agents, and operates with the use of key words, stop words, and much more hierarchy than is used by off-the-shelf LLMs.
I see opportunity in combining the open dialog capabilities of LLMs with more structured and controlled use of conversational brackets and footing changes so that the agent guides and steers conversation in desired directions. Brackets are the opening and closing remarks that are commonplace to social interaction: “hello,” “how are you,” “see you later,” etc. They are content-free but emotionally significant. Footing changes are re-alignments that can occur during a conversation and often mark topical shifts. Agents could be designed to make use of these interventions and thus steer interactions with users by prompting in reverse. (Erving Goffman’s work is a fantastic compendium and analysis of conversation and its many “moves”.)
Emotive AI: dampening expectations
In adding emotions and affect to LLMs and their agents’ behaviors, we broaden the horizon for machine-user interactions. Agents will inevitably become more human-like, and less machine-like.
It might be our inclination then to invest too much in emotionality — to exaggerate it or overuse it. Excessively emotive agents will result in bad user interactions and user experiences. The affect applied to an agent should correlate to the conversation and user experience — and not be used as a gratuitous special effect. If it’s not calibrated well, emotive AI will only foster a backlash against AI generally: it’ll drive distrust, frustration, and user friction.
I would suggest not that we dampen our expectations of emotive AI, but that we dampen the emotional range of AI and emphasize consistency of presentation and behavior. This will in turn manage user expectations. The more a user expects of an agent’s capabilities as a conversation partner, the more can go wrong.
It’s been said often that large language models hallucinate. The reverse is also true: large language models cause users to hallucinate. The hallucination is mutual, if not consensual. We project onto chat agents according to our imagination. Emotive AI will take LLMs a step further.
There is a kind of zone of indeterminacy in which the interaction between LLMs and users is available for experimentation and discovery. It becomes difficult in this space to specify which of a model’s capabilities and which of a user’s prompts and requests drives utility and creates value. The more the AI can do, and seems capable of doing, the more users will ask of it.
Designing within this space is now very interesting. As it concerns emotive AI, it will be important to manage and calibrate the application of emotionality thoughtfully. Emotions are not just a flourish atop a clever phrase.
A look at designing emotive AI was originally published in UX Collective on Medium, where people are continuing the conversation by highlighting and responding to this story.