Friday, June 13, 2008

Chatbots and Non Player Characters as Instigators for Visitor Feedback and Reflection

How do you get a good story out of someone? I've come back to this question again and again. Whether you want memories or insights, offer comment books or touchscreens, it's hard to design ways to invite meaningful visitor submissions. Finding great questions to ask can help. So does making the output significant in terms of display and usage. Most of all, you need great listeners. And this week, I had an aha moment about a designed listening technique: conversations with computers.

We all know that direct verbal communication is a powerful way to get feedback. The problems are scale and comfort. You can't talk to everyone, and not everyone feels comfortable talking to you. The fabulous Storycorps
project partially sidesteps these problems by requiring participants to bring their own conversation partner to the booths in which they record life stories. The facilitator is functionally another visitor, one with whom the participant is not only willing but eager to speak.

But Storycorps still requires facilitators, and it asks a lot of its participants. What if there was a way to have a one-on-one conversation without ANY other human--staff, friend, or otherwise? Could a computer be a good conversationalist?

This week in the Mind exhibition at the Exploratorium, I met D.A.I.S.Y., an artificially intelligent chatbot. Daisy is a computer program designed to mimic human verbal communication that lets you have a text-based “conversation” with a machine.

You've probably seen Daisy or something like her before. You type and the computer answers. Daisy learns new words and syntax with every interaction, but the computer program can be reset to "newborn," so her learning is bumpy. The experience of text chatting with Daisy is chaotic and her speech is often nonsensical.

But despite her incoherencies, Daisy evokes an emotional response. Visitors relate to her--or try to. As one of the evaluators commented, visitors want to make meaning out of their interactions with Daisy. Visitors interact with her over several lines of text even though they're getting little back content-wise.

"Jeez!" I thought. If visitors will spend ten minutes chatting with an inane computer program, why don't they spend as much time at the talkback station? Why do they just type "I like cheese" at the talkback and then run away to talk more with Daisy?

Because the talkback station doesn't listen. Daisy may not be complex, but she's highly responsive. She wants to understand you, and you want to help her understand. This is what separates Daisy from all the instructional graphics and "What do YOU thinks?" we can put on screens. She makes you feel like she actually cares what you think. And that makes you feel listened to, and makes you want to speak. A good listener isn't someone who knows all the words (or even how to put them into complete sentences). A good listener is someone who focuses on you and your words. And that's Daisy's whole reason for being.

How do you make a listener like Daisy? Responsiveness is more important than complexity, no matter the venue. In the gaming world, Daisy would be called a non-player character (NPC). NPCs are the trolls and wizards that roam through video games dispensing clues and forked paths. Interactions with NPCs often feel staged. You don't feel like your part of the conversation matters--it's just a vehicle for them to tell you about the crystal sword. Over time, game designers have tried to improve NPCs by making their linguistic skills more complex. But some of the best, most emotionally evocative NPCs are silent, simple creatures that express themselves solely in relation to you. A rock can more responsive than a photorealistic human if it rolls after you in just the right way.

And the things that make NPCs and chatbots emotionally engaging isn't all virtual. Some libraries employ dogs to sit and watch struggling readers intently as they work their way aloud through books. Why? Because they look like they're listening. They appear to care, and that's what matters. Ironically, some of the early chatbots were used to parody psychotherapists, repeating your statements as questions and asking to hear more. Cheaper than an hour on the couch and almost as useful.

Daisy has no physical representation beyond her words, so her language is all she has to make her seem fake or real. In a strange way, her nonsensical text makes her feel more human than a complex character with a range of stock phrases. You can't spot the gaffes where she didn't respond to your words precisely, since she never responds to your words precisely. It's always a little random, and that feels human too.

One of the Exploratorium designers explained that Daisy can be primed with a small vocabulary set that tends to focus her conversation around particular topics. In MIND, they've primed her with content around consciousness. But they could just as easily give her a vocabulary of words like museum, exhibit, and questions like "what do you think?"

It's sort of maddening talking to someone like Daisy. But she makes you care, makes you want to engage. And isn't that what we want for our visitors?

4 comments, add yours!:

Anonymous said...

I think question choice in the immediate context of the museum experience is critical. There are many examples of video talkbacks out there now: Brad Larson has developed an off the shelf design that can be plugged with content, another type was developed by Richard Rabinowitz's team for Slavery In New York, the Freedom Museum in Chicago also has something similar and we developed yet another version for the Minnesota 150 exhibit at the History Center. In each case the visitor leaves a video response to a teaser question and each affords a playback function so a visitor can review other visitor's videos. The differences in design of each of these prototypes is telling. Some will only offer pre-screened playback, others have special celebrity answers to pique interest. Some are so simple as to have a "record" button and a limited response time of 20 seconds with an automatic save function, posting immediately, others use a more complicated interface allowing curatorial review before saving or a more open-ended recording timeframe allowing visitors to stop when finished. Some have the record and playback as separate interactions in discrete locations, others have them as options in the same interface.

But how are the teaser questions handled? All are "canned," preformulated questions, though some afford a variety of different questions to choose from. In our example, the big aha was understanding that the proposition works better when our question anticipates what a visitor's expressed desire is most likely to be. An abstruse question elicits little response. No question at all prompts mostly comment-book style answers: "Great exhibit!" "You suck." and so forth.

But how might this style of interaction move to something more conversational? And is the desired conversation more natural when it is framed as a visitor to visitor interaction, or as a visitor to museum conversation? And are chatbots really sophisticated emough to provide the kind of interaction that is satisfying or do they yet fail the Turing test? things to consider, for sure.

Nina Simon said...

Thanks for your thoughtful comment! To me, the chatbot is successful not in simulating human conversation (which it's pretty lousy at) but at creating a non-human creature with which to interact. It's more like talking to a dog than to a person--and there are some things people will comfortably say or explain to dogs that they might not to staff, visitors, or the most convincing chatbot in the world. Incidentally, game research has shown that chatbots/NPCs that are too human-like are creepy. There's a term--the zombie line--that defines that moment when a non-human character becomes too human-esque for comfort. It's why the movie version of the Polar Express looked like aliens, whereas we're ok with Snoopy.

On the conversational angle, you might be interested in this post from the early days of Museum 2.0 about Jellyvision, "the interactive conversation company."

Jellyvision created the game You Don't Know Jack, famous for conversational style, and in the 1.5 years since I first wrote about them, they appear to have changed their brand image from game design to internet conversation design. Perhaps a great company to talk with about the future of museum conversations?

Marc said...

The uncanny valley is definitely territory to stay away from, but I think semi intelligent agents are a good approach not just for soliciting input, but also for complex interactions.

Agent based interfaces are something to look at to minimize learning curves to interact with complex systems, and avoiding HAL like creepiness is always key.

Anonymous said...

Hey I like talking to dogs too!

You're probably aware of the Dolphin Oracle artwork at the Walker Art Center in Minneapolis? You can ask questions (typed in on a wireless keyboard) of a computer generated dolphin (projected on a screen) which, after a pause and some dolphin squeaks, replies (captioned below) in an esoteric zen koan way. It usually takes some aspect of your question and twists it around in a bizarre word figure. It's pretty funny, very popular and the dolphin is very cute, which helps I think.

I keep wondering if the creepiness of computer-generated human figures is just a stage we'll work through collectively until we're more used to it. Certainly part of the problem is that they can get virtually everything else looking spot-on real right down to the hair follicles on a polar bear, a plodding dinosaur or the explosion of CG cars and airplanes. But we're acutely sensitive to the details of humankind whether it's the gestures of the body, the facial ticks that reveal fleeting emotions, the pitch and inflection of the voice, the idiosyncrasies and blushings of skin, literally millions of minute and telling details, many of which we aren't even conscious of as we see them in others and perform them ourselves, but which still play on our thoughts and emotions, that relay complex, naturalistic information exchanges between human beings. Actors, dancers, comedians, etc., are specialists at orchestrating and deploying these gestures and, for the most part, they do it intuitively. Watching CG movie humans in, say, "Beowulf," I think we're still struck by how crude the approximations are. It's analogous to bad acting. But it does get better with each iteration. Beowulf was incrementally more convincing than Polar Express, though it still basically sucked.

The creepiness problem reminds me of an American guy I once met when I was visiting in Japan who was a scholar of classical Japanese literature. He (I was told by someone else who knew) could read, write and speak Japanese at a very high, scholarly level, better than an ordinary Japanese person and also could perform all of the classical rituals flawlessly, the tea ceremony, Shinto purification rituals, etc. He told me that, as he became more adept at the language and customs of Japan, he noticed Japanese people becoming less and less comfortable with him. Apparently, not only do the Japanese expect us to be clumsy navigators in their world, but they rather prefer it that way because it reinforces their sense of exceptionalism. The more you become like them, the more confusing and threatening it becomes. Americans, on the other hand, think they can be anything they want to be, including Japanese. That's one cultural trait peculiar to the U.S. So maybe the same impulse is in effect with these bots. For the time being, it seems really important to us that we perceive ourselves to be distinctly and uniquely human and we don't welcome any mimickrey, mockery or competition--yet. But I have a strong suspicion that this will change quickly and dramtically at some future tipping point, probably at the point when the quality becomes more plausible, more convincing.