Interacting with
Virtual Pets, Digital Alter-Egos
and other Software Agents

PATTIE MAES

Some of you may have noticed that I've recently become interested in building real or naturally intelligent agents (points proudly at heavily pregnant belly). My talk is about something that I've been doing for a much longer time, namely, building artificially intelligent agents and artificial creatures.

S o f t w a r e . A g e n t s

By now, I'm sure most of you are convinced that the home of the future will have a physical or real component as well as digital, virtual components. But a point that hasn't been stressed very much is that this virtual half of our home will be inhabited by agents or creatures.

So the virtual half of our home won't just be a passive data landscape waiting to be explored by us. There will be active entities there that can sense the environment--the digital world--and perform actions in the digital world and interact with us. We call these entities software agents.

A software agent has a very broad definition. It's a process that lives in the world of computers and computer networks and can operate autonomously to fulfil one or more tasks.

I've portrayed the software agents in this environment as little creatures like ants and ticks and so on. I chose this metaphor deliberately because a software agent actually has a lot in common with real creatures. A software agent also senses its environment (the digital world, in this case) and can act upon it and make changes.

A software agent is also semi-intelligent. It's not as intelligent as a human being, but it may have the intelligence of an ant or a tick and, hopefully, later, of more complex kinds of animals.

Another parallel is that these agents are mobile. Most of these software agents can move around in the world that they live in. A lot of them will have a home where they reside most of the time: one of the notes in the network, one of the client's computers or one of the end consumers' computers, for example.

Just like real creatures, some agents will act as pets and others will be more like free agents. Some agents will belong to a user, will be maintained by a user and will live mostly in that user's computer. Others will be free agents that don't really belong to anyone.

And just like real creatures, the agents will be born, die and reproduce. They will also form complete ecologies and may discover certain niches they can occupy while they perform useful tasks.

So why do we need these software agents in the digital world or in cyberspace?

I'm convinced that we need them because the digital world is too overwhelming for people to deal with, no matter how good the interfaces we design. There is just too much information. Too many different things are going on, to many new services are being offered, etc. etc. etc. I believe we need some intelligent processes to help us with all of these computer-based tasks. Once we have this notion of autonomous processes that live in computer networks, we can also implement a whole range of new functionalities or functions. We won't just be helping people with existing tasks--we'll also be able to do some new things.

Basically, we're trying to change the nature of human computer interaction. The current dominant metaphor is that of direct manipulation. The user manipulates representations of data and information to make things happen. I believe that that metaphor will have to be unmantled by a very different one which Allen Kay has termed indirect management. Users will not only be manipulating things personally, but will also manage some agents that work on their behalf.

D i f f e r e n t . T y p e s . o f . A g e n t s

The term agent has been mentioned a lot in the media recently, but there is still a great deal of confusion as to exactly what it means. This is partly because there are so many different types of agents in development.

The first way to distinguish different kinds of agents is the nature of the task they perform. Some agents could be termed user agents, user robots or user softbots. Others may be called tasks agents.

A user agent is an agent that belongs to a particular user and works for that user. It knows the user's habits, preferences, intentions and goals. It performs tasks on the user's behalf. For example, I have some personal news editor agents that I interact with on a daily basis. These systems know what kind of things I'm interested in and recommend newspaper and electronic news articles.

Similarly, one could imagine having a personal electronic shopper agent to look for you for particular objects, a VCR, for example. It would go out onto the network looking for notes where electronic equipment is being offered. It would know one's criteria for buying a VCR. And it could actually buy a VCR on my behalf if it found a good deal.

Another example of a user agent would be a virtual pet. I'll give some examples of these later.

Not all agents are necessarily useful or designed to perform useful tasks for you. Some agents entertain you.

In contrast to the userbots, there will be a lot of taskbots or agents that don't belong to a particular user, but perform tasks useful to the whole community, for example, indexing services or load balancing on the Net.

Agents can also be distinguished according to the role they perform. People are trying to build many different kinds of agents to do many different things. There are agents that guide the user through complex information spaces. Others act as a personal memory or reminder. Others have the sole task of continually watching and rememorizing what the user is doing. So that if users afterwards try to figure out where they left a particular document or the name of a person they interacted with in that particular month on Saturday afternoon, the personal memory agent will be able to help.

Some agents provide abstraction for complexity of services and can receive and implement high-level commands. Some agents may act as monitors, for example, to watch the stocks I own and warn me whenever they fluctuate sharply in value.

Yet other agents may act as teachers. Certainly, many agents will act on behalf of the user, for example, agents that perform transactions on your behalf or schedule meetings when you're not available.

Agents can be entertainers, as well as collaborators and assistants. There will be a large need for agents that act as filters to help us deal with the wealth of available information by making recommendations as to what deserves our attention.

A third distinction is based on the nature of the agent's intelligence. For example, one of the user program agents commercially available today is called Beyond Mail, an electronic mail system that allows you to create agents to watch your incoming mail and perform certain actions when a particular type of message arrives.

People have also been trying to build much smarter artificial intelligence agents that have more common-sense knowledge about the user, the application and the world in general. This kind of agent is not yet commercially available.

A fourth type of agent learns. They program themselves by noticing, for example, that the user usually performs tasks in a particular way. They then offer to automate these tasks for the user in a similar way.

A . L . I . V . E .

For the past four years at the media lab, my students and I have been building a whole range of software agents. We've been looking at the problem from two different points of view. One is the interface to agents. How will users communicate with agents? What will agents look like? Secondly, we've been concentrating on the problem of how to make agents do useful things? I want to give you some examples of the work we've been doing in these two areas.

I first want to talk about a novel kind of system we developed to allow a user to interact with agents. This system is a novel solution to the virtual reality problem--an environment that allows a user to interact with a virtual world that includes computer-animated software agents. It is called Artificially Live Interactive Video Environment or ALIVE.

One of the creatures we've been building is a virtual dog. For me, home is where the dog is. At least, that was my experience growing up. When you come home, your dog is there and meets you and starts playing with you. You can then make it perform all sorts of tricks. The user just walks into the virtual environment we built. You don't wear anything. It's completely wireless.

When you walk into the environment, you actually see yourself on a big, eight by ten foot screen wall. As you see, the real world is augmented by some computer-animated objects, many of which are virtual creatures. Here, the only virtual object is this virtual dog, but sometimes there is a whole virtual world and the only real thing is the image of the user. Everything else is artificial or computer-generated. So you can give commands to the dog, you can tell it to go away. It will come back to you. You can make it sit by putting your hand sideways. It gets sad when you leave. We put a little toy in its world. Here, it doesn't exactly know what to do, it wants to shake its hand. It wants to play with it. And eventually it gets bored.

The user is standing in front of this big screen and there is actually a camera on the top of the screen. That video camera is looking at the user and figuring out where the user's silhouette is. It is using that information to figure out the location of the user's hands, feet, centre of mass, head (all of this information at about ten hertz). Once we know where your hands, head and feet are, we can look for certain gestures such as waving, for example, which means that your hand goes back and forth horizontally. Or pointing, which means that your hand is very far away from your centre of mass and is kept still for a moment.

That is how it works. We compose your image with this computer-generated world. We can actually do this in two different ways. This is an example of what you could call enhanced reality: the table is real. The bricklike food objects on it are actually virtual. These computer graphic subjects are not there in the real world. The poster on the back wall is real but the buttons on the back wall are virtual. This is a complete blur between the real world and the virtual world.

We can also use the system in a complete virtual reality mode in which the only real thing in the scene is basically your image, while the whole scene around you, the objects and so on, are all computer-generated. We can even go further than that. We can replace your own video image with something else--a bear, for example. As you walk around, this computer-animated bear walks around and performs the same kind of gestures as you do.

You can switch between different worlds by pushing the middle button. In this next world, there is a little playmate who follows you around. If you tickle him, he jumps up. You can send him away, but he is rather stubborn and will shake his head no at first. Eventually, he'll go away. Oops-the user accidentally touched the button and is now cycling through the different worlds to bring himself back to the little playmate. Now he's trying to send him away again. Eventually, he'll go. This is a user who walked in without any knowledge about how the whole system worked. That's one reason why the interaction is slower. But the system is pretty natural to use, actually. You don't have to be wired up in any way because the only sensor that we use is the camera on top of the screen. It gets upset if you step on its toes. You can't see it on the video, but he shows this surprise face.

The only technology we need is a video camera. This means that five years from now, you can have one of these in your home. Using the television as the big screen, all you'll need is a computer and a video camera to interact with these virtual creatures. One of the ones that we're currently trying to build is a virtual Jane Fonda who actually teaches you aerobics. The big difference between a video Jane Fonda and this Jane Fonda is that the virtual one can actually give you feedback. She knows (or the computer knows) where your hands and feet are and what you are doing, so she can tell you: okay lift your legs a little bit higher now. She can create a personalised training scenario for you.

This is the environment we've been exploring to allow users to interact with software agents. As you can see, a lot of entertainment is involved. One of the reasons we chose this kind of interface rather than a small screen where agents aren't personified is that we're trying to build interfaces for the end consumer. We share Mr. Grosso's view that interaction with the digital world has to be made more fun and engaging. You have to combine information and entertainment. That is why we chose to immerse the user and have life-like characters embody the software agents.

P r a c t i c a l . U s e s . f o r . A g e n t s . a s . A l t e r . E g o s

We've also been working on the problem of how agents can do useful things for us and be more than just virtual playmates. We've built a whole range of agents that perhaps aren't as interesting to look at or aren't fully computer-animated, but that do very useful things for people. All of the applications that we have been looking are motivated by our own frustration with the way we currently are dealing with certain tasks.

Tasks such as electronic mail. I'm not sure how many messages you get every day, but I get at least a hundred. And in a typical electronic mail system, they're just presented chronologically. So I decided it would be a good idea to try and build an agent that helps me with my electronic mail, by sorts, prioritising even marking certain messages to be deleted before I look at them (this is actually the most useful function).

Another such problem is calendar management and meeting scheduling. All of you know what it's like to schedule a meeting that involves more than two people. It takes forever to find a time when everybody can make it. I don't believe that current software like Meeting Maker solves the problem, because it takes away privacy. Other people can examine your agenda and say: you have a block of free time there, so I'll schedule a meeting over there! I don't find that an acceptable solution. We've been building a calendar agent which is rather like a user alter ego that can negotiate about good times to schedule meetings.

N U U T

I'll go into more detail about two agents that everyone would find really very useful. NUUT is a personalised newspaper agent that looks at news feeds, in particular Internet news, and recommends articles to be read. RINGO is a music recommendation system. We're also building an agent for the World Wide Web that will recommend documents to the user.

Apart from the World Wide Web agents, all of these agents are being used by people today. If you're interested in using some and getting the software, come to talk to me or send me electronic mail. My agent will delete it (only joking!).

The NUUT system is actually a collection of agents which help the user decide what Net news articles to read.

In the NUUT system, you can actually create a set of agents. I created four agents here. You can even make a little visual representation of them to remind you what kind of news they represent for you. There was a politics agent, a computer news agent and a couple of others. Each of these agents make representations to me on a daily basis or even more often, for example, every couple of hours for new articles from the Net news feeds (about four hundred megabytes of new articles every week).

Each of these agents makes recommendations to me based on what I've shown interest in in the past. These are actually examples of learning agents. They continually watch my behaviour and notice patterns. For example, they may notice that I read Michael Schrage's column in the LA Times every week. And once they have picked up a certain pattern like that, they can automate it on my behalf and offer me that column every week, so I don't have to search for it. This is a set of articles one of my agents has retrieved. These are articles suggested by the politics agent. I can click on one of these titles. The agent has ordered them according to how important it thinks they are for me. I can look at the article by clicking on the title and then give either positive or negative feedback to my agent, indicating that I want to receive more or less of that kind of news article in the future. I can also highlight an area of the article and say: this is what I want to know more about. Or I can highlight the author's name and say: don't give any more articles by this author. You interact with this agent by giving it positive and negative examples of things that you wanted it to retrieve. NUUT uses a technique called content filtering: it notices correlations in the kinds of things I like and dislike.

R I N G O

RINGO uses a complimentary technique called social filtering. Rather than finding correlations among the types of things that I like or dislike, this system actually tries to find correlations between the tastes of different users dealing with the same type of information. In this case, the system recommends music. This is a World Wide Web interface for which you currently need to have an MIT address, but which will soon be more widely available. Right now, if any of you want to try it, you can try accessing the system via address mail. The address is ringo@media.mit.edu .

So you basically tell this system a little bit about what kind of music you like and dislike. For example, a user could say they like the Beatles a lot. The scale goes from one to seven. One means: I seriously dislike this music; seven means: I really love this music. This user has indicated liking the Beatles and not liking Madonna, among other preferences. Rather than trying to find correlations among all these different music albums, the system tries to find correlations between my data and that of other users who have conveyed to the system what they like or dislike. For example, this particular user's taste in music seems similar to mine, because that user also indicated liking the Beatles and disliking Madonna. Once RINGO has discovered which users have similar tastes, it will actually recommend that I listen to music that other people like me have liked (as here, for example, Eric Clapton here). It may also use low values to tell users that they will probably not like that type of music.

There are a lot of other neat features to the system. One of the neatest things is that it can improve by itself. We started this system with twenty users and 575 artists in the data base. Now, after two months without doing any advertising (this is the first time I've ever advertised it to a large audience), there's more than 3000 users in the database and more than 9000 albums. All of those have been added by the users, so RINGO recommendations are continuously improving because the more users there are in the system, the higher the probability is that there will be users in the system that have tastes similar to your own.

These are two examples of agents that perform useful tasks on behalf of the user. I've shown you some examples of how we envision interaction with these agents in the future. And the main message I hope you have gotten out of this lecture is that the vision we are working towards is one where the computer is almost a window or door to this virtual world populated by agents that assist you, entertain you, and even may train you in a very personalised, interactive and natural way.



about
content
day by day
hotlist
faq
credits