Connect With Us

Call For Inquiry


Opening hours

Mon - Sun : 09:00 - 16:00

Home Article [CEDEC2021] Development of dialogue character AI in the game industry (Part 2)

[CEDEC2021] Development of dialogue character AI in the game industry (Part 2)

[CEDEC2021] Development of dialogue character AI in the game industry (Part 2)

This article is the second part of the report of the session "Development of Dialogue Character Artificial Intelligence Technology in the Game Industry" held at CEDEC2021. Now that dialogue agents are attracting attention outside the game field, systematization of dialogue character AI technology has great significance. In this article, we will introduce the history of artificial intelligence technology centered on dialogue models while introducing "Wonder Project J1" and "J2" discovered by SQUARE ENIX in the project "SAVE" that preserves past assets and examples of other companies. I will read it.

[CEDEC2021] Development of dialogue character AI in the game industry (Part 1)

Mr. Yoichiro Miyake took the stage at the session "Development of Dialogue Character Artificial Intelligence Technology in the Game Industry" held at CEDEC2021 and explained the difference between dialogue agents in game AI and dialogue agent research utilized outside the game industry.

Transition of AI plan seen in "Wonder Project J / J2"

From the flow of computer RPG (CRPG) in the 70's, the separation of the game system (narrator) and the character occurs due to the appearance of the character dialogue game in the 80's. In the latter half of the 1980s, the game system itself does not tell a story, but a game that builds an interaction with a player by letting the character read the story from there will appear.

"Dragon Quest 4" released in 1990 was equipped with artificial intelligence for battle. Some artificial intelligence is mixed, such as AI selecting the tactics of friends according to the battle policy based on the "Sakusen" selected by the player, and the learning function to control the activation of magic from the battle experience so far. It is a system that has been used. The game system indirectly tells the player "I want you to experience this kind of experience" by creating various battle systems through the artificial intelligence of the character.

Then, in 1994, "Wonder Project J Machinery Boy Pino" (Enix) will be released. The title is a new genre of games called Communication Adventure, which was released as a Super Nintendo game title. From the large amount of materials found in the "SAVE project" (game development material excavation project at Square Enix), you can see from the initial plan to the completion of AI design.

Since this material is an early one, it is different from the one implemented in the actual game, but the education ("praise" "" praise "" that the player voluntarily acts on the AI ​​from the item (stimulator) You can see that the system that grows according to the four modes of "ignore", "stop", and "scold") has been adopted from the original plan.

In the next stage, education will be simple and the conditioned reflex values ​​will change. In addition, parameters such as intellect, sensibility, and mental strength are prepared, and the values ​​of the parameters change to grow.

At the final stage, what is classified into several behaviors becomes education, and AI is designed in a simple form in which they change parameters.

As shown in the schematic diagram in the game, the weight of the AI ​​character becomes very large as shown in Fig. 6. On the other hand, the player has a dual structure in which he is interacting with the AI ​​character and at the same time interacting with the game system.

Next is "Wonder Project J2 Corro Forest Joset" (Enix, 1996). This is a sequel developed in response to J1's popularity, and is a story about the growth of a heroine named Joset. The compatible model is Nintendo 64. Two specifications remain for these as well, and the AI ​​concept flow of Joset is on the left in Fig. 7. It hasn't changed much from J1, but it was gradually refined at the development stage.

First, check the status (HP / MP) at the starting point. Then, the emotion is checked, and from there, the transition to the idling state or the guidance of items and events is performed.

What the AI ​​does when doing nothing is actually very important, and the character must have something to voluntarily interact with the player. On the other hand, AI is learned by the player's evaluation of "scolding" and "praising". At the same time, J2 also has a precise design for memory called comprehension information. This means that the brain develops as the memory information changes, which will be described later.

There are also various emotional parameters available. It is not simply expressed in the same way by the prepared parameters, but is expressed in stages of "activation period", "normal period", "expression period", and "explosion period". In the normal period, there is no emotional expression in the incubation period of emotions, and in the expression period, the emotion with the highest parameter value is displayed, and in the explosive period, the emotion with the highest parameter value is displayed.

Designing a learning model for characters as seen in "Wonder Project J / J2"

As a learning method, J1 first uses the "Korekore method," "event method," and "item method." The korekore method is a method of guiding a character in the form of "korekore" and learning by scolding or complimenting "no, that's not the case". Later, "Black & White" (Lionhead Studios, 2001) will adopt the same method, but J1 implemented in 1994 can be said to be the earliest.

The event method is a method of learning by changing the internal parameters by participating in the event or by showing them. The item method changes the state by actually handing over an item, such as a battery. This may be more appropriate for growth than for learning.

The basic form is followed in J2, and "reasoning method" and "thinking scenario method" are added. The reasoning method is a method of inferring from what you have memorized, "I think you can do this by combining this with this." On the other hand, the thinking scenario method learns from the player with Yes / NO (praise / scolding). In other words, what is found by the reasoning method is corrected by the thinking scenario method.

This kind of memory representation is called knowledge representation in AI, but it is basically a form of memory. First of all, in the case of the item method, for example, actions such as "carry", "throw", "eat", and "make" are linked to "cooking" and input, and when thrown, feedback such as scolding is given several times. By repeating it, you will learn that it is a wrong act. In the event method, for example, they are classified by showing the event from the state where it is not known whether the customer is a good person or a bad person. In other words, it can be said to be both a behavioral expression and a classification expression.

And, in the thinking scenario method that understands higher-order information by combining some information, the sequence is defined such that some knowledge is YES / True (some are NO / False). It has become.

In J2, actions and properties are learned for various items, and information inferred from these multiple pieces of information is stored as table information. This is called the "Understanding Information Box". Understanding information is what Joset must understand in order to clear an event or the like. There are two types of making methods, one is to build from "Yes / No" in the Korekore method, and the other is to determine the value found in the thinking scenario method by the player's evaluation.

As this understanding information box increases, the memory and knowledge in Joset's head will increase and grow. The decision of the event or thing that you are most interested in influences how you talk and react when you are idling.

In this way, the influence of the dialogue agent's experience is very large for Wonder Project, and the dialogue agent itself creates the player's experience, and the game system is mixed on top of it.

It is worth noting that Wonder Project J1 and J2 presented the character's inner and learning models very early on. Five years later, The Sims unveiled a utility-based interior model, but it was the first to present a complex interior model.

Learn with an 8,000-node neural network

"Creatures" is a game that only trains characters. This is similar to Wonder Project, where you will learn how to use things. For example, throwing a ball is the correct way to play, so we will learn by combining things and actions.

"Creatures" is a game for Windows 3.1, but it has a neural network of about 8,000 nodes. Packaged neurons are called lobes, and the lobes are connected to form a large architecture that defines the definition of things and their actions. It can be said that this was the largest neural network until around 2010.

At that time, there were limited resources (games in the 80's and 90's had almost no capacity left when implementing elaborate AI), so it was very simple as a game system, and simple words on neural networks. Agents who learn while interacting are moving and staying in touch with the player.


"Astronoka" applying a genetic algorithm

"Astronoka" (Square Enix, 1998) is a game in which an enemy character named Babuu learns the characteristics of traps by a genetic algorithm and evolves while avoiding traps set by the player. The genetic algorithm evolves a group, not a single body, and in the case of "Astronoka", 20 bodies are evolved as one unit. By bringing two genes from each character's gene and multiplying them to create a new gene, we will create a new generation steadily, but there is a relatively high probability that an individual who was excellent in the previous generation will be a parent. The roulette method is adopted.

The following is a commentary based on a paper submitted to the Japanese Society for Artificial Intelligence by the developer Yukihito Morikawa (Yukihito Morikawa "Use of Artificial Intelligence Technology for Video Games", Journal of the Japanese Society for Artificial Intelligence vol.14 No.2 1999 3). It will be.

The flow of the game is as follows. First of all, the game starts on a daily basis, and one baby comes every day. As a space farmer, players set various traps on Babuu to protect the vegetables they grow. Babuu, who was initially caught in traps, evolves to avoid traps by genetic algorithms, and becomes smarter and smarter as genes with excellent parameters survive.

Parameters include weight, height, strength, leg strength, tolerance and finding solutions. There is only one player facing the player on the front side of the game, but on the back side, 20 bodies have been evolved to change generations. Every time a player sets a trap, the other person gets smarter, which means that the evolution is behind the scenes. "Astronoka" is also extremely fast as an example of using a genetic algorithm in a game, and it can be said that its perfection is high.

Most of the games in "Astronoka" are confrontation parts with AI, and the content is to enjoy the interaction with AI that is growing steadily.

Melody language seen in "Seaman"

As mentioned in the first part, dialogue agents (conversational agents) are becoming more popular in the Japanese Society for Artificial Intelligence due to increasing social needs. However, even dialogue agents that are being researched on an academic basis have not yet reached a natural conversation. Even if I learned with a large amount of corpus (conversation data) by deep learning etc., it did not become a natural conversation.

However, the game "Seaman" (SEGA, 1999), which was released nearly 20 years ago, became a hot topic within the Japanese Society for Artificial Intelligence because it was able to realize natural conversation. Therefore, Miyake and other editorial members of the academic journal visited the developer, Yoot Saito, and published an interview article in the academic journal. Here, we will summarize the contents and introduce them.

Seaman's system is made up of a huge number of branches, and is characterized by adopting the concept of "melody language" (coined by Mr. Saito). Most conversation studies do not focus on sound data, but in Japanese conversations the same word often has different meanings depending on intonation and accent. In other words, even if the same words are used, the conversation cannot be established unless the control including the sound is performed.

Also, grammatically accurate wording may not always be the best spoken language. For example, returning "Please say it again" when you cannot hear the other party's voice may be a good response for the service robot, but it should not be natural for conversation, and simply saying "I can't hear". If you do, it will be more natural for conversation. In this way, one of the elements extracted from the interaction of sounds and words used in conversations that we usually do is the melody language.

In this interview, Mr. Saito said that recent conversation agents are made like search engines. I point out. In other words, we ignore the rhythm of conversation and always try to return only accurate information to the input. In this way, "Seaman" was an epoch game in the 90's, and its achievements have been attracting attention in the Japanese Society for Artificial Intelligence to this day.

"Dokodemoissho-Toro and Shooting Star" (Sony Computer Entertainment, 2004) is also a content to enjoy conversations with characters that implement AI. In the Dokodemoissho series, Toro, a cat character who dreams of becoming a human being, talks using the words that the player has learned.

In this way, showing as the main content of characters that implement AI in a very simple game system is the biggest feature of games that enjoy conversations with characters such as "Seaman" and the Dokodemoissho series. ..

Introducing autonomous agents

In the 2000s, "The Sims" (2000, Maxis) will be released as the Sims People series. This is a game of observing characters like "Apple Town Monogatari" and "Little Computer People", but the characters perform various actions to give players a lot of fun.

In the lecture, Mr. Miyake presented information on AI of "The Sims", which is distributed on overseas university sites and GDC sites. The Sims AI is Utility-based, but consists of "Meta," "Peer," and "Sub." Meta is the overall AI mechanism, and Peer is the character.

The Sims has information about various characters, and interactions such as graphics, animations, and sounds are embedded in those objects.

The so-called personality model called "motif engine" is used for characters. There are a total of eight parameters, including physical (physiological) parameters such as Hunger, Comfort, Hygiene, and Bladder, and mental parameters such as Energy, Fun, Social, and Room. The Sims is a system in which the eight parameters change according to actions. For example, going to the bathroom relaxes the physiological parameters, and talking to a person restores the Social parameters.

In addition, a weight graph is set for each parameter, and the product of the weight and the value is the overall happiness (Mood). As a result, it is possible to select the optimal (= maximum utility) action.

For example, the change when the hungry state recovers from "-80" to "60" is the weight of "-80" to "60" as the overall happiness. Furthermore, suppose it changes from "60" to "90". At this time, when comparing the increase in happiness when going from -80 to 60 and when going from 60 to 90, the former is overwhelmingly large. In other words, going from "extremely hungry" to "a little full" is more happy than going from "a little full" to "more full". Is to grow. The phenomenon that "the first cup of beer is the most delicious and the second cup is a little tastier" is called "the law of diminishing marginal utility" in economics, and this mechanism of utility has been introduced.

Research on coordinated placement system in CCP

CCP Games (CCP), which develops "EveOnline", is studying the natural behavior of agents in collaboration with Reykjavik University. For example, if four characters are gathered, arrange them in a circle in the middle. This is an application of psychological knowledge and is called "O space" or "P space", but here we analyze the standing position formed when people face each other and design the social behavior of the character. I am. For example, humans have a weak point in their backs, so when they stand against a wall, they unconsciously turn their backs to the wall, which makes them feel uncomfortable when facing the wall. We are applying that knowledge. In addition, we have implemented F-formation, a technique based on psychological knowledge, which is used when deciding how to use space when arranging several characters.

It's not a direct conversation with the player, but for example, when the player enters this group as the fourth person, it becomes more natural and human when the three existing AI characters form a circle with the new player. Behavior. By doing so, you can make the agents of the city crowd appear to behave more naturally.

Utilize environmental awareness for dialogue

The pattern matching method used in "LEFT 4 DEAD" (Valve Software, 2012) is a method in which a specific conversation is responded to what the character sees in the condition group. For example, if you want to have a simple conversation with an enemy or a friend in the game, conversation candidates are presented from the visual information, and the candidate with the highest evaluation value is responded.

The Last of Us (Naughty Dog, 2014) uses a similar approach. In The Last of Us, the next line is determined by the condition group of the previous line. Depending on the various set conditions, the candidates for the next conversation will be scored and then chained to the next line. We have adopted an indirect narrative that explains the situation to the player by letting the character have a little conversation using the context created in this way.

MCS-AI dynamic cooperation model in "FINAL FANTASY XV"

The last thing Miyake introduced was the MCS-AI dynamic cooperation model implemented in "FINAL FANTASY XV". This is a collaboration of meta AI, character AI, and partial AI, and is equipped with peer cooperation, dialogue system, mobile conversation, and so on.

This is also an indirect narrative expression that brings various experiences to the player by letting the character speak. For example, if you want to tell the player how tight the current situation is, rather than the specific lines themselves that you want to convey in the scenario, you may want the character to say or do something that suggests the situation.

For example, if Meta AI is not implemented, fellow agents will gather in one place more and more, but since two of the parties are busy in battle, let them talk "I'll leave it to you" and "I understand". By adjusting Meta AI, you can draw a natural flow of the scene. In this case, the character AI asks Meta AI if it wants to help this player, and receives information such as path search from Meta AI to help the player. In addition, a script system is prepared for conversation, and each of the script candidates decides which line to speak to which character this time.

The "Face to Face system" is to calculate the appropriate position when the character talks and perform a path search toward that location. It is also a system that does the positioning of the conversation by having a dialogue on the behavior. Here too, AI works together to determine the position and overall control. Alternatively, "moving conversation" guesses the destination the player wants to go to, searches for a position between that destination and the player, and positions toward that destination.

With these functions, it is possible to convey to the player the feeling of being supported by friends through not only dialogue but also body positioning and positioning.

The future of dialogue agents

As we have seen, dialogue agents in the gaming industry have evolved in their own way. On the other hand, dialogue agent research in the academic field has evolved around conversation by natural language processing. Currently, as the third trend, dialogue agents other than the game industry are emerging.

Mr. Miyake points out that it will be important to consider the next-generation AI system in the form of combining these three trends as the next action of the game industry. A characteristic of the game industry is that the game system becomes the subject of the narrator, which causes interaction with the user. By studying dialogue agents together with game systems that work with users in various ways, I think we can find clues to evolution without losing sight of the essence.

Writer: Takako Ouchi

Tags: When ai eats human intelligence