What are the application markets of the generative AI revolution in games

Published on 1 Weeks ago   146 views   0 Comments  

Original author: James Gwertzman and Jack Soslow, translated and edited by DeFi Zhidao.

To understand how generative AI will revolutionize the game, just look at @ emmanuel_ 2 m recently posted this Twitter post. In this article, he discussed the use of Stable Diffusion+Dreambooth (the popular 2D generation AI model) to generate liquid medicine images for hypothetical games.

The revolutionary nature of this work is not only that it saves time and money, but also provides quality - thus breaking the classic triangle of "cost, quality or speed can only have two at the same time". Artists can now create high-quality images in a few hours, while it takes weeks to manually generate these images. What is truly transformative is:

Now, anyone who can learn some simple tools can gain this creativity.

These tools can create countless variations in a highly iterative manner.

Once trained, the process is real-time - the results are almost instantly available.

Since real-time 3D, there has never been such a revolutionary technology for games. Spend any time talking with the game creator. The feeling of excitement and surprise is obvious. So where will this technology go? How will it change the game? But first, let's review what is generating AI?

Image source:Boundless Layout AI Tool generation

What is generative AI

AI generation is a kind of machine learning. The computer can generate original new content according to the user's prompts. Today, text and image are the most mature applications of this technology, but almost every creative field is working, from animation to sound effects to music, and even creating virtual characters with full personality.

Of course, the application of AI in games is nothing new. Even early games, such as the Pong of Atari, have computer controlled opponents to challenge players. However, these virtual enemies do not run AI as we know it today. They are just scripts written by game designers. They simulated an AI opponent, but they could not learn. They could only be as good as the programmers who built them.

Due to faster microprocessors and cloud computing, the difference now is the available computing power. With this capability, large neural networks can be built to identify patterns and representations in highly complex domains.

This blog post is divided into two parts:

  • The first part includes our observation and prediction of game generation AI.
  • The second part is the market map we made for this field, which outlines each market segment and identifies the key companies in each market segment.

Part I - Observation and prediction


First, let's explore some assumptions in the rest of this blog post:

1. The amount of research on general artificial intelligence will continue to grow, creating more effective technologies

Consider the chart of the number of academic papers on machine learning or artificial intelligence published each month in the arXiv archive:

As you can see, the number of papers is growing exponentially, with no sign of slowing down. This includes only published papers - many of which have never even been published and are directly used for open source model or product development. The result is an explosion of interest and innovation.

2. Among all entertainment, games will be most affected by AI generation

In terms of the number of asset types involved (2D art, 3D art, sound effects, music, dialogue, etc.), games are the most complex forms of entertainment. Games are also the most interactive, emphasizing real-time experience. This has created a steep barrier for new game developers to enter, and has also paid a high cost for making a modern, top ranked game. It also creates a huge opportunity to generate AI subversion.

Think of a game like Red Dead Redemption 2. It is one of the most expensive games in history, with a production cost of nearly 500 million dollars. The reason is easy to understand - it has one of the most beautiful and real virtual worlds of all the games on the market. It also took nearly 8 years to build, with more than 1000 non playable characters (each character has its own personality, works of art and dubbing actors), a world of nearly 30 square miles, more than 100 tasks divided into 6 chapters, and nearly 60 hours of music created by more than 100 musicians. Everything in this game is very big.

Now compare Red Dead Redemption 2 with Microsoft Flight Simulator. The latter is not only large, but also very large. Microsoft Flight Simulator enables players to fly on the whole earth, including 197 million square miles of the earth. How did Microsoft create such a huge game? By letting AI do it. Microsoft and blackshark AI cooperates to train AI to generate a realistic 3D world from 2D satellite images.

This is an example of a game. If we don't use AI, it is actually impossible to build it, and we can benefit from the fact that these models can be improved over time. For example, they could enhance the "highway clover overpass" model, rerun the entire construction process, and suddenly all highway overpasses on the entire planet were improved.

3. Every asset involved in game production will have an AI generation model

So far, 2D image generators like Stable Diffusion or MidJourney have captured most of the popular excitement of generating AI, because they can generate images with attractive features. However, there are already generative AI models applicable to almost all assets in the game, from 3D models to character animation, to dialogue and music. The second half of this blog post includes a market map highlighting companies that focus on each type of content.

4. The price of content will drop significantly, and in some cases it will actually drop to zero.

When talking to game developers who are trying to integrate AI generation into their production processes, the most exciting thing is the significant reduction in time and cost. A developer told us that the time for them to generate conceptual art for a single image has been reduced from three weeks to one hour from the beginning to the completion. We believe that similar savings can be achieved in the entire production process.

It needs to be clear that artists are not in danger of being replaced. This really means that artists no longer need to do all the work themselves: they can now set the initial creative direction, and then hand over most of the time consuming and technical execution to AI. In this respect, they are like celluloid painters in the early stage of hand drawn animation. Skilled "inkers" draw the outline of animation, and then the low-cost "painters" will complete time-consuming painting work. Animate cels and fill lines. It is "auto complete" for game creation.

5. We are still at the primary stage of this revolution, and many practices need to be improved

Although we are excited recently, we are still at the starting line. In the process of figuring out how to use this new technology in games, there is still a lot of work to be done, and it will create huge opportunities for companies to quickly enter this new field.


In view of these assumptions, the following are some predictions on how the game industry will change:

1. Learning how to effectively use generative AI will become a marketable skill

We have seen some experimenters use generative AI more effectively than others. To take full advantage of this new technology, it is necessary to use various tools and technologies and understand how to use them flexibly. We predict that this will become a marketable skill, combining the creative vision of artists with the technical skills of programmers.

Chris Anderson has a famous saying: "Every abundance will lead to new scarcity." As the content becomes rich, we believe that the artist who knows how to use AI tools to collaborate and work most effectively is the most scarce.

For example, there are special challenges in using AI generation to make artworks, including:

  • continuity. For any production asset, you need to be able to change or edit the asset later. With AI tools, this means that you need to be able to reproduce the asset with the same prompts so that you can make changes. This can be tricky because the same prompts can produce very different results.
  • Style. It is important that all art in a given game have a consistent style -- which means that your tools need to be trained or otherwise bound according to the style you give.

2. Reducing barriers will bring more adventurous spirit and creative exploration

We may soon enter a new "golden age" of game development, in which a lower threshold will lead to more innovative and creative games. Not only do lower production costs lead to lower risk, but these tools also release the ability to create high-quality content for a wider audience. This leads to the next forecast

3. The rise of "micro game studio" assisted by AI

With AI generation tools and services, we will begin to see more viable commercial games produced by "mini studios" with only one or two employees. The idea of small independent game studios is not new - the popular game Among Us was created by Innersloth studios, which had only five employees at that time - but the scale and scale of the games that these small studios can create will grow. This will lead to

4. Increase in the number of games released each year

The success of Unity and Roblox shows that more games can be created by providing powerful creative tools. Generating AI will further reduce the threshold and create more games. The industry has already faced discovery challenges - more than 10,000 games were added to Steam last year alone - which will bring greater pressure to discovery. However, we will also see that

5. New game types that cannot be created before AI generation

We will see the invention of new game types. If AI is not generated, these game types cannot be realized at all. We have talked about Microsoft's flight simulator, but new types will be invented, which depend on new content generated in real time.

Consider Spellbrush's Arrowmaster. This is a role-playing game, which features the characters created by AI and provides almost unlimited new game playing methods.

We also know that another game developer is using AI to let players create their own in game avatars. In the past, they had a group of hand-painted avatar images that players could mix and match to create their avatars - now they have completely abandoned this point and simply generate avatar images based on players' descriptions. It is safer for players to generate content through AI than for players to upload their own content from scratch, because AI can be trained to avoid creating objectionable content, while still giving players a greater sense of ownership.

6. Value will be attributed to industry specific AI tools, not just basic models

The excitement and heated discussion around the basic models such as Stable Diffusion and Midjournal are generating jaw dropping estimates, but the continuous influx of new research ensures that new models will emerge and disappear with the improvement of new technologies. Consider three popular websites that generate AI models: Dall-E, Midjournal, and Stable Diffusion. Each new model will become the focus of attention.

Another method may be to build an industry consistent tool suite, focus on the generation of AI requirements for specific industries, deeply understand specific audiences, and fully integrate into existing production pipelines (such as Unity or Unreal games).

A good example is Runway, which meets the needs of video creators through video editing, green screen removal, repair, motion tracking and other AI auxiliary tools. Tools like this can build specific audiences and profit from them, adding new models over time. We haven't seen the emergence of game suites like Runway, but we know that this is a space for positive development.

7. Legal challenges are coming

The common point of all these AI models is that they are trained using massive content datasets, which are usually created by grabbing the Internet itself. For example, Stable Diffusion has received training from more than 5 billion image/title pairs captured from the network.

At present, these models claim to operate under the principle of "fair use" of copyright, but this argument has not been clearly tested in court. Obviously, legal challenges are coming, which may change the pattern of AI generation.

Large studios may seek competitive advantage by building proprietary models based on their internal content with clear rights and ownership. For example, Microsoft has a particularly favorable position in this area. At present, it has 23 first party studios and another 7 after the acquisition of Activision.

8. Programming will not be seriously damaged like art content - at least not yet

Software engineering is another major cost of game development. But as our colleagues from the a16 z Enterprise team shared in their recent blog post, art is not dead. It just becomes machine generated. Generating code using AI models requires more testing and verification. Therefore, compared with generating creative assets, productivity is improved less. Coding tools like Copilot may provide engineers with modest performance improvements, but they won't have the same impact... at least not in the short term.


Based on these forecasts, we propose the following:

1. Now start exploring generative AI

It will take some time to figure out how to make full use of the power of the coming AI revolution. Companies that start now will have advantages in the future. We know that several studios are conducting internal experimental projects to explore how these technologies affect production.

2. Look for market map opportunities

Some parts of our market map are already very crowded, such as animation or voice and dialogue, but other areas are very open. We encourage entrepreneurs interested in this field to focus on areas that have not yet been explored, such as "game runways".

Part II - Market Map

Market status

We have created a market map to capture the list of companies we find in each category, where we see the generation of AI impact games. This post introduces each of these categories, explains them in more detail, and highlights the most exciting companies in each category.

2D Image

Generating 2D images based on text prompts has been one of the most widely used fields of generating artificial intelligence. Tools such as Midjournal, Stable Diffusion, and Dall-E 2 can generate high-quality 2D images from text, and have entered game production at multiple stages of the game life cycle.

Conceptual art

AI generation tools are good at "conceptualizing" or helping non artists (such as game designers) quickly explore concepts and ideas to generate concept maps, which is a key part of the production process. For example, a studio (remain anonymous) is using several of these tools to fundamentally accelerate their conceptual art process. It only takes one day to create an image, which used to take up to three weeks.

  • First, their game designers use Midjournal to explore different ideas and generate images that they find inspiring.
  • These were handed over to professional concept artists, who assembled them and painted on the results to create a single coherent image - and then input it into the Stable Diffusion to create a series of changes.
  • They discuss these changes, select one, manually draw some edits -- and then repeat the process until they are satisfied with the results.
  • At that stage, this image was sent back to the Stable Diffusion for the last time to "upgrade" it to create the final artwork.

2D production art

Some studios are already trying to use the same tools to create in-game artwork. For example, here is a wonderful tutorial from Albert Bozesan on how to use Stable Diffusion to create 2D assets in the game.

3D artwork

3D assets are the cornerstone of all modern games and the upcoming meta universe. Virtual world or game level is essentially a collection of 3D assets, which are placed and modified to fill the environment. However, creating 3D assets is more complex than creating 2D images and involves multiple steps, including creating 3D models and adding textures and effects. For animated characters, it also involves creating an internal "skeleton" and then creating animation on top of that skeleton.

We have seen several different startups working hard at each stage of the 3D asset creation process, including model creation, character animation and stage construction. However, this is not yet a solved problem -- no solution is ready to be fully integrated into production.

3D Assets

Startups trying to solve the problem of 3D model creation include Kaedim, Mirage and Hypothetic. Bigger companies are also paying attention to this problem, including Nvidia's Get3D and Autodesk's ClipForge. Kaedim and Get3d focus on images to 3D (image-to-3D); ClipForge and Mirage focus on text to 3D (text-to-3D), while Hypothetic is interested in text to 3D (text-to-3D) search and image to 3D (image-to-3D) search.

3D texture

The fidelity of a 3D model depends on the texture or material applied to the mesh. Deciding which mossy, weathered stone texture to apply to the medieval castle model can completely change the look and feel of the scene. Textures contain metadata about how light reacts to materials (i.e., roughness, glossiness, etc.). Allowing artists to easily generate textures based on text or image prompts is valuable for improving the speed of iteration in the creative process. Several teams are looking for this opportunity, including BariumAI, Ponzu and ArmorLab.


Creating excellent animation is one of the most time-consuming, expensive and skilled parts of the game creation process. One way to reduce costs and create more realistic animation is to use motion capture, where you can put actors or dancers in motion capture clothing and record their movements on the motion capture stage equipped with special instruments.

We now see the generated AI model that can capture animation directly from the video. This is more efficient, both because it eliminates the need for expensive motion capture equipment, and because it means you can capture animation from existing videos. Another exciting aspect of these models is that they can also be used to apply filters to existing animations, such as making them look drunk, old, or happy. Companies entering this field include Kinetix, DeepMotion, RADiCAL, Move Ai and Plask.

Gate design and world construction

One of the most time-consuming aspects of game creation is to build the game world. Generating AI should be very suitable for this task. Minecraft、No Man' Games such as s Sky and Diablo have been famous for using program technology to generate levels. Levels are created randomly, each time different, but follow the rules set by the level designer. A big selling point of the new Unreal 5 game engine is its program toolset for open world design, such as vegetation placement.

We have seen some initiatives in this field, such as Promethean, MLXAR or Meta's Builder Bot, and believe that it is only a matter of time before the generation technology replaces the program technology to a large extent. Academic research in this field has been going on for some time, including Minecraft's generation technology or Doom's checkpoint design.

Another compelling reason to expect a generative AI tool for level design is to be able to create different styles of levels and worlds. You can imagine that in the 1920 New York era, tools were required to create a world, comparing the dystopian silver winged killer future and Tolkien fantasy world.

The following concepts were generated by Midjournal using the prompt "One Game Level... Style".

audio frequency

Sound and music are important components of the game experience. We are beginning to see companies using Generic AI to generate audio to complement the work that has already taken place in graphics.

Sound effects

Sound effect is a very attractive open field of AI. There have been academic papers exploring the idea of using AI to generate "foley" (such as footsteps) in movies, but there are few commercial products in games.

We think it is only a matter of time, because the interactivity of the game makes it an obvious application for generating AI, which can create both static sound effects ("laser gunfire, Star Wars style") during production and real-time interactive sound effects during runtime.

Consider the simple matter of generating footsteps for player characters. Most games solve this problem by including a few pre recorded footsteps: walking on the grass, walking on the gravel, running on the grass, running on the gravel, etc. Generating and managing these sounds is tedious and sounds repetitive and unrealistic at runtime.

A better way is to generate AI model of onomatopoeia effect in real time. It can dynamically generate appropriate sound effects, which are slightly different each time


Music has always been a challenge for games. This is important because it can help set the emotional tone just like in movies or TV, but because the game can last hundreds or even thousands of hours, it quickly becomes repetitive or annoying. In addition, due to the interactive nature of the game, it may be difficult for music to accurately match what is happening on the screen at any given time.

For more than 20 years, adaptive music has been a topic in the field of game audio, dating back to Microsoft's "DirectMusic" system for creating interactive music. DirectMusic has never been widely adopted, mainly because it is difficult to create in this format. Only a few games, such as Monolith's No One Lives Forever, have created real interactive soundtracks.

Now we see that many companies are trying to create music generated by AI, such as Soundful, Musico, Harmonai, Infinite Album and Aiva. Although some of today's tools, such as Jukebox of Open AI, are computationally intensive and cannot be run in real time, most of them can be run in real time after the initial model is built.

Voice and conversation

There are a lot of companies trying to create realistic sounds for the characters in the game. Given the long history of trying to provide sound to computers through speech synthesis, this is not surprising. These companies include Sonantic, Coqui, Replica Studios, Resemble.ai, Readspeaker Ai and so on.

There are many advantages to using AI generation for speech, which to some extent explains why this field is so crowded.

Instantly generate conversations. Usually the voice in the game is pre recorded by the voice actor, but these are only limited to the pre recorded voice recordings. By generating AI conversations, characters can say anything - which means they can fully react to the player's behavior. Combined with the more intelligent AI model for NPC (outside the scope of this blog, but now it is an equally exciting innovation field), the promise of a game that fully responds to players is coming.

  • cosplay. Many players want to play fantasy characters with little resemblance to their real world identities. However, once the player speaks with his own voice, this illusion will be shattered. This illusion can be maintained by using a generated sound that matches the player's avatar.
  • Control. When generating voice, you can control the subtle differences of voice, such as voice color, tone change, emotional resonance, phoneme length, stress, etc.
  • Localization. It is allowed to translate the dialogue into any language and say it in the same voice. Companies like Deepdub specialize in this niche market.

NPC or player character

Many startups are considering using generative AI to create trusted characters that can interact with it, in part because this is a market that has such broad applicability outside of games, such as virtual assistants or receptionists.

The efforts to create credible roles can be traced back to the beginning of AI research. In fact, the definition of the classic AI "Turing Test" is that humans should not be able to distinguish between chatting with AI and humans.

At present, hundreds of companies are building universal chat robots, many of which are supported by language models similar to GPT-3. A few people specially try to build chat robots for entertainment, such as Replika and Anima, which try to build virtual friends. As discussed in the movie She, the concept of dating a virtual girlfriend may be closer than you think.

Now we have seen the next iteration of these chat robot platforms, such as Charisma.ai and Convai com or Inworld Ai, designed to provide power, emotion and agency for fully rendered 3D characters, and provide tools for creators to provide these character goals. This is important if they want to integrate into the game or have a narrative position in the promotion of the plot, rather than a purely decorative facade.

All in one platform

Runwayml. Com is one of the most successful AI generation tools, because it brings together a wide range of creator tool suites in one software package. At present, there is no such video game platform, and we think it is a neglected opportunity. We are happy to invest in solutions with the following characteristics:

  • A complete set of AI generation tools covering the entire production process. (Code, asset generation, texture, audio, description, etc.)
  • It is tightly integrated with popular game engines such as Unreal and Unity.
  • It aims to adapt to the typical game production process.


This is an incredible moment for game creators! Partly thanks to the tools described in this blog post, generating the content needed to build a game has never been easier - even if your game is as big as the entire planet!

One day, you can even imagine a fully personalized game, which is built for players completely according to their needs. This has existed in science fiction for a long time - such as "AI puzzle game" in Ender's Game, or holographic deck in Star Trek. However, with the rapid development of the tools described in this blog post, it is not difficult to imagine that this reality is just around the corner.

Generic placeholder image
215 views   0 Comments   1 Weeks ago
94 views   0 Comments   1 Weeks ago
87 views   0 Comments   1 Weeks ago
128 views   0 Comments   1 Weeks ago
114 views   0 Comments   2 Weeks ago