One month after being launched, it has become a quasi unicorn and tens of thousands of people have lined up to register. AI Art is the next NFT?

Published on 9/30/2022   309 views   0 Comments  

Author | Liu Yujie, Wang Yutong

If people define efficient AI as automatic driving, data analysis, unmanned factory, etc., where is the boundary of creative AI?

In the field of writing, AI writing services such as Caiyun Xiaomeng, Jasper and Tsinghua Jiuge have already given shocking answers. It has opened up ideas for countless online writers, and has received various criticisms while pushing the threshold of writing down. In the field of art creation, Disco Diffusion was very popular in the first half of this year, which caused a heated discussion.

In just a few months, AI Art has become a track pursued by global VCs: conforming to the current hottest PLG mode, AI technology with bottom barriers, a perfect intersection of technology and humanity, and a large number of experiencers swarmed in.

Source network

The above figure shows the first popular science article about Disco Diffusion on UISDC, the largest graphic designer community in China. Designers are one of the most sensitive groups to image creation tools. At that time, most C-end users did not know the existence of this "black technology". Even if they did, they would lose the desire to participate in the test because of its complex debugging environment.

However, in five months, the popularity of Disco Diffusion triggered a shared experimental revolution fermented from Google Colab Notebook. According to insiders, the follower who solved the technical pain point of Disco Diffusion, the Stable Diffusion, which was launched on August 22, 2022, is currently financing with a valuation of 500 million to 1 billion dollars.One month after the launch, it is expected to join the global unicorn camp, which is enough to show that the primary market is optimistic about the future of AI Art.

Netizens used Disco Diffusion to generate 24 solar terms on October 17.

AI Art, namely AI generation art, is a branch of AIGC. AIGC (Artificial Intelligence Generated Content) is "a new production mode that uses AI technology to automatically generate content after Professional Generated Content (PGC) and User Generated Content (UGC)". The international equivalent term is "AI generated Media or Synthetic Media", which is defined as "a general term for the production, manipulation and modification of data or media through AI algorithms".

In fact, AI has a long history of generating images and paintings from the perspective of technology and commercialization. It is not an emerging field, but a technological field of continuous innovation and iteration.

As early as 2015, Google released and opened DeepDream, which generates psychedelic and surreal images through algorithms. In the past decade, from digital synthesis to image effects such as "one click generation of new Haicheng style photos" and "generation of childhood photos", which were once popular on the Internet, all reflect the continuous optimization and maturity of AI capabilities and computing power foundation.

The generation effect of Google DeepDream is very obvious

The popularity of AI Art this year is due to a new interaction mode featuring text to image, which is announcing to the public that AI Art is entering an era of "democratization". Using text description, or based on the image and story of the picture, or based on the artist's style, composition, color, perspective and other professional terms, you can generate complete paintings in tens of seconds, which makes art creation like a running thing: everyone can run, but professional people can run faster.

Restoring to the bottom layer is a complete innovation of GAN by Diffusion.

The technical principle of traditional AI Art is to generate a confrontation network (GAN) or VAE, etc. At present, as the most mainstream image generation model of the last generation of AI Art tools and platforms, GAN has made great breakthroughs in model training, but still has serious structural problems in the process of practical application.

As the heat rises, it may be replaced by Diffusion. As a fraction based generation model, Denoising Diffusion Models is a very powerful new generation model. Its working principle is to destroy the training data by repeatedly adding Gaussian noise to the training data, and then learn how to retrieve the data by reversing the process of adding noise. Diffusion also provides a large number of sample diversity and accurate pattern coverage of the distribution of learning data, which means that Diffusion is suitable for learning models with a large number of different and complex data, thus solving the problem of GAN. Diffusion slowly changes the input data to map the data to the forward transformation of noise, and completes the data generation through the learned and parameterized reverse process. The process starts with random noise and cleans up step by step.

Source network

Diffusion has significantly improved the image generation effect, and the traces of digital generation have also been effectively weakened. Users can choose the number of steps to execute. The more steps, the more detailed the image is, which also arouses more "hard core" requirements.

Source network

This is why AI Art tools have existed since a long time ago, but the previous image effects often have many problems such as "too fake" or incomplete. It is even better to directly use Photoshop to do some stylized processing, so these works have lost their collection and sharing value as artworks in today's Diffusion era.

Generators represented by Disco Diffusion, Stable Diffusion, DALL-E2, MidJourney and other algorithms and tools have become the first force for AI generation to land on the C side and the broader meta universe world through the exponential explosion of posts and works.

The above figure records a landmark event that is enough for people to really pay attention to the AI Art field: an AI generated art work won the first place in the art competition of Colorado Exposition.

At present, AI Art tools are also developing rapidly in Europe, the United States and other countries, and started a little later in China. The participants are mainly large companies with more accumulation in the AI field or image editing field, such as Baidu, Meitu, etc.

this paper,36 krypton sorted out the AI Art tools that are popular around the world, analyzed their differences and commonalities, and provided reference for domestic startups with similar ideas or capabilities, and investment practitioners who want to make investments in AIGC.

At present, AI Art tools and services in the market still use a considerable number of underlying algorithms, DeepDream or GAN, and the recent popularity mainly comes from Diffusion, so we divide them into two categories.

1、 Diffusion

Stable Diffusion(

Stable Diffusion is the most avant-garde and popular AI painting machine learning model. It was launched on August 22, 2022 and developed by StabilityAI. The Web demo version is carried on the AI open source community Huggingface. StabilityAI is an AI startup founded in 2019. Headquartered in London, StabilityAI is committed to building solutions with AI as the technology carrier. At present, Stable Diffusion is testing its commercial version DreamStudio, which has a faster generation speed and is about to launch API functions. According to insiders, StabilityAI is being considered by well-known VC companies such as Cotue and Lightspeed to invest at a valuation of 500 million to 10 million dollars.

  • Open source, free to use
  • Support two modes of text generated image and image generated image
  • The Web demo version is very fast, and it is estimated that it only takes 1 to 15 minutes to generate images (the generation duration is related to queuing)

Source: Stable Diffusion

Disco Diffusion(

Disco Diffusion has a powerful open-source CLIP Guided Diffusion model. Based on Google's technical architecture, it can create detailed and realistic images. Launched on October 29, 2021, it was developed by Accomplice, a company founded in 2016 and dedicated to helping each team and individual find an AI driven image workflow suitable for them.

  • Open source, free
  • Need to passGoogle ColabGenerated. There is no more friendly user interface and there is a threshold for use
  • Users can customize advanced options such as steps

Source: Disco Diffusion


DALLE2 can create realistic images and art from natural language descriptions. It will be launched on April 6, 2022 and developed by OpenAI. OpenAI is the president of Musk and Y Combinator, an American business incubatorAltePeter Thiel, co-founder of PayPal, a global online payment platform, and other Silicon Valley technology tycoons were founded in 2015. Previously, when DALLE2 was not officially released, it was only released to 1000 users every week. On September 29, OpenAI cancelled the waiting list for accessing its text generation image system, DALL-E 2, and anyone can immediately register for use. According to OpenAI, about 1.5 million DALL-E users generate more than 2 million images every day.

  • Text to image generation only takes a few minutes, and multiple iterations of the generated image
  • Editing and decoration functions, which can customize multi-layer images
  • In terms of facial rendering, DALL-E2 will deliberately generate crooked eyes or distorted lips to prevent fake pictures
  • Any person who registers to visit DALL-E will get 50 points for free, and then can get 15 points every month. Each point can be used to generate a picture, and the points can be purchased. 115 points are sold for 15 dollars

Source: DALLE2

Mid Journey(

Mid Journey is a popular but not yet universally available AI art generator. Midjournal is an independent research laboratory that explores new thought media and expands human imagination. Midjournal is a small self financing team focusing on design, human infrastructure and artificial intelligence. Midjournal is an AI text to image diffusion model hosted on the Discard server. Currently, there are 1.5 million users.

  • The demo version is easy to use and only needs to provide a small text input
  • Based on Discord, there is a good community ecology
  • Detailed documentation and developer friendly

Mid Journey


TIAMAT is an AI painting tool developed by a domestic team. It was launched on July 22, 2022. The company is headquartered in Shanghai. It is still in internal test.

  • Support Chinese input
  • Mainly for Chinese user groups, better understand the East Asian art style
  • Apply for internal test based on Feishu Community


Photosonic AI( )

Photosonic, developed by Writesonic, an AIGC company, is located in San Francisco, USA. Previously, it focused on AI text creation. At present, Photosonic has generated more than one million images. Photosonic went online a week after the launch of Stable Diffusion. According to the founder of Stable Diffusion, Photosonic AI copied the open source version of Stable Diffusion.

Photosonic AI

2、 Non Diffusion


DeepDream, one of the most popular AI art generators in the market, was launched in June 2015. It is a computer vision program created by Alexander Mordvintsev, a Google engineer, and can explore different AI algorithms. At present, a large number of art effect generation applications in the market are based on this open source technology


NightCafe was launched in November 2019 byRedditdevelopment. Reddit is an entertainment, social and news website. It was founded on February 3, 2005 and is headquartered inSan Francisco, committed to bringing community and a sense of belonging to everyone in the world. Through NightCafe, users can obtain the ownership of the generated art works, or purchase the printed version of the works.


Artbreeder was launched in May 2019 and created by Joel and Studio Morphogen. Artbreeder aims to become a new kind of creative tool, which can give users creativity by making collaboration and exploration easier. Artbreeder uses BigGAN and StyleGAN models. One of them uses the minimum open source version of BigGAN.

Big Sleep(

Big Sleep is an AI art generator based on Python. Developed by Google, it uses the CLIP of BigGAN and OpenAI to generate text to images from Twitter user Adverb through Google Colab notebooks. It needs to be generated by Python programming language. Image processing requires a period of time and a large amount of memory. It may not be possible to run scripts on the computer.


StarryAI is an AI art generator app. At present, it has more than 500000 downloads on Google play. As a mobile application, it has iOS and Android versions, which can be used on the mobile terminal. It supports the creation of NFT and advanced options such as generation steps.

WOMBO Dream(

Wombo is a synthetic media company headquartered in Toronto. In March 2021, Wombo launched an AI driven counterpart app that allows users to upload any static portrait and animate it to sing the song they choose. This product triggered explosive fission. At present, WOMBO Dream algorithm applies the method guided by CLIP developed by OpenAI.


It was founded in San Francisco in 2017 and received seed round financing in 2019. DeepAI uses 12 technical products and services, including HTML5, Google Analytics and jQuery. DeepAI's technologies include ViewportMeta, iPhone/Mobile compatibility and Google Font API. Initially, its function is to automatically color black and white photos into color.

3、 Progress of large factories



In May 2022, Google Research released Image. This time, Google Imagen abandoned the conventional idea of mapping text features to image features and then using GAN or diffusion model to generate images. Instead, it used a pure language model to only encode text features, leaving the work of text to image conversion to the image generation model. The image generation model here is still a diffusion model, a series of diffusion models. This means that its pure text data acquisition and comprehensiveness are easier than that of image text data acquisition, and its text understanding ability is stronger than that of image text data acquisition. 


In June 2022, Google released its Parti text to image computer model, which renders surreal images by studying tens of billions of parameters. The full name of the Party is "Pathways Autoregressive Text to Image". As the number of available parameters increases, the output image can be more realistic. The model studied 20 billion parameters before generating the final image.

Party is different from Imagen. Imagen is a text to image generator designed by Google for diffusion learning. This process trains the computer model by adding "noise" to the image to make it blurred. Then, the model learns to decode the still image to recreate the original image. As the model improves, it can change what looks like a series of random points intoA pictureLike.

At present, Google has not released Parti or Imagen to the public.

Facebook/Meta Make-A-Scene

Meta officially announced the existence of Make-A-Scene in July 2022. At present, the team is testing and collecting feedback from Meta employees, and Make-A-Scene is opening its use rights within Meta. Make-A-Scene can capture the preset scene layout, make the sketch part of the input content, and then the user fills the frame through text input. The model can also create its own layout by entering text, but this means that the user has given up some control.

Microsoft NUWA

In March 2022, Microsoft Asia Research Institute launched the latest multi-modal model N | WA. N | WA supports eight visual generation and editing tasks. Among them, four types of tasks supporting images include: text to image, sketch to image, image completion, and image editing; The four types of tasks that support video include: text to video, video sketch to video, video prediction, and video editing. In July, Microsoft Research Asia publicly released new research achievements: NUWA's upgraded version - the infinite vision generation model NUWA Information, which can generate high-resolution images or long videos of any size.

Wenxin · One Space(

Wenxin · One Grid is a product innovation realized by Wensheng Map system based on Wenxin big model. It will be launched on August 19, 2022. This is the first product of "AI Painting" launched by Baidu relying on the technical innovation of flying oars and Wenxin big model. Baidu's AI Flying Propeller Wenxin model is an industrial level knowledge enhancement model. The large model service in the field of cultural map supports the input of a text description, and the selection of generation style and resolution. The model will automatically create an image that meets the requirements according to the input content.

  • Covering national fashion, national style and other styles
  • Relying on Baidu's computing power, the drawing is fast and the degree of completion is high
  • Easy operation and advanced customization

Meitu AI open platform(

Meitu AI open platform is an AI service platform launched by Meitu Company. It focuses on face technology, human body technology, image recognition, image processing, image generation and other core fields, and provides customers with market proven professional AI algorithm services and solutions.

  • Meitu has advantages in face technology, image segmentation, image enhancement, image generation, etc
  • Have long-term aesthetic accumulation, be able to grasp the trend of beauty, and realize the organic combination of art and technology
  • The cutting-edge technology can be quickly combined with the product, and the daily adjustment amount exceeds 100 million times, which is both stable and practical

The rampant discussion of AI generation on social media has always had the color and prejudice of scientific ethics, while the discussion on image generation technology has always been driven by art lovers, designers, artists and other groups. Therefore, the extension and protective services brought by AI Art on artistic design production efficiency, intellectual property rights, image data reuse and other aspects may be the next market trend.

In addition, it is worth mentioning that recently, there has been an art work trading market for AI Art on Product Hunt, the world's largest product community.This may be another emerging vertical copyright trading track since NFT became popular.

AI Art Trading Market

If high-quality AI Art can be sold at a good price, there will undoubtedly be an era of "national artists".

Of course, any emerging technology will go through a period of great attention at the beginning, and will inevitably be drowned by the market's "disappointment". AI Art is now in the early boom period, and there are still some difficulties to overcome in the future.

The most important problem is that compared with other tracks of AIGC, the current AI Art is somewhat more sexy, but it seems to have less "practical value".

First of all, with the technology sinking from the algorithm model to the hands of users, how to accurately find customer groups for commercialization? As a black technology, although it seems attractive at present, the specific users of AI Art may still stay in: the inspiration tools of artists, the material tools of designers, and the novelty hunting tools of the general public. How many individual users and B-end enterprises have consumption needs for art pictures with different styles? It is not known yet.

However, according to 36 krypton, within the imaginable range, the business scenarios that AI Art can land on mainly include the following categories:

  1. The most direct is that the scene is used for consumer level graphic repair applications such as toC's beautiful pictures show, and to designer's real-time design and other production tools to increase the scene richness and improve user stickiness for these products. In fact, it is understood that such manufacturers are already making relevant layout;

Instant design has launched AI design plug-in

  1. Innovate the production mode of professional creators, for example, as an effective tool to supplement the ability of illustrators, animators, film creators, etc., and liberate productivity. In the future, the main professional ability of a lot of creative work will be embodied in the ability to produce and glue digital materials, rather than the original handicraft (like rice);
  2. AI Art is backed by a broad UGC and user personalized space. It can now fit in well with the trend of national We Media and low threshold content production, and will have a deeper space to play in the meta universe market in the future. Based on this consideration, domestic major content production and distribution platforms, e-commerce platforms, Internet giants, etc. are likely to incubate the functions of AI Art in their own product ecology, help users quickly produce art content that conforms to the tonality of the platform, and serve their own users and enterprise customers at the same time;
  3. As AIGC is in line with the trend of no code, AI Art is also likely to have high potential enterprise service value. The most direct target enterprises are advertising companies, film and television creation companies, architectural firms and other enterprises that have a large demand for artistic renderings. These enterprises alone have a high market ceiling. In addition, the advertising and creative departments of brand merchants are also powerful audiences;

Brand ads generated with Mid Journey

However, different user groups have different specific needs, and subsequent iterations of the product will also be adjusted according to different needs. Therefore, the existing forms of AI Art still remain in the algorithm, beta generation tool, and platform community, and it is likely that different values and service types will be differentiated accordingly. After all, the innovation of the underlying technology and the development of the track are only the first step in the long march of "AI replacing human beings".

In addition to commercialization, another point where opportunities and threats coexist is that most of the current products are based on the understanding of English natural language, while there is no doubt that other major languages such as Chinese, Spanish, French, German and Japanese also have considerable market demand that has not been met. In the process of meeting the needs of different languages, there will be more difficulties. For example, the AI learning difficulty of Chinese is exponential to English, which may be one of the reasons why China lags behind Europe and the United States temporarily.

But challenges also foreshadow blue oceans and opportunities. For example, TIAMAT, the first AI Art company with Chinese natural language understanding as its highlight, has emerged in China. For another example, in Japan, where the AICG industry is developed, the first enterprise service level AI Art manufacturer that can support Japanese input is bound to have a large market space to expand.

Despite the difficulties,European and American VCs are still willing to pay for risky future opportunities.

First of all, AI Art is currently in line with the PLG/CLG model recognized at home and abroad. The product can effectively improve productivity, and can be gradually extended from individual users to teams and even enterprises. In China, PLG/CLG also has leading companies, such asBlue LakePingCAPEtc.

Secondly, AI has been regarded as the future direction in recent years, but the landing scenario still needs to be explored. At this moment, AI Art and even AIGC are scenarios that have clear needs in the AI mature period, and are worth being optimistic about. After all, culture and community are taking shape. For example, Mid Journey's discord based service has made it the second largest community on discord. At present, domestic starters TIAMAT, who are in the internal testing stage, also receive hundreds of application forms with detailed reasons every day. An open and shared community is crucial to the AI Art field, and it is also one of the important criteria for evaluating an AI Art company.Its contribution is also reflected in the way of thinking under the logic of the Internet, which can quickly turn AI Art into a "digital skill".

Netizens launched the Disco Diffusion thesaurus sharing program

In order to better create AI Art, netizens launched the Disco Diffusion thesaurus sharing plan. (Because the test version of Disco Diffusion on Google Colab Notebook still has a certain threshold for understanding and generation, but now many tools have added guidance and filtering functions on artistic style.)

2022 can be called the first year of AI Art opened by Diffusion.In the next three to five years, AI Art will develop in a more free direction, such as showing a stronger coupling and more customization space, that is, closer to the process of "subjective creation", and more and more detailed user ideas will be differentiated and reflected in art works. The DreamBooth AI recently launched by Google has shown this feature.

DreamBooth AI

At the same time, the prevalence of the concept of meta universe and Web3 in the world also provides a combination idea for AI Art.With the dual popular concepts of AI and Web3, AI Art is likely to win a group of investors who believe in the future.

There is no mistake in following the trend. However, looking at China, the more cautious the investment atmosphere is, the more rigorous the assessment of the underlying capabilities of AI Art startups may be. The era of financing based on new concepts has passed. In the future, AI Art startups that can make waves in China must at least have: deeper AI algorithm technology reserves, more open communities and considerable data training barriers, more suitable product landing scenarios for East Asian usage habits, and more long-term ideal entrepreneurs.

*Intern analyst Gu Zhenxing also contributed to this article

Generic placeholder image
214 views   0 Comments   1 Weeks ago
94 views   0 Comments   1 Weeks ago
87 views   0 Comments   1 Weeks ago
128 views   0 Comments   1 Weeks ago
114 views   0 Comments   2 Weeks ago