Text-to-image generator Stable Diffusion is now available for anyone to put to the test.
Stable Diffusion is developed by Stability AI and was initially released for researchers earlier this month. The image generator claims to deliver a breakthrough in speed and quality that can run on consumer GPUs.
The model is based on the latent diffused model created by CompVis and Runway but enhanced with insights from conditional diffusion models by Stable Diffusion’s lead generative AI developer Katherine Crowson, Open AI, Google Brain, and others.
“This model builds on the work of many excellent researchers and we look forward to the positive effect of this and similar models on society and science in the coming years as they are used by billions worldwide,” said Emad Mostaque, CEO of Stability AI.
The core dataset was trained on LAION-Aesthetics, a dataset that filters the 5.85 billion images in the LAION-5B dataset based on how “beautiful” an image was, building on ratings from the alpha testers of Stable Diffusion.
Stable Diffusion runs on computers with under 10GB of VRAM and generates 512×512 pixel resolution images in just a few seconds.
“We’re excited that state-of-the-art text-to-image models are being built openly and we are happy to collaborate with CompVis and Stability.ai towards safely and ethically releasing the models to the public and help democratise ML capabilities with the whole community,” commented Apolinário, ML Art Engineer at AI community Hugging Face.
Stable Diffusion goes head-to-head against other text-to-image models including Midjourney, DALL-E 2, and Imagen.
An interactive space to test Stable Diffusion has been created here.
(Image Credit: Fabian Stelzer)
Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.