After OpenAI more or less released their tool DALL-E for the public, AI-text-to-picture generation is on everyone’s lips.
I don’t subscribe to the notion of the creative industry’s demise at this point. But: Being a creative with a technical background, I definitely want to get to grips with the new possibilities that these tools offer. However, what I’m particularly interested in is quite specific:
- How comparable are the results of popular AI tools when fed with the exact same prompts?
- How do the systems cope with prompts that don’t refer to something representational/figurative? In particular, I find this question quite interesting: the examples on the vendors’ websites mostly display images that were created via prompts in the style of “A clown made of cheese is sitting on a baby elephant in space.” As absurd and grotesque as these inputs are, however, it’s clear that they depict something tangible/imaginable. But how do the tools handle requests that reference abstract concepts that can only be captured by artistic means?
Having just worked as an artist on an exhibition about the Heavenly Jerusalem, I pose the following prompts to the individual AI tools:
Prompt 1: uptopian heavenly paradise atmospheric mythos Prompt 2: imagined ideal space as utopian environment Prompt 3: afterlife world environment ideal space 4k cinematic dreamy atmospheric wallpaper
The following tools were used for comparison:
Tool | License | Price |
---|---|---|
OpenAI DALL-E 2 | commercial | approx. $0,13/generation |
Midjourney | commercial | unclear (in beta phase) |
Stable Diffusion | OpenSource/commercial | free |
While the first two tools run on the servers of the providers and any requests are made via a web interface (DALL-E 2) or via Discord (Midjourney), Stable Diffusion actually uses the computing power of the user’s PC. This requires the strongest possible graphics card and a lot of VRAM. For the results presented here, a nVidia RTX3080ti was used, whose resources were also fully utilized. Although this hardware seems fairly potent it ran out of memory once larger images than 512 x 512 pixels were to be created. The biggest advantage here is the independance from any commercial provider and that it is (with the exception of electrical power) free, while Midjourney and OpenAI charge you a small fee for each creation.
Results from prompt 1
uptopian heavenly paradise atmospheric mythos
DALL-E 2 (click to enlarge)




Midjourney (click to enlarge)

Stable Diffusion (click to enlarge)




Results from prompt 2
imagined ideal space as utopian environment
DALL-E 2 (click to enlarge)




Midjourney (click to enlarge)

Stable Diffusion (click to enlarge)




Results from prompt 3
afterlife world environment ideal space 4k cinematic dreamy atmospheric wallpaper
DALL-E 2 (click to enlarge)




Midjourney (click to enlarge)

Stable Diffusion (click to enlarge)




Please note: I am not affiliated in any way with the developers/distributors of the presented tools. These images are NOT published under a Creative Commons license. All systems are under heavy development, so outputs my change significantly in the future. The images displayed here were created on September 7th and September 8th 2022.
Useful links & tools
- Max token length and prompt tips in Stable Diffusion
- Token calculator (OpenAI, Stable Diffusion uses a slightly different method, but nice to get a quick estimate)
- Stable Diffusion GRisk GUI 0.1
- Max length for prompts in DALL-E 2
- Alternative GUI for Stable Diffusion (see also discussion here)
- Using Stable Diffusion as “render engine” in Blender
- Alternative GUI Stable Diffusion (needs Anaconda)
- Nice Airtable list with current text to image tools and frameworks
- A search engine for 5.8 billion images from the “Laion-5B dataset“. (Alternative with more filter possibilities but rather unpolished UX: here)
- How the same Stable Diffusion prompt is interpreted with different artists/artstyles: link
- Render interior designs: link
- How Stable Diffusion works from a technical perspective: The Illustrated Stable Diffusion
last update of this post: 05-10-2022