If there’s one thing AI image generation has been especially bad at, it’s rendering legible text. OpenAI is looking to make that a thing of the past with the launch of ChatGPT Images 2.0, its latest image generation model that appears to be remarkably good at text generation. It also introduces some other niceties, like additional aspect ratios and a better understanding of languages that don’t use the Latin alphabet.
““Images 2.0 brings an unprecedented level of specificity and fidelity to image creation,” OpenAI said in its Images 2.0 introduction post. “It can not only conceptualize more sophisticated images, it actually brings that vision to life effectively, able to follow instructions, preserve requested details, and render the fine-grained elements that often break image models: small text, iconography, UI elements, dense compositions, and subtle stylistic constraints, all at up to 2K resolution.”
Like other AI models, ChatGPT’s image rendering previously struggled with text rendering, even in English, but the images OpenAI has shown off suggest that Images 2.0 is surprisingly good at creating legible, coherent text related to its prompts. It’s also made strides in rendering Japanese, Korean, Chinese, Hindi and Bengali, OpenAI says.
The company also claims Images 2.0 is much better at following instructions and capturing small details from your prompts, allowing it to produce the particular characteristics that make certain visuals more realistic. As a result, it should be much better at creating pixel art, manga, comic book pages, and cinematic stills in a way that is more consistent with what you’d expect from those visual media.
Finally, it’s much more flexible with different aspect ratios. OpenAI says Images 2.0 supports ratios as wide as 3:1 and as tall as 1:3. That’s in addition to generating images at up to 2K resolution.
It’ll be interesting to see what people can generate with this new image model. The changes OpenAI has made make it particularly good for things like marketing materials or visualizing a project during the planning stages. It also comes as OpenAI is making a more pronounced pivot toward business and productivity applications.
Images 2.0 is available to try now through the “Images” option in ChatGPT.