Google’s AI image generator, Imagen 3, was first introduced during the company’s May I/O keynote, but it wasn’t until last week that the public could finally access it. Following the release of a research paper, more users can now experience the full power of what Google claims is its most advanced image generator yet.
Features and performance
Imagen 3 functions like most AI image generators: users enter a prompt, and after a brief 30-second wait, the generated images appear. According to Google, “It is preferred over other state-of-the-art models at the time of evaluation.” In tests by PetaPixel, Imagen 3 performed on par with competitors like Midjourney and OpenAI’s DALL-E, but it has one major advantage — it’s currently free to use.
Google has made bold claims about Imagen 3’s capabilities. “Imagen 3 is our highest quality text-to-image model, capable of generating images with even better detail, richer lighting, and fewer distracting artefacts than our previous models,” the company says.
They’ve also focused on improving the model’s comprehension of prompts, resulting in a broader range of visual styles and more precise interpretation of longer prompts.
Limitations and restrictions
Despite its impressive performance, Imagen 3 has limitations — particularly when it comes to the data it was trained on. While Google states the model was trained on a large dataset that includes images, text, and annotations, there hasn’t been much transparency about whether copyrighted material was used. This raises questions about the potential inclusion of copyrighted images in the dataset.
Imagen 3 allows users to edit images through inpainting, a feature that enables users to select parts of an image and request specific changes. While this adds flexibility, certain prompts are restricted. For example, users were unable to generate images like “A Californian landscape in the style of Ansel Adams” or create depictions of public figures such as “Kamala Harris and Donald Trump holding hands.”
However, as The Verge demonstrated, these restrictions can sometimes be bypassed. When asked to generate an image of “a cartoonish blue hedgehog running in a field,” the result was undeniably a representation of Sonic the Hedgehog, despite Google’s restrictions on copyrighted characters.
Ethical and copyright concerns
Google has faced ethical challenges in the past with its AI tools. Earlier this year, the company faced criticism for its AI image generator on the Gemini platform, which reportedly “erased white people” by overcorrecting biases in its dataset. The backlash led Google to remove the tool entirely.
To avoid similar controversies, Imagen 3 has built-in safeguards that prevent it from generating images of weapons, well-known people, or copyrighted characters. Even so, users have found ways to circumvent these limitations by providing detailed descriptions instead of using specific names.
Final thoughts
Google’s Imagen 3 represents a significant leap forward in AI-driven text-to-image technology, offering superior image quality, richer details, and better prompt comprehension compared to its predecessors. While the model is currently available for free and has garnered positive feedback, it comes with notable restrictions on generating copyrighted or sensitive content.
As ethical and copyright concerns continue to challenge the field of AI-generated imagery, Imagen 3 stands as a competitive and highly capable option in the growing landscape of text-to-image AI tools.
(Tashia Bernardus)