Wednesday, May 29, 2024

Prompt Engineering Has Just Been Slain By OpenAI With DALL-E 3

Now, you may automatically obtain high-quality pictures of complicated scenarios.

Plan modification. I was about to publish the third installment of the OpenAI series when they abruptly released DALL-E 3.

It matters for three reasons.

First, despite creating incredible image models, OpenAI has more than enough resources to compete with Google and its other rivals in the language domain.

The second is that picture creation technology has advanced even farther. If not in terms of quality—we’ll have to see how it compares to Midjourney—then at least in terms of the systems’ basic aptitude (what they are capable of doing at any given level of quality).

Thirdly, rapid engineering for AI art is completed if OpenAI’s representation of the model’s capabilities is accurate. I will briefly comment on this one’s applicability after hearing what OpenAI has to say about its most recent release.

What DALL-E 3 can do

“DALLE 3 understands significantly more nuance and detail than our previous systems, allowing you to easily translate your ideas into incredibly accurate images,” the company claims.

It was quite difficult to develop a prompt that could faithfully translate your mental image of a scene into the model, which was one of the main drawbacks of models like Midjourney and Stable Diffusion (and the earlier DALL-E versions, too). With DALL-E 3, OpenAI appears to have found the answer.

Here’s an illustration from the blog entry:

“DALLE 3 can faithfully depict a scene with particular objects and their relationships.”

While lonely persons and objects are simple and of good quality, what about scenarios where various items must adhere to the prompt’s specified relationships? Neither Midjourney nor Stable Diffusion permit you to do this. That problem remained unresolved.

Sam Altman foresaw quick engineering as a transient stage of generative AI some time ago.

