Midjourney recently released a new feature to use style reference in its image generation AI based on your inputs and with this it has killed prompt engineering. Let’s take a look at what this new feature is and understand how this will impact the entire prompt engineering for image AI at least.
The new command is –sref short for style reference and what it does is transforms an existing image (either uploaded or created by Midjourney) in that particular style. The beauty here is that this style if another picture (for now). You can also adjust how much impact this has on your image generation with the previous parameters we had like
--sw 0-1000 the style strenght.
We start with whatever you want to generate, I started like this:
You also need to check your settings, to allow for
For the picture above, click the Variations button like so:
Now, focus on the reference you want. For the purpose of this example, I chose a collage-type picture that looks like this:
The prompt now becomes this, with the bolded part being the reference picture. You obtain this link by either uploading a picture inside the chat box and pressing ENTER or using one of your older generated pictures. Click on the picture, and once it pops out, right-click and select “Copy link.”
painting in the style of kristoffer zetterstrand, with mixed media – photography, acrylic, graphite – with the theme of corporate beach party –v 6.0 –sref https://media.discordapp.net/attachments/1090299970738987021/1204348332319973396/p6framvglvgc1.png?ex=65d467ca&is=65c1f2ca&hm=583512e5255aa755cd6292cdc67e7d340d92524f1c0f2f887e0c9d7767319e97&=&format=webp&quality=lossless&width=905&height=905
This is the result:
We then kept adding, we used these next pictures as reference as well:
These are some of the results that we got:
What are the prompts for these? They look like this:
We also played with the same pictures but had a different start, it ended up like this:
Illustration in the style of kristoffer zetterstrand, with mixed media – photography, acrylic, graphite. – with the theme of corporate beach party https://s.mj.run/m3kpdlkxHTc –sref https://s.mj.run/HmsHQOTmWy8 ::3 https://s.mj.run/4FZRdketRhI ::4 https://s.mj.run/NLiLzN1uFak ::3 –v 6.0
Now here lies the beauty and the problem with prompt engineering.
Why did prompt engineering die?
There are two things that form this concept of “prompt engineering”:
- Transferability – the ability of this prompt to be taught or replicated by various people to achieve consistent results. If they get different results, then the prompt is ineffective.
- In-depth knowledge – the knowledge to build a prompt engineering comes from memorizing strings of parameters or sentences that in the end give you very similar results across multiple experiments.
Prompt engineering as a career went in one year from being scientific, learning how the tool thinks and replies so you can get what you want – to ability to generate a mental concept of what you want and then translating it via your knowledge repository into the end result.
Up until now, you would find thousands of pictures on the internet, and the main question would be – where’s the prompt? – so other people can generate similar images. With the ability to generate a multitude of combinations that depend on:
- You’re ability to recognize artistic styles, camera techniques, artists, art techniques and identifying why they are unique, special or interesting
- Projecting a specific outcome and reverse-engineering it to its individual components to understand how to generate it
In our case, we couldn’t generate this picture without knowing a few specific artists and then generating the end picture with whatever combination we want (thousands of variations between using
--sw from 1 to 1000 or 1 or 3 references). If someone asks me for the prompt, it’s literally just
--sref because without the extra steps in between and using whatever image I used it’s impossible to get to the same result.
There is no prompt anymore, as being able to utilize existing art knowledge will ultimately help you much more than being able to type text into Discord (shout-out to all the art students or teachers out there!).
We started with a generic Mickey Mouse picture, as now its in the public domain
These are the references we used in order:
And this is the result:
These are the reference pictures:
This was the first result with just one reference image:
The impact is visually staggering, as Midjourney generates an image that seamlessly blends the specified style reference with the original prompt. Multiple variations can be explored by incorporating different reference images, offering a diverse array of results.
However, the beauty of this feature also marks the downfall of prompt engineering as we know it.
Prompt engineering, once considered a scientific approach to understanding and manipulating AI responses, has evolved into a more intuitive process. The emphasis now shifts from typing specific text prompts to leveraging existing art knowledge for a more tailored and sophisticated image generation.