How to use an AI image generator...

Man of Honour
Joined
18 Oct 2002
Posts
20,482
Location
UK
...to do this, if it can be done.

Can any of the current AI image generators be used to effectively generate a picture of the mental image that may be conjured up by the lyrics of a song? The song was a favourite of a very close relative who has since passed, and I want to capture the song in a single image which can be printed, framed, and take pride of place on the wall.

I would like to create an image which captures all of the elements of the lyrics of "What a Wonderful World" by Louis Armstrong. For those unfamiliar, it goes:
I see trees of green
Red roses too
I see them bloom
For me and for you
And I think to myself
What a wonderful world

I see skies of blue
And clouds of white
The bright blessed day
The dark sacred night
And I think to myself
What a wonderful world

The colors of the rainbow
So pretty in the sky
Are also on the faces
Of people going by

I see friends shaking hands
Saying, "How do you do?"
They're really saying
"I love you"

I hear babies cry
I watch them grow
They'll learn much more
Than I'll ever know
And I think to myself
What a wonderful world

Yes, I think to myself
What a wonderful world
Ooh yeah

I have tried pasting the lyrics as they are into Craiyon but it cant resolve the detail and generally shows a picture of a multicoloured tree or a single rose. Is there a technique which may make this endeavour more successful? I'm wondering if the word structure can be optimised, or if maybe multiple images will have to be generated and stacked. I may also want to add other details specific to the person. Interested to hear how others would approach this problem.

Many thanks!
 

Yes it's possible, you just need to know how to be a good prompt smith with Midjourney or Stable Diffusion, you then need to use something like Runway, but if you can't do MJ or SD then you won't be able to do Runway by "do" I mean generate decent images from prompts, craiyon is pointless for anything decent, it's not even early learning center for kids level AI, more like I found my first crayon level
 
Last edited:
I don't think it's at a stage where you can just enter something and it'll bring it up.

In Stable Diffusion, you're going to need to give it a hand through the prompt by describing what the verses say, so it's not quite your original intent of just posting in to get the result out. This is especially important given there are verses there that'll just overwrite each other with no one having more importance than the other to the AI, so it just picks one (in short) and sticks with that instead of mixing the two; such as the Skies being Green and also being Blue, you'll need to help Stable Diffusion to build that picture up. Especially if you just want one picture to encompass all the verses.

Again, not quite at stage where you can copy and paste to work out on its own. It needs help.
 
Out of curiosity put the lyrics into a text to photo AI program and it just creates a mash-up of people in a park with blue sky and multi-coloured trees like an alien landscape LOL which was pretty much what I thought it would do.
 
I've never used on of these before. I'm impressed/terrified with the result.

I give you: "Donald Trump arguing with a bear on a beach in Jamaica"

2awLPmE.jpeg
 
So I tried it with Stable Diffusion and Realistic Vision, and broke down some of the verses so they fit in together more easily as descriptors that it can understand, providing as much information as possible on what to generate and include.

So something like this for the positive prompt:
In the background during a bright sunny day, there is a sky of blue with white clouds stretching into the distance. One rainbow breaks between the clouds in the sky in the background. Below the sky in the foreground, there are many leafy green coloured trees and below them are many red roses in full bloom. Standing within the roses in the foreground, we see two people shaking their hands greeting each other.

With a negative prompt of:
white roses, white flowers
Had to remove those as there was too little red roses being generated otherwise.

But as you can see from the positive prompt, I did not ask the generator for a bright and dark aspect of the sky, as you lose any chance of a rainbow there that's somewhat realistic. So the "sacred dark" part of the verse had to be ditched considering the rest of the verses were full of descriptors for a bright somewhat hopeful of the future kind of scene. But overall, things needed to be cut out to make it work if you wanted any semblance of realism.

And got something sort of acceptable (I guess?)

For OP to consider: 1920x1080 size only as I'm using a 12GB 3060 here, so it takes time to generate large images that can be used to print from and retain some level of detail. And this was something like the 90th generated one (from batch of 6 generated images each go, each one lasting 20 minutes). With many of the others failing to make the required look and reason for it as well (someone who has passed and using these lyrics for a meaningful image to remember by), so I rejected those other ones.

qRSepUT.png

:: edit ::
It has the bright (somewhat blue) skies with some white clouds (as you need dark clouds for the rain to create the rainbow unfortunately), with some rainbows visible in the sky. Green tree's with a sea of red roses (and some other coloured flowers) on the ground in full bloom. The rest of the stuff in the verses you'll just need to imagine all those other people in the background near the tree's are shaking hands, etc, with the old and young there to complete the rest of the verses. The two walking away on the long road ahead of them away from you, are non specific but they are holding hands and helping another along, and the entire scene is bright enough where you know there are those leaving, on a sad note, but also not a dark one either, given the brightness and hopefulness of the future of the scene from it's bloom, greens, bright colours and the rainbows. And that the road is clear that you can see them still after they have left.
:: /edit ::

:: edit 2 ::

Managed to get another image that's close enough (I feel).

Wsv1HlM.png

Still feel the first one "feels" better and has more meaning. This one is just more hopefull but without being specific about the hope (done through the brighter colours and bloom).

Used the following prompt:
In the background during a bright sunny day, there is a sky of blue with whispy white clouds stretching into the distance. Rainbows breaks between the clouds in the sky in the background. Below the sky in the foreground, there are many leafy green coloured trees and below them are many red roses in full bloom. Standing within the red roses in the foreground, we see two people shaking their hands greeting each other.

:: /edit 2 ::

If you don't need realism, then you can try different AI generators, but I tend to stick with the realistic approaches.
 
Last edited:
So I tried it with Stable Diffusion and Realistic Vision, and broke down some of the verses so they fit in together more easily as descriptors that it can understand, providing as much information as possible on what to generate and include.


If you don't need realism, then you can try different AI generators, but I tend to stick with the realistic approaches.
Wow, thank you very much! That is definitely moving in the right direction and gives me a good basis from which I can tweak. Seeing how you worded it has given me some ideas of how to approach this rather than just pasting in lyrics like I have been doing. I think I’m fine without realism, as in my mind’s eye it is quite a surrealistic, almost psychedelic image, so I will give some others a try too.

Have you tried mid journey? Not free anymore though.
Not yet but if it is worth paying for I’ll give it a try. Out of all the available options, paid and free, what is the best for the job?

I’ll have a play this evening after work and I’ll post my results.
 
Last edited:
OK I thought it must be Arthur Dent but don't remember emotion from final scenes

Yes - don't think it can be summarised by AI - it would do something stupid & literal, moreover it needs a humans emotions to understand what lyrics convey

I think best summary would be scenes from famous films (which once again AI couldn't understand) like wonderful life!, casablanca, Leon, Manhattan.
 
Back
Top Bottom