Midjourney - AI art

Nice. Yeah, that's way closer than I thought you'd be able to get from what I'd seen before. Interesting, and thanks for entertaining my curiosity. Would you mind sharing your prompts, @Mobius 1 and @Ayahuasca ?
 
Completed it mate?

xIaHJGu.png
 
Nice. Yeah, that's way closer than I thought you'd be able to get from what I'd seen before. Interesting, and thanks for entertaining my curiosity. Would you mind sharing your prompts, @Mobius 1 and @Ayahuasca ?
Sure. After the sketch of the glass I asked for realistic here’s the prompt

“Can you make it realistic style - with the wine almost overflowing”

And the second attempt

“Again but to the brim”
 
Hence why I said "full wine glass" rather than full glass of wine.

There isn't an obvious distinction there, (see my post again I asked for a "full glass of wine") it knows how full a wine glass ought to be - see the answer when I ask it. Then see my third pic, you can quite easily get it to draw a glass of wine with the wine filled all the way up but that requires some additional clarification for the model.
 
Last edited:
Sure. After the sketch of the glass I asked for realistic here’s the prompt

Interesting, very straightforward then. If anyone cares, the reason I asked is because of this video:


I guess that it's got a lot better at it (or you're using a better model).
 
Interesting, very straightforward then. If anyone cares, the reason I asked is because of this video:[...]

I guess that it's got a lot better at it (or you're using a better model).

Well yes, that video was uploaded a month ago but 4o was updated a few days ago.

At the time of that video if you prompted it to ask for an image then it would've called another model (Dall-E, a diffusion model) to generate the image and return it to you, now with the update the image generation is native and it's got it's own autoregressive image generation capabilities within the core 4o model.

Essentially you're no longer playing a short game of Chinese whispers to get an image and you can give it an example image to reference too - previously the image model it would call was just text to image, then there were some capabilities for inpainting etc.. but now you've got the ability to give it an image and ask it to turn it into say a cartoon of a particular style etc. thus the Studio Ghibli style stuff going viral a few days ago.
 
Last edited:
There isn't an obvious distinction there, (see my post again I asked for a "full glass of wine") it knows how full a wine glass ought to be - see the answer when I ask it. Then see my third pic, you can quite easily get it to draw a glass of wine with the wine filled all the way up but that requires some additional clarification for the model.

There is a distinction because one will give you things like:

LrKxFKu.jpeg
 
There is a distinction because one will give you things like:

Nope, there's something else at play there - share the link to the prompting for that image:

See here for the screenshots I showed previously - draws it partially full as wine should be, corrects to almost full when requested then on the third prompt gets it all the way to the top.


Now tweaking it to "can you draw a full wine glass please" and you can see there's no meaningful distinction between that and the prompt in my previous conversation "can you draw a full glass of wine please" - as I pointed out, that's simply what it understands how a glass of wine should be filled properly:

j1mf4xZ.png


link to the prompt here:

There's probably something else in your prompts to result in that glass bubbling and overflowing.
 
Last edited:
There's probably something else in your prompts to result in that glass bubbling and overflowing.

You aren't understanding what I'm saying in regard to how the model does and doesn't understand the distinction between a full wine glass (abstract - full of what?) and a full glass of wine.

EDIT: I've been trying to fudge it by working around the understanding of a full glass of wine as per what you posted.
 
Last edited:
You aren't understanding what I'm saying in regard to how the model does and doesn't understand the distinction between a full wine glass (abstract - full of what?) and a full glass of wine.

You've clearly added various things re: the bubbles image you've posted etc.. as now you're avoiding sharing the prompt so we can dismiss that.

You can see from both of my examples that there is no meaningful distinction between "can you draw a full wine glass please" and "can you draw a full glass of wine please" - the likely result of the first prompt is still that the user wants wine so that's what it draws... if you or the user didn't want wine then you could specify that but the model needs to draw something and if you've been ambiguous then it's still just going to infer what's needed.*

As you can see from the prompts I shared they both result in similar images, wine glasses filled as they ought to normally be. To get them full to the top just requires additional clarification like fill to the brim, almost overflowing etc...

*Note, that doesn't imply that it didn't know there's an ambiguity, no one is misunderstanding that you didn't specify the glass should be full of wine... this should be obvious but to illustrate the point see here:

WYb4K2O.png

 
Last edited:
You've clearly added various things re: the bubbles image you've posted etc.. as now you're avoiding sharing the prompt so we can dismiss that.

The prompt used is completely irrelevant to the point I was making, you still aren't understanding what I'm actually saying. (You are mostly kind of repeating my original point but missing a few nuances).
 
Last edited:
The prompt used is completely irrelevant to the point I was making, you still aren't understanding what I'm actually saying. (You are mostly kind of repeating my original point but missing a few nuances).

You claimed there's a distinction because one prompt would give you images like [example you shared], but when asked to share the prompt you're using (as I suspect you've clearly added more to it) you won't do so - ergo we can clearly ignore that example.

Then I've tested both phrasings, shown my results and linked to the full conversation with GPT 4o in both cases... and I can explain why it's done that - leaving an ambiguity in the prompt just leads to it inferring the obvious - that a full wine glass would likely be full of wine. And I've already shown in the first set of prompts that a glass full of wine is sufficient to get it to fill up to the top - the issue was with an older model.
 
You claimed there's a distinction because one prompt would give you images like [example you shared], but when asked to share the prompt you're using (as I suspect you've clearly added more to it) you won't do so - ergo we can clearly ignore that example.

Then I've tested both phrasings, shown my results and linked to the full conversation with GPT 4o in both cases... and I can explain why it's done that - leaving an ambiguity in the prompt just leads to it inferring the obvious - that a full wine glass would likely be full of wine. And I've already shown in the first set of prompts that a glass full of wine is sufficient to get it to fill up to the top - the issue was with an older model.

You are still missing the point of what I was saying, the distinction allows for certain differences but doesn't necessarily give you.
 
You are still missing the point of what I was saying, the distinction allows for certain differences but doesn't necessarily give you.

Are you able to elaborate? I mean lots of this stuff is testable - you shared an image as an example of some claimed distinction but then became very shy when asked to share the prompt. It's no good saying the point is being missed when you're unable to articulate or clarify what it is in the first place.

Like Mr Jack made a claim about not being able to fill a wine glass, a few of us tested it and we found it wasn't true - then it became apparent that he wasn't imagining things, it was quite a real phenomenon but it related to an older diffusion model - the current autoregressive model is able to be more precise.

It does of course default to filling a wine glass part of the way up, but that's because that's how wine is supposed to come, if you ask it to draw a full glass of beer (or indeed a full beer glass) it will draw it full to the top as that's how beer is supposed to come.

See here for full glass of beer: https://chatgpt.com/share/67ed9b69-70d4-800f-aa36-2464f6955e6d

or here for full beer glass: https://chatgpt.com/share/67ed9e63-2bec-800f-9179-64857683b54f

Very similar results in both cases.

But the wine thing (if requested with a brief prompt) can be resolved easily with a follow-up prompt as shown earlier, the idea you need to drop the reference to wine isn't true. In fact, if you want a bubbly glass of pink wine then that can be done in one go with a more lengthy prompt too - there's no special distinction achieved there by not mentioning wine.

9m3kpgJ.png


And there we go, a glass of pink wine with bubbles and filled to the brim, these models are a lot more capable than people seem to realise so I do think claims about basic limitations ought to be treated with a bit of skepticism as often it's more an issue with the prompt than with the model itself.
 
Last edited:
It's no good saying the point is being missed when you're unable to articulate or clarify what it is in the first place

As you do you've latched onto a sub-part of what I've said and keep worrying away at it while missing the overall picture and nothing I can say will change that.

Most of what you are saying was covered or inferred over several comments between myself and Mr Jack.

That in some models you can work around the limitation doesn't currently change the limitation of how these models do and don't understand things.
 
do most people here use the paid version of chatgpt? signed up last night to play with the image generation, only 2 or 3 images allowed per day on the free sub :(
 
Back
Top Bottom