That man singing doesn't exist

Рет қаралды 309,084

Күн бұрын

DALL-E Mini, Midjourney, DALL-E 2. I pit these 3 AI image generators against each other to see which reigns supreme. Spoiler alert: DALL-E 2 wins.
Try DALL-E Mini here: www.craiyon.com/
0:00 - 3 tools tested
2:20 - Stylised
4:54 - Realistic
5:49 - Martian base
7:16 - Text
8:12 - The vertical mattress challenge
9:27 - Kitchens
10:22 - Cities
11:30 - Dog sniffing lamp post
12:14 - Variants on existing images
12:42 - Upscaling limitations
14:03 - The future

Пікірлер: 803

@asdasdasd9320 2 жыл бұрын

Philip has such an abstract idea of a bed, that even DALL-E 2 can't handle it

@BonJoviBeatlesLedZep 2 жыл бұрын

I'm still in doubt that that's his actual bedroom and he comfortably sleeps on that. He lives with his girlfriend right? He must surely be sleeping in an actual sane bed with her most nights.

@YTnamesAreBS 2 жыл бұрын

This comment made me exhale quickly from my nose, which is the highest honor I can bestow on you.

@Blox117 2 жыл бұрын

i tried inputting "honest woman" but dalle 2 told me to ask for something reasonable

@iR0CKt 2 жыл бұрын

@@Blox117 Need to be stylized or something :D

@ChrisD__ 2 жыл бұрын

@@iR0CKt That's right, you can only get yourself a furry or anime GF.

@cortster12 2 жыл бұрын

You know the crazy part? We're not reaching the endgame, these tools are still in their infancy. We're like people looking at room sized computers in the 1950s and being amazed. Because it was amazing, the future just ended up being unfathomable.

@vadiks20032 2 жыл бұрын

aren't they written in python imagine the speed if they were written on C languages

@fueledbycoffee583 Жыл бұрын

@@vadiks20032 written in python calling c++ code. So would be about the same.

@4rumani Жыл бұрын

this is the end lol, the ai boom is over and we're headed towards a long long ai winter

@cortster12 Жыл бұрын

@@4rumani I will remember this comment when it's completely wrong.

@BlackParade01 4 ай бұрын

@@4rumanioh boy how wrong you were

@supremesurvivor 2 жыл бұрын

This is certainly one of my favorite videos on youtube, but it's so scary, unnerving, that I can't even describe what I'm feeling at the end. The feeling that we cannot predict what this might imply for art and politics without being pessimistic. Please Philip, keep it up!

@user-lh7mt7zo7l 2 жыл бұрын

It just means we'd return to a time before photo and video evidence.

@danisob3633 2 жыл бұрын

ye, lie detection needs to get better

@user-lh7mt7zo7l 2 жыл бұрын

@Lucas Carvalho I wonder what happens when we make AI generated images of people who don't exist but then someone is born who looks like that haha

@pygmalion8952 2 жыл бұрын

@@user-lh7mt7zo7l every ai service would be regulated to indicate it is an ai image in the photo's information. tho it is a bit shaky given the fact that you can spoof identification codes sometimes.

@user-lh7mt7zo7l 2 жыл бұрын

@@pygmalion8952 yeah regulation wouldn't work because with enough money and power you could make your own A.I.

@vankata69exe45 2 жыл бұрын

philip is an ai with very good text to speech at this point

@existentialselkath1264 2 жыл бұрын

New York in unreal engine is genuinely really impressive. It doesn't just look like a game, its got that distinct unreal engine 4 look I can never explain, but it's done it perfectly

@arrowtongue 2 жыл бұрын

AI generated images are so great at capturing the feel or vibe of something, because the nature of neural networks is stuff we can't quite describe, it's as scary as it is weirdly comforting we can turn these more abstract feelings into things

@ChrisD__ 2 жыл бұрын

I think it's the orange sun like paired with blue everything else, the artist's color grading goals leaking into the actual world lighting. Along with missing shadows here and there and repeated objects and textures. Notice all the fire escapes all over the place. Also the general blurriness of the bounce lighting.

@Strelokos666 2 жыл бұрын

"distinct unreal engine 4 look" what the hell is that suppose to mean?

@ChrisD__ 2 жыл бұрын

@@Strelokos666 Ya know... that UE4 look. Orange and teal, TAA, dithering, and every post processing effect under the sun.

@eldarlrd Жыл бұрын

@@Strelokos666 You haven't played any UE4 game?

@arrowtongue 2 жыл бұрын

8:40 your disappointment with the mattresses and stubborn valve please fix made me genuinely burst out laughing, love your sense of humor

@Zoo-Wee-Mama-Sq 2 жыл бұрын

It's been a joy watching your channel branch out from CSGO mapping topics to technology in general, while still bringing the same top notch production.

@distortedjams 2 жыл бұрын

An interesting test would be to take a real life image, put that into a AI that can transcribe images (Instagram automatically does this). Feed that transcription to one of these AIs and compare the results.

@xouthpaw 2 жыл бұрын

And then Instagram won't need any content creators anymore, because you'll be able to log in and receive a 100% AI generated image feed based on your assumed preferences

@IronKurone 2 жыл бұрын

@@xouthpaw Few years later from today, perhaps my favorite instagram celebrity might not even be human. And that...kinda scary.

@treudden 2 жыл бұрын

You can use an init image in disco diffusion which works really good

@Erveon 2 жыл бұрын

@@IronKurone Knowing people have a favorite instagram celebrity is by itself scary enough

@IronKurone 2 жыл бұрын

@@Erveon Its the future, who know...

@Snowdrift72 2 жыл бұрын

8:08 is the body the AI has created for itself and chosen to inhabit

@olegmoki Жыл бұрын

If you use DALL-E at 3 am and then turn around... ᅠ

@TilW 2 жыл бұрын

I am quite impressed with DALL-E 2, but when closely compared to DALL-E Mini and Midjourney, it still falls behind when it comes to one thing: The inablility to generate Boris Johnson in a bath of beans without violating the TOS.

@mkontent 2 жыл бұрын

This.

@mkontent 2 жыл бұрын

I was honestly shocked that Dall-e mini even knows the faces of popular people. Considering how fuzzy the images are, it will still try its best to recreate Boris Johnson, Ryan Gosling, etc. Literally blew my mind. Like, recognizing faces and the names behind them is something a baby human can do. Little AI that knows Ryan Gosling...

@3n3j0t4 2 жыл бұрын

@@mkontent dalle mini is literally the only one that allows faces

@3n3j0t4 2 жыл бұрын

@@mkontent I know you read this you sassy dumbfuck

@explosu 2 жыл бұрын

@@mkontent TBF, Boris Johnson already has a face that looks like it was generated by Dall-E Mini so it probably doesn't struggle as much.

@1000_Gibibit 2 жыл бұрын

Really glad that you managed to get (direct or indirect) access to DALL-E 2. These comparisons are wonderful! And of course you came up with some great prompts for the AI as always. The rate at which AI research advances is actually insane. And the conditions required for this pace, like rapidly improving hardware, are starting to feel like they are straight out of a sci fi story if you think about it. How long before someone accidentally creates an AI that can operate on real life systems that we lose control over? I always thought AI doomsday thinkers were too optimistic about AI. Now I don't know anymore if it's possible for a story like Hyperion to become reality. All bets are off. Oh and all the shorter term consequences relating to reliability of image validity are getting a bit concerning as well of course...

@oldm9228 2 жыл бұрын

GitHub copilot is probably an example of a currently active AI that operates on real life systems. It generates context aware code for applications based on requests. The quality of that code is questionable and it could potentially include "hidden intentions" (security risks) just like human written code can.

@HighWarlordJC 2 жыл бұрын

There's a very real reason many of our brightest minds constantly warn about the dangers of AI.

@amp4105 2 жыл бұрын

imagine ai generated movies

@amunak_ 2 жыл бұрын

@@oldm9228 Copilot and similar are amazing for generating boilerplate and small chunks of code that you can actually verify yourself. But I have doubts about usage beyond that.

@McDonaldsCalifornia 2 жыл бұрын

I mean dall-e and gpt and stuff are genuinely impressive but they are far from what we would expect a true AI (or AGI or Super AI or whatever) to look like.

@juliann149 Жыл бұрын

It's crazy watching this video only 9 months later, seeing how much the generators improved already. Would be interesting to re-run the comparison on the current versions, especially Midjourney advanced a lot iirc.

@Tofuey 3 ай бұрын

Even further a year from this comment

@nixel1324 2 жыл бұрын

Yes, Dall-e mini (now Craiyon) has a very ai-y feel to it, but I like that. It's like the charm of a retro console. From a technical standpoint it's inferior in every way, but that makes it recognizable, gives it character and makes it endearing. And once people grow up in a world where the higher end stuff is the norm, people like me will probably largely be considered old-fashioned. I don't really care much about modern consoles, and cannot tell apart PS5 and Xbox X footage. But I'll instantly recognize a Wii game, even if running in 4k with texture replacements. Even when you upscale Craiyon results (like with Dall-e Flow), it still has that charm for me. When photo-real ai images become mainstream, I hope people will still appreciate the weirder, less fine-tuned options. I think I will, at least.

@KVVUZRSCHK 2 жыл бұрын

Dall-E Mini is on the left side of the uncanny valley. Astonishingly lifelike yet easily distinguishable as fake.

@IndieLambda 2 жыл бұрын

That's when you add "AI generated" at the end of your prompt.

@RaptorShadow 2 жыл бұрын

Someone pointed out that the surreal and disposable quality of the Craiyon images make it perfect for memes. You can quickly and cheaply get a rendering of whatever stupid idea you come up with (like Boris Johnson's Bean Bath Suprise) and get some output. The jank becomes part of the charm.

@simian.friends 2 жыл бұрын

your writing and presentation is particularly great in this video, really enjoyed this, already can tell that I will be rewatching this many times over the coming months

@devindykstra 2 жыл бұрын

Of all the things to defeat Dalle 2, I would never have expected a mattress leaning against a wall.

@mauricepouly 2 жыл бұрын

i adore your videos and the flair you bring into them. i enjoyed this one a lot too and it made me laugh which is a feat on its own. thank you phillip keep on doing what you do

@hisshame 2 жыл бұрын

Thank you for sharing the process with us!

@kruchji 2 жыл бұрын

I love how you immediately answer any question that I think of while watching. Great video!

@DeepWeeb 2 жыл бұрын

Petiton to rename the channel to *"3klikspiphlipk"*

@llave8662 2 жыл бұрын

DALL-E can't generate words to avoid falsifications, similarly to the reason why it does not allow faces. Great video!

@LazyBuddyBan 2 жыл бұрын

thats explains it. but also won't matter, since likely in 5 years we will get it without restrictions.

@manfail7469 2 жыл бұрын

@@LazyBuddyBan christ, can you imagine how much the world will change when stuff like dall-e 2 go unrestricted?

@JustSayin24 2 жыл бұрын

Actually the original research paper for DALL-E 2 states that text rendering is a known limitation of the model. Specifically, the "embedding does not precisely encode spelling information of rendered text" - in other words, the model isn't trained at a high-enough precision to properply represent the intracacies of charecter shapes and grammatical rules.

@tissuepaper9962 2 жыл бұрын

@@JustSayin24 I imagine they are already working on making the next model recognize text in the training data, transcribe it, and run it through a separate NLP model so that the image generator can understand grammar and spelling and stuff.

@cem_kaya 2 жыл бұрын

@@tissuepaper9962 there is no need to do such convoluted stuff to get photorealistic text generation. scaling up the model works fine

@digitalrockets9702 2 жыл бұрын

That's so wild how I stumbled across your channel. While I was generating my own thing on the midjourney discord, I distinctly remember seeing your Martian base castle iterations pass by in the queue as well. So interesting how everyone can just watch each other's image generation take place at the same time. like watching parallel worlds unfold.

@emperorpalpatine1469 2 жыл бұрын

Mate I'm so glad you're still making videos like this, you're probably my favorite chap on KZfaq. You taught me how to play counter strike and you got me into technology from a young age, thanks a lot Mr Phillip :)

@vekst 2 жыл бұрын

Some of those film imitation ones are amazing, and convinced me to hoin the DALL-E 2 waitlist to try some for myself. Great vid as always Philip!

@MattVidPro 2 жыл бұрын

great video! I've been making a plethora of videos discussing and testing this technology lately, and man is it moving FAST. Every few days I hear something new....

@iulic9833 2 жыл бұрын

I know, can't wait for DALLE 2 to get released to the public, if it ever will. Also got some good results when upscaling the images too, they have some artifacts but its still mind boggling how an AI can create stuff as this. kzfaq.info/get/bejne/l5d6Zregq7uliGw.html&ab_channel=69fff

@Lulzalex 2 жыл бұрын

First the silent zoom in on the horse shaped entity followed by doing the same to the generated Chucky-doll all completely threw me off LMAO. I kept checking my back for the remainder of the video and could not relax like I usually do...

@mixchief 2 жыл бұрын

1:00 Hahahahaha! Beautiful piece. Like the subtle touch with the blue wig covering half his hair. And 1:05 gets even better. That a turd floating around in the bean soup?

@mgetommy 2 жыл бұрын

So cool, well done Philip . I love your tendency to see something interesting and tweak with it and show us the results

@HELLF1RE9 2 жыл бұрын

8:06 that is unbelievably unnerving

@Revan-kq7ih 2 жыл бұрын

Some of these pictures made laugh really hard. Great video, we need more of these!

@yom35 2 жыл бұрын

Amazing video as always!!

@raminasadollahzadeh328 Жыл бұрын

I am starting to get back to your channels. Been gone for about 2 years and now that I'm back I have to say there is a style in your videos which is very rare and unique. People say CSGO utubers are dying but you are proving them wrong. I am happy for you and for my self to find my new old fav channel.

@BigCheeseEvie 2 жыл бұрын

Splendid content, really entertaining. Keep em coming!

@maximiliankegley-oyola928 2 жыл бұрын

These are all fascinating. I love seeing the differences between the AIs. Always love your videos!

@ozmog6458 2 жыл бұрын

Hey, thanks for making these videos.

@artemisDev 2 жыл бұрын

the new Turing test: "Draw a mattress leaning up against wall".

@Gheno 2 жыл бұрын

Perfect timing, just got done draining my quest's battery and I need something to watch while eating. Thanks, unc phil.

@akkkarinn 2 жыл бұрын

I want MORE, we need a second part to this video!

@luigimaster111 2 жыл бұрын

This already has the potential to be an immense asset for artists, I for example dabble at making and animating 3d models and on a frequent basis struggle to find/make good textures to use on my models, now it's pretty much at a point where I could just ask an AI for what I need, do a bit of tweaks, then plop it in. Now we just need an AI for generating things like normals maps, as the existing automatic tools I've found to be a bit lackluster. I also struggle to visualize things, that is to say I can't visualize things at all to the point where even my dreams are an entirely auditory experience, so when making anything I have to make heavy use of references to get anywhere. Tools like this of course make getting specific reference material easier. With so much progress happening in such a relatively short time span though... Well I think stuff like this will soon threaten the jobs of a lot of people. I for example am suggesting to use them to fill the roll of texture and concept artist for my small scale personal projects, at what point will large studios decide to do something similar?

@TheKrzysiek 2 жыл бұрын

While others are worried about using this for more malicious stuff, I'm more excited about how much cool new content we can get from this. Want a specific image for a video, wallpaper, or a meme? Put it in AI I especially wonder if it will ever be used for things like concept art, book covers, character portraits etc.

@arcadianpunk 2 жыл бұрын

Let the battle begin

@dominikrohel2546 2 жыл бұрын

I appreciate your passion in making these intersting videos. I wonder what the technology will be like in 5 or 10 years. I guess i’ll have to wait for future philip to cover it

@artizard 2 жыл бұрын

i was hoping you were going to make a video about dall E 2, nice video! You should do even more ai related videos, i find them really fascinating

@lukasg4807 2 жыл бұрын

TBH I'm more impressed with the ability to understand what you're asking for than the image generation itself

@JohnDoe-sw2nc 2 жыл бұрын

DALL-E 2 is scary good

@HSE_VO 2 жыл бұрын

I adore your AI videos. Please keep them going!

@loetwiek 2 жыл бұрын

i love the ai things on your channel keep em coming

@med3262 3 ай бұрын

It's hilarious seeing AI tools from just 2 years ago and how fast AI generated contend has evolved. Looking back at the now primitive DALL-E 2 is fascinating.

@h930hec 2 жыл бұрын

Fantastic video Philip. Very much enjoying the AI content!

@deKxiAU 2 жыл бұрын

Just a tip for 'photorealism' with these models: put camera / photography specs like F-stop, iso and lens length - works best with DALLE 2 Edit: ah I see you've done that in the later prompts, ignore the above then :P Also worth noting none of the generation methods actually merge images from Google, they just had watermarked images in the dataset. I realise you probably don't think that it actually does just google some images given what you said, and it might seem pedantic - but it's a proper distinction (and the 'just slapping images from googling together' myth is a very common for all AI generative art right now), for those who don't know: the model's learned that that particular image is likely to have a watermark from what its seen in its training dataset and so it's synthesised it. It's not actually searching anything on any search engine or anything like that, it's just a matter of the dataset not being cleaned of any watermarked content Great video though Philip :) Edit: also for Midjourney specifically, there's some additional background style modifiers you can disable that would be somewhat influencing your result out of the box for the cartoon ones and making them less accurate to the prompt, forgot what the arg is as I'm on mobile watching this but it's somewhere in the FAQ I believe - but this is why you always get a vignette, a similar colour palette, amongst others behaviour across every Midjourney prompt

@huttyblue 2 жыл бұрын

What was a watermarked image doing in its data set if it wasn't sourced from scraping the web though? It may not be specifically from google but the concept of it just learning from what was able to be searched up on the internet is the same.

@deKxiAU 2 жыл бұрын

@@huttyblue I didn't say it wasn't due to web scraping, it absolutely is. I'm saying it's not googling/searching online at inference stage (the stage where people can actually interact with the AI in the way you see in the video), and in fact the AI never touches the internet (outside of it getting hosted online to be accessed by yourself, or it being trained using a GPU cloud farm somewhere in the first place). It's a large fundamentally different procedure with very different outcomes. Since most people here probably aren't familiar with neural net training I'll elaborate a bit: CLIP was trained on web scraped images as outlined above (CLIP being the model under the hood of most AI generative apps/notebooks and of Midjourney too), but it's nowhere near the same as a program searching up your prompt for close images online that match and then splicing those together - it's not a glorified Pinterest board. The dataset is static from the date of when it was scraped and published. It's then used as training material for the basis of the AI's generative ability - you won't find things posted after the date of the dataset in it's generative vocabulary for example. Naturally, a poor dataset can lead to poor results and watermarks are an obvious poor side effect of web scraping, but to conflate it with 'searching online' gives the impression that it's simply reverse searching for your images and slapping them together which leads to people believing that it is actively searching and essentially 'cheating' - like someone looking up results for a test right? Whereas in actuality it genuinely generates the images based on what it learned from 'studying' the dataset and associating different labels with what it thinks is relevant, as if it spent it's time studying wikipedia articles instead of the sources wikipedia lists, etc. CLIP has stupidly learned that stock image watermarks are common enough across it's whole dataset that they are worth adding to some images sometimes even when not directly prompted for it, because it had enough images to train on that had watermarks that it associated the concept of watermarks with that sort of image in its latent space. But it's the concepts themselves that it has learned, not direct image portions and mashing them together. DALL-E 2 has this same issue but the dataset was far more curated, it's fairly difficult to get a blatant example. DALL-E Mini (now CrAIyon) also suffers from this but the quality is bad enough that you'll be hard pressed to even recognise it's a watermark and not just random jibberish text. Most models at the moment are trained with the LAION dataset (among a few more) which has a whole host of web scraped content (including graphic porn and all sorts of NSFW images - these usually get taken out manually by the big companies models), but until there are open sourced datasets that don't have to rely on web scraping to get the sheer number of images training a model requires (several hundred million to billions), stuff like watermarks and weird quirks are just part of the parcel - that said, web scraping is also why it can make such hilarious memes because the highly curated datasets (like the one in DALL-E 2) remove large chunks of the image base and sort of gut the models ability to accurately reach a prompt in the process. TLDR: Its the difference between studying for a test before the day, or actively searching online during your test. Hope that helped illuminate the differences! Enjoy your day :)

@tissuepaper9962 2 жыл бұрын

@@deKxiAU I disagree that there's much of a difference. The claim that it's "just slapping images together" is basically pointing out that the system doesn't know anything about *why* certain features exist in images, it just knows that they *do*. AI at this point are still just advanced statistical aggregators, most lack the kind of logic that would allow them to generate images with details that make sense as opposed to just looking right at a glance. Philip isn't saying that the system literally merges images from Google at the time of inference, it seems to me like a subtle statement about "learning" vs. "regurgitating" and what should actually be called "intelligence".

@deKxiAU 2 жыл бұрын

@@tissuepaper9962 there is a significant difference. If Philip meant that 'it doesn't know why features exist' he should have just said that. Learning the 'wrong' details doesn't make it 'regurgitation' any more than learning the right details would, and it falling apart under scrutiny is largely due to the limited resolution of the training data (typically 256x256 or 512x512, DALL-E 2 starts at 64x64 with additional diffusion networks trained on upscaling it incrementally) combined with the limited number of parameters the model contains which leads to it having to combine concepts into the same latent dimensions and differentiate between them poorly as a result. I'm not sure what you could disagree with really, like I said at the bottom - it's the difference between studying for a test or looking up the answers during it, entirely different implications can be drawn from systems that do either of those. The former relies on prelearning concepts and identifying key relationships between them, the latter can pick new images as they pop up on the internet and doesn't have any understanding of the relationship between concepts at all. One is learning conceptual relationships, the other is a pinterest board with a fancy text input. I'm not saying it's not statistical aggregation, I'm just saying it's not ripping images off the internet and splicing them together like some Frankenstein creation, and that it *has* learned within the weights of its millions to billions of parameters that there is an association between watermarks and those types of images - which is actually true, in the dataset it was trained on there was enough watermarks for it to recognise the concept across them and learn about it the same way it has for every other concept it recognised; like trees and bushes belonging in a forest, stock watermarks belong on stock-looking images. Removing the watermarks from the dataset would solve that specific issue, but wouldn't change anything about how its fundamentally working, it would just give better results because it's an algorithm that aims to be able to create images that *could* have been from its dataset without actually recreating any image from it (that would be what's called overfitting, which we dont see in these models), its task is quite literally to map the entire range of possible images in its dataset and to abstract whatever relationships it can to condense it into its embedded parameter weights, and so it would be a failure if it didn't have watermarks when there are so many in the dataset. Make sense? Intelligence has nothing to do with it, different conversation entirely. Not arguing it's sentient or that it understands in a way that human brains do, (obviously, the way it understands and learns isn't as complex as humans and it doesn't have an understanding of *why* these things exist together, just that they do, because the why wasn't part of the training data - its simply condensing image concept relationships to an extremely large matrix of numbers), just that people shouldn't propagate a myth because "it's close enough" when it actually gives a false impression of what these models can do and how they work, and what it means for the world; different behaviour, different results, different legal implications, different world outcomes and use-cases. I'm only hoping to help correct the record as I'm a huge fan of Philips content - not wanting to knock the video, overall it's very good and knowledgeable and at the incredibly high standard Philip always provides for his videos - just that particular line (which he said twice) suggests hes either a bit misinformed on the topic (which is fine, everyone's misinformed on something and it shouldn't be taken personally if it's corrected) or that he wasn't quite clear on what he meant (also fine as he possibly wasn't aware of how what he said could be interpreted)

@tissuepaper9962 2 жыл бұрын

@@deKxiAU You have your interpretation, I have mine. You can carefully defend the model by explaining the limitations, that doesn't change my opinion. I think it's a perfectly acceptable simplification made for brevity, something you appear to hold in little regard. PS: You say "intelligence" is a different discussion, did you forget what "AI" stands for?

@Lohmeier54 2 жыл бұрын

I remember your old videos on ai face improvement. I never expected ai to get here until i was at least 30. I'm not even 20, this is incredible

@broomguy7 2 жыл бұрын

Another great video from 36PKL!

@flyinggoatman 2 жыл бұрын

I'm so glad I got access to both.

@seto007 2 жыл бұрын

Hey Philip, I recently got access to both DALL-E 2 and Midjourney, and so I wanted to share a bit of my perspective on the strengths and weaknesses of both. While DALL-E 2 is certainly better at generating the initial image at a higher fidelity and with more stylization based on the description provided, I actually think that Midjourney succeeds far more at creating a "final image" than DALL-E 2 does. The reason for this is that the subsequent variations of an image that you can generate with DALL-E 2 often deviate significantly from the original description, to the point where it often feels as though the AI is trying to guess at what the original description you used was based on the image it's making variations from, and because of this it often feels like the AI subsequently gets confused and creates more abstract renditions than what you might have intended. Midjourney doesn't seem to have this issue. Subsequent variations seem to stick to both the original description and intent behind the image it's creating variations of, and because of this it feels as though subsequent generations look much closer to the original intent of the person describing the image to be generated. Beyond this, it feels as though DALL-E 2 has some issues with understanding things like perspective in all but the most simple of circumstances. If you were to ask it to generate an image viewed from the side, for example, it will often give you an image viewed from a diagonal downwards angle, as opposed to a true sideshot like what you would see in something like a Shutterstock photo. Midjourney does not have this issue in most circumstances; it seems to understand that you want to view the object being described from a side-facing angle. I think both models have their strengths and weaknesses, depending on the use case; since I am primarily interested in using these AIs to speed up the art process for a cyberpunk video game I am working on, I like using DALL-E 2 to generate stylized concept art that gets across the themes I am going for, whilst I prefer using Midjourney to generate more technical images of hypothetical in-game objects to use as reference.

@bjk0norway0bjk 2 жыл бұрын

really enjoyed this video :D

@CrazyKosai 2 жыл бұрын

more shenanigans with DALL-E 2 plz

@sanderbos4243 2 жыл бұрын

Loved this video!

@Sgt_Recka 2 жыл бұрын

I know you said that people don’t watch this kind of content from you. But I just wanted to say that I love it! AI is so interesting, and not many people on KZfaq are showing what you are showing. I’m here for all your content, from all 3 channels

@kriterer 2 жыл бұрын

Seeing Boris emerge from the beans at 0:55 is immaculate

@Diie89 2 жыл бұрын

That final image with the men in purple jumpsuits singing in to a microphone is quite horrifyingly realistic. Even when zooming in and trying to spot things that might be off, I still struggle to find stuff.

@KrynexYT 2 жыл бұрын

As Károly from Two-Minute Papers always says, imagine the improvement two papers down the line. DALL-E 4 will probably make graphic designers etc. largely redundant.

@bluebell2334 2 жыл бұрын

I love Karoly's style of presenting something. Each video exceeds my expectations.

@s-zz 2 жыл бұрын

The irony of it all, is the fact that a lot of the same AI designers are also working on AI that can code. And will eventually cause them to become obsolete too. Seriously, look up coding with AI, there's a lot of info on it already.

@rene-of3sc 2 жыл бұрын

@@s-zz Meh, Copilot for example is useful to generate easy or repetitive functions but no matter what, a human would need to say what needs to be generated and see if the generated code is correct. I would assume in the future AI will be used as a productivity tool by programmers, but not replace them.

@AlphaGarg 2 жыл бұрын

@@rene-of3sc This. I hate this whole "[job] will become redundant!" falsity that people have for some reason hung onto. Did Photoshop make photographers' jobs redundant? No! Did node-based programming like Unreal's blueprints make programmers' jobs redundant? No! Neural networks like DALL-E, Jukebox, etc. are tools that'll be used by the people that know the most about these things - artists. Sure, any old schmuck might be able to generate an image based on a prompt. But they aren't going to be able to do it the same way an artist will. Artbreeder has existed for a while now, yet outside of artist circles, I haven't seen that much use of it. Same will happen to these once they get normalised and become accessible.

@trallakid 2 жыл бұрын

i don't think it will make all graphic designers redundant, just the ones stuck in the past. with all professions, the technology is constantly changing so any good graphic designers would ideally use these types of technology as another tool in the toolbelt. As a graphic designer myself I can 100% see this technology being great for idea generation and coming up with some ideas from prompts, but I don't think it will ever be able to fully replace a human (although mark my words I might regret going down this career path in a few years lol)

@lazz4205 2 жыл бұрын

Dall-e mini is an amazing tool to fiddle around with, i find it really excels at abstract depictions of things - the style "album cover" can come up with some pretty cool stuff when paired up with good prompts

@pastfuturizednow7907 Жыл бұрын

thank you for your work

@mennonis 2 жыл бұрын

Actually watching this to keep up with the progress, as I feel it will be important

@kasplay7275 2 жыл бұрын

the pics with the text in them is just like when u try to read in your dream

@jetex1911 2 жыл бұрын

Dall-e might not be able to do full images well, but I've definitely been having fun getting it to generate ideas for art thumbnails I could bring to life

@adan7949 2 жыл бұрын

It's actually a bit scary how convincing some of the images are, I'm glad those things are hard to get your hands on

@declanlambert1089 2 жыл бұрын

not for long

@BombBird11 2 жыл бұрын

@@happygofishing Dangerous little man, now aren't we? lol

@gigabooga 2 жыл бұрын

@@happygofishing Yeah but no one asked

@spiderjerusalem8505 2 жыл бұрын

@@happygofishing, true

@ShawnFumo 2 жыл бұрын

MidJourney has been letting many people in lately. Even DALL-E 2 said they let in 10k people in a week recently and had a survey on pricing models. It won’t be long.

@MarkSulekTalk 2 жыл бұрын

For your interest, I've seen an article where a photographer integrated an image he took in DALL-E 2, which was blurry and out of focus, and writted "Ladybug on a leaf, focus stacked high resolution macro photograph". The image recovered details and focus and became tack sharp, which was impressive ! You should try to do that !

@brownehawk7744 2 жыл бұрын

So basically some DALL-E # will eventually replace artist. You guys had a good run.

@arandompotat0 2 жыл бұрын

Love the AI image generation videos. Hope you can continue making these videos, since the tech will constantly improve. As a digital artist, your experimentations videos testing the limits of AI are so captivating. Will I have a job in a few years? probably not in concept art. Not anymore, that's for sure 😂😅

@PCubiles 2 жыл бұрын

Dall-E 3 (or at least something published at around the same time) will most likely be generating videos, there's already small examples that can do that from 1 initial image, but if you can connect that to a firstly generated image we could get it in a few years, or even in just one

@tedstriker2000 2 жыл бұрын

looking forward to these doing animations '')

@uglycoal 2 жыл бұрын

Could someone tell me the name of the music starting at 12:33? Can't seem to find it P.S. Found it in Deep in Thought album at 36:08 named as 03 - Wistful Winds

@QuestioningYourSanity 2 жыл бұрын

This is beyond fascinating. If I were making a movie or video game, I would use this to expand my idea of whats possible.

@boofkoosher2631 2 жыл бұрын

I was very shocked with dall-e 2 results. They were scaring-ly accurate and very detailed.

@Periwinkleaccount 2 жыл бұрын

Scarily.

@boofkoosher2631 2 жыл бұрын

Thank you dear sir for providing an appropriate word to fixate my lingua franca

@MrRobotrax 2 жыл бұрын

I'd really love to see dall e 2 make images of the backrooms. It feels like the perfect prompt for AI, since the images it generates already have a somewhat dream-like aura to them.

@CozMyN 2 жыл бұрын

I lost it when you said "Valve, please fix" :)))))

@mattd1466 2 жыл бұрын

I'm not sure you're aware of how good you are at presenting and making topics interesting, like I still watch your csgo videos even though I haven't been playing the game for years because they're still enjoyable to watch.

@mattd1466 2 жыл бұрын

@@2kliksphilip oh totally! at the end of the day I prefer my Philip kliked twice over thrice.

@keenban 2 жыл бұрын

I just got access to Dall-E 2 the other day, and I have been playing around with it. Honestly, it is quite crazy what it can do. I wonder how it would be if it were unrestricted.

@BombBird11 2 жыл бұрын

*C H A O S .* Just pure, utter chaos....💀

@godofzombi 2 жыл бұрын

I've found Dall E mini is really good at Art Deco posters, especially if you stick to natural landscapes. H.R. Giger also gives decent results, altough not the best quality. And mini's tendency to mangle faces makes some drawings almost look like the works of Francis Bacon.

@Failzz8 Жыл бұрын

Dude the matress part fucking killed me, why is this so funny holy shit lol

@amnesicpachyderm 2 жыл бұрын

I'm loving these AI videos. It's an exciting and worrying time, and it feels like we're on the precipice of some historic developments. Hopefully good ones. But I guess we'll see either way.

@FredMoin 2 жыл бұрын

I thought some of your upscaling videos were interesting, but this is just amazing. Thanks for the work you put in to show us what AI is capable of. Do you know if DALL-E 2 is made for that resolution and could be adjusted to put out more realistic images with more time/processing? Anyway, i still have a hard time to accept that a "program" can interpret text into relating images at all.

@AlessandroBluesBreaker 2 жыл бұрын

The vertical matress thing is crazy i went through a whole phase where i disassembled my bed for more space

@vectornine 2 жыл бұрын

The faces at the end are so good it's crazy

@MikeKleinsteuber 2 жыл бұрын

Nicely done.

@beanbandit495 2 жыл бұрын

The ending completely blew my mind!

@GamerReality Жыл бұрын

What music do you use in you videos? So enjoyable to listen to while you're talking!

@BrainSlugs83 2 жыл бұрын

I don't think the Midjourney Watermark is a smoking gun for saying that they grabbed existing photos and jiggered them around -- it could have just had a lot of watermarked photos in the training set for pictures of horses (or mountains) -- like and learned that writing in the background is part of what that looks like. I really feel like the holy grail of this stuff is a full 3d representation of your prompt (potentially animated or simulated; as there are realtime AI physics simulations that are coming along nicely -- like simulating the wind, water, large scale physics -- much more performant than PhysX or other modern approaches, especially for fluids and wind, etc.)... I think it could be really cool for VR or environment modeling or other types of asset modeling.

@gtPacheko 2 жыл бұрын

Great videl from 26PKL!

@AHAuwuOK 2 жыл бұрын

I had to pause the video because tears of laughter from the Midjourney's attempt at a dog sniffing a lamp post were blurring my vision

@Nyllsor 2 жыл бұрын

Very informative!

@christophernoneya4635 2 жыл бұрын

I think my favourite use for Dall-e mini is generating abstract art. It does this really well as it becomes easy to ignore say the smudging on the sides of an eye. It really does seem to struggle specifically with the human form, as one part bleeds into another.

@harrymalm 2 жыл бұрын

I was expecting this to be on about the same level as the competitions you made between the CS bots, but I was wrong... Some of the images generated by DALL-E 2 could definitely fool me, and I think you should make a video where the viewer guesses which images are real and which ones have been generated.

@Friek555 2 жыл бұрын

"Go to the kitchen and make me a kitchen" sounds like the kind of sentence a text generating AI would have spit out a few years ago

@chadcrypto2675 2 жыл бұрын

Awesome video, thanks.

@trillshii 2 жыл бұрын

I've been loving these A.I videos recently, I had no clue A.I technology like DALLE 2 even existed, I thought we were at least 10 years behind something that effective. Its amazing, but also terrifying.

@Batznblkcatz 2 жыл бұрын

I tried using descriptions of dreams Ive had. whoa, I mean its on the free dall e mini site but I'm questioning why I thought it was a good idea when the dream consisted of a flooded crawlspace under a house, with floating doll heads, those vintage ones, and omg I'm gonna have a hard time sleeping now.

@Senkiowa 2 жыл бұрын

8:13 The mattress things is interesting as generally "clutter" is one thing I always tought make realistic renders distinguishable from real life. There is so much unique stuff in the word that people just leave lying around that someone working on an appeasing looking 3D scene will not be able or want to include or will only include it in a way that makes it identifiable. While in reality if you look at a photograh you probably going to see blobs of things that can't be identified.