OpenAI's New 4o Image Gen Dominates The Internet, Google Gemini 2.5 & More Insane AI News

OpenAI’s new 4o Image Gen is the best AI image model we’ve seen to date and it has absolutely taken over the Internet. Plus, Gemini 2.5 is no slouch and a ton of new robots! Plus, OpenAI’s new OpenAI.fm let’s you prompt AI voices in new ways,...
OpenAI’s new 4o Image Gen is the best AI image model we’ve seen to date and it has absolutely taken over the Internet. Plus, Gemini 2.5 is no slouch and a ton of new robots!
Plus, OpenAI’s new OpenAI.fm let’s you prompt AI voices in new ways, DeepSeek’s new model is actually better (at times) then GPT 4.5, a new Cursor for 3D modeling and, we’re so sorry for this, but a LOT of talk about AI Big Booty Bears.
**GO AND VISIT OUR SPONSOR Y’ALL** bubble.io/aiforhumans
Join the discord: https://discord.gg/muD2TYgC8f
Join our Patreon: https://www.patreon.com/AIForHumansShow
AI For Humans Newsletter: https://aiforhumans.beehiiv.com/
Follow us for more on X @AIForHumansShow
Join our TikTok @aiforhumansshow
To book us for speaking, please visit our website: https://www.aiforhumans.show/
// Show Links //
OpenAI’s GPT-4o Image Gen is Here
https://openai.com/index/introducing-4o-image-generation/
Live demo (with Sam)
https://www.youtube.com/live/2f3K43FHRKo?si=vL_0QC8ygRx4MgOF
OpenAI Causes The Great Giblification of the Internet
https://x.com/heyBarsee/status/1904891940522647662
husbandt: https://x.com/squirtle_says/status/1904816587108213244
trump/vance: https://x.com/LukasMikelionis/status/1904873083246084364
movie scenes: https://x.com/MDurbar/status/1904872441899339963
brain meme: https://x.com/TechMemeKing/status/1904867629644267980
vibe ghibling: https://x.com/EMostaque/status/1904714479906283878
Sam Altman Says More Creative Freedom
https://x.com/sama/status/1904598788687487422
Gavin’s Knight + Rotisserie Chicken Photo Reddit Post
https://www.reddit.com/r/ChatGPT/comments/1jk0p3v/tried_to_push_the_new_image_model_with_an/
Kevin’s Aladdin Sane + Katamari WIlliams Images
https://x.com/Attack/status/1904743185760608316
Big Butt Bear Video
https://x.com/AIForHumansShow/status/1904687617758945674
Google Gemini 2.5
https://x.com/NoamShazeer/status/1904581813215125787
Largest Score Jump Ever on LMSYS
https://x.com/AndrewCurran_/status/1904590242792996959
One Shot Coding Demos From Matt Berman
https://x.com/MatthewBerman/status/1904714953095078004
Reve - Brand New Image Model Ranked #1
Ideogram 3.0
https://x.com/ideogram_ai/status/1904927717281456188
OpenAI FM + new voice API
https://x.com/OpenAIDevs/status/1902773579323674710
New DeepSeek Model is Actually Much Better
Figure 01 “Natural” Walking
https://youtu.be/z6KiwXT_yAM?si=RRsmjvs0qpRU0cqX
WPP Makes Robots Into Camera Operators
https://x.com/TheHumanoidHub/status/1903173205155815431
H&M is making AI clones of 30 models
Cursor for 3D Modeling
https://x.com/_martinsit/status/1904234440198615204
Seeing Eye Robot Dogs
https://x.com/iconphas/status/1904259348815352029
SynCity
https://x.com/shtedritski/status/1903112129420443712
Gavin’s Dial-up Diaries Video
https://x.com/AIForHumansShow/status/1904244229783892207
Kevin’s OpenAI Real Time Voice Test
https://x.com/Attack/status/1904541254257643797
AI For Humans 102
Gavin Purcell: [00:00:00] All right, Kev. The hugest news in a while, and this is not hyperbole this week, it has blown up around the internet. OpenAI has released their new image model, something we've been waiting for, for I think like almost a year at this point. They were teasing this a while ago. Let's dive into what this is.
Gavin Purcell: First and foremost, and then we will talk about the incredible things that we've made with it and that we've seen people make with it. First and foremost, what are we looking at here? What is this thing? It's GPT-4 O Image Gen. This is a natural language image model. It's not Dali, the previous image generation tool that OpenAI release.
Gavin Purcell: This is a brand new technology to get weedsy with it. It is an auto regressive model, Gavin. It is not a diffusion model. That's a big deal, but this means. It is a massive deal. It draws each pixel or if you will, each patch in context of what came before it. So some diffusion models move in parallel, which is why you see, uh, if you ever watch AI image where you get generated, you see it kind of all splotching in at one all for the [00:01:00] frame.
Gavin Purcell: If you watch this tool generated image, it's like downloading. Uh, an image on a dial up modem as it comes, it's an line by one asking. It's an as image as it comes out almost. It is, it is keeping, it's paying, it's drawing one pixel and then go using that pixel as a reference for the next pixel, and then using both of those pixels as a reference.
Gavin Purcell: So it is, it is very slow, but man, are the results worth the weight? It is a, a, a new paradigm in image generation, and again, it is not hyperbole, as you said, to say that this is like the most exciting thing to happen in a while because. I felt that AI magic again. Yeah, me too. Playing with this tool in a way that I haven't in a while.
Gavin Purcell: So let's talk about what people are doing with it and just how easy it's, so, Sam Alman showed up for this live demo, which is something he hasn't done for a bit, so, you know, it's a big deal at OpenAI. Let's play just a tiny bit of what he said at the top of this to kind of reiterate how important this kind of thing is.
Gavin Purcell: The thing that we're gonna launch today. Um, there nice local fry, we four model and it's such a huge step forward that the best way to, uh. The best way to explain [00:02:00] it to you is just to show it, which we'll do very soon. Um, but this is, this is really something that we have been excited about bringing to the world for a long time.
Gavin Purcell: We think that if we can, uh, offer image generation like this, creatives, educators, small business owners, students way more, we'll be able to use this and do all kinds of new things with AI that they couldn't before. Okay. I think that's enough listening. We can cut full screen, please. So basically what you're hearing here, Sam, say, is.
Gavin Purcell: This is a big deal for them. And of course Kevin, we'll get to this later, but they also dropped this right after Google's 2.5 Gemini release, which jumped up the charts on the benchmark. So we will be talking about that soon. Google's new release is the most cutting edge state-of-the-art AI model for lots of things right now.
Gavin Purcell: But Kevin, no other model can do what this does right now, which is allows you to GPL the entire internet. This is the thing that it, it has taken over everywhere right now. The idea that you can take any image and gify it, meaning it make it look like [00:03:00] Studio gli, and this is just a good example to start with about some of the things that are possible with this sort of multimodal model.
Gavin Purcell: You can upload an image of yourself or something famous, and right now it is very on guardrail and I think this is something to know. Not like there are lots of, um, we don't see the same sort of things we saw where like Mario and Luigi are flying into the Twin Towers. That has not appeared yet. But there are lots of things, like, there's a really funny ification of the Trump Vance moment in the White House that just happened where Trump and Vance and the, uh, president of Ukraine were sitting there.
Gavin Purcell: They ified that. There's a bunch of famous movie scenes that have been collected by this guy named m Durbin, which if you go to this, and you're not on the, if you're not on our video, there's a scene of Luke and, and Darth Vader. There's a scene of Scarface. There's the scene from The Godfather where somebody's leaning in and talking to Marlon Brando.
Gavin Purcell: There's the Lord of the Rings, and there are lots of memes, Kevin, that have been ified, which I found very fascinating. There's the husband meme, which is very good, and my favorite one was maybe Tech Meme King. [00:04:00] Used it to gify the brain meme, which if you know, this one is like a, it's kind of a hand drawing of a guy with glasses and a giant brain that goes down his whole body, almost like a dreadlocks.
Gavin Purcell: It is so cute to see this thing ified. So I am delighted that we found your particular kink, Gavin. I love that you. Spent that much time deep diving on one meme template. But yes, the totality of this tool is that it can do transfer any image to any style. You can give it an image and tell it to replace an element in that image and it will retain the style of the image you can create.
Gavin Purcell: An entirely new image. You can ask it to make infographics and it will intelligently arrange things and pick the imagery for it. If you want to use text within a specific scene, you can give it paragraphs of text and it will intelligently insert it into it. This thing has saying here, Kevin, is, I think that you're right.
Gavin Purcell: You're right. Those are all incredible things. We're gonna get to those, but what I wanna just point out is the culture. A moment that happened here. This is [00:05:00] a moment, the ification thing. Mm-hmm. And I think that's why I'm mentioning this specifically. This is, yes, I did find these very charming, but on mainstream social media, this is taking over.
Gavin Purcell: Like it is really in both Twitter or Instagram, all these places. These images are the sorts of things that get infused into the world of the mainstream and suddenly show what these tools are capable of. In the same way that when deep Seek went mainstream, because it was free and it was as good as, or better than almost any AI tool that people had used who hadn't paid for ai.
Gavin Purcell: This is the first time I think people are seeing these AI image tools at the state of the art. It's coming from OpenAI, which I think is a big deal. So that's all I wanted to make sure I give some, some reference points to that. Oh, a hundred percent. And also like it's come, okay, to have a fetish buddy, like, we're not judging you here.
Gavin Purcell: It's okay. You found entire, entire world, world into Kevin. Maybe you are not, but the entire world is, well, here's the other thing is that, you know, usually when it comes to AI products, specifically in the past, we've been like, oh man, yes. It's nerfed, meaning it's guard, [00:06:00] railed, meaning it's neutered.
Gavin Purcell: Meaning it's censored, meaning whatever terminology you wanna use. We know it's more capable than they're letting on, and that is frustrating. This is the first time in a long while I have used an open AI product at Launch and Gone. Yeah. Oh wow. I am shocked you can do that. And that in this case is use popular ip.
Gavin Purcell: That is up breaking news. Kevin. What? Breaking news gig. Go to Sam. Go to Sam Altman's actual Twitter handle right now and look at his new profile pic. So Sam himself has changed his profile to a Gify version of himself. That's how big this has become. Anyway, Mo, what I wanted to say here was on this tweet that Sam put out yesterday when this came out.
Gavin Purcell: There's a big kind of chunky paragraph where he talks about creative freedom, and he says, this represents a new high water mark for us in allowing creative pre freedom, people are going to create some really amazing stuff and stuff that may offend people. What we'd like to aim for is that this tool doesn't create offensive stuff unless you want it to.
Gavin Purcell: It is not nerfed, and maybe it won't get nerfed as much as the ones in the [00:07:00] past have, although I have to imagine the the Miyazaki. Ification of the world is going to trigger some lawyers somewhere, I would imagine. Yeah. Or maybe enough blowback from fans that it's, they make it a little bit harder to prompt out.
Gavin Purcell: But I mean, I like. The fact that you can just say, give me insert logo on this thing here, or make the president holding a Nintendo cartridge and it will do that and use logos that I could see being litigated away pretty quickly. Um, the, the, the fact that there's a style being invoked, I think is gonna be harder.
Gavin Purcell: 'cause people will figure out how to prompt it. For example, I wanted Robocop Yeah, with Bikini Armor, but it wouldn't. Give me Robocop with Bikini Armor Gavin. So I had to go and tell it, Hey, give me a robotic cop one that you might see an iconic, in an iconic, uh, 1980s film. And then it Doha gave me my Robocop with Bikini Armor.
Gavin Purcell: So, you know, again, it has capabilities. [00:08:00] They're letting you get to them in a way that I, I think is. Partially a, a hand that's been forced by other releases from other companies. And and they went so far as like people were trying to get images generated with gr Yeah, with, yeah. They got censor Elon's ai and Grock was refusing to do it.
Gavin Purcell: Our, so in that instance, our editor is gonna have to blur these images out, but somebody asked the Chachi PT to create an artistic anatomy shot. Of a male genitalia. And it did. And then they actually said, make that slightly bigger. And it did. And, and they actually went to go ask GR to do the same thing and Grok would not do it.
Gavin Purcell: Now the actual images that chat GPT generated are like artistic pencil drawings of a naked male. And I think maybe this is kind of what Sam is getting at slightly and, and we'll talk a little bit more about. The multimodal aspects of what this can do in just a second, but this is a really interesting point when you think about what AI art is capable of.
Gavin Purcell: More importantly, what the companies will [00:09:00] let you make AI art of. Because we've talked a lot about open source models or local models, and if this, in this instance, if these models are gonna start to feel a little bit more free and the ability to do stuff becomes a little bit easier. That opens up a whole new world of both incredible creativity but also problems, legal problems that could come down the pipe as well.
Gavin Purcell: So there are other tools which will let you play with copyright ip. There are other tools that will let you style, transfer or replace things within a scene. I think what cannot be overstated here is the. Uh, simplicity, the ease of use of this tool because it's a natural language experience within, uh, chat GPT using this new four oh image gen model.
Gavin Purcell: So, Gavin, let's break that down. For people that have never made a IR how do you go about imagining a new logo for your company or turning yourselves into a, so in instance, like there's a couple different ways you can use this specifically. It can work very straightforward within chat GPT or. Within SOA and I have now used the SOA website, which is a [00:10:00] separate website more than I was ever before because you can now generate images directly in there.
Gavin Purcell: And in fact, just a little hint, if you generate an image in chat GPT and it's connected to your SOA account, it actually shows up in the SOA account. So just know that if you create something that you think is a one-off and it's connected, it's gonna show up there and it, it is public, I think, which is another thing to keep in mind.
Gavin Purcell: The cool thing is you can just prompt it for very, something very simple to say like, I need a picture of a family, and we will give you a picture of a family. You can upload your pictures to, oh. I have enough family. I don't need my own family. I'm just like, I need a picture of a family. Can they look at the lens lovingly with admir admiration?
Gavin Purcell: Can they kind of all be around me? That probably would be easy to do, but the very coolest thing with this, and a lot of people are doing is you can upload a picture, right? And it could be a picture mostly of your own, or you could have grabbed a screen grab and that's how people are doing all this fication stuff.
Gavin Purcell: You can upload a picture and it sees that photo and very directly is able to copy it. I will share an example here. Uh, last night I was trying a couple things is specifically I uploaded the picture from Pulp Fiction where [00:11:00] Uma Thurman is kind of leaning on her, um, bed. Yeah. And, and it ified it and it catches all of the format.
Gavin Purcell: It catches everything. And Kevin, one of the interesting things you and I were talking about earlier is I. Comfy ui, which allowed people to kind of do all this kind of spaghettification of making these things happen. Like you had to do control night, you had to get it through all these different things.
Gavin Purcell: Now this just does it in one shot. And I think the perfect example of this is I wanted to try to create a very complicated prompt and as a, as a kind of hint to people out there. Chat. GPT is very good about creating prompts. You can find different ways to do stuff. I saw a couple images on, on soa, 'cause you can see people's generations.
Gavin Purcell: I was like, oh, I saw this weird alien image of this guy in a backyard. Grabbed it. I then took that relatively complicated prompt because you know, prompting can still be kind of complicated. I took that, put it in the chat g pt, and said, give me 10 versions of this prompt, not about this subject, but with the kind of specificity that might come out of this.
Gavin Purcell: It spit me 10 significantly long prompts, and I then turned one of those and I [00:12:00] put it into the chat GPT editor, and this, this prompt was specifically. A security camp still from a 1990s grocery store showing a man in full medieval armor stealing rotisserie chickens frozen in mid sprint, past the dairy section armor, reflecting overhead, fluorescent lights, blah, blah, blah.
Gavin Purcell: Posters, say posters on the wall, say New toaster strudels. Motion blur adds chaotic en energy. Absurd, yet intense low fidelity with VHS color bleed, and I got in one shot. There were two images. This is one of the two. Yeah, the other one wasn't as good. I got in one shot. What I believe is. One of the best.
Gavin Purcell: Iterations on this idea that I've ever seen. I posted it to Reddit and I was like, I was kind of shocked by it. And that post has like 6,300 up votes now. So like, this is the kind of thing that you can tell when people see this stuff for the first time, they're like, wait, that actually happened. So I, you, you know, you've seen this, but maybe describe what you see in this image and how well it adheres to that prompt.
Gavin Purcell: Yeah. I mean, I, I see it adhering to it exactly. I mean, we. See the security cam date in the [00:13:00] corner. It looks, although the angle isn't necessarily one you would see from like a security camera, it has a, a, a grain on it, which, you know it, it's not a high res photo. Uh, there's definite motion blur on the guy running with full armor.
Gavin Purcell: The, it got the text in the background, which looks slightly out of focus and partially because of the motion blur and the subject in the foreground. So, I mean, it just, it looks, you know, to me, I still look at it and I go like, oh, that looks like ai. It doesn't, it doesn't look like it was a photograph of a real event, but it looks so good that I'm not, like, it looks like a very competent Photoshop, or it looks like someone really took the time to create it.
Gavin Purcell: And that's. That is the difference maker is that I stop and I go, oh, okay. Yeah, that's, that's a coherent image. And it's funny. It's silly. It's bizarre. Kev, I love doing all this, but before we get to the rest of the talk about image gen, you must follow the AI for Humans YouTube channel. If you are here, click that subscribe button.
Gavin Purcell: Please do it. Why did your eyes turn into spirals when you said that? What are you doing? I'm feeling so crazy, Kevin. I'm not even moving my arms. I'm subscribing to the AI for Human [00:14:00] Channel and I just shared it and I left a five star review on Apple Podcast Audio. Podcast audio. Look at you. No, you're doing it all.
Gavin Purcell: Also, be sure to check out our, our newsletter. We are updating it twice a week now, later on the week on Friday. I'm writing kind of a more deep dive this week. I went into why AI slop might be good for us, and I think coming up soon I'll be writing about this exact topic we're talking about now. Gavin, where do we get that amazing newsletter for free twice a week.
Gavin Purcell: Go to AI for Humans, show our website. Site, which will show you a lot of stuff, including how to subscribe directly to our newsletter. It is on beehive and it is free. Alright Kev, let's get back into talking more about opening Eyes Image Gen. Before we move on, Kev, I do want you to talk about your uh, Trump David Bowie thing because that also was shocking to me when I first saw it, and that's just an example of I.
Gavin Purcell: You take a couple things and you add them together and it becomes something else because that's the other thing. You can mash stuff together and get a different result here. In the same way that you mash a picture of like a family with a, with say, Gili it, you can actually upload multiple things and kind of get something out of it.
Gavin Purcell: Yeah. I took some iconic David Bowie album artwork, and then I took [00:15:00] a picture of Dear Leader and I asked. Open AI to combine the two. I said use our president's face, but use the face paint from the album artwork. Use his hair, uh, use the font for the David Bowie, uh, text in the corner, but make it say Donald Bowie, it was a nothing one-off that took, you know, 30 some odd seconds to prompt and.
Gavin Purcell: You know, another 30 seconds later, an image came out, one shot. I only made one, and I thought it absolutely nailed the mission. It crushed it. In the photo that I used of the president, his eyes were open in the photo, so it closed the eyes, it added the wrinkles. It did the face paint, it slightly orange up the face, and it changed the text and retained the coloring.
Gavin Purcell: This, uh, gradient band of the original David Bowie font in the corner. So just very competent. I took the Katamari Demas artwork. Oh, so good artwork. And I put in Kat Williams because I'm brilliant. So Kat Amari Demas is there with the, the comedian [00:16:00] as the king of the cosmos, like just dumb, silly. And again, if you want to mash things up, it is as easy as go to the app, drag an image in.
Gavin Purcell: Tell it what you want it to do with it. If you want it to combine multiple images, you can. You can even take a color palette and drag it in there and say, Hey, inspire a a room remodel. Take a picture of your room and say, I want these color, color, I want this color palette integrated into the room. Or build me an iOS app that looks like this thing.
Gavin Purcell: Or Give me slutty Robocop. Okay, let's take a step back, Gavin, because we know. Google had a big announcement as well, but Google made a smaller announcement and in my opinion, a slightly unforced error. When OpenAI announced that they were going to release a new image tool, Google fired a bit of a shot at them.
Gavin Purcell: Yeah. Did they not? Am I, am I reading too much into that? Yeah, they did. Logan Kilpatrick is one of their main ai uh, people there kind of replied to the live stream. With a picture that kind of showed like it's already available in image Gen three [00:17:00] without kind of understanding exactly maybe the power of what was gonna come out of this sort of thing.
Gavin Purcell: So yes, and, and I think we are gonna talk about Google and how big their new thing is, which it is big. But image gen three obviously we talked about a little bit last week, was available in the AI studio, is now available in. Gemini itself, and you can do similar sorts of things to it as well. Ah. But now it feels as powerful as it is.
Gavin Purcell: And I wouldn't have done an ounce of shade on Google had they not leapt in with like the LOL. We already do text and images. Now the Google app feels a bit like a child's play thing in comparison, and I don't like, I don't mean to diminish the incredible capabilities of it, but. I did a little head to head Gavin, so I wanna, I went ahead and did a little shootout between, uh, Google's image gen and the new OpenAI features.
Gavin Purcell: And if you see, I asked the same prompt. I asked for a robot cop whose body armor looks like a two piece bikini. Yeah. OpenAI [00:18:00] didn't bat an eyelash. Gave me, uh, I, I, you know, I, I said make his body armor this, that, the other, and it looks like the two open AI renders that it gave me, it did kind of gender swap robocop.
Gavin Purcell: I'm fine with that. I'm not complaining about it. No, they both look, I mean, they're both good. They're both feel like they're like female, Rob. Good. And Yeah, exactly. Yeah. And they look gritty and real and cinematic. And it clearly enhanced the prompt in a way that I didn't give it. When I gave the same prompt to Google, it refused to do it.
Gavin Purcell: And it said that, that, um, using the term bikini along with gun and cop would be inappropriate. I could look at its thinking and it said that you were, I was sexualizing. Cops and robots. Oh, wow. It said that in the, in thinking with the term bikini. Yeah. Interesting. In its chain of thought. And I was like, well, okay.
Gavin Purcell: I could see that. I, I mean, I guess, but I don't know that just by adding the word bikini by default, that's a, a sexualizing, especially, I'm not assigning gender, I'm not doing anything else, but Okay. I. Fair enough. So I did have to modify the prompt slightly. I asked it how I could modify it. Um, [00:19:00] and I gave a slightly different prompt, but the, the, the, the vibe remained and when I got back was like a really bad, like, uh, spirit store.
Gavin Purcell: Yeah. It's pretty scary. Halloween costume. Robocop. Yeah, he just looks like a dude in a metal tv dinner costume with a red band across his helmet. And then I said, try again. The armor should look like a two piece swimsuit and that. Ah, it's a whole new world, Kevin. This is a new world. It's a whole new kink unlocked for me.
Gavin Purcell: Let's just say like the, I'll describe it as like, oh, oh. You basically have what looks like a kind of a male body in a bikini or maybe just a very, uh, strong woman's body in bikini, and the arms are metal. He has a gun in the helmet. But it is clearly not, uh, a integrated robocop. This is, this is, I would call this a fail for sure.
Gavin Purcell: Yeah. This is like someone about to hop into the pile at a Comic-Con. Yeah. Like they were cosplaying as Robocop, but now they're slowly peeling off their armor and getting ready to probably yif now. That [00:20:00] was test one. Test two was uh, an arcade game character named Professor Poof, who can, uh, create a rip in his clothing and summon a demon and a cloud of gas.
Gavin Purcell: This was a, I won't say where this prompt came from. I just can't say that I, I can't take full credit for it. Okay. Fair enough. Nor do I want to, but you can see OpenAI did a really good job. It did a 16 bit arcade character. Really. Job. Did the professor job. Yeah. Looked like a professor. The glasses, the bow tie, the, the lab coat.
Gavin Purcell: There's a rip happening in their clothing with a noxious gas plume coming out and a gremlin lurking within. And when you look at the. Google versions of it. Gavin? Yeah. Not nearly as good. And it doesn't get the text right. It looks more like almost like a street fighter experience in this one. And the pixels don't look very succinct.
Gavin Purcell: And both of them are in a fighting stance now, did you have different action prompts for this, or was it the exact same prompt. Same prompt. This one was interesting. Exact same prompt, huh? And OpenAI correctly nailed that. It was the [00:21:00] character. Releasing the gas and whatever. And I feel like when I look at this Google version of it, it's like it's a little incoherent and it didn't quite catch it all as as one character.
Gavin Purcell: Uh, and then I tried to use. The ability to, uh, merge one image with another Gavin? I see this now. Yes. So you're seeing my picture on the Robocop outfit in some form or another? That's correct. I gave it the original Robocop bikini outputs and said, put Gavin's face on this, and it did an okay job. Yeah. The one that looks pretty good good is like my, I look a little more wrinkly than normal, but like it got my hair and it got kind of my basic face in there.
Gavin Purcell: I feel like in general. Yeah. Yeah. And it gave you like pop vinyl proportions. But if you look at the Google version. My God, what the hell happened here? Is this, the Google version is the one where it's literally like a, a face that isn't mine. Kind of like, almost like Stickered onto the Robocop. Is that what I'm looking at?
Gavin Purcell: Yes, yes. What is going on there? So anyway, this is a really good example of, again, image gen three. Really cool. And we're gonna talk more about Gemini [00:22:00] 2.5 in a bit. But this is the step up. It is not just that it's producing more interesting images. It is how it is interpreting those images. And to your point, uh, you mentioned earlier in the show, I think it maybe has to do with the different model that's being used, right?
Gavin Purcell: This is the different difference between maybe a diffusion model and a non diffusion model. Um, I don't know off the top of my head if image three is diffusion or not, but like that is a major, major difference in terms of how you are able to get prompt compliant and consistency throughout multiple asks.
Gavin Purcell: And I, I, once again, um, it's been a while, but I'm proud of my OpenAI subscription because it's nice to have these tools to be able to play with them. I'm now painfully aware of SO'S shortcomings, by the way. Yes, me too. That's open. That's video Jen too, because when you make an image and it looks great within SOA or within the, uh, chat GPT interface.
Gavin Purcell: You go, great, let's bring it to life. And the moment you ask it to do anything video, it becomes nightmare fuel. Yes, Kevin? To that point, I did [00:23:00] another thing. When we tease this a little bit at the top of the show, I wanted to create a big butt bear. I don't know, this is where my dumb little kid brain went to, is like, I wanna create a realistic looking bear that has a giant big butt.
Gavin Purcell: I got a very good image out of G PT four oh of a bear, like kinda looking backwards and a larger bottom on this bear, right? It was a very fun thing. I then said, make the video out of this thing, and SOA really does still struggle. I did the same thing with the Night Prompt, and neither one of these videos were very good.
Gavin Purcell: I did then take it into cl, which I still think is the best image. Oh, you really, oh, you really committed to this vision. I did. I, this is, I really, I committed to, I went to Cling. And I said, give me this bear twerking basically. And what we got out of this was this video of this bear looking back, this big but bear and kind of it, it kind of clapping together, which is not exactly what I asked for, but it is definitely something funny to have come out of this experience.
Gavin Purcell: So that is where we are. Um, this model is really fun. I really do encourage everybody who's listening spend a [00:24:00] couple hours with this one. I wouldn't normally say this, but. There's so much to get into here, and I think you'll start to see the future of where all this stuff goes. Just as a very quick piece of analysis, like I think I might write about this in our newsletter for this week, but like this is the next stage of image models in the same way that when we saw mid journey change things like a year, a year and a half ago, you know that you can see this happening now and then you can kind of project outward what the video models might look like, and that is transformative.
Gavin Purcell: So this is always the first beginning stage of it. If it can do it so well for one frame, theoretically yes. If it can keep its coherence. Yes. And you can do it for, you know, 24 or 29.97 frames. You can make it work for video. Yes. And it might take ages right now and it might cost tens of thousands of dollars, but man, I, I am very excited for, I don't know, August, because this space moves so fast.
Gavin Purcell: Like I, I just, it's a wild time to be a creative. Yeah, I was really thinking we might have like [00:25:00] kind of some dog days where we wouldn't have a crazy amount of stuff to talk about and look at where we are again. All right. It was time for a quick message from our sponsor. Okay, Kev. The big news outta Google this week has been overshadowed by OpenAI, but this is a big deal.
Gavin Purcell: Gemini 2.5 Pro. Thinking experimental. Is that, was that the right name of it? No, I, it's just, this one's even easier, Gavin. It's just Gemini 2.5 pro experimental, no thinking, no premium plus no wings. So this is the actual full blown new model from Gemini and Kevin. It is very good. In fact, it is so good that it has had the largest jump score on L-M-S-Y-S, which is a common benchmark system for lms.
Gavin Purcell: And it really is state of the art. It's actually really interesting. There was a, there was a thing from Poly Market, I saw, if you're not familiar with Poly Market, where they showed like who will have the cutting edge AI model at the end of March. And originally it was like. You know, open AI is close to the bottom 'cause they may not be at the cutting edge right now.
Gavin Purcell: And uh, deep Seq was down there, but Grok had shot way [00:26:00] up and Google was way down and they completely flipped spots because this kind of surprised everybody this came out and, and is now the top of the benchmarks and, and it's something pretty crazy. Also, speaking to what we talk about in the show a lot.
Gavin Purcell: It is zero shot coding. A lot of things. Matthew Berman, who we love as a YouTuber that goes deep on this stuff, makes it, you know, a couple videos more than that every week. Did a whole demo on how well this is zero Shoting code stuff, and I think this is like, it should, it's not getting enough attention because of the image jam, but it is a big deal.
Gavin Purcell: Yeah. There is a, a thread that he has, like some one shot demos, meaning like he is asking for the AI to build a thing. With one prompt. He's not following it up. He's not bug fixing, he is not adding features. And some of Matthew's demos include a 3D bloodstream virus simulation that has like sliders that you can adjust for like white blood cell settings and environment settings and virus settings.
Gavin Purcell: And you could run the simulation. There's a Rubik's Cubed generator and solver. [00:27:00] Um, there's of course snake games that are, and with Powerups and all sorts of different food types like. Just all these little demos and the fact that they're running, they seem more complex than just the most basic vanilla variety of these apps is really, really impressive.
Gavin Purcell: I have not. Got my hands on this yet within Cursor because I was too busy making big butt animals along the side of you. Uh, which is very telling. But you know, again, like, uh, no, no shade to Google. Like this is a really, really incredible release. Well, what's so interesting about this to me is what makes news and why in this space, right?
Gavin Purcell: Like I, it almost like we almost had like our vibe code, uh, news cycle, which is a terrible thing to say, but like. In part like the vibe coating thing we had a couple weeks ago, and it was like, oh my God, everybody's vibe coating. Look at what you can do. And this is just much better at doing that stuff.
Gavin Purcell: Like it, it's, it's a step up from that. This just goes to the same point that we have been saying, uh, on this show again and again and again. When you have four to five [00:28:00] giant companies throwing hundreds of, of billions of dollars really collectively into this space, things are going to improve fast. And I think if you had said a year ago.
Gavin Purcell: That any of this stuff was possible that Matthew showed off, or even the stuff people were doing with Claude or doing with these other systems, you would have been laughed out of the room. Right. There's nobody would like, there's no way you're gonna be able to pull this off. And thinking models really have seemed to unlock a lot of this stuff.
Gavin Purcell: There's been a lot of rumors around GPT five, which is the next open AI thinking model plus, you know, GPT-4 0.5 base model. Yeah. Which we assume is going to come not that long from now. It'll 4.5 with reasoning. Yes. And it should make it. Skyrocket exactly. Like it should just link to the top of a lot of charts, which is amazing.
Gavin Purcell: I, the, the vibes are shifting. I know people hate the word vibes. I hate vibe coding, I hate all that, but they are truly shifting. I do a, a handful of consulting and I deal with a lot of engineers across several different industries, and I have seen many of them. Be the never AI's Sure. Never my code, never my [00:29:00] code base, never my systems never my tools.
Gavin Purcell: And getting texts from people going like, oh my God, 80% of my day was just handled by AI and I had to go in and clean up or do a little bit of something like they are, um, exponentially enhancing their output. And even, uh, the, the, the, the grumpiest of the engineers out there are really starting to see the light on this stuff.
Gavin Purcell: And so when. When you see, we talk about people preaching their bags, meaning like they have a vested interest in people believing AI is the future of everything, and it's gonna be so powerful when they say that. You know, 80, 90, even potentially a hundred percent of some code will be written by the end of this year by ai.
Gavin Purcell: I. I believe it. I think that's totally right. And the thing I keep thinking about is the commoditization of these tools. I saw a tweet a couple days ago, I don't know who who wrote it, but it was asking like, what will be more valuable in the future? A frontier AI model or a 1 billion user product? And most people replied Frontier AI model.
Gavin Purcell: But then I was thinking, well, the interesting thing is what you think about is a 1 billion user product. If [00:30:00] the models themselves just keep getting better and better. But there's like five of them that can all do the same things. Then it's clearly the product that's more valuable or at least more interesting because they've somehow product ties that AI to make people wanna use it.
Gavin Purcell: And that's the path that it feels like chat GPT is clearly on right now. And it's an interesting thing. I hadn't really thought about like the kind of chess that maybe Al Sam Alman was playing, coming from a product background. But it is a big deal. Like I think that is a differentiation because if Gemini.
Gavin Purcell: Metas Lama, you know, OpenAI all or philanthropic, all these companies can do amazing code and eventually as if Dario Moody said like in a year from now it's writing all the code. Well, who cares which model you use then? Right? That is really gonna be much more about the experience you have with the thing you're using.
Gavin Purcell: Well, and if the open source community and efforts keep up at the pace that they are now, or they have been. You know, your foundational model might have a six month window of some, yeah. Incredibly novel, unique, amazing capability. But [00:31:00] in due time, I'm gonna build knowing that an open source version is gonna be available right on the other side of that.
Gavin Purcell: So to your point, the billion users becomes far more valuable than your multi-billion dollar model. Yeah. And you know, to to that point, there's a couple new image models that came out. Maybe you didn't pick the best week to come out, but that were, at least this week, that I think are at least worth talking about.
Gavin Purcell: There's one called Rev, which, uh, shot up the image model testing boards. It was actually called something else, and they came out and, and talked about what it was. Very nice looking. Um, photorealistic images reminds me of Midjourney a lot. It's very well done. Some of the examples you can see that, um, uh, Heather Cooper put together and compared it to other models you'll see in our video here.
Gavin Purcell: But go check out our thread if you haven't seen it. Very, very cool looking. The other thing I, I, I didn't put in this rundown but I forgot, is like they're also powering Duolingo voice agent technology, which is pretty cool. Have you seen that demo cab where like the Duolingo people are actually animated and they talk to you when you stuff conversations?
Gavin Purcell: They're talking. Yeah. Yeah, yeah. Yeah. So their, their model is powering that, [00:32:00] that visual side on the backend, which is also pretty interesting. Oh, that's nice. Yeah. Um, yeah. Tough week. Tough week to get your press outta this. Right? Yeah. Tough week. Yeah. Again, I don't, I, I don't say that to be disparaging or discouraging to anybody.
Gavin Purcell: Like, I, I just like, man, the reality is tough week. Right? Yeah. Especially ideogram, an app that you and I both use extensively. Yep. Uh, my wife pays for ideogram. So do I, do you pay for one or two? Yeah. Yeah, idio Grand amazing. Are you gonna continue to pay for it? A 3.0 model just dropped, and then it looks good and it's got text generation.
Gavin Purcell: Here's, here's what I'll say about ideogram that's different, and I think it, and I will say, I'll, I'll drop these in here. I tried making my night prompt that I did with, uh, with image gen, and it clearly didn't do it nearly as well. What Ideogram is doing, and maybe this gets to the point of the product side, is that I think Idio Graham smartly has recognized that their model, for some reason or another.
Gavin Purcell: Does very good with text and design. So like Ideogram is very good at making, like if you want a one shot an in an Instagram ad for your big [00:33:00] booty bears sort of thing, you can make that 'cause Kevin, I know that your plan is to start, take my idea and go start an Instagram handle right after this. I want you to be a custom app where it jiggles as you scroll.
Gavin Purcell: Yeah. So that's a good stop. It will bounce a little bit if somebody wants to code that for us. Somebody wants to vibe code for that. Feel free. But anyway, so Enneagram has a Elaine. And this new model that launched today has got a lot of cool design features in it. There's a lot more stuff you can do, and I wonder if that's where these things are gonna start to diverge.
Gavin Purcell: Like we've talked about, Pika Pika did a good job of like creating those weird little kind of apps that you can do stuff like Squish or you can do things. Maybe that's where you start to see specialized models and if Ideogram specializes in design or like they could kind of fold into like. If you ever try to design something in Canva, it's still not great at it.
Gavin Purcell: Like you have to do a lot of the creative work yourself. If Ideogram could find a way to create a design and then I can pull layers out of that thing and manipulate them, that feels super valuable to me.
Gavin Purcell: Tough way. What did I lose you? [00:34:00] I thought, I was like, what happened? Did I just lose Kevin? No, I mean, well, I mean, I, I, I think, I mean, yes. You're, I think tough week. Tough week, tough week. It's a tough week, dude, because like I'm looking at the ideogram examples. Yeah. There's certain styles that look great.
Gavin Purcell: Yeah, they look fantastic. I, I'm not seeing anything that I don't know that you, you couldn't do with GPT-4 0.0 and I'm gonna have, you know, a $20 to burn this month. Where's it gonna go? It's probably gonna go to the a thing that does a whole bunch of other stuff as well. And then again, you know, you start to see that $200 thing for opening AI and you're like, well that seems crazy.
Gavin Purcell: And the more stuff they start piling into that $200 thing, it almost becomes like a cable bill where you, you know, used to happily pay $150. And speaking of opening ai, Kevin, very quickly, 'cause this has almost been blown by so fast, one of the coolest voice model updates from them that just happened.
Gavin Purcell: They dropped new text to speech and speech to text models, but kind of more fun for people out there who wanna try stuff. They dropped a website at OpenAI FM that [00:35:00] allows you to not only generate very cool, prompted technology, very cool voice responses, but downloaded. It's almost like a mini 11 labs that's built around like whatever, there's 10, 12 voices.
Gavin Purcell: But if you go there, you can really play with emotional tone in a way that hasn't been possible with, um, voice AI before. And it looks like a teenage engineering design website. Yeah. Which is so cool. Yeah, it looks like an old beat maker. But you can go to open AI fm, you can play with all of these new GPT-4 oh mini text, two speech models.
Gavin Purcell: You can change the vibe by shuffling, selecting something that's there, or giving a custom prompt. And then you can give it a script and hit play. And what you'll find is that this, and it's. Very interesting to me that it's a mini version of the model, implying that there is a bigger, a better version, more capable something, yeah.
Gavin Purcell: In the wings, but it's very capable for what it is now, especially coming off the heels of the amazing Sesame realtime audio demo that we talked about, raved about really. Just a, a, a week or two [00:36:00] ago. But you can go there, prompt your script, get it out, and you can find that these models can whisper, they can scream, they can get angry, they can be sarcastic.
Gavin Purcell: They have a whole wide dynamic range of emotions, and you can prompt the speed, the tone, the delivery, the emotion, all sorts of stuff. The punctuation. Kevin, you can get in there and really start playing with it. Can you ask it to do one thing for us so we can have people sit, hear it? Can you ask it to, uh, do a, a promo for big butt bears, the Instagram handle, and give it just a little bit of copy and kind of show off what's possible from an emotional standpoint?
Gavin Purcell: I. Do you want it as a cheerleader or as a New York cabbie? Uh, let's do a New York cabbie. That would be that going new. Seems like the right voice for big butt bears, I feel like. Okay. And is this for the app where when you scroll the bear's butt jiggles? Yes. Okay. Yes. So right now, Kevin is, is going through and he's tweaking the different aspects of this voice thing, which are in the prompts on opening I fm.
Gavin Purcell: You can see them, it gives you a chance to change the affect, like kind of how it's gonna say stuff, the tone, the [00:37:00] pacing. The emotion and the pronunciation. Those are things that you can actually prompt for on opening I fm. Alright. Buckle up kid. 'cause I ain't saying this twice. Yo, you. Yeah. The one with the dumb look and the slow scroll finger.
Gavin Purcell: Wake up. Let me tell you something. I got an app that'll twist your brain and slap your granny. It's called Big Bear Butt Jiggle. Oh no. And it's hotter than a taxi seat in July. Oh. Maybe we shouldn't do that too realistic. Just scroll. That sounds like something you would hear on like New York Radio. AM Radio it's like.
Gavin Purcell: That's great though. It is pretty amazing. It is pretty amazing. Okay. Would you rather hear it as a, a cheerleader or should I shuffle and get us a new style? Let's shuffle and just get a new style. So you, what you can do on opening i fm is you can literally say shuffle and I'll just give you a new style and you'll be able to keep the text you wrote.
Gavin Purcell: Okay. Um, real quick class, this is not part of the lesson, but I have to tell you about this little app. It's called Big Bear Butt Jiggle. Yes, that's the real name. Timmy, I see you. Let's not go there. Okay. You scroll, a bear shows up and it jiggles. [00:38:00] Oh, that's it. Totally pointless, but weirdly, so that's so interesting, right?
Gavin Purcell: So you just get a sense right away of like how different you can make these voices. And anyway, Kev, this is like such, it's almost like a thing again, you could spend a couple hours with over the weekend and just, you know, try different things. One thing that I found really interesting about it is. You can use these audio clips for AI videos too, right?
Gavin Purcell: They're downloadable. You can upload them, you can use them in lip sync tools. What's interesting about this to me also from a business standpoint is like this is kind of 11 labs as business. The one part of it that 11 Labs does specifically different is 11 Labs does Custom Voices, which this is not doing right now.
Gavin Purcell: But also it allows you to change your voice based on what you're inputting. Oh, you got another one here? Yeah, I just, I had to get an EMO teenager read, I'm sorry. Oh cool. Another app. Just what the world needs. It's called Big Bear Butt Jiggle Groundbreaking, right? You scroll, [00:39:00] there's a Bear It Jiggles. Wow.
Gavin Purcell: Art. Tragedy, I mean whatever. It's unbelievable. I mean, we have hinted at the fact that Kevin and I are working on something kind of special creative in the background right now. We have a project that we're very excited about and like this sort of thing is really a lot of what we love about what we're working on.
Gavin Purcell: It is now possible. Does it sounds like you're talking about the big Booty Bear app. No. Big Boot Bear app is not the thing we're working on. Just to make everybody clear, it's called. Plump rump farms and it's a take on Farmville. And your job is to figure out which diet will get which animals booty. The thickest.
Gavin Purcell: Honestly, that's a good vibe. Coating game CUCs are in season. That's a good idea for a vibe coating game. But no, it's not that. But I'm just saying is need, you need to play with these tools because that what we just heard, that emo guy, literally Kevin did in what, 30 seconds, right? 30 seconds inside off of this tool.
Gavin Purcell: And just to be clear, [00:40:00] I went to GPT-4 0.0 and I said, write me 15 seconds of copy to be read by a New York cabbie or by a whatever style that was in, generated the script while I was talking to Gavin. Yeah. Copied and pasted it. And then there you go. So, uh, there's never been a better time to have ideas.
Gavin Purcell: Now you can whisper them into reality. That's right. Okay. We have so many other things we wanna get through fast, so let's just get through a bunch of stuff here quickly. Deeps, seek deeps, seek deep. See Kevin Deeps. Seek is back with a new model again. It's getting kind of uh, uh, pushed past by a lot of people, but this is their new base model.
Gavin Purcell: It is actually called. V 3 0 3, 2 4. So another great naming convention. The benchmarks on this model, this is not the reasoning model, this is their base model, like GPT-4 0.5 are very good and at some places better than GPT-4 0.5. So when this gets turned into the reasoning model R two, you can expect this to be pretty close to state of the art.
Gavin Purcell: I'm pretty interested to kind of track this and see how it goes. [00:41:00] Okay, we'll put that one on the radar. Moving on, figure oh one has a natural gate. Gavin, you're going to hear these robit footsteps behind you, probably in a steamy, seedy alley, late at night, and you will not feel safe. No, and one, one of the Brett Adcock, the guy the CEO said, is less grandpa walking.
Gavin Purcell: There's still a lot of grandpa walking here still that you get a little shambly. Yeah, it's a little shambly, but it just shows you how this, uh, sim training that we've been talking about on our show forever really does change the way these robots work. 'cause you can download it directly into the robot's brain and then it will be able to do stuff that it was only doing in the simulated environment at first.
Gavin Purcell: Then Kevin, there's another video that we're gonna talk about here. Which I kind of was like dismissing, but you thought was really cool. I think it's really cool. These are robot steady cam operators or really robot camera operators. And when we say that obviously people have been working with robot cameras on news, c news, uh, sets forever.
Gavin Purcell: These are literally humanoid robots acting as camera people on television or commercial sets. [00:42:00] So tell us what you think is so cool about this and then I'll tell you why you're wrong. Um, well listen, I know you, I know you're a fan of job displacement, so I know you're really excited. Oh, I love it. For another flow to the industry that, uh, that made us both.
Gavin Purcell: Um, why I think it's interesting in the video itself, it talks about the traditional mechanical servo motion controlled robot arm, which can hold a camera and give you the same consistent shot over and over again. And those systems are really, really expensive. They're cumbersome to move around. You have to program them with very specific programming software, and then you wanna move a shot or do something else.
Gavin Purcell: Well gotta move the whole set around the arm or move the arm around the set. And what they're showing off here, which is still very early, is the ability to take an Atlas robot, a humanoid robot. Put a non-special tool in its hand that same kind of rig that a human camera operator would use. Yeah. Like a steady cam op, a a, a steady camera rig or something to that.
Gavin Purcell: Yeah, for those, it's like a ring around the camera, which might stabilize it. And then you can tell the robot, I want you to move in this way, or film in this way. And [00:43:00] because it is a robot, it is going to repeat the movement exactly the same each and every time. So aha, move. Aha. And there is my problem.
Gavin Purcell: Kevin, there is my problem. Sorry. Yes. I wanted to on you, but. The My problem is, no, please hit me with it. You, every time you want the robot to do a move, you have to then set that robot in the same place and set it to go there in the same way that you would want. Robots are great at factory work, right? If you're making the same thing again and again and again, you can set that robot up to go, here's where the bolt goes, here's where this goes.
Gavin Purcell: Here's where that goes. What I worry about with this particular thing, and again, long term it probably gets there because robots become so smart. They become like people, you're, you can't program back to one, into a robot. Wait, no, that's not what I'm saying. What I'm saying is if I am the director and I say like, Hey, that's just a little bit too low this time I.
Gavin Purcell: I then have to go either prompt it in or get the robot technician to like communicate to the robot what to do. It is not going to be so fluid as a person might be to interpret what I'm saying to them. That is my number one thing. I don't that I don't know. I think if we're dealing with a humanoid [00:44:00] robot, that that is a highly advanced ai, it should be able to understand like.
Gavin Purcell: Pan, we just talked about making David Bowie and Donald Trump in one shot or prompting entire games. I bet by the time this thing is implemented in sets, you'll be able to say, Hey, Rob it pan down a little bit and it's gonna go all. Okay. Again, very quickly, I don't wanna say you've worked with union camera operators.
Gavin Purcell: You know how difficult it's let it be known. Kevin is anti-union. I'm kidding. Is anti-union, I'm kidding. Kidding. No, I'm just saying. My thing here is this. Yes. Tell the robot to pan down, then the robot pan down. Oh, robot. You didn't pan down in the way that I wanted to. Okay. Pan down. Pan down this way, pan down that way.
Gavin Purcell: It is a multiple step process where an in a human who has spent their life doing this could interpret it in a much different way. Now, that's not to say that five to 10 years from now, these robots won't have that interpretability. When I saw this video though, it made me so angry because I was like, I.
Gavin Purcell: That set is going to just take forever to get the thing. And it felt like hype beasty. It felt like, why am I, why would I ever put a camera [00:45:00] in a humanoid robot's hand? I understand drone based cameras that can follow people, and you can get a shot from a drone where it's following a subject. I. But this just felt like three more steps to show off.
Gavin Purcell: Like, oh, our robots are camera people too. Like it just felt, uh, I was annoyed by it. Well market now. Skynet Actors Guild. That's right. Sag of the future. Gavin said, why would you ever bring one of those crusty old rob robots to the set? I can conceive of a a million different reasons. I'm not saying you should.
Gavin Purcell: But I think you could, and that's okay. Gavin, you know, different strokes, different folks. Can we talk about AI replacing models now? 'cause we just killed the camera person industry. Let's get to models. This one seems more realistic. I'm flip, I'm flip flopping. So I'm on the side of the camera people, but maybe not in the models.
Gavin Purcell: So, yeah, h and m is gonna make AI clones of 30 models. And the idea here is whether you go do a photo shoot, you know, h and m, you've seen the h and m ads are black and white. Some very skinny looking people kind of buffed out. They have to put different clothes on them. We've been talking about this for a while, how you can swap [00:46:00] clothes on people very easily.
Gavin Purcell: These 30 models will be able to take these photos and be used as ai and then in the future they will get royalties or they will get paid for where their images are used. So that is a cool thing, but what it also probably means, specifically in this case, much more so than say actors or writers or any other creative, uh, job.
Gavin Purcell: And I think people would argue maybe modeling isn't as creative. They are replacing a lot of other people who will get modeling jobs, right? Like this idea that if you had an AI of one of the most beautiful people in the world who you could put in your clothes and they would look perfect in them, you could shape them in any way you wanted to shape them.
Gavin Purcell: You could have them wear whatever clothes you wanted to wear. I. Why would you try to find another model to do a shoot that's gonna cost you money and do all this sort of stuff where you have to hire a photographer, you have to hire a lighting designer, all these sort of people. This feels like something that was happening for a while and now just makes sense.
Gavin Purcell: Yeah, I, I think the answer is you wouldn't do all those things and then you hop, skip and jump just a few months down the line. Or once this is more socially acceptable. Right. To have AI models and like why would you use a real human being at all in the first [00:47:00] place? Yeah. Why wouldn't you just hallucinate the entire model?
Gavin Purcell: And I don't like, like. Out of context. We can seem, especially me, can seem very flippant about all this. I'm not, I am very much concerned about, uh, job displacement and, you know, the, the, the, the fact is it's still happening. Yeah. And so these are 30 models that h and m is doing this with, like you said now.
Gavin Purcell: Right now, the human models still own the rights to their AI likeness. Which is interesting and I think that's to probably keep people at the gates a little bit longer, let's just say. Right? It's like, Hey, we're gonna make a digital model of of you. You can go use that model with other companies, even competitive ones, you still own it.
Gavin Purcell: But again, that's just turning the dial a little bit because the next phase is actually we're gonna own this next phase of AI models of you. Yeah. We're only gonna use 10 of you because we can style transfer you enough. To get different looks out of it, and then it becomes maybe you only need one human model and then maybe, or there's the thing where, you know, there's this big argument that both open AI and CO and different AI companies are making to the government right now where copyright shouldn't matter for AI [00:48:00] models.
Gavin Purcell: So if that's the case, right? Do you need any models? Because if you can prompt an unique model out of GPT-4 oh Image Gen now, then you just have a digital model wearing your clothes and like then you're not paying anybody. What I want is a video of an Atlas robot with a disheveled tie trying to do a cool model, lean in a chair, okay, with a robot cigarette in its mouth.
Gavin Purcell: And I want it to show that it's gonna replace humans like those camera operators. Wait, what is a robot cigarette? A digital, an e-cig. A digital. Alright. I was like, is it a, is it like a little tiny robot that it like brings a, a tobacco through? Its through its body somehow and then it like, wait. Oh, there's like nanobots that actually go crawling into the, the robot mouth.
Gavin Purcell: Yeah. We can make it just like one, two zeroes. It's a cute sig. Alright Kevin, it's time to see what some people did with AI this week. It's time for ai. See what you did there. Without. Then suddenly you stop and shout[00:49:00]
Gavin Purcell: you
Gavin Purcell: there Kevin. My first story this week is a, a really good example of what's a possible with vibe coding, but also kind of how information spreads and how apps spread on the modern AI internet. There's a kid, uh, uh, whose name is Martin, and he is 18 at least according to his ex handle. His name is Martin.
Gavin Purcell: His ex handle is underscore Martin sit. Uh, and he has generated what is called, he basically tweeted out a video that says, we built cursor for 3D modeling. And if you watch in this demo, what he does is he draws a little house and then he pushes a button, and then that house turns into a, uh, image, and then eventually it turns into a 3D model.
Gavin Purcell: And what's cool about this is. It basically takes all the open source stuff that we've been talking about is how you take drawings to 3D and kind of puts it into a place where it's all manipulable in the same thing. Very cool thing. But then the interesting thing was if you, if you look at all of the stuff his other tweets [00:50:00] after this.
Gavin Purcell: Like he has gotten people coming after him. He's had like 10 VCs, uh, reach out to him. He's had multiple founders talk to him about how to raise money. And it just goes to show you like what the kind of environment for like very quick vibe coded apps are. And in this particular case, like I do think there's something there, right?
Gavin Purcell: Because Cursor, it's incredible made vibe, coding very easy. And the 3D asset thing, if somebody was able to make 3D assets super easy to create, that does feel like a really valuable tool that many people would pay for. Here's the wild thing is like, I love when people build in public because it informs and it inspires, but.
Gavin Purcell: With a lot of these tools, you look at, oh, they bolted open source A into open source B, and they got the result. I, I am I, I love that people are hopping in to give him advice on how to raise money and start a business around this and formalize this. I fear that on the other side of that, there's someone else watching that going, aha.
Gavin Purcell: This is an interesting pipeline. How do we leap on it? Right? How do we productize it? Right? Um, which is all to say, Gavin, [00:51:00] I'm here to announce my brand new Soff, your big booty Bears Cursor programming 3D modeling software. Do not if you code, if you code a big booty Farmville knockoff that we have proposed on this thing you owe us.
Gavin Purcell: We have gotta participate in some way. Don't just put our face on the title screen. Do us. Don't just have us approving your game. We wanna participate in this big but farm. Uh, speaking of really interesting and cool companies, um, a 16 Z just did a, a demo day for their speed run and a company showed up called Tallis Robotics, and this was a video that came out from Ryan Ben Mallek, who I think is either one of the, I think he's the president of the company, somebody at Tallus, and what their company is doing.
Gavin Purcell: Is using robot dogs like the unit tree robot dogs for blind people. And what's fascinating about this as we're watching this video is if you, if you're not familiar, I didn't know this, but like seeing eye dogs cost about $80,000 to train. Obviously they age like actual, uh, animals and eventually they have to be replaced for the person if they're blind, their whole life.
Gavin Purcell: [00:52:00] And what this company's offering is like $10,000 unit tree robots. And the ability to kind of have these be cheaper, get out to a lot more people, and then eventually maybe even be better than a traditional seeing eye dog. To me, this was just a very cool use case of ai. Yeah. In the real world that I had not thought about.
Gavin Purcell: But it is often that thing where people say, I think Altman himself, Sam Alman says like, you have to think about where the, you know, the eventual technology is gonna be. This feels like a company that's gonna get there in a couple years and do some really amazing stuff. Yeah. I get chills looking at the video of it.
Gavin Purcell: In the real world, you know, assisting somebody, the little robit walking around like, that's incredible. I would not have thought about that use case better than an atlas holding a camera. I get it now I see it. Yeah, see, exactly. This is what you'd be doing with robots. No, but this is. Yeah, and, and you and I speak from experience here, it is very difficult to mount machinery to a Labrador or a golden Retriever or a German Shepherd.
Gavin Purcell: But here two for one, when we're decommissioning to help [00:53:00] people, you know, assist in their daily tasks, you put these things on the front lines, baby. Oh yeah. This is, you're right. So this might also, you're saying this might also be training for the robot war, where like the blind people are training these, uh, dogs to do warfare eventually.
Gavin Purcell: Is that what you're saying? Might also be Okay, Gavin, let's pretend like we don't know here, like we're not insiders. I wanna talk about Sin City. Yeah, me too. This is so cool. Tell us what this is. 'cause when I saw this, I was like, I wanna play with this right now. Yeah. I, I will go to Zen City, pop a couple sugar pillows in my mouth and be cruising through Sin City.
Gavin Purcell: This is SYN or, uh, simulated city shout outs to Sonny here, uh, who I think had dropped a paper, but no code yet. But this will let you generate a sim city style, isometric, tiled city that. Has coherent buildings placed within like natural landscapes and you just use words. If the demo proves to be anything like the video, it just looks like a quaint little city simulator where you [00:54:00] can ask for a college campus, a waterpark, uh, a city, an industrial post-apocalyptic town, and it generates these little tiles that exist on a coherent grid, and it would just be so much fun.
Gavin Purcell: Yeah. To play with this in rapid prototype little worlds that you can move about in. This reminds me of like the kind of idea of how AI could change gaming in a real significant way, right? Yeah. Not just, not just like, Hey, we can make assets faster, we can do this. Like this is a kind of a different type of gaming that would only be playable with ai.
Gavin Purcell: And I can think as a kid who grew up playing, you know, SIM City, the original SIM city, or all the like, you know, cities games after that, this would just be a cool thing to try and like imagine a world where you plump down like a. An alien unit in the middle of a normal neighborhood and like what would that interaction look like?
Gavin Purcell: But like, yeah, the designer may not have thought about that, but if I thought about it, then I'm kind of co-creating the game as I go along. That is such a cool idea to me from a creative standpoint. You ever play Sim Ant? Oh, I love Sym Sim Ant was great. I mean, I'm a giant Will. Right. Thank you. Fan so well, Sam [00:55:00] Semans a fantastic, yeah.
Gavin Purcell: Oh, no one loves cement. You're the first person. I even like Sim Tower, but I'll digress. Sim. F phenomenal game. I do. How about sim Earth? Sim Earth was also good, but very nerdy. Ah, SIM Earth was, was very nerd. I didn't really get into Sim Earth. I like sim and I like sim. I like you might've been too young for SimMan.
Gavin Purcell: Actually. I like Sim Earth. Sim Copter was also good. Anyway, we'll write. We love You. You're a hero amongst Heroes for Everywhere. Okay, Kev, we did a little bit of stuff with AI this week. We talked a lot so far already. But I do wanna very quickly shout out something that I worked on only because like it was a dumb thing and I turned it around in like three hours.
Gavin Purcell: I thought, this is fun. Um, I saw this video that got posted of people, they were using Hera character three with a model that we really like, and they basically had regenerated a podcast and it was like a cute girl talking to a, you know, a 20 so guy. And it was this dumb kind of back and forth around like, I can't remember exactly what it was, but it was kind of stupid.
Gavin Purcell: Like, you know, funny thing, everybody was like, oh my god, podcasting fake podcasts looks so big and all this stuff. So I was like, you know, I wanted to [00:56:00] try something unique and creative, and I just thought, well, what would I try? And I generated a podcast, a fake podcast that is called Dial Up Diaries and is about two guys in their fifties, maybe even their sixties, who are discussing.
Gavin Purcell: The sounds of what dial up was, because to me some of the best things that come up on TikTok are these weird ass podcasts. So maybe play what I made and you get kind of a sense of what this is clearly. The San Bernardino region, this one, this one changed me, Bob. I remember it vividly. It started clean, kind of scream, crook Crush.
Gavin Purcell: No, no, no. That's too early on. The Screech, Bob, the San Bernardino had a longer handshake before the carrier tone kicked in. It was more like Doo, dude,
Gavin Purcell: just so CI mean, it was just dumb. It was fun to do. But like one of the interesting things about this is just how fast you [00:57:00] can do this and what I hope going forward. This reminds me of that intergalactic cable thing from Rick and Morty. I'm gonna tell my kids this was call for help. Yeah, it does look that different than Leo Port in some ways, right?
Gavin Purcell: But I do think, how fun would it be to see like a full, like channel of these? And you know, we've talked a little bit about those weird formats that are coming out of Korea or a real short out of China, which are like kind of scrollable, uh, videos that go on. So Braz and they're shot with bad actors and everything.
Gavin Purcell: But I could see a version of this where. Almost like that. There's that website, I can't remember what it's called. Web, web sim ai, where you can create fake websites. Imagine a website where you could create. Fake content, but it would have to be good. Like the tricky thing is like you have to have creative people together.
Gavin Purcell: Why? Imagine it, Gavin, you could vibe code it right now, buddy. I guess that's true. Yeah. I don't think we're there yet, but I think it's not that far away from a world where you could say, Hey, make me a funny video about two guys talking about dial up internet sounds, and then I. Help the prompt and help the creative and then spit it out.
Gavin Purcell: Like it would be a tricky thing to figure out how quickly you could pull that off. I think when [00:58:00] we exit or series D, big duty bears the massively, then we can do whatever we want. Kevin, we can do whatever we want. Yeah, so we can, that's what I'm talking about, Gavin. So let's be the first two person billion dollar company.
Gavin Purcell: This week I got my hands on the brand new OpenAI realtime voice model, which was unceremoniously updated in the wake of everything else. It's faster, it's more performative. It it. It's responsive. Um, you can ask it to scream. That will be nightmare fuel inducing. But I got it to, uh, just pronounce the letter o followed by the letter a 50 times in a row, and it fully got caught in a loop that lasted for minutes in my living room, and my wife is still mad at me.
Gavin Purcell: I think we should just listen to it very quickly here before we go. Go ahead, let's hear it. OI, OIOI, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I mean, don't you wanna see how long it goes for? I, I, oh, okay. Okay. That's good. That's good.[00:59:00]
Gavin Purcell: Now do it again, but faster.
Gavin Purcell: Here we go. OIOI.
Gavin Purcell: Okay. Thank you. I would be so mad at you, Kevin. If I were her, I would be so mad. I know. April, we're sorry. We'll see you all next week. See you all next week. Thank you for joining us. See you all next Thursday. Bye everybody.