April 17, 2025

OpenAI's New o3 & o4-mini Are Better, Cheaper & Faster, New AI Video Models & More AI News

The player is loading ...

OpenAI’s o3 and o4-mini are here—and they’re multimodal, cheaper, and scary good. These models can see, code, plan, and use tools all on their own. Yeah. It’s a big deal. We break down everything from tool use to image reasoning to why o3...

Show Notes
Transcript

OpenAI’s o3 and o4-mini are here—and they’re multimodal, cheaper, and scary good. These models can see, code, plan, and use tools all on their own. Yeah. It’s a big deal.

We break down everything from tool use to image reasoning to why o3 might be the start of something actually autonomous. Plus, our favorite cursed (and adorable) 4o Image Generation prompts, ChatGPT as a social network, the old (Monday) news about GPT-4.1 including free Windsurf coding for a week!

Also, Kling 2.0 and Veo 2 drop new AI video models, Google’s Deepmind is using AI to talk to dolphins, NVIDIA’s new chip restrictions and Eric Schmidt says the computers… don’t have to listen to us anymore. Uh-oh.

THE COMPUTERS HAVE EYES. AND THEY MIGHT NOT NEED US. STILL A GOOD SHOW.

Join the discord: https://discord.gg/muD2TYgC8f

Join our Patreon: https://www.patreon.com/AIForHumansShow

AI For Humans Newsletter: https://aiforhumans.beehiiv.com/

Join our TikTok @aiforhumansshow

To book us for speaking, please visit our website: https://www.aiforhumans.show/

// Show Links //

O3 + o4-MINI ARE HERE

LIVE STREAM: https://www.youtube.com/live/sq8GBPUb3rk?si=qQMFAvm8UmvyGaWv

OpenAI Blog Post: https://openai.com/index/introducing-o3-and-o4-mini/

“Thinking With Images”

https://openai.com/index/thinking-with-images/

Codex CLI

https://x.com/OpenAIDevs/status/1912556874211422572

Professor & Biomedical Scientist Reaction to o3
https://x.com/DeryaTR_/status/1912558350794961168

Linda McMahon’s A1 vs AI

https://www.usatoday.com/story/news/politics/2025/04/12/linda-mcmahon-a1-instead-of-ai/83059797007/

GPT-4.1 in the API

https://openai.com/index/gpt-4-1/

GPT-4.1 Reduces The Need to Read Unneccesary Files

https://www.reddit.com/r/singularity/comments/1jz600b/one_of_the_most_important_bits_of_the_stream_if/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

OpenAI Might Acquire WIndsurf for 3 Billion Dollars

https://www.cnbc.com/2025/04/16/openai-in-talks-to-pay-about-3-billion-to-acquire-startup-windsurf.html

ChatGPT: The Social Network

https://x.com/kyliebytes/status/1912171286039793932

New ChatGPT Image Library

https://chatgpt.com/library

4o Image Gen Prompts We Love

Little Golden Books

https://x.com/AIForHumansShow/status/1912321209297191151

Make your pets people

https://x.com/gavinpurcell/status/1911243562928447721

Barbie

https://x.com/AIForHumansShow/status/1910514568595726414

Coachella Port-a-potty

https://x.com/AIForHumansShow/status/1911604534713192938

Ex-Google CEO Says The Computers Are Improving Fast

https://www.reddit.com/r/artificial/comments/1jzw6bd/eric_schmidt_says_the_computers_are_now/

Kling 2.0

https://x.com/Kling_ai/status/1912040247023788459

Rotisserie Chicken Knight Prompt in Kling 2.0:

https://x.com/AIForHumansShow/status/1912170034761531817

Kling example that didn’t work that well:
https://x.com/AIForHumansShow/status/1912298707955097842

Veo 2 Launched in AI Studio

https://aistudio.google.com/generate-video

https://blog.google/products/gemini/video-generation/

James Cameron on “Humans as a Model”

https://x.com/dreamingtulpa/status/1910676179918397526

Nvidia Restricting More Chip Sales To China https://www.nytimes.com/2025/04/15/technology/nvidia-h20-chip-china-restrictions.html

$500 Billion for US Chip Manufacturing https://www.cnbc.com/2025/04/14/nvidia-to-mass-produce-ai-supercomputers-in-texas.html

Dolphin Gemma: AI That Will Understand Dolphins https://x.com/GoogleDeepMind/status/1911767367534735832

Jason Zada’s Very Cool Veo 2 Movie

https://x.com/jasonzada/status/1911812014059733041

Robot Fire Extinguisher

https://x.com/CyberRobooo/status/1911665518765027788

vertical
===

Gavin Purcell: [00:00:00] All right, Kev, the day we've been waiting for is here. The oh three models are coming out. Uh, we've been waiting for this for a bit. Last December is when these models were announced. Right? So I

Kevin Pereira: am so full from the chocolate, from my open AI advent calendar. Gavin. Every day I wake up, I peel it back. Where did you find that?

Ripped the foil off my Sam.

Singer: Store, the Dollar Store had those still like in February of this year.

Kevin Pereira: Yeah. I'm a Dollar General shopper and surprisingly, they are super into large language models there. Um, no. What, what did we get, Gavin? We got oh three. We got O four mini. Yeah, it's pretty cool. And we got GPT-4 0.1, but that feels like that was like years ago.

I know it. It was on Monday. It was

Gavin Purcell: Monday. Well, let's start with the O three models because the O three model update, because that's the biggest deal. O three is the state-of-the-art reasoning model from OpenAI. Uh, this is going to come out to pro and plus users today. There's a couple big things that are going on here with this.

The first and foremost thing to know is that this is just a much better model, like [00:01:00] across the board. We joked about the charts in the beginning, but the benchmarks are all out there. We'll put some of them up here. Just know that this is a significant step towards better performance, especially in math and science.

I'm actually, I. Uh, as somebody who hasn't done a lot of vibe coding myself, I know you have. I think the coding side of this could be really interesting. I also think the other thing to be aware of here, and the thing that's gonna make the most news probably, is the fact that they are multimodal reasoning.

So Kevin, uh, shall we describe and try to d uh, translate that for the normal people in our audience to understand what that means?

Kevin Pereira: So listen, it, they, they announced it. The blog posts, the news articles ran with the headline that these models can think. With images. So that is both input and output. And if you're saying to yourself, well, haven't I seen that before?

Sure, maybe you have because there are other models that can do it, but this one does it better. You saw the charts of the charts, right? We just talked about that. I found the charts of the charts,

Singer: yes.

Kevin Pereira: So this means you can feed it a uh, a [00:02:00] chalkboard sketch, A PDF that is upside down and blurry by the way.

Something they point out. Not a problem for this. It can sort it out. Um. Just as well as you could feed it a screenshot from your desktop or, or computer code or whatever. It can take the imagery in. It can intelligently sort it, even if it's blurry or low quality. And then it can spit out a result. And that result, by the way, can also be image-based.

So you can get, uh, your results in the form of like a, a paper that is synthesizing a problem you've asked it to think about. And it can also provide you some charts Yeah. Or some illustrations to go along with it.

Gavin Purcell: So the, the demo they showed was very technical today, and if you are interested in kind of diving in on that, but there had a blog post that points out what Kevin was talking about, but this idea of being able to flip over and see copy, they had a very interesting thing in there where basically they're almost like spying on somebody else's notebook from afar.

Like it looks like they snapped a picture from the other side of the table. The notebook's upside down. You can see the writing is upside down and they say, can you tell me what this says? And it goes through the process. In this [00:03:00] blog post, you can see the whole steps of it, flipping it around, learning about it, and then it produces an image back that shows what the words say.

That is like a real big advance. And you can start to think and extrapolate out like, oh, I can do things now where it's not just like putting a picture in and then getting package GLI image, which is still fun. But you could do all sorts of real world stuff. I will also say, Kevin, part of me thinks that four oh image gen might have a little bit of this baked into it for that exact reason, right?

Kevin Pereira: Or this might have a little bit of four oh image gen baked into either way. Right there, we know that the, the stated goal is to unify all of this under the banner of GPT five in the coming months, where. And I and everybody listening to this will no longer have to hunt on a dropdown to find the right model.

Please. For the best use case, it'll just do it. Yeah. And Sam has even expressed, please make fun of me for that. So yes, Sam, we are here to do exactly that. But this gets me excited for so many reasons. I, I love a near future where I can snap a photo and say like, what is John Carmack thinking about me and the blinds?

Being in [00:04:00] the way, you know, only getting a little bit of his face through that. The answer is

Gavin Purcell: he's not thinking about you at all, Kevin. He's thinking about, he's

Kevin Pereira: thinking about intelligence. Wait for open ai. No way for open AI to weigh in on that, Gavin, thank you. But number two, for the future of augmented reality for these spectacles that we will all be walking around with the notion that the universe is always going to present whatever you're looking at in a clean, easy to read right side up, not tilted, not, not blurry, not with a glare on it.

Like that's not the real world. Yeah. So models becoming more capable at deciphering uh, uh, problematic imagery that becomes really, really interesting.

Gavin Purcell: Yeah. You know, a couple other quick things about this, it is gonna completely replace the O one models, which is interesting. Mm-hmm. So, uh, although they did say if you're an O one PRO user, they don't have an O three PRO yet, so that'll come a little bit later.

Um, there's also a pretty big coding update. This is a very technical thing, but they are basically dropping a in terminal coding agent. Yes. Yes. That allows you to directly use it within coding terminals. Is [00:05:00] that right?

Kevin Pereira: That's right. This is a competitor, I think to like Claude Code and some other offerings out there, but this is OpenAI dipping their toes directly in the, Hey, developers work with us on your code base Waters.

First of all, it's called Codex. Uh, it is in fact a command line, uh, agent. It is open source. Which is different than others out there. So you can download it, you can modify it, you can help improve it with a community. And they showed off some interesting examples of it. It can run in a full auto mode. So if you wanna develop something, maybe you wanna make a website for your, your business, maybe you want to copy photo booth, Apple's photo booth in one shot, which is something that they did.

They gave it a photo, a photo booth. If you're not familiar. Photo booth fires up your webcam and then lets you pick from all these different filters and effects, which are running in real time on your video feed and you can do some cute photos. This was all the rage back in, like what? The early OTs? Yeah, this was pre Snapchat filter, but the point is.

They take a screenshot of what photo booth looks like along with a few [00:06:00] sentences of a description, and they say, recreate this in the browser. And in one shot it does it. It fires up the camera. You have all the effects. There's your photo booth. So when we talk about vibe coding and dipping toes into those waters, yet doing it through a command line is.

Uh, nerdier. It precludes, I would say. Yeah, sure. Yeah. It's, well, it's, yeah, it's nerdier. It's more efficient, but it is, the barrier to entry is a little bit higher Yeah. Than some of the other tools we've seen, which are arguably still a high barrier to entry. So, um, I, I, we'll, we'll talk a little bit later about stuff OpenAI might be doing, uh, to, to get everybody in the fold, but this was a cool announcement.

It's a little nerdy. Yes. I, I geek out over it, but I'm glad that they're, they're moving into this. But to round out the O three stuff, you know. At or near genius level. Gavin was something that you pointed out to me like Sam Altman retweeted this.

Gavin Purcell: Yeah, so this is from a doctor that's, that was kind of like somebody that has been part of this conversation and been using O three for a bit.

I think. In fact, when O three was announced, he'd already had access to it. This is d Luna Motts, md, which I, I [00:07:00] apologize, Darien, if I'm, if I'm destroying your name there. But one of the things I thought that was interesting from his tweet, this is what Sam, Sam retweeted, this is he said. When I throw challenging clinical or medical questions at oh three, its responses sound like they're coming directly from a top subspecialist physician, precise, thorough, confidently, evidence-based, and remarkably professional.

He also says it never hallucinates and it's new agent style tools effortlessly handle multi-step tasks. Now, this sounds like a hypey sky, but this is a doctor. Who is actually spending time with this tool. And I do think the takeaway from this, you know, there's, there was a great story that came out in the, uh, information a couple days ago, which kind of leaked a little tiny bit of some of this stuff.

One of the very first things Greg Brockman said in the live stream today, I. Was that this is going to produce quote, good and useful novel ideas. I do wanna just zero in on the word novel because novel in this instance means new, right? Or things that people haven't figured out before. And that's the [00:08:00] thing that reasoning conceivably can get us to.

Now, in the past, lots of people have said, look, AI is never gonna come up with anything new. There are people out there now in, in the real world saying they are using this and that is actually driving novel ideas. In fact, Tyler Cohen, the very famous economist just yesterday, said that when the next models come out, he believes that they are smarter than him.

So that is something interesting. I think we have to keep in mind from a real world perspective that. Until we get these in our hands and we try them and we do stuff with that isn't as hardcore coding or math-based, we won't know that for sure. But Kev, I'm really excited to try like weird, creative stuff with this too.

Yeah. And see what it can do. Like Ethan Molik has played with this and he has some good examples of stuff he's tried, including like documenting a battle in space and the sci-fi epic. But that feels like to me where we're gonna get our hands on it and just. See what can oh three really figure out from a novel standpoint?

It's

Kevin Pereira: hard to even wrap the little brain around, but like, have you thought about what happens like when we wake [00:09:00] up one morning and these models are literally smarter than you and I, again,

Gavin Purcell: I think they're already smarter than you and I, Kevin, I,

Kevin Pereira: I think we're, and just a level set. Yes. This is a comedy show.

Gavin is just goofing, but yeah, it's hard to think about. It's hard to think about, but maybe by 2037 we'll be there. The also ran for the day, which should be exciting. Uh, is O four mini highly? Yeah. I, it's not an also

Gavin Purcell: ran, it's still very cool. Right? It's o its the first oh four model.

Kevin Pereira: It's the first oh four model.

It's efficient, it's cost effective, uh, designed for speed and scalability. But what does this mean in the context of the O three release? Now, Gavin, which 1:00 AM I supposed to love? I can only pick one favorite.

Gavin Purcell: Well, I think that the, it really depends on your use case, right? The O four Mini is probably gonna operate a lot like the O three Mini did before, and a lot of people who were coding found that very useful.

Some people did. It's also cheap, right? And it can run stuff in a much quicker way. I think we're not gonna know what it's gonna be great for until we really dive in ourselves on it. But that feels like the next step. What's so weird to me, Kev, is that like they released this [00:10:00] thing oh four mini. And yet they didn't talk about the oh four product at all.

So it is, it does feel like a weird name thing again. It's like. Is this really oh four mini, or are we seeing like O three mini plus? You know what I mean? That's what it's a, it's a hard thing to know.

Kevin Pereira: I'm struggling here because I wanted to go where's, oh yeah. And that's the Kool-Aid model. And I want, that's pretty good idea.

We make that. I wanna go to the Chachi BT and create the Oh yeah. Model. Sam Alman just released a Oh yeah. Mini. And it's just a tiny Kool-Aid bursting through a single brick.

Gavin Purcell: Speaking of that, if you haven't been watching the studio on Apple tv, it's great. And there's a very funny Kool-Aid. Yes. Uh uh. So subplot to it.

Um, let's talk quickly about GPT-4 0.1, Kevin, before we move on, because this super quick, this was, came out on Monday, but what, what was this? And it's mostly in the API, right?

Kevin Pereira: God, Monday was such a year away, but yes, this came out on Monday. It is API access, meaning if you go to chat GPT and try to interface with it, you can't, although they did say that a lot of the stuff from 4.1 is actually making its [00:11:00] way into the currently available 4.0 model there.

So OpenAI is doing. A lot with their models and their naming and folding in features. But basically this is a model that is, is is very capable, but specializes in code generation and look. We love to talk about benchmarks 'cause the line go up and that's an easy way to show a lot of the times that these things are getting better.

But one thing that's not exactly benchmarked right now are the vibes as in the, the real use cases of these things and the way they feel to interact with them. And during their, uh, unveiling of 4.1, uh, they had, uh. One of the windsurf creators on and Windsurf is a, a program that lets you create code using the power of AI and agents.

And he had this to say, and I'm just gonna play this real quick for you.

OpenAI: What we actually found was GPT-4 0.1 has substantially fewer cases of degenerate behavior, and maybe a couple examples here. We found that GPT-4 0.1 reduces. Uh, kind of the number of times that it needs to read unnecessary files [00:12:00] by 40% compared to the other leading models.

And also it modifies unnecessary files, 70% less than the other leading models.

Kevin Pereira: It got through puberty. It's no longer degenerate behavior. Um, it is a lot like those numbers. Yes, they're, they're bigger numbers, but these are the important ones because if you start to, uh, dabble with these tools, and again, we encourage everybody to, if you start making, uh, a program, making an app, making a cell phone game, whatever it is you wanna imagine in this world, the more times it calls files unnecessarily, the more cost, the more time it takes, uh, the more error prone it can be by modifying a file that doesn't need to be modifying.

So like these, these quick little statements. Are, are, it cannot be, I think overstated, like these are a big deal. And anecdotally, I took to 4.1 immediately and threw a problem at it that both, uh, Claude 3.7 and Google's Gemini, 2.5 Pro, these are the state-of-the-art models from the competitors. They both struggled to fix a tiny little error in an app [00:13:00] that you and I are working on, and 4.1 immediately and very quickly.

Recognized the solution, wrote the code and it worked. So that's amazing. This is purely anecdotal, but like that was a, a nice thing to have happen. These things are getting better every day.

Gavin Purcell: Little breaking news, Kevin. Speaking of Windsurf, uh, Bloomberg now is reporting that OpenAI might be looking to buy Windsurf for the little funds of a $3 billion.

Kevin, 3 billion.

Kevin Pereira: That's like the screws on the Stargate. Exactly, exactly. That's nothing. Exactly. That's nothing to them. It's so

Gavin Purcell: funny to me that like, that's it. So, so an interest, this just literally is breaking like a few minutes ago. Um, you know, windsurf is one of the, I'd say three or four big vibe coding platforms.

Really like a hardcore AI coding platform, including Cursor or, um, lovable or these other, I think the Bolt is another one, but if they brought Windsurf internally, that's a really interesting thing to think about. What you would be able to do. With a tool like O three or O four Mini and be able to integrate that [00:14:00] directly in that tool and call it chat GPT code or something like that.

So that's a big thing that just happened as well. Uh, it just keeps getting crazier and crazier and so it looks like 40 billion of that 40,000,000,003 might be going to win.

Kevin Pereira: Very quickly. If you want to try vibe coating both Windsurf and Cursor, these are programs you can download. Uh, watch some YouTubes on how to get started with them, but they all are offering free 4.1 access, uh, for a week basically.

So you can get in there, jam as much ideas as you want into the machine, learn how it works, uh, you know, fail epically and then fail again, and then eventually grind your way to a solution.

Gavin Purcell: Kevin.

Kevin Pereira: Oh yeah.

Gavin Purcell: So I'm super excited to learn more about these tools myself. I'm gonna be diving into them and, you know, when it comes to like learning.

About them. It's clear that America is on the forefront of helping kids understand this stuff. Just recently, uh, someone from the American government was at an educational conference and said, this

Linda McMahon: letter or report that I heard this morning, I wish I could remember the source, but that there is a school system that's gonna start, um.[00:15:00]

Making sure that first graders or even pre-Ks have a one teaching, you know, every year starting, you know, that far down in the grades. And that's,

Kevin Pereira: hold on. I'm, I'm, I'm sure, I'm sorry, Gavin. I'm sure that was just, that was Linda McMahon, who's our Secretary of Education.

Linda McMahon: Yes.

Kevin Pereira: That was, I'm sure that was a, there's, there's more to this.

I'm sure. That was just a, um, a, you know, a a

Linda McMahon: just a, that's a wonderful thing. Kids are sponges. They just absorb everything. And so it wasn't all that long ago that it's, we're gonna have internet in our schools. Woo. Now, okay, let's do okay. CA one. And how

Kevin Pereira: can someone please get Linda McMahon a Rift font? We can't have this happening in the future.

That

Gavin Purcell: would make life so much easier. Let's just decide the fonts that we say the word AI in going forward. And Kevin, you know, another font that I really love is the AI for Humans font, because you need to be able to look at that font across. The YouTube channel across the podcast, across our website, and across our newsletter, please subscribe to our YouTube channel.

If you're not watching on YouTube, go there. If you are watching this [00:16:00] on YouTube, check us out in our audio. If you're listening to both and on our YouTube, you should definitely be on our website and also reading our newsletter, which comes out twice a week. Kevin, we do it all for the people at home, and we actually couldn't do it without you, so please share when you get a second share.

One of the things you love that we make.

Kevin Pereira: Hey Gaff. Every week I see somebody in the YouTube comments going like, this is the best part of my week. I love what you guys do. I love this show. How does this channel not have a hundred billion views? Kevin? That's me. That's me saying that. Oh, okay. Well then you're the blank.

'cause the answer is you have to tell people about it. It's right. We are too lazy and resource constrained. That's right. Exactly. Please and thank you.

Gavin Purcell: Yes. And if you, if you do wanna help us, there's a. Tip jar in our Patreon that you can drop a few bucks into. But as always, we thank you for watching us. We have a ton of fun making this and, and thank you so much.

All right, Kev, we should move on to the next part of our open AI discussion today, because the big one, there's a couple stories here that kind of [00:17:00] tie in together, and then we're gonna get into what really cool four, four oh image gen prompts We wanna share with everybody first and foremost. There was a story from the Verge that chat, GPT is considering launching a social network, which you know, really around images.

So what, what do your first thought on this is? What, what do you think about this?

Kevin Pereira: The, the, the tic-tac toe game between multi-billionaires is the most fun. Uh, to watch when it, it doesn't affect the air I breathe or the water I drink. I mean, there's a bunch of scenarios where it's not fun, but this is a great shot against the, uh, Elon Musk or, uh, Twitter bow.

'cause now I'm back to calling it Twitter Gavin, uh, that was a quick. Quick return. Listen, Twitter and meta have these realtime fire hose streams. Now OpenAI has that, but only in the context of people plugging their thoughts into chat. GPT, they don't have the conversations, the realtime reactions of human to human.

They're not the breaking news. Place because you know, their models are a little dated until [00:18:00] they can search the web. So I think their company needs access to real time conversations and they need everybody doing these things on their platform so that they can train from it in the way that Meta and X can.

Gavin Purcell: You know what's interesting you say that? 'cause I never thought of it that way because I think what they're talking about for now is that it's gonna be image-based. But that is a good point. I think the other side of this that I've been thinking a lot about, as somebody who's been doing a ton of four oh image generation is.

I would like to follow the feeds of people who make things right. Yeah. And I know people are out there saying like, what a stupid idea. We don't need another social network. They're like, this is an actual organizational idea. That might make sense because we're gonna talk about it in a second. Like a couple really fun prompts that I found from users on soa.

I would like to be able to follow those people or to like, the fact that they do it now, I can like an image on Soar right now. And if you have an image you've created, you'll see how many likes it got. But it's no sort of like public display. And, and actually somebody smartly out there said maybe this is a way for OpenAI to start paying creators because as we know, every social [00:19:00] platform, whether it's TikTok or Twitter, has a way to kind of share in content creation, uh, you know, payments.

They're not giant, but that, that might be a way to open the door to sharing. With, uh, some of the money they make from these, these products with creators themselves?

Kevin Pereira: Yeah. I'll caution that. You know, um, I was only able to buy three cyber trucks with the income that I got from my custom chat GPTs. Yeah.

Which they told us all would make us very wealthy. Yeah. Uh, when we gave them all the data and made these custom chat experiences, I think images are a tip of the spear. Um, I think this eventually grows to be, if they go this. Direction, a, a fully fledged social network. But yeah. Um, I, I similarly want to follow creators and I want to remix their prompts and do all of those things.

And right now the experience of, uh, SOA almost feels like that on the main page. Yeah. Yeah. But once you dive any deeper, that falls apart. Yeah. There's no comments, there's no rankings, there's no whatever. But on the traditional social network side of it, Gavin, like, I don't know if you feel this way, but I.

I just assume now, by default, the conversations that I see on X and [00:20:00] even sometimes on threads, but probably lesser. So, um, I feel like, oh, I assume it's a bot by default. Right. And then don't feel that way by the I know you feel that way. I,

Gavin Purcell: I, it's interesting, like it depends on. I guess what your interactions are and how big your handle is because on x for you, you have a pretty big handle, so you're probably getting quite a few bot engagements and, and maybe for me it's not the same, but I, it's interesting.

I'd be really curious to kind of dive in on what you see as a bot and what you don't see as a bot response. 'cause I'd be, I just, I don't really have a sense yet of what they feel like and how real they are, because I feel like I should be able to tell what a bot is. Um, but that said like, either way. I think that social networks have that problem going forward, and that maybe, yeah.

To me, the biggest thing is about discovery, because like I said, we're gonna get into these prompts and talk about what made them interesting, and a lot of that is discovery, right? So to dive into this, the Sora homepage is a place right now, if you click on the images tab or the explore tab, where I have discovered all sorts of fascinating [00:21:00] prompts, right?

Same. And the biggest one that, like I I, we talked about one last week that I had, I had found there and I shared, but we're gonna go through a few more now that they're there. And you know, this is how creativity gets discovered, right? It sometimes is remixing somebody else's thing and then like being able to bring it to the next level on soa, you can go onto that homepage and find these things.

And I want to talk about. The pets to people. One, we should probably start with, yeah, because this one went fully mainstream, right? Like, I mean the so mainstream that ended up in the New York Post, which is hilarious to me. If you haven't seen this, this is where you take a picture of your dog and you can make them into a human.

I shared on x, a very funny picture of my dog, Ollie. Who turned into what looked like the bassist from Phish, which made me laugh. It's like this kind of guy with a goatee and then he was there. What's interesting is like I kept the same guy 'cause I took a picture of him looking out and then I looked down at him when he was up by the, by the table when I was at my dining table and it's just the guy's head right there.

So this went fully mainstream. But then you were able to flip this in an [00:22:00] interesting way.

Kevin Pereira: Yeah, super. And that's why we talk about being inspired by something and then remixing it. I took that same prompt. I mean, I transformed my little doctor Wesley into a human, and it made me like him a lot less. So I stopped that practice.

Yeah, it was, it was nightmarish. Um, but I flipped the prompt basically and said, transform this human into an animal character, enhancing all of their human features. With animal traits like fur ears, whiskers, muzzle tail, et cetera, preserve the personality, the recognizable traits. It was the same prompt that everybody was sharing to turn their pets into people.

And I started turning my friends into animals. Then I started turning famous, like iconic movie scenes into animals. I got the Goodfellas poster, uh, which chose different animal breeds for everyone. I turned Uma Thurman. It turned her into a, a smoking smoldering black cat. Our favorite guy fii gets turned into like a clearly excitable pup who you know, has issues with marking.

I can see it in his little doggo eyes. And then I turn the queen cover into [00:23:00] cats, like, which again, taking the same basic prompt and just flipping it slightly. And I. I had so many friends saying, well, how do you, how do you do this though? What, what, what, what is the, the path? And usually that means downloading some models into Pinocchio.

Yeah. Or connecting nodes in comfy UI or signing up for a bizarre service. And the reality with this is that you can just go right to chats.com, which takes you to chat GPT, and you can copy and paste. The prompt, which we'll put in our show notes. You can just put that prompt in there and click and drag a picture of your pet or your best friend that you wanna turn into your best pet friend.

Like it's that easy. Now

Gavin Purcell: I want to create a talk about these images of these little golden books. So if you grew up in the eighties or nineties, you probably have these little books that were like kids books. I saw a very funny, uh, Sora Post from a, a guy named OSU fans 77. So like clearly a Ohio State guy there.

He had created a prompt for Freddy's Treehouse, and the prompt is this. We'll share it on the screen, but if you're just listening, create a cover for a little golden book from 1979. It's [00:24:00] used and worn. The title is Freddy's Treehouse and shows Freddy Kruger from Uh, nightmare on Elm Street. And in this case it was like in the triage, having fun.

I basically took that prompt and just used it, but changed a few things in it. I changed the title, I changed the character, I changed the movie name, and then I said what different things it does. The thing that this added, which was kind of fun at the end, is he also says. At some funny exposition at the bottom.

So what I then created was a series of other movie, uh, villains in these same scenarios. And I added Pinhead Makes a Friend where it says at the bottom, he's a troublemaker from another dimension, but he sure is swell. And it's like Pinhead kind of looking sad with a nice guy next to him. Um, leather face got left out, which is like the character from Texas.

Chainsaw. Massacre. Yeah. And then Jack's Bad Day from The Shining. Basically, what's cool about this is like you can still find your way to do really creative stuff with known ip. If I think you're doing it in a transformative way, now who knows what the rules will be like on this going forward, but this is like [00:25:00] satire, right?

And we talk about IP rules. You and I both know these a lot. There's an argument you can make here that this is satire. You're taking a, of course horror, a horror movie, and you're putting it in a very childish situation. So there are ways to use this

Kevin Pereira: creatively to use an ip. If you're a billionaire, if you're a Jack Dorsey or Elon Musk, you would say that IP shouldn't even exist.

Gavin, that's, that's, we said this way. That's true. That is something that there should be no IP was Elon and

Gavin Purcell: Elon echoed it. Right. Which is hilarious to me. Yeah. Two billionaires should not be talking about, uh, creative people and how they make their money. I feel like, by the way, we do have a split audience on this, and there are people in our audience who are, who believe that IP shouldn't exist, which I understand.

But as somebody who's creative it, it is a tricky thing to sell creative people. I think mostly on the idea that what they might create and own does not have some sort of value, because that is how creative people might be able to make money. You don't wanna live a gig, uh, economy life for your whole life if you're creative in some form,

Kevin Pereira: but the only way to wash away the inevitable onslaught of lawsuits that are going to happen in the next few years.

Is to [00:26:00] just make it a a totally forgivable sin. True, true. There's no ip. But anyway, I digress. Let's talk about Barbie, 'cause you did a fun exploration.

Gavin Purcell: Another one that I did, which was really fun on so homepage, I found somebody trying to recreate a Barbie image. So I said, a photo of a colorful 1970s full page ad for a toy Mattel quote, Barbie Dream.

And then you put whatever the dream is in the, in the next part, realistic with a sales text logo in the bottom in the corner. Use uh, in use photo. And so I created Barbie's Dream Las Vegas Casino, and I created Barbie's Dream Senate hearing, and my favorite one was Barbie Dream Wall Street Boiler Room, where she's like trading.

And there's another little kid that I put in there where the kid has a phone and Barbie's got a phone and the title is no, she can make it to the top. And the fast-paced world of high finance, I didn't write that. The AI wrote that, and it's just a very cool way of seeing what the AI is possible. And then like last week, all of the thread became everybody else using it in fascinating ways, right?

Like somebody created [00:27:00] the, uh, Luke Skywalker's Tonton Carcass, which is a, a Star Wars toy where you see the actual carcass and Luke inside of it. But like, just dumb stuff like this adds so much joy to the world of creative stuff.

Kevin Pereira: I, I, I stumbled across a unremarkable iPhone selfie prompt that was yielding these blurry and just almost disposable camera esque images.

And so I used it to imagine what celebrities look like at Coachella while they're waiting in line for the porta-potties, as well as, uh, hallucinating a guy FII trip to Tokyo. I, I mean, just super fun and again. Now it's as easy as taking the prompt, adding your little special sauce to it, remixing it, and there you go.

And you can combine prompts too. So if you want a pet selfie where your pet is, a human in Tokyo or at Coachella, have at it. Mash it up.

Gavin Purcell: Or you can take Kevin Pereira and put him on his first day of school. Kevin, look at this. Oh, it's Kevin's first day school. School.

Kevin Pereira: Don't do that. Don't do that. Actually, you can't.

You can't. You can't do that. Don't do that.

Gavin Purcell: Kevin. Let's, let's do a hard [00:28:00] swap. Let's do a hard swap from very fun little kids. To the former head of Google telling us that we might all be cooked. You are the one who put this in here. Tell me what this is and why we're talking about it here.

Kevin Pereira: So this is Eric Schmidt, um, as you said, former CEO of Google.

He says that within six years, these computer minds will be smarter than the sum of humans. And this is something that you and I have talked about. People debate the timelines, but the timelines are sort of narrowing in on the predictions that people had a year or two ago. They're, they're. There's a little flexibility, but basically the, he says the computers are now self-improving.

Mm-hmm. And I think DeepMind kind of showed this. They used AI to come up with a new reinforcement learning algorithm that was more optimal for reinforcing its own learning. Yeah. So there's this flywheel happening where we know that at some point the machines are gonna get better at improving themselves at a rate faster than the humans once that happens.

The collective conscious, the intelligence of these machines is going to far surpass all of [00:29:00] humans. Here's what he has to say about this future, not only being inevitable, but how we are going to struggle to comprehend it.

Eric Schimdt: What happens when every single one of us has the equivalent of the smartest human on every problem in our pocket?

But the reason I wanna, I wanna make the point here is that in the next year or two, this foundation is being locked in. And it's not, we're not gonna stop it. Okay? So that sounded like

Kevin Pereira: a threat, but also a promise. We're not going to stop it. So what happens when this does happen?

Eric Schimdt: This path is not understood in our society.

There's no language for what happens with the arrival of this. That's why it's under hyped. People do not understand what happens. When you have intelligence at this level, which is largely free.

Kevin Pereira: Did you hear that Gavin? It's under hyped. We haven't been hyping

Gavin Purcell: enough. This is something that I've, we've talked a lot about in this show, but also I've been thinking a lot about is this idea that we are barreling fast into [00:30:00] a world that we do not recognize or do not see.

I have two teenage daughters and like there is a point where at at some world where by the time they're 30, this. This, this timeline will have passed right now if that timeline happens, which, you know, Eric believes a lot of the AI people believe we are gonna be looking at a very different world. And again, as we talked about with the AI 2027 paper last week, the kind of science fiction kind of projection of where we could go, I.

Most important, I think for everybody listening to this is to just kind of have an awareness that this is possible, that it is not impossible that this exists, that in five years, 10 years, we are living in a world where these machines are significantly. Smarter than us, and we really do have to start thinking as humans, how we fit into a world where that exists.

Okay. Kevin Cling 2.0. We are talking stay of the art AI video is out. I have played with it. It is very cool. This is the. [00:31:00] Chinese model Cling has come out. That was a little over the top, I think, but just to be clear, it is very good. Cling 2.0 I would say is the best image to video model that I've used to date.

There's some very cool stuff, some really big, um, AI video creators have used it. My experience with it is, it is the best, uh, in terms of motion. It also, they launched some pretty interesting things with it. An update to colors, to, uh, their imaging model, which allows you to, you know, create images within Cling itself.

It is not cheap though, Kevin, I will say to use CL 2.0 for a five second output is about a hundred credits. Now, that's only 81. Sense if you have the most expensive premium plan per month, which is like, I guess 70 bucks a month, which is not cheap, but you get 8,000 credits for 70 bucks, how many rollover

Kevin Pereira: credits do I get?

And then do I have to pay for roaming credits? Gavin, what if I've not? I do hate Credit Platinum Plus Environment because it does feel like we're in this world. Uh, the credits thing is kind of nauseating, especially when it's credits on top of a subscription tier. But [00:32:00] digress on all of that. The point is like, it is a powerful tool even though it may be expensive.

And did you see. Taylor Swift in the succession scene. Did you see?

Gavin Purcell: Yes, yes, yes, I did see that. Tell us what, tell us what we're looking at here for the people that are just listening, just listening to this.

Kevin Pereira: Yeah. This is, you know, this is a, like, uh, the ability for AI to replace a character in a scene by being given a basic image.

So they take the, there's a iconic scene in the new season of Severance where Adam Scott's character is looking about, and the camera's whizzing around and going ahead and, uh, circling about him as he's running down a hallway. And they just done swapped in Taylor Swift and, and it looks pretty good.

Gavin Purcell: Yeah, I, I was pretty impressed.

Um, overall I did run through my night with rotisserie chickens through it, which actually the version of that looked pretty good. If you take a look, what's funny is like, it's a ten second clip, it runs in a grocery store and kind of gets dissuaded by one guy and he actually runs out of frame and then comes back into frame and continues running through the [00:33:00] grocery store, which is very cool.

So that was, I would say, a huge win. The kind of downside was I created a thing where there was, uh, an image of dogs and cats fighting in what was a war in some way. And this is a very complicated image. And I said like, animate this image and try to have the dogs kind of come over the cats. Clearly it messed up a few things, but that's asking a lot of this image.

One really interesting use case of this. Is I, uh, created a still in SOA of a Lego character walking across a tightrope. And what it was really good at, and I think Kling really is successful at, is like single character animated in a special place. And if you watch this video, have I just sent you, it is like good, it gets like the kind of stop motion thing of it walking and then I had it fall off, but like.

You can see there's one little part of that where you can see the foot looking weird, but the rest of it feels like that almost would be a stop motion animation still. Right. You could cut around

Kevin Pereira: this. Yeah, no, you could cut around this believably. The fact that it actually gets, if you look on the, the building on the left, and I, if you're on the audio version of [00:34:00] this, I highly encourage you.

Look at this example on our YouTube, you see the reflection of the Lego hand and a little bit of the pole. Yeah. In one of the bricks. Like that was a phenomenal nail. The su, the subtle sway of the buildings as the character walks along the rope, like that's a physical modeling of the world that this Lego character exists in, and it is like forgive the wonkiness of the foot and the pole as he blurs and falls.

It's impressive.

Gavin Purcell: I was shocked. I was really impressed by it. So the other thing that happened, Kev, this week in AI video, is that VO two is out for the public. Mm-hmm. So you can try it. It is limited in how many you actually get to try, but if you're a Gemini subscriber, supposedly it's in the app. I haven't been able to see it yet, but.

Somebody told me you get, I think a hundred gens a month, if that sounds right, which is actually not bad. Not, and again, VO two is very good. So it's, I would put VO two up against this model Cling 2.0 and runway gen four. I would say probably, I put Cling 2.0 at the top, but VO two is very good. The other, the hard part about VO [00:35:00] two for me, Kev, is that.

I asked it to create some stuff, and I used, tried to get it to create the, the image of the war, uh, cats and dogs. It refused, it refused to create, uh, like four other things for me. So I wonder if they're very much like putting a, a filter on what it will make and what it won't make, because I think Google's a little more sensitive than say OpenAI would be about what people are doing with their tools.

Kevin Pereira: Well, OpenAI did just double their, uh, user base and basically in the course of a week by taking the guardrails off of image gen. So. Maybe Google would wanna look to that. They've got really capable models, let people have fun with them. Obviously we have thoughts about using these various tools to make AI video, but what about someone known for traditional video who has thoughts on AI tools?

Gavin, have you heard of James Cameron?

Gavin Purcell: Yes, I have. Kevin and James Kaon, the director of a little movie called Avatar is come out and he is talking again about ai. And specifically this quote is really interesting. He talks about the idea [00:36:00] of what the training source for art is. Uh, let's play this so people can listen to it.

James Cameron: The thing that that's, that I've been thinking about lately 'cause a lot of the, a lot of the hesitation in Hollywood and entertainment in general are issues of, of. You know, the source material for the, for the training data and who deserves what on copyright protection and all, all that sort of, and I think people are looking at it all wrong.

Personally, I think we are all, I'm a, you know, I'm an artist. Anybody that's an artist, anybody that's a human being is a model. You're a model already. Yeah. You know, you gotta, you got a three and a half pound meat computer. You're not carrying all the training data with you. You're creating a model as you go through life to process quickly through that model.

Every new situation that comes on. And as a screenwriter, you have a kind of built in ethical filter that says. I know my sources, I know what I liked. I know what I'm emulating. I also know that I have to move it far enough away that [00:37:00] it's my own independent creation. Right? So I think the whole thing needs to be managed from a legal perspective as at what's the output?

Yeah. That's not what, so that's the, that's

Gavin Purcell: the interesting thing that James is saying here, which I think is a slight shift as to what we look at here. Meaning that he is saying. The inputs, training on something, training on all this material is, is not really that controllable. And also it's what makes us, us, we train on material.

So if it's weird to let the computer not do that, I mean there's obviously that's a simplification of that argument, but also much more legal issues. The thing that he's saying though is that the output, if the output looks like something that is too close to something else. That's where you start to think about like where does that happen?

Now one thing that does is it takes off the burden off of companies like OpenAI and and AI companies and puts the burden on the individual user. Maybe. User, yes. Which James Cameron is involved with. Very good. Just to be clear. Very good point. And also this is the kind of basis of YouTube's. Big [00:38:00] copyright situation as well.

So with YouTube, when you actually do things where you get a copyright strike, that's my fault, not the company that allowed me to put that thing together, right? So this is transferring some of the burden onto the user, which actually I don't think is a bad idea. I'm still not sure how that will get enforced or what the is, is the ification of images.

Too close to Studio gli or if you just use Manche or you just use characters from gli, is that the violation? That's what we're not sure of yet. But I did think from Cameron, you're right, he is a, an AI investor now as well, but. That is an interesting point to me that he's making,

Kevin Pereira: and in our extended play four hour podcast, we'll begin to scratch the surface of the tip of the, somebody will want

Gavin Purcell: that, somebody will want that, that are

Kevin Pereira: those arguments.

But, uh, somebody does want something from AI and that's China. Yes. They want Nvidia chips. And it turns out Gavin, they're gonna get their hands on less of them. Yes. Because there are now new restrictions on some powerful new technology.

Gavin Purcell: So you know that we [00:39:00] are not really a political podcast, but there are some big things happening in the world and these are very important things for you to be aware of.

And I think it's worth diving into a little bit further, both involving Nvidia, which really is one of the most important companies in the world right now, specifically because they provide the chips that power this AI revolution we talk about all the time. Two big stories happened, uh, with Nvidia. First and foremost, there is a new restriction on Nvidia for sale, for selling specific chips to China.

Now, if you've been following the news, you know, there's a whole lot of conversation going on about tariffs, uh, in, in a lot of bad ways. They have messed up a lot of things for a lot of people. And in this instance, what the conversation really is about it is less about tariffs and more about. Whether or not America will sell the chips to its essentially largest rival.

And if there's a way to like, stop them from getting access to these things. Now, why would America want to do that? Well, there's a couple big reasons from a geopolitical reason. You'd not wanna make sure that [00:40:00] if you control something, that you wouldn't allow your geopolitical rival to get that same thing.

But two, I think in part this allows, uh, at least an a possibility. That we might be able to control, say a rogue AI or another AI from China, that would do much better than us. Kevin, this did not do great for NVIDIA's stock. It knocked NVIDIA's stock down about 5% yesterday when they were said, because essentially from a business standpoint, this means that Nvidia might have less customers.

Are they hurting for customers though, by the way? Well, I think the big thing is Nvidia can scale up conceivably to as many people as we'll buy chips, right? And if you say suddenly that an entire second, you know, the biggest country in the world from a population standpoint is less capable of buying those things and more so even might.

Physically start to say, we don't want you because you're restricting us. That does lot. That does lose a lot of potential customers. The other thing though, that Nvidia did, which I think is important to realize, is that they also said, you know, Hey, we're gonna make $500 billion worth of chips in [00:41:00] America.

So this is a big story because people have talked about the idea of Taiwan. Being the main place where many AI chips are, and when you think about China and how close they're to Taiwan and the kind of political backstory there, it is very possible that sometime in the future, this is just possibility.

There might be conflict in that area. Onboarding a lot of chip manufacturing in America is probably a smart thing. So these two stories kind of popped up and because of the oh three and oh four mini stories are gonna get blown over. But like, this is vitally important to how AI moves forward as well.

Kevin Pereira: It's just not as fun as turning your pet into a It's true. It's, but that's okay. It's true. It's, we gotta cover it all, including the things, Gavin, that you and I come across on the social media bubbles each and every day that stop us dead in our tracks and make us say, hey. I see what you did there.

Sometimes

Singer: yes. Rolling without a care. Then suddenly you stop [00:42:00] and shout.

Gavin Purcell: Kevin, we have three very great things today for I see what you did there. First and foremost, one of my favorite things I've seen in AI in a long time,

Kevin: dolphin, Gemma

Gavin Purcell: Dolphin, Gemma Dolphin, gma. I wanted to say it

Kevin: first. Dolphin, you said. You said it. Gemma

Gavin Purcell: Dolphin. Gemma is a new AI that is being, uh, created by Google for dolphin researchers to help us understand what Dolphins are saying.

And when we talked about this idea for a while of how. Language can be interpreted in different ways. This makes a lot of sense. And in fact, the funny thing is, dude, they got

Kevin Pereira: divers going down into the water with frigging speaking spells that are circuit bent on their wrists and they can push buttons and talk to dolphins and try to decode what they're saying.

That is just cool.

Gavin Purcell: Uh, yeah. You know, if you've remember Star Treks Forward, you remember Star Trek four, the movie Star Trek four at all.

Kevin Pereira: I do not get that. Okay.

Gavin Purcell: So [00:43:00] Star Trek four. One of the big storylines is they had to come back in time to talk to the whales of this time to figure out what had happened because somehow the whales, uh, are gone in the future and they needed to be able to communicate.

This is a version of that, like we are building that thing in the real world. And the other, the other side of this thing that Kevin made me laugh is. We are really not that far away from a pet translator. If this happens, you know, pretty soon you're gonna be hearing from your dog or your cat. Yes, little Wesley will be able to talk to you directly and say all the things that he's wanted to say forever and probably put you in your place, I would assume.

Kevin Pereira: I just, I'm like I, I like Shark Tank a lot, and I already hate the next four seasons when everybody is there with some sort of rapper that lets you talk to your plants, talk to your pet, and chat with your Prius. Like I just, I don't need that. But talking to Dolphins is in fact Cool. So I'm okay with that.

Gavin Purcell: What do you expect the Dolphins to say? Like if we finally understand them, what do you think they're gonna say? First? Ooh.

Kevin Pereira: Um, we love doing [00:44:00] tricks for salmon. If you could please do more of that.

Gavin Purcell: Salmon put us dolphins don't eat salmon. Kevin, what do they, sardines doesn't matter. Yes, they, what do they, they like little.

They, they, non salmon are in cold water. I don't think dolphins are up there. They might. Thank you. Now, I might be wrong. I might completely be wrong. I fully admit it. I'm a human being, not an ai. Maybe dolphins do eat salmon

Kevin Pereira: and or maybe that's what we'll find out is that they hate the fish selection we've been giving 'em.

Maybe they really want popcorns Gavin, that's not the point. The point is, don't wanna be in fish prisons. How about that? That's what was saying.

Gavin Purcell: Popcorns. If you wanna send us a bunch of free stuff, we'll take it. I eat popcorns. You eat popcorns sometimes.

Kevin Pereira: You can keep me in captivity as long as you feed me all natural white cheddar popcorn.

Gavin Purcell: I like the kettle corn one better, but, okay, we're moving on. Next thing up at Ted this week. Uh, Jason Zda, friend of the show, somebody who I, I know well is the CEO of a company called Secret Level. If you remember, they made the Coke, AI ad, did a very cool thing where he took from the audience requests and then [00:45:00] two days later turned it into a really compelling.

Short movie featuring a beaver, uh, an anthropomorphic beaver that lives underwater and is waiting to kind of see his wife and has a, this is gonna sound weird when I say it out loud. You have to go watch a video, has a relationship with a sock, so it, it's very charming. If you're seeing it on our video right now, don't we all

Kevin Pereira: Gavin don't judge our love.

I want a translator so I can talk to that sock and hear what it has to say. You

Gavin Purcell: definitely don't wanna talk to Kevin Sock, but in this case, the sock is very cute. It's got googly eyes and, and you just see the power of what VO two can actually do when it's in a storyteller's hands and again. This is a two day project and it, I, I know people will think this hyperbole, but like, it kind of looks like those early Pixar, maybe even like Midar shorts.

I was gonna say, it could be,

Kevin Pereira: it could be a Pixar short when you, especially when you think about the early days of their mostly tech demo stories. Yes, yes. But the fact that you could feel from this, the fact that it felt like a narrative, um, it was playing to the strengths of AI [00:46:00] generated video and it was all text to video.

Yeah. They weren't even generating images. And then transferring those into motion. So this was incredibly impressive. It was the kind of thing that would've taken a team months. Yeah. And tens of thousands of dollars to render. So I mean, uh, Jason and his team are, are definitely on top of it. It was a fun watch.

Yeah. Everybody should watch it. Yep. And last but certainly not least, if you decide to render AI video and everybody in the audience does, there may be fires at a server farm, but don't worry, we no longer need to send humans. Into the fray to extinguish it because we've got fire shoulder mounted robot fur

Singer: extinguishers.

Kevin Pereira: Oh, please say that in like the eighties action figure way that it would be presented if you could buy it.

Gavin Purcell: Oh, wow. Shoulder mountain fire. Oh wow. Shoulder mounted robot, fire extinguishers. I can't believe it.

Kevin Pereira: Whoa, these, these shoulder mounted robot fire extinguishers are the coolest.

Gavin Purcell: This is pretty cool though.

So this is basically, you see a robot walking through a field and on its shoulder has like, uh, fire [00:47:00] extinguishers cannons, essentially. Sh pushing out the the fire extinguisher stuff. Kevin, though, the thing when I watched this video that it made me think more than anything else is what else could be mounted on those shoulders?

Well, missiles, for God's sakes, rocket launchers and missiles, and this is the Terminator. I've seen that, yeah. That movie a billion times.

Kevin Pereira: This is how they're going to quell the future protests as the humans are on the street and they send one unitary G one with some like neurotoxin or sleeping gas or something to just walk on through and spray the crowd down.

Like, yeah, look, if this were Elon, he probably would've put flame throwers on the shoulders and been like, ha ha. Cool. Yeah. But here, I mean, this is helpful, but it does evoke the imagery of. Yeah, they could shoulder mount anything, uh, projectiles included. So yeah, it's, it's of those things cool. What they're doing.

But

Gavin Purcell: it's one of those things, I mean, listen, robots are fascinating and, and eventually we're gonna get there as we saw with the reasoning models at the top of the show. Like, we're gonna get there much faster than maybe originally. Thought, Kevin, this weekend, I had a very interesting experience. We talk about some of the things we do with AI that I wanted to kind of quickly go through to [00:48:00] have people understand.

We talked about Gemini Pro 2.5 last week, and if you go to ai studio.google.com, that is a place where you can actually play with a bunch of models. And I know this is another thing about Google, the Gemini app is their main app, like that is their chat t, but then ai studio.google.com is this place where they throw a bunch of tools together, right?

And like, I don't know why they have two separate things. I don't understand it. It's more, maybe it's more like the kind of. Place that you can try things. Something that we talked about a long time ago, and I had never really dug into until this weekend, is the stream tab on that page, and what stream is, is it watches you along while you're doing something.

You can basically share a tab of your browser with Google while you're doing it. I was working on something this weekend in Premier Pro and I do some editing. Not a crazy amount, but I needed to learn how to make a specific type of mask, and I was getting frustrated and I said, oh, I'm gonna try this.

I'll just see if it works. I spun it up. I. [00:49:00] It steps. You step by step through everything. It sees your screen, you can highlight specific things, and it told me how to make this mask in real time, and I found it transformative. Like I really think people need to try this particular thing. If you have a Gemini, you can do it.

If you have a question of doing anything like technical that can be seen on a website, it can tell you what you do, and then you ask a question, it's like, I had a thing with the mask. It was like, Hey, I pushed this button here. Is that the right one? It's like, no, that's the wrong one. You need to go to this thing.

There are a few moments of frustration, but it really did feel like the future of how AI can help you do stuff.

Kevin Pereira: Yeah. A lot of people have approximated this behavior, myself included, by grabbing screenshots and feeding them to Chad t or Cursor or whatever, and saying, Hey, here's this thing. How do I do it?

And you get the step-by-step handholding. But this is like paired programming where you've got someone in real time, like your R 2D two is writing shotgun and bleep blooping you.

Gavin Purcell: Yeah, that's exactly what it felt like. 'cause I've done the same thing. I've uploaded pictures like, Hey, [00:50:00] tell me how to do this.

In fact, I was doing it with chat GPT prior and that was what frustrated me. 'cause I was like, why am I doing all these separate steps? I went to it and it really becomes way easier. So I ju I really do recommend everybody go try this. Just go to ai studio.google.com, check it out. It's super fun. Um, Kev, did you try anything with AI this week?

Nope. Bye everyone. Bye.

OpenAI's New o3 & o4-mini Are Better, Cheaper & Faster, New AI Video Models & More AI News

Listen On

Featured Episodes

Recent Episodes

New to AI For Humans?

Amazon's New Alexa AI, OpenAI's GPT 4.5, Claude Sonnet's Coding Power…

OpenAI's $20,000 AI Agent, Insane Sesame AI Audio, GPT-4.5 Thoughts &…

NVIDIA's Huge AI & Robot Event, Google's NextGen Image Model, AI Vide…

OpenAI's 4o Image Gen Meltdown, Runway's Gen-4 AI Video & More AI News