Nov. 14, 2024

OpenAI & Google Struggle on Model Training, Suno's New AI Music Model & More AI News

The player is loading ...

AI NEWS: OpenAI, Google & other AI companies face potential struggles to advance their frontier AI models like Orion, but Anthropic's CEO still predicts AGI by 2027. What gives? We discuss the growing fear that AI scaling has peaked and what’s...

Show Notes
Transcript

AI NEWS: OpenAI, Google & other AI companies face potential struggles to advance their frontier AI models like Orion, but Anthropic's CEO still predicts AGI by 2027. What gives?

We discuss the growing fear that AI scaling has peaked and what’s causing the slowdown. Meanwhile, Apple makes strides in AI tech, we got our hands on Suno V4 which is *great* and China’s Deep Robotics showcases an off-road robot that scared the pants off us. Plus, NVIDIA’s new robotic advancements and Xpeng's humanoid robot reveal, and we explore how Meta’s AI chatbot gave foragers deadly advice.

IT’S NOT OVER… IT’S ONLY JUST BEGUN

Join the discord: https://discord.gg/muD2TYgC8f

AI For Humans Newsletter: https://aiforhumans.beehiiv.com/

Join our TikTok @aiforhumansshow

To book us for speaking, please visit our website: https://www.aiforhumans.show/

#ai #aitools #openai

// SHOW LINKS //

Ilya Says Scaling Is Plateau-ing

https://www.reuters.com/technology/artificial-intelligence/openai-rivals-seek-new-path-smarter-ai-current-methods-hit-limitations-2024-11-11/

OpenAI sources say training scaling is slowing down…

https://www.theinformation.com/articles/openai-shifts-strategy-as-rate-of-gpt-ai-improvements-slows?rc=c3oojq&shared=cee45715e080388f

Noam Brown on Inference Training

https://www.reddit.com/r/singularity/comments/1gqc24w/openais_noam_brown_says_scaling_skeptics_are/

OpenAI To Launch o1 by “end of year”
https://www.theinformation.com/articles/ex-openai-cto-muratis-new-team-takes-shape?rc=c3oojq&shared=166c84764e6700e7

OpenAI’s Plan To Make Gov’t Work For Them (against China)

https://www.cnbc.com/2024/11/13/openai-to-present-plans-for-us-ai-strategy-and-an-alliance-to-compete-with-china.html

Dario Amodei on Lex Fridman says AGI 2025 / 2026

https://youtu.be/ugvHCXCOmm4?si=IKvG7BXjXVFhyFhN

Suno v4.0

https://x.com/sunomusic/status/1856108854066413785

Apple Coming For Alexa - New AI-powered Smart Home Device

https://x.com/markgurman/status/1856439194807349657

https://www.theverge.com/2024/11/12/24294975/apple-smart-home-display-march-2025-rumors

Apple AI WTF https://www.theverge.com/2024/11/12/24289939/apple-intelligence-ai-notification-summaries-awkward-funny-bad

Deep Robotics Off-Roading Robot

https://x.com/breadli428/status/1856335825522311611

Project gr00t update

https://x.com/adcock_brett/status/1855657450604523970?s=46&t=w0Q4PuG9XdwnJWsovr5M2g

Xpeng Robot

https://x.com/adcock_brett/status/1855657472934977879?s=46&t=w0Q4PuG9XdwnJWsovr5M2g

Meta AI Mushroom Controversey

https://www.404media.co/ai-chatbot-added-to-mushroom-foraging-facebook-group-immediately-gives-tips-for-cooking-dangerous-mushroom/

Teacher Show Kids AI Future

https://x.com/venturetwins/status/1856394679379735004?s=46&t=w0Q4PuG9XdwnJWsovr5M2g

X-Portrait 2

https://byteaigc.github.io/X-Portrait2/

Remaking The Polar Express with AI Video

https://x.com/kaigani/status/1856198885841973271

Recapture video model

https://www.reddit.com/r/singularity/s/3TU3FvGyVg

Vidu 1.5 New Multi-modal AI Video Model

https://x.com/Viduforhuman/status/1855222897679188255

Suno AI (V4 Coming Soon)

https://suno.com/

AI4H EP084

[00:00:00] AI advancement is grinding to a halt if you ask some of the major players like Google or OpenAI. But Anthropic's CEO still sees AGI coming by 2027. That's just a little over two years away. So what is causing these wildly different interpretations? If everything is slowing down, then why does North America need an AI alliance to rival China?

We're going to dive into some of the public's misunderstanding of AI progress and hopefully give you some information to take away. All that plus a look at Apple's latest AI progress, a brand new music model from Suno, which is astonishing, and robots that can off road their way towards Armageddon. It's AI for humans, everybody.

Gavin Purcell: The big story this week is that AI is plateauing. That is right. We are getting information from all sorts of places. We're going to get into why or wait, why not? That may not be true, These sorts of frontier training runs that happen in the very beginning stages when you try [00:01:00] to scrape data from everywhere and then make an AI model is not working as well as it used to.

Gavin Purcell: In fact, there was a big Reuters article that just came out where Ilya Sutskever, the co founder of OpenAI, who is now in a shack somewhere developing safe super intelligence himself, has said that these models are not delivering the results they want.

Kevin Pereira: I think, to the public at large and to the 42 point headlines that I'm seeing on every website, this means the AI bubble has burst and that the billions of dollars of investment are just quicksand. AI Thanos snapped. And. It's all gone, right?

Gavin Purcell: That's, that's exactly right. AI Thanos is, is in our house. He's stealing from our cabinets. He's opening the door, taking the oatmeal that I bought, and now he's walking out the door with that

Kevin Pereira: Wait, he went on a hunt through galaxies and different dimensions like tripping through time to get these infinity stones just so he could steal your quaker oats?

Gavin Purcell: a big fan of Quaker oats, maple, and brown sugar, that's his ins [00:02:00] that that and he can only get that in my kitchen. So anyway, Kev, we should talk so let's first there's a

Kevin Pereira: No, no, no. I want to level set like that. That is what you would think if you were reading these headlines because the bigs are saying that this super intelligence, a technology that is as impactful as electricity is slowing down and by one metric.

Kevin Pereira: It might actually be, Gavin.

Gavin Purcell: that's right. So I do want to follow up. There's a couple other stories. This is not just the Ilya story. There was a story over the weekend where the information quoted a couple open AI insiders that said that they are also seeing a slowdown in this same sort of thing,

Gavin Purcell: so what they are seeing just to be very clear. Is the idea that the initial training runs of a eyes, meaning that the data they train the initial frontier model on , is not delivering as much improvement as they had in prior training runs. So there was a theory at one point that the more money and more data you would throw at these initial training runs that they would just continually scaling at [00:03:00] this kind of very high hockey stick growth.

Gavin Purcell: And what these stories are saying is that the major AI models are not seeing that as much, but Kev, this is not all bad news. Right? There's kind of a, a silver lining to this as well.

Kevin Pereira: Yes, there is a silver lining which we will get to, but to put a point on it, , Ilya Setskover said the 2010s were the age of scaling, , now we're back in the age of wonder and discovery once again. Everyone is looking for the next thing.

Kevin Pereira: Scaling the right thing matters more now than ever. And so again, the, it was the, the hypothesis was scaling is all you need because back in the 2010s, it did look like just throw more at it, grab every Reddit comment, every blog post. Yes. Bring that kitchen sink over here, Elon. Throw it all into the model because what comes out is intelligence.

Kevin Pereira: And now we're seeing diminishing returns though. Maybe the quality of the data and what you do with that data at inference time might matter more. And I think that is the silver lining that you were driving to Gavin, [00:04:00] which is. That if you scale the time you take to compute the output, there's plenty more gains to be made still.

Gavin Purcell: That's right. And I think that's exactly the silver lining I'm talking about. And I think the argument people are making out there right now is that we might be just running out of quality data. And there's only so much data that you can find that is actually worth training an AI model on, and that we may have just hit that peak.

Gavin Purcell: And there's a lot of talk about synthetic data, which I know is a. Weird term, it might scare some of our normies in the audience, but the idea for a while that the computers could make up their own data and we're not really sure where that lands, but to Kevin's point, the inference, , compute, which is the idea that you throw compute at something once it's been trained.

Gavin Purcell: And then using something like open AI's 01 reasoning model to continue to think about that data is something that we continually are hearing is not. Scaling pretty significantly, so much so that Noam Brown and other people at opening, I have continually kind of discounted this idea that scaling is slowing down because in their mind, scaling is both the [00:05:00] training model and the inference model kind of together.

MacBook Pro Microphone & FaceTime HD Camera: No, we are in a world, like I said earlier. Where the amount of compute that's going into pre-training for things like large language models is very, very high. But the inference costs are very low. And there was a reasonable concern. Um, Among various people. That this was where we were going to start seeing diminishing returns from AI progress because the costs and the amount of data that you need for pre-training will become so astronomical. And I think what O one, the really important take away from a one is that. That wall doesn't actually exist, that we can actually push this a lot further. Because now we can scale up inference computer. And there's so much room the scale of inference compute.

Gavin Purcell: There's another story from the information who's kind of covering this beat pretty well, that Oh one is looking to launch before the end of the year.

Gavin Purcell: Now, a lot of people have been rumoring that Oh one, the full Oh one was going to be coming like last week or the week before. But before the end of the year is better than , [00:06:00] never.

Kevin Pereira: We mentioned on this show that we have not extracted the full capabilities of even the O1 preview out there because we use it to do things like. Imagine, , futuristic snacks, or to pit animals against each other. And a deathmatch bracket, I think, was your usage, Gavin?

Gavin: that's exactly right. That's exactly right.

Kevin Pereira: If you're listening to this, or you're watching this, and you haven't spent time with O1, it does feel different.

Kevin Pereira: When the AI takes a beat and thinks through the problem, , you will see the results improve in some remarkable ways. So know they have this one in the hopper.

Kevin Pereira: We know this one is coming out. What happens when they apply, , these reasoning tactics to whatever the next model. Is this rumored Orion model , where maybe they're saying like, Oh, you know, we thought it would be a 10 X improvement, but it's only a six X improvement with just raw scaling of data going in and crunching it.

Kevin Pereira: Okay, fine. What happens when you apply that test time, compute that inference, compute, so it can think about the result. There's still plenty more [00:07:00] intelligence in these models. The ones that we have right now, the ones that you and I. Have access to, there's plenty more in there that we haven't extracted yet.

Gavin Purcell: The other thing that's going on, Kev, right now, that, that is like on the opposite spectrum of this is that opening is putting together a coalition where they can kind of prep the U. S. Government with all the A. I. Data and knowledge that is possible and the tools to essentially kind of lead what looks like a global fight against China for A. I. Supremacy, which I read this story. I was like, Yeah, That's interesting.

Gavin Purcell: It's kind of seems like some of the echoes of what we talked about last week, but this is just getting more serious every week.

Kevin Pereira: We talk about these models needing incredible energy, right? There's going to be a lot of energy consumed to train them and then to run them,

Kevin Pereira: So we might need to return to nuclear. Well, how are we doing that? We know Microsoft's going their direction. Anthropic's probably gonna go theirs. Amazon has new chips coming out. They're trying to build new data centers closer to power plants, so there's less transit time and energy loss.

Kevin Pereira: And so when you start [00:08:00] thinking about all these plates that are spinning, it does make sense that there would be a unified front between these companies and between the different branches of our government. Like this article name checks, getting the Navy involved because of their, , expertise Peace.

Kevin Pereira: With, you know, tactical nuclear reactors, et cetera. And it's just like, it

Gavin Purcell: Kevin, none of this will be real until you and I are in front of Congress and we're testifying. So the minute somebody in this audience sees one of the two of us sitting in a suit, sweating bullets in front of Congress, that's when you know this has really gotten

Kevin Pereira: I've got a couple text message threads with the bros. And one of them is called exhibit a, I thought that would be the reason I was testifying. The memes are so dank on that one, but no, it's going to be,

Gavin Purcell: It's going

Kevin Pereira: it's going to be this. Yeah.

Gavin Purcell: right.

Gavin Purcell: , the other thing that's interesting about this. So Dario Modi, the CEO of Anthropic, went on Lex Friedman's podcast for a five hour podcast. Dario, I find as an interesting talker about AI in general, obviously, Dario, Was that open AI for a long time and now runs Anthropic is the [00:09:00] person in charge of Anthropic.

Gavin Purcell: So he kind of echoed a little bit about what we heard from Sam Altman last week, where instead of talking about how , their companies are slowing down. And again, we've said this before, all of these companies have a big, big interest in making sure that checks keep coming in. But Dario does see a world where some version of AGI could come by 20, uh, 2026 or 2027, which is a really short time from now.

CLIP: If I say 2026 or 2027, there will be like a zillion, like people on Twitter who will be like, Hey, I CEO said 2026 and it'll be repeated for like. The next two years?

Kevin Pereira: Yes, we will. Dio, sorry. Yes, we will.

CLIP: Like, this is definitely when I think it's gonna happen. Um, who, who, who, whoever's next asserting these clips will, will, will, will, will , will crop out the thing I just said and, and, and only say the thing I'm about to say.

CLIP: Um, but I'll just say it anyway. Um, headphones, uh, uh, . So, so, uh, if you extrapolate the curves that we've had so far, right? If, if you say, well, I don't know, we're starting to get to like [00:10:00] PhD level and, and last year we were at, um. Uh, undergraduate level in the year before we were at like the level of a high school student again, you can, you can quibble with at what tasks and for what we're still missing modalities, but those are being added like computer use was added, like image in was added, like image generation has been added.

CLIP: If you just kind of like, and this is totally unscientific, but if you just kind of like eyeball the rate at which these capabilities are increasing. It does make you think that we'll get there by 2026 or 2027.

Kevin Pereira: I love that.

Gavin: eyeball.

Kevin Pereira: that I use, yeah, when making any dessert. Eh, just, eh, yeah, eyeball it.

Gavin Purcell: The fact that Ilya said that we're slowing down on the on the, you know, training data is a thing that I think is real. But also Ilya has his own pathway to what he wants to do.

Gavin Purcell: I think the thing that's really interesting here is just that Obviously, Dario and Sam both believe we're on the pathway to something significant. O1 is a path to something significant versus what we've seen before. And I think [00:11:00] you have to just kind of understand that no matter what, as you said, Even if we stayed still, even if there wasn't any sort of improvement in the AI models, what we're looking at is a world where for the next 10 to 20 years, even we're looking at the ways that like, you can, people could get better at making things with this tool.

Gavin Purcell: And like, that's the important thing for most of our audience out there. You're going to hear, Oh, it's slowing down. I don't have to care about anymore. That is not this, this is not the idea that we are moving away from a crypto cycle or something like that. This is just the idea that like one pathway Might be slowing down so that it's not an exponential gain so that we're suddenly living in a, you know, a future where fried eggs come out of our hands and we could just kind of feed ourselves eggs all day long that we may not be getting there as fast as I would like Kevin, but at some point we'll get to Friday universe.

Gavin Purcell: Yeah,

Kevin Pereira: could do? Could he actually

Gavin Purcell: he could fry the eggs on the repulsor. Actually, that would be a really interesting Ironman use case, right? Like if you use Ironman, Oh, Iron Man Chef, Kevin,

Kevin Pereira: Iron [00:12:00] Chef Man. Yeah, we're there. We were

Gavin Purcell: were, okay. Uh,

Kevin Pereira: did say, by the way, with this prediction, Some of those that like to scoff at these timelines would say, but what about, and that could be everything from running out of training data, to a chip shortage, Dario even mentions like, a war or Taiwan being erased off the map, , as something that could happen, and even in the face of those hurdles and those potential pitfalls, He still thinks 2026 or 2027 for AGI, which again, it's just another inflection point.

Kevin Pereira: It's not like that moment hits, we wake up, and yes, we all got the fried egg palm capabilities, it does set us off for the next few years of development when these, Machines get so capable that they can go off for weeks at a time. Self supervised to accomplish a task. We might see these AIs spinning up businesses and running them and managing them on their own.

Kevin Pereira: And the beginnings of this could be, a year from now. Conceivably.

Gavin Purcell: That's right, and maybe even more importantly, Kevin, in this [00:13:00] interview, we got the admission that AI naming Is terrible and that they made a mistake by naming it. I don't know if he's actually made a mistake, but Lex Freeman did ask him about why is Claude sonnet 3. 5 parentheses new called that and not just 3.

Gavin Purcell: 6 and to me, this gave me great justification. Anybody who knows this podcast knows that we've been talking about. The terrible names that AI companies give their models forever. Gemini, I'm looking at you.

Kevin Pereira: like why is, why is anybody saying chat GPT? If these machines are so damn smart, why can't they name themselves better? Like that should be, that should be the benchmark.

Gavin Purcell: that's product. Number one, problem. Number one, get better names AI.

Gavin Purcell: And you know what else, Kevin, this is another important thing that everybody in the AI space and really everybody listening to this should know is that you should be subscribing to AI for humans on YouTube.

Gavin Purcell: You should also be sharing and listening to the podcast on audio.

Kevin Pereira: You ever seen those nightline specials where it's like they have a car on the side of the road with the [00:14:00] hazards on and they want to see which county is going to pull over faster to lend a hand, right? Like you got a spare, you got a lug wrench or something. Well, that's, we are the car on the side of the road with the hazards on.

Kevin Pereira: All right. And we're responsible. We have a little road come, maybe a little flare down the line, but you're whizzing by right now. And I can feel you fast forwarding this podcast or clicking through this YouTube. Do not pause it. And share it with a friend. It's literally the only way we grow. Be the car that dares to pull over on this information superhighway.

Gavin Purcell: on this screen. I'll have my face just for you. Okay. Now that you've paused it, you've shared it at that exact moment. I'm sure we're going to get a thousand people unsubscribing, but thank you so much, everybody.

Gavin Purcell: Kevin, I want to jump into Suno V4, which is really cool. Suno, if you're not familiar is one of the top, if not the top, really AI music engines, V4 is not out yet, but , my friend across the way has gotten early access,

Kevin Pereira: 100%. Thank you to the Suno team. Hashtag not an ad, but they did give me early access to v4 and I poked around [00:15:00] and as predicted, Gavin, , line go up. Whatever your perspective. I don't know. I don't know if my video is flipped in the mirror, but the line go up right over time. These things get better.

Kevin Pereira: Nothing different about this here. V4 sounds better across the board. Voices are clearer in the performers. The instrumentation is well defined. The mastering, the EQ of the actual results, you can hear it. And so just like you would, you know, Generate any other song. You hit a button and it spits out two V4 tracks.

Kevin Pereira: They come out really fast. You can start streaming them almost immediately and you'll hear from the two different versions. It's like they're playing with different EQs behind the screen. That is the way , the highs, the mids, and the lows, the actual frequencies of the music, the way that they're balanced, that's what the EQ is.

Speaker 2: CITY!

Speaker 4: I guess they could be made up of kielbasa, which is pork. But sometimes bits of beef and veal, whatever, shut up, dork. This song is about a [00:16:00] city made of meat.

Kevin Pereira: And. It just sounds better. It sounds like we've hit that point now. There's audio that I am getting out of sooner that I'm like, yep, I would, and have been listening to that.

Kevin Pereira: And it works with everything else that exists. You can remaster any song that you've already made with Suno. So if you've experimented with this tool before, let's say you got a hot dog city power ballad, or you have like a hooray for humans

Speaker 5: AI for humans and Hollywood Ba da da da da da da da Hollywood

Gavin Purcell: uh, you've got a country ballot about taking a dump. That's something that, uh, I made a while back. I can't wait to hear that in the new version.

Kevin Pereira: You're going to love the Dolby Atmos version of that.

Gavin Purcell: I think the thing that when I hear that it really comes away is that there was always this kind of robotic tinny ness to the Suno songs when they come out. That feels like it's mostly gone now, which is the, is the biggest improvement. [00:17:00] Hmm.

Kevin Pereira: the older songs, Gavin, because what used to be. Muddy or tinny or rattle or something. It has to make a decision on cause the new model is that good. So it has to go, is that reverberation of the singer or is that a shaker? Is that a guitar bleeding into the piano or are those two separate instruments?

Kevin Pereira: And so when you remaster stuff you've already made, you get to hear the AI, this new model, making a decision about. Whether that's a voice or a tambourine or whatever, and you hear the way it shifts. And just again, the vocals are, are crisp and on point. The bass will actually thump there's real low end here.

Kevin Pereira: I'm so excited to share some samples. Of some original stuff and some covers that I did just, it sounds really, really good. So headphones off. Which is the hats off in the audio [00:18:00] world to the Suno team.

Gavin Purcell: Is that what it is? Is that right? Is that what they call

Kevin Pereira: I look, I started a thought and I had to

Gavin Purcell: Headphones off

Kevin Pereira: Yeah. A tip of the air pod, a waggle of the air pod in your direction.

Gavin Purcell: You know what, Kevin? I want, I want to be able to play my, uh, country Western upres digging a dump song while I'm in my own bathroom with my new Apple smart device powered by AI.

Gavin Purcell: This is what we're talking about. Apple has a new device coming out. According to a Mark Gurman at Bloomberg, who is a big Apple reporter, he often breaks big Apple news ahead of time. He has talked about the idea that there is a brand new device in the works that is meant to compete with Alexa, that is going to be the size of two iPhones.

Gavin Purcell: And it's going to be essentially a kind of a, in between an iPad and an iPhone, but it's going to be voice operated and driven by, , essentially their AI software, you know, Apple intelligence. I'm kind of excited about this. , I'm also kind of not because I'm not super thrilled with Apple intelligence so far, but

Kevin Pereira: that to me is the biggest thing Hey, we're announcing a new [00:19:00] product for this technology that we have not refined. That hasn't drastically shifted any user habits whatsoever. And it hasn't provided an immense amount of value in any ecosystem where it already exists.

Kevin Pereira: But we're going to have a premium version of it, too, according to the rumors. The premium version, Gavin, might be mounted on a robotic arm that will follow you about. That's the rumor! Is that they're going to have a docked version and a robotic arm version. Like, that's great. I don't need bad technology to be following me throughout a room.

Kevin Pereira: Like, they've got to have a hit with core Apple intelligence before this product matters.

Gavin Purcell: and there's been a couple of stories. There was one on the verge about those Apple intelligence notification summaries, which I think we can all kind of agree are pretty bad. Now we talked a little bit about it last week, but just getting summaries of like three emails when half of my emails are spammy in some ways, or they're, or there are things I signed up for that I'm still trying to unsubscribe from, because Doesn't work that well.

Kevin Pereira: Yeah, the people are writing about the summaries and not for the reason that Apple wants. I have to share one, , Andrew [00:20:00] Schmidt tweeted this out. , the text from his mother was, That hike almost killed me, but the Apple AI summary was, Attempted suicide, but recovered, and hiked in Redlands and Palm Springs. W w what?!

Gavin Purcell: yeah, thanks mom. Thanks for bringing that into my life, mom.

Kevin Pereira: Is there going to be a new product category for AI enabled all of the things? I'm not convinced that this isn't just like a Belkin holder for an iPad. Like, I don't know what is different about this. , they mentioned that it's going to run this hybrid OS that will, for example, Gavin, display the temperature of the room from afar.

Kevin Pereira: But then as you approach it, it recognizes that you're near. And then maybe it changes it to a control panel where you can adjust the temperature, right? So that intelligent distancing, okay, fine. But could we not? Just do that with an iPad Air and sell a special mount?

Gavin Purcell: Well, it reminds me of, we have one of those Alexa. So Alexa is sort of the thing I keep wanting them to put AI into. Cause we still have 15 of them in our house or whatever, not 15, but at least seven of them in our house. And [00:21:00] I would just love to be able to ask an Alexa question and have it be powered by Anthropic, which would be

Kevin Pereira: Did you get visited by Johnny Alexa seed or something? Did someone just

Gavin Purcell: No, we bought

Kevin Pereira: and start

Gavin Purcell: we bought into the whole ecosystem because honestly,

Gavin Purcell: like having it like we will use it as an internal intercom like the things that we use Alexa for actually intercom because we have a family of talking to people. Or we use it for playing music or we use it for timers.

Gavin Purcell: And those are literally the only three things you cannot ask Alexa a legitimate question and try to get an answer out of it. So if there's a way to make this happen in some ways, but the thing I was going to say is we have one of those Alexis in our kitchen that has a screen on it, which I swear to God, its main job is to deliver us advertisements in our homes so that we now have to look at.

Gavin Purcell: Frickin ad for something like it does nothing else because it's not like you're going to say, Hey, show me how to do this because the system doesn't work well enough. So like, I'm not a hundred percent convinced that a screen based home thing makes as big a difference as a voice one actually would because voice having a voice thing, if you're okay with you being listened to, which I know a lot of people have, but if you had a [00:22:00] voice thing throughout your entire house where you could say like, Hey blank, give me this information or tell me something that feels super valuable.

Kevin Pereira: Okay, well let me throw a hypothetical at you,

Gavin Purcell: Oh no,

Kevin Pereira: in the rural mountains of, say, Segal, Idaho, right? Up a steep, rocky terrain where, where men have trouble traversing, let alone machines. And you want to adjust your thermostat. But your smart screen is miles away at the base of that mountain.

Kevin Pereira: And so you have to , run to your window and scream, Hey, rugged Alexa, please adjust my temperature. And the robot comes screaming on all four leg wheels. And parkour move up the hill and runs up to you. Now, does that sound like an exciting future?

Gavin Purcell: It sounds exciting until I figure that that robot might have a flamethrower on its back and instead of it coming for the Electa, it's coming for me. This

Kevin Pereira: for you?

Gavin Purcell: yeah. This is a video from Deep Robotics that just came out of an off roading, four wheeled, legged robot that again, coming out of China, we've talked about this before, China is lapping [00:23:00] America in a lot of ways on robotics and mostly because they're doing it at scale and they are moving very quickly.

Kevin Pereira: , if you are only listening to the audio of this, whatever you're imagining from what Gavin's going to say, it's not it. It's crazier than that. Sorry, I

Gavin Purcell: That's okay. So you, in this video, you're seeing a robot basically jumping downhill backwards and kind of like catching itself along the way, but balancing itself and it looks like an athlete in a lot of ways to me, like, it looks like an athlete that is moving down a completely, uneven terrain with dirt and a whole bunch of other stuff, and it's doing it very well.

Gavin Purcell: I mean, the thing I always get shocked about these videos, Kev, is, you know, the Boston Dynamics videos, the big dog videos came out 10 years ago. And now these videos come out and you're just like, it seems like they come out of nowhere, but this has been being worked on for a while.

Gavin Purcell: And this one is just another step of the direction if you're like, Yeah. At some point, these things are going to be way more capable than we are at moving in the world, and this just goes to the point again of like, AI slowing down or not, like, if [00:24:00] this sort of thing keeps happening, we are not very far away from these robots being everywhere, like, I think in 10 years from now, there's a real world where people, you know, you hear this thing about like, robots will outnumber humans, like, in 10 years, that's very possible if these things can do the stuff that it looks like they're doing now.

Kevin Pereira: This stuff is coming, so look for Search and Rescue, Thank you.

Kevin Pereira: Really fantastic. Deploy one of these, let it go. Or, post apocalyptic Grubhub. Like, if you live way off

Gavin Purcell: X Games tricks, right? Imagine the X Games robotics world. Like, that, that is what I'm looking forward to. Is like, give this robot the ability to do the, I don't know, what, 1080 is what they can do with skateboards. Like, imagine the robot does that thing. 3600 flip. Now we're talking, now we're talking entertainment, baby.

Kevin Pereira: throw a no scope shot at the end of that, and we got robots doing a weird ninja warrior Call of Duty thing, and that is the Skynet Olympics.

Gavin Purcell: Now here's my, that's what, that's what's going to happen is once we're all being hunted, it's not going to be enough to hunt

Kevin Pereira: No, they're gonna be trick shotting us! Ha ha ha ha ha!

Gavin Purcell: it'll be like the Fortnite clips right now, [00:25:00] except it'll be us and the robots will find us from like six miles away.

Kevin Pereira: Bro, look at this 7 20 2 1 arm wheel no look knee capper bang bang! Whoa! I mean, that's sick! But I do need a tourniquet.

Gavin Purcell: What else can there other robot updates this week? We might as well talk about these things that are eventually going to

Kevin Pereira: Yeah! You know, a while ago, , NVIDIA announced a whole suite of AI advancements, specifically for robotics.

Kevin Pereira: And we're starting to see some of that come to fruition. So project Groot was one of NVIDIA's initiatives and they unveiled some updates now. And what we see is a robotic arm with like a thumb and sort of three graspy fingers. And you see this, Arm go and grab different objects of different sizes with different corners or some are rounded, some are squishy, et cetera.

Kevin Pereira: But, , it trained this arm completely in a simulation and then deployed it basically to the real world arm. And it says it includes environment generation with 25, 000 plus 3d assets. [00:26:00] Motion learning and advanced dexterity training. So they're just throwing model after model in the simulated environment.

Gavin Purcell: . And I think that's just an important thing is like, again, from our top of the show, like you hear the word people saying I slowing down. It's like AI is now not just like LLMs or chat GPT. It is spread to a vast variety of products. And robotics is one of the ones to really keep your eye on because this is a place that is going to move very fast

Kevin Pereira: Another company, Gavin, called Xpeng, showed off a 5'10 Humanoid Robot that weighs 153 pounds. So, uh, this is my weight class. I will be fighting this in the future.

Kevin Pereira: I need to learn how to do a

Gavin Purcell: Iron Mike Tyson, the Iron Mike Tyson robot is going to destroy this one.

Kevin Pereira: It remains to be seen how autonomous this thing is, what sort of AI and sensors it's packed with, but yet it's yet another dramatic unveiling of another Humanoid Robot coming from some other company that Might be, uh, hunting us in the near

Gavin Purcell: You know what I don't love about this Kevin is it's 510 and we've talked about the feet in the past. Like these robots have been inching up [00:27:00] slowly higher and higher and like 510 is getting awful close to 6 foot, which I believe is the crossing point when the robots get to be 6 foot. Then we're all screwed because up until now, Five, seven to five, nine has been like the sweet spot for robotics.

Gavin Purcell: And I think they understand they can't get too tall because then they're going to start forcing us, uh, you know, uh, manly men in the world to really say, you can't do this. And we're going to push them down.

Kevin Pereira: Us manly men!

Gavin Purcell: yeah,

Kevin Pereira: ha ha ha ha ha

Gavin Purcell: everybody knows that was a joke.

Kevin Pereira: And let's be clear, it might not be the six foot tall robots that try to kill us, Gavin. It might be Meta's AI chatbots, because,

Gavin Purcell: And they might be doing it sooner rather than later. That's

Kevin Pereira: They might have already done it. We have to talk about Fungi AI, Gavin. It was an auto generated Meta chatbot. If you haven't been using You know, uh, Facebook or messenger products.

Kevin Pereira: You might not notice this, but if you're in group chats with people, sometimes meta will just decide, Hey, how about an AI agent in your group or your

Gavin: Why not?

Kevin Pereira: answer questions that went unanswered or to [00:28:00] poke the group with a stick to try to keep the conversation going well, Fungi friend was apparently a custom AI chat bot that got inserted into a group of foragers.

Kevin Pereira: Folks that like to go out into the woods and forage for funsies. And Fungi AI gave them instructions on how to cook and prepare mushrooms, which are, in fact, deadly to human beings.

Gavin Purcell: This is the danger of misinformation that is now being provided by AI itself. Right. And this goes back to the idea of like, gosh, these are not really trustworthy things as of now. And I know hallucinations are something that people have been talking about for a while and we have to kind of make them less, but.

Gavin Purcell: Please do not trust the things that AI says, because right now it is untrustworthy. And this is an important lesson

Gavin Purcell: My wife and I were playing Scrabble last night and I decided like, Oh, I'll use a chat GPT advanced voice to like, uh, to give us a, you know, what can I make with these letters? And it gave me four words. It basically said, here are four words that you can use with the letters you have flow G [00:29:00] foil, folio, Joliff, which is an old word meaning jolly. And then I said, what was the first word you said? And it said, The first word I mentioned was floji. However, I realize now that might not be a valid Scrabble word.

Gavin Purcell: My apologies for the confusion. And I said, Oh, what is the definition of floji? And then it said, Oh, actually floji isn't a valid word in Scrabble. And I don't even know if it has a recognized definition in English. So again, just don't trust these things. Don't trust them. Yes.

Kevin Pereira: you know the stakes of the Scrabble game are a little bit lower than hey, I'm in the woods I snapped a photo of a mushroom. Is this going to kill me? And it goes like no actually you should put some tagine around it.

Gavin Purcell: It'll make you feel weird, but it'll be fun. Yeah.

Kevin Pereira: Yeah, it's a it's a big tummy No, no, but that's that was the point of this in a shout out to the the 404 media that had this article is that people in the mushroom group are saying this is really dangerous if you are in a group of experts, or even amateurs, plugging in an AI like this, their first interaction might be with a machine that gives them really poor advice, and this could be [00:30:00] something where the life is on the line, so stay away from the Flogeys.

Kevin Pereira: I know they're bioluminescent, everybody wants to nom nom those Flogeys. They're deadly.

Gavin Purcell: Dammit. It would have been like a triple word score too. . Anyway, let's talk about some of the things that we saw this week that we really thought were very cool in the world of AI that we didn't get our hands on per se, but we were excited to see them. It is time for AI. See what you did there.

Sometimes you're scrollin without a care, Then suddenly you stop and shout. Hey, I see what you did there. Hey, I see what you did there.

Gavin Purcell: All right. These are gonna be quick today, Kev. Let's first start with this video that we've seen go around. This is just a very cool, awesome use case of watching a teacher bring forth out of the power of AI to their students.

Gavin Purcell: You want to tell us what we're

Kevin Pereira: it's AI indoctrination. It's actually the worst video ever. No one should be celebrating this. In fact, this is why the Department of Education

Gavin: It's [00:31:00] getting,

Kevin Pereira: Bye bye. Okay, did that, did that take it too far? Alright, this is a teacher asked their students, , what they see themselves as when they grow up.

Kevin Pereira: What would you like to be, Gavin? A baseball player, an airline pilot, an astronaut. And then they used AI to imagine the children as adults. In those roles as veterinarians, as cosmonauts, et cetera. And it's hard to hate it. It's absolutely adorable. Uh, some people are against cameras in the classrooms.

Kevin Pereira: Okay, whatever. You know, can we just enjoy one thing for a second? Can we enjoy the smiles on the faces of these children who have no idea that AI is going to take their ability to even have a job in the future?

Gavin Purcell: I will say the one thing about this I think is a really important thing is why AI in education is very important is that every kid who sees this a is going to leave and see their faces. They're very joyful about this, but also they must be curious about how this is possible. It is like when I was a kid and I was in, I took this is how old I am, but I was a computer class.

Gavin Purcell: I took logo when I was a [00:32:00] kid and logo was a computer programming language, right? Built around the idea of making a turtle move in a square, right? But the fact that I made that turtle move by writing some code was like a very cool thing. So I just hope that this sort of thing gets more integrated into, into classes, especially in grade school and middle schools.

Gavin Purcell: I hope that they start having AI classes where kids can learn how to use these tools because they are going to be a big part of our future. So like. In the same way that you probably for you, Kev, like there was internet classes in high school or in middle school, where you started to learn about what to do with the internet.

Kevin Pereira: That's why you and I, we get asked all the time to speak at elementary schools and colleges alike. And when we show up, especially like to the younger kids and you go, look, you just ask it to write the paper for you. You don't have to read the catcher in the rye. That is a, transformative moment that, uh, it's just a privilege for us to be there for those things.

Kevin Pereira: And yes,

Gavin Purcell: folks. I want to make sure everybody knows that Kevin is lying about not one, but two things. One, he's saying that we are doing that in schools that we are not, but two, we have not been invited to high schools or middle schools, but [00:33:00] we would hope we would do that. I think we would have a lot of fun doing that.

Gavin Purcell: Like getting kids on board in this space is a very cool thing. And I think overall it could be awesome.

Kevin Pereira: Agreed. Now, one thing that is definitely awesome, Gavin, is X Portrait 2 highly expressive portrait animation. This is coming out of ByteDance. You might know him for such hits as TikTok and government takeover of algorithms, but we love technology that puppets avatars, especially if you can feed something a two dimensional still and have it come to life.

Kevin Pereira: And these examples are stunning. I hope this is real.

Gavin Purcell: It's unbelievable actually, because when we, we've talked about runway act one, which is now the kind of mainstream me version of this within runway MLS a product, this is the next generation of this. And when you look at the videos from what they're able to do again with a single still, and it has gotten insane and I do think this kind of lends more credence to the idea that like anybody can act as anybody else, whether it's for deep fakes in a bad way or to create some sort of like, you know, really [00:34:00] interesting AI film or AI show. All of this stuff is coming in a, in a very realistic, very interesting way.

Kevin Pereira: And just to like, in case you're not watching the video, what makes this so amazing? , high speed movement, characters quickly moving their heads, like whipping their head from side to side, like chin to shoulder, and then opposite direction, and it doesn't turn into a blurry mess.

Kevin Pereira: , wide mouth. Tongues coming out. Things that other models just don't even attempt, or will break if you try. This is really, really impressive. And like, I don't know, like, Was the giggle because I'm describing these things, Gavin? Is it me specifically?

Gavin Purcell: And I giggling mostly because like, I'm kind of shocked by the idea that this is what's possible now. And the other thing that we want to shout out here up next is the idea that Somebody out there took runway act one, which is using a similar sort of technology to this and ported over the polar express, , the incredible polar express, which I have like

Kevin Pereira: love to

Gavin Purcell: I loved listen, the polar express was a fine movie, but it [00:35:00] was early motion cap.

Gavin Purcell: And all of the kids, especially look kind of dead inside in their eyes. And what they did here is they use runway act one. This is Kay Ghani. Who we have shouted out before he kai's in our discord who did this is a very cool way of just showing how off the shelf technology is re could remaster something that say 20 to 30 years ago was just brand new breaking technology and even then it didn't look amazing but it was really shocking for what people saw it now you can use these like literally off the shelf tech to kind of make something better and it's all perfect but it was just a cool way of showing how AI can improve something older.

Kevin Pereira: Awesome video, Kaigani. Thank you for sharing that. And also, , quick shoutout to, , Google, cause they don't get enough love, Gavin. We wanna throw some flowers our way. Recapture, which is a method to regenerate a source video with all its existing scene motion, but from vastly different angles. With different cinematic camera motion.

Kevin Pereira: So what does that mean? If you've got a product video and you film something, but it's not the exact angle that you [00:36:00] want, now you can

Kevin Pereira: readjust that now, the, the code in the model have not been released yet. This is just a paper with some examples, as of the time of our recording, but. , again, , this is just like future of filmmaking stuff. As we talk about it all the time, like, Oh, cool. You got your character, but it's not the right angle.

Kevin Pereira: All right. Just run that other tool and fix that problem. Not an issue.

Gavin Purcell: that's the thing that I think people in the film and video world may not fully get is that like, not only are you talking about like kind of making up AI content from scratch, essentially with these things, but now you're talking about being able to set camera movements, being able to retake shots, being able to do stuff with the things you have, like all of that comes together to a pretty significant, tool set in the next probably two to five years, max.

Kevin Pereira: , if you hate the fact that we only yap about this stuff once a week, by the way, Gavin, quick shout out to ourselves for the AI for humans newsletter. It releases every Tuesday morning. It's absolutely free for all of you. Go to ai4humans. show. You can sign up there. , the line go [00:37:00] up there for us, which is amazing.

Kevin Pereira: So try it out. You may like it. And if so, please share that with your friends. And also in just a few weeks time, we're going to be doing a Q& A special for Thanksgiving, , as you and I give thanks to the audience. The small but proud legion of AI for actual humans out there. So if you have questions for us that you want to answer, , there'll be a link in the newsletter.

Kevin Pereira: You can leave them on our Patreon. You can join our Discord, , and leave it there. But we just want to know what are your questions? What are your concerns, your comments even? Leave them for us for when we get ready for this Thanksgiving episode.

Gavin Purcell: Yeah, people have been asking about an episode like this for a while, like kind of an FAQ episode. And this is your chance. We want to answer a bunch of questions and no question too dumb.

Gavin Purcell: Really like the goal of the show is to inform and educate people. If you have a question about like, how does chat GPT work? Well, we'll help answer that. So like, just get us questions. We're going to try to do it ahead. Oh, do you have a question?

Kevin Pereira: I have a question, Gavin, but I'm worried that it might not be the right question. Is it okay to ask it? Is this a safe space?

Gavin Purcell: That's a safe space. Okay. As

Kevin Pereira: Hey, Gavin?

Gavin Purcell: something that's going to get us canceled. It's [00:38:00] totally fine, Kevin.

Kevin Pereira: Oh, I don't know. It depends. What did you do with AI this week, Gavin?

Gavin Purcell: Well, let's get into this. I actually spent some time with a new update to a company that's called VDU. And VDU is a company I think we covered a while back, but there is a 1. 5 version of their model, and you can go do this right now. It is free to try. And what they've done is created a multimodal AI video model.

Gavin Purcell: And what this means is. You can take a picture of someone of yourself or of somebody, you can take a location and you can take like an outfit and you get, you're able to put each one of these things together. And then what it will do is make an AI video of those three things mashed together. And Kev, it's not like perfect, but it actually worked pretty well.

Gavin Purcell: And this just goes to show like, we're starting to get more control over these things. So I made some examples. If you go to the, um,

Kevin Pereira: Oh, I'm looking at him, buddy. I'm, I'm looking at him.

Gavin Purcell: so let's go through these one. So first and foremost, I wanted to use a picture of you. I was like, oh, this is interesting. So what I did is I took a picture of Kevin, uh, just a face, a headshot.[00:39:00]

Gavin Purcell: I took a picture of a guy in , a construction site, garbage workers, like, kind of like yellow vest. And then I took a picture of an empty cubicle range, and this is Kevin walking through that now the face is not perfect, but you can see at certain points, it gets you a little closer.

Gavin Purcell: It's a little beefier version of you, I think is what I would say,

Kevin Pereira: I was going to say everything about this actually is perfect, how dare you? But the source for this, Gavin, was just you giving three SILs. It was like Mad Libs, right? Three pictures. That's pretty cool.

Gavin Purcell: So then I took you and replaced you with Guy Fieri and just literally replaced as I always do. And I replaced the picture and I just put a guy for a picture in there. You can see that's guy in the office. Then I flipped it and I took Guy Fieri and I took it out of the office and put him into like a fancy dance ball, which was like, Oh, this is interesting.

Gavin Purcell: You can see him in a fancy dance ball.

Kevin Pereira: so good.

Gavin Purcell: Then I took Guy Fieri in the outfit and put him in a, at a sports game, like at a football game where I said, Hey, raise his arms and like make his belly show, which it showed off. And [00:40:00] then I took Guy Fieri and a tutu and I put him in that sports game as well. So you just get a sense of like how you can just change one small thing.

Kevin Pereira: These are pretty wild. It's not the best video model, but what a fun application, and it does kind of showcase, you can immediately imagine what Runway's implementation of this is going to be, or Luma, or Kling, or any of those things. Like, put yourself, I don't know. Here's the outfit that you want on your superhero.

Kevin Pereira: Now tell us the environment and are you driving it with text? Are you saying

Gavin Purcell: Yeah, you're driving it with text. Yes. So, so in that shot of him, , in the stadium where he's got his kind of belt with the belly I said in that prompt, I said like throws his arms up and his belly pops out, right?

Gavin Purcell: So like that does show some of it. I think to your point, Kev, the interesting thing here is like when a company like runway says, Oh, we can find a way to integrate this. If you put this plus act one together, which is the ability to drive facial animations and to create words that make the faces that way, plus, , gen three generations, you start to see like a really entire like suite of AI video products that could bring forth something pretty [00:41:00] incredible, pretty fast.

Kevin Pereira: there was a minute there where for still generators, you could click on any point on the image and sort of drag it like the nose of a kitten and move its head or the arm of someone who could do it and it was a little wonky but it kind of worked and when I see this I go, yep, what's your character's face, pick the body, pick the outfit, select the scene, Hit generate and then go in and just kind of click and drag and fine tune where they are posed in that and then as we just saw with a I see what you did there.

Kevin Pereira: Adjust the camera angle as well and just the ability to generate all this stuff is mind blowing.

Gavin Purcell: And the coolest thing is you can go try it for free. You get like, I think three generations, so you don't get a lot for free, but you can go get three generations right now. The link will be in the show notes. Go check it out. But now Kev, I wanted to hear more about what you did with Suno because to me, this is what your sweet spot, right?

Gavin Purcell: Like you are a musician, you have a music background. I always love when you play around with Suno because you're able to kind of get a sense of like what this thing can actually do. Tell us a little bit about what's possible here.

Kevin Pereira: , so this is a little garage band, something that I plinked out in a few [00:42:00] minutes that I long abandoned, but here you'll hear a little bit of it. It's a little country Western twangy thing. There's some steel guitar

Kevin Pereira: A bit of a vibe set, right? But certainly not a song and nothing. I spent time designing the instruments are, but I took that, put that into Suno and said, Hey, let's remaster this using V4, let's turn this into something new

Kevin Pereira: Then you can take it and you can get stems where it separates the different instruments and [00:43:00] the vocals, if there are any.

Kevin Pereira: And arrange them in a garage band, a logic, whatever your workstation of choices, and then put those back into Suno with new arrangement and say, great, now take that and run with it.

Gavin Purcell: when you're doing that, how did the guitar solos appear? Did you, did you prompt for solos or did you, you did?

Kevin Pereira: did. Yeah. Yeah. I took arrangements of things that I liked and would make a very basic something and then feed it back in. Here's the crazy thing about V4 too. You can remaster a remaster. So you can go deeper within the song.

Kevin Pereira: So there's pieces in this thing that are like the third and fourth generation of the same thing. You're basically taking the same seed and regenerating it. And so, yeah, just iterating playing, but asking it for a solo and letting it go nuts.

Kevin Pereira: And we arrived to a point, Gavin, where I sent [00:44:00] remixes of friends songs to them, accomplished musicians, and one was like immediately , Wow. I want to record that version of that song.

Gavin: Oh, cool.

Kevin Pereira: make that. That's amazing. So the machine inspired a capital a artist to do something. And then another friend, I deeply disappointed.

Kevin Pereira: They are disillusioned. They, yeah, yeah. They were not too pleased, not specifically with

Gavin: Right, right,

Kevin Pereira: it. They understand

Gavin Purcell: But just the ability that it could happen.

Kevin Pereira: Yeah, yeah. And that's the, and here we are once again, straddling this, this double edged sword. We are sitting on the fence, this very sharp fence. It's, it's tough because like it got me excited.

Kevin Pereira: I want to get back on my drums and play along with the thing that the machine gave me. But I also don't fault anybody for saying , I don't want to pick up sticks or a guitar ever again. Both realities can exist. I

Gavin Purcell: this particular tool is that Suno is one of the tools that's providing a lot deeper ability to manipulate things than, than it was before. Right. And like, it's not just prompt to get a song [00:45:00] out and you can do stuff with that. And so I would encourage. Musicians who are curious about this stuff and I would hope that Suno, maybe they've got a program like this, but like Suno should start some sort of program where they're going to give like musicians like free credits to play around with.

Gavin Purcell: I know everybody can get some free credits on Suno, but some of these products like the cover song feature are paid tools and everything like that, but they would be well served by getting a lot of actual musicians. Trying these things, at least I know they've got Timberland involved in the company, but like, that's a cool way to do stuff is to like say to the musicians, like, Hey, try these things.

Gavin Purcell: Maybe they're useful for you. And then you kind of start opening the door in the same way that Adobe has, I think to artists.

Kevin Pereira: And just to level set on the rate of progress again, we found them pretty early on a year in change ago and from the, Oh, there could be something here to now we have.

Kevin Pereira: Basically CD quality sound generation happening where you can separate the stems. You can style transfer your own voice or instrumentation. You can lock the sound of a singer or of a band and then style [00:46:00] flip them. All of these features are coming online as the sound models are getting better.

Kevin Pereira: Like, I do not know where we will be. This time next year, but the notion of a 24 7 procedurally generated radio station where you can talk to an AI and say, Hey, I'm about to go do a workout. Make this more heavy metal and, and, and driving. And it does that, you know, and then you say, Oh, I'm in my car.

Kevin Pereira: I need to chill out now. And then it style transfers that, we literally don't know where this is heading and it's so exciting.

Gavin Purcell: Yeah. I was going to say, and the good news is if you're listening to this podcast, you're somebody who is at least aware of where it might be heading. So keep listening, keep watching. Thanks everybody for being here. We really appreciate it as per usual.

Gavin Purcell: And we will see you all next week, baby.

Kevin Pereira: Bye bye.