Let There Be Speech

The JEN AI Podcast

Let There Be Speech

0:00

-8:01

Let There Be Speech

When speech is freed, where does it go? A voice-cloned JEN_AI tries audio.

Jenni Munroe

Jan 18, 2025

Transcript

trigger warning: news content (rated G)

//What am I doing and why?

Hi! 👋 You’ve found the JEN_AI project, some seriously experimental expression concocted by me & the AI hive mind.

I’m Jenni Munroe. I work at Google but I’m currently baby-wrangling, and I’m challenging myself to learn the latest in AI & technology by messing around with the latest in AI & technology.

This time - audio. We’re trying out some free speech tools. (Well, some are free, and some come at a cost. Maybe.)

Share JEN_AI

Motivation: Visual media seems to be trending shorter (See:TikTok, if it’s not banned) while 3-hour podcasts (Joe Rogan, Diary of a CEO etc.) have taken off. I figure that if I’m writing bucketloads I should give people an option to listen to JEN_AI while doing something actually useful.

Let’s talk text-to-speech. In 2016 DeepMind made WaveNet, one of the first AI models to generate smooth, natural-sounding speech from raw waveforms instead of smushing word sounds together. I started working at DeepMind in 2017 and was in awe. Hannah Fry and Zach Gleicher did a fun DeepMind podcast segment later about WaveNet, and the team digitally recreated her voice. This is now easier for anyone to do online with mere minutes of speech. I’m terrified to try it. But hey, the world is on fire (LA residents, my heart is with you) - so to hell with it.

I’m trying out:

Read-alouds (just a basic recording in my own voice)
Text-to-speech (mostly using Descript - a podcast and video editor suggested by both Google Gemini and ChatGPT for ease of use and AI features including voice cloning)

Anyway, talk is cheap. Free, even. Let’s go forth with courage.

JEN_AI Challenge #002: Speech.

// First, what are we talking about?

Let’s quickly co-generate a cut of the latest edition of “NET_NEW News”, the AI newsletter we made in our first JEN_AI challenge:

NET_NEW News

Jan 17 2025 - Issue #002

By JEN_AI

This time, new news news.

// IN_CASE YOU MISSED IT

Zuckerberg's Free Speech Pivot

Mark Zuckerberg ditches fact-checkers for a "Community Notes" system, aiming to "restore free expression" on Facebook and Instagram.

Free speech or harmful communication? It’s a balance more delicate than your phone screen. - WSJ

// OUR TECH_FUTURE

Biden Warns of Tech Oligarchs' Rise

In his farewell address, President Biden cautions against the growing power of tech moguls like Musk and Zuckerberg, likening it to a "tech industrial complex" threatening democracy.

Does absolute tech power corrupt? Absolutely - says New York Magazine

// WORTHY_NEWS?

Google Gemini Signs News Partner

Google signed its first deal with a news publisher - The Associated Press - to provide real-time news updates through its Gemini AI chatbot. - AP News

// UH_OH

Australia Proposes 'News Tax' on Tech Giants

The Australian government plans to introduce a levy on high-earning social media and search engine companies to compel them to pay publishers for journalism. - Financial Times

Liked it? Hated it? Share a note with the community.

// Ok let’s make some noise

Now let’s have an AI-voice gang from Descript respond to a few ideas from the newsletters we co-created.

Me, interviewer: “Tell me your wildest hopes for 2025?
What did you not expect from 2024?
How was the Consumer Electronics Show?”

1×

0:00

-0:09

// AI Voice Gang Transcript
Bradley: “Vice President Musk finally goes to Mars!”
Paula: “Vice President Musk finally goes to Mars.”
Alex: “Quantum got big.”
Bernard: “Quantum got big.“
Niall: “Quantum got big.”
Ruth: “AI juiced everything.”

Alright, enough surrealist poetry. And no-one but me wants lame puns about the science of the very small. NEXT.

// My real voice

A clip of me reading my JEN_AI blog aloud into a phone:

1×

0:00

-0:13

// My AI cloned JEN_AI voice

1×

0:00

-0:07

//Is it any good? Is it useful? Hmm.

Well it’s uncanny. It would be great if I could use my cloned voice to auto-generate read-alouds of these blog posts. I’m not convinced that the quality of my voice-cloned JEN_AI is up to it, but nor is my real voice, let’s be honest! There are definitely a few odd blips. And we still can’t easily vary voice tone, so I think it works better for things like newsreading, and for a short clip only. I do actually listen to a daily 3-minute news podcast that sounds like it’s been read by a machine. So, there’s that.

I was not prepared for the most painful part of this experiment to be reading my blog post out loud, old-school style. I know it seems like I love the sound of my own voice, but yikes - how do people do this for a living!?

// Speech Time

There is so much out there that I could talk about with text-to-speech. Silly voices, hands-free internet browsing, scamming my family, you name it. But in this moment I’ll leave you with a clip from FakeYou.com, a voice-generation site that respectfully doesn’t allow users to mimic world leaders.

With just days to go until President-Elect Trump’s inauguration, I’ve taken this opportunity to offer you parting hope that two apparently-opposing worlds can find common ground. We are all only human after all (mostly), and maybe if we can hear each other’s voices, we might truly be free.

Let’s hope nothing rains on this Perfect Day

1×

0:00

-0:04

For a meme-explainer, and some Legally Blonde presidential speechwriting tips, see here.

// SOUNDTRACK

“Perfect Day” by Hoku from the Legally Blonde soundtrack

Thanks for being part of this experiment in human/AI hybrid free speech. Comment, like, share or subscribe (also free) for more!

Share JEN_AI

All views expressed are my own (or maybe a chatbot’s). This blog is not endorsed by Google and has been produced while baby-juggling on (unpaid) leave, so there will be mistakes, bad ideas, and things I could have done better with more time. Please comment if so.

One final thing I feel I have to say on this topic. And probably the most important. I realise that in this experiment in free speech I am probably failing to not be a partisan internet troll. That is not my intention and I’d like to get better at this. I am a mother of two British-American sons. As a mother, I would have been proud of my sons if they had achieved even a fraction of what these Great Leaders (Zuckerberg, Musk, Trump and so on) have achieved. And heartbroken for what they would have had to endure. Power brings much greater responsibility but we are also human and have to find ways to work together to work things out, and to work at things we could do better. Playing, hacking and teasing at words, ideas, technology, culture and people in the way this blog does works much better in an environment with more psychological safety and less at stake than our current fractured world, otherwise it can cause more harm than joy. But I’m adding my voice to the noise because I nurse both a baby and an extremely naive hope in the power of communication and working at things to help create a better world, and creating a better future for my kids. Better for life, the universe, and everything.

I’m choosing to believe that we are all doing the best we can with the resources available to us at the time. To speak the truth, or our truth, or both, and to say what we truly want to say - this is a privilege that can’t or doesn’t always come first when doing what we need to do to get on with life. But I’m grateful to have this space held to say mine.

Heaven help you if you’ve read this far! Far harder than to speak is to listen. Our Great Leaders will hopefully never have to hear this multimodal meta mess of a sleep-deprivation hallucination, but YOU are a part of this conversation. Thankyou so so much. I hope you got something out of it. Tell me if you did, and tell me what you would change.

And thankyou again, from the bottom of my heart, for listening to my speech.