🤖 OpenAI Makes Us Forget How Bad Dall-E Was

Hello AI Enthusiast,

This week brings major AI updates. OpenAI has finally delivered an image generator in GPT-4o that makes previous versions look primitive, especially with text rendering that actually works. They've also refined their voice assistant to feel more natural by reducing interruptions. Meanwhile, Google has launched Gemini 2.5 Pro, claiming benchmark-topping performance. Let's explore these developments.

The Big Picture 🔊

OpenAI Integrates Image Generation into GPT-4o

OpenAI has added image generation capabilities directly into its GPT-4o model. The integration focuses on text rendering in images, prompt following, and maintaining consistency across multiple generations. The model can handle multiple objects in images and incorporates user-uploaded images as reference. Currently available to Plus, Pro, Team, and Free users, with known limitations in areas like cropping, non-Latin text, and editing precision. API access is expected in the coming weeks.

James Varnham CEO and Rainmaker

OpenAI's new image generator in GPT-4o is a big step up from previous versions. It's great at handling text in images and maintaining consistency across multiple creations. This is a game-changer for small businesses who can now create professional visuals without hiring designers. While commercial photographers might worry, there's room for creative folks who can adapt.

The quality still varies though - some images are perfect on the first try, while others need multiple attempts.

Is your team ready for AI tools like GPT-4o's image generator? Our AI Assessment helps you understand where you stand and how to work better with these technologies. See exactly where you could improve to boost productivity with our free evaluation.

Test Your Skills

OpenAI Updates Voice Assistant as Study Examines AI's Emotional Impact

OpenAI has upgraded its Advanced Voice Mode to reduce interruptions and improve personality, making ChatGPT's voice interactions more natural. This update comes amid growing competition from companies like Sesame and Amazon. Meanwhile, a joint OpenAI-MIT study found that emotional engagement with ChatGPT is rare for most users, with affective interactions concentrated among a small subset of heavy voice users. The research showed mixed effects on well-being; voice modes improved well-being with brief use but were associated with worse outcomes during prolonged daily engagement.

Helin Yontar CPO and Polyglot

OpenAI's improving their voice assistant while simultaneously studying its emotional impact feels a bit contradictory. Making AI conversations more natural benefits most users, but could increase dependency risks for vulnerable groups like young people or those prone to attachment.

It's like upgrading a car while wondering if people are driving too fast.

Google Unveils Gemini 2.5 Pro

Google has released Gemini 2.5 Pro Experimental, a model designed to analyze information before providing responses. According to Google, the model performs well on various benchmarks in reasoning and coding tasks. It maintains a 1 million token context window, with plans for a 2 million token window soon. Currently available to Gemini Advanced users and in Google AI Studio, with Vertex AI integration planned.

Gianluca Belloni CMO and Marketing Nomad

Google's Gemini 2.5 Pro looks great on benchmarks but lacks compelling real-world examples. While technically solid, Google isn't showing how people would actually use it day-to-day. They give us charts while competitors create buzz with practical demonstrations.

Google needs to tell a better story if they want people to get excited about Gemini.

Bits and Bobs 🗞️

Google is rolling out AI features to Gemini Live, enabling it to interpret screen content and live video from smartphone cameras to answer real-time questions.
Claude can now access the internet to provide more up-to-date and relevant responses, including direct citations for easy fact-checking.
Anthropic’s newly introduced "think" tool significantly enhances Claude's complex problem-solving by allowing an additional thinking step during response generation to assess completeness of information.
Adobe's latest Adobe Experience Platform Agent Orchestrator is set to enable businesses to leverage AI agents for real-time data connection and personalization.
At the RightsCon digital rights conference in Taiwan, concerns were raised about the reliance on US-based tech companies, prompting discussions on developing local AI solutions, especially for content moderation.

From Our Founder’s Channels 🤳

Want to know more about the implications of OpenAI’s new image generation? Check out Gianluca’s video.

@gianluca.mauro
The new OpenAI image tool and the death of mediocrity #Ai #learnontiktok #artificialintelligence #business #machinelearning #product #ux #... See more

LOLgorithms 😂

Actually, he could have chosen a better image to promote it.

That's a wrap on our newsletter! Before you go, here’s a quick recap of our offerings:

AI Academy Membership: Get 12 months of access to all our cohort-based programs, live webinars, on-demand courses, and tutorials.
Generative AI Project Bootcamp: Accelerate processes and solve business problems by mastering prompts and building AI prototypes, without coding.
Practical Introduction to ChatGPT: A free course on using ChatGPT confidently, understanding its workings, and exploring its potential.
Customized Corporate Training: Equip your team with the skills they need to unlock the potential of AI in your business.

Catch you next week! 👋

🤖 OpenAI Makes Us Forget How Bad Dall-E Was

The Big Picture 🔊

OpenAI Integrates Image Generation into GPT-4o

OpenAI Updates Voice Assistant as Study Examines AI's Emotional Impact

Google Unveils Gemini 2.5 Pro

Bits and Bobs 🗞️

From Our Founder’s Channels 🤳

LOLgorithms 😂

Keep Reading

AI Academy

Home

Account

Homepage

Courses