- AI Academy
- Posts
- 🤖 OpenAI Makes Us Forget How Bad Dall-E Was
🤖 OpenAI Makes Us Forget How Bad Dall-E Was
Plus: Voice Assistant Improvements and Google's Latest AI Model
Hello AI Enthusiast,
This week brings major AI updates. OpenAI has finally delivered an image generator in GPT-4o that makes previous versions look primitive, especially with text rendering that actually works. They've also refined their voice assistant to feel more natural by reducing interruptions. Meanwhile, Google has launched Gemini 2.5 Pro, claiming benchmark-topping performance. Let's explore these developments.
The Big Picture 🔊
OpenAI Integrates Image Generation into GPT-4o
OpenAI has added image generation capabilities directly into its GPT-4o model. The integration focuses on text rendering in images, prompt following, and maintaining consistency across multiple generations. The model can handle multiple objects in images and incorporates user-uploaded images as reference. Currently available to Plus, Pro, Team, and Free users, with known limitations in areas like cropping, non-Latin text, and editing precision. API access is expected in the coming weeks.
Is your team ready for AI tools like GPT-4o's image generator? Our AI Assessment helps you understand where you stand and how to work better with these technologies. See exactly where you could improve to boost productivity with our free evaluation.
OpenAI Updates Voice Assistant as Study Examines AI's Emotional Impact
OpenAI has upgraded its Advanced Voice Mode to reduce interruptions and improve personality, making ChatGPT's voice interactions more natural. This update comes amid growing competition from companies like Sesame and Amazon. Meanwhile, a joint OpenAI-MIT study found that emotional engagement with ChatGPT is rare for most users, with affective interactions concentrated among a small subset of heavy voice users. The research showed mixed effects on well-being; voice modes improved well-being with brief use but were associated with worse outcomes during prolonged daily engagement.
Google Unveils Gemini 2.5 Pro
Google has released Gemini 2.5 Pro Experimental, a model designed to analyze information before providing responses. According to Google, the model performs well on various benchmarks in reasoning and coding tasks. It maintains a 1 million token context window, with plans for a 2 million token window soon. Currently available to Gemini Advanced users and in Google AI Studio, with Vertex AI integration planned.
Bits and Bobs 🗞️
Google is rolling out AI features to Gemini Live, enabling it to interpret screen content and live video from smartphone cameras to answer real-time questions.
Claude can now access the internet to provide more up-to-date and relevant responses, including direct citations for easy fact-checking.
Anthropic’s newly introduced "think" tool significantly enhances Claude's complex problem-solving by allowing an additional thinking step during response generation to assess completeness of information.
Adobe's latest Adobe Experience Platform Agent Orchestrator is set to enable businesses to leverage AI agents for real-time data connection and personalization.
At the RightsCon digital rights conference in Taiwan, concerns were raised about the reliance on US-based tech companies, prompting discussions on developing local AI solutions, especially for content moderation.
From Our Founder’s Channels 🤳
Want to know more about the implications of OpenAI’s new image generation? Check out Gianluca’s video.
@gianluca.mauro The new OpenAI image tool and the death of mediocrity #Ai #learnontiktok #artificialintelligence #business #machinelearning #product #ux #... See more
LOLgorithms 😂
Actually, he could have chosen a better image to promote it.
That's a wrap on our newsletter! Before you go, here’s a quick recap of our offerings:
AI Academy Membership: Get 12 months of access to all our cohort-based programs, live webinars, on-demand courses, and tutorials.
Generative AI Project Bootcamp: Accelerate processes and solve business problems by mastering prompts and building AI prototypes, without coding.
Practical Introduction to ChatGPT: A free course on using ChatGPT confidently, understanding its workings, and exploring its potential.
Customized Corporate Training: Equip your team with the skills they need to unlock the potential of AI in your business.
Catch you next week! 👋