- AI Academy
- Posts
- 🤖 OpenAI's Model Maze Keeps Growing
🤖 OpenAI's Model Maze Keeps Growing
Plus: Google's Video Generation and Thinking Controls
Hello AI Enthusiast,
This week we're looking at developments that show the approaches companies are taking to advance AI. From more powerful AI models to new "thinking controls" that let you decide how much reasoning you need and impressive video generation. Let’s see where AI tools are headed.
The Big Picture 🔊
OpenAI's New Reasoning Models
OpenAI released o3 and o4-mini, their latest reasoning models that show impressive skills in coding, math, and image analysis. These models can "think with images," run Python code in ChatGPT, and scored significantly higher on technical benchmarks. However, according to OpenAI's own internal tests and third-party researchers, they hallucinate more often than their predecessors - o3 makes up facts in 33% of people-related questions (double previous rates), while o4-mini reaches 48%.
Google Launches Gemini 2.5 Flash with "Thinking" Controls
Google has released Gemini 2.5 Flash with adjustable "thinking" capabilities that developers can toggle on or off. Users can set specific "thinking budgets" to balance quality, speed, and cost, with the model deciding how much reasoning each task needs. Google claims it offers the best price-to-performance ratio while maintaining strong results on complex problems. Available now through Google AI Studio, Vertex AI, and in the Gemini web and mobile apps.
As models get smarter but less reliable, the real power lies in building AI agents tailored to your workflow. Join us on April 30 at 6pm CEST for a free webinar: Build Your Own AI Agents – Automation That Actually Works. We will unveil our revamped AI Agent Bootcamp and show you how to create decision-making systems that integrate with your tools.
Google Launches Veo 2 for Video Generation
Google has released Veo 2, their new video generation model, to Gemini Advanced users and developers. The system creates 8-second high-resolution videos from text descriptions or animates existing images through their Whisk tool. All content is watermarked with SynthID to indicate AI creation. The service is available to Google One AI Premium subscribers with monthly generation limits, while developers can access it through Google AI Studio.
Bits and Bobs 🗞️
Gemma 3 now supports QAT versions that let models like 27B run on GPUs like the RTX 3090, bringing advanced AI to everyday users.
Microsoft is gradually rolling out a preview of Recall, a feature that captures screenshots for later retrieval, to Windows Insiders.
OpenAI is reportedly developing its own social network, potentially integrating it within ChatGPT.
Tribal News🫂
Last week, the 11th edition of our Gen-AI Project Bootcamp came to a close (a revamped version, now called the AI Agent Bootcamp, is about to start). Once again, we were amazed by the projects our students built! 🫶
LOLgorithms 😂
Picking the right OpenAI model. The struggle is real.
That's a wrap on our newsletter! Before you go, here’s a quick recap of our offerings:
AI Academy Membership: Get 12 months of access to all our cohort-based programs, live webinars, on-demand courses, and tutorials.
AI Agent Bootcamp: Accelerate processes and solve business problems by mastering prompts and building AI Agents, without coding.
Practical Introduction to ChatGPT: A free course on using ChatGPT confidently, understanding its workings, and exploring its potential.
Customized Corporate Training: Equip your team with the skills they need to unlock the potential of AI in your business.
Catch you next week! 👋