• AI Academy
  • Posts
  • 🤖 OpenAI and Google Battle Over Smartest AI Assistant

🤖 OpenAI and Google Battle Over Smartest AI Assistant

Plus, Claude’s European availability and IBM’s open-sourced model

Hello AI Enthusiast,

Wow, what a week for AI! OpenAI and Google just made some big announcements that are really shaking things up. It seems like they timed their news to one-up each other, and it's not the first time OpenAI has tried to outshine Google's big moments.

First up, OpenAI rolled out its new GPT-4o model. Then, right after that, Google stepped up at their I/O 2024 event showing their upgraded Gemini AI integrated into their products.

Both companies are racing to build an AI that can see, talk, translate, and even help with your homework or guide you through a new city. They're transforming how we interact with technology on a daily basis. It's exciting (and maybe a bit scary) to think about.

The details of these updates are big news, and there's a lot to unpack. Let’s dive in!

The Big Picture 🔊

OpenAI Launches GPT-4o Model

OpenAI has introduced GPT-4o, a new model that can handle text, audio, image, and video inputs at the same time. GPT-4o is faster and 50% cheaper compared to the previous version, GPT-4 Turbo, and supports around 50 languages. This upgrade makes ChatGPT respond quicker, recognize voice tones, and interact more naturally. During the launch and afterward, OpenAI demonstrated several use cases showing the many ways this new way of interacting can be used. Starting today, text and image features are available in ChatGPT for Plus subscribers and also free users, with Plus users getting up to 5 times more messages. Developers can now use GPT-4o via API, and audio and video features will soon be available to trusted partners. Here’s a video showing one of the many usages of the new model.

Google I/O 2024 Highlights

At Google I/O 2024, Google's CEO talked about big improvements in Gemini. The AI model is now part of key Google services like Search, Photos, and Workspace. The new version, Gemini 1.5 Pro, can now handle up to two million tokens of information, making it the model with the longest context window. Google also introduced Gemini 1.5 Flash, which is a faster, more cost-effective version, and Gemini Nano, a smaller version of the AI with multimodality, that is integrated into Android devices, keeping data private on your device.

Another announcement worth mentioning is Project Astra, a seeing and talking responsive agent showcasing real-time conversational abilities and a deep understanding of different data types. This is a video of Project Astra in action. Any similarities?

💡Our Take: OpenAI's launch of GPT-4o is a big step forward in AI. The "o" in GPT-4o stands for "omni," meaning it can handle text, audio, image, and video at the same time, responding faster and in a very natural way, almost like talking to a human. This makes interactions smoother and opens up many new possibilities, especially through its API, which will expand further once video and audio features become available. The new desktop app is also exciting, offering us a virtual assistant that can see our screen and with whom we can talk. Interestingly, OpenAI is making these features available to free users, aiming for mass use rather than quick profits.

Just one day later, Google responded with major news at their I/O 2024 event. Google's reach, with billions of devices using their systems and apps, means they have a big advantage, even if their AI isn't always the best. Google also showed off Project Astra, an AI assistant that can talk, and understand different types of data in real time, similar to what OpenAI has been demonstrating.

The competition between OpenAI and Google is intense, and the battle will be all about who can nail the best user experience and integrate AI into products the smartest way. Google has the products, OpenAI doesn't, but we wonder if the rumored partnership between OpenAI and Apple is aiming to change that. 👀

How often do you use ChatGPT or similar AI tools?

After voting, share with us whether the latest enhancements will make you use it even more.

Login or Subscribe to participate in polls.

Bits and Bobs 🗞️

Educational Pill 💊

Understanding GPT-4o

GPT-4o, or "omni," seamlessly handles text, audio, images, and video, responding in any of these formats. This model mimics human interaction speeds, answering audio queries almost as quickly as we respond in conversations. Imagine chatting with AI that understands not just your words, but the full context of your questions, including tone and visuals.

Previously, interacting with ChatGPT involved slower responses because it converted spoken words to text and back to speech, losing nuances like tone and background sounds. GPT-4o changes this by using a single model that processes everything together, capturing subtle details like emotions and distinguishing between multiple voices. This makes the AI interaction feel much more fluid and intuitive.

From Our Channels 🤳

Check out this TikTok video of Gianluca getting ready to teach a Harvard class how to add your own data to an AI model. It's probably one of the simplest RAG explanations out there.

@gianluca.mauro

How to add your own data to an AI model? Quick explained of retrieval augmented generation from my Harvard class. #ai #learnontiktok #arti... See more

As we've seen, AI models are getting smarter fast. However, simply subscribing to the latest technology isn't enough to ensure your teams can use AI effectively. Access alone doesn't guarantee efficiency; understanding and skill do.

That’s where our corporate training programs come in. If you want your team to get the best out of AI, chat with our partnerships lead, Helin Yontar. She can fill you in on how we can help!

From the Tribe 🫂

Just when you think you've got all your automations perfected, OpenAI rolls out a new model! This week, our students couldn't help but laugh (and maybe groan a little) about having to update their automations yet again. While this means going back to the drawing board, it's a reminder of the fast-paced world of AI where staying updated is just part of the journey.

LOLgorithms 😂

Sundar did the heavy lifting for us.

That's a wrap on our newsletter! Here’s a quick recap before you go:

Catch you next week! 👋