- AI Academy
- Posts
- 📽️ OpenAI's Plot Twist in Video Making
📽️ OpenAI's Plot Twist in Video Making
Plus Google's unveiling of Gemini 1.5 and NVIDIA's personalized GPT chatbot
Hello AI Enthusiast,
In our last edition, we introduced a new twist to our newsletter: alongside the most significant AI news, we shared our thoughts, offering a personal touch. We were eager to see if this addition resonated with you, and the response was overwhelmingly positive, with 86% of our readers appreciating this deeper dive. That makes us happy! 😃
SPOILER: The first news piece we're dissecting is about OpenAI's Sora, arguably the most discussed recent AI release. If you've been keeping up with AI trends (and we bet you have), you've likely heard the buzz.
Before we dive into the latest AI advancements, we still have a few spots open for the next edition of the Master in Prompt Engineering, starting next month.
This time around, we're enriching the program with innovative features, including an AI tutoring bot to provide immediate answers to your questions about the educational content. Plus, we're bringing alumni from previous editions into the mix, making the experience more inclusive and relatable, fostering a sense of community and shared knowledge.
Now let’s see what happened in the last days.
News Bytes 🗞️
OpenAI has introduced Sora, an AI that can generate high-quality videos from text prompts. With the ability to follow detailed user text instructions and audio inputs, Sora can create scenes up to a minute long, drawing on elements from the real world to generate credible interpretations of the prompts. The model is now available for testing to experts in areas like misinformation, and creative professionals such as visual artists, designers, and filmmakers. However, there are current limitations such as accurately simulating complex scenes and the cause and effect of specific instances.
💡 Our take: Sora, will probably help budget movie producers and marketers by letting them create pretty good videos just by typing what they want. This could change the game, making it cheaper to get professional-looking videos, even though we don't know how much it costs to run Sora because OpenAI hasn't shared that info (although we expect it to be quite expensive). Still, it's a big deal because it makes making AI videos better than before. It’s important they added a watermarking technology to show which videos are made by AI and which are made by humans, keeping things honest. However, we must highlight that OpenAI decided to lay regulations post-development which may prompt ethical dilemmas.
What's your take on OpenAI's Sora? |
Google has unveiled Gemini 1.5, an AI model that showcases impressive improvements in performance, including an exciting breakthrough in long-context understanding. The new model can process a staggering 1 million tokens of information consistently, making it capable of reasoning about vast, complex information from diverse sources like text, video, and audio in an efficient manner, offering developers and enterprises new possibilities for AI applications.
💡 Our take: Google's Gemini 1.5 showcases remarkable abilities in understanding long-context information, marking a significant advancement in AI technology. With the capability to process up to 1 million tokens at a time, their "Needle in a Haystack" experiment demonstrates exceptional precision in extracting specific details from an extensive context. However, the absence of publicly available research on their methodologies leaves us curious about the inner workings of this technology. In contrast to OpenAI's approach of making AI technologies accessible to a broader audience, Google seems to focus more on catering to enterprise-level needs with its AI developments.
NVIDIA has introduced 'Chat with RTX', a free to download tech demo that equips RTX GPU-powered Windows PCs with a personalized GPT chatbot. Users can connect local data on their PC to a large language model for quick and relevant responses, and the tool can process sensitive data locally, ensuring user data stays on-device rather than on the cloud.
💡 Our take: Many companies have been cautious about adopting AI technologies, primarily due to concerns about data privacy and protection. NVIDIA's 'Chat with RTX' introduces a potential game-changer by processing sensitive data locally. This approach could significantly reassure businesses that have been hesitant to use such technology for fear of data breaches.
Anthropic is testing a technology called Prompt Shield in its GenAI chatbot, Claude, designed to identify when users ask political or voting-related questions and redirect them to trusted sources of voting information. The development aims to counteract misinformation in the light of upcoming U.S. presidential elections and comes in response to Claude's current inability to provide real-time political information due to infrequent training.
Google has rolled out Gemma, a new open-source offering of four AI models grouped into two categories - small (2B parameters) and large (7B parameters). These models are designed to be efficient on common devices such as laptops or phones. Each category features a basic version, along with another model tailored for specific tasks. In collaboration with Hugging Face, a popular AI community, Gemma is accessible to a wide array of users.
Google launched the AI Opportunity Initiative for Europe, aiming to make AI training and skills more accessible to all, with special attention to vulnerable communities. The initiative includes AI training programs in 18 languages, a €25 million fund to support AI training across Europe and a special support framework for startups using AI to combat societal challenges. We’re happy to see other companies joining our effort to make AI more accessible.
OpenAI has launched an invitation-only community forum to involve individuals in responsible AI efforts, offering paid opportunities to support research projects. The platform aims to gather diverse perspectives to shape the development and deployment of AI technologies, emphasizing the importance of inclusive participation in creating beneficial AI for humanity.
OpenAI terminated accounts of state-affiliated threat actors who tried to exploit their tools for malicious cyber activities, collaborating with Microsoft to disrupt them. While their current AI models have limited capabilities for malicious tasks, OpenAI is showing its commitment to fighting threats and maintaining platform integrity.
Reddit has signed a lucrative AI data licensing deal worth around $60 million annually, potentially setting a trend for similar agreements in the future. This move could significantly boost Reddit's IPO valuation by tapping into the growing enthusiasm for AI technologies in the corporate world.
Former Salesforce co-CEO Bret Taylor's new AI company, Sierra, aims to provide advanced customer service through AI agents that can take actions beyond answering questions. The platform is already being used by major consumer brands like SiriusXM and WeightWatchers, showing potential for transforming customer interactions via AI.
The USPTO has clarified that while AI systems cannot be listed as inventors on patents, humans can qualify as inventors even when assisted by AI. Significant human input is necessary for a patentable invention, with oversight of an AI system not equating to inventorship.
Canadian airline, Air Canada, had to pay a traveler because its chatbot gave wrong advice about refund rules. This is the first time a company was legally responsible for incorrect information from its chatbot. This story highlights how important it is for AI tools to be tested and set up properly and for companies to accept responsibility.
Educational Pill 💊
How Google’s new AI model works
Google's new Gemini 1.5 model uses a Mixture of Experts (MoE) architecture to deliver enhanced performance, particularly in understanding long contexts across various types of data.
Think of MoE like a team of specialists, where each member is really good at a specific task. In this case, “experts” are small LLMs that have higher performance in their own area of “expertise”. When Google's Gemini 1.5 model gets a job (like understanding a sentence or recognizing something in a picture), it decides which LLM (or "expert") is best suited for that particular job. Instead of asking the whole team to work on it, only the chosen expert does the work, making the process faster and more efficient.
Gemini 1.5 uses this approach to tackle huge amounts of data or complex questions without consuming too much energy. It can look at a vast amount of information at once (up to 1 million pieces of data, like words or parts of images) and understand it deeply, thanks to its “team of experts”.
LOLgorithms 😂
Never understood why they made Will Smith devour spaghetti in the first place.
we went from this to Sora in a year
— Haroon Choudery (@haroonchoudery)
10:07 PM • Feb 15, 2024
From our community 🤝
We recently shared a story about an incredible project created by one of our students during our course. As a service designer, he developed a "profile generator" to streamline client workshop preparation, significantly reducing time and simplifying the process. This project is a perfect example of our students taking what they learn and turning it into something awesome. We're extremely proud of his clever use of skills! Check out our Instagram to dive into more about this project.
That is the end of our newsletter.
Remember, if your company is looking to implement AI technologies, we also offer customized corporate training.
See you next week. 👋