• AI Academy
  • Posts
  • 🤖 The End of Free AI Training Data?

🤖 The End of Free AI Training Data?

Plus: Publishers Sue Google and Anthropic Writes AI Rulebook

Hello AI Enthusiast,

This week EU publishers are suing Google over AI Overviews that steal their traffic, Cloudflare launched a marketplace where publishers can charge AI bots for scraping their content, and Anthropic is proposing its own transparency framework. It's a messy three-way fight over who controls AI's future - and who gets paid for it.

The Big Picture 🔊

Publishers Sue Google Over AI Overviews in EU

Independent publishers filed an EU antitrust complaint against Google's AI Overviews, claiming the feature uses their content without permission to generate summaries that appear above search results. Publishers say this reduces website traffic and revenue since users get answers directly from AI summaries. Google says the feature creates new opportunities and drives billions of clicks to websites.

James Varnham
James VarnhamCEO and Rainmaker

The publishers have a point about lost traffic, but they're fighting yesterday's war. AI Overviews improve user experience, they're faster and more convenient than scrolling through multiple links for simple questions. Also, proving that AI Overviews directly caused traffic drops is difficult since website performance depends on many factors.

Traditional search is evolving. Publishers should adapt and focus on being among the sources that AI cite.

Cloudflare Launches Marketplace for AI Bot Scraping Fees

Cloudflare launched "Pay per Crawl," letting website owners charge AI companies micropayments for scraping their content. Publishers can set rates, offer free access, or block bots entirely. The move addresses a major imbalance: OpenAI scrapes sites 1,700 times per referral versus Google's 14 times. Major publishers like Conde Nast and TIME have joined. Both AI companies and publishers must use Cloudflare for the marketplace to work.

Gioele Mottarlini
Gioele MottarliniCOO and Image Addict

Cloudflare's marketplace addresses a real problem - AI companies have been freeloading off publishers' content. The scraping ratios are brutal: OpenAI takes 1,700 times more than it gives back versus Google's 14-to-1.

The real challenge is the network effect: this only works if most publishers join, but many will hesitate to lock themselves into potentially lower earnings.

Anthropic Proposes AI Transparency Framework

Anthropic proposed requiring large AI developers ($100M+ revenue or $1B+ R&D) to publicly disclose safety practices. Companies would publish frameworks detailing risk assessments for biological/nuclear harms and testing procedures. The proposal includes whistleblower protections and aims to standardize existing voluntary practices from OpenAI and Google without hindering innovation.

Andrea Mattiello
Andrea MattielloCM and Board Lover

Anthropic's intentions might be good, and while input from AI companies is valuable, independent third parties with users' safety at heart should have the final say on these frameworks. The easily-gamed revenue thresholds and self-certification probably don't help.

With the EU, US, and China moving in different directions, we're heading toward regional AI blocs rather than unified global standards.

While Anthropic writes transparency frameworks and publishers fight back against free scraping, the question you should ask yourself is whether your team knows how to work with AI effectively. Our Corporate AI Training cuts through the hype to show you what actually works in practice.

Bits and Bobs 🗞️

  • AI is revolutionizing fertility care with Columbia University Fertility Center's STAR method, which uses AI to detect and recover hidden sperm in cases of male infertility.

  • Grammarly has acquired Superhuman, aiming to enhance email communication by integrating AI-powered features.

  • Perplexity is launching a $200-per-month subscription plan called Perplexity Max, offering unlimited access for power users.

  • Anysphere has launched a new web app for Cursor, allowing users to manage AI coding agents directly from their browser.

  • Amazon has now deployed 1 million robots in its warehouses and introduced a new AI model called DeepFleet, which enhances robot coordination and boosts their speed by 10%.

  • Ford CEO Jim Farley warns about AI's potential to replace half of white-collar jobs in the U.S., emphasizing the growing need for skilled trade workers.

On the Podcast 🎧

We just published a new podcast episode with Andy Lancaster about what happens when a 30-year learning veteran meets AI. It's a refreshingly honest conversation about creativity, struggle, and why your aunt's handwritten letter hits different than anything AI produces. A real talk about staying human in an AI world. Listen now.

That's a wrap on our newsletter! Before you go, here’s a quick recap of our offerings:

  • AI Academy Membership: Get 12 months of access to all our cohort-based programs, live webinars, on-demand courses, and tutorials.

  • AI Agent Bootcamp: Accelerate processes and solve business problems by mastering prompts and building AI Agents, without coding.

  • Practical Introduction to ChatGPT: A free course on using ChatGPT confidently, understanding its workings, and exploring its potential.

  • Customized Corporate Training: Equip your team with the skills they need to unlock the potential of AI in your business.

Catch you next week! 👋