From the very beginning of the AI boom, one debate has been ever-present: content ownership. To develop their services, AI companies use massive amounts of data available on the internet. However, many publishers feel they have been unfairly disadvantaged by not receiving fair compensation. In a new development, there’s a controversy regarding Google’s stance on AI training payment as it came under scrutiny during a hearing with the UK’s Lords Communications and Digital Committee.
Roxanne Carter, a public affairs executive at Google , clarified that the company does not believe it should pay for “freely available” content used to train their AI models.
Google defends its position of not paying publishers for AI training on public data
Google’s case is based on a specific definition of how AI works. Carter says that AI models like Gemini are not databases or systems for finding information. Instead, they look at huge amounts of data to find statistical links and patterns between ideas and words. Google says that the end goal is to use these patterns to make “wholly new content.” They claim that their AI is not just copying what publishers or creators have done.
Google won’t pay for training on the open web , but they do distinguish between general web scraping and specialized access. The company is actively making deals for archival content and specialized datasets that aren’t available to the public. In short, the firm is willing to pay for “access” to data that is not on their own platform. But they are not doing so for training the AI on what they consider the public domain of the internet.
Google’s AI Overviews: The opt-out dilemma
For publishers, the situation is more complex. Google highlights a tool called “Google Extended,” which allows website owners to stay in Google Search while opting out of having their content used to train AI models like Gemini. This sounds like a fair compromise on paper. But a significant grey area remains regarding “ AI Overviews “—the summaries that appear at the very top of search results.
When asked if publishers could opt out of AI Overviews in particular, Google’s representatives were vague. Using some specific tags seems like a way to not show up in these summaries right now. Unfortunately, adding these tags could also make it harder for people to find a site in regular search results. This puts smaller publishers in a tough spot: they can either let AI summarize their work (which could lower the number of direct clicks) or risk losing their search ranking completely.
The smaller players could be the most affected
Government officials are worried that big media companies can make lucrative deals with tech giants , but smaller companies often miss out. People are worried that AI summaries might be able to compete with the articles they summarize. This means that AI summaries might be using the creator’s own work to keep people on the search page instead of sending them to the original source.
Regulatory bodies continue to consult on these issues. Meanwhile, the definition of “fair use” in the age of AI remains the ultimate question. For now, Google maintains the firm idea that the open web is a free classroom for their AI, even as the creators of that content argue that their “free” information is what makes the AI valuable in the first place.