Google’s updated privacy policy has revealed that the company now has the right to scrape and utilize nearly all the content users post online for its AI projects. The policy states that Google uses publicly available information to train its AI models and develop products like Google Translate, Bard, and Cloud AI capabilities. This expansion of data usage extends beyond the company’s own services, encompassing the entire public web.
This inclusion of external content in Google’s AI endeavors raises significant privacy concerns. While people generally understand that public posts are visible to others, this new policy emphasizes the potential uses and implications of their online writings. It is conceivable that even long-forgotten blog posts or old restaurant reviews might be ingested by Google’s chatbots, resulting in unpredictable and potentially difficult-to-comprehend manifestations of their words.
The matter of data sourcing for data-hungry chatbots is also an issue in the post ChatGPT world. Google and other companies have scraped extensive portions of the internet to fuel their AI systems, a practice whose legality is far from clear. Copyright questions surrounding web scraping are likely to be debated in the coming years, and the consequences are already being felt by consumers in unexpected ways.
Twitter and Reddit, feeling the impact of AI-related concerns, have made controversial changes to their platforms. Both companies have restricted access to their APIs, preventing the downloading of large quantities of posts. While this move is aimed at safeguarding intellectual property, it has disrupted third-party tools and services. Twitter even attempted to impose fees on public entities for tweeting, but quickly backtracked due to intense criticism.
Elon Musk has recently highlighted web scraping as a significant issue, attributing various Twitter incidents to the need to prevent data extraction from his site. However, many experts believe that the limitations imposed by Twitter were more likely a response to technical problems stemming from mismanagement. Twitter has not provided further clarification on the matter.