Copyright concerns create need for a fair alternative in AI sector

News Room

When future generations look back at the rise of artificial intelligence technologies, the year 2025 may be remembered as a major turning point, when the industry took concrete steps towards greater inclusion, and embraced decentralised frameworks that recognise and fairly compensate every stakeholder.

The growth of AI has already sparked transformation in multiple industries, but the pace of uptake has also led to concerns around data ownership, privacy and copyright infringement. Because AI is centralised with the most powerful models controlled by corporations, content creators have largely been sidelined.

OpenAI, the world’s most prominent AI company, has already admitted that’s the case. In January 2024, it told the UK’s House of Lords Communications and Digital Select Committee that it would not have been able to create its iconic chatbot, ChatGPT, without training it on copyrighted material.

OpenAI trained ChatGPT on everything that was posted on the public internet prior to 2023, but the people who created that content – much of which is copyrighted – have not been paid any compensation; a major source of contention.

There’s an opportunity for decentralised AI projects like that proposed by the ASI Alliance to offer an alternative way of AI model development. The Alliance is building a framework that gives content creators a method to retain control over their data, along with mechanisms for fair reward should they choose to share their material with AI model makers. It’s a more ethical basis for AI development, and 2025 could be the year it gets more attention.

AI’s copyright conundrum

OpenAI isn’t the only AI company that’s been accused of copyright infringement. The vast majority of AI models, including those that purport to be open-source, like Meta Platforms’ Llama 3 model, are guilty of scraping the public internet for training data.

AI developers routinely help themselves to whatever content they find online, ignoring the fact that much of the material is copyrighted. Copyright laws are designed to protect the creators of original works, like books, articles, songs, software, artworks and photos, from being exploited, and make unauthorised use of such materials illegal.

The likes of OpenAI, Meta, Anthropic, StabilityAI, Perplexity AI, Cohere, and AI21 Labs get round the law by claiming ‘fair use,’ reference to an ambiguous clause in copyright law that allows the limited use of protected content without the need to obtain permission from the creator. But there’s no clear definition of what actually constitutes ‘fair use,’ and many authors claim that AI threatens their livelihoods.

Many content creators have resorted to legal action, with a prominent lawsuits filed by the New York Times against OpenAI. In the suit, the Times alleges that OpenAI committed copyright infringement when it ingested thousands of articles to train its large language models. The media organisation claims that such practice is unlawful, as ChatGPT is a competing product that aims to ‘steal audience’ from the Times website.

The lawsuit has led to a debate – should AI companies be allowed to keep consuming any content on the internet, or should they be compelled to ask for permission first, and compensate those who create training data?

Consensus appears to be shifting toward the latter. For instance, the late former OpenAI researcher Suchir Balaji, told the Times in an interview that he was tasked with leading the collection of data to train ChatGPT’s models. He said his job involved scraping content from every possible source, including user-generated posts on social media, pirated book archives and articles behind paywalls. All content was scraped without permission being sought, he said.

Balaji said he initially bought OpenAI’s argument that if the information was posted online and freely available, scraping constituted fair use. However, he said that later, he began to question the stance after realising that products like ChatGPT could harm content creators. Ultimately, he said, he could no longer justify the practice of scraping data, resigning from the company in the summer of 2024.

A growing case for decentralised AI

Balaji’s departure from OpenAI appears to coincide with a realisation among AI companies that the practice of helping themselves to any content found online is unsustainable, and that content creators need legal protection.

Evidence of this comes from the spate of content licensing deals announced over the last year. OpenAI has agreed deals with a number of high-profile content publishers, including the Financial Times, NewsCorp, Conde Nast, Axel Springer, Associated Press, and Reddit, which hosts millions of pages of user-generated content on its forums. Other AI developers, like Google, Microsoft and Meta, have forged similar partnerships.

But it remains to be seen if these arrangements will prove to be satisfactory, especially if AI firms generate billions of dollars in revenue. While the terms of the content licensing deals haven’t been made public, The Information claims they are worth a few million dollars per year at most. Considering that OpenAI’s former chief scientist Ilya Sutskever was paid a salary of $1.9 million in 2016, the money offered to publishers may fall short of what content is really worth.

There’s also the fact that millions of smaller content creators – like bloggers, social media influencers etc. – continue to be excluded from deals.

The arguments around AI’s infringement of copyright are likely to last years without being resolved, and the legal ambiguity around data scraping, along with the growing recognition among practitioners that such practices are unethical, are helping to strengthen the case for decentralised frameworks.

Decentralised AI frameworks provide developers with a more principled model for AI training where the rights of content creators are respected, and where every contributor can be rewarded fairly.

Sitting at the heart of decentralised AI is blockchain, which enables the development, training, deployment, and governance of AI models across distributed, global networks owned by everyone. This means everyone can participate in building AI systems that are transparent, as opposed to centralised, corporate-owned AI models that are often described as “black boxes.”

Just as the arguments around AI copyright infringement intensify, decentralised AI projects are making inroads; this year promises to be an important one in the shift towards more transparent and ethical AI development.

Decentralised AI in action

Late in 2024, three blockchain-based AI startups formed the Artificial Superintelligence (ASI) Alliance, an organisation working towards the creation of a “decentralised superintelligence” to power advanced AI systems anyone can use.

The ASI Alliance says it’s the largest open-source, independent player in AI research and development. It was created by SingularityNET, which has developed a decentralised AI network and compute layer; Fetch.ai, focused on building autonomous AI agents that can perform complex tasks without human assistance; and Ocean Protocol, the creator of a transparent exchange for AI training data.

The ASI Alliance’s mission is to provide an alternative to centralised AI systems, emphasising open-source and decentralised platforms, including data and compute resources.

To protect content creators, the ASI Alliance is building an exchange framework based on Ocean Protocol’s technology, where anyone can contribute data to be used for AI training. Users will be able to upload data to the blockchain-based system and retain ownership of it, earning rewards whenever it’s accessed by AI models or developers. Others will be able to contribute by helping to label and annotate data to make it more accessible to AI models, and earn rewards for performing this work. In this way, the ASI Alliance promotes a more ethical way for developers to obtain the training data they need to create AI models.

Shortly after forming, the Alliance launched the ASI initiative, focused on the development of more transparent and ethical “domain-specific models” specialising in areas like robotics, science, and medicine. Its first model is Cortex, which is said to be modeled on the human brain and designed to power autonomous robots in real-world environments.

The specialised models differ from general-purpose LLMs, which are great at answering questions and creating content and images, but less useful when asked to solve more complex problems that require significant expertise. But creating specialised models will be a community effort: the ASI Alliance needs industry experts to provide the necessary data to train models.

Fetch.ai’s CEO Humayun Sheikh said the ASI Alliance’s decentralised ownership model creates an ecosystem “where individuals support groundbreaking technology and share in value creation.”

Users without specific knowledge can buy and “stake” FET tokens to become part-owners of decentralised AI models and earn a share of the revenue they generate when they’re used by AI applications.

For content creators, the benefits of a decentralised approach to AI are clear. ASI’s framework lets them keep control of their data and track when it’s used by AI models. It integrates mechanisms encoded in smart contracts to ensure that everyone is fairly compensated. Participants earn rewards for contributing computational resources, data, and expertise, or by supporting the ecosystem through staking.

The ASI Alliance operates a model of decentralised governance, where token holders can vote on key decisions to ensure the project evolves to benefit stakeholders, rather than the shareholders of corporations.

AI for everyone is a necessity

The progress made by decentralised AI is exciting, and it comes at a time when it’s needed. AI is evolving quickly and centralised AI companies are currently at the forefront of adoption; for many, a major cause of concern.

Given the transformative potential of AI and the risks it poses to individual livelihoods, it’s important that the industry shifts to more responsible models. AI systems should be developed for the benefit of everyone, and this means every contributor rewarded for participation. Only decentralised AI systems have shown they can do this.

Decentralised AI is not just a nice-to-have but a necessity, representing the only viable alternative capable of breaking big tech’s stranglehold on creativity.

Tags: ai, artificial intelligence, machine learning

Read the full article here

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *