
ChatGPT and Google’s Gemini have emerged as leading forces in the race for superior large language models. It’s evident that these platforms have transformed the AI industry. Yet, how they acquire information and manage datasets has been a continuous ethical concern.
BeInCrypto talked to emerging AI projects in Web3, including ChainGPT, Space ID, Sapien.io, Vanar Chain, O.XYZ, AR.IO, and Kindred, to discuss the contemporary concerns of intellectual property rights, copyright, and ownership. A key takeaway was the potential of decentralized artificial intelligence (deAI) as a worthy alternative.
The Rise of LLMs and the Data Acquisition Dilemma
Since their creation, large language models (LLMs) have quickly gained widespread use. In many ways, platforms like OpenAI’s ChatGPT and Google’s Gemini were the public’s first real contact with artificial intelligence (AI) capabilities and their non-exhaustive use potential.
Yet, these companies have also come under scrutiny for their operations. To remain competitive, AI models need access to a large number of datasets. LLMs can only generate human-like responses and understand complex queries by processing massive amounts of text.
To make this happen, leading tech giants like OpenAI, Google, Meta, Microsoft, Anthropic, and Nvidia largely funnel all the available data and information on the internet to train their AI models. This approach has raised serious questions about who owns the input these platforms ingest and later regurgitate in the form of output.
Despite AI’s disruptive potential, concerns over intellectual property rights have ended up in highly contested legal battles.
Are AI Companies Building Empires on Stolen Content?
Rapid AI adoption has raised concerns regarding data ownership, privacy, and potential copyright infringement. A key point of contention is using copyrighted material to train centralized AI models that large corporations exclusively control.
“AI companies are building empires on the backs of creators without asking for permission or sharing the spoils. Authors, artists, and musicians have spent years perfecting their craft, only to find their work ingested by AI models that generate knockoff versions in seconds,” Jawad Ashraf, CEO of Vanar Chain, told BeInCrypto.
This issue has indeed caused widespread dissatisfaction. Vanar Chain CEO added that OpenAI and others have openly admitted to scraping copyrighted material, sparking lawsuits and a broader reckoning over data ethics.
“The crux of the issue is compensation—AI firms argue that scraping publicly available data is fair game, while creators see it as daylight robbery,” Ashraf state.
Defining the Boundaries of AI-Generated Work
The New York Times filed a lawsuit against OpenAI and Microsoft in December 2023, alleging copyright violations and the unauthorized use of its intellectual property.
The Times accused Microsoft and OpenAI of creating a business model based on the “unlawful copying and use of The Times’s uniquely valuable works.” The newspaper also argued that these models “exploit and, in many cases, retain large portions of the copyrightable expression contained in those works.”
Four months later, eight more news publishers operating in six different US states sued Microsoft and OpenAI over copyright infringement.
The Chicago Tribune, The Denver Post, The Mercury News in California, the New York Daily News, The Orange County Register in California, the Orlando Sentinel, the Pioneer Press of Minnesota, and the Sun Sentinel in Florida – all alleged that the two technology companies used their articles without authorization in AI products and misattributed inaccurate information to them.
“Courts are now being forced to answer questions that didn’t exist a few years ago: Does AI-generated content constitute derivative work? Can copyright holders claim damages when their data is used without consent?” Trevor Koverko, co-founder of Sapien.io, told BeInCrypto.
In addition to journalism organizations, publishers, authors, musicians, and other content creators have initiated legal action against these tech companies over copyrighted information.
Legal Battles Across Industries
Just last week, three trade groups announced that they will sue Meta in a Paris court, alleging Meta “massively used copyrighted works without authorization” to train its generative AI-powered chatbot assistants, which are used across Facebook, Instagram, and WhatsApp.
Meanwhile, visual artists Sarah Andersen, Kelly McKernan, and Karla Ortiz sued AI art generators Stability AI, DeviantArt, and Midjourney for using their work to train their AI models.
“There is no end to concerns when it comes to the unregulated use of data and creative material by centralized AI companies. Currently, any artist, author, or musician with publicly available material can have their work crawled by AI algorithms that learn to create nearly identical content—and profit from it while the artist gets nothing,” argued Phil Mataras, founder of AR.IO.
OpenAI and Google particularly argue that if legislation limits their access to copyrighted material, the United States would lose the AI race against China. According to them, companies in China operate with fewer regulatory constraints, giving their rivals a key advantage.
These powerhouses are aggressively lobbying the US government to classify AI training on copyrighted data as “fair use.” They maintain that AI’s processing of copyrighted content results in novel outputs fundamentally different from the source material.
However, as generative AI tools increasingly produce text, images, and voices, many industries are pursuing legal challenges against these corporations.
“Content creators—whether they’re authors, musicians, or software developers—often say their [intellectual property] is being used in ways that go beyond fair use, especially when AI systems copy or replicate aspects of their original work,” said Ahmad Shadid, founder and CEO of O.XYZ.
Meanwhile, in Web3, players are lobbying for an alternative to traditional corporations’ approach to LLM development.
DeAI Surfaces as the Web3 Alternative
Decentralized AI (deAI) is an emerging field in Web3 that explores using blockchain and distributed ledger technology to create more democratic and transparent AI systems.
“DeAI, leveraging blockchain and distributed ledger technology, aims to address data ownership and copyright concerns by creating more transparent AI systems. It distributes the development and control of AI models across a global network, establishing fairer models for AI training that respect content creators’ rights. DeAI also aims to provide mechanisms for equitable compensation to creators whose work is used in AI training, potentially resolving many of the issues associated with centralized AI models,” explained Max Giammario, CEO and founder of Kindred.
With AI’s growing global prominence, its fusion with blockchain promises to transform both sectors, creating novel avenues for crypto innovation and investment.
In response, builders in the industry have already begun to develop successful projects that merge AI and Web3 technologies.
Unlike in the case of corporations that produce centralized AI models, deAI aims to be fully open-source.
OpenAI has previously argued that it complies with the US fair use doctrine despite using copyrighted material to train its AI models. Moreover, ChatGPT, its most popular application, is completely free to use.
Harrison Seletsky, Director of Business Development at Space ID, highlighted a contradiction in OpenAI’s argument.
“The clear ethical issue is that materials are being used without the explicit permission of their creators. If they are copyrighted, permission must be granted, and typically a fee paid. But beyond that, even if LLMs like ChatGPT use open-source data, OpenAI’s models are not open-source. They make use of publicly available material without fully ‘giving back’ to the sources they pull from.
There’s an overarching question here about whether AI should be open-source. OpenAI’s ChatGPT isn’t, while models like China’s DeepSeek are, as well as decentralized AI. From the perspective of ethics and intellectual property rights, the latter is certainly a better choice,” Seletsky said.
These technological powerhouses’ centralized control also prompts other concerns regarding the implementation and oversight of AI models.
Centralized vs. Decentralized: Ethical and Operational Differences
In contrast to the community-driven nature of deAI, centralized AI models are built by a small number of people, leading to potential biases.
“Centralized AI usually operates under a single corporate umbrella, where decisions are driven by a top-down profit motive. It’s essentially a black box owned and managed by one entity. In contrast, DeAI relies on a community-driven approach. The AI is designed to analyze community feedback and optimize for collective interests instead of just corporate ones,” explained Ahmad Shadid, founder and CEO of O.XYZ.
Meanwhile, blockchain technology provides a clear path for monetization.
“Creators can tokenize their creative assets—like articles, music, or even ideas—and set their own prices. This creates a fairer environment for both creators and users of intellectual property, essentially forming a free market for IP. It also makes ownership easy to prove, as everything on the blockchain is transparent and immutable, making it much harder for others to exploit someone’s work without properly aligning incentives,” Seletsky told BeInCrypto.
Different Web3 builders have already developed projects that decentralize content used for generative AI. Platforms like Story, Inflectiv, and Arweave leverage various aspects of blockchain technology to ensure that datasets used for AI models are ethically curated.
Ilan Rakhmanov, founder of ChainGPT, views deAI as a vital counterforce to centralized AI. He asserts that addressing the unethical practices of existing AI monopolies will be essential in cultivating a healthier industry in the future.
“This sets a dangerous precedent where AI companies can freely use copyrighted content without proper attribution or payment. Legally, this invites regulatory scrutiny; ethically, it deprives creators of control. ChainGPT believes in on-chain attribution and monetization, ensuring a fair value exchange between AI users, contributors, and model trainers,” Rakhmanov said.
But, for DeAI to take center stage, it must first overcome several obstacles.
What Obstacles Does deAI Face?
Though deAI has blossoming potential, it is also in its nascent stages. In that respect, companies like OpenAI and Google have the upper hand regarding economic prowess and infrastructure. They have the means to handle the vast resources needed to acquire such large amounts of data.
“Centralized AI companies have access to massive compute power, while deAI needs efficient, distributed networks to scale. Then there’s data—centralized models thrive on hoarded datasets, while deAI must build reliable pipelines for sourcing, verifying, and compensating contributors fairly,” Koverko told BeInCrypto.
To that point, Ahmad Shadid added:
“Building and running AI systems on distributed ledgers can be complicated, especially if you’re trying to handle massive amounts of data at scale. It also requires careful oversight to keep the AI’s learning processes aligned with community ethics and goals.”
These technological powerhouses can also use their resources and connections to lobby hard against competitors like deAI.
“They might do so by advocating for regulations that favor centralized models, leveraging their market dominance to limit competition, or controlling key resources necessary for AI development,” Giammario said.
For Ashraf, the probability of this happening should be taken for granted.
“When your entire business model is built on hoarding data and monetizing it in secret, the last thing you want is an open, transparent alternative. Expect AI giants to lobby against DeAI, push for restrictive regulations, and use their vast resources to discredit decentralized alternatives. But the internet itself started as a decentralized system before corporations took over, and people are waking up to the downsides of centralized control. The fight for open AI is just getting started,” Jawad Ashraf, CEO of Vanar Chain anticipated.
However, to further its mission, deAI needs to enhance its public awareness, reaching both Web3 users and those outside the space.
Bridging the Knowledge Gap
When asked about the main hurdles that deAI currently faces, Seletsky from Space ID said that people need to be aware of the problem of copyright infringement in AI models to solve it.
“The main hurdle is a lack of education. Most users don’t know where the data comes from, how it’s being analyzed and who’s controlling it. Many don’t even realize that AI has biases, just like humans. There’s a need to educate the average person on this before they can understand the advantages of decentralized AI models,” he said.
Once the public understands the copyright issues within centralized AI models, deAI advocates must actively demonstrate deAI’s merits as a strong alternative. However, despite increased awareness, deAI still faces adoption challenges.
“Adoption is another challenge. Enterprises are used to turnkey AI solutions, and deAI needs to match that level of accessibility while proving its advantages in security, transparency, and innovation,” Koverko said.
The Path Forward: Regulatory Clarity and Public Trust
With the challenges of education and accessibility addressed, the path to wider deAI adoption hinges on establishing regulatory clarity and building public trust. Trevor Koverko, co-founder of Sapien.io, also added that deAI needs accompanying regulatory clarity to reach these goals.
“Without clear frameworks, deAI projects risk being sidelined by legal uncertainty while centralized players push for policies that benefit their dominance. dominance. Overcoming these challenges means refining our tech, proving real-world value, and building a movement that pushes for open, democratized AI,” he asserted.
Shadid concurred with the need for greater institutional backing, adding that it should be coupled with building greater public trust.
“Transparency can be unsettling if you’ve spent decades perfecting proprietary methods, so DeAI must prove its superiority in terms of trust and innovation. Another hurdle is building enough user trust and regulatory clarity so that people—and even governments—feel comfortable with how data is handled. The best way to gain traction is to demonstrate real-world use cases where decentralized AI clearly outperforms its centralized counterparts or at least proves it can match them in speed, cost, and quality while being much more open and fair,” Ahmad Shadid explained.
Ultimately, the copyright concerns surrounding AI models call for a paradigm shift, focusing on respecting intellectual property and promoting a more democratic AI ecosystem– irrespective of deAI’s final impact.
Disclaimer
Following the Trust Project guidelines, this feature article presents opinions and perspectives from industry experts or individuals. BeInCrypto is dedicated to transparent reporting, but the views expressed in this article do not necessarily reflect those of BeInCrypto or its staff. Readers should verify information independently and consult with a professional before making decisions based on this content. Please note that our Terms and Conditions, Privacy Policy, and Disclaimers have been updated.
Kommentar hinterlassen