[ad_1]
Oxylabs CEO Julius Černiauskas discusses the ethical implications of knowledge and copyright protection for artificial intelligence, and the need for a fair balance between commercial interests and open access.
Recent developments in generative artificial intelligence (Gen AI) tools have added a new dimension to the old problem of acquiring knowledge. On the one hand, information is a fundamental right, and locking it behind paywalls is detrimental to individual and collective human progress. On the other hand, the costs of knowledge creation and publication and the people who do them should be compensated.
Sometimes, however, compensation doesn’t flow reasonably within the current business model. This has triggered renewed efforts towards a model of fair and open access to information.
Artificial intelligence between knowledge and profit
Last year, we saw a wave of lawsuits against major developers of artificial intelligence tools (such as Google, Meta, and OpenAI) and their investor Microsoft. Some of the lawsuits involve the unauthorized use of intellectual property rights to train artificial intelligence models.
American comedian Sarah Silverman is one of the authors suing artificial intelligence companies after her memoir was allegedly used to train artificial intelligence models. According to the AP, she claimed that the book was used “without consent, credit and compensation,” which sums up the main pain point for authors whose work is therefore used to create another final commercial product, e.g. ChatGPT.
Some argue that such practices may be allowed as fair uses of copyrighted material, depending, among other things, on whether they sufficiently alter the original work and their effect on the market value of the original work. If these practices are banned, AI developers may struggle to find high-quality data to continue their work.
Risks of Artificial Intelligence Knowledge Bubble
As AI and data companies await court rulings on copyright law issues, the practical problems that arise from withholding data from further training of AI models should be addressed. Artificial intelligence tools are becoming an increasingly important source of information.
However, hallucinationWhen it comes to artificial intelligence answering disinformation, “” was last year’s Cambridge Dictionary word of the year. Unfortunately, these illusions will increase if AI models withhold data.
The lack of diverse and up-to-date information created by humans is one of the reasons why artificial intelligence creates hallucinations. Without access to multifaceted and heterogeneous datasets, developers must train their own datasets using data synthesized by other large language models (LLMs).This creates an artificial intelligence echo chamber, which some researchers call Model crashes. Models keep getting data that describe the same possible realities, overestimating the chances of those outcomes and underestimating unlikely ones. Gradually, the model forgets about outliers entirely, narrowing its understanding of the possibilities.
The more AI models constructed in this way, the higher the risks involved. That is, poorly trained tools will produce low-quality answers, spread misinformation, reiterate stereotypes, and render parts of human knowledge nearly obsolete. Therefore, it is necessary to discuss how copyright owners can fairly exercise their rights without hindering further innovation in the artificial intelligence industry.
Paid news feed
A fair compensation mechanism that allows copyright holders to profit from commercial AI development would be a perfect solution. Alternatively, AI developers may only use such materials for non-commercial purposes to advance knowledge and provide opportunities for the public to benefit society. Unfortunately, gaining access to the most important knowledge scientists create to benefit humanity is difficult.
Digital distribution of information should reduce the cost of publishing and thus the cost of acquiring knowledge.However, commercial academic publishing giants charge high subscription fees and profit margins as high as 40%, according to academic publisher statistics in June 2023, while keeping the cost of its internal procedures confidential.
These fees do not involve payment of author fees. Researchers are paid to conduct research but do not publish the results. They publish academic papers for academic credit, which allows them to advance their careers and make their work known and useful to society.
They must pay article processing charges (APC) to publish in the most prestigious journals with the highest impact factors.The global average cost of publishing in such journals, according to the University of Oxford’s Open Access Publishing Report Average prices range from £2-3,500, but can go up to £10,000.
Authors may also publish in journals that do not charge APC fees. However, these publications are considered less authoritative and therefore have less impact on the development of the scientific field. Therefore, even good research is often ignored if scientists and their institutions cannot afford to publish it in major journals. At the same time, wider audiences cannot afford the high subscription fees set by publishers to gain access to the knowledge behind paywalls.
Preventing public access to scholarly knowledge can also hinder the development of artificial intelligence if developers decide not to view lower-impact journals as authoritative enough, or to pay the high prices of prestigious journals.
In such a rather dystopian scenario, artificial intelligence and human intelligence will develop within the constraints of echo chambers when costs and paywalls get in the way.
open access goals
The problems with current knowledge sharing systems require a more urgent shift to an open access model. For scientific research, this means publishing only in open access journals. These journals transparently report their article processing costs, which are under a few hundred dollars and can be borne by government-funded scientific research institutions. Published articles can then be freely accessed by everyone around the world, including AI developers.
Open access, particularly to academic resources, will enhance the quality of the output that an LLM can provide. Artificial intelligence tools that are better at digesting and disseminating information will in turn make this information more accessible and useful to a wider audience.
However, to facilitate (and support) this shift toward open access, AI companies must avoid the pitfalls of the current knowledge industry. Unfortunately, until we know what boundaries case law will set, it’s hard to tell what everything will look like in practice.
Ideally, the new system would involve compensation for copyright holders whose works are used to train algorithms. If monetary compensation is not available, they can benefit in another way, as researchers benefit from exposure and reputation building when they publish their research in top journals. Additionally, it’s fair to allow you to choose not to have your work used in this way.
To be recognized as reputable purveyors of business knowledge, AI companies should look beyond the current academic publishing giants. This will involve transparency into its operations, data acquisition methods and costs. Distributing the value extracted from AI models more fairly, rather than maximizing profits, will help build more trust in them.
Ultimately, the ability of AI developers to promote open access to knowledge for universal benefit depends on how open AI is. The more free features an AI model has, the better it is for accessing a variety of data sources.
See more: How multimodal capabilities are revolutionizing artificial intelligence models
Bridge knowledge gaps and build trust
The Internet was developed to help researchers easily share ideas and information. Later, it became a tool for disseminating information to everyone in the world, built on the unwavering belief that open access to information drives human progress.
AI tools can promote open access by making knowledge easy to find and digest. Building a fair system that balances the free flow of information with commercial interests should start with building trust between companies, authors and consumers.
How does your workplace guide conversations around copyright protection and ethics training in artificial intelligence?let us know Facebook, Xand LinkedIn. We’d love to hear from you!
Image source: Shutterstock
More knowledge about artificial intelligence
[ad_2]
Source link