[ad_1]
The race to pioneer artificial intelligence has become a desperate hunt for the digital data needed to advance the technology. To obtain this data, technology companies including OpenAI, Google and Meta have taken shortcuts, ignored company policies and discussed breaking laws, according to a New York Times investigation.
Managers, lawyers and engineers at Meta, the company that owns Facebook and Instagram, last year discussed acquiring Simon & Schuster publishing houses for long-form works, according to minutes of internal meetings obtained by The Times. They also agreed to collect copyrighted data from the web, even if it meant facing lawsuits. Negotiating licenses with publishers, artists, musicians and the press would take too long, they said.
Like OpenAI, Google transcribes YouTube videos to collect text for its artificial intelligence models, five people familiar with the company’s practices said. This may violate the copyright of the videos, which belong to their creators.
Last year, Google also expanded its terms of service. One motivation for the change was to allow Google to tap into public Google Docs, restaurant reviews on Google Maps and other online material to get more information, according to members of the company’s privacy team and an internal message seen by The New York Times. Lots of information. Artificial intelligence products.
The actions of these companies illustrate how online information — news stories, fiction, message board posts, Wikipedia articles, computer programs, photos, podcasts and movie clips — is increasingly becoming the lifeblood of the booming artificial intelligence industry. Creating innovative systems depends on having enough data to teach technology to instantly produce words, images, sounds and videos that resemble those created by humans.
[ad_2]
Source link