The platform exposing exactly how much copyrighted art is used by AI tools

A picture


Ask Google’s AI video tool to create a film of a time-travelling doctor who flies around in a blue British phone booth and the result, unsurprisingly, resembles Doctor Who,And if you ask OpenAI’s technology to do the same, a similar thing happens,What’s wrong with that, you may think?The answer could be one of the biggest issues AI chiefs face as their era-defining technology becomes ever more ubiquitous in our lives,Google and OpenAI’s generative artificial intelligence is supposed to be just that – generative, meaning it develops novel answers to our questions,Ask it for a time-travelling doctor, you get one that their systems have created.

But how much of that output is original?The problem is working out how much tools like OpenAI’s ChatGPT and its video generator Sora 2, and Google’s Gemini and its video tool Veo3, rely on someone else’s art to come up with their own inventions, and whether using source material from the BBC, for example, is an infringement of the broadcaster’s copyright,Creative professionals and industries including authors, film directors, artists, musicians and newspaper publishers are demanding compensation for the use of their work to build those models – and for the practice to stop until they have granted permission,They also argue that their work is being used without compensation in order to build AI tools that create works in direct competition with their own,Some news publishers, including the Financial Times, Condé Nast and Guardian Media Group, publisher of the Guardian, have struck licensing deals with OpenAI,A key sticking point is the AI giants’ closely – guarded models, which underpin their systems and make it difficult to know just how much their tech relies on other creatives’ work.

One firm, however, claims to be able to shine a light on the issue.The US tech platform Vermillio tracks use of a client’s intellectual property online and claims it is possible to trace, approximately, the percentage to which an AI generated image has drawn on pre-existing copyrighted material.In research undertaken for the Guardian, Vermillio created a “neural fingerprint” for various pieces of copyrighted work, before asking the AIs to create similar-looking imagery.For Doctor Who, it entered a prompt into Google’s popular Veo3 tool asking: “Can you create a video of a time travelling doctor who flies around in a blue British phone booth.”Your browser doesn't support HTML5 video.

Here is a link to the video instead,The Doctor Who video matches 80% of Vermillio’s Doctor Who fingerprint, implying that Google’s model has leaned heavily on copyright-protected work to produce its output,The OpenAI video, taken from YouTube and stamped with the watermark for OpenAI’s Sora tool, was an 87% match, according to Vermillio,Other examples created by Vermillio for the Guardian use a James Bond neural fingerprint,A Veo3 James Bond video, created with the prompt: “Can you create a famous scene from a James Bond movie?”, had a neural fingerprint match of 16%.

A Sora video, taken from the open web, had a 62% match with Vermillio’s Bond fingerprint, while images of the agent created by Vermillio using ChatGPT and Google’s Gemini model had matches of 28% and 86% respectively from a prompt citing: “A famous MI5 double ‘0’ agent dressed in a tuxedo from a famous spy movie by Ian Fleming”.Vermillio’s examples also showed strong matches with Jurassic Park and Frozen for OpenAI and Google models.Generative AI models, the term for technology that underpins powerful tools such as OpenAI’s ChatGPT chatbot as well as Veo3 and Sora, have to be trained on a vast amount of data in order to generate their responses.The main source of this information is the open web, which contains a vast array of data from the contents of Wikipedia to YouTube, newspaper articles and online book archives.Anthropic, a leading AI company, has agreed to pay $1.

5bn (£1.1bn) to settle a class-action lawsuit by authors who say the company took pirated copies of their works to train its chatbot.A searchable database of the works used in its models contains a host of well-known names including The Da Vinci Code author Dan Brown, the Labyrinth writer Kate Mosse and the Harry Potter creator JK Rowling.Kathleen Grace, the chief strategy officer at Vermillio, whose clients include Sony Music and the talent agency WME, said: “We can all win if we just take a beat and figure out a way to share and track content.This would incentivise copyright holders to release more data to AI companies and would give AI companies access to more interesting sets of data.

Instead of giving all the money to five AI companies, there would be this amazing ecosystem.”In the UK the artistic community has launched a vociferous fightback against government proposals to overhaul copyright law in favour of AI companies, who could be allowed to use copyrighted work without seeking permission first; instead, copyright holders would have to signal they wished to “opt out” from the process.A Google spokesperson said: “We can’t speak to the results of third-party tools, and our generative AI policies and terms of service prohibit the violation of intellectual property rights.”However, Google-owned YouTube says its terms and conditions allow Google to use creators’ work for making AI models.In September, YouTube said: “We use content uploaded to YouTube to improve the product experience for creators and viewers across YouTube and Google, including through machine learning and AI applications.

”OpenAI said its models train on publicly available data, a process which it claims is consistent with the US legal doctrine of fair use, which allows use of copyrighted work without the owner’s permission in certain circumstances,The Motion Picture Association trade group has urged OpenAI to take “immediate action” to address copyright issues around the latest version of Sora,The Guardian has seen Sora videos showing copyrighted characters from shows such as SpongeBob SquarePants, South Park, Pokémon and Rick and Morty,OpenAI said it would “work with rights holders to block characters from Sora at their request and respond to takedown requests”,Beeban Kidron, a crossbench peer in the House of Lords and a leading figure in the fightback against the UK government proposals, said it was “time to stop pretending that the stealing is not taking place”.

“If Doctor Who and 007 can’t be protected then what hope for an artist who works on their own, and does not have the resources or expertise to chase down global companies that take their work, without permission and without paying?”
technologySee all
A picture

Parents will be able to block Meta bots from talking to their children under new safeguards

Parents will be able to block their children’s interactions with Meta’s AI character chatbots, as the tech company addresses concerns over inappropriate conversations.The social media company is adding new safeguards to its “teen accounts”, which are a default setting for under-18 users, by letting parents turn off their children’s chats with AI characters. These chatbots, which are created by users, are available on Facebook, Instagram and the Meta AI app.Parents will also be able to block specific AI characters if they don’t want to stop their children from interacting with chatbots altogether. They will also get “insights” into the topics their children are chatting about with AI characters, which Meta said would allow them to have “thoughtful” conversations with their children about AI interactions

A picture

AI chatbots are hurting children, Australian education minister warns as anti-bullying plan announced

A disturbing new trend of AI chatbots bullying children and even encouraging them to take their own lives has the Australian government very concerned.Speaking to media on Saturday, the federal education minister, Jason Clare, said artificial intelligence was “supercharging” bullying.“AI chatbots are now bullying kids. It’s not kids bullying kids, it’s AI bullying kids, humiliating them, hurting them, telling them they’re losers … telling them to kill themselves. I can’t think of anything more terrifying than that,” Clare said

A picture

UK MPs warn of repeat of 2024 riots unless online misinformation is tackled

Failures to properly tackle online misinformation mean it is “only a matter of time” before viral content triggers a repeat of the 2024 summer riots, MPs have warned.Chi Onwurah, the chair of the Commons science and technology select committee, said ministers seemed complacent about the threat and this was putting the public at risk.The committee said it was disappointed in the government’s response to its recent report warning social media companies’ business models contributed to disturbances after the Southport murders.Replying to the committee’s findings, the government rejected a call for legislation tackling generative artificial intelligence platforms and said it would not intervene directly in the online advertising market, which MPs claimed helped incentivise the creation of harmful material after the attack.Onwurah said the government agreed with most of its conclusions but had stopped short of backing its recommendations for action

A picture

The teamwork behind Bletchley Park’s Colossus computer | Letter

Andrew Smith is right to applaud the work of Tommy Flowers for building Colossus, the world’s first digital programmable computer, delivered to Bletchley Park in 1944 (Move over, Alan Turing: meet the working-class hero of Bletchley Park you didn’t see in the movies, 12 October). The piece concludes with Flowers stressing: “It’s never just one person in one place” – teamwork and collaboration are key. This is even truer than the article might imply, when it says “subsequent models” of Colossus “included many new features and innovations”, as if these had been the result of Flowers working alone, just upgrading his design. Quite the contrary.It is well documented (for example, in the 2006 book Colossus by B Jack Copeland and others) that the Bletchley Park codebreakers Jack Good and Donald Michie not only utilised Colossus to help break the codes, they enhanced the computer; it was these developments that were so successfully incorporated by Flowers in subsequent machines

A picture

Olivia Williams says actors need ‘nudity rider’-type controls for AI body scans

Actors should have as much control over the data harvested from scans of their body as they do over nudity scenes, the actor Olivia Williams has said, amid heightened concern over artificial intelligence’s impact on performers.The star of Dune: Prophecy and The Crown said she and other actors were regularly pressed to have their bodies scanned by banks of cameras while on set, with few guarantees about how the data would be used or where it would end up.“A reasonable request would be to follow the precedent of the ‘nudity rider’,” she said. “This footage can only be used in the action of that scene. It cannot be used in any other context at all, and when the scene has been edited it must be deleted on all formats

A picture

‘Legacies condensed to AI slop’: OpenAI Sora videos of the dead raise alarm with legal experts

Last night I was flicking through a dating app. One guy stood out: “Henry VIII, 34, King of England, nonmonogamy”. Next thing I know, I am at a candlelit bar sharing a martini with the biggest serial dater of the 16th century.But the night is not over. Next, I am DJing back-to-back with Diana, Princess of Wales