Categories
Uncategorized

The AI and Copyright Conundrum: how should the UK balance rights of copyright owners against AI innovation?

Hrishikesh Chitale explores whether the use of copyrighted datasets to train AI models potentially constitutes copyright infringement.

by Hrishikesh Chitale, first year law student

Every technological evolution inevitably creates further conditions in the crisis cycle. In light of the rapid advancement of Artificial Intelligence (AI), this blog examines whether the use of copyrighted datasets to train AI models potentially constitutes copyright infringement.

The UK’s Copyright, Designs, and Patents Act 1988 (CDPA 1988) grants copyright holders moral and economic rights to preserve and protect their works (Ch.2 CDPA 1988). Moral rights include the right to be identified as an author of a work (i.e. patent right) (s.77 CDPA 1988), the right to object to derogatory treatment of copyrighted work (i.e. integrity right) (s 80 CDPA 1988), whereas economic rights include a right of reproduction (s 17 CDPA 1988). Copyright owners are entitled to confer licenses to users for exploiting their works, subject to the rights granted under the CDPA 1988. In January 2023, Getty Images (a visual media company) (Getty) brought legal proceedings against Stability AI before the High Court in London. This blog analyses the above case and considers the legal dilemma around how the UK could strike a balance between incentivising AI innovation against the rights of copyright owners under the CDPA 1988.

In Getty Images v Stability AI ([2023] EWHC 3090), Getty has alleged that: (i) Stability AI unlawfully exploited its copyrighted works by scraping millions of images available on the internet to train and develop the defendant’s ‘Stable Diffusion’ technology (system); (ii) the subsequent outputs generated by this system, which are synthetic images, substantially reproduce its copyright works and exhibit its brand mark; (iii) infringement of its other intellectual property rights (for example, database rights (regulation 16(1) The Copyright and Rights in Databases Regulations 1997); (iv) secondary infringement (s.22 CDPA 1988) of copyright, as Getty claims that the pre-trained system is an ‘article’ imported into the UK without Getty’s authorisation. The system typically creates images from texts or picture prompts.

In December 2023, Stability AI applied for a reverse summary judgment against Getty’s various claims, in particular the claim for secondary infringement of copyright on the grounds that software is not an ‘article’ under the CDPA 1988. Mrs Justice Joanna Smith found that the legal position of Getty’s claims for allegedly infringing its intellectual property rights as a result of training and developing Stability AI’s system is contested, and that the evidence should be substantively examined on trial, which is anticipated to be heard in 2025.

The dispute in Getty Images has the potential to re-design UK copyright law in the AI age, together with upholding the UK’s ‘National AI Strategy’ which seeks to render the UK a favourable jurisdiction for AI innovation. The copyrighted data used to train generative AI models (such as the ‘Stable Diffusion’ technology) is typically exploited from the internet by the process of ‘data scraping’ (i.e. sifting through copyrighted databases and extracting information). Consequently, such substantial copying of unlicensed data constitutes copyright infringement under the CDPA 1988.

In February 2024, in response to Getty’s claims, Stability AI raised its defences. Stability AI argues that any unlicensed copy of a work uploaded in the input is unequivocally the user’s responsibility and the system simply follows the user’s instruction to create images. Second, Stability AI contends that the images used while training the system are not memorised, and as a result, the output generated is not a substantial copy of Getty’s images, but a mere collective imitation of various sources exploited during the training process. However, in practice, the user is highly unlikely to know whether the AI developer has obtained authorisation from the copyright owner, but simply utilising AI’s capability. Therefore, users should not be held liable.

Stability AI has filed an alternative defence that the outputs triggered by either text or images are fair dealing of the copyrighted work under the pastiche exception (s.30A CDPA 1988). ‘Fair dealing’ is a legal term used to determine whether the use of the copyrighted work is lawful, or whether it infringes copyright. Section 30A permits for fair dealing of a work for a caricature, parody or pastiche. However, the CDPA 1988 remains silent regarding when the copying is a pastiche. In Shazam Productions Ltd v Only Fools the Dining Experience Ltd ([2022] EWHC 1379), the judge addressed the meaning of pastiche, and held the following two elements  were crucial: (i) ‘it either imitates the style of another work or an assemblage of several pre-existing works, and (ii) it must be noticeably different from the original work (para 188)’. Stability AI contends that the output is a pastiche, as the system’s output is an imitation of varying sources. Consequently, it argues that the output is an insubstantial reproduction of Getty’s images. There is no precedent as yet that determines the nature of use that could satisfy the requirements for a work to be a pastiche. Stability AI is unlikely to pass the fairness test due to its commercial service (Dream Studio) to third-party platforms, as this could be construed as undermining the rights of copyright owners by not acknowledging and remunerating them (Ch.2 CDPA 1988). Such unlawful exploitation of the copyrighted work arguably disrupts the longstanding legitimacy of copyright law.

The issue at the heart of this case is whether the training and development of the system took place outside the UK. Copyright is a territorial right that confers protection to its holder only within the territory of the UK.  If there is no evidence suggesting that the training of the system was carried out in the UK, Getty’s whole claim could potentially fail on trial, as the location issue could oust the court’s jurisdiction. Stability AI has argued that: (a) the alleged web scraping of Getty’s images to train and develop the system; (b) the processing and hosting of services using hardware and computing sources; and (c) the employees engaged in the development processes, were performed or located outside the UK. Nevertheless, the judge was not sufficiently convinced to deliberate on the location issue without allowing Getty to rebut the evidence on trial (para 60).

This suggests that there may a loophole in the current law that could potentially enable AI developers to evade liability on the basis that the technology infrastructure was located outside the jurisdiction. Such Jurisdictional constraints could perhaps undermine the rights of copyright owners, as their rights will not have an extraterritorial effect. Consequently, it is essential for states to collaborate and harmonise laws that uphold AI innovation, while remunerating rights holders for exploiting their works to train AI models, irrespective of where the research and training occurred. This could restrict AI developers from raising such defences and upholding the rights of copyright holders.

Getty’s prospect of succeeding in the secondary infringement claim hinges on the interpretation of the word ‘article’ stated under ss. 22, 23 and 27 of the CDPA 1988. ‘’The copyright in a work is infringed by a person, who without the license of the copyright owner, imports in the United Kingdom, otherwise than for his private and domestic use, an article which is, and which he knows or has reason to believe is, an infringing copy of the work’’ (s.22 CDPA 1988). The provisions for secondary infringement do not expressly describe whether an ‘article’ includes an intangible item (such as software). Stability AI argues that the true interpretation of the word ‘article’ under these sections can simply mean that the item must be tangible. The judge acknowledged that the interpretation of the word ‘article’ raises a novel issue, which must be ascertained on trial (para 94). The courts could consider departing from the traditional meaning of ‘article’ to include intangible items (such as software) to enable copyright owners to protect their works in the digital era. For example, preventing illegal importation of digital copies of video games.

In 2023, the UK Minister for Science, Research and Innovation announced in Parliament that the UK Government had abandoned its plans to extend the UK’s text and data mining exception for commercial purposes. Section 29A of the CDPA 1988 permits the making of a copy to carry out text and data analysis (i.e. computational analysis) solely for non-commercial purposes. This provision has attracted significant relevance, as AI systems are typically trained by the process of text and data mining. Further, in 2024, the Department for Science, Innovation and Technology abandoned its plans for a copyright code of practice, a framework that ideally sought to make licenses for data mining more available while remunerating rightsholders. It declared that a voluntary code of practice could not be agreed upon.

The UK should arguably consider adopting a statutory licensing scheme under the CDPA 1988, which will enable copyrighted works to be lawfully exploited for training AI models, while rewarding copyright owners, administered by the Intellectual Property Office (IPO). The license should be confined to a particular timeline. The use should be allowed for commercial and non-commercial purposes. The IPO can refuse to grant the license if it deems the exploiter’s use to be inappropriate upon diligent investigation. Any right owner whose work has been licensed will be entitled to claim the license fee from the IPO, and the copyright owner shall hold the right to restrict their work from being licensed to AI developers. Copyright owners must be entitled to license the same works to multiple AI developers to avoid anti-competitive practices. This arguably ensures an adequate balance between the interests and rights of copyright owners and AI developers, as it potentially enables AI developers to lawfully exploit works while rewarding rightsholders. An alternative solution to the text and data mining issue could be to permit the use of copyrighted works for commercial purposes, unlike s.29A. However, this is likely to abuse the rights of owners, as they will be left unacknowledged and unrewarded. Nevertheless, this proposal will indeed require significant appraisals and deliberations pertaining to procedural and administrative matters for the proposal to operate seamlessly.

It is imperative for the UK to strike a balance between the rights of copyright owners and incentivising AI innovation to uphold its ‘National AI Strategy’ and maintain the legitimacy of copyright in the UK. A statutory licensing scheme could potentially provide a certain degree of governance and fairness by ensuring copyright owners are remunerated while incentivising AI innovation. The dispute in Getty Images is likely to have wide implications on copyright law, particularly in the exceptions and licensing area. The impending litigation could perhaps establish whether the UK is a favourable jurisdiction in terms of AI innovation. It will be interesting to witness the court’s decision regarding the location issue, and whether the UK Parliament will extend the scope of CDPA 1988 to prevent AI developers from evading potential liability, regardless of where the training process occurred.

Leave a Reply

Your email address will not be published. Required fields are marked *