The extended period set by the Canadian Government (through Innovation, Science and Economic Development Canada, ISED) for response to its consultation paper on Artificial Intelligence (AI) and Copyright closed on January 15. We will start to see a flurry of submissions released by participants while ISED digests and assesses the input it has received. One of the first is the submission from the Coalition for the Diversity of Cultural Expression (CDCE), which represents over 360,000 creators and nearly 3,000 cultural businesses in both French and English-speaking parts of Canada. CDCE’s membership includes organizations representing authors, film producers, actors, musicians, publishers, songwriters, screenwriters, artists, directors, poets, music publishers—just about every profession you can think of that depends on creativity, and protection for creative output. The CDCE submission highlights three key recommendations, summarized as follows;
While none of these recommendations are surprising, and from my perspective are eminently reasonable, I am sure we will also see a number of submissions arguing that, “in the interests of innovation”, access to copyrighted works is not only essential but should be freely available without permission or payment. OpenAI, the motive force behind ChatGPT—and the defendant in the most recent high-profile copyright infringement case involving AI (When Giants Wrestle, the Earth Moves (NYT v OpenAI/Microsoft)—has already staked out part of this position. In its brief to the UK House of Lords Select Committee looking into Large Language Models (LLMs), a key technology that drives AI development, the company says;
“Because copyright today covers virtually every sort of human expression–including blog posts, photographs, forum posts, scraps of software code, and government documents–it would be impossible to train today’s leading AI models without using copyrighted materials (emphasis added). Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.”
OpenAI claims that it respects content creators and owners and looks forward to continuing to work with them, citing among other things, the licensing agreement for content it has signed with the Associated Press. But failure to reach a licensing deal with the New York Times is really the crux of the lawsuit that the media giant has brought against OpenAI and its key investor Microsoft. If reports are true that OpenAI’s licensing deals top out at $5 million annually, it is not surprising that licensing negotiations between the Times and OpenAI broke down over such lowball offerings.
As for the CDCE submission to ISED, it recommends that the government refrain from creating any new exceptions for text and data mining (TDM) since this would interfere with the ability of users and rightsholders to set the boundaries of the emerging market in licensing. No copyright exemption for AI is what the British government has just confirmed, after playing footsie with the concept for over a year. Apart from the examples of the licensing deals that OpenAI has with the Associated Press and German multimedia giant Axel Springer, the CDCE paper notes a range of other recent examples of content owners offering access to their product through licensing arrangements, including Getty Images, Universal Music Group and educational and scientific publishers like Elsevier. The paper also urges the government to avoid interfering in the market when it comes to setting appropriate compensation, leaving it to market players or, where the players can’t reach agreement, to the quasi-judicial Copyright Board.
In my view, licensing is going to be the solution that will eventually level the playing field, but to get there it will require that major content players lockout the AI web-crawlers while pursuing legal redress, as the NYT is doing. This will help to open the licensing path to smaller players and individual creators who don’t have the resources available to employ either technical or legal remedies. (The issue of what has already been ingested without authorization still needs to be settled). As for the tech industry’s suggestion that creators can opt-out of content ingestion if they wish, CDCE rightly points out that this is standing the world on its head, and would be contrary to longstanding copyright practice. Not only is it impractical in a world where what goes into an AI model is a black box (thus the imperative for transparency) but it is like saying a homeowner has to request not to be burgled, or else can expect to become a target.
On the question of whether AI generated works should be granted copyright protection, CDCE points out the double-standard of proposing an exception to copyright for TDM for inputs while claiming copyright protection for AI generated outputs. The need for human creativity is a line that has been firmly held by the US Copyright Office, pushing back on various attempts to register AI-generated (as opposed to AI-assisted) works. Canada has not been quite so clear cut in its position, owing to the way in which copyright is registered (almost by default, without examination) in Canada, as I pointed out in this blog post (A Tale of Two Copyrights). While AI generated works have received copyright protection in Canada (Canadian Copyright Registration for my 100 Percent AI-Generated Work), this is more by oversight than design, given the way the Canadian copyright registration system works.
Thirdly, we turn to transparency, a sine qua non if licensing solutions are to be implemented. If authors don’t know whether their works are being used to train AI algorithms, or can’t easily prove it, licensing will fall flat. CDCE calls for publication of all content ingested into training models, disclosure of any content outputs that contain AI, and design of AI models to prevent generation of illegal or infringing content. This is similar to requirements already under consideration in the EU.
CDCE also makes the important point that it is not just copyright legislation that defends individual and collective rights against the incursions of AI and big AI platforms. While the Copyright Act offers some protection to creators, privacy legislation is important for all citizens. As the UK Information Commissioner has pointed out in a recent report, the legal basis for web-scraping is dependent on (a) not breaching any laws, such as intellectual property or contract laws and (b) conformity with UK privacy laws (the GDPR, or General Data Protection Regulation), where the privacy rights of the individual may override the interests of AI developers, even if data scraping meets other legitimate interest tests.
Finally, there is the question of the moral rights of creators that can be threatened by misapplication of AI, whether it is infringement of a performer’s personality or publicity right, distortion of their performance or creative output, misuse of their works for commercial or political reasons or any of the other reasons why copyright gives the creator the right to authorize use of their work.
Quite apart from the question of AI, there are of course other outstanding copyright questions that need to be resolved urgently, including the longstanding issue of the ill-conceived education “fair dealing” exception that has undermined if not permanently damaged the educational publishing industry in Canada. This exception needs to be narrowed to allow users continued unlicensed access to copyrighted materials under fair dealing guidelines for study, research and educational purposes but to limit institutional use to situations only where a work is not commercially available under a license from a rightsholder or collective society. While this issue requires looking back and fixing something that is already broken, policy making with respect to AI and copyright needs anticipate the future and “do no harm”, while requiring AI developers to open up their black boxes and respect existing rights. This should be achieved by maintaining and protecting the rights of creators in ways that will facilitate market-based licensing solutions for use of copyrighted content by AI developers, while ensuring that creative output remains the domain of human beings, and not machines.
This article was first published on Hugh Stephens Blog