The rise of generative AI in software development: are AI-assisted software development tools inadvertently creating imitations of copyleft-licensed code?

The use of generative AI in software development is on the rise but not widely understood. Danny Jeffrey and Paul Armstrong from Womble Bond Dickinson explains how these new tools are being used, as well as some of the risks associated with their application.

What is generative AI?

Generative AI is a branch of artificial intelligence able to produce novel and realistic content, such as text, images, or audio, in response to data inputted by users of the platform. Unlike traditional AI systems that perform specific tasks based on pre-defined rules or supervised learning, generative AI systems can learn from unlabelled data and generate new data that follows the same distribution, outputting data in variety of formats. Generative AI systems rely on advanced techniques, such as foundation models, which are large-scale neural networks (acting in a manner not dissimilar to the human brain) that can be pre-trained on large amounts of data and fine-tuned for various downstream tasks, producing contextually appropriate responses. Examples of foundation models include generative pretrained transformers (GPT), which can generate natural language text; variational autoencoders (VAE), which can generate images or sounds; and generative adversarial networks (GAN), which can generate realistic images or videos by pitting two neural networks against each other.

Generative AI has enormous potential for innovation and value creation across various industries and domains, such as marketing, healthcare, finance, entertainment, education, and art. Whilst it may seem from recent headlines that this technology is brand new, similar tools and models date back some years now, albeit they were not publicly accessible. There is a range of generative AI tools freely available on the market already including ChatGPT, a chatbot that can have natural and engaging conversations; DALL-E, an image generator that can create realistic images from text prompts; Stable Diffusion, an art generator that can create high-quality paintings from sketches; and GitHub Copilot, a code generator that can write software code from natural language descriptions.

How is generative AI used in software development?

Generative AI has the potential to unleash developer productivity, enhance the customer experience, and foster innovation. According to research conducted by McKinsey, generative AI-assisted tools can deliver impressive time (and cost) savings in many common developer tasks, such as documenting code functionality (50% faster), writing new code (47% faster), and optimising existing code (63% faster). One of the most prolific examples of this, GitHub Copilot, is a cloud-based artificial intelligence tool which assists users by auto-completing and suggesting code. Not only that, but when provided with a programming problem described in a natural language, Copilot is capable of generating code that solves the problem. Whilst Copilot is not the only AI-assisted tool that can perform (some) of these functions, it does have some unique features which make it both particularly helpful and particularly risky to use – we will cover these in detail later.

Generative AI can also enable developers to create novel and diverse content, such as images, video, music, and speech. This ability to create novel outputs has been deployed with a variety of functionalities, but Ubisoft's R&D department, La Forge, has widely touted its utility in video game development for particularly mundane world-building tasks. Ghostwriter, a tool developed in-house, alleviates the developers of one of video games production's most laborious tasks: writing NPC dialogue. Whilst these particular tools aren’t used to write or orchestrate the wider lore or questlines, it can generate conversations between background NPCs based only on a short subject matter input by the developer. This unique tool allows scriptwriters and developers more time to focus on the narrative of the main storylines.

Legal considerations

Generative AI poses some unique challenges and risks in software development that must be addressed by developers and their clients. Generative AI can raise some complex questions regarding the ownership, authorship, liability, and accountability of the generated code and content. For instance, who owns the intellectual property rights of the generated code and content? Who is responsible for any errors or damages caused by the generated code and content? How can developers ensure that they are not infringing on any existing copyrights or patents?

The primary concern with a generative AI-assisted software writing application is that its code suggestions can be lifted from open-source software, and by failing to identify or attribute the original work, it violates open-source licences. The implication, therefore, is that developers who use this sort of tool are subject to those same risks.

Whilst open-source software licences generally refer to a set of terms and conditions which stipulate end-user obligations when a component of open-source code is used in software, there are differing types which require different standards as to how they may be used and redistributed:

Permissive licences: these licences allow use with very few restrictions. Developers are able to modify and distribute the code on the basis that they provide attribution of the original code to the original developers
Copyleft licences: includes a reciprocity obligation stating that derivative works based on original code provided under a copyleft licence are released under the same terms and conditions as the original code, and that the source code containing changes must be available or provided upon request. It is these licences that commercial entities are particularly wary of, as it may require making in-house code publicly available.

Ultimately, what matters most in relation to open-source software licences, is that they require developers to provide attribution by including the original copyright notice. Given that some of these generative AI applications strips code of its licences, developers who use it are completely unaware that they may be violating licence terms. The question, then, is whether AI-assisted software development tools are inadvertently creating derivate works of copyleft-licensed code. This remains a question for the courts, and indeed there are ongoing cases in both the UK and US to this effect. Whilst the precise answer remains to be confirmed, it seems reasonable that a lot would depend on the length and comprehensiveness of the software's suggestions. The more complex and specific the suggestion, the more likely it is that it is a derivative work of copyleft-licensed code.

While there are a few players in this space, some tools have an approach to licences that means they are particularly risky when compared to other software options that are trained only on permissive licences. The latter kind also generally stick to standardised suggestions, and so are less likely to suggest code which can be traced back to a copyleft-licensed code.

Other challenges

Another problem that may arise from using these tools is that the application could copy code that has a security vulnerability and introduce it into code you are going to use. Without reviewing the code implemented by the assistive software properly, this could meaningfully impact the integrity of any code implemented by the developers.

What’s more, the code may not work or even make sense. By design, the application predictively suggests commonly-used code, but in doing so, may produce code with common mistakes. This can result in code which is either unnecessarily imprecise, or otherwise operates improperly.

Software developers can overcome some of these issues by treating the code in the same way you would treat code produced by a junior developer. The code should be labelled as AI-generated, and treated as though it was produced by an employee that requires supervision.

Solutions

Some of the solutions to the problems outlined above include setting up your application so it blocks public code suggestions and sticks to using an internal source code bank only. It’s also clear that thoroughly testing and analysing all code, as if it was produced by a junior developer, is necessary. You can also run projects through licence-checking tools that analyse code for plagiarism. Incidentally, the way to combat exuberant AI-assisted software may well be more AI.

As in all areas of business where you’re trying something new, use common sense and develop a good understanding of how it works, as well as it’s applications and risks, before you use it in a commercial operation. It’s worth remembering - any piece of suggested code that is very clearly from another source, or even has comments still attached - shouldn’t be used.

Materials science company PANGAIA opens first UK standalone store supported by Womble Bond Dickinson

Womble Bond Dickinson announces 13 UK-wide promotions

IP for all as WBD celebrates World IP Day by fostering career pathways

The three seismic shifts shaking up the UK’s energy transition in 2024

The ICO's Consultation on Generative AI: plugging the regulatory gap

The UK's strategy for critical minerals

A raft of changes to the UK immigration system – key things for you to know

Unlocking opportunities: tech in Bristol and Bath

It's all change for charity leaders and trustees

The rise of generative AI in software development: are AI-assisted software development tools inadvertently creating imitations of copyleft-licensed code?

Contributors

What is generative AI?

How is generative AI used in software development?

Legal considerations

Other challenges

Solutions

Recommended

The three seismic shifts shaking up the UK’s energy transition in 2024

The UK's strategy for critical minerals

CIS changes bring good news for tenants

The UK's pathway to a hydrogen economy

Why we need more women working in energy and how we do it

WBD advises on planning consent for major upgrade of Yorkshire's high-voltage network

The latest PPN updates in procurement law

Empowering Consumers for the Green Transition Directive – what will this new legislation mean for your business?

Pieces of the jigsaw – DWP consults on options for defined benefit schemes

What is the Circular Economy Action Plan and why does it matter to businesses outside the EU?

Customs declarations mistakes are costing UK businesses thousands – is it time to re-evaluate processes?

Competence, the Principal Designer and the new building safety regime

Materials science company PANGAIA opens first UK standalone store supported by Womble Bond Dickinson

Womble Bond Dickinson announces 13 UK-wide promotions

IP for all as WBD celebrates World IP Day by fostering career pathways

The three seismic shifts shaking up the UK’s energy transition in 2024

The ICO's Consultation on Generative AI: plugging the regulatory gap

The UK's strategy for critical minerals

A raft of changes to the UK immigration system – key things for you to know

Unlocking opportunities: tech in Bristol and Bath

It's all change for charity leaders and trustees

The rise of generative AI in software development: are AI-assisted software development tools inadvertently creating imitations of copyleft-licensed code?

Contributors

Paul Armstrong

Danny Jeffrey

Related

What is generative AI?

How is generative AI used in software development?

Legal considerations

Other challenges

Solutions

Womble Bond Dickinson Insights

Recommended

The three seismic shifts shaking up the UK’s energy transition in 2024

The UK's strategy for critical minerals

CIS changes bring good news for tenants

The UK's pathway to a hydrogen economy

Why we need more women working in energy and how we do it

WBD advises on planning consent for major upgrade of Yorkshire's high-voltage network

The latest PPN updates in procurement law

Empowering Consumers for the Green Transition Directive – what will this new legislation mean for your business?

Pieces of the jigsaw – DWP consults on options for defined benefit schemes

What is the Circular Economy Action Plan and why does it matter to businesses outside the EU?

Customs declarations mistakes are costing UK businesses thousands – is it time to re-evaluate processes?

Competence, the Principal Designer and the new building safety regime

Contact