Pipeline Security and Supply Chain Protection

Inside MCP Security

A practical guide from Wiz Research that breaks down real-world MCP security risks and offers actionable steps to help teams secure LLM integrations.

Learn More

Pipeline Security and Supply Chain Protection

Cybersecurity professionals need to consider the software development pipeline – the lifecycle of the development process, from inception to production - as well as the impact that software supply chain failures and vulnerabilities can have on it.

Think of the word “pipeline” in the sense of cybersecurity and the first though of many in the industry goes to real pipelines usually the Colonial Pipeline shutdown that we have referred to before. But this is not that. Here, we are interested in the software development pipeline – the lifecycle of our development process, from inception to production.

Anyone developing in the cloud today is likely to take the continuous integration/continuous deployment (or, to some, continuous delivery) approach, or CI/CD.

Evolving the Software Development Process

Let us look at how our approach to creating software has changed over the years. Put simply, the days of developing code for days or even weeks and then doing weekly or, more commonly, monthly deployments are long gone. Since the cloud provides us with such flexibility when it comes to provisioning compute and storage resource to run what we write, why would we not use a fast, dynamic approach to producing and deploying code to get fixes and new features in as quickly as humanly possible? Doing so means they can be generating revenue, improving the user experience or correcting a problem that someone reported.

The reason is simple: the ability to write code – or, more specifically, the speed at which developers could write it – was not much of an impediment to the development process back in the day. A software engineer of the 2020s is not much faster than one from the 1990s or 2000s when it comes to writing lines of code. While integrated development environments (IDEs) have evolved and improved over the years, they have not leapt forward by an order of magnitude and given some miraculous boost to how fast a coder can code.

The key impediment to producing working software quickly was, in fact, testing. The testing process has traditionally been long, intricate and monotonous: devising vast numbers of test scenarios; documenting the pre-conditions, the steps and the expected outcomes; and having all of them executed by a team of testers. A test cycle for a small application could take days; the same for a large application could go on for many weeks. Wind the clock on and automation makes it fast and easy to pull together the code segments, compile them, run the test suite with little or no human intervention, even execute security tests on the code to detect potential vulnerabilities and to compile a list of errors.

Of course, since the early 2020s we also have the option to ask an AI module to tell us how to fix the errors, but this is a much larger subject that we shall cover another time. This process of automating end-to-end generation of something we can consider deploying is what we now call CI.

The CD element takes the outputs from all the above and pushes it through the deployment process. This begins in a staging environment and goes through one or more stages of approval. User acceptance testing (UAT) being the obvious one: although you can define limits for performance and conduct realistic tests to ensure that the solution meets them and scales as specified, it is impossible to automate a test that tells you entirely whether the user will be content. So, there will generally be an element of UAT. Other approvals might include the company’s change control regime, sign-off that the support teams understand what is new and are able to support the revised product, and in some industries the approval of one or more regulators. The final deployment may well be highly automated – in fact this is often to be recommended as we have seen many, many instances of deployment steps being missed in manual deployment processes, leading to various levels of disaster – but going from “ready to go” to “deployed” will necessarily have some manual decision points.

Shared Code, Supply Chain Risk

Having looked at automated testing, there is in fact a second prime factor to rapid development and deployment: code libraries. We noted earlier that writing code has not really become much faster in 30 or so years, but there is an alternative that the cloud has brought us: to not write the code, but to use code that somebody else wrote. Readily available, inexpensive cloud services made it possible for the likes of GitHub to come into existence (in this case, in 2008). At the time of writing, there are approximately 630 million code repositories in GitHub, with an estimated 180 million developers using it. The sheer volume of code sharing that goes on globally thanks to such repositories is hard to imagine, but there is a tangible downside: the risk of inheriting vulnerabilities either from maliciously written code or from apparently innocent libraries that have been infiltrated by attackers.

The example that is probably best known is the Log4j exploit of 2021. What is interesting is that the issue, believed to have impacted 800,000 people within 72 hours of being exploited in the wild, was a vulnerability that was not due to any kind of malicious activity: it was simply a bug in a block of code that happened to introduce a significant security flaw. On the other side of the coin we have the deliberate infiltration of the NPM code repository that caused the spread of the Shai-Hulud worm.

Proceed With Caution

CI/CD and code reuse are the fundamental components of cloud development, but caution is needed if we are to develop securely using these techniques and we should look at three distinct areas.

First, the code repositories we are using. Because we need to distrust any code that either we did not write or is lodged somewhere that might be compromised, such as in a third-party cloud repository, we should treat the sources of those code libraries just as we would any other supplier. Third party risk management (TPRM) is not a concept that is known to the typical developer and it is essential either to gain an understanding or engage with someone who is already well versed in the concept. Do not pull down updated versions without verifying them. Ensure you have a comprehensive software bill of materials (another concept that merits discussion of its own, as it is a big area), as this will allow you quickly to understand the implications should one of the libraries you have used be shown to have a vulnerability.

Second, the approach we take to the development process. Embed the concept of DevSecOps, to ensure that the security team are involved throughout the full development lifecycle, not just at the very last step before deployment. Automate testing, both functional and security, and apply it throughout the process – from static testing of source code to dynamic testing of the release candidate and even the live version once promoted to production. Be firm with the principle of least privilege: developers should not access production systems, while the gateways through which code has to pass to move from development, though testing and into production must be firm, clear and effective.

The third step relates to the wider production environment rather than the code acquisition and development/build process: be prepared for the unexpected and ensure that you enhance your business continuity plans to encompass the potential impacts of a security incident in the pipeline and/or its supply chain. The likelihood of an incident will always be non-zero, so it is important to be ready, and to be ready with a plan that was written specifically for pipeline incident scenarios.

Pipeline Security and Supply Chain Protection