Building a Hardened Container Infrastructure—In and Outside of the Cloud
By MATT GILLESPIE
Bank vaults, mainframes and mountain fortresses are desirable for their lack of subtlety. Protection of their contents is ensured by sheer heft, so proprietors can focus elsewhere.
That calculus changes when low overhead is paramount. For instance, Linux containers epitomize lightweight, ephemeral infrastructure. And workloads that by design exist with only fleeting ties to physical systems must rely elsewhere for protection.
Containerization is a key enabling technology for cloud-native services and architectures, creating a mechanism to bundle and cordon application-development assets at the operating-system level. Many security organizations are working overtime to come to grips with the architectural changes that come with container adoption.
Hosts still need to be hardened and guarded, but the root of trust must extend to the pedigree of the container itself. In fact, both containers and the hardware they run on must be treated like cattle, not pets, with a focus on imperfect copies being immediately replaced.
At the host level, visibility and scanning monitor ongoing health of the physical system, the virtualization layer (if any) and the base layer of the container infrastructure. By running containers only on infrastructure that is verified to be in a known good state, you can preclude uncontrolled code beneath the level of the containers themselves.
Likewise, the only containers allowed to run must be those verified to be in a known good state. Because they are built to be spun up and down in seconds, very little overhead comes from destroying and replacing an imperfect container.
Define the emerging present with an immutable, gold-standard container image
Maintaining a trusted operating state for containers requires the perspective of the container’s full lifecycle. The trustworthiness of every container in the environment is based on comparison to its corresponding certified container image.
Base images are built to meet gating criteria, such as being limited to specific versions of specific software packages, with measures such as security scans and vulnerability analysis also performed at build time. Once the container image has been verified to be safe and meet the applicable criteria, it is certified as trusted and then cryptographically signed.
Signed images are used as the basis against which to measure containers in production. At deployment time, mechanisms such as The Update Framework (TUF) and Notary can authenticate and verify each individual container.
Travis Jeppson, engineering site lead at Kasten, which provides application backup and recovery for Kubernetes, notes how that approach protects the environment. “Once you get to your production system, that image has been scanned, it’s been signed, and you can actually tell your servers to only accept signed images. That enables you to only run software that you trust.”
Assurance that containers in deployment continue to conform to the corresponding trusted images must be passed to the runtime environment. Tools such as Falco enable runtime threat detection by monitoring container operating state and detecting unauthorized changes.
Jeppson explains that, “If that state does change, you can remove that container and replace it with a new container from the same image that you’ve signed and verified, to remove that risk out of your infrastructure.”
This model represents a shift from the traditional focus of protecting workloads by identifying malware or other attacks to one that identifies abnormalities as quickly as possible and eliminates them by replacing them with trusted equivalents based on verified images.
In environments where code commits and container deployments occur continually—perhaps as frequently as hourly—registry visibility and hygiene are also critical. Security organizations must be able to track the contents of the registries, as well as the age and vulnerabilities associated with each.
For example, if a particular container image hasn’t been updated in six months, the security team can work with application owners and development teams to determine whether the image should be removed from the registry or perhaps updated in the next sprint cycle.
Notably, these same principles apply both to on-prem and public cloud infrastructure. For example, cloud service providers can provide trusted infrastructure and services, including verification for regulatory frameworks such as PCI, but customers are responsible for protecting the software layers they run on top of that.
Thus, content trust based on signed images is equally important, regardless of where the containers are deployed.
Tailor the architecture to the degree of control needed
The early iterations of container deployment by many organizations use hypervisor-based virtual machines. That approach is familiar and well adapted to the public cloud environment, where it bolsters data isolation beyond what’s possible with containers alone.
Jeppson explains the appeal of this approach to many customers in the public cloud, “where the physical hardware is still going to be shared, but you can leverage the properties of the virtual machine to [help] prevent break-ins and break-outs.”
Notwithstanding the advantages of that protection, the quest for cost efficiency motivates many organizations away from the overhead of hypervisor-based virtualization. Security solutions architect Sean Nicholson reports that as customers develop further along their containerization journeys, “they’re removing that hypervisor layer altogether, and they’re running [containers] directly on bare metal in their data centers or a managed environment such as Amazon ECS or Azure Container Service or Google Container Service.”
In particular, managed container orchestration services let customers abstract away management of the container environment, at the cost of ceding control of where specific workloads run. That type of control can allow an organization, for example, to run sensitive or regulated workloads only on a specified, segregated set of servers.
Nicholson likens the adoption of managed Docker and Kubernetes services to what we have seen over the past decade or so with public cloud. Organizations were originally cautious about moving production to the cloud, but “five years later, it’s like, ‘Oh, I don’t need a data center; let them worry about the hardware.’ I think the same thing is going to happen with [managed] containers.”
Managed container environments vary in terms of what they allow the customer to retain responsibility for, but typically, the control plane is the province of the service provider, giving the customer limited or no control over the container environment as a whole.
Conversely, managing your own container infrastructure enables nodes to restrict which containers can be scheduled to run on them. A container can also specify that it will only run on a specific sub-population of nodes.
Together with runtime scanning that allows only containers based on trusted images to run, these measures provide significant protection against malicious containers.
Independent of core architecture considerations such as hypervisor-based virtualization and managed container services, tactical considerations at the per-container level play key roles in protecting workloads. A few representative issues include the following:
- Embrace automation to protect containers in action. Changes in container environments happen at superhuman speeds—humans can’t keep up with them. Pinpointing problems automatically enables pulling containers or hosts out of service and replacing them almost instantaneously.
- Run containers with least privilege. The common practice of running as root inside of a container can expose not only the container itself, but potentially the host as well, if control breaks out of the container. Mechanisms such as Docker’s user instruction and Kubernetes’ runAsUser field help mitigate this danger.
- Use protection mechanisms built into the container platform. For example, Kubernetes admission controllers can verify that every request coming into the API is from an authentic, authorized user. Anything else can be blocked, whether it’s a nosy neighbor or a cavalcade of crypto-miners.
Establish a collaborative vision across security, development and operations
As ever, security teams must navigate carefully to avoid being perceived as roadblocks.
“Developers can tend not to interface with security teams until an application is ready to go into production, and by that time, it’s too late. We need to foster an atmosphere of working together, starting as early in the process as possible,” Nicholson notes.
Full visibility into development, staging and orchestration pipelines by all concerned parties helps build that spirit of collaboration as security tasks shift left in the development and deployment process, with the emergence of DevOps and DevSecOps approaches in mainstream organizations.
Operationally, the gating factors for what’s allowed to run within containers are based on standards established jointly by development and security teams. Those gates provide critical guardrails for what’s allowed in the container environment.
To meet business and technical needs, these standards must optimize flexibility, and their enforcement needs to be automated with an eye toward minimizing overhead. Such factors are vital to positioning security teams as enabling container adoption, rather than interfering with it.
Security organizations are well advised to foster partnership with the development organizations and business units responsible for containers and the workloads that operate in them. By building security into every phase of software development and deployment, the organization can be guided by mutual interests at the operational level.
It’s possible for everyone to win. Nicholson calls for “agreement between the dev team building images and the security organization that’s giving them the thumbs-up to go as fast as they want, as long as they do so within the bounds of these established gates.”
Matt Gillespie is a technology writer based in Chicago. He can be found at www.linkedin.com/in/mgillespie1. A version of this article will appear in the May/June InfoSecurity Professional magazine, which is focused on cloud security.