opensource.google.com

Menu

Introducing Ephemeral Containers

Friday, January 21, 2022

Ephemeral containers in Kubernetes started with a simple question: is it feasible to run a service on Kubernetes without bundling a Linux distribution userland with every binary?

It was early 2016. Kubernetes had just released version 1.2, and my SRE team was evaluating using Google Kubernetes Engine for internal workloads. Docker and Kubernetes examples always seemed to build images on top of Linux distributions like Debian or CentOS, but our build system produced a binary with its minimum set of library dependencies, so that's what I wanted to deploy as a container image.

This minimal container image worked fine, but only if I never made a mistake. Since the container image had no shell to use with kubectl exec, I had to log into the node with administrator privileges to interactively troubleshoot any problems. This produced an unfortunate debugging experience and was unacceptable from a security perspective.

What's more, kubectl exec had changed little from docker exec even though Kubernetes introduced new abstractions such as a Pod, where multiple containers share resources. How should Kubernetes native troubleshooting work?

Debugging on Borg

Providing userspace utilities for cluster applications wasn't a new problem for Google. Google's existing cluster orchestration system, Borg, provides a common userland for processes. Rather than including system and debugging utilities with the application binary, Borg provides a basic set of userland utilities that applications can expect in their runtime environment. Another team maintains and updates these utilities independent from application binaries.

There are downsides to this approach: Updates to the common utilities can take weeks or months to roll out, application owners can't specify which utilities they need, and the utilities needed at run time may be completely different from the ones needed at debug time. We could do better for Kubernetes.

Extensibility for Kubernetes

I wanted a solution that felt native for Kubernetes and gave users the freedom to customize to their use case, but I was still new to Kubernetes. I reached out to SIG Node and discovered a welcoming, helpful open source community.

Together we considered different ways of deploying tools to a Pod at debugging time. Implementing the feature entirely on the client side would be easiest, but solutions such as copying binaries into the running container image didn't make debugging feel like a feature of the platform. Kubernetes deploys binaries using containers, so it's natural to use containers for troubleshooting as well.

Existing container types were tied to the Pod lifecycle, though. Containers and Init Containers run when a Pod starts, and neither may be added after a Pod is created. For administrative actions we needed a lifecycle more like kubectl exec. We needed a new type of container: the Ephemeral Container.

What are Ephemeral Containers?

Ephemeral containers are a new type of container that are part of the Kubernetes core API. An Ephemeral Container may be added to an existing Pod for administrative actions like debugging, it runs until it exits, and it won't be restarted. An ephemeral container runs within the Pod's existing resource allocation and shares common container namespaces.

How are Ephemeral Containers used?

Here are some debugging scenarios that are made easier using ephemeral containers.

Troubleshooting Clusters

I run a service named "apples" that consists of a Go binary running in a distroless container image. One of its pods is suddenly having trouble connecting to a backend service, but since it's a distroless image I can't use kubectl exec to troubleshoot:

% kubectl exec -it apples-57bcf49487-ddmpn -- sh OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: "sh": executable file not found in $PATH: unknown

We can use kubectl debug to add an ephemeral container and test the backend service:

% kubectl debug -it --image=busybox apples-57bcf49487-ddmpn -- sh Defaulting debug container name to debugger-5wvgc. / # ps ax PID USER TIME COMMAND 1 65535 0:00 /pause 7 root 0:00 /app 19 root 0:00 sh 26 root 0:00 ps ax / # wget -S -O - http://bananas:8080 Connecting to bananas:8080 (10.0.0.237:8080) HTTP/1.1 500 Internal Server Error wget: server returned error: HTTP/1.1 500 Internal Server Error

Technical Support

To make this easier to detect next time, I'll add this check to my operation team's autodiagnose script. The ops team doesn't have access to attach to production pods, but they have access to run the autodiagnose image which attaches its logs to a bug report:

% kubectl debug --image=gcr.io/apples/autodiagnose apples-57bcf49487-ddmpn --

--bug=1234

What's Next for Ephemeral Containers?

Ephemeral Containers are available as a beta feature in Kubernetes 1.23, but we still have lots of work to polish the rough edges and improve kubectl debug to support more debugging journeys, such as configuring the container security context to allow attaching a debugger.

Try out Ephemeral Containers and let us know in the Ephemeral Containers and kubectl debug enhancements how they work for you.

Contributor Experience

Working with the Kubernetes community has been incredibly rewarding. When I started I didn't know nearly enough to contribute something like this, but I discovered a community that works hard to welcome contributions at all levels.

I want to thank the community, and especially Dawn Chen, Yu-Ju Hong, Jordan Liggitt‎, Clayton Coleman, Maciej Szulik, Tim Hockin‎, for providing the support and guidance that made this feature possible.

Kubernetes will welcome your contribution as well! See kubernetes.dev for how to get started.


By Lee Verberne, Site Reliability Engineer – Google Cloud Platform

.