Unleashing autonomous AI agents: Why Kubernetes needs a new standard for agent execution

Joan Kallogjeri & Tomer Glottmann, GCP Google Kubernetes Engine

The arrival of autonomous AI Agents capable of reasoning, planning, and executing actions by generating their own code and interacting with the runtime environment marks a paradigm shift in how applications are built and operated. However, these new capabilities also introduce a fundamental security gap: how to safely allow agents to run untrusted, unverified generated code, perform actions and get access to data in runtime environments, especially pertaining to mission-critical infrastructure and environments that may have proprietary data.

We are excited to announce a major initiative within the Kubernetes community to address this exact challenge: we are launching Agent Sandbox as a formal subproject of Kubernetes SIG Apps, hosted under kubernetes-sigs/agent-sandbox.

This is more than just a tool; it is designed to standardize and evolve Kubernetes into the most secure and scalable platform for the agentic workloads.

The Latency Crisis for Interactive AI

Agent behavior often involves quick, iterative tool calls — checking a file, running a calculation, or querying an API. For security reasons, each of these calls requires its own isolated sandbox.

The challenge is that these sandboxes must be created from scratch, extremely quickly, to ensure isolated environments between executions. Because security and isolation are non-negotiable, the "spin-up" time becomes the critical bottleneck. If the secure execution environment takes too long to spin up, the entire agent application stalls, killing the interactive experience.

The Bottleneck of Massive Throughput

Enterprise platforms require infrastructure that can handle overwhelming scale. Users engaged in complex AI agent workloads demand support for up to tens of thousands of parallel sandboxes, processing thousands of queries per second. To meet this challenge, we are extending Kubernetes' proven capabilities of managing high-capacity, low latency applications, models and infrastructure to fit a growing class of single-instance workloads, like AI agent runtimes or dev environments, that require a lightweight, VM-like abstraction. A standardized, controller-based Sandbox API provides Kubernetes-native solution for these use cases, avoiding the workarounds required today, paving the way for the next generation of cloud-native AI applications.

The Agent Sandbox: A new Agent Standard for Kubernetes

To solve these problems, we are introducing a new, declarative resource focused strictly on the Sandbox primitive, designed from the ground up to be backend-agnostic.
The goal is to provide a persistent, isolated instance for single-container, stateful, singleton workloads, managed entirely through familiar Kubernetes constructs. The core APIs include:

Sandbox : The core resource defining the agent sandbox workload for running an isolated instance of the agent's environment
SandboxTemplate : Defines the secure blueprint of a sandbox archetype, including resource limits, base image, and initial security policies
SandboxClaim : A transactional resource allowing users or higher-level frameworks (like ADK or LangChain) to request an execution environment, abstracting away the complex provisioning logic.

In addition to the Sandbox primitive we are also launching with additional features that make the experience as a whole better for the user:

WarmPools — In order to support fast instance startup time, which is an important part of the usability of agenting sandboxes, we introduced the Warm Pool extension. The Sandbox Warm Pool Orchestrator utilizes a dedicated CRD to maintain a pool of pre-warmed pods, allowing the Sandbox Controller to claim a ready instance upon creation and reduce cold startup latency to less than one second.
Shutdown Time — Since agentic behaviour can be unpredictable, this feature supports clean termination and cleanup of sandboxes, it automates the deletion by providing an absolute time for the sandbox to terminate.
Python API/SDK — For better usability and a developer-friendly interface to programmatically interact with these CRDs, we provide an example SDK that abstracts away Kubernetes complexities with simple Pythonic functions.

The standard is designed to seamlessly support multiple isolation backends like gVisor and Kata Containers allowing developers to choose the technology that best fits their security and performance trade-offs.

The new Agent Sandbox features and implementations are available now in the github repo kubernetes-sigs/agent-sandbox and on our website agent-sandbox.sigs.k8s.io. We invite all developers, partners, and experts to join this critical community effort to define the secure scalable future of autonomous AI on Kubernetes.

We will be presenting a technical deep dive and officially launching the project at Kubecon Atlanta, November 2025. We hope to see you there!

opensource.google.com

Google Open Source Blog