Google Open Source Blog: 2026

Posts from 2026

Google Cloud: PostgreSQL community contribution updates

Friday, July 10, 2026

by Dilip Kumar, Cloud SQL for PostgreSQL & Matt Cornillon, Sales EMEA

Group photo of the Google Cloud team smiling and standing at the Google Cloud booth during the PGConf India event.

Google Cloud is deeply committed to the long-term success of the PostgreSQL ecosystem. Our involvement goes beyond providing PostgreSQL managed services; it's also about active participation in the open source communities through technical contributions, leadership in conference committees, and sharing architectural insights that benefit all users. Following is a recap of recent events Google Cloud participated in.

PGConf.dev 2026

Serving as a vital developer-centric hub, PGConf.dev provides a unique opportunity for collaboration with the full assembly of senior PostgreSQL committers. This gathering is essential for aligning technical efforts and shaping the future project roadmap.

Key Highlights

Participation focused on strategic coordination with PostgreSQL committers regarding logical replication development, and a consultation on global index architecture.
High community interest confirms the Global Index feature solves a vital architectural requirement for enterprises.
Established community consensus to pursue a deparsing-based architectural approach for DDL replication.

Google Cloud Sessions

ilip Kumar, a PostgreSQL contributor from Google Cloud, presenting 'Experimenting with a Global Index in PostgreSQL' at pgconf.dev 2026 in Vancouver. He is speaking at a podium next to a presentation slide detailing the Global Index storage architecture and PartitionIdentifier management. — Dilip Kumar, a PostgreSQL contributor from Google Cloud, presenting "Experimenting with a Global Index in PostgreSQL" at pgconf.dev 2026 in Vancouver. He is speaking at a podium next to a presentation slide detailing the Global Index storage architecture and PartitionIdentifier management.

Session Title	Session Type	Speakers/Led by
Experimenting with a Global Index in PostgreSQL: Design, Implementation, and Challenges	Technical Talk	Dilip Kumar
Unconference: Global Indexes	Unconference Session	Dilip Kumar
Unconference: Logical Replication: Warts and Missing Pieces	Unconference Session	Hannu Krosing

PGConf India 2026

Key Highlights

The three-day conference was divided into a training day followed by two days of sessions. More than 580 participants attended the conference over three days.
The conference sessions included a mix of keynotes, breakout technical sessions, sponsor sessions, and booth interactions.

Google Cloud Sessions

Session Title	Session Type	Speakers
Database And GenAI	Keynote	Paresh Rathod
Experimenting with a Global Index in PostgreSQL	Technical Talk	Dilip Kumar
GCP - Best home to run PostgreSQL	Sponsor Session	Trusar Borse, Abhijeet Rajkur
Beyond shared_buffers: On-Demand Memory PostgreSQL	Technical Talk	Rajeev Rastogi, Vaibhav Popat
Where is my Memory	Technical Talk	Pushkar Kalidkar
Agentic AI Applications with GCP Databases	Keynote	Abhijeet Rajkur, Rishi Kapoor, Saurabh Gupta

PGDay Paris & PGDay France 2026

France hosts two distinct flagship PostgreSQL events, and Google Cloud is deeply embedded in both as both organizers and technical contributors. While PGDay Paris serves as an international, English-language hub for the European community, PGDay France is a community-driven, traveling event that focuses on the francophone ecosystem, taking place in Toulouse for 2026.

Key Highlights

Matt Cornillon served on the organization committee for PGDay France, while Yves Colin contributed as a member of the program committee.

Google Cloud Sessions

Session Title	Session Type	Speakers
Creating a "Dungeon Master" with Postgres and MCP	Technical Talk	Matt Cornillon
Create your first AI agent with PostgreSQL	Workshop	Matt Cornillon, Yves Colin

PGDay FOSDEM 2026

FOSDEM PGDay is a prominent open source gathering that brings together developers from across the globe to discuss the latest PostgreSQL advancements. It serves as an essential platform for exploring emerging paradigms in database development.

Key Highlights

Exploration of how AI-assisted workflows are redefining development beyond standard autocomplete for SQL queries.

Google Cloud Sessions

Session Title	Session Type	Speakers
Vibe-coding with Postgres: really?	Technical Talk	Matt Cornillon

PGConf Belgium 2026

PGConf Belgium 2026 took place at the UCLL Campus Proximus in Haasrode, Belgium, serving as an outstanding learning and networking platform for the local PostgreSQL community and students.

Key Highlights

The session was selected by faculty as supportive material for a database exam following deep student engagement.

Google Cloud Sessions

Session Title	Session Type	Speakers
Creating a "Dungeon Master" with Postgres and MCP	Technical Talk	Matt Cornillon

Nordic PG Day 2026

Nordic PG Day is the largest PostgreSQL event in the Scandinavian countries. The 2026 edition took place in Helsinki, gathering more than 130 PostgreSQL enthusiasts for a day of deep dives.

Key Highlights

Google joined as an official Partner-level sponsor for the first time, including a dedicated table booth.

Google Cloud Sessions

Session Title	Session Type	Speakers
Unlock AI Agents with PostgreSQL	Technical Talk	Mats Berglin, Miguel Toscano

Swiss PGDay 2026

Swiss PG Day is the annual event organized by the Swiss PostgreSQL User Group in Rapperswil, Switzerland. The ninth edition featured sessions in both English and German.

Key Highlights

Demonstration of the physical impact of pushing millions of vectors to PostgreSQL based on a real-world use case.

Google Cloud Sessions

Session Title	Session Type	Speakers
Surviving pgvector in production: a reality check	Technical Talk	Miguel Toscano

Postgres Conference: 2026 San Jose

Since its inception in 2007, the Postgres Conference has served as a cornerstone for advancement, fostering a rich environment for learning and professional networking.

Key Highlights

Google proudly served as a sponsor for the event.
Adapting PostgreSQL for the artificial intelligence era demands a transformation in operational approaches. With the rise of natural language tools and vibe coding speeding up development, Agentic AI places advanced demands on production databases. In his presentation, Vikas examines how Google Cloud managed services have evolved to handle these workloads, providing architectural strategies and best practices for contemporary AI deployment.

Google Cloud Sessions

Session Title	Session Type	Speakers
Postgres and AI - Stronger Together!	Technical Talk	Vikas Arora

Community Leadership and Committees

Googlers play a vital role in shaping the direction of the most prestigious PostgreSQL developer events. Our leadership in these committees helps ensure that enterprise-grade requirements—such as those needed for large-scale migrations—are part of the global conversation.

PGConf.dev 2026: Dilip Kumar served on the Program Committee.
PGConf India 2026: Dilip Kumar was a member of the Paper Selection Committee.
PGDay France: Matt Cornillon was a member of the organization committee and Yves Colin served as a member of the Program Committee.

Looking Forward

Our commitment remains firm: to turn feedback from these global events into code, reviews, and active community partnerships. We thank the wider PostgreSQL community and the project's committers for their continued collaboration in making PostgreSQL better for everyone.

Acknowledgement

We extend our heartfelt appreciation to our open source community contributors for their outstanding dedication and active participation in making PostgreSQL conferences a great success.

Abhijeet Rajurkar, Darshan Nagarajappa, Dilip Kumar, Hannu Krosing, Mats Berglin, Matt Cornillon, Michael Bautin, Miguel Toscano, Niranjan Shivprasad, Paresh Rathod, Rajeev Rastogi, Vaibhav Popat, Vikas Arora, and Yves Colin

Furthermore, we are deeply grateful to the broader PostgreSQL open-source communities, especially the dedicated conference organizers, committee members, and all supporting sponsors.

Community feedback: How can corporations improve support for open source maintainers?

Tuesday, June 30, 2026

by Sophia Vargas, Google Open Source

We know that AI is actively transforming the sustainability and socio-technical dynamics of OSS communities. Google Open Source is committed to partnering with open source communities and ecosystems to learn together how we should update our own models for engagement and support.

During an open meetup for GitHub Maintainer Month, I led a session to gather community feedback on how corporations can more effectively support open source maintainers.

Paying maintainers takes creativity

Many maintainers would appreciate consistent financial support. However, facilitating payments to individuals without established contractual relationships remains a complex challenge, particularly across diverse international jurisdictions. Fiscal hosts and programs such as Open Collective, GitHub Sponsors, and the LFX Mentorship Program can simplify components of this process, but they do not resolve the underlying issues of funding sustainability and predictability. While initiatives like the Open Source Endowment are working toward long-term funding sustainability, individual maintainers also had a few ideas:

Pay per meaningful contribution vs gameable metrics: Avoid payment models based on easily manipulated units like pull request counts or review volume. A proposed alternative is ‘pay per report,' encouraging maintainers to document their achievements and upcoming roadmaps.
Commitment-based purchasing: Corporate policies might make procurement simpler (or more complex) than sponsorships, so maintainers could benefit from offering structured support services alongside traditional sponsorship opportunities.
Fund conference attendance: In-person networking can be a boon for solo maintainers but it's often cost-prohibitive. For some corporations, travel sponsorship may be a simpler alternative to direct payments.

Challenge for Corporations and Fiscal Hosts: How can we assist maintainers in understanding any and all prerequisites and documentation necessary to participate in monetary programs?

Advice from Maintainers: Consult a tax professional to understand the implications of various funding methods.

Manage and respect expectations

Beyond financial support, our discussion returned to the importance of respect and etiquette. Particularly, how can we manage expectations between heterogeneous creators, contributors and users - are maintainers clearly communicating their preferences, and are corporations actively respecting them? Some suggestions include:

Adherence to community norms: Maintainers should share their preferred communication channels, while contributors - both human and agentic - must ensure they review and follow them.
Consistency with documentation: Discrepancies between documented procedures and actual practices create friction for all participants. This standard should be upheld by both individual maintainers and corporate-managed projects.
Clarity of intent: Many maintainers would like to understand the motivation behind a contribution and reserve the right to ask questions.

To improve specific program experiences, maintainers suggested:

Consistent communication: Recipients of funding programs expect clearly communicated expectations regarding the timing and amount of disbursements.
Transparency and discoverability: Maintainers would appreciate easily discoverable records that track program participation, active agreements, and verify the status of Contributor License Agreements (CLAs).

Let's keep learning as a community

While we cannot make any promises, we want to continue to learn and challenge ourselves to consider novel ways to support OSS communities and maintainers. As a member of our community, we value your opinion. We've created a Google form to collect any thoughts you might have, as well as gauge interest in another open meeting. We plan to share any and all learnings back with the community.

Documenting the manual: how curiosity and robotic arms led to a career in open source

Monday, June 22, 2026

by Daryl Ducharme, Google Open Source

When you think of "innovation" in open source, your mind probably jumps to the latest AI model or a revolutionary new framework. You might not immediately think of manual pages. Even Alejandro "Alex" Colomar, who spends his days maintaining Linux Kernel documentation, jokingly admits that some might find the work "boring" because it focuses on fixing existing issues and documenting new features rather than flashy inventions.

But as any developer knows, the most powerful code is only as good as the documentation behind it. At Google, we believe that investing in the success of projects we don't own is a core part of being a good open source citizen. That is why we are proud to sponsor Alejandro's work on the Linux Kernel man-pages project—supporting the critical infrastructure that many of our own systems rely on every day.

Documentation is the gift you give to your future self and your whole community.

The precision of a robot

Alejandro's journey into the world of essential documentation started at university. He was working with robotic arms that used a proprietary scripting language. Wanting more control, he decided to write a C library to communicate with the robots over the network by sniffing packets with Wireshark. It worked, but it was slow—he had to wait seconds between commands to ensure the robot had finished moving.

To make the movements smooth, he needed to understand the messages the robot was sending back in real-time. This required high-precision timing. He found SO_TIMESTAMP, which provided microsecond precision, but he noticed a macro called SO_TIMESTAMPNS in the header files that promised nanosecond resolution. The problem? It wasn't documented in the manual page.

The first patch

After figuring out how to use the undocumented feature by looking at the kernel source code, Alejandro decided to ensure the next person wouldn't have to struggle. He cloned the man-pages repository, wrote a new paragraph based on existing features, and figured out how to send a plain-text patch via email.

"As it was my first patch, I was a bit intimidated by the procedure," Alejandro recalls. That intimidation led to a commit message he is still proud of today: roughly 120 lines of explanation for just 25 lines of new documentation. He wanted to prove that he had done his homework. The welcoming response from the maintainer encouraged him to keep going, leading to more patches and, eventually, a career-long dedication to clarity in open source communities.

Sustaining the commons

Google understands that open source is a "small community built on trust." By supporting maintainers like Alejandro, we help ensure that critical infrastructure—like the documentation that powers the Linux ecosystem—remains accurate and accessible for everyone. We believe that using open source comes with a responsibility to contribute and sustain it, which is why we partner with developers to maintain and grow critical projects.

Alejandro's work doesn't just help himself; it helps thousands of other programmers who rely on correct documentation to build the next generation of technology. As he puts it: "I couldn't program without correct documentation, so whenever I find an issue in documentation, I try to fix it."

A garden that needs tending

We often say that a community is a garden, not a building—it requires constant tending, not just initial construction. By sponsoring Alejandro, we are helping to tend that garden, ensuring the "manual" remains a living, breathing resource for the global developer ecosystem. Whether it is fixing a typo or documenting a high-precision networking macro, every contribution makes the "eyes" on the code that much sharper.

In-place pod restarts: Boosting efficiency and workload reliability in Kubernetes v1.35

Thursday, June 18, 2026

by Duncan Campbell & Giuseppe Tinti Tomio, Kubernetes

Operational efficiency and system resilience are critical when running scaled platforms. Yet, in Kubernetes, recovering from software crashes remains a headache because you couldn't trigger a clean restart of a Pod's containers without recreating the entire Pod object, leading to some amount of resource waste.
To address this, Restart All Containers on Container Exits graduated to beta and is enabled by default in Kubernetes v1.36. Developed in close collaboration with the CNCF community, this capability represents Google's commitment to investing in the success of foundation-led open source projects. By sharing best practices from running large distributed systems internally, we are helping build a more resilient and efficient ecosystem. Letting containers restart while keeping the Pod's runtime identity provides a built-in way to perform in-place Pod recovery, boosting application reliability and saving resource costs.

The Problem: The High Cost of Pod Re-creation

Historically, Kubernetes managed failures using pod level restart policies. While sufficient for simple services, modern multi-container Pods often have complex dependencies. When a failure requires a full environment reset, your only option was deleting and recreating the entire Pod.
This introduces massive control plane churn, causing latency and pressure on the etcd backend during large failures:

Initialization Dependencies: If a main container corrupts a local environment, for example, single-use secrets that must be re-requested, restarting just that container is insufficient; the setup must run again.
Watcher Interoperability: If a watcher sidecar detects a fatal error, it must trigger a full recreate of the entire pod and its infrastructure, including the sandbox.
Stale States: If a database sidecar proxy restarts, the main application can get stuck attempting to use stale, broken connections.
Resource Race Conditions: When a large job finds a proper set of nodes, recreating Pods can lead to other pending Pods taking over those resources. In-place restarts eliminate this race condition risk.

Previously, resolving these failures required destroying the entire Pod. For large batch or AI/ML workloads, where thousands of Pods might fail simultaneously, this can lead to "Thundering Herd" scheduling requests, delaying recovery and wasting expensive GPU/TPU compute time.

Introducing In-Place Restarts: The RestartAllContainers Action

Kubernetes v1.35 introduces the RestartAllContainers action, enabled by the RestartAllContainersOnContainerExits feature gate, which graduated to beta in 1.36 alongside its dependencies ContainerRestartRules and NodeDeclaredFeatures. This lets a container's exit behavior trigger a fast, in-place restart of the entire Pod on its existing node.
The Kubelet halts all containers while keeping the Pod sandbox intact, preserving critical infrastructure:

Network Identity: Keeps the same IP, network namespace, and UID, completely bypassing IP reassignment.
Hardware and Devices: Keeps GPUs/TPUs bound, eliminating scheduling and re-allocation delays.
Storage Mounts: Volumes, including emptyDir and PVCs, remain fully mounted; their content is not cleared during restarts.

Once terminated, the Kubelet re-runs init containers (including sidecars, which are part of the init sequence) in order, guaranteeing a clean setup in a known-good environment.

A Native Pod Specification Example

You can implement this under the container's restartPolicyRules field. Here is a quick example of how a watcher sidecar can trigger an in-place restart of the entire Pod by exiting with code 88:
YAML
Note: Image names and paths in the YAML below are for illustrative purposes.

apiVersion: v1
kind: Pod
metadata:
  name: ml-worker-pod
spec:
  restartPolicy: Never
  initContainers:
    - name: setup-environment
      image: registry.k8s.io/ml-tools/setup-worker:v1.0
    - name: watcher-sidecar
      image: registry.k8s.io/ml-tools/watcher:v1.0
      restartPolicy: Always
      restartPolicyRules:
        - action: RestartAllContainers
          exitCodes:
            operator: In
            values: [88]
  containers:
    - name: main-application
      image: registry.k8s.io/ml-tools/training-app:v1.0

The Operational Impact of In-Place Restarts

For organizations running distributed workloads, RestartAllContainers provides serious operational advantages:

No Control Plane Overhead: By preserving identity, clusters avoid scheduling latency and DNS propagation. This was a key factor for JobSet using this feature to reduce recovery from minutes to seconds.
Node Locality Preservation: Since the Pod stays anchored to the same node, restarted containers can instantly access local, warm storage caches.
Maximized Hardware Efficiency: In distributed AI training, losing a single node halts the entire job. Keeping accelerators like GPUs/TPUs bound lets workloads resume training significantly faster, directly reducing compute costs.

Observability and SRE Best Practices

To support monitoring, Kubernetes v1.35 introduces the AllContainersRestarting Pod condition. Set to True during restarts, it alerts SREs and autoscalers, preventing false-positive alerts, while container restart counts increment to let Prometheus easily track recovery events.
To use in-place restarts successfully, shift your mental model to "persistent sandboxes" and follow three best practices:

Ensure Reentrancy: Kubelet only guarantees "at least once" execution for init containers. Reentrancy is now a standard requirement, so your code must be fully idempotent.
Plan for Termination Handling: Graceful termination (preStop hooks) is not supported for in-place restarts. SIGKILL is almost immediate, so applications must handle sudden exits gracefully.
Prepare External Tooling: CD and observability tools should expect re-running init containers without interpreting them as new deployments.

What's Next?

This beta capability is a major step toward fluid workload management and serves as a building block for advanced community features like JobSet in-place restarts (KEP-467).
Our work on KEP-5532 reflects our commitment to transparent open source governance. Developed collaboratively within SIG Node, this feature shows how we hold ourselves to high citizenship standards; making our design, goals, and intentions transparent while building shared best practices that benefit everyone. We encourage you to experiment with Kubernetes v1.35 and share your feedback with the community!

Learn More

Read the Kubernetes Pod Lifecycle Documentation.
Explore KEP-5532: Restart All Containers on Container Exits.
Review theKubernetes v1.35: Restart All Containers Blog Post.
Join the SIG Node Community on Slack (#sig-node).

Open rails for agentic commerce at Open Source Summit North America 2026

Tuesday, June 16, 2026

by Anurag Sinha, Universal Commerce Protocol (UCP)

At Open Source Summit North America 2026, I shared why agentic commerce needs open rails.

As AI agents become more capable, the shopping journey is shifting from "show me" to "help me." Instead of browsing, comparing, clicking, and checking out step by step, people can increasingly ask an agent to help them decide what to buy and, in some cases, complete the purchase. Industry forecasts suggest agentic shopping could account for roughly 10% to 25% of U.S. e-commerce by 2030 (Bain), which points to a meaningful shift in how digital commerce will work. Watch the full keynote here.

Why shared rules matter

That shift also exposes a challenge. Commerce is still highly fragmented. Different businesses, payment providers, and platforms operate with their own rules, workflows, and business logic. Every new surface adds more integration work. Every bespoke connection creates more complexity. And that fragmentation makes it harder for AI systems to understand and perform commerce actions consistently across businesses. A shared language lowers that barrier for everyone.

A common language for agentic commerce

That is the problem Universal Commerce Protocol (UCP) is designed to solve.

We launched the Universal Commerce Protocol, or UCP, with industry leaders to establish an open standard for agentic commerce, built to work across the shopping journey. UCP creates a common language for agents and systems to operate together across consumer surfaces, businesses, and payment providers, so the ecosystem does not need a different bespoke integration for every new agent or platform.

Just as importantly, UCP is designed for the real world. Every business has its own way of selling. Checkout, fulfillment, loyalty, policy logic, shipping, and post-purchase flows can vary widely between a local shop, a marketplace, and a large retailer. UCP is built to support that reality.

A diagram of the Universal Commerce Protocol (UCP), subtitled 'The common language for platforms, agents and businesses.' It illustrates a central UCP framework containing modules for 'Shopping' and 'Common' services, flanked by 'Consumer platforms' on the left and 'Business platforms' on the right, with bidirectional arrows showing how they connect and communicate through the central protocol.

A layered architecture for a shared commerce language

UCP uses a layered model to create a reusable shared language for commerce. Services organize domains like shopping and common. Capabilities define core actions such as checkout, catalog, cart, orders, and shared functions like identity linking. Extensions keep those capabilities configurable, so features like fulfillment can be modeled once and reused across multiple flows instead of being hardwired each time. At the transport layer, UCP stays agnostic, supporting bindings like REST, Model Context Protocol, and Agent2Agent.

Together with capability discovery and payment handling, these layers help consumer platforms, agents, and businesses interoperate more consistently over time. They also let different participants advertise what they support, compose new behaviors, and communicate over the transport that works best for them.

Built in the open

A standard for everyone should be shaped by everyone. Because UCP is open, merchants, developers, and community contributors can pressure-test real-world gaps, propose new capabilities and extensions, and help make sure the protocol reflects more than the needs of the largest players. That kind of participation is what keeps an ecosystem moving.

Since launch, UCP has continued to evolve through new capabilities, an expanded Tech Council, and new consumer experiences built on top of the protocol. That momentum matters because standards only work when the ecosystem uses them.

Watch the full keynote

Agentic commerce is still evolving, and UCP is a foundational building block to support what's next in this new era.

If you want the full architecture walkthrough and the complete story from Open Source Summit North America, watch the session here. And if you want to go deeper, you can explore the UCP documentation, join the community conversation, and contribute to the public repository.

CEL finds a new home at github.com/cel-expr!

by Olena Huang, CEL (Common Expression Language) team

We're excited to announce that the official Common Expression Language (CEL) repositories have moved to a dedicated GitHub organization. Visit the new cel-expr repository now!

Why the move?

This move is a key step in strengthening the CEL ecosystem. By centralizing our projects, including the language specification, Go, C++, C, Java, and Python implementations, under the cel-expr organization, we aim to:

Enhance Branding: Create a clear and unified brand identity for CEL.
Improve Discoverability: Make it easier for users and contributors to find all official CEL resources in one place.
Ensure Consistency: Foster consistency across all CEL projects.
Streamline Development: Simplify our development and release processes.

What's Changing?

The following repositories now reside in the cel-expr organization:

google/cel-spec is now cel-expr/cel-spec
google/cel-cpp is now cel-expr/cel-cpp
google/cel-go is now cel-expr/cel-go
google/cel-java is now cel-expr/cel-java
cel-expr/cel-python and cel-expr/cel-c have already been in the cel-expr namespace

All future development, issues, and pull requests for these projects will take place in their new homes within the cel-expr organization. This is a non-breaking change, due to automatic redirects, but you should update your URLs where possible.

What Stays the Same?

We've worked to make this transition as seamless as possible:

Automatic Redirects: GitHub will automatically redirect all web traffic and git operations from the old google/cel-* URLs to the new cel-expr/cel-* locations. Your existing links and git remote configurations pointing to the old URLs should continue to work for cloning and fetching.
Preserved History: The full commit history, issues, and pull requests for each repository have been migrated and are available in the new locations.

Action Required: Update Your Dependencies

While existing links and git remote configurations pointing to the old URLs should continue to work thanks to GitHub's redirects, we recommend updating your dependency management configurations (e.g., go.mod, pom.xml, requirements.txt, etc.) to point directly to the new repository URLs under https://github.com/cel-expr. This ensures you are fetching the latest code and releases from the canonical source.

We're thrilled about this new chapter for CEL, bringing all our core components under one roof. We believe this will foster a stronger CEL community and accelerate the development and adoption of CEL.

Explore the new organization at https://github.com/cel-expr!

A new pkg.go.dev API for Go

Friday, June 12, 2026

by Ethan Lee, Jonathan Amsterdam & Hana Kim, Go Team

Access to Go metadata has been an everpresent need for the Go community. Since its launch, pkg.go.dev has served as a central hub for Go package documentation and discovery. While we initially prioritized providing this comprehensive access via a web interface, the need for streamlined programmatic access has become increasingly clear.

Structured API access has been one of the most highly requested features for pkg.go.dev for a while now. Developers building tools, IDE integrations, automated workflows, and other systems have had to rely on inconsistent and fragile scraping methods. By providing a formal API, we can provide fast and efficient access to required data. This foundation also sets Go up for the future of AI-assisted coding. Large language models and agents can access the context necessary to reason about the Go ecosystem with greater precision and accuracy.

Empowering Tool Builders

Our goal with this API is to reduce the technical churn for builders and innovators. By offering structured JSON metadata, we address the following use cases:

Search and Discovery: The API enables fast and efficient search across the entire Go module ecosystem.
Driving AI Innovation: As AI-assisted coding evolves, LLMs and agents need precise context. This API provides the data required for agents and models to reason deterministically about Go packages.

The Service Interface

Built for stability and efficient caching, the API uses a stateless, GET-only architecture. Primary endpoints are currently hosted under the v1beta path. Following a period of feedback from the Go community and confirmed stability, we intend to transition toward a formal v1 release.

For a complete interactive reference of all endpoints, query parameters, and response shapes, see pkg.go.dev/api. The machine-readable API contract is also published directly at pkg.go.dev/v1beta/openapi.yaml.

Endpoint	Description
`/v1beta/imported-by/{path}`	Paths of packages importing the package at `{path}`.
`/v1beta/module/{path}`	Information about the module at `{path}`.
`/v1beta/package/{path}`	Information about the package at `{path}`.
`/v1beta/packages/{path}`	Information about packages of the module at `{path}`.
`/v1beta/search/search?q={query}`	Search results for a given query.
`/v1beta/symbols/{path}`	List of symbols declared by the package at `{path}`.
`/v1beta/versions/{path}`	Versions of the module at `{path}`.
`/v1beta/vulns/{path}`	Vulnerabilities of the module or package at `{path}`.

An example of retrieving package information is shown below:

curl https://pkg.go.dev/v1beta/package/github.com/google/go-cmp/cmp | jq
{
  "modulePath": "github.com/google/go-cmp",
  "version": "v0.7.0",
  "isLatest": true,
  "isStandardLibrary": false,
  "goos": "all",
  "goarch": "all",
  "path": "github.com/google/go-cmp/cmp",
  "name": "cmp",
  "synopsis": "Package cmp determines equality of values.",
  "isRedistributable": true
}

A Reference Implementation

To demonstrate how to interact with our API, we are providing a reference CLI implementation: pkgsite-cli. This implementation serves as a practical example for developers looking to build their own integrations, showing how to handle the data directly from the terminal. Note, as the API continues to evolve, the interface and behavior of this CLI may change.

You can use it to search for packages or inspect symbols without leaving your shell:

go install golang.org/x/pkgsite/cmd/internal/pkgsite-cli@latest

pkgsite-cli search "uuid"
github.com/google/uuid
  Module:   github.com/google/uuid@v1.6.0
  Synopsis: Package uuid generates and inspects UUIDs.
... more


pkgsite-cli package github.com/google/go-cmp/cmp
github.com/google/go-cmp/cmp
  Name:      cmp
  Module:    github.com/google/go-cmp
  Version:   v0.7.0 (latest)
  Synopsis:  Package cmp determines equality of values.

pkgsite-cli package --symbols github.com/google/go-cmp/cmp
github.com/google/go-cmp/cmp
  Name:     cmp
  Module:   github.com/google/go-cmp
  Version:  v0.7.0 (latest)
  Synopsis: Package cmp determines equality of values.

Symbols:
  type Indirect struct{}
  type MapIndex struct{}
  type Option interface{}
  ... more

Looking Ahead

While we prioritize stability for our new /v1beta endpoints, we are eager to hear how open source communities use these resources to solve real-world problems.

We look forward to your feedback via our issue tracker and to seeing the tools you’ll build next.

Introducing OpenRL: A self-hosted post-training API for fine-tuning LLMs

Thursday, June 11, 2026

by Sunil Arora, Shuby Mishra & Chuang Wang, GKE

We are pleased to share a research preview of OpenRL, a new open-source project coming out of GKE Labs. OpenRL is a self-hosted training API for fine-tuning LLMs on your own Kubernetes cluster.

Why we built it

If you look at agentic RL on LLMs, it is incredibly easy to get bogged down in system complexity. To run a single RL loop, you have to coordinate a dozen different things: selecting and cleaning datasets, choosing RL environments, debugging training loops, managing reward signals, handling inference mismatches, allocating hardware, and managing infrastructure. Picture looks something like this:

Figure shows an AI researcher and an infrastructure engineer staring at the hurdles in post training along the way to the summit.

Each of these is a hard problem. But what makes it more complex is how tightly AI research and infrastructure concerns are mixed together in today's tooling and frameworks.

We believe decoupling the infrastructure from AI research can make these problems more tractable so that infrastructure engineers and AI researchers can independently tackle them. We have seen this pattern with Kubernetes where Kubernetes abstracted out the infrastructure and made application developers and SREs life easier.

So, can you abstract out post training infrastructure? We believe so and drew huge inspiration/validation from Tinker (from Thinking Machines). The Tinker APIs for post training hit that Goldilocks zone where it hides all the post training infrastructure behind four key APIs:

Figure shows high level components and their interaction in a OpenRL based RL workflow

So the end result of this abstraction is that AI Researchers get full flexibility on their RL loop and infrastructure engineers can focus on scaling, orchestration, and reliability. OpenRL allows you to run the same training APIs but on your own infrastructure. And this decoupling has other interesting benefits.

Sharing GPUs

Traditional RL loops are strictly sequential. The trainer waits for the sampler to finish rollouts, the sampler waits for the environment to score rewards (which is often bound by slow CPU/network tasks), and the whole loop sits blocked. Your expensive GPUs spend a lot of time doing nothing. The abstraction allows running multiple RL jobs and allows infrastructure engineers to pack the training/sampling steps to utilize more of their GPUs. The graph below shows the GPU consumption in OpenRL for running one, two, and three RL jobs concurrently.

The figure shows the trainer/sampler duty cycle in OpenRL for scenarios with 1 RL job, 2RL jobs and 3 RL jobs respectively.

Better UX

Once you separate out the infrastructure behind the APIs, you start to see the gains in user experience of developing the RL loop because AI researchers no longer have to wrangle the complex python dependencies like cuda. When you are doing R&D, you do not have to run the RL loop directly on the machines with GPUs, you can simply run your RL loop on your Mac pointing to the training APIs running on a Kubernetes cluster/VMs.

Autoresearch

We believe that frontier AI research will get more and more automated in the future and abstracting out infrastructure as a building block is key to that. To demonstrate that, we added an autoresearch recipe inspired heavily by karpathy's work. The recipe demonstrates how to conduct parallel experiments to conduct parameter sweep, and improve the reward signal for our text-to-sql recipe for Gemma models.

What OpenRL is not

A managed service. OpenRL is self-hosted and not a managed service. We aim to make it easy for users to deploy and operate it on their Kubernetes clusters.
An RL framework. OpenRL gives AI researchers full control over their RL loop.

Get started

We have made it easy to run OpenRL on your Mac, Nvidia GPUs, or on GKE. This allows you to test your RL loop on Mac and when you are ready to scale, you can point the RL loop to the OpenRL endpoint running in the GKE cluster.

Try out our text-to-SQL example for teaching the latest Gemma model SQL here: guides.

One of the benefits of a Tinker compatible endpoint is that you can use Tinker-Cookbook with OpenRL. Tinker-cookbook is one of the best resources for post training infrastructure for RL.

Future steps

We have started with a simple architecture focussing on LoRA fine-tuning and plan to evolve the project in the coming months, so please give it a try and share your feedback. A few things we are very excited to work on:

Full parameter fine-tuning
Multitenancy (simultaneous RL on different types of base models)

Acknowledgement

We have been inspired by the work done by various open source projects in AI communities, so huge thank you to Thinking Machines, vLLM, PyTorch, prime-rl, verl, SkyRL, and llm-d.

Google joins the Eclipse Foundation as a strategic member to accelerate AI-integrated developer tools

Wednesday, June 10, 2026

by amanda casari & Mike Bufano, Google Open Source

Collaboration with the Eclipse Foundation will support open infrastructure for AI-integrated developer platforms like Google Antigravity, while advancing broader open source security and regulatory compliance initiatives

As of April 2026, Google has joined the Eclipse Foundation as a Strategic Member, reflecting the company's continued investment in open source technologies and modern developer infrastructure.

As part of this collaboration, Google will additionally sponsor Open VSX and is among the first adopters of the recently announced Open VSX Managed Registry service. Open VSX is the open source, vendor-neutral extension registry for tools built on the VS Code™ extension API. It powers a rapidly growing ecosystem of AI-integrated IDEs, cloud development environments, and developer platforms, including Google Antigravity, AWS's Kiro, Cursor, and, Windsurf among many others.

As a Strategic Member, Google will participate in the Eclipse Foundation's Board of Directors and Technical Advisory Council, helping guide the technical and strategic direction of one of the world's leading open source software foundations.

"The industry is feeling the massive turning point as AI continues to change how developers write, deploy, and maintain software," said amanda casari of Google's Open Source Programs Office and new Eclipse Board member. "Joining The Eclipse Foundation as a Strategic Member ensures that the next generation of AI-integrated developer experiences—including platforms like Google Antigravity—are built in partnership with transparent, vendor-neutral foundations. Open registries, like Open VSX, are critical infrastructure which keep the global developer ecosystem open to everyone."

Google and the Eclipse Foundation share a deep history, having collaborated across numerous initiatives since 2006. This Strategic Membership elevates the relationship and support critical to modern initiatives like Open VSX, Open Regulatory Compliance (ORC), and Adoptium.

"Google has played a pivotal role in open source innovation for two decades," said Mike Milinkovich, Executive Director of the Eclipse Foundation. "Their decision to join as a Strategic Member reflects the growing importance of open collaboration in supporting global regulatory compliance efforts, strengthening open source infrastructure, securing supply chains, and advancing the next generation of AI-integrated developer platforms."

The Eclipse Foundation continues to see explosive growth as adoption accelerates across AI-integrated developer tooling and cloud development environments. The Open VSX registry now scales to meet massive global demand:

300 million+ downloads per month
200 million requests during peak daily traffic
12,000+ hosted extensions from over 8,000 publishers.

Unlocking TPU performance: Deep kernel profiling with XProf

Monday, June 8, 2026

by Yogesh SY, AI Infra Google

Unlocking TPU performance: Deep kernel profiling with XProf

As machine learning workloads scale to unprecedented heights, developers are increasingly writing highly specialized Tensor Processing Unit (TPU) kernels using frameworks like Pallas, Mosaic, and Triton to maximize hardware performance.

However, customizing high-performance kernels has historically introduced a major engineering challenge: optimization blind spots. To legacy performance profilers, custom compilation paths appear as opaque execution paths. Developers are left with single, massive execution blocks in their trace captures, lacking granular visibility into what is actually occurring inside the chip's internal components. Did a vector processing instruction stall? Was matrix math idle due to data loading bottlenecks?

Traditional profiling relies heavily on compile-time static cost models to estimate kernel efficiency. While helpful for standard operations, these models cannot capture dynamic runtime realities like instruction execution stalls, memory subsystem congestion, or hardware scheduling conflicts.

To open this opaque execution path, we are excited to introduce the Kernel Profiling suite in XProf—a low-level hardware debugging suite engineered specifically for Pallas kernel authoring and optimization on Google TPUs. By combining static compilation tracking with dynamic, sub-microsecond hardware telemetry, XProf Kernel provides the deep transparency required to optimize high-scale ML workloads.

Deep visibility: HLO Graphs & MLIR Inspection

The first step in debugging any custom kernel is understanding how your high-level code is translated by the compiler. When compiling a JAX or PyTorch model, the compiler generates a High-Level Optimizer (HLO) graph. Previously, custom calls inside these graphs remained completely obscured.

XProf's updated Graph Viewer resolves this by exposing the internal compilation logic of these custom regions directly. To unlock this deep visibility, developers must pass the appropriate debug flags to the XLA compilation environment.
--xla_enable_custom_call_region_trace=true
--xla_xprof_register_llo_debug_info=true

Once these flags are active, any trace captured via XProf includes comprehensive compiler metadata. In the XProf Graph Viewer, clicking on a custom-call block reveals an interactive panel titled "Custom Call Text." This displays the raw, lowered MLIR (Multi-Level Intermediate Representation) code generated by the compiler.

A screenshot of the TensorBoard XProf interface displaying an HLO graph, with a Custom Call Text panel open to reveal raw MLIR code — Figure 1: XProf interface displaying an HLO graph, with a "Custom Call Text" panel to reveal raw MLIR code

By displaying the MLIR text side-by-side with high-level source-code representations, developers can immediately verify whether the compiler is correctly fusing operations and structuring memory tiles as intended.

Tracing Instrumented Low-Level Operations (LLO) Analysis

To provide cycle-level execution visibility, XProf exposes Low-Level Operations (LLO) bundle data directly inside the Trace Viewer. An LLO bundle represents the actual machine instructions issued to the TPU core's functional units during every clock cycle.

Through dynamic instrumentation, XProf inserts hardware markers exactly when a LLO bundle region executes. Within the Trace Viewer, this manifests as dedicated, time-aligned execution tracks representing the TPU bundle's slot utilization metrics from static analysis:

MXU (Matrix Multiply Unit): Tracks active, busy cycles of high-throughput matrix-multiplication pipelines.
Scalar and Vector ALUs: Displays the execution profile of mathematical operations, letting you spot pipeline imbalances.
Vector Fills, Loads, Spills, and Stores: Exposes HBM-to-register data movement, critical for identifying bandwidth-throttling bottlenecks.
XLU (Cross-Lane Unit): Monitors collective communications and data shuffling across physical TPU cores.

Figure 2: XProf Capture Profile trace viewer interface showing dynamic hardware execution tracks

Runtime Performance Counter Sampling

While static analysis effectively verifies instruction counts or vector store logic, it remains detached from the dynamic realities of runtime execution. To bridge this gap, XProf introduces fine-grained, periodic performance counter sampling—available starting with TPU v7 (Ironwood). This capability empowers developers to move beyond static estimation and measure precisely how hardware blocks are utilized in real-time, providing the empirical ground truth needed to identify whether compute units are truly active or stalled by memory subsystems.

Consider the optimization of a tiled matrix multiplication (Matmul) kernel. While a static trace might indicate a logically perfect sequence of operations, real-world performance often falters if the Matrix Multiply Unit (MXU) sits idle while awaiting data from High-Bandwidth Memory (HBM). To diagnose and resolve such bottlenecks, developers can utilize a structured three-step profiling workflow:

Set up the Profiling Environment: Configure the TPU v7 (Ironwood) runtime by defining specific hardware counters—such as scalar issues or synchronization waits.
Capture a Kernel Profile: Use the XProf request interface to capture fine-grained performance counters, which can then be visualized as a time-series within the Trace Viewer.
Interpret the Data: Analyze the resulting counters to distinguish between a Memory-Bound Scenario (characterized by massive spikes in sync_wait) and an Optimized Scenario. For instance, implementing triple buffering to overlap memory loads with MXU compute can reduce runtime from 125.5µs to 88µs—a ~30% performance gain validated by a drastic reduction in synchronization events.

By shifting from static code inspection to empirical runtime telemetry, hardware behavior explicitly validates optimization strategies, ensuring every cycle on the silicon is spent productively. For a hands-on example to check out these techniques, please explore our Pallas Matmul w/ Perf Counters demo.

XProf timeline highlighting a comparison between a detailed Runtime Perf Counter section sampling at a 1-microsecond frequency and a Static LLO Region track below it — Figure 3: XProf timeline highlighting a comparison between a detailed "Runtime Perf Counter" section sampling at a 1-microsecond frequency and a "Static LLO Region" track below it

Visualizing the "Utilization Gap"

This dynamic tracking exposes the significant gap left by traditional static analysis tools. A static tool analyzes instructions linearly, completely ignoring time. It might flag an MXU instruction block as "100% Utilized."

In contrast, XProf plots actual hardware execution over time. You might discover that a long-running Scalar ALU operation is stalling the entire execution pipeline, leaving the powerful MXU completely idle. By visualizing these temporal idle gaps, developers can adjust data shapes, memory alignments, and instruction sequencing to maximize compute density.

STATIC ESTIMATION:
[========== Block Execution: MXU Flagged 100% Utilized ==========]

XPROF REAL-WORLD TIMELINE:
├─ [Scalar ALU (Active)] ─┼─ [MXU (Active)] ─┼── [MXU (Idle / Memory Stall)] ──┤
│ Stalling pipeline...     │ Compute phase     │ Starved; waiting for HBM Load    │

Figure 4 : The UI shows the active TPU Core functional unit tracks (MXU, Scalar ALU, Vector ALU, and memory data pipelines) aligned side-by-side with the active framework Ops, exposing exact execution times and real-time idle cycles.

Overall Utilization from Performance Counters

Navigating profiling metrics can be daunting. Relying on metrics calculated via compile-time cost models often misrepresents performance when applied to custom compilation paths. To solve this, XProf establishes a clear Hierarchy of Trust:

                  ┌───────────────────────────────┐
                  │     Absolute Ground Truth     │
                  │  (HBM, Hardware Registers,    │ (100% Trustworthy)
                  │       TPO Metrics, CSRs)      │
                  └───────────────┬───────────────┘
                                  ▼
                  ┌───────────────────────────────┐
                  │       Estimated Metrics       │
                  │   (Program Optimal FLOPs,     │ (Requires caution with
                  │      Goodput Efficiency)      │  custom compiling paths)
                  └───────────────────────────────┘

Figure 5: Hierarchy of Metrics

The Absolute Ground Truth (100% Trustworthy): Metrics derived directly from physical hardware registers (HBM utilization, TPO metrics, unprivileged hardware stats). When profiling custom kernels, these represent physical reality and should be your primary optimization anchors.
Estimated Metrics (Use with Caution): Metrics like "Compared to program optimal FLOPS" or "Goodput efficiency" rely on XLA cost models. Because custom compilation paths bypass standard passes, these metrics can be highly skewed or outright non-functional.

For the unvarnished truth, XProf exposes the Perf Counters View, providing direct, tabular access to over 16,000 raw hardware counters read straight from the TPU silicon.

A screenshot of the XProf Perf Counters tabular view, displaying a list of unprivileged hardware counters alongside their corresponding raw decimal and hexadecimal values — Figure 6: XProf Perf Counters Tabular View

Understanding Trace Tracks: The height of a trace track does not represent a normalized 0-100% percentage. It represents the maximum raw counter value observed in that interval. For example, if a counter increments by 100 cycles over a 500-nanosecond trace window (roughly 1,000 clock cycles on a 2.0 GHz core), it indicates exactly 10% physical utilization of that unit.

To configure and profile the runtime performance counters sampling method, please follow the instructions from OpenXLA Kernel Profiling Instructions.

Advanced Sampling: Event-Triggered Profiling

Previously, dynamic capturing was limited to Periodic Sampling Mode—polling counters based on a host-level timer, which hit a physical resolution floor of 1 microsecond.

           CORE 0           CORE 1           CORE 2           CORE 3
      ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
      │  28 Counters │ │  28 Counters │ │  28 Counters │ │  28 Counters │
      └──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘
      └─────────────────────────────────────────────────────────────────┘
                            4 x 28 Sparse Matrix

Figure 7: Sparse Matrix Configuration

To capture lightning-fast hardware cycles, XProf now supports External Event-Triggered Mode. The dynamic sampler intercepts physical TPU trace instructions and boundary triggers (such as entering/exiting custom call scopes), allowing for sub-microsecond capture latency and precise attribution.

Developers can configure up to 28 hardware counters per core, distributed across up to four active SparseCores, creating a 4 x 28 profiling matrix that maximizes data variety while protecting workload performance.

Activating this is straightforward via standard JAX JIT profilers:

options = jax.profiler.ProfileOptions()

# Example request for externally triggered collection
options.advanced_configuration = {
"tpu_enable_periodic_counter_sampling" : True,
"tpu_tc_perf_counter_sampling_options" : (
          'is_external_trigger:true scaling:0 counter_size_bits:1 indices:10 indices:11 indices:56 indices:57 indices:58'
),
}

# For periodic sampling, please use interval_us instead of is_external_trigger.

Getting Started

Ready to transition from guessing performance to measuring and optimizing the physical limits of your ML silicon? Explore these open-source resources to get started with XProf Kernel today:

XProf GitHub Repository: github.com/openxla/xprof
Official XProf Documentation: openxla.org/xprof
JAX Profiling Guide: jax.readthedocs.io/en/latest/profiling.html

Journey to JPEG XL: How open source experiments shaped the future of image coding

Wednesday, June 3, 2026

by Jyrki Alakuijala, Zoltán Szabadka & Luca Versari, Paradigms of Intelligence, Google Technology & Society

Building the Next Generation Image Standard

The internet runs on images. Since the early days of the web, there has been a relentless tension between visual fidelity and bandwidth. For decades, the industry relied on the venerable JPEG standard for images loading fast. It served us remarkably well, but as displays moved to High Dynamic Range (HDR) and Wide Color Gamut (WCG), the format began to show its limits.

The road to JPEG XL (JXL) wasn't a straight line. It was a decade-long exploration, creating a series of milestone projects testing radical ideas in psychovisual modeling, entropy coding, and optimization. Today, as JPEG XL sees rapid adoption across operating systems and professional standards, we’re looking back at the experiments that made it possible.

The Early Foundation: 2011–2017

Our study began with a focus on understanding the limits of existing technology. We didn't start by trying to write a new standard; we started by trying to make the current ones better, and learning their limitations. This allowed us to make the new formalism more flexible and efficient in the right places.

WebP Lossless and Brotli: Lossy WebP drew its lineage from video technology, the WebP Lossless (2011) represented an architectural and scoping departure. We debuted the entropy image concept, an innovative method utilizing a secondary image to orchestrate the selection of static entropy codes for the primary visual data. We reapplied this approach later with data-driven context modeling in the Brotli compression format, enabling rich context modeling without slowing decoding.
Butteraugli: Around 2014, we realized that raw mathematical compression (PSNR) wasn't enough, and simple psychovisual approximations (SSIM and similar) failed in color-rich environments. We built Butteraugli and the XYB color space to mimic the human visual system's edge detection and opponent-color processes in varying scale, allowing us to compress images more effectively.
We pushed the legacy JPEG 1 standard (ISO/IEC 10918, introduced in 1992) to its absolute limits through two key projects: Guetzli and Brunsli. These initiatives provided invaluable insights into the strengths and limitations of traditional JPEG compression methods. Guetzli (2016) is a slow high-density perceptual encoder that used Butteraugli to find the optimal quantization tables, pushing legacy JPEGs to be 20-30% smaller. Brunsli (2015) meanwhile, focuses on lossless recompression, allowing users to repack existing JPEGs into a smaller footprint without losing a single bit of original data. After finishing with JPEG XL standardization, we returned to Guetzli's scope in 2024 and made the encoding much faster and HDR-compatible in Jpegli.

The feedback from these launches, ranging from the technical details of WebP Lossless to the psychovisual audits of Guetzli, proved indispensable. While we already targeted the highest visual fidelity, feedback from detail-critical e-commerce helped us to refine the requirements.

The Convergence: 2017–2019 PIK Era and the 2019 FUIF Integration

By 2017 we had powerful separate tools and it was time to fuse them. In open sourcing PIK we combined the efficiency of Brunsli with the psychovisual optimizations of Guetzli. Further, PIK introduced a real adaptive quantization field and other optimizations. PIK formed our proposal to the ISO standardization body. The committee's final call for proposals pushed toward extreme density, requiring bit rates as low as 0.06 BPP, equivalent to 35 times the compression of internet-quality images and 80 times that of camera output. This expansion of scope necessitated a significant complexification of the format and the encoder, leading to the Variable-block-size Discrete Cosine Transform (VarDCT) architecture that remains central to JPEG XL today.

We proposed to merge our PIK proposal with the FUIF (Free Universal Image Format) proposal from Cloudinary. PIK used Brotli-style distribution selection at encoding time, while FUIF refined codes incrementally during decoding. The final JPEG XL standard became a best-of-both-worlds compromise: we used PIK's faster-to-decode distribution selection with FUIF's sophisticated context trees. The merger represented a departure from conventional one platform driven standardization, and prioritized technical synergy and collaboration.

A flowchart titled 'Building Blocks of the JPEG XL Standard' showing a left-to-right progression across three periods. The first period, 'Early Building Blocks (2011-2017)', contains four boxes: WebP Lossless & Brotli, Butteraugli & XYB, Guetzli, and Brunsli. Arrows point from these early technologies into the second period, 'The Convergence (2017-2019)', which consists of two main boxes: PIK and FUIF. Finally, multiple lines flow from both PIK and FUIF, converging into the third period, 'Final Standard'. This final section features a large orange box labeled 'JXL: JPEG XL Standard', which is described as merging PIK's distribution selection with FUIF's context trees.

JPEG XL Today: An Ecosystem Takes Root

JPEG XL's efficiency, psychovisually-optimized quality, file size, and coding speed, are being noticed. We are seeing bottom-up adoption in various industries, the most demanding fields are leading the way. Because of its ability to handle high bit-depth, high quality and even lossless data efficiently and robustly, JPEG XL has become foundational in several fields:

Photography: Used in Digital Negative (DNG 1.7), Apple's ProRAW, and others.
Medical: Adopted by DICOM, the international standard for medical images.
Publishing: Integration into future versions of the PDF and EPUB standards.

The ecosystem has been maturing rapidly. Adobe's photography software, Apple's iOS, macOS, and visionOS have native support, as do Linux distributions like Ubuntu and Microsoft's JPEG XL Image Extension for Windows. Our libjxl-tiny inspired Shikino High-Tech, Inc. and CAST to release the first commercial JPEG XL encoder IP core for ASIC and FPGA designs, aimed at real-time, low-power image capture. Safari (2023) led among major browsers, while Firefox and Chrome currently maintain experimental support.

Two men in a bright office collaborating at a whiteboard. The board contains a hand-drawn flowchart titled 'VARDCT BLOCK JOINING STRATEGY'. The diagram illustrates small square blocks combining into larger patterned rectangles, connected by arrows. Text labels in the flowchart include 'Decision Logic: Rate-Distortion Cost', 'Merging Criteria', 'Entropy Coding Efficiency', 'Neighboring Blocks', and 'Variable Block Sizes'. The man on the left is pointing to the bottom left of the diagram, while the man on the right, who has long hair and a beard, is writing a mathematical equation on the board with a marker. — JPEG XL design was not only countless hours of optimization, experimentation and eye-balling the results, but also creative discussions at a whiteboard. In this Gemini-reconstructed scene, Luca Versari and Jyrki Alakuijala (left-to-right) debate VarDCT block selection heuristics.

Looking Forward

The story of JPEG XL stands as a testament to the efficacy of long-horizon planning validated by intermediate functional milestones—with minimum-viable prototypes like Guetzli and practical tools like Brunsli and Brotli—that invite feedback from the open-source community. A small research team can innovate by crystallizing solutions through quick iterations, with thousands, if not tens of thousands, of experiments in psychovisual modeling, entropy, coding speed and complexity, and the entire industry can eventually navigate toward a more efficient, beautiful future.

We started by trying to squeeze a few more bytes out of a 1992 JPEG 1 standard; with JPEG XL we hope to have established a foundation for digital imaging that can last for the next three decades.

Announcing Apache Iceberg 1.11.0

Wednesday, May 27, 2026

by Alex Stephen & Talat Uyarer, Lakehouse

Apache Iceberg project has just launched version 1.11.0! A lot has happened since the last version.

Iceberg 1.11.0 adds support for Apache Spark 4.1 and Apache Flink 2.1, the latest releases of the two engines and makes both the default build targets

The rest are more structural. The REST catalog learns to plan scans server-side, shifting metadata work off the query engine. A new partition statistics scan API gives optimizers a clean, supported way to read a table's shape. Built-in table encryption arrives with envelope encryption and Google KMS support. And Google Storage Analytics library integration makes your Iceberg workloads faster than before.

Let's take a look at some of the biggest changes.

Spark & Flink Updates

As Spark and Flink are moving forward, the 1.11.0 release is pushing forward for new version support in both.

Spark 4.1 & DSv2 Migration: Spark 4.1 unlocks is MERGE INTO with automatic schema evolution: Spark's newer MERGE syntax accepts a WITH SCHEMA EVOLUTION clause, so a MERGE whose source carries columns the target table lacks can add those columns to the table within the same statement, with no separate ALTER TABLE round trip. Beyond the version bump, the 1.11 Spark connector also modernizes against Spark's newer DataSource V2 APIs and adds an asynchronous micro-batch planner that speeds up Structured Streaming.
Flink Ecosystem Updates: Initial work for Flink 2.1 support has landed in the core repository, continuing Iceberg's promise of providing first-class, low-latency streaming sink capabilities. The centerpiece of the Flink work is the DynamicIcebergSink, an experimental sink that breaks the old one-sink-per-table model: a single sink routes each record to a table chosen at runtime, creating tables on demand and evolving their schemas and partition specs on the fly as the input changes including dropping columns once you opt in with dropUnusedColumns. In addition to DynamicIcebergSInk work Flink started supporting nanosecond, variant and unknown types from V3 Spec.

Server-side scan planning

In previous versions of Iceberg, the client handled the heavy lifting of scan orchestration. The driver of engine would traverse the table's metadata tree, retrieving manifest lists and files from object storage to filter data against specific partition requirements. Iceberg 1.11.0 shifts this computational burden into the catalog through server-side scan planning.

Instead of manually traversing manifests, the engine submits a single POST …/plan request detailing the scan allowing the REST catalog to return optimized FileScanTasks.

The API is designed to handle data at any scale: smaller scans return immediate results, extensive operations return a plan-id for polling, and massive datasets are retrieved via parallel plan-tasks through POST …/tasks.

ALT TEXT — Planning moves off the query engine and into the catalog — the driver no longer touches metadata in object storage.

Built-in table encryption

As data lakes increasingly serve as the central hub for sensitive PII and financial data, relying solely on bucket-level storage encryption is no longer enough. Iceberg 1.11.0 introduces built-in table encryption, bringing fine-grained, KMS-backed security directly to the table level.

This provides data platform teams with robust capabilities for security and compliance:

Zero-Trust Storage Security: Even if a malicious actor gains direct access to your underlying object storage bucket, the data remains completely unreadable.
Total Index Protection: It isn't just the raw data that is protected; Iceberg encrypts the manifest lists as well, preventing attackers from inferring sensitive information from table statistics.
Tamper-Proof Data: Built-in authentication tags guard against unauthorized modifications, ensuring data integrity.
Effortless Key Rotation: Keys are rotated automatically as they age, satisfying strict compliance mandates without requiring you to rewrite massive datasets.

Iceberg achieves this using envelope encryption with a three-tier key hierarchy. A table master key lives securely in your KMS and never touches Iceberg storage. This master key wraps key-encryption keys (KEKs), which are stored safely inside the table metadata. Finally, each KEK wraps a unique, per-file data-encryption key (DEK).

Every data file and manifest list is then encrypted with AES-GCM under its own unique DEK. This decoupled architecture ensures maximum security while maintaining the high performance expected of Iceberg workloads.

File Format API

Historically, Iceberg's format-handling code was tightly coupled, growing organically around Parquet, Avro, and ORC. Adding a new format or enforcing consistent feature support (like V3 default values or new column types) across all formats meant duplicating complex engine-specific switch/case code paths.

Iceberg 1.11.0 introduces the finalized File Format API, bringing a consistent API to reading and writing all of these file formats.

Instead of hardcoded engines handling binary extraction, the architecture introduces:

FormatModel: A standardized implementation defining how a file format handles reader/writer construction and its specific capabilities.
FormatModelRegistry: A central directory where query engines fetch appropriate read and write builders.

This API (which is already seeing adoption around other Apache Iceberg implementations) provides a significant code cleanup for the future of the project. It also opens the door for more file formats as time goes on.

Moreover, this new interface facilitates the implementation of Column Families, enabling vertical partitioning of storage. This advancement lets teams perform targeted updates or rewrites on isolated columns—such as recalculating vector embeddings—while leaving the remaining table data undisturbed.

SQL UDF Specification

1.11.0 includes the SQL UDF specification, which adds a brand new metadata format for both Scalar and Table Functions:

Immutable Versioning and Rollback: UDF metadata is written as self-contained, versioned JSON files stored right in the object store. If a data engineer deploys a buggy UDF update, administrators can execute an atomic rollback to a previous version log state
Standardized Schema Typings: Parameters and return types map cleanly to Iceberg Type JSON representations, directly accommodating complex nested maps, structs, and the upcoming Iceberg V3 variant type.
Engine Specific Execution: Each SQL UDF has a function implementation for each engine, allowing users to leverage engine-specific functionality in their UDFs.

Google Analytics Library Integration

For Google Cloud customers, version 1.11.0 delivers substantial throughput gains by embedding the GCS Analytics Core library into GCSFileIO (Issue #14326, PR #14333).

This integration introduces Footer Prefetching, which optimizes Parquet length checks by caching object suffixes to remove network overhead. Combined with threaded VectoredIO for concurrent multi-range operations and specialized small object caching for sub-1MB files, these enhancements eliminate persistent I/O bottlenecks. Initial benchmarks indicate that these architectural improvements can reduce Parquet metadata parsing latency and boost total record processing speeds, empowering high-scale Spark, Flink, and Trino workloads to run with improved efficiency on Google Cloud Storage.

Getting Started with 1.11.0

We are excited to be part of the Apache Iceberg community and innovating together. As a compliant Iceberg REST Catalog, Lakehouse for Apache Iceberg (formerly BigLake) already has support for version 1.11.0.

To upgrade your environment, update your build dependencies to version 1.11.0. Remember to review your deployment runtimes to ensure compatibility with the new JDK 17 baseline, and test your workloads if you are transitioning from Spark 3.4.

For a full breakdown of every bug fix, contributor attribution, and dependency bump, check out the official Apache Iceberg Releases Page