opensource.google.com

Menu

Accelerate JAX models on Intel GPUs via PJRT

Thursday, June 1, 2023

We are excited to announce the first PJRT plugin implementation in Intel Extension for TensorFlow, which seamlessly runs JAX models on Intel® GPU. The PJRT API simplified the integration, which allowed the Intel GPU plugin to be developed separately and quickly integrated into JAX. This same PJRT implementation also enables initial Intel GPU support for TensorFlow and PyTorch models with XLA acceleration.

Image of the Intel
Figure 1. Intel Data Center GPU Max Series

With the shared vision that modular interfaces make integration easier and enable faster, independent development, Intel and Google collaborated in developing the TensorFlow PluggableDevice mechanism. This is the supported way to extend TensorFlow to new devices and allows hardware vendors to release separate plugin binaries. Intel has continued to work with Google to build modular interfaces for the XLA compiler and to develop the PJRT plugin to run JAX workloads on Intel GPUs.

JAX

JAX is an open source Python library designed for complex numerical computations on high-performance computing devices like GPUs and TPUs. It supports NumPy functions and provides automatic differentiation as well as a composable function transformation system to build and train neural networks.

JAX uses XLA as its compilation and execution backend to optimize and parallelize computations, particularly on AI hardware accelerators. When a JAX program is executed, the Python code is transformed into OpenXLA’s StableHLO operations, which are then passed to PJRT for compilation and execution. Underneath, the StableHLO operations are compiled into machine code by the XLA compiler, which can then be executed on the target hardware accelerator.

PJRT

PJRT (used in conjunction with OpenXLA’s StableHLO) provides a hardware- and framework-independent interface for compilers and runtimes (recent announcement). The PJRT interface supports the plugin from a new device backend. This interface provides a means for a straightforward integration of JAX into Intel's systems, and enables JAX workloads on Intel GPUs. Through PJRT integration with various AI frameworks, Intel’s GPU plugin can deliver hardware acceleration and oneAPI optimizations to a wider range of developers using Intel GPUs.

The PJRT API is a framework-independent API to allow upper level AI frameworks to compile and execute numeric computation represented in StableHLO on an AI hardware/accelerator. It has been integrated with popular AI frameworks including JAX, TensorFlow (via TF-XLA) and PyTorch (via PyTorch-XLA) which enables hardware vendors to provide one plugin for their new AI hardware and all these popular AI Frameworks will support it. It also provides low level primitives to enable efficient interaction with upper level AI frameworks including zero-copy buffer donation, light-weight dependency management, etc, which enables AI frameworks to best utilize hardware resources and achieve high-performance execution.

Image of the Intel
Figure 2. PJRT simplifies the integration of oneAPI on Intel GPU into AI Frameworks

PJRT Plugin for Intel GPU

The Intel GPU plugin implements the PJRT API by compiling StableHLO and dispatching the executable to Intel GPUs. The compilation is based on XLA implementation, adding target-specific passes for Intel GPUs and leveraging oneAPI performance libraries for acceleration. The device execution is supported using SYCL runtime. The Intel GPU Plugin also implements device registration, enumeration, and SPMD execution mode.

PJRT’s high-level runtime abstraction allows the plugin to develop its own low-level device management modules and use the advanced runtime features provided by the new device. For example, the Intel GPU plugin developed an out-of-order queue feature provided by SYCL runtime. Compared to fitting the plugin implementation to a low-level runtime interface, such as the stream executor C API used in PluggableDevice, implementing PJRT runtime interface is straightforward and efficient.

It’s simple to get started using the Intel GPU plugin to run a JAX program, including JAX-based frameworks like Flax and T5X. Just build the plugin (example documentation) then set the environment variable and dependent library paths. JAX automatically looks for the plugin library and loads it into the current process.

Below are example code snippets of running JAX on an Intel GPU.

$ export PJRT_NAMES_AND_LIBRARY_PATHS='xpu:Your_itex_library/libitex_xla_extension.so' $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:Your_Python_site-packages/jaxlib $ python >>> import numpy as np >>> import jax >>> jax.local_devices() # PJRT Intel GPU plugin loaded [IntelXpuDevice(id=0, process_index=0), IntelXpuDevice(id=1, process_index=0)] >>> x = np.random.rand(2,2).astype(np.float32) >>> y = np.random.rand(2,2).astype(np.float32) >>> z = jax.numpy.add(x, y) # Runs on Intel XPU
This is the latest example of Intel AI tools and frameworks leveraging oneAPI software libraries to provide high performance on Intel GPU.

Future Work

This PJRT plugin for Intel GPUs has also been integrated into TensorFlow to run XLA supported ops in TensorFlow models. However, XLA has a smaller op set than TensorFlow. For many TensorFlow models in production, some parts of the model graph are executed with PJRT (XLA compatible) while other parts are executed with the classic TensorFlow runtime using TensorFlow OpKernel. This mixed execution model requires PJRT and TensorFlow OpKernel to work seamlessly with each other. The TensorFlow team has introduced the NextPluggableDevice API to enable this.

When using NextPluggableDevice API, PJRT manages all critical hardware states (e.g. allocator, stream manager, driver, etc) and NextPluggableDevice API allows hardware vendors to build new TensorFlow OpKernels that can access those hardware states via PJRT. PJRT and NextPluggableDevice API enable interoperability between classic TensorFlow runtime and XLA, allowing the XLA subgraph to produce a PJRT buffer and feed to TensorFlow and vice versa.

As a next step, Intel will continue working with Google to adopt the NextPluggableDevice API to implement non-XLA ops on Intel GPUs supporting all TensorFlow models.

Written in collaboration with Jianhui Li, Zhoulong Jiang, and Yiqiang Li from Intel.

By Jieying Luo, Chuanhao Zhuge, and Xiao Yu – Google

Google Open Source Peer Bonus program announces first group of winners for 2023

Wednesday, May 24, 2023



We are excited to announce the first group of winners for the 2023 Google Open Source Peer Bonus Program! This program recognizes external open source contributors who have been nominated by Googlers for their exceptional contributions to open source projects.

The Google Open Source Peer Bonus Program is a key part of Google's ongoing commitment to open source software. By supporting the development and growth of open source projects, Google is fostering a more collaborative and innovative software ecosystem that benefits everyone.

This cycle's Open Source Peer Bonus Program received a record-breaking 255 nominations, marking a 49% increase from the previous cycle. This growth speaks to the popularity of the program both within Google and the wider open source community. It's truly inspiring to see so many individuals dedicated to contributing their time and expertise to open source projects. We are proud to support and recognize their efforts through the Open Source Peer Bonus Program.

The winners of this year's Open Source Peer Bonus Program come from 35 different countries around the world, reflecting the program's global reach and the immense impact of open source software. Community collaboration is a key driver of innovation and progress, and we are honored to be able to support and celebrate the contributions of these talented individuals from around the world through this program.

In total, 203 winners were selected based on the impact of their contributions to the open source project, the quality of their work, and their dedication to open source. These winners represent around 150 unique open source projects, demonstrating a diverse range of domains, technologies, and communities. There are open source projects related to web development such as Angular, PostCSS, and the 2022 Web Almanac, and tools and libraries for programming languages such as Rust, Python, and Dart. Other notable projects include cloud computing frameworks like Apache Beam and Kubernetes, and graphics libraries like Mesa 3D and HarfBuzz. The projects also cover various topics such as security (CSP), testing (AFLPlusplus), and documentation (The Good Docs Project). Overall, it's an impressive list of open source projects from different areas of software development.

We would like to extend our congratulations to the winners! Included below are those who have agreed to be named publicly.

Winner

Open Source Project

Bram Stein

2022 Web Almanac

Saptak Sengupta

2022 Web Almanac

Thibaud Colas

2022 Web Almanac

Andrea Fioraldi

AFLPlusplus

Marc Heuse

AFLplusplus

Joel Ostblom

Altair

Chris Dalton

ANGLE

Matthieu Riegler

Angular

Ryan Carniato

Angular

Johanna Öjeling

Apache Beam

Rickard Zwahlen

Apache Beam

Seunghwan Hong

Apache Beam

Claire McGinty

Apache Beam & Scio

Kellen Dye

Apache Beam & Scio

Michel Davit

Apache Beam & Scio

Stamatis Zampetakis

Apache Hive

Matt Casters

Apache Hop

Kevin Mihelich

Arch Linux ARM

Sergio Castaño Arteaga

Artifact Hub

Vincent Mayers

Atlanta Java Users Group

Xavier Bonaventura

Bazel

Jelle Zijlstra

Black

Clément Contet

Blockly

Yutaka Yamada

Blockly

Luiz Von Dentz

Bluez

Kate Gregory

Carbon Language

Ruth Ikegah

Chaoss

Dax Fohl

Cirq

Chad Killingsworth

closure-compiler

Yuan Li

Cloud Dataproc Initialization Actions

Manu Garg

Cloudprober

Kévin Petit

CLVK

Dimitris Koutsogiorgas

CocoaPods

Axel Obermeier

Conda Forge

Roman Dodin

Containerlab

Denis Pushkarev

core-js

Chris O'Haver

CoreDNS

Justine Tunney

cosmopolitan

Jakob Kogler

cp-algorithms

Joshua Hemphill

CSP (Content-Security-Policy)

Romain Menke

CSSTools’ PostCSS Plugins and Packages

Michael Sweet

CUPS

Daniel Stenberg

curl

Pokey Rule

Cursorless

Ahmed Ashour

Dart

Zhiguang Chen

Dart Markdown Package

Dmitry Zhifarsky

DCM

Mark Pearson

Debian

Salvatore Bonaccorso

Debian

Felix Palmer

deck.gl

Xiaoji Chen

deck.gl

Andreas Deininger

Docsy

Simon Binder

Drift

Hajime Hoshi

Ebitengine

Protesilaos Stavrou

Emacs modus-themes

Raven Black

envoy

Péter Szilágyi

ethereum

Sudheer Hebbale

evlua

Morten Bek Ditlevsen

Firebase SDK for Apple App Development

Victor Zigdon

Flashing Detection

Ashita Prasad

Flutter

Callum Moffat

Flutter

Greg Price

Flutter

Jami Couch

Flutter

Reuben Turner

Flutter

Heather Turner

FORWARDS

Donatas Abraitis

FRRouting/frr

Guillaume Melquiond

Gappa

Sam James

Gentoo

James Blair

Gerrit Code Review

Martin Paljak

GlobalPlatformPro

Jeremy Bicha

GNOME

Austen Novis

Goblet

Ka-Hing Cheung

goofys

Nicholas Junge

Google Benchmark

Robert Teller

Google Cloud VPC Firewall Rules

Nora Söderlund

Google Maps Platform Discord community and GitHub repositories

Aiden Grossman

google/ml-compiler-opt

Giles Knap

gphotos-sync

Behdad Esfahbod

HarfBuzz

Juan Font Alonso

headscale

Blaž Hrastnik

Helix

Paulus Schoutsen

home-assistant

Pietro Albini

Infrastructure team - Rust Programming Language

Eric Van Norman

Istio

Zhonghu Xu

Istio

Pierre Lalet

Ivre Rocks

Ayaka Mikazuki

JAX

Kyle Zhao

JGit | The Eclipse Foundation

Yuya Nishihara

jj (Jujutsu VCS)

Oscar Dowson

JuMP-dev

Mikhail Yakshin

Kaitai Struct

Daniel Seemaier

KaMinPar

Abheesht Sharma

KerasNLP

Jason Hall

ko

Jonas Mende

Kubeflow Pipelines Operator

Paolo Ambrosio

Kubeflow Pipelines Operator

Arnaud Meukam

Kubernetes

Patrick Ohly

Kubernetes

Ricardo Katz

Kubernetes

Akihiro Suda

Lima

Jan Dubois

Lima

Dongliang Mu

Linux Kernel

Johannes Berg

Linux Kernel

Mauricio Faria de Oliveira

Linux Kernel

Nathan Chancellor

Linux Kernel

Ondřej Jirman

Linux Kernel

Pavel Begunkov

Linux Kernel

Pavel Skripkin

Linux Kernel

Tetsuo Handa

Linux Kernel

Vincent Mailhol

Linux Kernel

Hajime Tazaki

Linux Kernel Library

Jonatan Kłosko

Livebook

Jonas Bernoulli

Magit

Henry Lim

Malaysia Vaccine Tracker Twitter Bot

Thomas Caswell

matplotlib

Matt Godbolt

mattgodbolt

Matthew Holt

mholt

Ralf Jung

Miri and Stacked Borrows

Markus Böck

mlir

Matt DeVillier

MrChromebox.tech

Levi Burner

MuJoCo

Hamel Husain

nbdev

Justin Keyes

Neovim

Wim Henderickx

Nephio

Janne Heß

nixpkgs

Martin Weinelt

nixpkgs

Brian Carlson

node-postgres

Erik Doernenburg

OCMock

Aaron Brethorst

OneBusAway for iOS, written in Swift.

Onur Mutlu

Onur Mutlu Lectures - YouTube

Alexander Alekhin

OpenCV

Alexander Smorkalov

OpenCV

Stafford Horne

OpenRISC

Peter Gadfort

OpenROAD

Christopher "CRob" Robinson

OpenSSF Best Practices WG

Arnaud Le Hors

OpenSSF Scorecard

Nate Wert

OpenSSF Scorecard

Kevin Thomas Abraham

Oppia

Praneeth Gangavarapu

Oppia

Mohit Gupta

Oppia Android

Jaewoong Eum

Orbital

Carsten Dominik

Org mode

Guido Vranken

oss-fuzz

Daniel Anderson

parlaylib

Richard Davey

Phaser

Juliette Reinders Folmer

PHP_CodeSniffer

Hassan Kibirige

plotnine

Andrey Sitnik

PostCSS · GitHub

Dominik Czarnota

pwndbg

Ee Durbin

PyPI

Adam Turner

Python PEPs

Peter Odding

python-rotate-backups

Christopher Courtney

QMK

Jay Berkenbilt

qpdf

Tim Everett

RTKLIB

James Higgins

Rust

Tony Arcieri

rustsec

Natsuki Natsume

Sass

Mohab Mohie

SHAFT

Cory LaViska

Shoelace

Catherine 'whitequark'

smoltcp

Kumar Shivendu

Software Heritage

Eriol Fox

SustainOSS

Richard Littauer

SustainOSS

Billy Lynch

Tekton

Trevor Morris

TensorFlow

Jiajia Qin

TensorFlow.js

Patty O'Callaghan

TensorFlow.js Training and Ecosystem

Luiz Carvalho

TEP-0084: End-to-end provenance in Tekton Chains

Hannes Hapke

TFX-Addons

Sakae Kotaro

The 2021 Web Almanac

Frédéric Wang

The Chromium Projects

Raphael Kubo da Costa

The Chromium Projects

Mengmeng Tang

The Good Docs Project

Ophy Boamah Ampoh

The Good Docs Project

Gábor Horváth

The LLVM Project

Shafik Yaghmour

The LLVM Project

Dave Airlie

The Mesa 3D Graphics Library

Faith Ekstrand

The Mesa 3D Graphics Library

Aivar Annamaa

Thonny

Lawrence Kesteloot

trs80

Josh Goldberg

TypeScript

Linus Seelinger

UM-Bridge

Joseph Kato

Vale.sh

Abdelrahman Awad

vee-validate

Maxi Ferreira

View Transitions API

Tim Pope

vim-fugitive

Michelle O'Connor

Web Almanac

Jan Grulich

WebRTC

Wez Furlong

WezTerm

Yao Wang

World Federation of Advertisers - Virtual People Core Serving

Duncan Ogilvie

x64dbg

We are incredibly proud of all of the nominees for their outstanding contributions to open source, and we look forward to seeing even more amazing contributions in the years to come.

By Maria Tabak, Google Open Source Peer Bonus Program Lead

.