Google Open Source Report Card

Friday, October 14, 2016

Open source software enables Google to build things quickly and efficiently without reinventing the wheel, allowing us to focus on solving new problems. We stand on the shoulders of giants and we know it. This is why we support open source and make it easy for Googlers to release the projects they’re working on internally as open source.

Today we’re sharing our first Open Source Report Card, highlighting our most popular projects, sharing a few statistics and detailing some of the projects we’ve released in 2016.

We’ve open sourced over 20 million lines of code to date and you can find a listing of some of our best known project releases on our website. Here are some of our most popular projects:
  • Android - a software stack for mobile devices that includes an operating system, middleware and key applications.
  • Chromium - a project encompassing Chromium, the software behind Google Chrome, and Chromium OS, the software behind Google Chrome OSdc devices.
  • Angular - a web application framework for JavaScript and Dart focused on developer productivity, speed and testability.
  • TensorFlow - a library for numerical computation using data flow graphics with support for scalable machine learning across platforms from data centers to embedded devices.
  • Go - a statically typed and compiled programming language that is expressive, concise, clean and efficient.
  • Kubernetes - a system for automating deployment, operations and scaling of containerized applications.
  • Polymer - a lightweight library built on top of Web Components APIs for building encapsulated re-usable elements in web applications.
  • Protobuf - an extensible, language-neutral and platform-neutral mechanism for serializing structured data.
  • Guava - a set of Java core libraries that includes new collection types (such as multimap and multiset), immutable collections, a graph library, functional types, an in-memory cache, and APIs/utilities for concurrency, I/O, hashing, primitives, reflection, string processing and much more.
  • Yeoman - a robust and opinionated set of scaffolding tools including libraries and a workflow that can help developers quickly build beautiful and compelling web applications.
While it’s difficult to measure the full scope of open source at Google, we can use the subset of projects that are on GitHub to gather some interesting data. Today our GitHub footprint includes over 84 organizations and 3,499 repositories, 773 of which were created this year.

Googlers use countless languages from Assembly to XSLT, but what are their favorites? GitHub flags the most heavily used language in a repository and we can use that to find out. A survey of GitHub repositories shows us these are some of the languages Googlers use most often:
  • JavaScript
  • Java
  • C/C++
  • Go
  • Python
  • TypeScript
  • Dart
  • PHP
  • Objective-C
  • C#
Many things can be gleaned using the open source GitHub dataset on BigQuery, like usage of tabs versus spaces and the most popular Go packages. What about how many times Googlers have committed to open source projects on GitHub? We can search for email addresses to get a baseline number of Googler commits. Here’s our query:

SELECT count(*) as n
FROM [bigquery-public-data:github_repos.commits]
WHERE > '2016-01-01 00:00'
AND REGEXP_EXTRACT(, r'.*@(.*)') = ''

With this, we learn that Googlers have made 142,527 commits to open source projects on GitHub since the start of the year. This dataset goes back to 2011 and we can tweak this query to find out that Googlers have made 719,012 commits since then. Again, this is just a baseline number as it doesn’t count commits made with other email addresses.

Looking back at the projects we’ve open-sourced in 2016 there’s a lot to be excited about. We have released open source software, hardware and datasets. Let’s take a look at some of this year’s releases.

Seesaw is a Linux Virtual Server (LVS) based load balancing platform developed in Go by our Site Reliability Engineers. Seesaw, like many projects, was built to scratch our own itch.

From our blog post announcing its release: “We needed the ability to handle traffic for unicast and anycast VIPs, perform load balancing with NAT and DSR (also known as DR), and perform adequate health checks against the backends. Above all we wanted a platform that allowed for ease of management, including automated deployment of configuration changes.”

Vendor Security Assessment Questionnaire (VSAQ)
We assess the security of hundreds of vendors every year and have developed a process to automate much of the initial information gathering with VSAQ. Many vendors found our questionnaires intuitive and flexible, so we decided to share them. The VSAQ Framework includes four extensible questionnaire templates covering web applications, privacy programs, infrastructure as well as physical and data center security. You can learn more about it in our announcement blog post.

OpenThread, released by Nest, is a complete implementation of the Thread protocol for connected devices in the home. This is especially important because of the fragmentation we’re seeing in this space. Development of OpenThread is supported by ARM, Microsoft, Qualcomm, Texas Instruments and other major vendors.

Can we use machine learning to create compelling art and music? That’s the question that animates Magenta, a project from the Google Brain team based on TensorFlow. The aim is to advance the state of the art in machine intelligence for music and art generation and build a collaborative community of artists, coders and machine learning researchers. Read the release announcement for more information.

Virtual reality (VR) isn’t nearly as immersive without spatial audio and much of VR development is taking place on proprietary platforms. Omnitone is an open library built by members of the Chrome Team that brings spatial audio to the browser. Omnitone builds on standard Web Audio APIs to deliver an immersive experience and can be used alongside projects like WebVR. Find out more in our blog post announcing the project’s release.

Science Journal
Today’s smartphones are packed with sensors that can tell us interesting things about the world around us. We launched Science Journal to help educators, students and citizen scientists tap into those sensors. You can learn more about the project in our announcement blog post.

Cartographer is a library for real-time simultaneous localization and mapping (SLAM) in 2D and 3D with Robot Operating System (ROS) support. Combining data from a variety of sensors, this library computes positioning and maps surroundings. This is a key element of self-driving cars, UAVs and robotics as well as efforts to map the insides of famous buildings. More information on Cartographer can be found in our blog post announcing its release.

This is just a small sampling of what we’ve released this year. Follow the Google Open Source Blog to stay apprised of Google’s open source software, hardware and data releases.

By Josh Simmons, Open Source Programs Office