opensource.google.com

Menu

Google Code-in is breaking records

Friday, December 15, 2017

It’s been an incredible (and incredibly busy!) three weeks for the 25 mentor organizations participating in Google Code-in (GCI) 2017, our seven week global contest designed to introduce teens to open source software development. Participants complete bite sized “tasks” in topics that include coding, documentation, UI/UX, quality assurance and more. Volunteer mentors from each open source project help participants along the way.

Total registered students has already surpassed 2016 numbers and we are less than halfway to the finish! We’re thrilled that high school students are embracing GCI like never before.

Check out some of the statistics below (current as of Thursday, December 14):
  • Total registered students: 6,146
  • Number of students who have completed at least one task: 1,573 (51% of those students have completed more than 3 tasks, earning them a GCI t-shirt)
  • Total number of tasks completed: 5,499
  • Most tasks completed by one student: 39

Top 5 Countries by Tasks Completed

Countries Represented by Mentors and Students



Of course, GCI wouldn’t be possible without the effort of the more than 725 mentors and organization administrators. Based in 65 countries, mentors answer questions, review submissions, and approve tasks for students at all hours of the day -- and sometimes night! They work tirelessly to help encourage and guide the next generation of open source contributors.

Every year we express our gratitude to the mentors and organization administrators. We are particularly grateful for them given how many more students are participating in GCI this year. Thank you all, and hang in there!

By Mary Radomile, Google Open Source

TFGAN: A Lightweight Library for Generative Adversarial Networks

Tuesday, December 12, 2017

Crossposted on the Google Research Blog

Training a neural network usually involves defining a loss function, which tells the network how close or far it is from its objective. For example, image classification networks are often given a loss function that penalizes them for giving wrong classifications; a network that mislabels a dog picture as a cat will get a high loss. However, not all problems have easily-defined loss functions, especially if they involve human perception, such as image compression or text-to-speech systems. Generative Adversarial Networks (GANs), a machine learning technique that has led to improvements in a wide range of applications including generating images from text, superresolution, and helping robots learn to grasp, offer a solution. However, GANs introduce new theoretical and software engineering challenges, and it can be difficult to keep up with the rapid pace of GAN research.

A video of a generator improving over time. It begins by producing random noise, and eventually learns to generate MNIST digits.
In order to make GANs easier to experiment with, we’ve open sourced TFGAN, a lightweight library designed to make it easy to train and evaluate GANs. It provides the infrastructure to easily train a GAN, provides well-tested loss and evaluation metrics, and gives easy-to-use examples that highlight the expressiveness and flexibility of TFGAN. We’ve also released a tutorial that includes a high-level API to quickly get a model trained on your data.
This demonstrates the effect of an adversarial loss on image compression. The top row shows image patches from the ImageNet dataset. The middle row shows the results of compressing and uncompressing an image through an image compression neural network trained on a traditional loss. The bottom row shows the results from a network trained with a traditional loss and an adversarial loss. The GAN-loss images are sharper and more detailed, even if they are less like the original.
TFGAN supports experiments in a few important ways. It provides simple function calls that cover the majority of GAN use-cases so you can get a model running on your data in just a few lines of code, but is built in a modular way to cover more exotic GAN designs as well. You can just use the modules you want -- loss, evaluation, features, training, etc. are all independent.. TFGAN’s lightweight design also means you can use it alongside other frameworks, or with native TensorFlow code. GAN models written using TFGAN will easily benefit from future infrastructure improvements, and you can select from a large number of already-implemented losses and features without having to rewrite your own. Lastly, the code is well-tested, so you don’t have to worry about numerical or statistical mistakes that are easily made with GAN libraries.
Most neural text-to-speech (TTS) systems produce over-smoothed spectrograms. When applied to the Tacotron TTS system, a GAN can recreate some of the realistic-texture, which reduces artifacts in the resulting audio.
When you use TFGAN, you’ll be using the same infrastructure that many Google researchers use, and you’ll have access to the cutting-edge improvements that we develop with the library. Anyone can contribute to the github repositories, which we hope will facilitate code-sharing among ML researchers and users.

By Joel Shor, Senior Software Engineer, Machine Perception

Announcing the S2 Library: Geometry on the Sphere

Tuesday, December 5, 2017

Google has always embraced new approaches to organizing all the world's information, and this includes all the world's geography. Today we are announcing the open source release of Google's S2 library, the core geometric library on which Google's global geographic database is built.

A unique feature of the S2 library is that unlike traditional geographic information systems, which represent data as flat two-dimensional projections (similar to an atlas), the S2 library represents all data on a three-dimensional sphere (similar to a globe). This makes it possible to build a worldwide geographic database with no seams or singularities, using a single coordinate system, and with low distortion everywhere compared to the true shape of the Earth. While the Earth is not quite spherical, it is much closer to being a sphere than it is to being flat!

Notable features of the library include:
  • Flexible support for spatial indexing, including the ability to approximate arbitrary regions as collections of discrete S2 cells. This feature makes it easy to build large distributed spatial indexes. (The image above illustrates the S2 space-filling curve, an important tool used for spatial indexing.)
  • Fast in-memory spatial indexing of collections of points, polylines, and polygons.
  • Robust constructive operations (such as intersection, union, and simplification) and boolean predicates (such as testing for containment).
  • Efficient query operations for finding nearby objects, measuring distances, computing centroids, etc.
  • A flexible and robust implementation of snap rounding (a geometric technique that allows operations to be implemented 100% robustly while using small and fast coordinate representations).
  • A collection of efficient yet exact mathematical predicates for testing relationships among geometric primitives.
  • Extensive testing on Google's vast collection of geographic data.
  • Flexible Apache 2.0 license.
The reference implementation of the S2 library is written in C++, and subsets have been ported to Go, Java, and Python. An early version of the code was released in 2011, but today's announcement represents a major update along with a commitment to maintain the library going forward. The code is under active development and new features will be released regularly. (The Java port is based on the 2011 code and does not have the same robustness, performance, or features as the current C++ version.)

Our C++ code repository is here: https://github.com/google/s2geometry
And check out our documentation here: https://s2geometry.io

To learn more, start by reading the overview and quick start documents, then explore the documentation site. The library also has extensive documentation in the header files, which is where the most authoritative information can be found. More introductions and tutorials will be added over time - contributions are welcome!

The S2 library was written primarily by Eric Veach. Other significant contributors include Jesse Rosenstock, Eric Engle (Java port lead), Robert Snedegar (Go port lead), Julien Basch, and Tom Manshreck.

By Eric Veach, Software Engineer
.