opensource.google.com

Menu

Posts from 2019

Hey! Ho! Ten Years of Go!

Wednesday, November 13, 2019



Ten years ago, we announced the Go release here on this blog. This weekend we marked Go's 10th birthday as an open-source programming language and ecosystem for building modern networked software.

Go's original target was networked system infrastructure, anticipating what we now call the cloud. Go has become the language of the cloud, but more than that, Go has become the language of the open-source cloud, including Containerd, CoreDNS, Docker, Envoy, Etcd, Istio, Kubernetes, Prometheus, Terraform, and Vitess.

From our earliest days working on Go, we planned for Go to be open source. We knew that bootstrapping a new language and ecosystem was too large a project for one team or even one company to do alone. Go needed a thriving open-source community to curate and grow the ecosystem, to write books and tutorials, to teach courses to developers of all skill levels, and of course to find bugs and work on code improvements and new features. And of course we also wanted to share what we had created with everyone.

Open source at its best is about people working together to accomplish far more than any of them could have done alone. We are incredibly grateful to the thousands of people who have built up Go, its ecosystem, and its community with us over the past decade.

There are over a million Go developers worldwide, and companies all over the globe are looking to hire more. In fact, people often tell us that learning Go helped them get their first jobs in the tech industry. In the end, what we're most proud of about Go is not a well-designed feature or a clever bit of code but the positive impact Go has had in so many people's lives. We aimed to create a language that would help us be better developers, and we are thrilled that Go has helped so many others. Today we launched go.dev to be a hub for all Go developers to learn more and find ways to connect with each other.

As a thank you from us on the Go team at Google to Go contributors and developers worldwide for joining us on Go's journey, we are distributing a commemorative 10th anniversary pin at this month's Go Developer Network meetups. Renee French, who created the Go gopher for the release back in 2009, designed this special pin and also painted the mission control gopher scene at the top of this post. We thank Renee for giving Go so much of her time and a mascot that continues to delight and inspire a decade on.

As #GoTurns10, we hope everyone will take a moment to celebrate the Go community and all we have achieved together. On behalf of the entire Go team at Google, thank you to everyone who has joined us over the past decade. Let's make the next one even more incredible!



By Russ Cox, for the Go team

Building Skills, Building Community

Tuesday, November 12, 2019

Year after year, we hear from conference attendees that it's not just the content they came for, it's the connections. Meeting new people, getting new perspectives, making new friends (and sometimes hiring them!) is a big part of KubeCon Life. We want to make sure that the Kubecon community is welcoming to people from diverse backgrounds but just being welcoming is not enough: we have to actually do the work to help people get through the door.

The easiest way to help people get through the door is through diversity scholarships. One of the biggest blockers to full participation in our community is just having the resources to get to the room where it happens, and a diversity scholarship—not just a ticket, but travel assistance too—helps increase participation.

1: Going Swagless

This Kubecon we want you to take away the really important things from the conference: new knowledge and new connections... not just another pen or plastic doodad. (Although to be fair, we will also have plenty of stickers... stickers aren't swag, they're an essential part of Kubecon!)

Google prides itself on being a data-driven company, so when we need to decide where we can spend our dollars to make the most impact and do the most good for the Kubecon community, we turn to the data. We know there is an issue from the CNCF KubeCon report in Seattle 2018 reporting in 11% women (and that’s not even a complete diversity metric). Now looking at the things conference attendees have told us they value about Kubecon, we put together this handy chart to help us guide our decision-making:
Travel + Conf Ticket ScholarshipBranded Pen
Face to face learning
Career development
OSS community building
Writing tools

We also need to consider externalities when we make our decisions—and going #swagless and dedicating those resources to improving the conversation and community at Kubecon has some positive externalities: less plastic (and lighter luggage going home) is better for the planet, too!

If our work to support diversity and inclusion at Kubecon has inspired you and you want to know what your org can do to participate, there is plenty of room in the #swagless tent for everyone—redirect your swag budget to D&I efforts. Shoutouts to conference organizers like SpringOne that went totally swagless this year!

2: Diversity Lunch + Hack

Our commitment to a welcoming environment and a diverse community doesn't stop at getting people in the door: we also need to work on inclusion. Our diversity lunch and hack is a place where people can:
  • Build their skills through pair programming
  • Get installation help
  • Do deep-dives on k8s topics
  • Connect with others in the community
Our diversity lunch isn't just talking about diversity: it's about working towards diversity through skill-building and creating stronger community bonds. Register here!

We welcomed 220 friends and allies in Barcelona and expect to continue the sold-out streak in San Diego (get your ticket now)!

3: Redirecting Even More

But wait, there's more! We're not just going #swagless, we're also redirecting all the hands-on workshop registration fees ($50) from Anthos Day, Anthos&GKE Lab, OSS: Agones, Knative, and Kubeflow to the diversity scholarship fund. You can build a stronger, more diverse community while you build your skills—a total twofer. (And our workshops are also walking the walk of inclusion by being accessible themselves: if you need support to attend a workshop, whether financial or physical, send us a note.

4: Hiring

Also, one of the best things any company can do to drive D&I is to hire people who will help your company become more diverse, whether as a consultant to help you build your program, or as a team member who will help you bring a wider perspective to your product! Come meet a Googler at any of the activities we are doing during the week to discuss jobs at Google Cloud: g.co/Kubecon.

By: Paris Pittman, Google Open Source

Paving the way for a more diverse open source landscape: The First OSS Contributor Summit in Mexico

Wednesday, November 6, 2019

“I was able to make my first contribution yesterday, and today it was merged. I'm so excited about my first steps in open source", a participant said about the First Summit for Open Source Contributors, which took place this September in Guadalajara, México.
How do you involve others in open source? How can we make this space more inclusive for groups with low representation in the field?

With these questions in mind and the call to contribute to software that is powering the world's favorite products, Google partnered with Software Guru magazine, Wizeline Academy, OSOM (a consortium started by Googler, Griselda Cuevas, to engage more Mexican developers in open source), IBM, Intel, Salesforce and Indeed to organize the First Summit for Open Source Contributors in Mexico. The Apache Software Foundation and the CNCF were some of the organizations that sponsored the conference. The event consisted of two days of training and presentations on a selection of open source projects, including Apache Beam, Gnome, Node JS, Istio, Kubernetes, Firefox, Drupal, and others. Through 19 workshops, participants were able to learn about the state of open source in Latin America, and also get dedicated coaching and hands-on practice to become active contributors in OSS. While unpaid, these collaborations represent the most popular way of learning to code and building a portfolio for young professionals, or people looking to do a career shift towards tech.


As reported by many advocacy groups in the past few years, diversity remains a big debt in the tech industry. Only an average of 8.4% of employees in ten of the leading tech companies are Latinx(1). The gap is even bigger in open source software, where only 2.6% of committers to Apache projects are Latinx(2). Diversity in tech is not just the right thing to do, it is also good business: bringing more diverse participation in software development will result in more inclusive and successful products, that serve a more comprehensive set of use cases and needs in any given population.


While representation numbers in the creation of software are still looking grim, the use of OSS is growing fast: It is estimated that Cloud and big-data OSS technologies will grow five times by 2025 in Latin America. The main barrier for contributing? Language. 

The First Summit for Open Source Contributors set out to close this fundamental gap between tech users and its makers. To tackle this problem, we created, in partnership with other companies, 135 hours of content in Spanish for 481 participants, which produced over 200 new contributors across 19 open source projects. When asked why contributions from the region are so low, 41% of participants said it was due to lack of awareness, and 34% said they thought their contributions were not valuable. After the event, 47% of participants reported that the workshops and presentations provided them with information or guidance on how to contribute to specific projects, and 39% said the event helped them to lose fear and contribute. Almost 100% of participants stated that they plan to continue contributing to Open Source in the near future… and if they do, they would raise representation of Latinx in Open Source to 10%.
Organizing Team
This event left us with a lot of hope for the future of diversity and inclusion in open source. Going forward, we hope to continue supporting this summit in Latin America, and look for ways of reproducing this model in other regions of the world, as well as designing proactive outreach campaigns in other formats.

View more pictures of the event here.
View some of the recorded presentations here.


By: María Cruz for Google Open Source

(1) Aggregate data from Tech Crunch: https://techcrunch.com/2019/06/17/the-future-of-diversity-and-inclusion-in-tech/
(2) Data from the last Apache Software Foundation Committer Survey, applied in 2016, 765 respondents (13% of committers)

OpenTitan – Open sourcing transparent, trustworthy, and secure silicon

Tuesday, November 5, 2019

Security begins with secure infrastructure. To have higher confidence in the security and integrity of the infrastructure, we need to anchor our trust at the foundation—in a special-purpose chip.

Today, along with our partners, we are excited to announce OpenTitan—the first open source silicon root of trust (RoT) project. OpenTitan will deliver a high-quality RoT design and integration guidelines for use in data center servers, storage, peripherals, and more. Open sourcing the silicon design makes it more transparent, trustworthy, and ultimately, secure.
The OpenTitan logo

Anchoring trust in silicon

Silicon RoT can help ensure that the hardware infrastructure and the software that runs on it remain in their intended, trustworthy state by verifying that the critical system components boot securely using authorized and verifiable code. Silicon RoT can provide many security benefits by helping to:
  • Ensure that a server or a device boots with the correct firmware and hasn't been infected by a low-level malware.
  • Provide a cryptographically unique machine identity, so an operator can verify that a server or a device is legitimate.
  • Protect secrets like encryption keys in a tamper-resistant way even for people with physical access (e.g., while a server or a device is being shipped).
  • Provide authoritative, tamper-evident audit records and other runtime security services.
The silicon RoT technology can be used in server motherboards, network cards, client devices (e.g., laptops, phones), consumer routers, IoT devices, and more. For example, Google has relied on a custom-made RoT chip, Titan, to help ensure that machines in Google’s data centers boot from a known trustworthy state with verified code; it is our system root of trust. Recognizing the importance of anchoring the trust in silicon, together with our partners we want to spread the benefits of reliable silicon RoT chips to our customers and the rest of the industry. We believe that the best way to accomplish that is through open source silicon.

Raising the transparency and security bar

Similar to open source software, open source silicon can:
  1. Enhance trust and security through design and implementation transparency. Issues can be discovered early, and the need for blind trust is reduced.
  2. Enable and encourage innovation through contributions to the open source design.
  3. Provide implementation choice and preserve a set of common interfaces and software compatibility guarantees through a common, open reference design.
The OpenTitan project is managed by the lowRISC CIC, an independent not-for-profit company with a full-stack engineering team based in Cambridge, UK, and is supported by a coalition of like-minded partners, including ETH Zurich, G+D Mobile Security, Google, Nuvoton Technology, and Western Digital.

The founding partners of the OpenTitan project

OpenTitan is an active engineering project staffed by a team of engineers representing a coalition of partners who bring ideas and expertise from many perspectives. We are transparently building the logical design of a silicon RoT, including an open source microprocessor (the lowRISC Ibex, a RISC-V-based design), cryptographic coprocessors, a hardware random number generator, a sophisticated key hierarchy, memory hierarchies for volatile and non-volatile storage, defensive mechanisms, IO peripherals, secure boot, and more. With OpenTitan, a coalition of partners have come together to deliver a more open, transparent, and high-quality RoT.
A comparison of the major design components of a traditional RoT and an OpenTitan RoT
The OpenTitan project is rooted in three key principles:
  • Transparency – anyone can inspect, evaluate, and contribute to OpenTitan’s design and documentation to help build more transparent, trustworthy silicon RoT for all.
  • High quality – we are building a high-quality logically-secure silicon design, including reference firmware, verification collateral, and technical documentation.
  • Flexibility – adopters can reduce costs and reach more customers by using a vendor- and platform-agnostic silicon RoT design that can be integrated into data center servers, storage, peripheral and other devices.

Participating in the OpenTitan project

OpenTitan will be helpful for chip manufacturers, platform providers, and security-conscious enterprise organizations that want to enhance their infrastructure with silicon-based security. Visit our GitHub repository today.

If you are interested in actively collaborating on OpenTitan to help make secure open source silicon a reality, we encourage you to contact the OpenTitan team. If you would like your product to be considered for a pilot OpenTitan RoT integration, the team would be excited to hear from you.

By Royal Hansen‎, Vice President, Google and Dominic Rizzo, OpenTitan Lead, Google Cloud

From "let's try" to "woah, this is awesome!": Three years of GSoC for InterMine

Friday, November 1, 2019

GSoC Experience Series

InterMine is an open source data warehouse for biological data. In 2017, we decided at short-ish notice to participate in a call from Open Genome Informatics for Google Summer of Code (GSoC) mentoring organisations. InterMine had never participated in a program like this before, and we weren’t entirely sure if the time investment was actually going to be worth it. We nervously said “no more than two projects”, but we had so many great applications, we ended up taking on five brilliant students.
Fast forward to 2019, GSoC is firmly embedded in our organisation it’s hard to imagine that this is only our third time participating. The benefits to us (and hopefully the students as well!) were immeasurable, allowing us to explore open-ended projects we thought might be fun and implement concrete ideas that we’ve been wanting to do for years, all while interacting with a really smart bunch of talented students. 

From the 2017 cohort of students, we ended up with one of our students, Konstantinos Krytsis, authoring a scientific paper about the work they did: InterMineR: an R package for InterMine databases. Another student, Nadia Yudina, returned to our org as a mentor the next year.
In 2018, student engagement got even better: of six students, Adrián Rodríguez-Bazaga applied for an internal vacancy and joined us full time, Nupur Gunwant spent her next summer break working on an internship in our office, and two students returned as mentors the next year (Aman Dwivedi and Arunan Sugunakumar).

By this point, any questions we might have had about whether or not GSoC was “worth it” were firmly answered: GSoC had become an integral part of our team’s operations. There were still things we needed to improve, though—we ran a student debrief after GSoC 2018, and one student expressed that despite having worked with our API and data for three months, they still didn’t have a firm idea of why or how someone might wish to use InterMine. 😱 whoops! This definitely had never been our intent, and I felt mortified that we’d overlooked something so basic.

In 2019, we set out to provide our students with a firm grounding by running cohort calls. All students were invited, giving them the chance to meet one another and interact—not quite face to face, but video calls still give a great sense of “group” compared to just text chat. We structured the calls to run over several months, liberally borrowing from the Mozilla Open Leaders curriculum to teach students about open source good practices, presentation skills, code review, providing effective and kind feedback (an essential part of code review), and of course—talking about what InterMine is, how it was founded, and what type of people might use it. We made heavy use of Zoom’s breakout room feature, to allow small sub-groups of students and mentors to have private discussions about topics, before re-convening to report their experiences to the group.

Feedback from students was very positive about the calls, so we expect to continue this in later years. I think my favourite comment after our very first call was “Are there going to be more of these group calls? This was awesome!” We also repeatedly had the group calls mentioned positively in free-text feedback from student evaluations.

With this in mind, we’d like to share our call agenda templates with other organisations so others can run the same student cohort calls if they wish,and remix/modify, etc. as needed. As part of our GSoC site repo, all content including our call templates, GSoC grading criteria and advice, etc. is Apache licensed and open for reuse. You can see all of our call templates on our GSoC repo site, or fork our GSoC GitHub repo;and I’m happy to discuss ideas (email: yo@intermine.org, twitter: @yoyehudi or @intermineorg) or help others get similar group call programs off the ground if you’d like advice.

The 2019 GCI Organizations!

Tuesday, October 29, 2019

We are excited to welcome 29 open source organizations to mentor students as part of Google Code-in 2019. The contest, now in its tenth year, offers students ages 13-17 from around the world, an opportunity to learn and practice their coding skills while contributing to open source projects—all virtually!
Google Code-in starts for students on December 2nd this year! Students are encouraged to research and learn about the participating organizations ahead of time. You can get started by clicking on the links below:

Apertium – A free/open-source machine translation platform.

Australian Open Source Software Innovation and Education – Australian umbrella organization for open-source projects.

BRL-CAD – Computer graphics, 3D modeling, 3D printing, and rendering!

CCExtractor Development – Accessibility tools with a focus on subtitles.

CircuitVerse.org – Have fun exploring logic circuits right from your browser!

CloudCV – Make AI research more reproducible.

Copyleft Games – Tools and engines for making games.

Drupal – Content management software used to make many of the websites and applications you use every day.

Fedora Project – Advance Free/Open Source Software and content.

FOSSASIA – Developing open source software applications and open hardware together with a global developer community from its base in Asia, improving people’s lives and create a sustainable future.

Haiku – Operating system that specifically targets personal computing.

JBoss Community – Community of open source projects primarily written in Java.

Liquid Galaxy project – A remarkable panoramic system and visualization tool.

MetaBrainz Foundation – Crowd sourced open data projects: MusicBrainz, BookBrainz, ListenBrainz, AcousticBrainz, CritiqueBrainz and Cover Art Archive.

Open Roberta – Online IDE introducing kids to the world of coding by teaching them how to program robots with NEPO®.

OpenMRS – Write Code, Save Lives — Open source medical records platform improving health-care in resource-constrained environments.

OpenWISP – Network management system aimed at low cost networks: from public wifi, to university wifi, mesh networks and IoT.

OSGeo – An umbrella organization for the Open Source Geospatial community.

Public Lab – Open hardware and software to help communities measure and analyze pollution.

R Project for Statistical Computing – R is a free software environment for statistical computing and graphics.

SCoRe Lab – Research lab that seeks sustainable solutions for various problems in developing countries.

Sugar Labs – Learning platform and activities for elementary education.

Systers, an AnitaB.org community – Helping women find their potential in code. You are not alone.

TensorFlow – An open-source machine learning framework for everyone.

The Julia Programming Language – A fresh approach to Technical Computing.

The Mifos Initiative – FinTech non-profit leveraging the cloud, mobile, and open source community to deliver digital financial services to the world’s 3 billion poor and underbanked.

The ns-3 Network Simulator Project – A discrete event network simulator for Internet systems, research, and education.

The Terasology Foundation – An open source voxel world - imagine the possibilities! Makers of video games and a small slew of libraries & frameworks for game development.

Wikimedia – The non-profit foundation dedicated to bringing free content to the world, operating Wikipedia and maintaining the MediaWiki software.

These 29 organizations are working diligently to create thousands of tasks for students to work on, including code, documentation, design, quality assurance, outreach, research and training tasks. The contest starts for students on December 2nd.

You can learn more about GCI on the contest site where you’ll find Frequently Asked Questions, Important Dates and other helpful information, including the Getting Started Guide.

Want to chat with other students, mentors, and organization administrations about the contest? Check out our discussion mailing list. We can’t wait to get started!

By Radha Jhatakia, Google Open Source

Why Diversity is Important in Open Source: Google's Sponsorship of OSSEU

Monday, October 28, 2019

The Open Source Summit + Embedded Linux Conference is taking place in Lyon, France, which the Google Open Source Programs Office is sponsoring. The Linux Foundation supports shared technology through open source, while the conference provides a space for developers and technologists in open source to meet, network, and share knowledge with one another in order to advance the community. Why is this of utmost importance to Google OSS? Google has been rooted in the open source community for many years, supporting programs, projects, and organizations to help advance open source software and technology—we understand the necessity of sustaining open source and the developer community in order to advance technology as a whole.

Sponsoring OSSEU is more than just providing funds, but really pushing the diversity initiative in open source. We need diversity across all levels in open source whether it’s contributors, maintainers, doc writers, or anyone supporting the project. As said recently by the Open Source Initiative, “Many perspectives makes better software.” Having previously funded diversity initiatives such as scholarships or lunches at OSS conferences, Google continues to support this cause by sponsoring the diversity lunch at OSSEU.
In particular, sessions and events that Google will be hosting while at OSSEU include a keynote on Documentation by Megan Byrd-Sanicki and the Women in Open Source Lunch, both on Tuesday, October 29, 2019. The keynote on Docs highlights the importance of doc stars and why their contributions are essential to the growth of the open source community. Our support of the women in open source lunch is especially important as we look to increase the diversity of the open source community by supporting women and non-binary persons to get more involved and have the opportunity to connect with each other at an event of this scale.

If you’re attending OSSEU, stop by the keynote, and we hope to see you at the lunch as well. If you aren’t attending this year, and are interested in getting more involved in the open source community, the summits hosted by the Linux Foundation are one of the best ways to learn more about OSS and meet passionate people involved in different OSS projects and organizations.

By Radha Jhatakia, Google OSPO

Google Summer of Code: Being Happy While Working is Possible

Thursday, October 24, 2019

GSoC Experience Series

I am proud to have been part of GSoC 2019, which was without a doubt, a motivating experience that gives strength to continue improving and working in open source. I participated with the project: New rules for the Topology Framework in gvSIG Desktop, and received mentoring by the OSGeo organization and gvSIG association. Being a part of this project has been one of the best experiences I have had—from a professional point of view and because the freedom the mentors gave me to work and the interaction with the community, allowed me to enjoy the environment while learning simultaneously. Achieving the objectives was a challenge but thanks to the motivation and support it was possible.

With the project it was possible to implement a new set of topology rules for the validation and correction of vector data sets, which improve and extend the characteristics of previously existing tools in gvSIG. These tools allow browsing, searching and correcting validation errors. With the rules implementation are automated tasks, allowing to reduce errors and eliminating repetitive tasks. For more information, you can read the final report or the repository with all the documentation of the project documentation is available in English, Spanish and Italian.

What I love about this project is working on time optimization—perhaps the most precious and scarce resource—The user is allowed to focus on logic to be solved, leaving aside repetitive tasks and optimizing the use of time.

Defining rule implementation: “Must be Coincident with”


Rule “Must be Coincident with” working to find the topological errors.

Beyond the technical contribution, what gave me the most value is the spirit of the program that allows you to work professionally, and through a motivating context really allows you to enjoy the process and this enhances the results. It was essential that as the project progressed the mentors were transparent and allowed me to work with more freedom; their trust and the community interaction was of great importance.

It has been a great experience and I appreciate the opportunity to participate in a project with these characteristics, which also helps optimize the use of time. I encourage anyone who is interested in adding value in any area of open source to participate in GSoC, don’t hesitate due to your age.

By Mauro Carlevaro

It Really is a Great Learning Experience

Tuesday, October 22, 2019

GSoC Experience Series

Nearly a month ago the official results for Google Summer of Code 2019 were announced, and I am happy to say I successfully completed the program with OpenStreetMap working on the 3D renderer OSM2World.

Before even applying, when I was searching for information on it most of the resources I was able to find included the same phrase: "It is a great learning experience!"

Being the almost-graduate Computer Science student I was, I had the inaccurate impression of thinking I knew enough skills and doubted what it could really offer me—in terms of expanding knowledge, as I had decided on a (Java) project I would apply to (a language I already knew).

Long story short, here is what "it is a great learning experience" translated into for me when it came to programming practices:

  • Always think about cases besides the "happy path": CS students/learners may agree with me here: Practice-projects do not always require making the application tolerant towards wrong input one can provide. That is not the case for a large scale application, though, as one unpredicted NullPointer exception derived from one tiny part of the input file (in my case) can have a user scratching their head for hours or not find the root of the problem, which in many cases is not where the error log indicates; in addition to their work not being done due to the unexpected crash. Which leads me to the 2nd point that I learned the hard way.
  • Make unit testing an integral part of coding routine: Yes, this, as well as other points listed here, might seem obvious to most but until recently, it was not to me. And being known as one of the less interesting tasks to perform when coding definitely doesn't help unit tests place high on programmer's "favorite things to do" lists. However, tests can most of the time detect unintended "features" other than just wrong method output, like the unexpected crashes mentioned above. So it is pretty much always better to create them soon after writing your new method rather than waiting before the end of a deadline.
  • Add elements of Functional Programming to object-oriented thinking, with the most important elements to me being those of immutable types and side-effect free methods (i.e. methods that do not modify variable values outside their local environment). I only understood the importance of that myself when I was suddenly able to make use of such methods I wrote for previous tasks, for the latest ones. And that was due to the fact that I was instructed to write them that way, without knowing beforehand they would come in handy again.

This list could probably have a few more points added, it was a 3-month long program after all, but for me those are the ones that definitely deserve their spot here. And of course the above would not have been possible without my mentor's... mentoring! Instructing someone on what to do and allow them to discover the benefits of the advice on their own, in addition to providing any necessary explanations, is definitely a way to help someone adopt practices for a lifetime! It is safe to say that the whole GSoC experience would have been different, should things have been different.

For anyone that might be interested, here is the application document I submitted as well as the final project!

By Jason Manoloudis, OpenStreetMap GSoC Student

Video Architecture Search

Monday, October 21, 2019

Video understanding is a challenging problem. Because a video contains spatio-temporal data, its feature representation is required to abstract both appearance and motion information. This is not only essential for automated understanding of the semantic content of videos, such as web-video classification or sport activity recognition, but is also crucial for robot perception and learning. Just like humans, an input from a robot’s camera is seldom a static snapshot of the world, but takes the form of a continuous video.

The abilities of today’s deep learning models are greatly dependent on their neural architectures. Convolutional neural networks (CNNs) for videos are normally built by manually extending known 2D architectures such as Inception and ResNet to 3D or by carefully designing two-stream CNN architectures that fuse together both appearance and motion information. However, designing an optimal video architecture to best take advantage of spatio-temporal information in videos still remains an open problem. Although neural architecture search (e.g., Zoph et al, Real et al) to discover good architectures has been widely explored for images, machine-optimized neural architectures for videos have not yet been developed. Video CNNs are typically computation- and memory-intensive, and designing an approach to efficiently search for them while capturing their unique properties has been difficult.

In response to these challenges, we have conducted a series of studies into automatic searches for more optimal network architectures for video understanding. We showcase three different neural architecture evolution algorithms: learning layers and their module configuration (EvaNet); learning multi-stream connectivity (AssembleNet); and building computationally efficient and compact networks (TinyVideoNet). The video architectures we developed outperform existing hand-made models on multiple public datasets by a significant margin, and demonstrate a 10x~100x improvement in network runtime.

EvaNet: The First Evolved Video Architectures

EvaNet, which we introduce in “Evolving Space-Time Neural Architectures for Videos” at ICCV 2019, is the very first attempt to design neural architecture search for video architectures. EvaNet is a module-level architecture search that focuses on finding types of spatio-temporal convolutional layers as well as their optimal sequential or parallel configurations. An evolutionary algorithm with mutation operators is used for the search, iteratively updating a population of architectures. This allows for parallel and more efficient exploration of the search space, which is necessary for video architecture search to consider diverse spatio-temporal layers and their combinations. EvaNet evolves multiple modules (at different locations within the network) to generate different architectures.

Our experimental results confirm the benefits of such video CNN architectures obtained by evolving heterogeneous modules. The approach often finds that non-trivial modules composed of multiple parallel layers are most effective as they are faster and exhibit superior performance to hand-designed modules. Another interesting aspect is that we obtain a number of similarly well-performing, but diverse architectures as a result of the evolution, without extra computation. Forming an ensemble with them further improves performance. Due to their parallel nature, even an ensemble of models is computationally more efficient than the other standard video networks, such as (2+1)D ResNet. We have open sourced the code.


Examples of various EvaNet architectures. Each colored box (large or small) represents a layer with the color of the box indicating its type: 3D conv. (blue), (2+1)D conv. (orange), iTGM (green), max pooling (grey), averaging (purple), and 1x1 conv. (pink). Layers are often grouped to form modules (large boxes). Digits within each box indicate the filter size.

AssembleNet: Building Stronger and Better (Multi-stream) models

In “AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures”, we look into a new method of fusing different sub-networks with different input modalities (e.g., RGB and optical flow) and temporal resolutions. AssembleNet is a “family” of learnable architectures that provide a generic approach to learn the “connectivity” among feature representations across input modalities, while being optimized for the target task. We introduce a general formulation that allows representation of various forms of multi-stream CNNs as directed graphs, coupled with an efficient evolutionary algorithm to explore the high-level network connectivity. The objective is to learn better feature representations across appearance and motion visual clues in videos. Unlike previous hand-designed two-stream models that use late fusion or fixed intermediate fusion, AssembleNet evolves a population of overly-connected, multi-stream, multi-resolution architectures while guiding their mutations by connection weight learning. We are looking at four-stream architectures with various intermediate connections for the first time — 2 streams per RGB and optical flow, each one at different temporal resolutions.

The figure below shows an example of an AssembleNet architecture, found by evolving a pool of random initial multi-stream architectures over 50~150 rounds. We tested AssembleNet on two very popular video recognition datasets: Charades and Moments-in-Time (MiT). Its performance on MiT is the first above 34%. The performances on Charades is even more impressive at 58.6% mean Average Precision (mAP), whereas previous best known results are 42.5 and 45.2.



The representative AssembleNet model evolved using the Moments-in-Time dataset. A node corresponds to a block of spatio-temporal convolutional layers, and each edge specifies their connectivity. Darker edges mean stronger connections. AssembleNet is a family of learnable multi-stream architectures, optimized for the target task.


A figure comparing AssembleNet with state-of-the-art, hand-designed models on Charades (left) and Moments-in-Time (right) datasets. AssembleNet-50 or AssembleNet-101 has an equivalent number of parameters to a two-stream ResNet-50 or ResNet-101.

Tiny Video Networks: The fastest video understanding networks

In order for a video CNN model to be useful for devices operating in a real-world environment, such as that needed by robots, real-time, efficient computation is necessary. However, achieving state-of-the-art results on video recognition tasks currently requires extremely large networks, often with tens to hundreds of convolutional layers, that are applied to many input frames. As a result, these networks often suffer from very slow runtimes, requiring at least 500+ ms per 1-second video snippet on a contemporary GPU and 2000+ ms on a CPU. In Tiny Video Networks, we address this by automatically designing networks that provide comparable performance at a fraction of the computational cost. Our Tiny Video Networks (TinyVideoNets) achieve competitive accuracy and run efficiently, at real-time or better speeds, within 37 to 100 ms on a CPU and 10 ms on a GPU per ~1 second video clip, achieving hundreds of times faster speeds than the other human-designed contemporary models.

These performance gains are achieved by explicitly considering the model run-time during the architecture evolution and forcing the algorithm to explore the search space while including spatial or temporal resolution and channel size to reduce computations. The below figure illustrates two simple, but very effective architectures, found by TinyVideoNet. Interestingly the learned model architectures have fewer convolutional layers than typical video architectures: Tiny Video Networks prefers lightweight elements, such as 2D pooling, gating layers, and squeeze-and-excitation layers. Further, TinyVideoNet is able to jointly optimize parameters and runtime to provide efficient networks that can be used by future network exploration.






TinyVideoNet (TVN) architectures evolved to maximize the recognition performance while keeping its computation time within the desired limit. For instance, TVN-1 (top) runs at 37 ms on a CPU and 10ms on a GPU. TVN-2 (bottom) runs at 65ms on a CPU and 13ms on a GPU.


CPU runtime of TinyVideoNet models compared to prior models (left) and runtime vs. model accuracy of TinyVideoNets compared to (2+1)D ResNet models (right). Note that TinyVideoNets take a part of this time-accuracy space where no other models exist, i.e., extremely fast but still accurate.

Conclusion

To our knowledge, this is the very first work on neural architecture search for video understanding. The video architectures we generate with our new evolutionary algorithms outperform the best known hand-designed CNN architectures on public datasets, by a significant margin. We also show that learning computationally efficient video models, TinyVideoNets, is possible with architecture evolution. This research opens new directions and demonstrates the promise of machine-evolved CNNs for video understanding.

Acknowledgements

This research was conducted by Michael S. Ryoo, AJ Piergiovanni, and Anelia Angelova. Alex Toshev and Mingxing Tan also contributed to this work. We thank Vincent Vanhoucke, Juhana Kangaspunta, Esteban Real, Ping Yu, Sarah Sirajuddin, and the Robotics at Google team for discussion and support.

Bazel Reaches 1.0 Milestone!

Thursday, October 17, 2019

We're excited to announce the first General Availability release of Bazel, an open source build system designed to support a wide variety of programming languages and platforms.

Bazel was born of Google's own needs for highly scalable builds. When we open sourced Bazel back in 2015, we hoped that Bazel could fulfill similar needs in the software development industry. A growing list of Bazel users attests to the widespread demand for scalable, reproducible, and multi-lingual builds. Bazel helps Google be more open too: several large Google open source projects, such as Angular and TensorFlow, use Bazel. Users have reported 3x test time reductions and 10x faster build speeds after switching to Bazel.
With the 1.0 release we’re continuing to implement Bazel's vision:
  • Bazel builds are fast and correct. Every build and test run is incremental, on your developers’ machines and on your CI test system.
  • Bazel supports multi-language, multi-platform builds and tests. You can run a single command to build and test your entire source tree, no matter which combination of languages and platforms you target.
  • Bazel provides a uniform extension language, Starlark, to define builds for any language or platform.
  • Bazel works across all major development platforms (Linux, macOS, and Windows).
  • Bazel allows your builds to scale—it connects to distributed remote execution and caching services.
The key features of the 1.0 GA release are:
  • Semantic Versioning
Starting with Bazel 1.0, we will use semantic versioning for all Bazel releases. For example, all 1.x releases will be backwards-compatible with Bazel 1.0. We will have a window of at least three months between major (breaking) releases. We'll continue to publish minor releases of Bazel every month, cutting from GitHub HEAD.
  • Long-Term Support
Long-Term Support (LTS) releases give users confidence that the Bazel team has the capacity and the process to quickly and safely deliver fixes for critical bugs, including vulnerabilities.
  • Well-rounded features for Angular, Android, Java, and C++
The new features include end-to-end support for remote execution and caching, and support for standard package managers and third-party dependencies.
New to Bazel? Try the tutorial for your favorite language to get started.

With the 1.0 release we still have many exciting developments ahead of us. Follow our blog or Twitter account for regular updates. Feel free to contact us with questions or feedback on the mailing list, submit feature requests (and report bugs) in our GitHub issue tracker, and join our Slack channel. Finally, join us at the largest-ever BazelCon conference in December 2019 for an opportunity to meet other Bazel users and the Bazel team at Google, see demos and tech talks, and learn more about fast, correct, and large-scale builds.

Last but not least, we wouldn't have gotten here without the continued trust, support, encouragement, and feedback from the community of Bazel users and contributors. Heartfelt thanks to all of you from the Bazel team!

By Dmitry Lomov, Bazel Team

Google Code-in 2019 Org Applications are Open!

Thursday, October 10, 2019

We are now accepting applications for open source organizations interested in participating in the tenth Google Code-in 2019. Google Code-in (GCI) has invited pre-university students ages 13-17 to learn hands-on by contributing to open source software.

Each year we have heard inspiring stories from the participating mentors about their commitment to working with young students. We only select organizations that have participated in Google Summer of Code because they have gained experience in mentorship and know how to provide a support system for these new, young contributors.

Organization applications are now open and all interested open source organizations must apply before Monday, October 28, 2019 at 17:00 UTC.

In 2018, 27 organizations were accepted—9 of which were participating in GCI for the first time! Over the last 9 years, 11,232 students from 108 countries have completed more than 40,000 tasks for participating open source projects. Tasks fall into 5 categories:
  • Code: writing or refactoring.
  • Documentation/Training: creating/editing documents and helping others learn more.
  • Outreach/Research: community management, outreach/marketing, or studying problems and recommending solutions.
  • Quality Assurance: testing and ensuring code is of high quality.
  • Design: graphic design or user interface design.
Once an organization is selected for Google Code-in 2019 they will define these tasks and recruit mentors from their communities who are interested in providing online support for students during the seven week contest.

You can find a timeline, FAQ and other information about Google Code-in on our website. If you’re an educator interested in sharing Google Code-in with your students, please see the resources here.

By Radha Jhatakia, Google Open Source

Understanding Scheduling Behavior with SchedViz

Wednesday, October 9, 2019

Linux kernel scheduling behavior can be a key factor in application responsiveness and system utilization. Today, we’re announcing SchedViz, a new tool for visualizing Linux kernel scheduling behavior. We’ve used it inside Google to discover many opportunities for better scheduling choices and to root-cause many latency issues.

Thread Scheduling

Modern OSs execute multiple processes concurrently, by running each for a brief burst, then switching to the next: a feature called multiprogramming. Modern processors include multiple cores, each of which can run its own thread, known as multiprocessing. When these two features are combined, a new engineering challenge emerges: when should a thread run? How long should it run, and on what processor? This thread scheduling strategy is a complex problem, and can have a significant effect on performance. In particular, threads that don't get scheduled to run can suffer starvation, which can adversely affect user-visible latencies.

In an ideal system, a simple strategy of assigning chunks of CPU-time to threads in a round-robin manner would maximize fairness by ensuring all threads are equally starved. But, of course, real systems are far from ideal, and this view of fairness may not be an appropriate performance goal. Here are a few factors that make scheduling tricky:
  • Not all threads are equally important. Each thread has a priority that specifies its importance relative to other threads. Thread priorities must be selected carefully, and the scheduler must honor those selections.
  • Not all cores are equal. The structure of the memory hierarchy can make it costly to shift a thread from one core to another, especially if that shift moves it to a new NUMA node. Users can explicitly pin threads to a CPU or a set of CPUs, or can exclude threads from specific CPUs, using features like sched_setaffinity or cgroups. But such restrictions also make scheduling even tougher.
  • Not all threads want to run all the time. Threads may sleep waiting for some event, yielding their core to other execution. When the event occurs, pending threads should be scheduled quickly.
SchedViz permits you to observe real scheduling behavior. Comparing this with the expected or desired behavior can point to specific problems and possible solutions.

Tracepoints and Kernel Tracing

The Linux kernel is instrumented with hooks called tracepoints; when certain actions occur, any code hooked to the relevant tracepoint is called with arguments that describe the action. The kernel also provides a debug feature that can trace this data and stream it to a buffer for later analysis.

Hundreds of different tracepoints exist, arranged into families of related function. The sched family includes tracepoints that can reconstruct thread scheduling behavior—when threads switched in, blocked on some event, or migrated between cores. These sched tracepoints provide fine-grained and comprehensive detail about thread scheduling behavior over a short period of traced execution.

SchedViz: Visualize Thread Scheduling Over Time

SchedViz provides an easy way to gather kernel scheduling traces from hosts, and visualize those traces over time. Tracing is simple:
$ sudo ./trace.sh -capture_seconds 5 -out ~/traces
Then, importing the resulting collection into SchedViz takes just one click.


Once imported, a collection will always be available for later viewing, until you delete it.

The SchedViz UI displays collections in several ways. A zoomable and pannable heatmap shows system cores on the y-axis, and the trace duration on the x-axis. Each core in the system has a swim-lane, and each swim-lane shows CPU utilization (when that CPU is being kept busy) and wait-queue depth (how many threads are waiting to run on that CPU.) The UI also includes a thread list that displays which threads were active in the heatmap, along with how long they ran, waited to run, and blocked on some event, and how many times they woke up or migrated between cores. Individual threads can be selected to show their behavior over time, or expanded to see their details.

Using SchedViz to Identify Antagonisms: Not all threads are equally important

Antagonism describes the situation in which a victim thread is ready to run on some CPU, while another antagonist thread runs on that same CPU. Long antagonisms, or high cumulative duration of antagonisms, can degrade user experience or system efficiency by making a critical process unavailable at critical times.

Antagonist analysis is useful when threads are meant to have exclusive access to some core but don’t get it. In SchedViz, such antagonisms are listed in each thread’s summary, as well as being immediately visible as breaks in the victim thread's running bar. Zooming in reveals exactly what work is interfering.

Several antagonisms affect a thread that wants its CPU exclusively.
Root-causing an antagonism via zooming in.

Round-robin queueing, in which two or more threads, each wanting to run most or all of the time, occupy a single CPU for a period of time, also yields antagonisms. In this case, the scheduler attempts to avert starvation by giving multiple threads short time-slices to run in a round-robin manner. This reduces the throughput of affected threads while introducing often-significant, repeating, latencies. It is a sign that some portion of the system is overloaded.

In SchedViz, round-robin scheduling appears as a sequence of fixed-size intervals in which the running thread, and the set of waiting threads, changes with each interval. The SchedViz UI makes it easy to better understand what caused this phenomenon.

An overloaded CPU with two threads engaged in round-robin queueing. Running intervals are shown as ovals at top; waiting intervals as rectangles at bottom.
Zooming out and viewing more CPUs reveals that round-robin queueing started when a thread migrated into the overtaxed CPU.

Using SchedViz to Identify NUMA Issues: Not all cores are equal

Larger servers often have several NUMA nodes; a CPU can access a subset of memory (the DRAM local to its NUMA node) more quickly than other memory (other nodes' DRAMs). This non-uniformity is a practical consequence of growing core count, but it brings challenges.

On the one hand, a thread migrated away from the DRAM that holds most of its state will suffer, since it will then have to pay an extra tax for each DRAM access. SchedViz can help identify cases like this, making it clear when a thread has had to migrate across NUMA boundaries.

On the other hand, it is important to ensure that all NUMA nodes in a system are well-balanced, lest part of the machine is overloaded while another part of the machine sits idle.

A thread (in yellow) risks higher-latency memory accesses as it migrates across NUMA nodes.
A system risks both under-utilization and increased latency due to NUMA imbalance.

Beyond Scheduling

Many issues can be identified and explored using only sched tracepoints. But, there are many tracepoints, reflecting a wide variety of phenomena. Many of these tracepoints go well with scheduling data. For example, irq events can reveal when thread running time is spent handling interrupts; sys events can help reveal when execution moves into the kernel, and what it’s doing there; and workqueue events can show when kernel work is underway, and what work is being done. SchedViz presently offers limited support for visualizing these non-sched tracepoint families, but improving that support is an active area of development for us.

Google Summer of Code 2019 (Statistics Part 2)

Monday, September 30, 2019

2019 has been an epic year for Google Summer of Code as we celebrated 15 years of connecting university students from around the globe with 201 open source organizations big and small.

We want to congratulate our 1,134 students that complete GSoC 2019. Great work everyone!

Now that GSoC 2019 is over we would like to wrap up the program with some more statistics to round out the year.

Student Registrations

We had 30,922 students from 148 countries register for GSoC 2019 (that’s a 19.5% increase in registrations over last year, the previous record). Interest in GSoC clearly continues to grow and we’re excited to see it growing in all parts of the world.

For the first time ever we had students register from Bhutan, Fiji, Grenada, Papua New Guinea, South Sudan, and Swaziland.

Universities

The 1,276 students accepted into the GSoC 2019 program hailed from 6586 universities, of which, 164 have students participating for the first time in GSoC.

Schools with the most accepted students for GSoC 2019:

University # of Accepted Students
Indian Institute of Technology, Roorkee48
International Institute of Information Technology - Hyderabad29
Birla Institute of Technology and Science, Pilani (BITS Pilani)27
Guru Gobind Singh Indraprastha University (GGSIPU Dwarka)20
Indian Institute of Technology, Kanpur19
Indian Institute of Technology, Kharagpur19
Amrita University / Amrita Vishwa Vidyapeetham14
Delhi Technological University11
Indian Institute of Technology, Bombay11
Indraprastha Institute of Information and Technology, New Delhi11

Mentors

Each year we pore over gobs of data to extract some interesting statistics about the GSoC mentors. Here’s a quick synopsis of our 2019 crew:
  • Registered mentors: 2,815
  • Mentors with assigned student projects: 2,066
  • Mentors who have participated in GSoC for 10 or more years: 70
  • Mentors who have been a part of GSoC for 5 years or more: 307
  • Mentors that are former GSoC students: 691
  • Mentors that have also been involved in the Google Code-in program: 498
  • Percentage of new mentors: 35.84%
GSoC 2019 mentors are from all parts of the world, representing 81 countries!

Every year thousands of GSoC mentors help introduce the next generation to the world of open source software development—for that we are forever grateful. We can not stress enough that without our invaluable mentors the GSoC program would not exist. Mentorship is why GSoC has remained strong for 15 years, the relationships built between students and mentors have helped sustain the program and many of these communities. Sharing their passion for open source, our mentors have paved the road for generations of contributors to enter open source development.

Thank you to all of our mentors, organization administrators, and all of the “unofficial” mentors that help in our open source organization’s communities. Google Summer of Code is a community effort and we appreciate each and every one of you.

By Stephanie Taylor, Google Open Source

Unleashing Open Source Silicon

Friday, September 27, 2019

Open Source Silicon

We all know that open source software has changed the fundamental nature of the software industry and that Google generously adds fuel to this culture of openness and community through Google Summer of Code. What few people realize is that there is another major industry that is ripe for an open source overhaul—the silicon industry. And, this summer, a Google Summer of Code student helped open the floodgates.

If you search social media for “open source silicon,” you’ll find a few dozen names that pop up with some frequency. These folks are fanatically driving forward with open source circuit models and software for creating them. You’ll also find people clambering to jump aboard the RISC-V bandwagon. RISC-V, like x86, MIPS, and others before it, is a CPU “instruction set architecture,” and the mere fact that it is free of proprietary licenses has inspired countless open source implementations and an industry shake-up that has ARM quaking in its boots.

While this open source silicon community is a hotbed of enthusiasm, it is several decades behind the world of open source software. In this post, I’ll reveal the three reasons this movement has, thus far, not been able to take off like open source software, and I’ll explain why these three obstacles are all coming to a very sudden and dramatic end, that will unleash a tidal wave, catching the silicon industry by surprise. And you’ll see that Google Summer of Code, this year, played a pivotal role.

What’s Standing in the Way

So, why is coding and sharing circuit models any different from sharing software? Three reasons:
  1. Implementation Details: There’s more to worry about with hardware than software. Correct functionality is far from the only concern. Particular care must be given to physical implementation. And this detail must be modified for specific silicon technology and design constraints. As a result, leveraging open source logic can involve a substantial amount of rework.
  2. Access to software: While compilers for software tend to be open source, electronic design automation (EDA) tools for compiling hardware are traditionally proprietary and prohibitively expensive.
  3. Access to hardware: Unlike software, circuit models must be turned into silicon to be useful. Fabricating custom silicon is out of the question for a hobbyist, but field-programmable gate arrays (FPGAs) provide a more realistic option. These are chips that can be quickly reconfigured, or “programmed,” to implement any logic function. While FPGAs are within reach, they still cost money, and they are delivered by postal service, not a web browser. And, worst of all, it could take weeks to get an FPGA platform up and running and communicating with the open source logic.

Breaking Down the Barriers

Let’s look at what the open source community is doing to help:
  1. Implementation Details: There is a trend toward designing more abstractly, and leaving the details to tools. Open source tools can now compile C++ into silicon (with caveats). And several open source hardware description languages leverage modern software language innovations that make it easier to rework implementation details. The open source community has shown a greater willingness than industry to explore and adopt these languages. Though hardware remains fundamentally different from software, their differences are becoming less prominent.
  2. Access to software: Open source EDA software has marked some significant achievements in the past several years. Circuit designs have been implemented on FPGAs using 100% free and open source EDA tools. (Google Summer of Code has helped to fund a few open source EDA capabilities in projects under the Free and Open Source Silicon Foundation.) The US government has recognized the opportunity and is providing significant fuel to the fire through the Posh Open Source Hardware initiative. Being restricted to open source software can still be a bit limiting, but it is no longer prohibitive.
  3. Access to hardware: Hmmm. This is still a problem.
My personal contributions to this open source silicon movement stem from my startup, Redwood EDA. We directly target problem #1 by providing tools that support advanced (yet simpler) circuit modeling techniques. And, to address #2, we make all of our software freely available online for open source development. But neither open source EDA nor the efforts of my startup had been able to noticeably impact problem #3, access to hardware.

This is where bigger forces have stepped in. In the past few years, cloud providers have begun incorporating FPGAs into their datacenters. These are available to anyone with an internet connection and a credit card, bundled with industry-class EDA software, on a pay-per-use basis. Wow! This is the solution to hardware access! An open source developer can provide not only their hardware model but also the platform for which their model was designed. A user can download and go, just like they can with software! …in theory.

So here’s the rub. The learning curve for cloud FPGA platforms has been way too high for the open source community to latch on.

Our Project

With a bit of help from Politecnico di Milano’s NECST Lab and ThroughPuter Inc., I was able to get a project off the ground, and it attracted some attention for this year’s Google Summer of Code. I was happy to see an application from Ákos Hadnagy who had done some other ground-breaking work with me in the last Summer of Code, and he was accepted into the program. Together, this summer, we built infrastructure, automated flows, and wrote documentation (or more to the point, eliminated documentation), and now, instead of a month to ramp up, it is now possible to develop for this platform in a matter of minutes!
We dubbed our framework “1st CLaaS,” where we have coined the term “CLaaS” for custom logic as a service. Very simply, 1st CLaaS wraps a developer’s custom FPGA logic as a microservice. Standard web protocols can be used to stream bits to and from this logic, and platform details are hidden by the framework.

Implications and Wrap-up

So there is no longer anything standing in the way! Hobbyists can build and share hardware, and open source silicon can thrive. Just imagine the disruption this will have on the industry, which is currently driven by corporate giants. And with easy web integration, the opportunity and demand for hardware acceleration should rise, and we could start to see some interesting new capabilities on the web that were not imaginable until now.

Google certainly didn’t have this specific industry transformation in mind when starting Google Summer of Code, but I suspect the whole point of the program was to inspire and enable the unexpected. And it did!

If you’d like to contribute to 1st CLaaS or collaborate on some of the world’s first FPGA-accelerated web applications, we’d be more than happy to have you involved. I look forward to next year's applications.

By Steve Hoover, Redwood EDA, Google Summer of Code mentor
.