opensource.google.com

Menu

Posts from 2019

Announcing Docsy: A Website Theme for Technical Documentation

Wednesday, July 10, 2019

Have you ever struggled with the process of creating documentation for an open source project? Do you have an open source project that's outgrown its README? Open source projects need great docs to succeed, but great open source doc sites aren't always easy to produce and share.

Google supports over 2000 open source projects, and there has been growing demand from these projects for tooling and guidance to help them write and publish their documentation. To meet this need we created Docsy: a documentation website with templates and guidance for documentation, which we’re open sourcing to the public to use and help improve the tool.

Docsy builds on existing open source tools, like Hugo, and our experience with open source docs, providing a fast and easy way to stand up an OSS documentation website with features specifically designed to support technical documentation. Special features include everything from site navigation to multi-language support – with easy site deployment options provided by Hugo. We also created guidance on how to add additional pages, structure your documentation, and accept community contributions, all with the goal of letting you focus on creating great content.

Who’s using it?

The Kubeflow, Knative, and Agones websites were built using the Docsy theme, with more projects in the pipeline. We’ve also created an example site that uses lots of Docsy features for you to explore and copy.

Ready to get started?

Visit the Docsy site to find out how to create your first site with Docsy! You can either use Docsy like a regular Hugo theme, or clone our example site. Docsy is an open source project—of course—and we welcome your issues, suggestions, and contributions!

Come Meet the Google Open Source Team at OSCON!

Tuesday, July 9, 2019

Google Cloud is proud to be a Diamond Sponsor at OSCON, and we’re excited for another year of connecting, learning, and sharing with the open source community! Google is deeply grateful to all of your amazing open source efforts, so to celebrate, our booth will have an Open Gratitude wall where we will acknowledge your contributions, and where we encourage you to express your gratitude for those who have helped you in open source!

Once you’ve recognized your open source heroes on the Open Gratitude wall, stick around at the Google Open Source booth to learn about topics such as open source governance, documentation, open source in ML and gaming, encouraging non-code contributions, and about Google’s open source outreach programs in general. At our booth sessions you can also explore open source projects such as Kubernetes, Istio, Go, and Beam (as well as other Apache projects). Booth office hours run from 10:15am to 7pm Wednesday, July 17, and from 10:15am to 4:10pm on Thursday, July 18. The full schedule will be posted at the booth—please come by and check it out!

In addition to the events at the booth, the Google open source team has two workshops on Tuesday, July 16:
This half-day workshop kicks off with an overview of research-backed documentation best practices. Andrew Chen, Erin McKean, and Aizhamal Nurmamat kyzy lead you through a hands-on exercise in which you'll create the skeleton of a ready-to-deploy documentation website for your open source project.
Paris Pittman takes you through the ins and outs of the Kubernetes contributor community so you can land your first PR. You'll learn about SIGs, the GitHub workflow, its automation and continuous integration (CI), setting up your dev environment, and much more. Stick around until the end, and you'll have time to work on your first PR with the help of current contributors.
We also hope you attend the main conference sessions presented by Googlers, especially the keynotes on Wednesday (Built to last: What Google and Microsoft have learned growing open source communities) and Thursday (Be a Docs Star), and the sessions on Wednesday:
And Thursday:
As part of our commitment to creating a diverse and inclusive community, we’ve redirected our conference swag budget into diversity scholarships. (We believe you’d prefer to have more interesting conversations with a wider range of people over another pair of socks!) But if you are looking for a souvenir of your time in Portland there will be a special Portland-themed sticker featuring Pancakes, the (extremely adorable) gRPC mascot, and we encourage projects to take and leave stickers in our sticker-swap space!

OSCON is one of the highlights of the year for those of us who love open source—we’re thrilled to be able to share what we’ve learned with you, and to learn what you’re interested in and excited about (and also what you think could improve). See you in Portland!

Truth 1.0: Fluent Assertions for Java and Android Tests

Monday, July 8, 2019

Software testing is important—and sometimes frustrating. The frustration can come from working on innately hard domains, like concurrency, but too often it comes from a thousand small cuts:
assertEquals("Message has been sent", getString(notification, EXTRA_BIG_TEXT));
assertTrue(
    getString(notification, EXTRA_TEXT)
        .contains("Kurt Kluever <kak@google.com>"));
The two assertions above test almost the same thing, but they are structured differently. The difference in structure makes it hard to identify the difference in what's being tested.
A better way to structure these assertions is to use a fluent API:
assertThat(getString(notification, EXTRA_BIG_TEXT))
    .isEqualTo("Message has been sent");
assertThat(getString(notification, EXTRA_TEXT))
    .contains("Kurt Kluever <kak@google.com>");
A fluent API naturally leads to other advantages:
  • IDE autocompletion can suggest assertions that fit the value under test, including rich operations like containsExactly(permission.SEND_SMS, permission.READ_SMS).
  • Failure messages can include the value under test and the expected result. Contrast this with the assertTrue call above, which lacks a failure message entirely.
Google's fluent assertion library for Java and Android is Truth. We're happy to announce that we've released Truth 1.0, which stabilizes our API after years of fine-tuning.



Truth started in 2011 as a Googler's personal open source project. Later, it was donated back to Google and cultivated by the Java Core Libraries team, the people who bring you Guava.
You might already be familiar with assertion libraries like Hamcrest and AssertJ, which provide similar features. We've designed Truth to have a simpler API and more readable failure messages. For example, here's a failure message from AssertJ:
java.lang.AssertionError:
Expecting:
  <[year: 2019
month: 7
day: 15
]>
to contain exactly in any order:
  <[year: 2019
month: 6
day: 30
]>
elements not found:
  <[year: 2019
month: 6
day: 30
]>
and elements not expected:
  <[year: 2019
month: 7
day: 15
]>
And here's the equivalent message from Truth:
value of:
    iterable.onlyElement()
expected:
    year: 2019
    month: 6
    day: 30

but was:
    year: 2019
    month: 7
    day: 15
For more details, read our comparison of the libraries, and try Truth for yourself.

Also, if you're developing for Android, try AndroidX Test. It includes Truth extensions that make assertions even easier to write and failure messages even clearer:
assertThat(notification).extras().string(EXTRA_BIG_TEXT)
    .isEqualTo("Message has been sent");
assertThat(notification).extras().string(EXTRA_TEXT)
    .contains("Kurt Kluever <kak@google.com>");
Coming soon: Kotlin users of Truth can look forward to Kotlin-specific enhancements.
By Chris Povirk, Java Core Libraries

Google's robots.txt Parser is Now Open Source

Monday, July 1, 2019

Originally posted on the Google Webmaster Central Blog

For 25 years, the Robots Exclusion Protocol (REP) was only a de-facto standard. This had frustrating implications sometimes. On one hand, for webmasters, it meant uncertainty in corner cases, like when their text editor included BOM characters in their robots.txt files. On the other hand, for crawler and tool developers, it also brought uncertainty; for example, how should they deal with robots.txt files that are hundreds of megabytes large?

Today, we announced that we're spearheading the effort to make the REP an internet standard. While this is an important step, it means extra work for developers who parse robots.txt files.

We're here to help: we open sourced the C++ library that our production systems use for parsing and matching rules in robots.txt files. This library has been around for 20 years and it contains pieces of code that were written in the 90's. Since then, the library evolved; we learned a lot about how webmasters write robots.txt files and corner cases that we had to cover for, and added what we learned over the years also to the internet draft when it made sense.

We also included a testing tool in the open source package to help you test a few rules. Once built, the usage is very straightforward:

robots_main <robots.txt content> <user_agent> <url>

If you want to check out the library, head over to our GitHub repository for the robots.txt parser. We'd love to see what you can build using it! If you built something using the library, drop us a comment on Twitter, and if you have comments or questions about the library, find us on GitHub.

Posted by Edu Pereda, Lode Vandevenne, and Gary, Search Open Sourcing team

Security Crawl Maze: An Open Source Tool to Test Web Security Crawlers

Friday, June 21, 2019

Scanning modern web applications for security vulnerabilities can be a difficult task, especially if they are built with Javascript frameworks, which is why crawlers have to use a multi-stage crawling approach to discover all the resources on modern websites.

Living in the times of dynamically changing specifications and the constant appearance of new frameworks, we often have to adjust our crawlers so that they are able to discover new ways in which developers can link resources from their applications. The issue we face in such situations is measuring if changes to crawling logic improve the effectiveness. While working on replacing a crawler for a web security scanner that has been in use for a number of years, we found we needed a universal test bed, both to test our current capabilities and to discover cases that are currently missed. Inspired by Firing Range, today we’re announcing the open-source release of Security Crawl Maze – a universal test bed for web security crawlers.

Security Crawl Maze is a simple Python application built with the Flask framework that contains a wide variety of cases for ways in which a web based application can link other resources on the Web. We also provide a Dockerfile which allows you to build a docker image and deploy it to an environment of your choice. While the initial release is covering the most important cases for HTTP crawling, it’s a subset of what we want to achieve in the near future. You’ll soon be able to test whether your crawler is able to discover known files (robots.txt, sitemap.xml, etc…) or crawl modern single page applications written with the most popular JS frameworks (Angular, Polymer, etc.).

Security crawlers are mostly interested in code coverage, not in content coverage, which means the deduplication logic has to be different. This is why we plan to add cases which allow for testing if your crawler deduplicates URLs correctly (e.g. blog posts, e-commerce). If you believe there is something else, feel free to add a test case for it – it’s super simple! Code is available on GitHub and through a public deployed version.

We hope that others will find it helpful in evaluating the capabilities of their crawlers, and we certainly welcome any contributions and feedback from the broader security research community.

By Maciej Trzos, Information Security Engineer

Introducing TensorNetwork, an Open Source Library for Efficient Tensor Calculations

Friday, June 7, 2019

Originally posted on the Google AI Blog.

Many of the world's toughest scientific challenges, like developing high-temperature superconductors and understanding the true nature of space and time, involve dealing with the complexity of quantum systems. What makes these challenges difficult is that the number of quantum states in these systems is exponentially large, making brute-force computation infeasible. To deal with this, data structures called tensor networks are used. Tensor networks let one focus on the quantum states that are most relevant for real-world problems—the states of low energy, say—while ignoring other states that aren't relevant. Tensor networks are also increasingly finding applications in machine learning (ML). However, there remain difficulties that prohibit them from widespread use in the ML community: 1) a production-level tensor network library for accelerated hardware has not been available to run tensor network algorithms at scale, and 2) most of the tensor network literature is geared toward physics applications and creates the false impression that expertise in quantum mechanics is required to understand the algorithms.

In order to address these issues, we are releasing TensorNetwork, a brand new open source library to improve the efficiency of tensor calculations, developed in collaboration with the Perimeter Institute for Theoretical Physics and X. TensorNetwork uses TensorFlow as a backend and is optimized for GPU processing, which can enable speedups of up to 100x when compared to work on a CPU. We introduce TensorNetwork in a series of papers, the first of which presents the new library and its API, and provides an overview of tensor networks for a non-physics audience. In our second paper we focus on a particular use case in physics, demonstrating the speedup that one gets using GPUs.

How are Tensor Networks Useful?

Tensors are multidimensional arrays, categorized in a hierarchy according to their order: e.g., an ordinary number is a tensor of order zero (also known as a scalar), a vector is an order-one tensor, a matrix is an order-two tensorDiagrammatic notation for tensors. and so on. While low-order tensors can easily be represented by an explicit array of numbers or with a mathematical symbol such as Tijnklm (where the number of indices represents the order of the tensor), that notation becomes very cumbersome once we start talking about high-order tensors. At that point it's useful to start using diagrammatic notation, where one simply draws a circle (or some other shape) with a number of lines, or legs, coming out of it—the number of legs being the same as the order of the tensor. In this notation, a scalar is just a circle, a vector has a single leg, a matrix has two legs, etc. Each leg of the tensor also has a dimension, which is the size of that leg. For example, a vector representing an object's velocity through space would be a three-dimensional, order-one tensor.
Diagrammatic notation for tensors.
The benefit of representing tensors in this way is to succinctly encode mathematical operations, e.g., multiplying a matrix by a vector to produce another vector, or multiplying two vectors to make a scalar. These are all examples of a more general concept called tensor contraction.
Diagrammatic notation for tensor contraction. Vector and matrix multiplication, as well as the matrix trace (i.e., the sum of the diagonal elements of a matrix), are all examples.
These are also simple examples of tensor networks, which are graphical ways of encoding the pattern of tensor contractions of several constituent tensors to form a new one. Each constituent tensor has an order determined by its own number of legs. Legs that are connected, forming an edge in the diagram, represent contraction, while the number of remaining dangling legs determines the order of the resultant tensor.
Left: The trace of the product of four matrices, tr(ABCD), which is a scalar. You can see that it has no dangling legs. Right: Three order-three tensors being contracted with three legs dangling, resulting in a new order-three tensor.
While these examples are very simple, the tensor networks of interest often represent hundreds of tensors contracted in a variety of ways. Describing such a thing would be very obscure using traditional notation, which is why the diagrammatic notation was invented by Roger Penrose in 1971.

Tensor Networks in Practice

Consider a collection of black-and-white images, each of which can be thought of as a list of N pixel values. A single pixel of a single image can be one-hot-encoded into a two-dimensional vector, and by combining these pixel encodings together we can make a 2^N-dimensional one-hot encoding of the entire image. We can reshape that high-dimensional vector into an order-N tensor, and then add up all of the tensors in our collection of images to get a total tensor Ti1,i2,...,iN encapsulating the collection.
This sounds like a very wasteful thing to do: encoding images with about 50 pixels in this way would already take petabytes of memory. That's where tensor networks come in. Rather than storing or manipulating the tensor T directly, we instead represent T as the contraction of many smaller constituent tensors in the shape of a tensor network. That turns out to be much more efficient. For instance, the popular matrix product state (MPS) network would write T in terms of N much smaller tensors, so that the total number of parameters is only linear in N, rather than exponential.
The high-order tensor T is represented in terms of many low-order tensors in a matrix product state tensor network.
It's not obvious that large tensor networks can be efficiently created or manipulated while consistently avoiding the need for a huge amount of memory. But it turns out that this is possible in many cases, which is why tensor networks have been used extensively in quantum physics and, now, in machine learning. Stoudenmire and Schwab used the encoding just described to make an image classification model, demonstrating a new use for tensor networks. The TensorNetwork library is designed to facilitate exactly that kind of work, and our first paper describes how the library functions for general tensor network manipulations.

Performance in Physics Use-Cases

TensorNetwork is a general-purpose library for tensor network algorithms, and so it should prove useful for physicists as well. Approximating quantum states is a typical use-case for tensor networks in physics, and is well-suited to illustrate the capabilities of the TensorNetwork library. In our second paper, we describe a tree tensor network (TTN) algorithm for approximating the ground state of either a periodic quantum spin chain (1D) or a lattice model on a thin torus (2D), and implement the algorithm using TensorNetwork. We compare the use of CPUs with GPUs and observe significant computational speed-ups, up to a factor of 100, when using a GPU and the TensorNetwork library.
Computational time as a function of the bond dimension, χ. The bond dimension determines the size of the constituent tensors of the tensor network. A larger bond dimension means the tensor network is more powerful, but requires more computational resources to manipulate.

Conclusion and Future Work

These are the first in a series of planned papers to illustrate the power of TensorNetwork in real-world applications. In our next paper we will use TensorNetwork to classify images in the MNIST and Fashion-MNIST datasets. Future plans include time series analysis on the ML side, and quantum circuit simulation on the physics side. With the open source community, we are also always adding new features to TensorNetwork itself. We hope that TensorNetwork will become a valuable tool for physicists and machine learning practitioners.

Acknowledgements

The TensorNetwork library was developed by Chase Roberts, Adam Zalcman, and Bruce Fontaine of Google AI; Ashley Milsted, Martin Ganahl, and Guifre Vidal of the Perimeter Institute; and Jack Hidary and Stefan Leichenauer of X. We'd also like to thank Stavros Efthymiou at X for valuable contributions.

by Chase Roberts, Research Engineer, Google AI and Stefan Leichenauer, Research Scientist, X 

Software Community Growth Through “first-timers-only” Issues

Wednesday, June 5, 2019

“first-timers-only issues are those which are written in a very engaging, welcoming way, far different than the usual ‘just report the bug’ type of GitHub issue. To read more about these, check out firsttimersonly.com, which really captures how and why this works and is beginning to be a movement in open source coding outreach! Beyond the extra welcome, this also includes getting such well-formatted issues out in front of lots of people who may be contributing to open source software for the very first time. 
It takes a LOT of work to make a good issue of this type, and we often walk through each step required to actually make the requested changes – the point is to help newcomers understand that a) they're welcome, and b) what the collaboration workflow looks like. Read more at https://publiclab.org/software-outreach!”
Since early 2016, we at Public Lab have been working to make our open source software projects more welcoming and inclusive and to grow our software contributor community in diversity and size. Our adoption of the strategy of posting first-timers-only (FTO) issues was also started at Public Lab near the end of 2016:https://publiclab.org/notes/warren/10-31-2016/create-a-welcoming-first-timers-only-issue-to-invite-new-software-contributors

During March and April, as GSoC, Outreachy, and other outreach programs were seeking proposals for the upcoming summer, we put a lot of extra time and work into welcoming newcomers into our community and making sure they are well-supported. We've seen a huge increase in newcomers and wanted to report in about how this process has scaled!

Through the end of March, nearly 409 FTO issues had been created across our projects, which shows how many people have been welcomed into Open Source 🌐 and in our community, by the collaborative efforts.

From March 9, 2019, we started maintaining a list of people who want to work on various projects of Public Lab - https://github.com/publiclab/plots2/issues/4963 - through first-timers-only issues. And, we are proud to announce that over 20 days at the end of March, the Public Lab community created 55 FTO issues i.e., 13% of total Public Lab FTO issues (for all time) were created during this 20 day period.

The idea of maintaining a list of FTO issue-seekers has been a big success and has helped coordinate and streamline the process. We were able to assign issues to nearly 50 contributors in just 20 days. And, each day the list is growing and we are opening more and more FTO issues for helping new contributors in taking their first-step in Open Source with Public Lab.

The credit for this tremendous growth goes to whole Public Lab reviewers team who ensured that every newcomer gets an FTO issue and also supported each newcomer in making their first contribution.

What makes this especially cool is that many of the FTO creators were people who had just recently completed their own FTO — then turned around and welcomed someone else into the community. This really highlights how newcomers have special insight into how important it is to welcome people properly, to support them in their first steps, and how making this process a core responsibility of our community has worked well.

Thanks to everyone, for the great work and cheers to this awesome community growth 🎉 🥂 💯

By Gaurav Sachdeva with input from Jeffrey Warren, Public Lab


Easier and More Powerful Observability with the OpenCensus Agent and Collector

Tuesday, June 4, 2019

The OpenCensus project has grown considerably over the past year, and we’ve had several significant announcements in the first half of 2019, including the project’s merger into OpenTelemetry. The features discussed in this post will move into OpenTelemetry over the coming months.

For those who aren’t already familiar with the project, OpenCensus provides a set of libraries that collect distributed traces and application metrics from your applications and send them to your backend of choice. Google announced OpenCensus one year ago, and the community has since grown to include large companies, major cloud providers, and APM vendors. OpenCensus includes integrations with popular web, RPC, and storage clients out of the box, along with exporters that allow it to send captured telemetry to your backend of choice.

We’ve recently enhanced OpenCensus with two new components: an optional agent that can manage exporters and collect system telemetry from VMs and containers, and an optional collector service that offers improved trace sampling and metrics aggregation. While we’ve already demonstrated these components on stage at Kubecon and Next, we’re now ready to show them more broadly and encourage their use.


The OpenCensus Agent

The OpenCensus agent is an optional component that can be deployed to each of your virtual machines or kubernetes pods. The agent receives traces and metrics from OpenCensus libraries, collects system telemetry, and can even capture data from a variety of different sources including Zipkin, Jaeger, and Prometheus. The agent has several benefits over exporting telemetry directly from the OpenCensus libraries:
  • The OpenCensus libraries export data to the OpenCensus agent by default, meaning that backend exporters can be reconfigured on the agent without having to rebuild or redeploy your application. This provides value for applications with high deployment costs and for PaaS-like platforms that have OpenCensus already integrated.
  • While the OpenCensus libraries collect application-level telemetry, the agent also captures system metrics like CPU and memory consumption and exports these to your backend of choice.
  • You can configure stats aggregations without redeploying your application.
  • The OpenCensus agent will host z-pages. While we originally made these a part of the libraries, we’ll be moving this functionality to the agent. This should result in a higher quality diagnostic page experience, as we’ll no longer have to reimplement the pages in each language.
  • The OpenCensus agent uses the same exporters already written for the Go OpenCensus library.
While directly exporting to a backend will remain a common use case for developers, we expect most larger deployments to start using the OpenCensus agent. The agent is currently in beta and we expect it to reach production ready quality and feature completeness later this year.

The OpenCensus Collector

The OpenCensus collector and agent share the same source code and are both optional – the key difference is how they’re deployed. While the agent runs on each host, the collector is deployed as a service and receives data from OpenCensus libraries and agents running across multiple hosts. The collector has several benefits over exporting telemetry directly from the OpenCensus libraries or agent:
  • Intelligent (tail based) trace sampling is one of the biggest benefits of the collector. By configuring your OpenCensus libraries to sample all traces and send all spans to the collector, you can have the collector perform sampling decisions after the fact! For example, you can configure the collector to sample the slowest one percent of traces at 30%, traces with errors at 100%, and all other traces at 1%!
  • The collector performs well and can be sharded across multiple instances. Current performance scales linearly across cores, allowing 10,000 spans to be collected per 1.2 cores.
  • The collector can be used as a proxy for other telemetry sources. In addition to receiving data from OpenCensus libraries and agents, Zipkin clients, Jaeger clients, and Prometheus clients, the service can be used to receive telemetry from client applications running on the web or on mobile devices.
  • The collector will soon host z-pages for your entire cluster. This is simply an expansion of the z-page functionality that we’ve added to the OpenCensus agent.
  • The collector can be used to apply telemetry policies across your entire application including adding span attributes such as region to all spans received, stripping personally identifiable information by removing or overwriting span attributes, or mapping span attributes to different names.

When to Use Each

As mentioned above, both the agent and collector are optional, and we expect that some developers will continue to export traces and metrics directly from the OpenCensus libraries. However, we expect both to become quite popular in the following scenarios:
  • Many organizations don’t want to have to rebuild and redeploy their apps when they change exporters. The agent provides the flexibility to change exporters without having to modify and redeploy your code.
  • With the OpenCensus agent you can capture system metrics via the same pipeline used to extract application metrics and distributed traces from your application.
  • If you want to make trace sampling decisions more intelligently, you’ll need to start using the collector.
  • With the OpenCensus collector you can minimize egress points and support features including batching, queuing and retry. These features are important when sending tracing and metric data to SaaS-based backends.
  • Platform providers can include the OpenCensus agent and collector into their services, making them available out of the box to customers.

Status

Both the agent and collector are currently in beta, though we know that several companies are already using them in their production services. We’re working towards the 1.0 release of each of these, and we expect this to occur by the end of Q2.

In the meantime, please join us on GitHub, Gitter, and in our monthly community meetings!

gVisor: One Year Later

Friday, May 31, 2019

Last year at KubeCon EU 2018, we open-sourced gVisor to help advance the container security field. Container isolation was -- and continues to be -- an important topic in containers, and we wanted to share how we’ve addressed this at Google with the broader community. Over the past year, we’ve listened to your feedback and made enhancements to the project, improving integration and support for Kubernetes, and growing our contributor community.

Extending Kubernetes Support

One of the most common requests we heard was for better Kubernetes support. When we launched, gVisor supported Docker and had only minimal (and experimental) support for Kubernetes. One year later we now support full integration with Kubernetes via containerd and MiniKube. This includes the ability to run multiple containers in a single pod, full terminal support with kubectl exec, and enforcement of pod cgroup policies. We've also tightened security by isolating our I/O proxies (known as "gofers") in seccomp sandboxes.
Our Docker support has also improved; gVisor now obeys CPU and memory limits when passed to Docker. We also support interactive terminals via docker run -it and docker exec -it, and we exposed gVisor's native save/restore functionality via the experimental docker checkpoint command.

Increasing Compatibility and Performance

Since launch, we’ve increased our compatibility with many popular workloads and released a suite of over 1,500 syscall tests. This test suite helps to prevent regressions and also makes it easier for contributors to get started developing and testing their changes to gVisor.
Along with compatibility, we've also increased our performance significantly, particularly around networking. Network-heavy workloads like webservers are an important use case for potential gVisor users, and we've made a lot of optimizations to our network stack, such as enabling Generic Segmentation Offloading (GSO). This has resulted in more than a 3x improvement in tcp_benchmark throughput. We've also implemented some RFCs in our TCP/IP stack like SACK which helps maintain throughput with lossy network connections. We've published many performance benchmarks on our website, and will update those as we continue to make progress.

Growing the gVisor Community

We’ve also made our development process more open by moving bugs to a GitHub issue tracker, holding monthly community meetings, and starting a new developer-focused mailing list. We've also published a new governance model and code of conduct for our community.

In the last year we focused on improving our documentation, including new user guides, architecture details, and contribution guides, so that it is easier for new users to learn about gVisor and start using and contributing to the project. You can view these docs on a new website we created, gvisor.dev. Both the website and documentation are open to contributions. The changes have brought in contributions from dozens of users all over the world.
We're very excited about the future of gVisor and the great community we are building. We'd love to hear more feedback and look forward to continuing working towards more open infrastructure and collaboration. To learn more and get involved, check out our new home at gvisor.dev, particularly the community page where you can join our mailing lists and find the next community meeting.

By Nicolas Lacasse and Ian Lewis, gVisor Team

Season of Docs Now Accepting Technical Writer Applications

Wednesday, May 29, 2019

Season of Docs is excited to announce that technical writer applications are now open!

In their applications, technical writers can submit project proposals based on the project ideas of participating organizations, or propose their own ideas. Refer to the guidelines on the website for how to create a technical writer application. The technical writer application form is located here: https://forms.gle/Fxr2nW4TCiyESHbo8.

The deadline for technical writer applications is June 28, 2019 at 18:00 UTC.

What is Season of Docs?

Documentation is essential to the adoption of open source projects as well as to the success of their communities. Season of Docs brings together technical writers and open source projects to foster collaboration and improve documentation in the open source space. You can find out more about the program on the introduction page of the website.

During the program, technical writers spend a few months working closely with an open source community. They bring their technical writing expertise to the project's documentation and, at the same time, learn about the open source project and new technologies.

Mentors from open source projects work with the technical writers to improve the project's documentation and processes. Together, they may choose to build a new documentation set, redesign the existing docs, or improve and document the project's contribution procedures and onboarding experience.

How do I take part in Season of Docs as a technical writer?

First, take a look at the technical writer guide on the website, which includes information on eligibility and the application process.

Explore the list of participating organizations and their project ideas. When you find one or more projects that interest you, you should approach the relevant open source organization directly to discuss project ideas.

Then, read create a technical writing application and submit your application using this form: https://forms.gle/Fxr2nW4TCiyESHbo8. The deadline for technical writer applications is June 28, 2019 at 18:00 UTC.

Is there a stipend for participating technical writers?

Yes. There is an optional stipend available to the accepted technical writers. The stipend amount is calculated based on the technical writer's home location. See the technical writer stipends page for more information.

What kind of mentor will I be working with?

Season of Docs mentors are not necessarily technical writers, and they may have little experience in technical communication. They're members of an open source organization who know the value of good documentation and who are experienced in open source processes and tools.

The relationship between you and your mentors is a collaboration. You bring documentation experience and skills to the open source organization. Your mentors contribute their knowledge of open source and code. Together, you can develop technical documentation and improve the open source project's processes.

What if I have a full time job and don't have many hours per week to devote to Season of Docs?

In the technical writer application, there is an option to apply for a long-running project, which allows technical writers to complete their project in five months instead of the standard three months. This must be agreed upon with the open source organization before work commences.

If you have any questions about the program, please email us at season-of-docs-support@googlegroups.com.

General Timeline

May 29 - June 28Technical writers submit their proposals to Season of Docs.
July 30Google announces the accepted technical writer projects.
August 1 - September 1Community bonding: Technical writers get to know mentors and the open source community, and refine their projects in collaboration with their mentors.
September 2 - November 29Technical writers work with open source mentors on the accepted projects, and submit their work at the end of the period.
December 10Google publishes the list of successfully-completed projects.

Join Us

Explore the Season of Docs website at g.co/seasonofdocs to learn more about participating in the program. Use our logo and other promotional resources to spread the word. Examine the timeline, check out the FAQ, and apply now!

By Andrew Chen, Google Open Source and Sarah Maddox, Google Technical Writer

Google Summer of Code 2019 (Statistics Part 1)

Thursday, May 23, 2019

Since 2005, Google Summer of Code (GSoC) has been bringing new developers into the open source community every year. This year, we accepted 1,276 students from 63 countries into the 2019 GSoC program to work with 201 open source organizations over the summer.

Students are currently wrapping up the Community Bonding phase where they become familiar with the open source projects they will be working with by spending time learning the codebase, the community’s best practices, and integrating into the community. Students will start their 12-week coding projects on May 27th.

Each year we like to share program statistics about the GSoC program and the accepted students and mentors involved in the program.

Accepted Students

  • 89.2% are participating in their first GSoC
  • 75% are first time applicants

Degrees

  • 77.5% are undergraduates, 16.6% are masters students, and 5.9% are in PhD programs
  • 72.8% are Computer Science majors, 3.5% are Mathematics majors, 16.8% are other Engineering majors (Electrical, Mechanical, Aerospace, etc.)
  • Students are in a variety of majors including Atmospheric Science, Neuroscience, Economics, Linguistics, Geology, and Pharmacy.

Proposals

There were a record number of students submitting proposals for the program this year: 5,606 students from 103 countries submitted 7,555 proposals.

In our next GSoC statistics post we will delve deeper into the schools, gender breakdown, mentors, and registration numbers for the 2019 program.

By Stephanie Taylor, Google Open Source

Google and Binomial Partner to Open-Source Basis Universal Texture Format

Wednesday, May 22, 2019

Today, Google and Binomial are excited to announce that we have partnered to open source the Basis Universal texture codec to improve the performance of transmitting images on the web and within desktop and mobile applications, while maintaining GPU efficiency. This release fills an important gap in the graphics compression ecosystem and complements earlier work in Draco geometry compression.

The Basis Universal texture format is 6-8 times smaller than JPEG on the GPU, yet is a similar storage size as JPEG – making it a great alternative to current GPU compression methods that are inefficient and don’t operate cross platform – and provides a more performant alternative to JPEG/PNG. It creates compressed textures that work well in a variety of use cases - games, virtual & augmented reality, maps, photos, small-videos, and more!

Without a universal texture format, developers are left with 2 options:

  • Use GPU formats and take the storage size hit.
  • Use other formats that have reduced storage size but couldn't compete with the GPU performance.

Maintaining so many different GPU formats is a burden on the whole ecosystem, from GPU manufacturers to software developers to the end user who can’t get a great cross platform experience. We’re streamlining this with one solution that has built-in flexibility (like optional higher quality modes) but is much easier on everyone to improve and maintain.

How does it all work? Compress your image using the encoder, choosing the quality settings that make sense for your project (you can also submit multiple images for small videos or optimization purposes, just know they’ll share the same color palette). Insert the transcoder code before rendering, which will turn the intermediary format into the GPU format your computer can read. The image stays compressed throughout this process, even on your GPU!  Instead of needing to decode and read the whole image, the GPU will read only the parts it needs. Enjoy the performance benefits!
Basis Universal can efficiently target the most common GPU formats
Google and Binomial will be working together to continue to support, maintain and add features, so check back frequently for the latest. This initial release of Basis Universal transcodes into the following GPU formats: PVRTC1 opaque, ETC1, ETC2 basic alpha, BC1-5, and BC7 opaque. Over the coming months more functionality will be added including BC7 transparent, ASTC opaque and alpha, PVRTC1 transparent, and higher quality BC7/ASTC.
Basis Universal reduces transmission size for texture while maintaining similar image quality.
See full benchmarking results
Basis Universal improves GPU memory usage over .jpeg and .png
With this partnership, we hope to see adoption of the transcoder in all major browsers to make performant cross-platform compressed textures accessible to everyone via the WebGL API, and the forthcoming WebGPU API. In addition to opening up the possibility of seamless integration into pipelines, everyone now has access to the state of the art compressor, which will also be open sourced.

We look forward to seeing what people do with Basis Universal now that it's open sourced. Check out the code and demo on GitHub, let us know what you think, and how you plan to use it! Currently, Basis Universal transcoders are available in C++ and WebAssembly.

By Stephanie Hurlburt, Binomial and Jamieson Brettle, Chrome Media

OpenTelemetry: The Merger of OpenCensus and OpenTracing

Tuesday, May 21, 2019

We’ve talked about OpenCensus a lot over the past few years, from the project’s initial announcement, roots at Google and partners (Microsoft, Dynatrace) joining the project, to new functionality that we’re continually adding. The project has grown beyond our expectations and now sports a mature ecosystem with Google, Microsoft, Omnition, Postmates, and Dynatrace making major investments, and a broad base of community contributors.

We recently announced that OpenCensus and OpenTracing are merging into a single project, now called OpenTelemetry, which brings together the best of both projects and has a frictionless migration experience. We’ve made a lot of progress so far: we’ve established a governance committee, a Java prototype API + implementation, workgroups for each language, and an aggressive implementation schedule.

Today we’re highlighting the combined project at the keynote of Kubecon and announcing that OpenTelemetry is now officially part of the Cloud Native Computing Foundation! Full details are available in the CNCF’s official blog post, which we’ve copied below:

A Brief History of OpenTelemetry (So Far)

After many months of planning, discussion, prototyping, more discussion, and more planning, OpenTracing and OpenCensus are merging to form OpenTelemetry, which is now a CNCF sandbox project. The seed governance committee is composed of representatives from Google, Lightstep, Microsoft, and Uber, and more organizations are getting involved every day.

And we couldn't be happier about it – here’s why.

Observability, Outputs, and High-Quality Telemetry

Observability is a fashionable word with some admirably nerdy and academic origins. In control theory, “observability” measures how well we can understand the internals of a given system using only its external outputs. If you’ve ever deployed or operated a modern, microservice-based software application, you have no doubt struggled to understand its performance and behavior, and that’s because those “outputs” are usually meager at best. We can’t understand a complex system if it’s a black box. And the only way to light up those black boxes is with high-quality telemetry: distributed traces, metrics, logs, and more.

So how can we get our hands – and our tools – on precise, low-overhead telemetry from the entirety of a modern software stack? One way would be to carefully instrument every microservice, piece by piece, and layer by layer. This would literally work, it’s also a complete non-starter – we’d spend as much time on the measurement as we would on the software itself! We need telemetry as a built-in feature of our services.

The OpenTelemetry project is designed to make this vision a reality for our industry, but before we describe it in more detail, we should first cover the history and context around OpenTracing and OpenCensus.

OpenTracing and OpenCensus

In practice, there are several flavors (or “verticals” in the diagram) of telemetry data, and then several integration points (or “layers” in the diagram) available for each. Broadly, the cloud-native telemetry landscape is dominated by distributed traces, timeseries metrics, and logs; and end-users typically integrate with a thin instrumentation API or via straightforward structured data formats that describe those traces, metrics, or logs.



For several years now, there has been a well-recognized need for industry-wide collaboration in order to amortize the shared cost of software instrumentation. OpenTracing and OpenCensus have led the way in that effort, and while each project made different architectural choices, the biggest problem with either project has been the fact that there were two of them. And, further, that the two projects weren’t working together and striving for mutual compatibility.

Having two similar-yet-not-identical projects out in the world created confusion and uncertainty for developers, and that made it harder for both efforts to realize their shared mission: built-in, high-quality telemetry for all.

Getting to One Project

If there’s a single thing to understand about OpenTelemetry, it’s that the leadership from OpenTracing and OpenCensus are co-committed to migrating their respective communities to this single and unified initiative. Although all of us have numerous ideas about how we could boil the ocean and start from scratch, we are resisting those impulses and focusing instead on preparing our communities for a successful transition; our priorities for the merger are clear:
  • Straightforward backwards compatibility with both OpenTracing and OpenCensus (via software bridges)
  • Minimizing the time where OpenTelemetry, OpenTracing, and OpenCensus are being co-developed: we plan to put OpenTracing and OpenCensus into “readonly mode” before the end of 2019.
  • And, again, to simplify and standardize the telemetry solutions available to developers.
In many ways, it’s most accurate to think of OpenTelemetry as the next major version of both OpenTracing and OpenCensus. Like any version upgrade, we will try to make it easy for both new and existing end-users, but we recognize that the main benefit to the ecosystem is the consolidation itself – not some specific and shiny new feature – and we are prioritizing our own efforts accordingly.

How you can help

OpenTelemetry’s timeline is an aggressive one. While we have many open-source and vendor-licensed observability solutions providing guidance, we will always want as many end-users involved as possible. The single most valuable thing any end-user can do is also one of the easiest: check out the actual work we’re doing and provide feedback. Via GitHub, Gitter, email, or whatever feels easiest.

Of course we also welcome code contributions to OpenTelemetry itself, code contributions that add OpenTelemetry support to existing software projects, documentation, blog posts, and the rest of it. If you’re interested, you can sign up to join the integration effort by filling in this form.

By Ben Sigelman, co-creator of OpenTracing and member of the OpenTelemetry governing committee, and Morgan McLean, Product Manager for OpenCensus at Google since the project’s inception
.