opensource.google.com

Menu

Posts from 2018

Introducing Git protocol version 2

Friday, May 18, 2018


Today we announce Git protocol version 2, a major update of Git's wire protocol (how clones, fetches and pushes are communicated between clients and servers). This update removes one of the most inefficient parts of the Git protocol and fixes an extensibility bottleneck, unblocking the path to more wire protocol improvements in the future.

The protocol version 2 spec can be found here. The main improvements are:
The main motivation for the new protocol was to enable server side filtering of references (branches and tags). Prior to protocol v2, servers responded to all fetch commands with an initial reference advertisement, listing all references in the repository. This complete listing is sent even when a client only cares about updating a single branch, e.g.: `git fetch origin master`. For repositories that contain 100s of thousands of references (the Chromium repository has over 500k branches and tags) the server could end up sending 10s of megabytes of data that get ignored. This typically dominates both time and bandwidth during a fetch, especially when you are updating a branch that's only a few commits behind the remote, or even when you are only checking if you are up-to-date, resulting in a no-op fetch.

We recently rolled out support for protocol version 2 at Google and have seen a performance improvement of 3x for no-op fetches of a single branch on repositories containing 500k references. Protocol v2 has also enabled a reduction of 8x of the overhead bytes (non-packfile) sent from googlesource.com servers. A majority of this improvement is due to filtering references advertised by the server to the refs the client has expressed interest in.

Getting over the hurdles

The Git project has tried on a number of occasions over the years to either limit the initial ref advertisement or move to a new protocol altogether but continued to run into two problems: (1) the initial request is rigid and does not include a field that could be used to request that new servers modify their response without breaking compatibility with existing servers and (2) error handling is not well enough defined to allow safely using a new protocol that existing servers do not understand with a quick fallback to the old protocol. To migrate to a new protocol version, we needed to find a side channel which existing servers would ignore but could be used to safely communicate with newer servers.

There are three main transports that are used to speak Git’s wire-protocol (git://, ssh://, and https://), and the side channel that we use to request v2 needs to communicate in such a way that an older server would ignore any additional data sent and not crash. The http transport was the easiest as we can simply include an additional http header in the request (“Git-Protocol: version=2”). The ssh transport is a bit more difficult as it requires sending an environment variable (“GIT_PROTOCOL=version=2”) to be set on the remote end. This is more challenging because it requires server administrators to configure sshd to accept the new environment variable on their server. The most difficult transport is the anonymous Git transport (git://).

Initial requests made to a server using the anonymous Git transport are made in the form of a single packet-line which includes the requested service (git-upload-pack for fetches and git-receive-pack for pushes), and the repository followed by a NUL byte. Later virtualization support was added and a hostname parameter could be tacked on and  terminated by a NUL byte: `0033git-upload-pack /project.git\0host=myserver.com\0`. Ideally we’d be able to add a new parameter to be used to request v2 by adding it in the same manner as the hostname was added: `003dgit-upload-pack /project.git\0host=myserver.com\0version=2\0`. Unfortunately due to a bug introduced in 2006 we aren't able to place any extra arguments (separated by NULs) other than the host because otherwise the parsing of those arguments would enter an infinite loop. When this bug was fixed in 2009, a check was put in place to disallow extra arguments so that new clients wouldn't trigger this bug in older servers.

Fortunately, that check doesn't notice if we send additional request arguments hidden behind a second NUL byte, which was pointed out back in 2009.  This allows requests structured like: `003egit-upload-pack /project.git\0host=myserver.com\0\0version=2\0`. By placing version information behind a second NUL byte we can skirt around both the infinite loop bug and the explicit disallowal of extra arguments besides hostname. Only newer servers will know to look for additional information hidden behind two NUL bytes and older servers won’t croak.

Now, in every case, a client can issue a request to use v2, using a transport-specific side channel, and v2 servers can respond using the new protocol while older servers will ignore the side channel and just respond with a ref advertisement.

Try it for yourself

To try out protocol version 2 for yourself you'll need an up to date version of Git (support for v2 was recently merged to Git's master branch and is expected to be part of Git 2.18) and a v2 enabled server (repositories on googlesource.com and Cloud Source Repositories are v2 enabled). If you enable tracing and run the `ls-remote` command querying for a single branch, you can see the server sends a much smaller set of references when using protocol version 2:


# Using the original wire protocol
GIT_TRACE_PACKET=1 git -c protocol.version=0 ls-remote https://chromium.googlesource.com/chromium/src.git master


# Using protocol version 2
GIT_TRACE_PACKET=1 git -c protocol.version=2 ls-remote https://chromium.googlesource.com/chromium/src.git master




By Brandon Williams, Git-core Team

Google Summer of Code 2018 statistics part 1

Wednesday, May 9, 2018

Since 2005, Google Summer of Code (GSoC) has been bringing new developers into the open source community every year. This year we accepted 1,264 students from 62 countries into the 2018 GSoC program to work with a record 206 open source organizations this summer.

Students are currently participating in the Community Bonding phase of the program where they become familiar with the open source projects they will be working with. They also spend time learning the codebase and the community’s best practices so they can start their 12 week coding projects on May 14th.

Each year we like to share program statistics about the GSoC program and the accepted students and mentors involved in the program. Here are a few stats:
  • 88.2% of the accepted students are participating in their first GSoC
  • 74.4% of the students are first time applicants

Degrees

  • 76.18% of accepted students are undergraduates, 17.5% are masters students, and 6.3% are getting their PhDs.
  • 73% are Computer Science majors, 4.2% are mathematics majors, 17% are other engineering majors (electrical, mechanical, aerospace, etc.)
  • We have students in a variety of majors including neuroscience, linguistics, typography, and music technologies.

Countries

This year there are four students that are the first to be accepted into GSoC from their home countries of Kosovo (three students) and Senegal. A complete list of accepted students and their countries is below:
Country Students Country Students Country Students
Argentina 5 Hungary 7 Russian Federation 35
Australia 10 India 605 Senegal 1
Austria 14 Indonesia 3 Serbia 1
Bangladesh 3 Ireland 1 Singapore 8
Belarus 3 Israel 2 Slovak Republic 2
Belgium 3 Italy 24 South Africa 1
Brazil 19 Japan 7 South Korea 2
Bulgaria 2 Kosovo 3 Spain 21
Cameroon 14 Latvia 1 Sri Lanka 41
Canada 31 Lithuania 5 Sweden 6
China 52 Malaysia 2 Switzerland 5
Croatia 3 Mauritius 1 Taiwan 3
Czech Republic 4 Mexico 4 Trinidad and Tobago 1
Denmark 1 Morocco 2 Turkey 8
Ecuador 4 Nepal 1 Uganda 1
Egypt 12 Netherlands 6 Ukraine 6
Finland 3 Nigeria 6 United Kingdom 28
France 22 Pakistan 5 United States 104
Germany 53 Poland 3 Venezuela 1
Greece 16 Portugal 10 Vietnam 4
Hong Kong 3 Romania 10    
There were a record number of students submitting proposals for the program this year -- 5,199 students from 101 countries.

In our next GSoC statistics post we will delve deeper into the schools, gender breakdown, mentors, and registration numbers for the 2018 program.

By Stephanie Taylor, Google Open Source

OpenCensus’s journey ahead: platforms and languages

Monday, May 7, 2018

We recently blogged about the value of OpenCensus and how Google uses Census internally. Today, we want to share more about our long-term vision for OpenCensus.

The goal of OpenCensus is to be a ubiquitous observability framework that allows developers to automatically collect, aggregate, and export traces, metrics, and other telemetry from their applications. We plan on getting there by building easy-to-use libraries and automatically integrate with as many technologies and frameworks as possible.

Our roadmap has two themes: increased language, framework, and platform coverage, and the addition of more powerful features.Today, we’ll discuss the first theme of the increased coverage.

Increasing Coverage

More Language Coverage

In January, we released OpenCensus for Java, Go, and C++ as well as tracing support for Python, PHP, and Ruby. We’re about to start development of OpenCensus for Node.js and .NET, and you’ll see activity on these repositories ramp up in the coming quarter.

Integration with more Frameworks, Platforms, and Clients

We want to provide a great out-of-the-box experience, so we need to automatically capture traces and metrics with as little developer effort as possible. To achieve this, we’ll be creating integrations for popular web frameworks, RPC frameworks, and storage clients. This will enable automatic context propagation, span creation, and trace annotations, without requiring extra work on behalf of developers.

As a basic example, OpenCensus already integrates with Go’s default gRPC and HTTP handlers to generate spans (with relevant annotations) and to pass context.

More complex integrations will provide more information to developers. Here’s an example of a trace captured with our upcoming MongoDB instrumentation, shown on Stackdriver Trace and AWS X-Ray:
A MongoDB trace shown in Stackdriver Trace

The same trace captured in X-Ray

Istio

OpenCensus will soon have out-of-the-box tracing and metrics collection in Istio. We’re currently working through our initial designs and implementation for integrations with the Envoy Sidecar and Istio Mixer service. Our goal is to provide Istio users with a great out of box tracing and metrics collection experience.

Kubernetes

We have two primary use cases in mind for Kubernetes deployments: providing cluster-wide visibility via z-pages, and better labeling of traces, stats, and metrics. Cluster-wide z-pages will allow developers to view telemetry in real time across an entire Kubernetes deployment, independently of their back-end. This is incredibly useful when debugging immediate high-impact issues like service outages.

Client Application Support

OpenCensus currently provides observability into back-end services, however this doesn’t tell the whole story about end-to-end application performance. Throughout 2018, we plan to add instrumentation for client and front-end web applications, so developers can get traces that begin from customers’ devices and reflect actual perceived latency, and metrics captured from client code.

We aim to add support for instrumenting Android, iOS, and front-end JavaScript, though this list may grow or change. Expect to hear more about this later in 2018.

Next Up

Next week we’ll discuss some of the new features that we’re looking to bring to OpenCensus, including notable enhancements to the trace sampling logic.

None of this is possible without the support and participation from the community. Please check out our repository and start contributing; we welcome contributions of any size -- however you want to take part. You can join other developers and users on the OpenCensus Gitter channel. We’d love to hear from you!

By Pritam Shah and Morgan McLean, Census team

Open sourcing Seurat: bringing high-fidelity scenes to mobile VR

Friday, May 4, 2018

Crossposted from the Google Developers Blog

Great VR experiences make you feel like you’re really somewhere else. To create deeply immersive experiences, there are a lot of factors that need to come together: amazing graphics, spatialized audio, and the ability to move around and feel like the world is responding to you.

Last year at I/O, we announced Seurat as a powerful tool to help developers and creators bring high-fidelity graphics to standalone VR headsets with full positional tracking, like the Lenovo Mirage Solo with Daydream. Seurat is a scene simplification technology designed to process very complex 3D scenes into a representation that renders efficiently on mobile hardware. Here’s how ILMxLAB was able to use Seurat to bring an incredibly detailed ‘Rogue One: A Star Wars Story’ scene to a standalone VR experience.

Today, we’re open sourcing Seurat to the developer community. You can now use Seurat to bring visually stunning scenes to your own VR applications and have the flexibility to customize the tool for your own workflows.

Behind the scenes: how Seurat works

Seurat works by taking advantage of the fact that VR scenes are typically viewed from within a limited viewing region, and leverages this to optimize the geometry and textures in your scene. It takes RGBD images (color and depth) as input and generates a textured mesh, targeting a configurable number of triangles, texture size, and fill rate, to simplify scenes beyond what traditional methods can achieve.


To demonstrate what Seurat can do, here’s a snippet from Blade Runner: Revelations, which launched today with the Lenovo Mirage Solo.

Blade Runner: Revolution by Alcon Interactive and Seismic Games
The Blade Runner universe is known for its stunning worlds, and in Revelations, you get to unravel a mystery around fugitive Replicants in the futuristic but gritty streets. To create the look and feel for Revelations, Seismic used Seurat to bring a scene of 46.6 million triangles down to only 307,000, improving performance by more than 100x with almost no loss in visual quality:

Original scene:

Seurat-processed scene: 

If you’re interested in learning more about Seurat or trying it out yourself, visit the Seurat GitHub page to access the documentation and source code. We’re looking forward to seeing what you build!

By Manfred Ernst, Software Engineer

Rolling out the red carpet for GSoC 2018 students!

Monday, April 23, 2018

Congratulations to our 2018 Google Summer of Code (GSoC) students and a big thank you to everyone who applied! Our 206 mentoring organizations have chosen the 1,264 students that they'll be working with during the 14th Google Summer of Code. This year’s students come from 64 different countries!

The next step for participating students is the Community Bonding period which runs from April 23rd through May 15th. During this time, students will get up to speed on the culture and code base of their new community. They’ll also get acquainted with their mentor(s) and learn more about the languages or tools they will need to complete their projects. Coding begins May 15th and will continue throughout the summer until August 14th.

To the more than 3,800 students who were not chosen this year - don’t be discouraged! Many students apply at least once to GSoC before being accepted. You can improve your odds for next time by contributing to the open source project of your choice directly; organizations are always eager for new contributors! Look around GitHub and elsewhere on the internet for a project that interests you and get started.

Happy coding, everyone!

By Stephanie Taylor, GSoC Program Lead

My first open source project and Google Code-in

Wednesday, March 28, 2018

This is a guest post from a mentor with coala, an open source tool for linting and fixing code in many different languages, which participated in Google Code-in 2017.

About two years ago, my friend Gyan and I built a small web app which checked whether or not a given username was available on a few popular social media websites. The idea was simple: judge availability of the username on the basis of an HTTP response. Here’s a pseudo-code example:
website_url = form_website_url(website, username)
# Eg: form_website_url('github', 'manu-chroma') returns 'github.com/manu-chroma'

if website_url_response.http_code == 404:
  username available
else:
  username taken
Much to our delight, it worked! Well, almost. It had a lot of bugs but we didn’t care much at the time. It was my first Python project and the first time I open sourced my work. I always look back on it as a cool idea, proud that I made it and learned a lot in the process.

But the project had been abandoned until John from coala approached me. John suggested we use it for Google Code-in because one of coala’s tasks for the students was to create accounts on a few common coding related websites. Students could use the username availability tool to find a good single username–people like their usernames to be consistent across websites–and coala could use it to verify that the accounts were created.

I had submitted a few patches to coala in the past, so this sounded good to me! The competition clashed with my vacation plans, but I wanted to get involved, so I took the opportunity to become a mentor.

Over the course of the program, students not only used the username availability tool but they also began making major improvements. We took the cue and began adding tasks specifically about the tool. Here are just a few of the things students added:
  • Regex to determine whether a given username was valid for any given website
  • More websites, bringing it to a total of 13
  • Tests (!)
The web app is online so you can check username availability too!

I had such a fun time working with students in Google Code-in, their enthusiasm and energy was amazing. Special thanks to students Andrew, Nalin, Joshua, and biscuitsnake for all the time and effort you put into the project. You did really useful work and I hope you learned from the experience!

I want to thank John for approaching me in the first place and suggesting we use and improve the project. He was an unstoppable force throughout the competition, helping both students and fellow mentors. John even helped me with code reviews to really refine the work students submitted, and help them improve based on the feedback.

Kudos to the Google Open Source team for organizing it so well and lowering the barriers of entry to open source for high school students around the world.

By Manvendra Singh, coala mentor

A galactic experience in Google Code-in 2017

Monday, March 26, 2018

This is a guest post from Liquid Galaxy, one of the organizations that participated in both Google Summer of Code and Google Code-in 2017.

Liquid Galaxy, an open source project that powers panoramic views spanning multiple computers and displays, has been participating in Google Summer of Code (GSoC) since 2011. However, we never applied to participate in Google Code-in (GCI) because we heard stories from other projects about long hours and interrupted holidays in service of mentoring eager young students.

That changed in 2017! And, while the stories are true, we have to say it’s also an amazing and worthwhile experience.

It was hard for our small project to recruit the number of mentors needed. Thankfully, our GSoC mentors stepped up, as did many former GSoC students. We even had forward thinking students who were interested in participating in GSoC 2018 volunteer to mentor! While it was challenging, our team of mentors helped us have a nearly flawless GCI experience.

The Google Open Source team only had to nudge us once, when a student’s task had been pending review for more than 36 hours. We’re pretty happy with that considering we had nearly 500 tasks completed over the 50 days of the contest.

More important than our experience, though, is the student experience. We learned a lot, seeing how they chose tasks, the attention to detail some of them put into their work, and the level of interaction between the students and the mentors. Considering these were young students, ranging in age from 13 to 17, they far exceeded our expectations.

There was one piece of advice the Google Open Source team gave us that we didn’t understand as GCI newbies: have a large number of tasks ready from day one, and leave some unpublished until the halfway point. That ended up being key, it ensured we had enough tasks for the initial flood of students and some in reserve for the second flood around the holidays. Our team of mentors worked hard from the moment we were accepted into GCI to the moment we began to create over 150 tasks in five different categories. Students seemed to think we did a good job and told us they enjoyed the variety of tasks and level of difficulty.

We’re glad we finally participated in Google Code-in and we’ll definitely be applying next time! You can learn more about the project and the students who worked with us on our blog.

By Andreu Ibáñez, Liquid Galaxy org admin

Cloudprober: open source black-box monitoring software

Friday, March 23, 2018

Ever wonder if users can actually access your microservices? Observe timeouts in your applications, and not sure if it's the network or if your servers are too busy? Curious about the 99%-ile network latency between your on-premise data center and services running in the cloud?

Cloudprober, which we open sourced last year, answers questions like these and more. It’s black-box monitoring software that "probes" your systems and services and generates metrics based on probe results. This kind of monitoring strategy doesn’t make assumptions about how your service is implemented and it works at the same layer as your service’s users. You can make changes to your service’s implementation with peace of mind, knowing you’ll notice if a change prevents users from accessing the service.

A probe can be anything: a ping, an HTTP request, or even a custom program that mimics how your services are consumed (for example, creating and accessing a blog post). Cloudprober builds and exports standard metrics, and provides a way to easily integrate them with your existing monitoring stack, such as Prometheus-Grafana, Stackdriver and soon InfluxDB. Cloudprober is written in Go and works on all major platforms: Linux, Mac OS, and Windows. It's released as a static binary as well as a Docker image.

Here’s an example probe config that runs an HTTP probe against your forwarding rules and exports data to Stackdriver and Prometheus:
probe {
  name: "internal-web"
  type: HTTP
  # Probe all forwarding rules that contain web-fr in their name.
  targets {
    gce_targets {
      forwarding_rules {}
    }
    regex: "web-fr-.*"
  }
  interval_msec: 5000
  timeout_msec: 1000
  http_probe {
    port: 8080
  }
}

// Export data to stackdriver
surfacer {
  type: STACKDRIVER
}

// Prometheus exporter
surfacer {
  type: PROMETHEUS
}

The probe config is run like this from the command-line:
./cloudprober --config_file $HOME/cloudprober/cloudprober.cfg

This example probe config highlights two major features of Cloudprober: automatic, continuous discovery of cloud targets, and data export over multiple channels (Stackdriver and Prometheus in this case). Cloud deployments are dynamic and are often changing constantly. Cloudprober's dynamic target discovery feature ensures you have one less thing to worry about when doing minor infrastructure changes. Data export in various formats helps it integrate well with your existing monitoring setup.

Other features include:
  • Go text templates based configuration which adds programming capability to configs, such as "for" loops and conditionals
  • Fast and efficient implementation of core probe types
  • Custom probes through the "external" probe type
  • The ability to read config through metadata
  • And cloud (Stackdriver) logging
Though most of the cloud support is specific to Google Cloud Platform (GCP), it’s easy to add support for other providers. Cloudprober has an extensible architecture so you can add new types of targets, probes and monitoring backends.

Cloudprober was built by the Cloud Networking Site Reliability Engineering (SRE) team at Google to monitor network availability and associated features. Today, it's used by several other Google Cloud SRE teams as well.

We’re excited to share Cloudprober with the wider devops community! You can find more examples in the GitHub repository and more information on the project website.

By Manu Garg, Cloud Networking Team

Open sourcing GTXiLib, an accessibility test automation framework for iOS

Wednesday, March 21, 2018

Google believes everyone should be able to access and enjoy the web. We share guidance on building accessible tech over at Google Accessibility and we recently launched a dedicated disability support team. Today, we’re excited to announce that we’ve open sourced GTXiLib, an accessibility test automation framework for iOS, under the Apache license.

We want our products to be accessible and automation, with frameworks like GTXiLib, is one of the ways we scale our accessibility testing. GTXiLib can automate the process of checking for some kinds of issues such as missing labels, hints, or low contrast text.

GTXiLib is written in Objective-C and will integrate with your existing XCTests to perform all the registered accessibility checks before the test tearDown. When the checks fail, the existing test fails as well. Fixing your tests will thus lead to better accessibility and your tests can catch new accessibility issues as well.
  • Reuse your tests: GTXiLib integrates into your existing functional tests, enhancing the value of any tests that you have or any that you write.
  • Incremental accessibility testing: GTXiLib can be installed onto a single test case, test class or a specific subset of tests giving you the freedom to add accessibility testing incrementally. This helped drive GTXiLib adoption in large projects at Google.
  • Author your own checks: GTXiLib has a simple API to create custom checks based on the specific needs of your app. For example, you can ensure every button in your app has an accessibilityHint using a custom check.
Do you also care about accessibility? Help us sharpen GTXiLib by suggesting a check or better yet, writing one. You can add GTXiLib to your project using CocoaPods or by using its Xcode project file.

We hope you find this useful and look forward to feedback and contributions from the community! Please check out the README for more information.

By Siddartha Janga, Google Central Accessibility Team 

Celebrating open source mentorship with Joomla

Tuesday, March 20, 2018

Let’s marvel for a moment: as Google Summer of Code (GSoC) 2018 begins, 46 of the participating open source organizations are celebrating a decade or more with the program. There are 586 collective years of mentorship between them, and that’s just through GSoC.

Free and open source software projects have been doing outreach and community building since the beginning. The free software movement has been around for 35 years, and open source has been around for 20.

Bringing new people into open source is necessary for project health and sustainability, but it’s not easy. It takes time and effort to prepare onboarding materials and mentor people. It takes personal dedication, a welcoming culture, and a commitment to institutional knowledge. Sustained volunteerism at this scale is worthy of celebration!

Joomla is one open source project that exemplifies this and Puneet Kala is one such person. Joomla, a web content management system (CMS) that was first released in 2005, is now on their 11th year of GSoC. More than 80 students have participated over the years. Most students are still actively contributing, and many have gone on to become mentors.

Puneet, now Joomla’s GSoC team lead, began with the project as a student five years ago. He sent along this article celebrating their 10th anniversary, which includes links to interviews with other students who have become mentors, and this panel discussion from Joomla World Conference.

It’s always great to hear from the people who have participated in Google Summer of Code. The stories are inspiring and educational. They know a thing or two about building open source communities, so we share what they have to say: you can find guest posts here.

We’d like to extend our heartfelt thanks to the 608 open source organizations and 12,000 organization administrators and mentors who have been a part of GSoC so far. We’d also like to applaud the 46 organizations that have 10+ years under their belts!

Your tireless investment in the future of people and open source is a testament to generosity.

By Josh Simmons, Google Open Source

Coding your way into cinemas

Monday, March 19, 2018

This is a guest post from apertus° and TimVideos.us, open source organizations that participated in Google Summer of Code last year and are back for 2018!

The apertus° AXIOM project is bringing the world’s first open hardware/free software digital motion picture production camera to life. The project has a rich history, exercises a steadfast adherence to the open source ethos, and all aspects of development have always revolved around supporting and utilising free technologies. The challenge of building a sophisticated digital cinema camera was perfect for Google Summer of Code 2017. But let’s start at the beginning: why did the team behind the project embark on their journey?

Modern Cinematography

For over a century film was dominated by analog cameras and celluloid, but in the late 2000’s things changed radically with the adoption of digital projection in cinemas. It was a natural next step, then, for filmmakers to shoot and produce films digitally. Certain applications in science, large format photography and fine arts still hold onto 35mm film processing, but the reduction in costs and improved workflows associated with digital image capture have revolutionised how we create and consume visual content.

The DSLR revolution

Photo by Matthew Pearce
licensed CC SA 2.0.
Filmmaking has long been considered an expensive discipline accessible only to a select few. This all changed with the adoption of movie recording capabilities in digital single-lens reflex (DSLR) cameras. For multinational corporations this “new” feature was a relatively straightforward addition to existing models as most compact digital photo cameras could already record video clips. This was the first time that a large diameter image sensor, a vital component for creating the typical shallow depth of field we consider cinematic, appeared in consumer cameras. In recent times, user groups have stepped up to contribute to the DSLR revolution first-hand, including groups like the Magic Lantern community.

Magic Lantern

Photo by Dave Dugdale licensed CC BY-SA 2.0.
Magic Lantern is a free and open source software add-on that runs from a camera’s SD/CF card. It adds a host of new features to Canon’s DSLRs that weren't included from the factory, such as allowing users to record high-dynamic range (HDR) video or 14-bit uncompressed RAW video. It’s a community project and many filmmakers simply wouldn’t have bought a Canon camera if it weren’t for the features that Magic Lantern pioneered. Because installing Magic Lantern doesn’t replace the stock Canon firmware or modify the read-only memory (ROM) but runs alongside it, it is both easy to remove and carries little risk. Originally developed for filmmaking, Magic Lantern’s feature base has expanded to include tools useful for still photography as well.

Starting the revolution for real 

Of course, Magic Lantern has been held back by the underlying proprietary hardware routines on existing camera models. So, in 2014 a team of developers and filmmakers around the apertus° project joined forces with the Magic Lantern team to lay the foundation for a totally independent, open hardware, free software, digital cinema camera. They ran a successful crowdfunding campaign for initial development, and they completed hardware development of the first developer kits in 2016. Unlike traditional cameras, the AXIOM is designed to be completely modular, and so continuously evolve, thereby preventing it from ever becoming obsolete. How the camera evolves is determined by its user community, with its design files and source code freely available and users encouraged to duplicate, modify and redistribute anything and everything related to the camera.

While the camera is primarily for use in motion picture production, there are many suitable applications where AXIOM can be useful. Individuals in science, astronomy, medicine, aerial mapping, industrial automation, and those who record events or talks at conferences have expressed interest in the camera. A modular and open source device for digital imaging allows users to build a system that meets their unique requirements. One such company for instance, Mavrx Inc, who use aerial imagery to provide actionable insight for the agriculture industry, used the camera because it enabled them to not only process the data more efficiently than comparable camera equivalents, but also to re-configure its form factor so that it could be installed alongside existing equipment configurations.

Google Summer of Code 2017

Continuing their journey, apertus° participated in Google Summer of Code for the first time in 2017. They received about 30 applications from interested students, from which they needed to select three. Projects ranged from field programmable gate array (FPGA) centered video applications to creating Linux kernel drivers for specific camera hardware. Similarly TimVideos.us, an open hardware project for live event streaming and conference recording, is working on FPGA projects around video interfaces and processing.

After some preliminary work, the students came to grips with the camera’s operating processes and all three dove in enthusiastically. One student failed the first evaluation and another failed the second, but one student successfully completed their work.

That student, Vlad Niculescu, worked on defining control loops for a voltage controller using VHSIC Hardware Description Language (VHDL) for a potential future AXIOM Beta Power Board, an FPGA-driven smart switching regulator for increasing the power efficiency and improving flexibility around voltage regulation.
Left: The printed circuit board (PCB) (printed circuit board) for testing the switching regulator FPGA logic. Right: After final improvements the fluctuation ripple in the voltages was reduced to around 30mV at 2V target voltage.
Vlad had this to say about his experience:

“The knowledge I acquired during my work with this project and apertus° was very satisfying. Besides the electrical skills gained I also managed to obtain other, important universal skills. One of the things I learned was that the key to solving complex problems can often be found by dividing them into small blocks so that the greater whole can be easily observed by others. Writing better code and managing the stages of building a complex project have become lessons that will no doubt become valuable in the future. I will always be grateful to my mentor as he had the patience to explain everything carefully and teach me new things step by step, and also to apertus° and Google’s Summer of Code program, without which I may not have gained the experience of working on a project like this one.”

We are grateful for Vlad’s work and congratulate him for successfully completing the program. If you find open hardware and video production interesting, we encourage you to reach out and join the community–both apertus° and TimVideos.us are back for Google Summer of Code 2018.

By Sebastian Pichelhofer, apertus°, and Tim 'mithro' Ansell, TimVideos.us

Googlers on the road: FOSSASIA Summit 2018

Friday, March 16, 2018

In a week’s time, free and open source enthusiasts of all kinds will gather in Singapore for FOSSASIA Summit 2018. Established in 2009, the annual event attracts more than 3,000 attendees, running from March 22nd to 25th this year.

FOSSASIA logo licensed LGPL-2.1.
FOSSASIA Summit is organized by FOSSASIA, a nonprofit organization that focuses on Asia and brings people together around open technology both in-person and online. The organization is also home to many open source projects and is a regular participant in the Google Open Source team’s student programs, Google Summer of Code and Google Code-in.

Our team is excited to be among those attending and speaking at the conference this year, and we’re proud that Google Cloud is a sponsor. If you’re around, please come say hello. The highlight of our travel is meeting the students and mentors who have participated in our programs!

Here are the Googlers who will be giving presentations:

Thursday, March 22nd

2:00pm Real-world Machine Learning with TensorFlow and Cloud ML by Kaz Sato

Friday, March 23rd

9:30am BigQuery codelab by Jan Peuker
10:30am  Working with Cloud DataPrep by KC Ayyagari
1:00pm Extract, analyze & translate Text from Images with Cloud ML APIs by Sara Robinson
2:00pm    Bitcoin in BigQuery: blockchain analytics on public data by Allen Day
2:40pm What can we learn from 1.1 billion GitHub events and 42 TB of code? by Felipe Hoffa2:40pm Engaging IoT solutions with Machine Learning by Markku Lepisto
2:45pm CloudML Engine: Qwik Start by Kaz Sato
3:20pm Systems as choreographed behavior with Kubernetes by Jan Peuker
4:00pm The Assistant by Manikantan Krishnamurthy

Saturday, March 24th

10:30am Google Summer of Code and Google Code-in by Stephanie Taylor
10:30am Zero to ML on Google Cloud Platform by Sara Robinson
11:00am  Building a Sustainable Open Tech Community through Coding Programs, Contests and Hackathons panel including Stephanie Taylor
11:05am Codifying Security and Modern Secrets Management by Seth Vargo
1:00pm    Open Source Education panel including Cat Allman
5:00pm Everything as Code by Seth Vargo

We look forward to seeing you there!

By Josh Simmons, Google Open Source

Semantic Image Segmentation with DeepLab in TensorFlow

Thursday, March 15, 2018

Cross-posted on the Google Research Blog.

Semantic image segmentation, the task of assigning a semantic label, such as “road”, “sky”, “person”, “dog”, to every pixel in an image enables numerous new applications, such as the synthetic shallow depth-of-field effect shipped in the portrait mode of the Pixel 2 and Pixel 2 XL smartphones and mobile real-time video segmentation. Assigning these semantic labels requires pinpointing the outline of objects, and thus imposes much stricter localization accuracy requirements than other visual entity recognition tasks such as image-level classification or bounding box-level detection.


Today, we are excited to announce the open source release of our latest and best performing semantic image segmentation model, DeepLab-v3+ [1]*, implemented in TensorFlow. This release includes DeepLab-v3+ models built on top of a powerful convolutional neural network (CNN) backbone architecture [2, 3] for the most accurate results, intended for server-side deployment. As part of this release, we are additionally sharing our TensorFlow model training and evaluation code, as well as models already pre-trained on the Pascal VOC 2012 and Cityscapes benchmark semantic segmentation tasks.

Since the first incarnation of our DeepLab model [4] three years ago, improved CNN feature extractors, better object scale modeling, careful assimilation of contextual information, improved training procedures, and increasingly powerful hardware and software have led to improvements with DeepLab-v2 [5] and DeepLab-v3 [6]. With DeepLab-v3+, we extend DeepLab-v3 by adding a simple yet effective decoder module to refine the segmentation results especially along object boundaries. We further apply the depthwise separable convolution to both atrous spatial pyramid pooling [5, 6] and decoder modules, resulting in a faster and stronger encoder-decoder network for semantic segmentation.


Modern semantic image segmentation systems built on top of convolutional neural networks (CNNs) have reached accuracy levels that were hard to imagine even five years ago, thanks to advances in methods, hardware, and datasets. We hope that publicly sharing our system with the community will make it easier for other groups in academia and industry to reproduce and further improve upon state-of-art systems, train models on new datasets, and envision new applications for this technology.

By Liang-Chieh Chen and Yukun Zhu, Google Research

Acknowledgements
We would like to thank the support and valuable discussions with Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille (co-authors of DeepLab-v1 and -v2), as well as Mark Sandler, Andrew Howard, Menglong Zhu, Chen Sun, Derek Chow, Andre Araujo, Haozhi Qi, Jifeng Dai, and the Google Mobile Vision team.

References
  1. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation, Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam, arXiv: 1802.02611, 2018.
  2. Xception: Deep Learning with Depthwise Separable Convolutions, François Chollet, Proc. of CVPR, 2017.
  3. Deformable Convolutional Networks — COCO Detection and Segmentation Challenge 2017 Entry, Haozhi Qi, Zheng Zhang, Bin Xiao, Han Hu, Bowen Cheng, Yichen Wei, and Jifeng Dai, ICCV COCO Challenge Workshop, 2017.
  4. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs, Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille, Proc. of ICLR, 2015.
  5. Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs, Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille, TPAMI, 2017.
  6. Rethinking Atrous Convolution for Semantic Image Segmentation, Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam, arXiv:1706.05587, 2017.


* DeepLab-v3+ is not used to power Pixel 2's portrait mode or real time video segmentation. These are mentioned in the post as examples of features this type of technology can enable.

Open sourcing Resonance Audio

Wednesday, March 14, 2018

Spatial audio adds to your sense of presence when you’re in VR or AR, making it feel and sound, like you’re surrounded by a virtual or augmented world. And regardless of the display hardware you’re using, spatial audio makes it possible to hear sounds coming from all around you.

Resonance Audio, our spatial audio SDK launched last year, enables developers to create more realistic VR and AR experiences on mobile and desktop. We’ve seen a number of exciting experiences emerge across a variety of platforms using our SDK. Recent examples include apps like Pixar’s Coco VR for Gear VR, Disney’s Star Wars™: Jedi Challenges AR app for Android and iOS, and Runaway’s Flutter VR for Daydream, which all used Resonance Audio technology.

To accelerate adoption of immersive audio technology and strengthen the developer community around it, we’re opening Resonance Audio to a community-driven development model. By creating an open source spatial audio project optimized for mobile and desktop computing, any platform or software development tool provider can easily integrate with Resonance Audio. More cross-platform and tooling support means more distribution opportunities for content creators, without the worry of investing in costly porting projects.

What’s Included in the Open Source Project

As part of our open source project, we’re providing a reference implementation of YouTube’s Ambisonic-based spatial audio decoder, compatible with the same Ambisonics format (Ambix ACN/SN3D) used by others in the industry. Using our reference implementation, developers can easily render Ambisonic content in their VR media and other applications, while benefiting from Ambisonics open source, royalty-free model. The project also includes encoding, sound field manipulation and decoding techniques, as well as head related transfer functions (HRTFs) that we’ve used to achieve rich spatial audio that scales across a wide spectrum of device types and platforms. Lastly, we’re making our entire library of highly optimized DSP classes and functions, open to all. This includes resamplers, convolvers, filters, delay lines and other DSP capabilities. Additionally, developers can now use Resonance Audio’s brand new Spectral Reverb, an efficient, high quality, constant complexity reverb effect, in their own projects.

We’ve open sourced Resonance Audio as a standalone library and associated engine plugins, VST plugin, tutorials, and examples with the Apache 2.0 license. This means Resonance Audio is yours, so you’re free to use Resonance Audio in your projects, no matter where you work. And if you see something you’d like to improve, submit a GitHub pull request to be reviewed by the Resonance Audio project committers. While the engine plugins for Unity, Unreal, FMOD, and Wwise will remain open source, going forward they will be maintained by project committers from our partners, Unity, Epic, Firelight Technologies, and Audiokinetic, respectively.

If you’re interested in learning more about Resonance Audio, check out the documentation on our developer site. If you want to get more involved, visit our GitHub to access the source code, build the project, download the latest release, or even start contributing. We’re looking forward to building the future of immersive audio with all of you.

By Eric Mauskopf, Google AR/VR Team

The value of OpenCensus

Tuesday, March 13, 2018

This post is the second in a series about OpenCensus. You can find the first post here.

Early this year we open sourced OpenCensus, a distributed tracing and stats instrumentation framework. Today, we continue our journey by discussing the history and motivation behind the project here at Google, and what benefits OpenCensus has to offer. As OpenCensus continues to gain partners we’ll be shifting the focus away from Google, but we wanted to use this post as an opportunity to answer some of the questions that we’re most commonly asked at meetings and events.

Why did Google open source this? Why now?

Google open sources a lot of projects and we’ve begun documenting some of the reasons why on the Google Open Source website. What about OpenCensus specifically? There are many reasons it made sense for us to release this project and get others involved.

We had already released other related projects. The Census team had been eager to share their work with the public for a while. With projects like gRPC and Istio, going open source, it made sense to release OpenCensus as well.

It helped us serve our customers better. Teams with performance-sensitive APIs like BigTable and Spanner needed more insight into their customers’ calling patterns while debugging issues, and wanted a way to connect customers’ traced requests to equivalent traces inside of Google.

Managing integrations ourselves is costly. The Stackdriver Trace engineering team had been investing considerable resources building their own instrumentation libraries across seven languages, and it became apparent that the cost of building and maintaining integrations into web and RPC frameworks would continue in perpetuity. Releasing these libraries might encourage framework providers to manage these integrations instead.

We have a vested interest in everyone else’s reliability and performance. As a web search and cloud services provider, Google’s users benefit as web services and applications become increasingly reliable and performant. Popularizing distributed tracing and app-level metrics is a one way to achieve this. This is especially important with the rising popularity of microservices-based architectures which are difficult to debug without distributed tracing.

This expands the market for other services. By making tracing and app-level metrics more accessible, we grow the overall market for monitoring and application performance management (APM) tools, which benefits Stackdriver Monitoring and Stackdriver Trace.

As these factors came into focus, the decision to open source the project became clear.

Benefits to Partners and the Community

Google’s reasons for developing and promoting OpenCensus apply to partners at all levels.

Service developers reap the benefits of having automatic traces and stats collection, along with vendor-neutral APIs for manually interacting with these. Developers who use open source backends like Prometheus or Zipkin benefit from having a single set of well-supported instrumentation libraries that can export to both services at once.

For APM vendors, being able to take advantage of already-provided language support and framework integrations is huge, and the exporter API allows traces and metrics to be sent to an ingestion API without much additional work. Developers who might have been working on instrumentation code can now focus on other more important tasks, and vendors get traces and metrics back from places they previously didn’t have coverage for.

Cloud and API providers have the added benefit of being able to include OpenCensus in client libraries, allowing customers to gain insight into performance characteristics and debug issues without having to contact support. In situations where customers were still not able to diagnose their own issues, customer traces can be matched with internal traces for faster root cause analysis, regardless of which tracing or APM product they use.

What’s Next

If you missed the first post in our series, you can read it now. In upcoming blog posts and videos we’ll discuss:
  • The current schedule for OpenCensus on a language-by-language basis
  • Guides on how to add custom instrumentation to your application
  • Techniques for adding more automatic integrations to OpenCensus
  • Our long-term vision for OpenCensus
Thanks for reading – we’ll see you on GitHub!

By Pritam Shah and Morgan McLean, Census team

Student applications open for Google Summer of Code 2018

Monday, March 12, 2018

Ready, set, go! Today we begin accepting applications from university students who want to participate in Google Summer of Code (GSoC) 2018. Are you a university student? Want to use your software development skills for good? Read on.

Now entering its 14th year, GSoC gives students from around the globe an opportunity to learn the ins and outs of open source software development while working from home. Students receive a stipend for successful contribution to allow them to focus on their project for the duration of the program. A passionate community of mentors help students navigate technical challenges and monitor their progress along the way.

Past participants say the real-world experience that GSoC provides sharpened their technical skills, boosted their confidence, expanded their professional network and enhanced their resume.

Interested students can submit proposals on the program site between now and Tuesday, March 27, 2018 at 16:00 UTC.

While many students began preparing in February when we announced the 212 participating open source organizations, it’s not too late to start! The first step is to browse the list of organizations and look for project ideas that appeal to you. Next, reach out to the organization to introduce yourself and determine if your skills and interests are a good fit. Since spots are limited, we recommend writing a strong proposal and submitting a draft early so you can get feedback from the organization and increase the odds of being selected.

You can learn more about how to prepare in the video below and in the Student Guide.


You can find more information on our website, including a full timeline of important dates. We also highly recommend perusing the FAQ and Program Rules, as well as joining the discussion mailing list.

Remember to submit your proposals early as you only have until Tuesday, March 27 at 16:00 UTC. Good luck to all who apply!

By Josh Simmons, Google Open Source

Open Sourcing the Hunt for Exoplanets

Friday, March 9, 2018



Recently, we discovered two exoplanets by training a neural network to analyze data from NASA’s Kepler space telescope and accurately identify the most promising planet signals. And while this was only an initial analysis of ~700 stars, we consider this a successful proof-of-concept for using machine learning to discover exoplanets, and more generally another example of using machine learning to make meaningful gains in a variety of scientific disciplines (e.g. healthcare, quantum chemistry, and fusion research).

Today, we’re excited to release our code for processing the Kepler data, training our neural network model, and making predictions about new candidate signals. We hope this release will prove a useful starting point for developing similar models for other NASA missions, like K2 (Kepler’s second mission) and the upcoming Transiting Exoplanet Survey Satellite mission. As well as announcing the release of our code, we’d also like take this opportunity to dig a bit deeper into how our model works.

A Planet Hunting Primer

First, let’s consider how data collected by the Kepler telescope is used to detect the presence of a planet. The plot below is called a light curve, and it shows the brightness of the star (as measured by Kepler’s photometer) over time. When a planet passes in front of the star, it temporarily blocks some of the light, which causes the measured brightness to decrease and then increase again shortly thereafter, causing a “U-shaped” dip in the light curve.
A light curve from the Kepler space telescope with a “U-shaped” dip that indicates a transiting exoplanet.
However, other astronomical and instrumental phenomena can also cause the measured brightness of a star to decrease, including binary star systems, starspots, cosmic ray hits on Kepler’s photometer, and instrumental noise.
The first light curve has a “V-shaped” pattern that tells us that a very large object (i.e. another star) passed in front of the star that Kepler was observing. The second light curve contains two places where the brightness decreases, which indicates a binary system with one bright and one dim star: the larger dip is caused by the dimmer star passing in front of the brighter star, and vice versa. The third light curve is one example of the many other non-planet signals where the measured brightness of a star appears to decrease.
To search for planets in Kepler data, scientists use automated software (e.g. the Kepler data processing pipeline) to detect signals that might be caused by planets, and then manually follow up to decide whether each signal is a planet or a false positive. To avoid being overwhelmed with more signals than they can manage, the scientists apply a cutoff to the automated detections: those with signal-to-noise ratios above a fixed threshold are deemed worthy of follow-up analysis, while all detections below the threshold are discarded. Even with this cutoff, the number of detections is still formidable: to date, over 30,000 detected Kepler signals have been manually examined, and about 2,500 of those have been validated as actual planets!

Perhaps you’re wondering: does the signal-to-noise cutoff cause some real planet signals to be missed? The answer is, yes! However, if astronomers need to manually follow up on every detection, it’s not really worthwhile to lower the threshold, because as the threshold decreases the rate of false positive detections increases rapidly and actual planet detections become increasingly rare. However, there’s a tantalizing incentive: it’s possible that some potentially habitable planets like Earth, which are relatively small and orbit around relatively dim stars, might be hiding just below the traditional detection threshold — there might be hidden gems still undiscovered in the Kepler data!

A Machine Learning Approach

The Google Brain team applies machine learning to a diverse variety of data, from human genomes to sketches to formal mathematical logic. Considering the massive amount of data collected by the Kepler telescope, we wondered what we might find if we used machine learning to analyze some of the previously unexplored Kepler data. To find out, we teamed up with Andrew Vanderburg at UT Austin and developed a neural network to help search the low signal-to-noise detections for planets.
We trained a convolutional neural network (CNN) to predict the probability that a given Kepler signal is caused by a planet. We chose a CNN because they have been very successful in other problems with spatial and/or temporal structure, like audio generation and image classification.
Luckily, we had 30,000 Kepler signals that had already been manually examined and classified by humans. We used a subset of around 15,000 of these signals, of which around 3,500 were verified planets or strong planet candidates, to train our neural network to distinguish planets from false positives. The inputs to our network are two separate views of the same light curve: a wide view that allows the model to examine signals elsewhere on the light curve (e.g., a secondary signal caused by a binary star), and a zoomed-in view that enables the model to closely examine the shape of the detected signal (e.g., to distinguish “U-shaped” signals from “V-shaped” signals).

Once we had trained our model, we investigated the features it learned about light curves to see if they matched with our expectations. One technique we used (originally suggested in this paper) was to systematically occlude small regions of the input light curves to see whether the model’s output changed. Regions that are particularly important to the model’s decision will change the output prediction if they are occluded, but occluding unimportant regions will not have a significant effect. Below is a light curve from a binary star that our model correctly predicts is not a planet. The points highlighted in green are the points that most change the model’s output prediction when occluded, and they correspond exactly to the secondary “dip” indicative of a binary system. When those points are occluded, the model’s output prediction changes from ~0% probability of being a planet to ~40% probability of being a planet. So, those points are part of the reason the model rejects this light curve, but the model uses other evidence as well - for example, zooming in on the centred primary dip shows that it's actually “V-shaped”, which is also indicative of a binary system.

Searching for New Planets

Once we were confident with our model’s predictions, we tested its effectiveness by searching for new planets in a small set 670 stars. We chose these stars because they were already known to have multiple orbiting planets, and we believed that some of these stars might host additional planets that had not yet been detected. Importantly, we allowed our search to include signals that were below the signal-to-noise threshold that astronomers had previously considered. As expected, our neural network rejected most of these signals as spurious detections, but a handful of promising candidates rose to the top, including our two newly discovered planets: Kepler-90 i and Kepler-80 g.

Find your own Planet(s)!

Let’s take a look at how the code released today can help (re-)discover the planet Kepler-90 i. The first step is to train a model by following the instructions on the code’s home page. It takes a while to download and process the data from the Kepler telescope, but once that’s done, it’s relatively fast to train a model and make predictions about new signals. One way to find new signals to show the model is to use an algorithm called Box Least Squares (BLS), which searches for periodic “box shaped” dips in brightness (see below). The BLS algorithm will detect “U-shaped” planet signals, “V-shaped” binary star signals and many other types of false positive signals to show the model. There are various freely available software implementations of the BLS algorithm, including VARTOOLS and LcTools. Alternatively, you can even look for candidate planet transits by eye, like the Planet Hunters.
A low signal-to-noise detection in the light curve of the Kepler 90 star detected by the BLS algorithm. The detection has period 14.44912 days, duration 2.70408 hours (0.11267 days) beginning 2.2 days after 12:00 on 1/1/2009 (the year the Kepler telescope launched).
To run this detected signal though our trained model, we simply execute the following command:
python predict.py  --kepler_id=11442793 --period=14.44912 --t0=2.2
--duration=0.11267 --kepler_data_dir=$HOME/astronet/kepler 
--output_image_file=$HOME/astronet/kepler-90i.png 
--model_dir=$HOME/astronet/model
The output of the command is prediction = 0.94, which means the model is 94% certain that this signal is a real planet. Of course, this is only a small step in the overall process of discovering and validating an exoplanet: the model’s prediction is not proof one way or the other. The process of validating this signal as a real exoplanet requires significant follow-up work by an expert astronomer — see Sections 6.3 and 6.4 of our paper for the full details. In this particular case, our follow-up analysis validated this signal as a bona fide exoplanet, and it’s now called Kepler-90 i!
Our work here is far from done. We’ve only searched 670 stars out of 200,000 observed by Kepler — who knows what we might find when we turn our technique to the entire dataset. Before we do that, though, we have a few improvements we want to make to our model. As we discussed in our paper, our model is not yet as good at rejecting binary stars and instrumental false positives as some more mature computer heuristics. We’re hard at work improving our model, and now that it’s open sourced, we hope others will do the same!


By Chris Shallue, Senior Software Engineer, Google Brain Team

If you’d like to learn more, Chris is featured on the latest episode of This Week In Machine Learning & AI discussing his work.
.