opensource.google.com

Menu

Posts from 2020

Google Summer of Code 2020 mentoring orgs announced!

Thursday, February 20, 2020

We are delighted to announce the open source projects and organizations that have been accepted for Google Summer of Code (GSoC) 2020, the 16th year of the program!

After careful review, we have chosen 200 open source projects to be mentor organizations this year, 30 of which are new to the program. Please see the program website for a complete list of the accepted organizations.

Are you a student interested in participating in GSoC this year? We will begin accepting student applications on Monday, March 16, 2020 at 18:00 UTC and the deadline to apply is Tuesday, March 31, 2020 at 18:00 UTC.


The most successful applications come from students who start preparing now. Here are some proactive tips for a successful before the application period begins:
You can find more information on our website which includes a full timeline of important dates. We also highly recommend perusing the FAQ and Program Rules and watching some of our other videos with more details about GSoC for students and mentors.

A hearty congratulations—and thank you—to all of our mentor organizations! We look forward to working with all of you during Google Summer of Code 2020.

By Stephanie Taylor, Google Open Source

AutoFlip: An Open Source Framework for Intelligent Video Reframing

Friday, February 14, 2020

Originally posted on the AI Blog

Videos filmed and edited for television and desktop are typically created and viewed in landscape aspect ratios (16:9 or 4:3). However, with an increasing number of users creating and consuming content on mobile devices, historical aspect ratios don’t always fit the display being used for viewing. Traditional approaches for reframing video to different aspect ratios usually involve static cropping, i.e., specifying a camera viewport, then cropping visual contents that are outside. Unfortunately, these static cropping approaches often lead to unsatisfactory results due to the variety of composition and camera motion styles. More bespoke approaches, however, typically require video curators to manually identify salient contents on each frame, track their transitions from frame-to-frame, and adjust crop regions accordingly throughout the video. This process is often tedious, time-consuming, and error-prone.

To address this problem, we are happy to announce AutoFlip, an open source framework for intelligent video reframing. AutoFlip is built on top of the MediaPipe framework that enables the development of pipelines for processing time-series multimodal data. Taking a video (casually shot or professionally edited) and a target dimension (landscape, square, portrait, etc.) as inputs, AutoFlip analyzes the video content, develops optimal tracking and cropping strategies, and produces an output video with the same duration in the desired aspect ratio.
Left: Original video (16:9). Middle: Reframed using a standard central crop (9:16). Right: Reframed with AutoFlip (9:16). By detecting the subjects of interest, AutoFlip is able to avoid cropping off important visual content.

AutoFlip Overview

AutoFlip provides a fully automatic solution to smart video reframing, making use of state-of-the-art ML-enabled object detection and tracking technologies to intelligently understand video content. AutoFlip detects changes in the composition that signify scene changes in order to isolate scenes for processing. Within each shot, video analysis is used to identify salient content before the scene is reframed by selecting a camera mode and path optimized for the contents.

Shot (Scene) Detection

A scene or shot is a continuous sequence of video without cuts (or jumps). To detect the occurrence of a shot change, AutoFlip computes the color histogram of each frame and compares this with prior frames. If the distribution of frame colors changes at a different rate than a sliding historical window, a shot change is signaled. AutoFlip buffers the video until the scene is complete before making reframing decisions, in order to optimize the reframing for the entire scene.

Video Content Analysis

We utilize deep learning-based object detection models to find interesting, salient content in the frame. This content typically includes people and animals, but other elements may be identified, depending on the application, including text overlays and logos for commercials, or motion and ball detection for sports.

The face and object detection models are integrated into AutoFlip through MediaPipe, which uses TensorFlow Lite on CPU. This structure allows AutoFlip to be extensible, so developers may conveniently add new detection algorithms for different use cases and video content. Each object type is associated with a weight value, which defines its relative importance — the higher the weight, the more influence the feature will have when computing the camera path.


Left: People detection on sports footage. Right: Two face boxes (‘core’ and ‘all’ face landmarks). In narrow portrait crop cases, often only the core landmark box can fit.

Reframing

After identifying the subjects of interest on each frame, logical decisions about how to reframe the content for a new view can be made. AutoFlip automatically chooses an optimal reframing strategy — stationary, panning or tracking — depending on the way objects behave during the scene (e.g., moving around or stationary). In stationary mode, the reframed camera viewport is fixed in a position where important content can be viewed throughout the majority of the scene. This mode can effectively mimic professional cinematography in which a camera is mounted on a stationary tripod or where post-processing stabilization is applied. In other cases, it is best to pan the camera, moving the viewport at a constant velocity. The tracking mode provides continuous and steady tracking of interesting objects as they move around within the frame.

Based on which of these three reframing strategies the algorithm selects, AutoFlip then determines an optimal cropping window for each frame, while best preserving the content of interest. While the bounding boxes track the objects of focus in the scene, they typically exhibit considerable jitter from frame-to-frame and, consequently, are not sufficient to define the cropping window. Instead, we adjust the viewport on each frame through the process of Euclidean-norm optimization, in which we minimize the residuals between a smooth (low-degree polynomial) camera path and the bounding boxes.

Top: Camera paths resulting from following the bounding boxes from frame-to-frame. Bottom: Final smoothed camera paths generated using Euclidean-norm path formation. Left: Scene in which objects are moving around, requiring a tracking camera path. Right: Scene where objects stay close to the same position; a stationary camera covers the content for the full duration of the scene.

AutoFlip’s configuration graph provides settings for either best-effort or required reframing. If it becomes infeasible to cover all the required regions (for example, when they are too spread out on the frame), the pipeline will automatically switch to a less aggressive strategy by applying a letterbox effect, padding the image to fill the frame. For cases where the background is detected as being a solid color, this color is used to create seamless padding; otherwise a blurred version of the original frame is used.

AutoFlip Use Cases

We are excited to release this tool directly to developers and filmmakers, reducing the barriers to their design creativity and reach through the automation of video editing. The ability to adapt any video format to various aspect ratios is becoming increasingly important as the diversity of devices for video content consumption continues to rapidly increase. Whether your use case is portrait to landscape, landscape to portrait, or even small adjustments like 4:3 to 16:9, AutoFlip provides a solution for intelligent, automated and adaptive video reframing.


What’s Next?

Like any machine learning algorithm, AutoFlip can benefit from an improved ability to detect objects relevant to the intent of the video, such as speaker detection for interviews or animated face detection on cartoons. Additionally, a common issue arises when input video has important overlays on the edges of the screen (such as text or logos) as they will often be cropped from the view. By combining text/logo detection and image inpainting technology, we hope that future versions of AutoFlip can reposition foreground objects to better fit the new aspect ratios. Lastly, in situations where padding is required, deep uncrop technology could provide improved ability to expand beyond the original viewable area.

While we work to improve AutoFlip internally at Google, we encourage contributions from developers and filmmakers in the open source communities.

Acknowledgments

We would like to thank our colleagues who contributed to Autoflip, Alexander Panagopoulos, Jenny Jin, Brian Mulford, Yuan Zhang, Alex Chen, Xue Yang, Mickey Wang, Justin Parra, Hartwig Adam, Jingbin Wang, and Weilong Yang; MediaPipe team who helped with open sourcing, Jiuqiang Tang, Tyler Mullen, Mogan Shieh, Ming Guang Yong, and Chuo-Ling Chang.

By Nathan Frey, Senior Software Engineer, Google Research, Los Angeles and Zheng Sun, Senior Software Engineer, Google Research, Mountain View

HarbourBridge: From PostgreSQL to Cloud Spanner

Wednesday, February 12, 2020

Would you like to try out Cloud Spanner with data from an existing PostgreSQL database? Maybe you’ve wanted to ‘kick the tires’ on Spanner, but have been discouraged by the effort involved?

Today, we’re announcing a tool that makes trying out Cloud Spanner using PostgreSQL data simple and easy.

HarbourBridge is a tool that loads Spanner with the contents of an existing PostgreSQL database. It requires zero configuration—no manifests or data maps to write. Instead, it ingests pg_dump output, automatically builds a Spanner schema, and creates a new Spanner database populated with data from pg_dump.

HarbourBridge is part of the Cloud Spanner Ecosystem, a collection of public, open source repositories contributed to, owned, and maintained by the Cloud Spanner user community. None of these repositories are officially supported by Google as part of Cloud Spanner.

Get up and running fast

HarbourBridge is designed to simplify Spanner evaluation, and in particular to bootstrap the process by getting moderate-size PostgreSQL datasets into Spanner (up to a few GB). Many features of PostgreSQL, especially those that don't map directly to Spanner features, are ignored, e.g. (non-primary) indexes, functions and sequences.

View HarbourBridge as a way to get up and running fast, so you can focus on critical things like tuning performance and getting the most out of Spanner. Expect that you'll need to tweak and enhance what HarbourBridge produces—More on this later.

Quick-start guide

The HarbourBridge README contains a step-by-step quick-start guide. We’ll quickly review the main steps. Before you begin, you'll need a Cloud Spanner instance, Cloud Spanner API enabled for your Google Cloud project, authentication credentials configured to use the Cloud API, and Go installed on your development machine.

To download HarbourBridge and install it, run
go get -u github.com/cloudspannerecosystem/harbourbridge
The tool should now be installed as $GOPATH/bin/harbourbridge. To use HarbourBridge on a PostgreSQL database called mydb, run
pg_dump mydb | $GOPATH/bin/harbourbridge
The tool will use the cloud project specified by the GCLOUD_PROJECT environment variable, automatically determine the Cloud Spanner instance associated with this project, convert the PostgreSQL schema for mydb to a Spanner schema, create a new Cloud Spanner database with this schema, and finally, populate this new database with the data from mydb. HarbourBridge also generates several files when it runs: a schema file, a report file (with details of the conversion), and a bad data file (if any data is dropped). See Files Generated by HarbourBridge.

Take care with ACLs

Note that PostgreSQL table-level and row-level ACLs are dropped during conversion since they are not supported by Spanner (Spanner manages access control at the database level). All data written to Spanner will be visible to anyone who can access the database created by HarbourBridge (which inherits default permissions from your Cloud Spanner instance).

Next steps

The tables created by HarbourBridge provide a starting point for evaluation of Spanner. While they preserve much of the core structure of your PostgreSQL schema and data, many important PostgreSQL features have been dropped.

In particular, HarbourBridge preserves primary keys but drops all other indexes. This means that the out-of-the-box performance you get from the tables created by HarbourBridge can be significantly slower than PostgreSQL performance. If HarbourBridge has dropped indexes that are important to the performance of your SQL queries, consider adding Secondary Indexes to the tables created by HarbourBridge. Use the existing PostgreSQL indexes as a guide. In addition, Spanner's Interleaved Tables can provide a significant performance boost.

Other dropped features include functions, sequences, procedures, triggers, and views. In addition, types have been mapped based on the types supported by Spanner. Types such as integers, floats, char/text, bools, timestamps and (some) array types map fairly directly to Spanner, but many other types do not and instead are mapped to Spanner's STRING(MAX). See Schema Conversion for details of the type conversions and their tradeoffs.

Recap

HarbourBridge automates much of the manual work of trying out Cloud Spanner using PostgreSQL data. The goal is to bootstrap your evaluation and help get you to the meaty issues as quickly as possible. The tables generated by HarbourBridge provide a starting point, but they will likely need to be tweaked and enhanced to support a full evaluation.

We encourage you to try out the tool, send feedback, file issues, fork and modify the codebase, and send PRs for fixes and new functionality. Our plans and aspirations for developing HarbourBridge further are outlined in the HarbourBridge Whitepaper. HarbourBridge is part of the Cloud Spanner Ecosystem, owned and maintained by the Cloud Spanner user community. It is not officially supported by Google as part of Cloud Spanner.

By Nevin Heintze, Cloud Spanner

Importing SA360 WebQuery reports to BigQuery

Tuesday, February 11, 2020

Context

Search Ads 360 (SA360) is an enterprise-class search campaign management platform used by marketers to manage global ad campaigns across multiple engines. It offers powerful reporting capability through WebQuery reports, API, BiqQuery and Datastudio connectors.

Effective Ad campaign management requires multi-dimensional analysis of campaign data along with customers’ first-party data by building custom reports with dimensions combined from paid-search reports and business data.

Customers’ business data resides in a data-warehouse, which is designed for analysis, insights and reporting. To integrate ads data into the data-warehouse, the usual approach is to bring/ load the campaign data into the warehouse; to achieve this, SA360 offers various options to retrieve paid-search data, each of these methods provide a unique capabilities.
Comparison AreaWebQueryBQ ConnectorDatastudio ConnectorAPI
Technical complexityLow
Medium
Medium
High
Ease of report customizationHigh
Medium
Low
High
Reporting DetailsCompleteLimited
Reports not supported on API are not available
E.g.
Location targets
Remarketing targets
Audience reports
Possible Data WarehouseAny
The report is generic and needs to be loaded into the data-warehouse using DWs custom loading methods.
BigQuery ONLYNoneAny
Comparing these approaches, in terms of technical knowledge required, as well as, support for data warehousing solution, the easiest one is WebQuery report for which a marketer can build a report by choosing the dimensions/metrics they want on the SA360 User Interface.

BigQuery data-transfer service is limited to importing data in BigQuery and Datastudio connector does not allow retrieving data.

WebQuery offers a simpler and customizable method than other alternatives and also offers more options for the kind of data (vs. BQ transfer service which does not bring Business Data from SA360 to BigQuery). It was originally designed for Microsoft Excel to provide an updatable view of a report. In the era of cloud computing, a need was felt for a tool which would help consume the report and make it available on an analytical platform or a cloud data warehouse like BigQuery.

Solution Approach

This tool showcases how to bridge this gap of bringing SA360 data to a data warehouse, in generic fashion, where the report from SA360 is fetched in XML format and converted it into a CSV file using SAX parsers. This CSV file is then transferred to staging storage to be finally ETLed into the Data Warehouse.

As a concrete example, we chose to showcase a solution with BigQuery as the destination (cloud) data warehouse, though the solution architecture is flexible for any other system.

Conclusion

The tool helps marketers bring advertising data closer to their analytical systems helping them derive better insights. In case you use BigQuery as your Data Warehouse, you can use this tool as-is. You can also adopt by adding components for analytical/data-warehousing systems you use and improve it for the larger community.

To get started, follow our step-by-step guide.
Notable Features of the tool are as following:
  • Modular Authorization module
  • Handle arbitrarily large web-query reports
  • Batch mode to process multiple reports in a single call
  • Can be used as part of ETL workflow (Airflow compatible)
By Anant Damle, Solutions Architect and Meera Youn, Technical Partnership Lead

Announcing our Google Code-in 2019 Winners!

Monday, February 10, 2020

Google Code-in (GCI) 2019 was epic in every regard. Not only did we celebrate 10 years of the Google Code-in program, but we also broke all of our previous records for the program. It was a very, very busy seven weeks for everyone—we had 3,566 students from 76 countries complete 20,840 tasks with a record 29 open source organizations!

We want to congratulate all of the students who took part in this year’s 10th anniversary of Google Code-in. Great job!

Today we are excited to announce the Grand Prize Winners, Runners Up, and Finalists with each organization.

The 58 Grand Prize Winners completed an impressive 2,158 tasks while also helping other students.

Each of the Grand Prize Winners will be awarded a four-day trip to Google’s campus in northern California to meet with Google engineers, one of the mentors they worked with during the contest, and enjoy some fun in California with the other winners. We look forward to seeing these winners in a few months!

Grand Prize Winners

The Grand Prize Winners hail from 21 countries, listed by full name alphabetically below:

Name
Organization
Country
Aayushman Choudhary
JBoss Community
India
Abdur-Raheem Idowu
Haiku
Norway
Abhinav Kaushlya
The Julia Programming Language
India
Aditya Vardhan Singh
The ns-3 Network Simulator project
India
Anany Sachan
OpenWISP
India
Andrea Gonzales
Sugar Labs
Philippines
Anmol Jhamb
Fedora Project
India
Aria Vikram
Open Roberta
India
Artur Grochal
Drupal
Poland
Bartłomiej Pacia
Systers, An AnitaB.org Community
Poland
Ben Houghton
Wikimedia
United Kingdom
Benjamin Amos
The Terasology Foundation
United Kingdom
Chamindu Amarasinghe
SCoRe Lab
Sri Lanka
Danny Lin
CCExtractor Development
United States
Diogo Fernandes
Apertium
Luxembourg
Divyansh Agarwal
AOSSIE
India
Duc Minh Nguyen
Metabrainz Foundation
Vietnam
Dylan Iskandar
Liquid Galaxy
United States
Emilie Ma
Liquid Galaxy
Canada
Himanshu Sekhar Nayak
BRL-CAD
India
Jayaike Ndu
CloudCV
Nigeria
Jeffrey Liu
BRL-CAD
United States
Joseph Semrai
SCoRe Lab
United States
Josh Heng
Circuitverse.org
United Kingdom
Kartik Agarwala
The ns-3 Network Simulator project
India
Kartik Singhal
AOSSIE
India
Kaustubh Maske Patil
CloudCV
India
Kim Fung
The Julia Programming Language
United Kingdom
Kumudtiha Karunarathna
FOSSASIA
Sri Lanka
M.Anantha Vijay
Circuitverse.org
India
Maathavan Nithiyananthan
Apertium
Sri Lanka
Manuel Alcaraz Zambrano
Wikimedia
Spain
Naman Modani
Copyleft Games
India
Navya Garg
OSGeo
India
Neel Gopaul
Drupal
Mauritius
Nils André
CCExtractor Development
United Kingdom
Paraxor
Fedora Project
United Arab Emirates
Paweł Sadowski
OpenWISP
Poland
Pola Łabędzka
Systers, An AnitaB.org Community
Poland
Pranav Karthik
FOSSASIA
Canada
Pranay Joshi
OSGeo
India
Prathamesh Mutkure
OpenMRS
India
Pratish Rai
R Project for Statistical Computing
India
Pun Waiwitlikhit
The Mifos Initiative
Thailand
Rachit Gupta
The Mifos Initiative
India
Rafał Bernacki
Haiku
Poland
Ray Ma
OpenMRS
New Zealand
Rick Wierenga
TensorFlow
Netherlands
Sayam Sawai
JBoss Community
India
Sidaarth “Sid” Sabhnani
Copyleft Games
United States
Srevin Saju
Sugar Labs
Bahrain
Susan He
Open Roberta
Australia
Swapneel Singh
The Terasology Foundation
India
Sylvia Li
Metabrainz Foundation
New Zealand
Umang Majumder
R Project for Statistical Computing
India
Uzay Girit
Public Lab
France
Vladimir Mikulic
Public Lab
Bosnia and Herzegovina
William Zhang
TensorFlow
United States

Runners Up

And a big kudos to our 58 Runners Up from 20 countries. They will receive a GCI backpack, jacket and a GCI tshirt. The Runners Up are listed alphabetically by First name below:

Name
Organization

Name
Organization
Adev Saputra
Drupal

Kunal Bhatia
Score Lab
Adrian Serapio
R Project for Statistical Computing

Laxya Pahuja
The Mifos Initiative
Alberto Navalón Lillo
Apertium

Łukasz Zbrzeski
Score Lab
Alvii_07
Liquid Galaxy

Madhav Mehndiratta
Fedora Project
Amar Fadil
OpenWISP

Marcus Chong
Sugar Labs
Ananya Gangavarapu
TensorFlow

Mateusz Samkiewicz
JBoss Community
Andrey Shcherbakov
Wikimedia

Maya Farber Brodsky
CCExtractor Development
Antara Bhattacharya
Metabrainz Foundation

Michał Piechowiak
Fedora Project
Anthony Zhou
Public Lab

Moodhunt
Metabrainz Foundation
Bartosz Dokurno
Circuitverse.org

Muhammad Wasif
FOSSASIA
Ching Lam Choi
The Julia Programming Language

name not shown
Haiku
Chirag Bhansali
AOSSIE

Nathan Taylor
Sugar Labs
Chiranjiv Singh Malhi
BRL-CAD

Nishanth Thumma
Open Roberta
Daksha Aeer
Systers, An AnitaB.org Community

Panagiotis Vasilopoulos
Haiku
Devansh Khetan
OpenMRS

Rachin Kalakheti
TensorFlow
Dhanus SL
OSGeo

Regan Iwadha
JBoss Community
Dhhyey Desai
AOSSIE

Ribhav Sharma
OpenMRS
Eric Xue
Copyleft Games

Richard Botez
Open Roberta
Eryk Mikołajek
BRL-CAD

Rishabh Verma
The Mifos Initiative
Hannah Guo
The Terasology Foundation

Rishank Kanaparti
Copyleft Games
Harsh Khandeparkar
Public Lab

Rishi R
R Project for Statistical Computing
Hirochika Matsumoto
CloudCV

Sai Putravu
The ns-3 Network Simulator project
Ilya Maier
Systers, An AnitaB.org Community

Samuel Sloniker
Apertium
Irvan Ayush Chengadu
Drupal

Shivam Rai
OSGeo
Jakub Niklas
The Terasology Foundation

Siddharth Sinha
FOSSASIA
Jun Rong Lam
Circuitverse.org

Soumitra Shewale
The Julia Programming Language
Karol Ołtarzewski
OpenWISP

Stanisław Howard
The ns-3 Network Simulator project
Kripa Kini
Liquid Galaxy

Suryansh Pathak
CloudCV
Krzysztof Krysiński
CCExtractor Development

Taavi Väänänen
Wikimedia

Finalists

And a hearty congratulations to our 58 Finalists from 20 countries. The finalists will win a special GCI jacket and a GCI tshirt. They are listed alphabetically by first name below:
Name
Organization

Name
Organization
Abinav Chari
CloudCV

Musab Kılıç
CCExtractor Development
Andre Christoga Pramaditya
CloudCV

Nail Anıl Örcün
The Terasology Foundation
Anish Agnihotri
OSGeo

Natalie Shapiro
Circuitverse.org
Aryan Gulati
FOSSASIA

Nate Clark
The Terasology Foundation
Ayush Sharma
Fedora Project

Nicholas Gregory
Wikimedia
Ayush Sharma
SCoRe Lab

Nikita Ermishin
OpenWISP
Daniel Oluojomu
JBoss Community

Nishith P
FOSSASIA
Dhruv Baronia
TensorFlow

Oliver Fogelin
R Project for Statistical Computing
Diana Hernandez
Systers, An AnitaB.org Community

Oussama Hassini
The Mifos Initiative
Gambali Seshasai Chaitanya
Apertium

Param Nayar
Copyleft Games
Hao Liu
R Project for Statistical Computing

Peter Terpstra
The ns-3 Network Simulator project
Hardik Jhalani
Systers, An AnitaB.org Community

Piyush Sharma
The Mifos Initiative
Hrishikesh Patil
OpenMRS

Robert Chen
Public Lab
Jackson Lewis
The ns-3 Network Simulator project

Rohan Cherivirala
Open Roberta
Jan Rosa
Wikimedia

Ruixuan Tu
Haiku
Janiru Hettiarachchi
Liquid Galaxy

Saptashwa Mandal
Drupal
Janiru Wijekoon
Metabrainz Foundation

Sashreek Magan
Sugar Labs
Joshua Yang
Apertium

Sauhard Jain
AOSSIE
Kevin Liu
Open Roberta
Sharman Maheshwari
SCoRe Lab
Krishna Rama Rao
AOSSIE

Sumagna Das
BRL-CAD
Li Chen
Fedora Project

Tanvir Singh
OSGeo
Madhav Shekhar Sharma
The Julia Programming Language

Techno-Disaster
CCExtractor Development
Mbah Javis
TensorFlow

Thusal Ranawaka
BRL-CAD
Merul Dhiman
Liquid Galaxy

Vivek Mishra
Copyleft Games
Michelle (Wai Man) Lo
OpenMRS

Yu Fai Wong
JBoss Community
Mihir Bhave
OpenWISP

Yuqi Qiu
Metabrainz Foundation
Mohit S A
Circuitverse.org

Zakhar Vozmilov
Public Lab
Mokshit Jain
Drupal

Zakiyah Hasanah
Sugar Labs
Mudit Somani
The Julia Programming Language

Zoltán Szatmáry
Haiku

Our 794 mentors, the heart and soul of GCI, are the reason the contest thrives. Mentors volunteer their time to help these bright students become open source contributors. They spend hundreds of hours during their holiday breaks answering questions, reviewing submitted tasks, and welcoming the students to their communities. GCI would not be possible without their dedication, patience and tireless efforts.

We will post more numbers from GCI 2019 here on the Google Open Source Blog over the next few weeks, so please stay tuned.

Congratulations to our Grand Prize Winners, Runners Up, Finalists, and all of the students who spent the last couple of months learning about, and contributing to, open source. We hope they will continue their journey in open source!

By Stephanie Taylor, Google Open Source
.