Google Open Source Blog: September 2011

Posts from September 2011

Introducing Google JS Test

Thursday, September 29, 2011

Google JS Test is a JavaScript unit testing framework that runs on the V8 JavaScript Engine, the same open source project that is responsible for Google Chrome’s super-fast JS execution speed. Google JS Test is used internally by several Google projects, and we’re pleased to announce that it has been released as an open source project.

Features of Google JS Test include:

Extremely fast startup and execution time, without needing to run a browser.

Clean, readable output in the case of both passing and failing tests.

An optional browser-based test runner that can simply be refreshed whenever JS is changed.

Style and semantics that resemble Google Test for C++.

A built-in mocking framework that requires minimal boilerplate code (e.g. no $tearDown or $verifyAll calls), with style and semantics based on the Google C++ Mocking Framework.

A system of matchers allowing for expressive tests and easy to read failure output, with many built-in matchers and the ability for the user to add their own.

See the Google JS Test project home page for a quick introduction, and the getting started page for a tutorial that will teach you the basics in just a few minutes.

By Aaron Jacobs, Google Engineer

Custom Maps lets you take posted maps with you

Wednesday, September 28, 2011

Do you love to explore the outdoors with Google Maps but sometimes wish it had the details of a trail map or a tourist attractions map of a foreign city? Do you sometimes wish you could take one of those “You are here” maps with you to help you find places in an unfamiliar environment? Do you prefer maps on your phone rather than on paper?

If you answered “yes” to even some of these questions, you may want to take a closer look at a new Android app called Custom Maps -- recently released as open source at code.google.com.

Custom Maps showing a birdwatcher’s location overlaid on a photo of a posted park map.

The Custom Maps app allows for easy creation of digital maps from any map image. The image can be a photo of a paper map, a photo of a brochure map, or a picture of the map posted at a trailhead or at the entrance to an amusement park. It could also be a .jpeg or .png image hosted on the internet or a screenshot of a PDF map. All you have to do is choose two (or more) matching points that are common to both the map image and Google Maps, and Custom Maps can show your GPS location on the map. This makes it an excellent mapping option in situations where data signal is not available like in state parks or abroad, or when alternate map images show details that are not included in Google Maps.

Custom Maps showing a hiker’s location on Mist Trail in Yosemite National Park.

Custom Maps stores the geo aligned map images into KMZ files, which are simply ZIP files containing the geo location information in KML format, and the map image file. This makes it possible to take the map image out of the KMZ file, add some personal markup in the map using an image editor, and put the image back into the KMZ file. As long as the image is not resized in the process, the marked up map image can now display the user’s GPS location.

Custom Maps users can share created geo aligned map images as email attachments or by using QR codes. When a Custom Maps compatible QR code is scanned by a barcode scanner application, users can open the link directly in the Custom Maps app instead of a web browser.

Google has published the source code for Custom Maps under Apache License 2.0 at http://code.google.com/p/custom-maps/. The source code can be studied for examples of how to deal with the following topics on Android mobile apps:

dealing with large images in constrained memory environment of mobile devices
parsing XML (KML) documents using XML Pull API
using the Google Maps Android API and displaying translucent overlays on MapView widget
declaring an app as able to handle special URLs and file types so it can be launched by QR codes and so mail applications can direct attachments to it
triggering file sharing intents from an app

But Custom Maps is not finished yet. Several new features are planned including distance measurement, marking map locations with icons, making it possible to geolocate map images without Google Maps or data connection, working around the app memory limit to load larger map images, and automatically switching between stored maps based on user’s location and zoom level. Join the open source project to add these and more features to Custom Maps.

By Marko Teittinen, Google Geo Team

OpenIntents wraps up their first Google Summer of Code

Tuesday, September 27, 2011

This year was the first year OpenIntents participated in the Google Summer of Code. We are an open source organization which creates software for Android mobile phones and tablets, with special emphasis on interoperability with other software components.

As an organization we’ve found involvement in the Google Summer of Code extremely rewarding. The students have been able to improve their skills and gain practical experience in the stages of a software project, our organization has benefited from the interest generated from the students’ work, and the wider community will continue to benefit from the code the students have delivered.

We particularly enjoyed the international aspect of the program. All students, mentors, and co-mentors lived in different countries which did not prevent us from having a great time discussing the projects through Skype and live chat sessions. We received a great number of excellent proposals, from which two very different projects were chosen for the program.

Elena Burceanu’s project aimed to enhance the Sensor Simulator. During the first weeks, the GUI was polished, both in appearance and through clever code restructuring. After enhancing the GUI the number of supported sensors was increased and now includes Android sensors for gyroscope and general rotation vector. Finally, a scenario simulator was added, which creates sensor output from a set of initial states and the ability to change the time intervals between them. The sensor’s values are smoothly interpolated between the key frames. The final product was released as version 2.0. The source code and documentation for Elena's project are now available to view.

Andras Berke’s project consists of a new application called Historify which displays the user’s activities with others over a variety of communication methods (Voice, SMS, Facebook, etc.), and provides a method for third party applications to supply other activity events showing the interoperability of Android applications. During the summer Andras went through the whole application design process from the UI wireframes to a first beta release including documentation along the way. In addition, he provided demo applications to show how third party developers can interact with Historify. You can now view the source code and documentation from Andras' summer project.

We hope to be involved with the Google Summer of Code again and would recommend involvement to any organization which meets the Google Summer of Code organization criteria.

By Peli Oi, OpenIntents Organization Administrator

Sprinting for Open Source

Friday, September 23, 2011

For 5 days in October the Google Summer of Code Doc Summit, organized together with FLOSS Manuals, will bring together four documentation teams from open source projects, guest speakers, and free documentation 'free agents' to discuss everything and anything concerning the free documentation of free software. The event will feature a two day unconference and a three day Book Sprint. During the Book Sprint each project will produce a Book ready for distribution in print and electronic book formats.

The event is an ambitious project. Not only are unconferences about free software documentation scarce, never before has a Book Sprint been attempted with four projects working simultaneously on their own book. It’s going to be an extremely interesting and challenging event.

Free software documentation has often been a very low priority for free software projects. Often the documentation suffers from common flaws including:

no documentation existing at all
assumptions about the user's knowledge are set too high
poor navigation
unexplained jargon
there is no visual component
the documentation is proprietary or 'closed'
the format is unreadable
no translation workflow
operational steps are missing, unexplained, written 'from memory' or state how the software 'should' operate
the documentation is out of date, not easily re-usable or not easily modifiable.

The Google Summer of Code Doc Summit will attempt to discuss and address these problematic issues and look towards positive models for documentation production. We hope to shine light on the importance of the free software documentation 'sector' in the ecology of Free software. Free (libre) documentation is not simply an aid for learning how to use free software, it is a road into education and adoption in industry, a tool for demonstrating to clients how free software will meet their needs and expectations, and an important promotional tool for the advancement of free software. A healthy free documentation sector is both socially and economically empowering. We believe Free Documentation of Free Software efforts and ideals should be valued on the same level as free software itself and that is exactly what we plan to do at this Summit.

The Google Summer of Code Doc Summit is more than a think tank and an opportunity to discuss real world issues. Four projects, OpenMRS, KDE, Sahana, and OpenStreetMap, will have a chance to directly strengthen their documentation efforts. We look forward to working together with each of the selected teams and individuals to help them produce their own book by the end of the five day summit.

It’s going to be a great event.

By Adam Hyde, FLOSS Manuals

SHOGUN aims high with Google Summer of Code

Thursday, September 22, 2011

Google Summer of Code 2011 gave a big boost to the development of the SHOGUN machine learning toolbox. In case you have never heard of SHOGUN or machine learning, machine learning involves algorithms that do ‘intelligent’ and even automatic data processing and is currently used in many different settings. You will find machine learning in the face detection in your camera, compressing the speech in your mobile phone, and powering the recommendations in your favorite online shop, as well as predicting the solubility of molecules in water and the location of genes in humans, to name just a few examples. Interested? Shogun can help you give it a try.

SHOGUN is a machine learning toolbox, which is designed for unified large-scale learning for a broad range of feature types and learning settings. It offers a considerable number of machine learning models such as support vector machines for classification and regression, hidden Markov models, multiple kernel learning, linear discriminant analysis, linear programming machines, and perceptrons. Most of the specific algorithms are able to deal with several different data classes, including dense and sparse vectors and sequences using floating point or discrete data types. We have used this toolbox in several applications from computational biology, some of them coming with no less than 10 million training examples and others with 7 billion test examples. With more than a thousand installations worldwide, SHOGUN is already widely adopted in the machine learning community and beyond.

Some very simple examples stemming from a sub-branch of machine learning called supervised learning illustrate how objects represented by two-dimensional vectors can be classified into good or bad, by learning a support vector machine. I would suggest installing the python_modular interface of SHOGUN and to run the example interactive_svm_demo.py also included in the source tarball. Two images illustrating the training of a support vector machine follow:

We were a first time organization this year, i.e. taking part in our first Google Summer of Code. Having received many student applications, we were very happy to hear that we were given 5 very talented students but we had to reject about 60 students (only 7% acceptance rate). Deciding which 5 students we would accept was an extremely tough decision for us. So in the end we raised the bar by requiring sample contributions even before the actual Google Summer of Code started. The quality of the contributions and independence of the student aided our decision on the selection of the final five students.

At the end of the summer we now have a new core developer and various new features implemented in SHOGUN: Interfaces to new languages like Java, C#, Ruby, and Lua, a model selection framework, many dimension reduction techniques, Gaussian Mixture Model estimation and a full-fledged online learning framework. All of this work has already been integrated in the newly released shogun 1.0.0. To find out more about the newly implemented features read below.

Interfaces to the Java, C#, Lua and Ruby Programming Languages

Baozeng

Boazeng implemented swig-typemaps that enable transfer of objects native to the language one wants to interface to. In his project he added support for Java, Ruby, C# and Lua. His knowledge about Swig helped us to drastically simplify shogun's typemaps for existing languages like Octave and Python, resolving other corner-case type issues. In addition, the typemaps bring a high-performance and versatile machine learning toolbox to these languages. It should be noted that shogun objects trained in e.g. Python can be serialized to disk and then loaded from any other language like Lua or Java. We hope this helps users working in multiple-language environments. Note that the syntax is very similar across all languages used, compare for yourself, various examples for all languages (Python, Octave, Java, Lua, Ruby, and C#) are available.

Cross-Validation Framework

Heiko Strathmann

Nearly every learning machine has parameters which have to be determined manually. Before Heiko started his project, one had to manually implement cross-validation using (nested) for-loops. In his highly involved project Heiko extended shogun's core to register parameters and ultimately made cross-validation possible. He implemented different model selection schemes (train, validation, test split, n-fold cross-validation, stratified cross-validation, etc.) and created some examples for illustration. Note that various performance measures are available to measure how “good” a model is. The figure below shows the area under the receiver operator characteristic curve as an example.

Dimension Reduction Techniques

Sergey Lisitsyn

Dimensionality reduction is the process of finding a low-dimensional representation of a high-dimensional one while maintaining the core essence of the data. For one of the most important practical issues of applied machine learning, it is widely used for preprocessing real data. With a strong focus on memory requirements and speed, Sergey implemented the following dimension reduction techniques:

Locally Linear Embedding
Kernel Locally Linear Embedding
Local Tangent Space Alignment
Multidimensional scaling (with capability of landmark approximation)
Isomap
Hessian Locally Linear Embedding
Laplacian Eigenmaps

See below for some illustrations of dimension reduction/embedding techniques.

Expectation Maximization Algorithms for Gaussian Mixture Models

Alesis Novik

The Expectation-Maximization algorithm is well known in the machine learning community. The goal of this project was the robust implementation of the Expectation-Maximization algorithm for Gaussian Mixture Models. Several computational tricks have been applied to address numerical and stability issues, like:

Representing covariance matrices as their SVD

Doing operations in log domain to avoid overflow/underflow

Setting minimum variances to avoid singular Gaussians

Merging/splitting of Gaussians.

An illustrative example of estimating a one and two-dimensional Gaussian follows below.

Large Scale Learning Framework and Integration of Vowpal Wabbit

Shashwat Lal Das

Shashwat introduced support for 'streaming' features into shogun. That is, instead of shogun's traditional way of requiring all data to be in memory, features can now be streamed from a disk, enabling the use of massively big data sets. He implemented support for dense and sparse vector based input streams as well as strings and converted existing online learning methods to use this framework. He was particularly careful and even made it possible to emulate streaming from in-memory features. He finally integrated (parts of) Vowpal Wabbit, which is a very fast large scale online learning algorithm based on SGD.

By Sören Sonnenburg, Shogun Machine Learning Toolbox Organization Administrator and Co-mentor for Google Summer of Code

This is cross posted from Dr. Sonnenburg’s blog, where you can find more details on his personal experience with Google Summer of Code.

Open source code meets law at Hack4Transparency

Monday, September 19, 2011

What do you call a group of talented hackers in the European Parliament for a 24-hour window, enjoying free food while improving data transparency? We call it Hack4Transparency, and it’s not your everyday hackathon. Google is proud to be one of the sponsors of this upcoming event, a code sprint this November 8-9 that, literally, brings code to law. This is the first ever hacking event within the premises of European government, taking place in the heart of Brussels and giving dedicated hackers an opportunity to bring the power of good code to the place where it can matter most.

Over the course of 24 hours, hackers will work to make data more accessible and intelligible to consumers, to government, and to anyone who’s interested in the state of Internet access and information availability around the world.

Hackers will work along two tracks. The Internet Quality track focuses on making broadband performance data meaningful to the average consumer by improving the user interfaces of existing broadband measurement tools. The Global Transparency track asks hackers to take data from existing sources including Google’s Transparency Report, the Open Net Initiative, and Herdict, and using these sources to create compelling visualizations showing what type of Internet content is available or unavailable to users.

There will be free food, free WiFi, and the opportunity to win prizes while working with a lot of cool people dedicated to making big improvements.

Applicants that are selected to attend will have their travel and accommodations covered, and winning hackers on each track will receive €3.000.

If you're an EU-based hacker and you want fun, food, a free vacation, and the opportunity to make a big impact, we invite you to apply.

The deadline for applications is Monday, October 10, noon CET.

By Marco Pancini, Google Sr. Policy Counsel, Brussels

Hedgewars bets on Google Summer of Code

Friday, September 16, 2011

What an exciting summer for the Hedgewars team! This was our first year to participate in the Google Summer of Code program and we managed to survive to the end!

The fun began when we were accepted as a participating organization and prepared the ideas list to be discussed with the students. As an interesting statistic, I do remember that people joining our IRC channel increased by 20% during that time. We didn't expect so much interest, and my best guess is that it was because it's always fun to code for a game, and Hedgewars in particular offered a lot of uncommon initiatives, like programming with FreePascal or experimenting with SDL-1.3.

The only negative part of our experience lies in the fact that of the 2 student slots that we were given, one project wasn't carried out at all which meant that the AI implementation had to be called off. Perhaps a more experienced organization might have noticed subtle hints from the student and avoided this situation, but what counts is that we learned our lesson and will be able to evaluate our students better in the future.

On the other hand, we've had an outstanding success in our second project, Hedgewars on Android: the student was able to fulfill the task requirements and implement a few optional features in the time allocated. He also documented his code (rarity!) and interacted with developers of our dependencies. What I particularly liked was the fact that all of our active developers helped in this project with testing and suggestions.

Overall I feel like we've all had our share of fun gathering ideas from the community, sorting out students' proposals, managing student’s work, and achieving results. Our student is sticking around to finish and maintain his project and is about to join our development team, so my guess is that he also enjoyed his time with the Hedgewars team during Google Summer of Code.

Finally, with our repo full of new code and our bag full of experience, we look forward to next year’s Google Summer of Code where we plan to have just as much fun and success (and hopefully even more).

Thanks to all the people involved, mentors, students and admins.

By Vittorio Giovara, Hedgewars Organization Administrator for Google Summer of Code

This post is cross posted from the Hedgewars blog, where you can find additional details about their experiences with the Google Summer of Code.

Omaha 3: Modernizing Automatic Software Updating

Wednesday, September 14, 2011

In 2007, Google built the Google Update engine to provide a common background automatic updater for all of its products on Windows, and released it to open source as the Omaha project. Since its release, multiple developers have successfully reused the Omaha code base to provide updating for their own products.

Over the past few years Google has continued to improve Omaha, and we're happy to announce that the third major release of Omaha has now been released as open source. The Omaha 3 code base can be used to replicate Google Update binaries, or it can be reused and tweaked by developers to create your own automatic updater for your software.

Major improvements over Omaha 2 include:

* A new API: Omaha 3 has a new state model that can be exposed via COM, allowing for finer control of the updating process and more detailed feedback from it. (It is backwards-compatible with the Omaha 2 'OnDemand' API as well.)

* Improved debugging: Omaha now generates far more detailed logs, which can be collected in the form of either text files or ETW events that can be collected using Sawbuck.

* Improved data collection and delivery: Pings for both successful and failed installs and updates are finer grained and easier to process. Omaha 3 can gracefully handle intermittent network connections as well; the results of an automatic update can be queued and re-sent at later dates.

* Machine-preferred installs: Omaha can now do elevation-optional installs, where it can attempt to install an application either per-machine or per-user and notify the installer of which mode to use. This allows you to deploy products that ship in both flavors using a single installer.

Omaha 3 is already being used by Google Chrome, Google Talk, and other Windows applications.

Developers who have used prior releases of Omaha should definitely check out the new release, which can be found at http://code.google.com/p/omaha/ using the Apache License, Version 2.0.

We welcome feedback/questions at omaha-discuss@googlegroups.com as well.

By Ryan Myers, Software Engineer at Google

August Penguin 10th annual meeting in Israel

Friday, September 9, 2011

Last month, 300 open source developers gathered in Tel Aviv for the 10th anniversary of August Penguin, the annual open source event in Israel. The event started with a review of activities of Hamakor, an Israeli open source organization that strives to drive open source software in every aspect of our life - in the government, with small-medium business and on the web. The day continued with other open source initiatives, and with prizes for local open source development and innovation, and ended with technical tracks.

Developers from all ages attended the event:

Google provided the event organizers with HTML5 shaped cupcakes, it was a blast!

Speaking of HTML5, there was a very interesting session in the event that talked about cross browser incompatibilities of many of the government websites. It was mentioned that the open source community together with Google will further engage top sites in Israel and promote an open and standardized web.

By Amir Shevat, Google Developer Relations

7th Year of Google Summer of Code comes to an end

Thursday, September 8, 2011

We are pleased to announce the end of another successful Google Summer of Code, our program designed to introduce university students from around the world to open source development.

Back in late May, 1,115 university students from 68 countries began writing code for 175 open source organizations with the help of over 2000 mentors from 55 countries. We are excited to announce that just over 88% of the students passed their final evaluations, this was just shy of 2010’s record success rate of 89%. We’ll be posting more tasty stats about the 2011 program here soon. Meanwhile you can view a variety of statistics on the previous six years of the program. Mentoring organizations will also be posting wrap up reports over the coming weeks so stay tuned!

Now that the program has ended for the summer, the students are busy preparing their code samples for all eyes to see. Soon you will be able to go to the program site where organizations will have links to the student’s code repository on code.google.com.

Thank you to all of the students, mentors and organization administrators that have helped to make this 7th year of the Google Summer of Code a roaring success!

By Stephanie Taylor, Open Source Programs