Posts from January 2009

Updates from Googlers Contributing to Open Source Projects

Wednesday, January 28, 2009

You may recall some of our previous posts about Google employee contributions to Open Source during their 20% time. While many engineers spend their 20% time on releasing code created internally at Google, many more spend their time contributing to external projects just to scratch their own itch. We're pleased to bring you some updates about what our engineers have been doing over the past few months:

For all you version control geeks out there, you'll be interested to know that Ben Collins-Sussman has been working on rewriting Subversion's HTTP protocol. While the rewrite will still be WebDAV compatible, he's busy removing all of the DeltaV formalities that cause numerous extra requests. Once complete, users should see much faster network traffic when speaking to an Apache server. For more details, check out the write up on Ben's blog.

Continuing his work on WHOPR, a scalable whole program optimizer for GCC, Diego Novillo reports that the complier can now build several Google applications with link-time optimizations enabled. Diego and the rest of the WHOPR team are handling code generation bugs and performance problems with generated code. They expect to be showing steady performance improvements over the next couple of months.

Frank Mayhar recently submitted a patch to remove the journaling dependency from ext4. The patch has been accepted and should be merged very soon. You can find more details and the actual patch submitted in the list archives.

Shiki Okasaka continued development on the ES Operating System, releasing some new source tarballs. Included in the latest release were a number of contributions made by Google Summer of Code students. The switchover to the new Web IDL standard from OMG IDL for the system interface definitions is in preliminary stages and some progress has also been made on the implementation of the project's TCP/IP stack. The latest release also contains the Web IDL based preliminary RPC runtime for x86 linux, allowing for testing and combination of multiple ES software components running in separate processes on Linux, a substitute for the ES kernel natively running on a PC.

Tim Hockin contributed to a number of different systems software projects. Work continued on the recently released prettyprint project. Tim has been exploring languages to use for prettyprint's runtime, Javascript being the top contender for now. He has also worked on improvements to ACPID, including a lot of code clean up and a minor release. Tim also has a patch in his review queue to add netlink support to this completely flexible, totally extensible daemon for delivering ACPI events. He's also done some more work to whip the MCE Daemon into shape and is looking at promoting it for various distros.

These are just a few of the many contributions that have taken place over the past few months, and we'll be bringing you regular news about what Googlers are up to in the wide world of Open Source. Happy Hacking!

OpenGSE Released

Tuesday, January 27, 2009

Something that may be of interest to Open Source enthusiasts is the recent release of the Open Google Servlet Engine (OpenGSE). Once upon a time engineers within Google developed their own fast, lightweight servlet engine which we inside Google refer to as simply the "Google Servlet Engine" or GSE. Many of Google's products, both internal and public facing, use the GSE, including GMail and Google Calendar. The OpenGSE code base conceptually consists of an inner-core which focuses on raw HTTP processing and a shell which wraps that inner core and implements the so-called "servlet container" abstraction. This approach was taken originally just because it made the coding easier, but it was found that this servlet container layer can wrap our internal corporate version, the Google Servlet Engine, and enable it to provide the same servlet container support as OpenGSE! While we weren't anticipating this outcome, it was an excellent unintended side effect and will help improving the functionality of our internal systems, as well.

While opening the code and removing all Google specific dependencies, it was necessary to ensure that the code base still behaved like a servlet engine: invalidated sessions are unable to store objects, etc.. We started with Watchdog, a then dormant test suite that only checks servlet spec 2.3 (to some extent). We decided to investigate how difficult it would be to add a few extra tests to test the features in the 2.5 spec. The tests were added and the code-base was tweaked further to pass those tests. The client-side of the original watchdog tests consisted of a custom ANT task which writes results to System.out. All of these client-side custom ANT task invocations were transformed into individual JUnit test methods which makes IDE development much, much easier and more pleasurable.

During a casual conversation with some of my fellow engineers, they suggested that perhaps the best way to think of OpenGSE is as a suite of servlet engine tests with a "reference servlet engine" that passes these tests. The "toy" servlet engine supplied with the test suites would have the same core http processing code (as far as possible) as the servlet engine which powers GMail etc. For folks outside of Google, there's really no compelling argument to drop Apache Tomcat/Jetty, etc. in favor of OpenGSE's reference servlet engine, but anyone interested in servlet engines would have a fantastic learning resource available to them. For Googlers, using a servlet-layer wrapped internal GSE for their project means that future Open Sourcing of their project becomes that much easier.

Special thanks goes to Mike Jennings who took the initiative, led this Open Source project throughout 2007-2008 and managed to get the first version of OpenGSE out in 2008, as well as Spencer Kimball and Peter Mattis, the two original authors of GSE.

We hope that you will find OpenGSE as useful as we have. Check out the code and send us your feedback in our Discussion Group.

Ed. Note: Updated post to clarify the terminology used.

Surf the Earth

Thursday, January 22, 2009

For those of you not following the Google Mac Blog, you may be interested to check out one of our newest releases, Earth Surfer. The code for Earth Surfer, an application that allows you to use a Nintendo Wii Balance Board to travel over Google Earth in a milk truck, is based on Thatcher Ulrich's terrific Open Source Javascript Monster Milktruck demo, and runs in a webpage.

We always look forward to your feedback, so check out the code and send us your comments in our discussion group.

Open Sourcing Google Desktop Gadgets

Wednesday, January 21, 2009

The Google Desktop team has been steadily releasing our Desktop gadget (widget) creations as Open Source for the past few years. If you check out this list, you can see most of the official Google created gadgets are actively maintained by the Google Desktop developer community. We had many good reasons for opening this code for the community:

  • Source code can be a valuable learning tool. The gadgets not only show you how to develop Desktop gadgets, integrate with Google APIs, but also provide other tidbits of knowledge such as how to calculate phases of the moon or StarDates.

  • The images and graphics are also Open Sourced. Being an engineer, I know how frustrating it is to work hard on an application only to have it dismissed because of hand-drawn stick figures and shapes. We hope people can take advantage of our graphic designers' talents. If you're a fan of clocks, I have something right up your alley.

  • We get warm fuzzy feelings by simply supporting the cause. It fosters a spirit of openness and collaboration between the team and developer community.

To summarize, I'd encourage community members to consider opening your projects, even smaller ones. These fancy search engines nowadays are quite good about picking up code, and your algorithms and graphics can help someone in need.

In the past few weeks, we released some pretty cool gadgets for YouTube, Calendar and Google Docs. While developing these gadgets, we tried something different and almost exclusively used Project Hosting on Google Code rather than internal Google repositories, as our goal from the outset was to open this code base to all who might benefit from it.

Our experience has been great. We had to work closely with the Google Gadgets for Linux team, who typically work entirely in public Subversion repositories; using Project Hosting on Google Code helped us better integrate with their development process. We also asked expert users of Google products to consult with us on usability and design concerns. It was a great pleasure to add them as full-fledged project members during the development phase instead of waiting for this feedback post launch. We also worked with many contributors who are not Google employees, who helped provide development and graphic resources. These were the most enjoyable projects I've been a part of during my time at Google, primarily because of the cooperative atmosphere among many different people and teams.

CSS Selector Shell Released

Tuesday, January 20, 2009

Most web developers have felt the pain of discovering or remembering ways in which different browsers interpret and/or render their Cascading Style Sheets (CSS). Our newly released CSS Selector Shell is a simple Javascript tool for testing how a given browser parses CSS text by inserting a style element into the document and then reading the cssText back programmatically. You can already check out the CSS Selector Shell code base in action on Google AppEngine.

So far this tool has proven quite useful by showing what happens when developers try to make use of CSS selector syntax that may not be supported in a particular browser. Specifically, it can show how and perhaps why a particular CSS hack works in some browsers and how it is ignored in others. It can also demonstrate the potential for harm when using unsupported CSS syntax in some browsers. For instance, when testing a combined selector ".class1.class2" in Internet Explorer 6, it becomes active as ".class2", which may not at all be the goal of the CSS author. Another detail the CSS Selector Shell demonstrates is that shorthand property/values sometimes expand much further than you may have suspected. On the test page itself, there are two visible test elements that you can use to target and experiment. The code itself makes use of the Dojo Tooklit as well as the Django framework.

We hope this tool will be useful to you in diagnosing problems or understanding the differences between browsers when interacting with your Cascading Style Sheets. We always welcome your feedback, so check out the code and let us know what you think in our discussion group.

London OS Jam 11: Security

Thursday, January 15, 2009

The usual mixture of free pizza, beer and talks recently made for another smashing London Google Open Source Jam. The topic was Security, and the talks showed just how wide-ranging the subject is:

  • Ben LaurieCaja, a framework for running untrusted Javascript

  • Ben Smyth — Security protocols and how to express them, and why online voting could work

  • Jon Cowie — The Sysadmin view of Security: Virtual Machines, Ruby on Rails, packages and the complexities of auditing it all

  • Sam Mbale — Trust between users of applications on social networking sites

  • Douglas Squirrel — Real world PKI. A play in 3 acts. With props!

  • Glyn Wintle — Your personal data isn’t safe, and online voting can’t work

  • Gervase Markham — Phishing for profit and more profit

  • Joe Walnes — Cross site scripting and injection attacks

Each talk was limited to 5 minutes, with some trying to run longer — but we have a gong for such cases — and some running shorter. Questions are encouraged at the end of the talk, and in many cases lead to a discussion later. Of course, the talks were only part of the evening. Many impromptu discussions, not limited to Security, created debate on everything from Australian politics to mobile phones to automated testing.

We hope our guests found the evening as fun and informative as we did. If you are in or around London, you are welcome to join us for the next Open Source Jam. Keep your eye on our London OS Jam site for an announcement of the next Jam.

The Globus Alliance's First Google Summer of Code

Wednesday, January 14, 2009

The Globus Alliance is a community of organizations and individuals developing fundamental technologies behind the "Grid," which lets people share computing power, databases, instruments, and other on-line tools securely across corporate, institutional, and geographic boundaries without sacrificing local autonomy. Globus currently hosts more than 20 projects, actively developed by a community of more than 100 committers, and spanning a variety of technology concerns on grid systems.

This was the first year that we have participated in Google Summer of Code™ and, despite being total newbies, we were fortunate to be given ten students to mentor. Overall, we couldn't be happier with how the summer turned out. Eight of our students made it through the program and, three months after the end of Summer of Code, most of the code produced by these students has either made it into the official Globus code repository or is in the process of being added. Most of the mentors feel that their students have become a part of the Globus community thanks to their participation in the program. One of our students has already been voted in as a Globus committer (Globus uses an open and meritocratic governance model — similar to Apache Jakarta's — where new committers are voted into projects based on their work).

Through their projects, our students addressed a variety of specific issues and needs across multiple fields, ranging from grid security to virtual machines. More specifically:

The Portal-based User Registration Service (PURSe) is a set of tools and Java APIs, developed for constructing portal-based systems that automate user registration, the creation of PKI credentials, and subsequent credential management. A typical PURSe-based portal allows users to register via a Web page and then use a username and password to obtain X.509 proxy certificates. Mehran Ahsant, mentored by Rachana Ananthakrishnan, developed a standalone Credential Translation Service (CTS), integrated with PURSe, to provide grid users with other formats of security credentials such as SAML assertions and X.509 certificates. The CTS is a standalone WS-Trust security token web service, capable of issuing security tokens as defined by the WS-Trust specification and translating tokens into another format when a token is not in a format or syntax understandable by the recipient.

The Virtual Workspace Service, one of several services that make up the Globus Nimbus cloud toolkit, was only capable of using Xen as a virtualization backend. Michael Fenn, mentored by Kate Keahey, set things straight by refactoring the existing code to allow multiple backends, and implementing a KVM backend.

Globus GridFTP is a high-performance, secure, reliable data transfer protocol which generally assumes the existence of a high-performance parallel file system, a relatively expensive resource. On the other hand, FreeLoader is a storage system that aggregates idle storage space from workstations connected within a local area network to build a low-cost, yet high-performance data store. Is this a match made in heaven? Hesam Ghasemi, mentored by Raj Kettimuthu, thought so, and modified GridFTP so it could use FreeLoader as a backend, potentially reducing the cost and increasing the performance of GridFTP deployments.

AliEn is the Grid infrastructure which is used by scientists participating in the ALICE experiment at CERN. Artem Harutyunyan, mentored by Tim Freeman, developed a set of scripts on top of Globus Nimbus to dynamically deploy an entire AliEn Grid site, enabling 'one-click' deployment of all the services necessary for ALICE job retrieval and their execution. Artem is still actively working on this project and has even submitted a paper on his work to the CHEP 2009 conference. Screenshots of ALICE jobs running on the University of Chicago's Nimbus science cloud can be seen here.

Globus GridFTP can help you move data fast. However, Mattias Lidman, mentored by John Bresnahan, thought this wasn't fast enough, so he developed a compression driver for the Globus XIO input/output library (which GridFTP depends on) to compress/uncompress data as it passes through it. He even wrote a performance study (PDF) showing that his driver is, in fact, totally awesome.

The Globus GridShib project integrates a federated authorization infrastructure (Shibboleth) with Grid technology to provide attribute-based authorization for distributed scientific communities. Joana Matos Fonseca da Trindade, mentored by Tom Scavo, contributed to GridShib by implementing a Holder-of-Key Single Sign-On profile handler for the Shibboleth Identity Provider. Her contribution was completely integrated into the GridShib development and distribution framework and Joana did such a great job that she was asked to join the GridShib project as a committer. More details on Joana's work can be found on the GridShib website and on the Globus wiki.

Swift is a system for the rapid and reliable specification, execution, and management of large-scale science and engineering workflows. One of its main components is SwiftScript, a simple scripting language that can be used to specify complex parallel computations. Milena Nikolic, mentored by Ben Clifford, improved the SwiftScript compiler by adding stronger type checking and type inference.

The OpenNebula virtual infrastructure engine, developed by collaborators of the Globus Alliance, can be used to dynamically deploy and re-allocate virtual machines on a pool of physical resources but lacks a "cloud-like" interface, like the one provided by Globus Nimbus. Nimbus, in turn, lacks the advanced resource management features provided by OpenNebula. William Voorsluys, mentored by yours truly, tackled this particular issue by working on integrating OpenNebula and Nimbus.

Many congratulations to all of our mentors and students for their tremendous success in our first Summer of Code!

Google Blog Converters 1.0 Released

Friday, January 9, 2009

Blog authors around the world, Google would like to remind you that it's your blog, your data. Now that Blogger allows users the ability to export all contents of their blog, the Data Liberation team would like to announce the Google Blog Converters project. This new Open Source project provides the ability to easily move blog posts and comments from service to service. This initial release provides Python libraries and runnable scripts that convert between the export formats of Blogger, LiveJournal, MovableType, and WordPress.

In addition, the source code includes templates for hosting these conversions on Google App Engine. Future additions to the project will include support for BlogML and synchronization tools between various services that do not provide a import/export feature but do provide APIs for accessing and modifying blog contents.

We're excited to provide this level of control for your personal blog data. Contributions to the project are always welcomed and encouraged, so check out our source code (download, 2.7 MB zipped) and let us know what you think. We look forward to your feedback in our discussion group.

The NUI Group's First Google Summer of Code

Thursday, January 8, 2009

The Natural User Interface Group (NUI Group) is an interactive media group focused on research and creation of open source machine sensing techniques, such as voice/handwriting/gesture recognition and touch computing, to benefit artistic and educational applications. Additionally, the NUI Group is a world wide community offering a collaborative environment for developers that are interested in learning and sharing new Human Computer Interaction methods and concepts. Last year, we were chosen to participate in Google Summer of Code™ 2008 and we worked with 7 students, 6 of whom successfully completed their projects. It was a great opportunity to bring students into the world of Open Source human computer interaction and we were very excited by the results.

Stanislaw Zabramski worked on the multi-physics project. His main goal was to create a multi‐touch sensitive application for two‐dimensional graphic visualizations of a few basic concepts of physics, especially mechanics. His work is meant to be used by primary school pupils as a simple educational entertainment tool, thus making them familiar with physics in a more creative environment. Young users can actively participate in the learning process by designing and testing their own simulations in a visually catchy, cartoon‐style environment. The basic multi-touch enabled prototype application has been developed using Flash and ActionScript, and you can take a closer look at the interface in this screenshot:

We are looking forward to Stanislaw's release of the final version later this year.

Ashish Kumar Rai wrote a QMTSim application, a multi-touch input Tangible User Interface Object (TUIO) Simulator. Ashish developed this new simulator to allow fast development and debugging of multi-touch applications. TUIO is a versatile protocol, designed specifically to meet the requirements of table-top tangible user interfaces. While there is a Java based TUIO simulator, it does not help in utilizing the full capabilities of the protocol and only rudimentary applications can be developed using it. Ashish's implementation of QMTSim has many advantages over the Java TUIO simulator, including things like user defined touch point movement paths, an animation timeline, and support for simulations of pinching and zooming. Further, QMTSim provides opacity control to make the simulator transparent and to keep it above the application, thus giving an impression of touching the application itself. Ashish has recorded three videos on his project, including an introduction and two screencasts.

Alessandro De Nardi worked on the Grafiti project, a general infrastructure for table-top application multi-touch and multi-user gesture recognition management. Grafiti is a C# framework built on top of the C# TUIO client designed to support the use of third party modules for specialized gesture recognition algorithms. A set of modules for the recognition of some basic gestures is included. You may want to check out Alessandro's demo video:

Thomas Hansen developed Graphics Processor Unit (GPU) accelerated blob tracking for multi-touch user interfaces (and other blob tracking needs for that matter) as part of his gpuTracker project. Video signals are processed by the GPU to provide real time tracking of blobs. gpuTracker is aimed specifically at tracking blobs such as those created by displays using Frustrated Total Internal Reflection (FTIR) or Diffused Illumination (DI). Check out the image of video input from a GPU enabled blob tracker:

Seth Sandler worked on the tbeta project, a blob tracking application for multi-touch screens built using image processing based techniques like FTIR and DI. The application is written in C++ and uses OpenFrameworks. Some of the most interesting features include an input video image filter chain, a quick camera switcher, dynamic mesh calibration for fish eye lenses, image reflection, a GPU mode, which allows for integration with the aforementioned gpuTracker code that was developed by Thomas Hansen. Most importantly, tbeta is cross-platform and already works on Mac, Linux and Windows.

Daniel Lelis Baggio wrote EHCI (Enhanced Human Computer Interface), a webcam image processing library built on top of OpenCV, which generates events from user's head, hand and body movements. This library is also intended to track objects so that augmented reality can be made. In order to enhance human computer interaction, the application uses a single webcam and does not require the use of either FTIR or DI techniques. Besides tracking positions, this library is also able to provide higher level events such as fetching 3d user hand or head position. You can get a better feeling of Daniel's work by watching his EHCI videos on youtube.

Many congratulations to our students and many thanks to our mentors for making our first Summer of Code such a wonderful experience!

google-perftools 1.0 Released

Wednesday, January 7, 2009

Nearly four years ago, we released our first major Open Source codebase, google-perftools, a set of tools to help developers create applications with better performance. We've just released version 1.0 of the software, which includes the fastest malloc implementation we've seen, as well as a thread-friendly heap-checker, heap-profiler, and cpu-profiler. We're now working on improving performance even more in multi-threaded, multi-processor environments, including adding support for Non-Uniform Memory Access. Work continues unabated to enhance portability and performance on a wide range of systems.

Many thanks to the community for testing and bug reports on various operating systems and processors! Your help has made perftools a useful system for a truly wide range of developers. We look forward to your feedback in the project's discussion group.