Posts from August 2008

Summer of Coders at Google Krakow

Saturday, August 30, 2008

I was thrilled when I found out there would be a Google Summer of Code meetup in Krakow, not far from where I live. I not only wanted to visit Google's offices and see behind the scenes of Google, I was excited to meet fellow Summer of Code students and some of Google's engineers. The office visit, which was held on August 4th, turned out to be just as fun as I anticipated.

After we arrived, our hosts showed us around the office and offered "java" and snacks. Following refreshments, a few of the employees informally introduced the projects they are working on and told us about life at Google. It was very interesting to see how many different projects are done in this single office.

Each of us 20+ students then got a short time to introduce what we were working on for our Summer of Code projects. Once again it was interesting to see the wide variety of projects. After we finished talking about our projects, we were given some cool Google swag and there was time for discussion, more food and some fun (table tennis, Wii tennis, RockBand etc.). Time went by very quickly and too soon it was time to go, so we took some pictures and said goodbye to our hosts.

Mozilla and Eclipse Licenses Now Available for Hosting Users

Wednesday, August 27, 2008

You might remember that we recently removed the MPL from the list of licenses available to projects hosted on Google Code. We did this because we have been trying as a company to make a statement against open source license proliferation. You see, we feel it is damaging to the larger world of open source development if there are too many duplicative licenses. So...Why are we changing our mind about the MPL and EPL now?

Since we started hosting projects, we've been petitioned by the Eclipse Foundation (of which Google is a member) and its community of developers to include the EPL as an option for new projects. We've resisted until now as we felt that the features of the EPL were not unique enough to justify its inclusion. This hasn't changed, but how we think about licenses is getting a bit more nuanced.

Eclipse is an important, lively and healthy project with an enormous plug in and developer community that uses an otherwise duplicative license. They aren't interested in using the BSD or other open source licenses that are readily combinable with EPL code. We have decided that after nearly 2 years of operation, that it was time to add the EPL and serve these open source developers.

We also want to show our solidarity with our friends at the Eclipse project through this action.

Considering the user base, and not just the popularity of an otherwise non-majority license isn't unprecedented for us. For instance: We considered this when we first opened the site in our acceptance of the Artistic/GPLv2 combination which sees little substantive use outside of Perl.

In that light, our removal of the MPL from the site seemed a little absurd. So, our bad. We're putting that option back up for new projects. The groups that want to use the MPL to enable their additions, extensions and more for Firefox and other Mozilla projects are legion and considering their recent summit, represent a very healthy global collection of developers.

Let us know what you think in the comments and we look forward to seeing the new projects that we'll be able to serve here on Google Code.

Uzaygezen: Multi-Dimensional Indexing with Hilbert Curves

Monday, August 25, 2008

I'm pleased to announce the initial release of the Open Source project Uzaygezen, a Java library specialised in multi-dimensional indexing based on Hilbert curves. For those who may be wondering about the origins of the project's name, a fellow engineer in Dublin suggested this word from his native Turkish, as Uzaygezen means "space wanderer." The library supports mapping from a multi-dimensional space into one dimension via the Compact Hilbert Index. Additionally, Uzaygezen allows query building for databases with range query functionality, e.g. relational databases and, more generally, B-trees. To find out more and to check out the code, please visit the project homepage.

We always love to hear what you think. Please join the Uzaygezen discussion group to share your thoughts with us.

Git and Google Summer of Code

Friday, August 22, 2008

Many of you may remember our most recent episode of the Google Summer of Code™ podcast, Getting Giddy with Git. Now that we're heading into the program home stretch, we're back with an update on the success (and failures) of Summer of Code within the Git community.

This year Git was fortunate enough to get 6 very hard-working students, focusing on a number of different projects to make Git more portable and run even faster. Among our high-profile projects for 2008, Miklos Vajna's native C port of git-merge entered the main tree on July 8th, just one day after the mid-term evaluation period began. Miklos' work shipped as a key part of Git 1.6.0, released on August 17th. We were also excited to see Marek Zawirski's push implementation for EGit (the Eclipse plugin) enter the main EGit tree on June 28th, weeks before the program mid-term. Difficulties working with SWT caused Marek's GUI implementation to be delayed, but it finally showed up at the last minute as a 31 patch series. We are looking forward to seeing this in the fall EGit release.

To read even more about all of our great student projects, see the mailing list thread Jakub Narebski started during the 'pencils down' period. Jakub offers an excellent summary of every student project, and many of the students provide more detailed updates later in the thread.

Sojourning in Szeged? Stop by DrupalCon 2008

Thursday, August 21, 2008

If you're passionate about web development and will be in or around Szeged, Hungary next week, be sure to stop by DrupalCon 2008. Nearly 500 developers will be descending on the city to discuss all things Drupal, from improving user experience to further growing this CMS' vibrant community. Our very own Leslie Hawthorn, geek herder extraordinaire, will be presenting on Open Source at Google, with highlights from Drupal's participation in the Google Summer of Code™ program and the Google Highly Open Participation Contest™.

If you can make it, be sure to swing by to say hello to Leslie!

Zurich Open Source Jam 4

Friday, August 15, 2008

In mid-July 2008 we had the fourth instance of the Open Source Jam in Zurich, an event for Open Source developers and users to meet, collaborate and network. This time it was a combined event with Google Summer of Code™ participants.

We had more than 55 people over, some of them giving talks on various projects. These projects included GDAL2Tiles, Tech Drawing Abilities for Inkscape, Libarchive, Mercurial, Mono, OLAT and Osmarender Frontend for OpenStreetMap.

When not listening to the lightning talks given in two blocks, people were talking to each other in small groups, discussing ideas and projects. As usual, you could hear various languages, from English and German to French. These jams are not only great to meet Free Software people like Bram Moolenaar, creator of Vim, but also just to meet people of different cultural background. To make sure nobody was thirsty or hungry, Google provided free beer and food.

The Zurich Open Source Jams are semi-regular events. To stay informed about the details of the next one, or to catch up on discussions about previous ones, join the Open Source Jam Zurich Google Group. To get more information on Summer of Code, visit the program website.

SciFoo: 200 of the World’s Top Scientists Meet at Google’s Annual Meeting of Really, Really Smart People

Organized in collaboration with Nature Publishing Group and O’Reilly Media (“FOO” stands for “Friends of O’Reilly”), and hosted at the Googleplex, the third annual Science Foo Camp (SciFoo) unconference boasts no predefined agenda. Rather, participants are invited to propose their session topics on a giant white board, in various time slots with eight sessions running concurrently.

Most academic conferences are highly specialized and attended time and again by the same people. Here, to promote fruitful cross-pollination, participants hail from dozens of science and technology disciplines, from biology and astrophysics to CS and nano-technology. Attendance is invitation only; in the interest of mixing things up, many of the 200+ participants are not invited twice. “SciFoo allows people at different institutions and from different disciplines to interact with each other,” says Open Source Programs Manager Chris DiBona, who spearheads SciFoo. “It gives them a rare chance to talk freely with each other in a private setting.”

This year, the conference was attended by Eric, Sergey, Larry Page, and Larry Brilliant of, along with a bevy of Google organizers and volunteers. The list of "campers" boasted four Nobel Prize winners (Sydney Brenner, Walter Gilbert, Andy Fire, and Frank Wilczek) and a laundry list of champions in the scientific community. Here are just a few: George Dyson (scientific historian), Brian Cox (physics popularizer, spokesman at CERN), Aubrey de Grey (biomedical gerontologist who studies "living young longer"), Eugenie Scott (director, National Center for Science Education), Brother Guy Consolmagno SJ (astromer at the Vatican), Neal Stephenson (science fiction writer), Nick Bostrom (transhumanist philosopher), Dan Tani (NASA astronaut, who has spent 131 days in space), Steward Brand (creator of The Whole Earth Catalog), Jill Bolte Taylor (neuroanatomist, author of the recent bestseller My Stroke of Insight [see TED talk]), notable theoretical physicists Lord Martin Rees (England's Astronomer Royal), Max Tegmark, Paul Davies, Lee Smolin, and renown oceanographer Sylvia Earle. To give the conference some umph, rocket scientist Carl Dietrich brought along a model of his Terrafugia "roadable aircraft," also known as a flying car, and Ian Wright parked his X1 all electric performance car, capable of 0-60 MPH in 3.07 seconds, by the dining tent.

Certain themes recurred. One was the need to do a better job of open sourcing data within the science community, including negative results; such sharing would enable collaboration and prevent scientists from "reinventing the wheel." A number of seminars also addressed the more quotidian concerns of studying science, from navigating office politics in academia to finding ways of making the discipline more exciting to young people. Many talks were also informed by specific social and humanitarian concerns, such as how Google can help detect emerging global pandemics, how genomic testing can help people prevent diseases, and, in a nutshell, what we can all do to ensure the long-term survival of the human race.

“A scientifically literate world is one that’s good for everyone,” DiBona says, summarizing the intent behind the conference. “People who are better educated will better understand what's possible on the Internet. As Googlers, I think it's incumbent on us to try to support basic science research and education around the world."

You can learn more about SciFoo by checking out the blog buzz and news coverage aggregated at

Opportunities for Students at the Linux Plumbers Conference

Thursday, August 14, 2008

We love helping our colleagues congregate together in the spirit of collaborative learning, and we're even more delighted to do so when it gives us the chance to help students learn more about Free and Open Source Software. Google's Open Source Team is a proud sponsor of the upcoming Linux Plumbers Conference, which will be held in Portland, Oregon, USA in mid-September. This first time conference has added a student specific mini-conference focused on providing hands on tutorials and development coaching; full-time students are welcome to attend the tutorial day along with the additional three days of the conference for only 50 USD. Tutorials will be taught by some of the most well-known members of the Linux community, including several of the conference speakers.

Early registration is open through Monday, 18 August. We certainly hope to see you there!

Summer of Coders at Google's Bangalore R&D Center

Wednesday, August 13, 2008

At the end of July, several Google Summer of Code™ students got a chance to visit Google Bangalore. This being my first Summer of Code, I was very excited to visit the Google office and get a feel of what life is like when you work at Google. We had a bunch of students visit from all across India, and getting to meet them in person after days of talking on IRC added to the excitement!

Our hosts enlightened us about what Google is all about, the office itself, the company's India operations, and the Google 15 factor because of all the awesome food! We were also treated to a talk on the History of Google. Mr. Rahul Roy-Chowdhury’s presentation showcasing Google’s Vision & DNA gave us insights into how “different” Google’s business has been right from the start. We also got to know how new and innovative things are conceived of at Google and how projects end up in Google Labs, specifically the Google Indic Transliteration engine and Google News.

We all had a fantastic time and left feeling enriched and inspired. Many thanks to Google for hosting us!

Keyczar: A New Crypto Toolkit

Monday, August 11, 2008

We are pleased to announce the Open Source release of Keyczar, a toolkit that makes cryptography safer and easier to use. For more information, please visit the Keyczar homepage or read more on the Google Online Security Blog.

Linux Disk Scheduler Benchmarking

Friday, August 8, 2008

Over the last six months, Google has sponsored Gelato@UNSW to take a close look at the disk schedulers in Linux, particularly when combined with RAID.

We benchmarked the four standard Linux disk schedulers using several different tools (see our wiki for full details) and lots of different workloads, on single SCSI and SATA disks, and on hardware and software RAID arrays from two to eight spindles (hardware raid) and up to twenty spindles (software raid), trying RAID levels 0 through 6.

We had to fix some of the benchmarking tools (the fixes are now upstream), and we developed a new one: a Markov Chain based replay tool, which allows a workload to be characterised and then a similar workload generated.

We found bugs in all the schedulers; the ones in the deadline and anticipatory schedulers we fixed, and the current kernel has our fixes in it. CFQ's problems are harder to fix; we are continuing to work on them.

The work was presented at the Linux Storage and Filesystem Workshop, and in January at the 2008 Kernel Mini-Conference. See our Talks wiki page for links to slides and video.

Our major finding is that the best I/O scheduler to use is very dependent on your workload. The deadline scheduler seems to give a good compromise between bandwidth and bounded latency; but for particular workloads on small numbers of disks AS and CFQ can outperform it by a long way. In our measurements on hardware RAID the benefits of anticipation are negligible with more than three or four spindles; and CFQ's worst case performance (which seems to be very easy to trigger) is orders of magnitude worse than that of any other scheduler.

The most interesting results are outlined on our wiki; full results will be published later this year.

distcc's pump mode: A New Design for Distributed C/C++ Compilation

Thursday, August 7, 2008

For a while now, Google has been using distcc, a distributed C/C++ compilation system, to speed up building software made of millions of lines of code. With distcc, we can build code an order of magnitude faster than we could if everyone had to compile on their own workstation. But even with distcc, compiles could take a long time: compiling the Google Webserver might take 20 minutes. We started looking at distcc to see if we could make it even faster.

We're proud to report that we've succeeded: we've developed an algorithm we call "pump mode", which can be added to distcc to speed it up by a factor of 3. Pump mode works by pushing even more processing onto the servers. Based on an incremental static analysis of the source code, pump mode is able to quickly identify the sets of files needed for the preprocessing phase of compiling C/C++ programs and send them to the compilation servers for preprocessing. This achieves a dramatic decrease in the CPU load of the workstation and of course much better build speed. We have tested pump mode on some open source software and seen improvements in build speed between 50% (the Linux kernel) and 200% (Samba). With simple changes to the project Makefiles, most projects we have looked at would be even faster!

The pump mode extension has been Google's main C/C++ build system for over a year now.

Distcc's pump mode was developed by a small team at Google that included myself, Manos Renieris, Fergus Henderson, and Craig Silverstein. The pump mode extension complements the recently released open source gold linker, which addresses the other basic bottleneck for fast building of C/C++ software.

Distcc's pump mode is included in release 3.0 of distcc. This is the first release since 2004 when Martin Pool, the original author of the code base, released version 2.18.3. Distcc 3.0 contains many other contributions from a variety of contributors, including Avahi Zeroconf support by Lennart Poettering, "lsdistcc" by Dan Kegel, and bug fixes and portability improvements by Nadim Khemir, Maks Verver, Niklaus Giger, Sascha Demetrio, Alex Besogonov, Ben Skeggs, Lisa Seelye, Lei Zhang, Michael Moss, Dongmin Zhang, and others. Disctcc is now maintained by Fergus Henderson.