Posts from August 2013

Who is New in Google Summer of Code - Part 11

Friday, August 30, 2013

Over the past 11 weeks we have highlighted 29 open source organizations participating in their first Google Summer of Code. For our final post in this annual summer series, we have organization administrators from Funf Open Sensing Framework and the Open Source Robotics Foundation describing their student’s projects below.

The Funf Open Sensing Framework is an extensible sensing and data processing framework for mobile phones. The Funf project aims to help both developers as well as non-technical researchers and individuals. The Funf-in-a-Box service lets users configure and build a custom data collection app in less than five minutes, with zero programming. Funf Journal, a mobile app available on the Android Play Store, allows users to collect and explore data about their lives (quantified-self), and gives developers/researchers a chance to evaluate the capabilities underneath the hood. All of these are built on top of the Funf SDK, which can be used by developers to incorporate sensing functionality into their apps, and can be extended to provide new sensing capabilities. 
This is Funf’s first year in Google Summer of Code and we formed a small and tight working group consisting of our two organization mentors and our two amazing students, Swetank Kumar Saha and Pararth Shah. We have been working on two fronts: Adding core functionality to the Funf SDK and enhancing Funf In A Box (FIAB) which has become very popular with researchers and data collection enthusiasts. Pararth has spearheaded work on the core library and added support for high bandwidth probes, including raw audio, video, and timelapse. We’re now diving into advanced triggers and scheduling that will allow for dynamic sensing configurations. On the FIAB side, Swetank is adding support for configuring and deploying custom surveys and capturing additional user input.  Amazingly, he also managed to fit in porting the FIAB architecture from a single Amazon server to Google Cloud Services, which is going to greatly reduce our costs and increase our performance. The summer isn’t over yet, so stay tuned for more new features and updates! 
By Nadav Aharony and Alan Gardner, Funf Organization Administrators 

The Open Source Robotics Foundation (OSRF) has a clear mission: "To support the development, distribution, and adoption of open source software for use in robotics research, education, and product development." We have three exciting Google Summer of Code projects contributing to CloudSim, Gazebo, and ROS, which currently represent three of our biggest open source efforts.  
Esteve Fernández has been adding support for OpenStack to CloudSim, a project developed by OSRF to run robotics simulations in the cloud. CloudSim was used to support the DARPA Virtual Robotics Challenge and currently supports the Amazon and SoftLayer clouds. Esteve added support for private clouds to CloudSim, enabling organizations to run simulations on their own networks. Furthermore, Esteve is contributing to CloudSim by fixing bugs, improving the code base and helping with any task that comes up. In the following weeks, he will be making CloudSim constellations accessible to users in an OpenStack cloud provided by OSRF as a public service. 
Andrei Haidu is developing a fluid dynamics physics engine for the Gazebo robot simulator that will enable the use of aerial and underwater vehicles in simulation. 
Gonzalo Abella is developing a new parameter server prototype for ROS. He is exploring making all parameters dynamic, and integrating the capabilities of the dynamic_reconfigure package into the core API. 
By Carlos Agüero, Open Source Robotics Foundation Organization Administrator

We hope you have enjoyed reading about many of the new organizations participating in Google Summer of Code this year over these past 3 months. Students have a little over two  weeks to wrap up their summer projects before the soft pencils down date on September 16th.  For more details on important dates you can visit our program timeline and you read about all of the 177 open source projects participating in this year’s Google Summer of Code on the program site.

By Stephanie Taylor, Open Source Programs

Nine years of Google Summer of Code and KDE: Still Going Strong

Wednesday, August 21, 2013

Bringing innovation and teaching together, the KDE community takes the spirit of Google Summer of Code and keeps building on it to reach even more students. For the ninth year, the KDE community is participating in the Google Summer of Code with 50 student projects. KDE also has three women participating this summer in the Outreach Program for Women run by the GNOME Foundation. As in previous years, the KDE community has organized the 'Season of KDE' for motivated students whose proposals didn't make it in either Google Summer of Code or the Outreach Program for Women.

With so many students and a request for them to blog regularly about their progress, 
Planet KDE has many Google Summer of Code related posts written by this year’s students. This allows the wider KDE community to follow the work done by the students and comment on it. But this is not the only way the students share what they are doing. At the Akademy conference in Bilbao, many students were present, presenting their work in their own session or as part of the Student Programs Presentation. Students also update the KDE wiki with the status of their projects. 

The Google Summer of Code students are working on a wide variety of KDE projects, from components of the basic shell (network management) to the core of that shell itself (dynamic switching between shells based on form factor changes) to end user applications. Some projects move out of the desktop sphere, with a web shop for the popular Krita painting application, as well as a project report tool showing development statistics on KDE sub-projects. 

Projects are pushing the boundaries of technology, bringing in openGL and collaborative text editing in KDE applications, and exploring unique interfaces for features like the human-friendly query parser for the Semantic Search technology in KDE. 

Since the first Google Summer of Code in 2005, the KDE community continues to push the boundaries of technology. Students discover how the process of collaboration and open innovation results in a great experience, and the IT world gains valuable new participants. And the students in turn get a chance to shine while making a valuable contribution to society!

By Jos Poortvliet, KDE Marketing team

GNU Tools Cauldron 2013

Monday, August 19, 2013

Recently Google hosted just over 100 GCC (GNU Compiler Collection) developers at our Mountain View, CA headquarters for the 3rd annual GNU Tools Cauldron. The purpose of this 2+ day workshop was to gather GNU tools developers together to coordinate work, exchange reports on ongoing efforts, and discuss development plans for the next 12 months.

For me, the most interesting result was the final realization that GCC is once again in need of a major code base reorganization.  For several quarters we have been working on modernizing the codebase. We switched it to C++, we started converting core data structures and are beating the refactoring drum. All those efforts were not wasted and there are two major efforts in progress now that will fundamentally alter the structure of the compiler.

The one I'm most hopeful about is Red Hat's proposal to extend the modularization effort we had started last year. This will see the compiler split into hermetic modules that will only be able to communicate via well-defined interfaces (if you've ever hacked on GCC, you'll know how much this simplifies life).

The other, related, effort is the removal of all global state from the compiler.  The final goal is to allow the compiler to be turned into a shared library and used for JIT purposes.

Among my favorites of the videos of the presentations from the Cauldron are those dealing with this overhaul of the GCC codebase.  This is going to fundamentally transform GCC internals over the next couple of years. GCC 5.0 will be unrecognizable. Watch the videos on the removal of global state and GCC re-architecture.  Exciting stuff!

Another especially interesting moment at the Cauldron was when Dehao Chen presented his work on AutoFDO.  This work will significantly simplify the usability of FDO technologies.  I can't wait to see this submitted to trunk.

We also had the usual collection of presentations about optimizations, runtime BOFs, new features and “lively discussion” over meals.  Good, geeky times were had by all.

By Diego Novillo, Compiler team

Who is New in Google Summer of Code - Part 9

Friday, August 16, 2013

This week we have three more new Google Summer of Code organizations, MLton,  Buildroot and Privly, explaining their projects and what their students are currently working on.

MLton is an open source, whole program, optimizing compiler for the Standard ML programming language. Standard ML is a strict, statically typed, functional programming language with type inference, abstract data types, a sophisticated module system, garbage collection and many other features. As a high-level language with advanced programming language features, Standard ML is a challenge to implement efficiently. MLton uses whole program compilation to provide both advanced programming language features and superior performance.  
This is the first year that MLton is participating in Google Summer of Code and we are excited to be mentoring two students. Tucker DiNapoli is working on adding a rich collection of SIMD primitives to the compiler and developing an SML library that exposes these primitives to the programmer. Nate Burgers is working on tooling support that will allow MLton to target RTEMS (Real-Time Executive for Multiprocessor Systems) and runtime system improvements to make MLton generated code better suited for real-time embedded systems. 
By Matthew Fluet, MLton Organization Administrator 
In the context of embedded Linux systems, one often needs to create highly-customized Linux systems, comprising a Linux kernel, a bootloader and a root file system with multiple libraries and applications. Buildroot is a tool that allows one to build from source all the components of an embedded Linux system, using cross-compilation. It supports a wide range of CPU architecture (x86, x86-64, ARM, PowerPC, MIPS, Blackfin, ARC, Xtensa and more) and more than 1000 userspace packages (including, Gtk, Qt, Gstreamer and many more). Developed by an active community, Buildroot is used by many embedded CPU vendors as the base for their development kit, and is also used by a number of embedded system makers for their products.  
The focus of our Buildroot Google Summer of Code project by Spenser Gilliland is to improve the support of the multimedia features of various ARM processors in Buildroot. This involves creating Buildroot packages for the various OpenGL libraries or hardware-accelerated video encoding/decoding libraries that are needed on ARM processors. So far, thanks to the Google Summer of Code, improvements to video decoding on Raspberry Pi have been integrated, OpenGL support for OMAP3 and AM335x platforms (BeagleBoard, BeagleBone) has been integrated, as well as OpenGL support for Allwinner SOCs (Cubieboard and other similar platforms). 
By Thomas Petazzoni, Buildroot Google Summer of Code Organization Administrator 

The Privly Project is developing browser extensions to layer stronger security and privacy properties onto the web. The system works by injecting security applications into the context of potentially untrusted websites. Since the injected content is contained within its own application, Privly can support all relevant cryptographic protocols without downloading server-stored JavaScript on every request. This fixes a fundamental issue with security in the JavaScriptable web. 
Since mobile browsers are not commonly extendable, Privly needs a different approach for Android and iOS. After a design process led by Google Summer of Code applicants, two students were selected to create a novel way of integrating Privly's security applications with mobile platforms. The mobile apps will be able to post content securely through other mobile applications, as well as pull encrypted content from various sources and display it in-app. 
By Sean McGregor, Privly Organization Administrator

Over the last 2 months we have highlighted 26 of the 40 new organizations participating in Google Summer of Code this year. To view a complete list of the projects the students are working on this summer you can visit the Google Summer of Code program site.

By Stephanie Taylor, Open Source Programs

Learning the meaning behind words

Wednesday, August 14, 2013

Today computers aren't very good at understanding human language, and that forces people to do a lot of the heavy lifting—for example, speaking "searchese" to find information online, or slogging through lengthy forms to book a trip. Computers should understand natural language better, so people can interact with them more easily and get on with the interesting parts of life.

While state-of-the-art technology is still a ways from this goal, we’re making significant progress using the latest machine learning and natural language processing techniques. Deep learning has markedly improved speech recognition and image classification. For example, we’ve shown that computers can learn to recognize cats (and many other objects) just by observing large amount of images, without being trained explicitly on what a cat looks like. Now we apply neural networks to understanding words by having them “read” vast quantities of text on the web. We’re scaling this approach to datasets thousands of times larger than what has been possible before, and we’ve seen a dramatic improvement of performance -- but we think it could be even better. To promote research on how machine learning can apply to natural language problems, we’re publishing an open source toolkit called word2vec that aims to learn the meaning behind words.

Word2vec uses distributed representations of text to capture similarities among concepts. For example, it understands that Paris and France are related the same way Berlin and Germany are (capital and country), and not the same way Madrid and Italy are. This chart shows how well it can learn the concept of capital cities, just by reading lots of news articles -- with no human supervision:

The model not only places similar countries next to each other, but also arranges their capital cities in parallel. The most interesting part is that we didn’t provide any supervised information before or during training. Many more patterns like this arise automatically in training.

This has a very broad range of potential applications: knowledge representation and extraction; machine translation; question answering; conversational systems; and many others. We’re open sourcing the code for computing these text representations efficiently (on even a single machine) so the research community can take these models further.

We hope this helps connect researchers on machine learning, artificial intelligence, and natural language so they can create amazing real-world applications.

By Tomas Mikolov, Ilya Sutskever, and Quoc Le, Google Knowledge

Gumbo: A C library for parsing HTML

Tuesday, August 13, 2013

We're pleased to announce the open source release of the Gumbo HTML parser, a C implementation of the HTML5 parsing algorithm.

One of the big accomplishments of the HTML5 standard was to standardize the HTML parsing algorithm, so that all browsers see the same HTML document in the same way. So far, most implementations of this algorithm have either been tied to specific browsers or rendering engines, or they've been written in specific scripting languages. This makes it hard to write quick one-off tools to manipulate and cleanup HTML if you don't happen to be working in a language that already has an HTML5-compatible parsing library.

Gumbo seeks to provide a simple library that can serve as a basic building block for linters, refactoring tools, templating languages, page analysis, and other small programs that need to manipulate HTML. It's written in pure C for ease of interfacing with other languages, and has no outside dependencies. Gumbo was built from the start to support source locations and correlating nodes in the parse tree with positions in the original text.

For more information including download, installation, and usage instructions, please visit the Gumbo project page.

By Jonathan Tang, Search Features team

Who is New in Google Summer of Code - Part 8

Friday, August 9, 2013

For the 8th post in our Google Summer of Code series highlighting the new open source organizations we are welcoming into this year’s program, we have organization administrators from LTTng, Constellation and PLASMA describing their student’s projects below.

The LTTng project (Linux Trace Toolkit - next generation) aims at providing highly efficient tracing tools for Linux. Its tracers help to track down performance issues and debugging problems involving multiple concurrent processes and threads. Tracing across multiple systems is also possible. Apart from LTTng's kernel tracer and userspace tracer, viewing and analysis tools are also part of the project. LTTng's performance relies on techniques such as Userspace RCU, lockless algorithms, per-cpu data structures and cache impact minimization. 
During Google Summer of Code 2013, our two students will work on the following projects:

  • Zifei Tong will work on dynamic instrumentation support for the userspace tracer (UST). The current UST tracer relies on static tracepoint probes manually inserted in the traced application’s source code. This project aims at providing dynamic instrumentation capabilities in arbitrary applications. 

  • Xiaona Han will work on improvement to the Babeltrace Python bindings. Most of the public Babeltrace API is currently mapped using SWIG. However, a more “Pythonic” set of wrapper classes will make trace reading and writing more accessible than using the current native API. 
By Christian Babeux, LTTng Organization Administrator 

Constellation is a young academic group at the University of Stuttgart, Germany. Our goal is to provide the creative environment for realization of different aerospace projects. As one branch the group offers a distributed supercomputing platform for solving aerospace related numerical problems. The massive computing power is provided by volunteers donating their idle computing time at home by forming a virtual super-computer via the internet. For this citizen space science method we are using the open source software called BOINC (Berkeley Open Interface for Network Computing). We are currently supported by 7,000 volunteers providing 20,000 host PCs to our computing grid where they help us optimize the thrust curve of a hybrid-engine sounding rocket by the student group HyEnD, and find a trajectory between Earth and the Earth-Moon-libration point EML-4 for the communication relay satellite mission TYCHO.

We are really honored to be part of this year's Google Summer of Code, and we have three students working on diverse ideas. We try to find the optimum interplanetary trajectory for one probe to all eight planets in our solar system and Pluto with "Solar System Grand Tour", which includes n-body simulation and implementing optimizing algorithms. We want to support flying observatories like SOFIA by NASA and DLR to maximize their observation times by optimizing flight routes with "airborne observatory". And lastly, we want to excite children about space with an educational and fun "space trumps" card game for Android mobile devices, so that they will be involved in STEM professions when they grow up. This is the power of citizen science and open-source!
By Andreas Hornig, Constellation Organization Administrator


PLASMA is a research group at the University of Massachusetts that works on a diverse array of projects that span the space of programming languages and systems. We are very excited to take part in Google Summer of Code for the first time! This summer we are fortunate to have three excellent students on board working on two of our projects, Doppio and CheckCell. 
Doppio is a Java Virtual Machine written in CoffeeScript that can run unmodified JVM programs in the browser -- no plugins or recompilation required! Giles Lavelle is working on adding AWT and Swing support to Doppio, which will enable us to run GUI programs. Braden McDorman is adding networking support to Doppio using WebSockets, which will let JVM programs make use of the network through standard Java socket APIs. 
CheckCell is a tool for finding input data errors in spreadsheets. CheckCell combines statistical and program analysis techniques to find errors: values that have an unusually high influence on formulas or charts. These values are either extremely important, or are wrong. Our Google Summer of Code student, Alexandru Toader, is porting CheckCell from Microsoft Excel to Google Spreadsheet, allowing anyone with a modern web browser to use CheckCell online. 
By John Vilk, Daniel Barowy, and Emery Berger, PLASMA Organization Administrators
These are only a few of the new organizations participating in Google Summer of Code 2013. Stay tuned for our final posts over the next couple of weeks highlighting the remaining 40+ new organizations. A complete list of the 177 organizations mentoring students this year and the program timeline are available on the Google Summer of Code program site.

By Stephanie Taylor, Open Source Programs

More patents in the service of open source

Thursday, August 8, 2013

Open-source software has accelerated the pace of innovation in computing, leading to better products and services at lower cost. But as the impact of open-source software has grown, so too has the number of patent attacks against it.

In March, we announced an Open Patent Non-Assertion (OPN) Pledge—committing not to sue any user, distributor or developer of open-source software on specified patents, unless first attacked. Our goal was to encourage pro-competitive, defensive uses of patents to support open-source innovation.

Today we are pleased to pledge an additional 79 patents under the OPN. These patents cover software used to efficiently operate data centers, including middleware, distributed storage management, distributed database management, and alarm monitoring.

We acquired these patents from IBM and CA Technologies, companies that in 2005 were among the first to make open-source patent pledges. The goal of the patent system is to foster innovation, and we aim to use patents, whether acquired or developed internally, in support of that goal.

You can learn more about this second group of patents and the Pledge itself on our site, which we’ve also updated to make it easier to browse and download data on pledged patents.

To date, the patents we’ve included in the Pledge have generally related to “back-end” technologies: servers, data centers, and the like. But open-source software is also transforming the development of consumer products that people use every day—so stay tuned for additional extensions to patents covering those sorts of technologies.

By Duane Valz, Senior Patent Counsel

Google Summer of Code Full of Stats - Part 3, Countries

Monday, August 5, 2013

In our third and final series of statistic posts for 2013 Google Summer of Code, and back by popular demand, we have a list of all countries represented in the program this year. Get ready to scroll! There are students from 71 countries this year, complete with two newcomers - Cameroon and Tunisia.

2013 Student Participants by Country 
Algeria = 4
Argentina = 10
Australia = 10
Austria = 24
Azerbaijan = 1
Bangladesh = 2
Belarus = 5
Belgium = 9
Bosnia-Herzegovina = 2
Brazil = 13
Bulgaria = 6
Cameroon = 2
Canada = 31
Chile = 3
China = 65
Colombia = 1
Croatia = 5
Cyprus = 1
Czech Republic = 11
Denmark = 4
Ecuador = 1
Egypt = 11
Estonia = 1
Finland = 6
France = 35
Germany = 68
Ghana = 1
Greece = 17
Hungary = 17
India = 271
Indonesia = 2
Ireland = 5
Israel = 1
Italy = 20
Japan = 6
Kazakhstan = 1
Kyrgyz Republic = 1
Latvia = 1
Lithuania = 3
Macedonia = 3
Malaysia = 1
Mexico = 3
Moldavia = 2
Netherlands = 7
New Zealand = 3
Pakistan = 1
Peru = 3
Poland = 31
Portugal = 7
Romania = 42
Russian Federation = 37
Serbia = 2
Singapore = 17
Slovak Republic = 9
Slovenia = 6
South Africa = 2
South Korea = 4
Spain = 35
Sri Lanka = 56
Sweden = 17
Switzerland = 6
Taiwan = 5
Tunisia = 1
Turkey = 13
Ukraine = 20
United Kingdom = 35
United States = 143
USA Minor Outlying Islands = 1
Uzbekistan = 1
Vietnam = 2

By Mary Radomile, Open Source Programs

Who is New in Google Summer of Code - Part 7

Friday, August 2, 2013

We are halfway through Google Summer of Code 2013 and with the three projects below, we have now highlighted 20 of this year’s 40 new open source organizations in our weekly blog series.

The Maryland Institute for Technology in the Humanities (MITH) is a leading digital humanities center that pursues disciplinary innovation and institutional transformation through applied research, public programming, and educational opportunities. Jointly supported by the University of Maryland College of Arts and Humanities and the University of Maryland Libraries, MITH engages in collaborative, interdisciplinary work at the intersection of technology and humanistic inquiry. MITH specializes in text and image analytics for cultural heritage collections, data curation, digital preservation, linked data applications, and data publishing. 
We have two students working with us on Google Summer of Code projects. One student is working on a JavaScript library to engrave MEI-encoded music notation using VexFlow. Part of the work will be dedicated specifically to support variant handling, a distinctive feature of the MEI data model and an essential component of critical editions of musical works. Our second student is creating a set of demonstrations and coding infrastructure for MITHgrid, a JavaScript library we’ve been developing to support graph-based JavaScript applications in the browser, such as a video annotation toolkit and a Shared Canvas viewer
By James Smith, Organization Administrator for MITH 
Motion planning is a key area in robotics that finds feasible paths for a robot from some initial state to some desired goal. Over the last couple of years we have developed a standard library for sampling-based motion planning algorithms, a class of algorithms that has been shown to work well on a large variety of systems, ranging from car-like robots to humanoid robots with many degrees of freedom. The Open Motion Planning Library (OMPL) is designed to be very general; the library makes no assumptions about the type of robot or how the environment is represented. This allows it to be integrated into a larger robotics software system such as ROS
Manipulation Planning - An example of using OMPL on the PR2 from Willow Garage. The robot is asked to move and manipulate the objects on the table. The demo is using ROS. 
We are excited to have two very talented students working with us this summer. Caleb Voss is developing a plugin for Blender (a 3D modeling program) that allows one to plan motions for robots in environments drawn within Blender. The project integrates many different components, it relies on the Blender Game Engine to simulate physically realistic robot motion, on MORSE for robot models and high-level controllers, and on OMPL for planning. 
Luis Torres is working on a core future within OMPL - the representation of costs and the way planners optimize costs. There already exists some functionality in OMPL to optimize path length and some other common path properties, but in the redesign that Luis is working on this will be done in an abstract way so that the user can specify almost any kind of cost function. 
By Mark Moll, OMPL Organization Administrator
Today, August 2nd, is the deadline for midterm evaluations for students and mentors for Google Summer of Code 2013. To view a complete list of the 177 open source organizations that the 1192 students are working with this summer you can visit the Google Summer of Code program site.

By Stephanie Taylor, Open Source Programs