Google Summer of Code wrap-up: STE||AR Group

Friday, October 30, 2015

Today we are featuring the STE||AR Group, another Google Summer of Code veteran organization. Adrian Serio gives an overview of their four students summer projects below.


The STE||AR Group is an international team of researchers who aim to improve application scalability by more efficiently utilizing hardware resources available to developers. This summer has been an exciting time for the STE||AR Group’s Google Summer of Code (GSoC) mentors and students alike! We were very pleased with the dedication and effort of all five of our participants.

Our students made contributions to three of our software products:
  • HPX: a distributed C++ runtime system which comes with a standards-compliant API and allows users to scale their applications across thousands of machines
  • LibGeoDecomp: an auto-parallelizing library for petascale computer simulations which is able to take advantage of HPX to better adapt fluctuating workloads to the system
  • LibFlatArray: a highly efficient multidimensional array library which provides an object-oriented interface but stores data in a vectorization-friendly Struct-of-Arrays format.

Just like how these three products can work together as a tightly integrated stack, our goal with the GSoC projects was to create synergy between them and steer our development towards increasing the adaptivity and efficiency of our software. Below are the summaries of our student’s projects.

Implementation of a New Resource Manager in HPX: Nidhi Makhijani
This project set out to properly assign hardware resources to executors: C++ objects that dictate the way a thread should be executed. Nidhi was able to allocate resources to an executor when it was created and return the resources when it stops. Additionally, Nidhi laid the groundwork for dynamic allocation where the resource manager can monitor and share resources amongst all of the running executors.

SIMD Wrapper for ARM NEON, Intel AVX512 & KNC in LibFlatArray: Larry Xiao
Vectorization is imperative for writing highly efficient numerical kernels. The goal of this project was to extend the already existing SIMD wrappers in LibFlatArray to more architectures (e.g. ARM NEON, Intel AVX512, etc.) and to extend the capabilities of these wrappers. Larry set out to study the different ISAs (Instruction Set Architectures), and make the library run efficiently on these architectures.

CSV Formatted Performance Counters for HPX: Devang Bacharwar
HPX provides users with a uniform interface to access arbitrary system information from anywhere in the system. Devang’s project allows users to request these counters in a CSV format. Additionally, he has enabled the ability to get timestamps with each value as well. These features will make it easier for HPX users to perform analysis on the performance data gathered from an application.

Integrate a C++AMP Kernel with HPX:  Marcin Copik
The HPX runtime system can coordinate the execution and synchronization of OpenCL kernels on arbitrary OpenCL devices, such as GPUs, in a system. In his GSoC project, Marcin used a C++ AMP compiler to produce an OpenCL kernel from a parallel algorithm implemented by HPX. Marcin integrated the Kalmar AMP compiler into the HPX build system, transformed a parallel for each algorithm into an OpenCL kernel, dispatched the kernel to a GPU and synchronized the result with a concurrently running HPX application.

A Flexible IO Infrastructure for LibGeoDecomp: Konstantin Kronfeldner
In LibGeoDecomp, users are able to read from and write to arbitrary regions of the simulation space. These operations are carried out by objects which we call Steerers and Writers. Over the summer, Konstantin added the ability for these Steerers and Writers to be dynamically created and destroyed. LibGeoDecomp is typically used on supercomputers, where jobs are executed non-interactively via a batch system. Konstantin's extensions enable users to interact with the application at runtime. They can view and modify the simulation model dynamically. The benefit of this is a significantly lower turnaround time for domain scientists who need to carry out many computational experiments.

By Adrian Serio, Scientific Program Coordinator, STE||AR Group