opensource.google.com

Menu

Google Code-in 2019 Org Applications are Open!

Thursday, October 10, 2019

We are now accepting applications for open source organizations interested in participating in the tenth Google Code-in 2019. Google Code-in (GCI) has invited pre-university students ages 13-17 to learn hands-on by contributing to open source software.

Each year we have heard inspiring stories from the participating mentors about their commitment to working with young students. We only select organizations that have participated in Google Summer of Code because they have gained experience in mentorship and know how to provide a support system for these new, young contributors.

Organization applications are now open and all interested open source organizations must apply before Monday, October 28, 2019 at 17:00 UTC.

In 2018, 27 organizations were accepted—9 of which were participating in GCI for the first time! Over the last 9 years, 11,232 students from 108 countries have completed more than 40,000 tasks for participating open source projects. Tasks fall into 5 categories:
  • Code: writing or refactoring.
  • Documentation/Training: creating/editing documents and helping others learn more.
  • Outreach/Research: community management, outreach/marketing, or studying problems and recommending solutions.
  • Quality Assurance: testing and ensuring code is of high quality.
  • Design: graphic design or user interface design.
Once an organization is selected for Google Code-in 2019 they will define these tasks and recruit mentors from their communities who are interested in providing online support for students during the seven week contest.

You can find a timeline, FAQ and other information about Google Code-in on our website. If you’re an educator interested in sharing Google Code-in with your students, please see the resources here.

By Radha Jhatakia, Google Open Source

Understanding Scheduling Behavior with SchedViz

Wednesday, October 9, 2019

Linux kernel scheduling behavior can be a key factor in application responsiveness and system utilization. Today, we’re announcing SchedViz, a new tool for visualizing Linux kernel scheduling behavior. We’ve used it inside Google to discover many opportunities for better scheduling choices and to root-cause many latency issues.

Thread Scheduling

Modern OSs execute multiple processes concurrently, by running each for a brief burst, then switching to the next: a feature called multiprogramming. Modern processors include multiple cores, each of which can run its own thread, known as multiprocessing. When these two features are combined, a new engineering challenge emerges: when should a thread run? How long should it run, and on what processor? This thread scheduling strategy is a complex problem, and can have a significant effect on performance. In particular, threads that don't get scheduled to run can suffer starvation, which can adversely affect user-visible latencies.

In an ideal system, a simple strategy of assigning chunks of CPU-time to threads in a round-robin manner would maximize fairness by ensuring all threads are equally starved. But, of course, real systems are far from ideal, and this view of fairness may not be an appropriate performance goal. Here are a few factors that make scheduling tricky:
  • Not all threads are equally important. Each thread has a priority that specifies its importance relative to other threads. Thread priorities must be selected carefully, and the scheduler must honor those selections.
  • Not all cores are equal. The structure of the memory hierarchy can make it costly to shift a thread from one core to another, especially if that shift moves it to a new NUMA node. Users can explicitly pin threads to a CPU or a set of CPUs, or can exclude threads from specific CPUs, using features like sched_setaffinity or cgroups. But such restrictions also make scheduling even tougher.
  • Not all threads want to run all the time. Threads may sleep waiting for some event, yielding their core to other execution. When the event occurs, pending threads should be scheduled quickly.
SchedViz permits you to observe real scheduling behavior. Comparing this with the expected or desired behavior can point to specific problems and possible solutions.

Tracepoints and Kernel Tracing

The Linux kernel is instrumented with hooks called tracepoints; when certain actions occur, any code hooked to the relevant tracepoint is called with arguments that describe the action. The kernel also provides a debug feature that can trace this data and stream it to a buffer for later analysis.

Hundreds of different tracepoints exist, arranged into families of related function. The sched family includes tracepoints that can reconstruct thread scheduling behavior—when threads switched in, blocked on some event, or migrated between cores. These sched tracepoints provide fine-grained and comprehensive detail about thread scheduling behavior over a short period of traced execution.

SchedViz: Visualize Thread Scheduling Over Time

SchedViz provides an easy way to gather kernel scheduling traces from hosts, and visualize those traces over time. Tracing is simple:
$ sudo ./trace.sh -capture_seconds 5 -out ~/traces
Then, importing the resulting collection into SchedViz takes just one click.


Once imported, a collection will always be available for later viewing, until you delete it.

The SchedViz UI displays collections in several ways. A zoomable and pannable heatmap shows system cores on the y-axis, and the trace duration on the x-axis. Each core in the system has a swim-lane, and each swim-lane shows CPU utilization (when that CPU is being kept busy) and wait-queue depth (how many threads are waiting to run on that CPU.) The UI also includes a thread list that displays which threads were active in the heatmap, along with how long they ran, waited to run, and blocked on some event, and how many times they woke up or migrated between cores. Individual threads can be selected to show their behavior over time, or expanded to see their details.

Using SchedViz to Identify Antagonisms: Not all threads are equally important

Antagonism describes the situation in which a victim thread is ready to run on some CPU, while another antagonist thread runs on that same CPU. Long antagonisms, or high cumulative duration of antagonisms, can degrade user experience or system efficiency by making a critical process unavailable at critical times.

Antagonist analysis is useful when threads are meant to have exclusive access to some core but don’t get it. In SchedViz, such antagonisms are listed in each thread’s summary, as well as being immediately visible as breaks in the victim thread's running bar. Zooming in reveals exactly what work is interfering.

Several antagonisms affect a thread that wants its CPU exclusively.
Root-causing an antagonism via zooming in.

Round-robin queueing, in which two or more threads, each wanting to run most or all of the time, occupy a single CPU for a period of time, also yields antagonisms. In this case, the scheduler attempts to avert starvation by giving multiple threads short time-slices to run in a round-robin manner. This reduces the throughput of affected threads while introducing often-significant, repeating, latencies. It is a sign that some portion of the system is overloaded.

In SchedViz, round-robin scheduling appears as a sequence of fixed-size intervals in which the running thread, and the set of waiting threads, changes with each interval. The SchedViz UI makes it easy to better understand what caused this phenomenon.

An overloaded CPU with two threads engaged in round-robin queueing. Running intervals are shown as ovals at top; waiting intervals as rectangles at bottom.
Zooming out and viewing more CPUs reveals that round-robin queueing started when a thread migrated into the overtaxed CPU.

Using SchedViz to Identify NUMA Issues: Not all cores are equal

Larger servers often have several NUMA nodes; a CPU can access a subset of memory (the DRAM local to its NUMA node) more quickly than other memory (other nodes' DRAMs). This non-uniformity is a practical consequence of growing core count, but it brings challenges.

On the one hand, a thread migrated away from the DRAM that holds most of its state will suffer, since it will then have to pay an extra tax for each DRAM access. SchedViz can help identify cases like this, making it clear when a thread has had to migrate across NUMA boundaries.

On the other hand, it is important to ensure that all NUMA nodes in a system are well-balanced, lest part of the machine is overloaded while another part of the machine sits idle.

A thread (in yellow) risks higher-latency memory accesses as it migrates across NUMA nodes.
A system risks both under-utilization and increased latency due to NUMA imbalance.

Beyond Scheduling

Many issues can be identified and explored using only sched tracepoints. But, there are many tracepoints, reflecting a wide variety of phenomena. Many of these tracepoints go well with scheduling data. For example, irq events can reveal when thread running time is spent handling interrupts; sys events can help reveal when execution moves into the kernel, and what it’s doing there; and workqueue events can show when kernel work is underway, and what work is being done. SchedViz presently offers limited support for visualizing these non-sched tracepoint families, but improving that support is an active area of development for us.

Google Summer of Code 2019 (Statistics Part 2)

Monday, September 30, 2019

2019 has been an epic year for Google Summer of Code as we celebrated 15 years of connecting university students from around the globe with 201 open source organizations big and small.

We want to congratulate our 1,134 students that complete GSoC 2019. Great work everyone!

Now that GSoC 2019 is over we would like to wrap up the program with some more statistics to round out the year.

Student Registrations

We had 30,922 students from 148 countries register for GSoC 2019 (that’s a 19.5% increase in registrations over last year, the previous record). Interest in GSoC clearly continues to grow and we’re excited to see it growing in all parts of the world.

For the first time ever we had students register from Bhutan, Fiji, Grenada, Papua New Guinea, South Sudan, and Swaziland.

Universities

The 1,276 students accepted into the GSoC 2019 program hailed from 6586 universities, of which, 164 have students participating for the first time in GSoC.

Schools with the most accepted students for GSoC 2019:

University # of Accepted Students
Indian Institute of Technology, Roorkee48
International Institute of Information Technology - Hyderabad29
Birla Institute of Technology and Science, Pilani (BITS Pilani)27
Guru Gobind Singh Indraprastha University (GGSIPU Dwarka)20
Indian Institute of Technology, Kanpur19
Indian Institute of Technology, Kharagpur19
Amrita University / Amrita Vishwa Vidyapeetham14
Delhi Technological University11
Indian Institute of Technology, Bombay11
Indraprastha Institute of Information and Technology, New Delhi11

Mentors

Each year we pore over gobs of data to extract some interesting statistics about the GSoC mentors. Here’s a quick synopsis of our 2019 crew:
  • Registered mentors: 2,815
  • Mentors with assigned student projects: 2,066
  • Mentors who have participated in GSoC for 10 or more years: 70
  • Mentors who have been a part of GSoC for 5 years or more: 307
  • Mentors that are former GSoC students: 691
  • Mentors that have also been involved in the Google Code-in program: 498
  • Percentage of new mentors: 35.84%
GSoC 2019 mentors are from all parts of the world, representing 81 countries!

Every year thousands of GSoC mentors help introduce the next generation to the world of open source software development—for that we are forever grateful. We can not stress enough that without our invaluable mentors the GSoC program would not exist. Mentorship is why GSoC has remained strong for 15 years, the relationships built between students and mentors have helped sustain the program and many of these communities. Sharing their passion for open source, our mentors have paved the road for generations of contributors to enter open source development.

Thank you to all of our mentors, organization administrators, and all of the “unofficial” mentors that help in our open source organization’s communities. Google Summer of Code is a community effort and we appreciate each and every one of you.

By Stephanie Taylor, Google Open Source
.