opensource.google.com

Menu

Open sourcing the attention center model

Thursday, December 1, 2022

When you look at an image, what parts of an image do you pay attention to first? Would a machine be able to learn this? We provide a machine learning model that can be used to do just that. Why is it useful? The latest generation image format (JPEG XL) supports serving the parts that you pay attention to first, which results in an improved user experience: images will appear to load faster. But the model not only works for encoding JPEG XL images, but can be used whenever we need to know where a human would look first.

An open sourcing attention center model

What regions in an image will attract the majority of human visual attention first? We trained a model to predict such a region when given an image, called the attention center model, which is now open sourced. In addition to the model, we provide a script to use it in combination with the JPEG XL encoder: google/attention-center.

Some example predictions of our attention center model are shown in the following figure, where the green dot is the predicted attention center point for the image. Note that in the “two parrots” image both parrots’ heads are visually important, so the attention center point will be in the middle.

Four images in quadrants as follows: A red door with brass doorknob in top left quadrant, headshot of a brown skinned girl waering a colorful sweater and ribbons in her hair and painted face smiling at the camera in the top right quadrant, A teal shuttered catherdral style window against a sand colored stucco wall with pink and red hibiscus in the forefront in the bottom left quadrant, A blue and yellow macaw and red and green macaw next to each other in the bottom right quadrant
Images are from Kodak image data set: http://r0k.us/graphics/kodak/

The model is 2MB and in the TensorFlow Lite format. It takes an RGB image as input and outputs a 2D point, which is the predicted center of human attention on the image. That predicted center is the place where we should start with operations (decoding and displaying in JPEG XL case). This allows the most visually salient/import regions to be processed as early as possible. Check out the code and continue to build upon it!

Attention center ground-truth data

To train a model to predict the attention center, we first need to have some ground-truth data from the attention center. Given an image, some attention points can either be collected by eye trackers [1], or be approximated by mouse clicks on a blurry version of the image [2]. We first apply temporal filtering to those attention points and keep only the initial ones, and then apply spatial filtering to remove noise (e.g., random gazes). We then compute the center of the remaining attention points as the attention center ground-truth. An example illustration figure is shown below for the process of obtaining the ground-truth.

Five images in a row showing the original image of a person standing on a rock by the ocean; the first is the original image, the second showing gaze/attention points, the third shoing temporal filtering, the fourth spatial filtering, and fifth, attention center

Attention center model architecture

The attention center model is a deep neural net, which takes an image as input, and uses a pre-trained classification network, e.g, ResNet, MobileNet, etc., as the backbone. Several intermediate layers that output from the backbone network are used as input for the attention center prediction module. These different intermediate layers contain different information e.g., shallow layers often contain low level information like intensity/color/texture, while deeper layers usually contain higher and more semantic information like shape/object. All are useful for the attention prediction. The attention center prediction applies convolution, deconvolution and/or resizing operator together with aggregation and sigmoid function to generate a weighting map for the attention center. And then an operator (the Einstein summation operator in our case) can be applied to compute the (gravity) center from the weighting map. An L2 norm between the predicted attention center and the ground-truth attention center can be computed as the training loss.

Attention center model architecture

Progressive JPEG XL images with attention center model

JPEG XL is a new image format that allows the user to encode images in a way to ensure the more interesting parts come first. This has the advantage that when viewing images that are transferred over the web, we can already display the attention grabbing part of the image, i.e. the parts where the user looks first and as soon as the user looks elsewhere ideally the rest of the image already has arrived and has been decoded. Using Saliency in progressive JPEG XL images | Google Open Source Blog illustrates how this works in principle. In short, in JPEG XL, the image is divided into square groups (typically of size 256 x 256), and the JPEG XL encoder will choose a starting group in the image and then grow concentric squares around that group. It was this need for figuring out where the attention center of an image is that led us to open source the attention center model, together with a script to use it in combination with the JPEG XL encoder. Progressive decoding of JPEG XL images has recently been added to Chrome starting from version 107. At the moment, JPEG XL is behind an experimental flag, which can be enabled by going to chrome://flags, searching for “jxl”.

To try out how partially loaded progressive JPEG XL images look, you can go to https://google.github.io/attention-center/.

By Moritz Firsching, Junfeng He, and Zoltan Szabadka – Google Research

References

[1] Valliappan, Nachiappan, Na Dai, Ethan Steinberg, Junfeng He, Kantwon Rogers, Venky Ramachandran, Pingmei Xu et al. "Accelerating eye movement research via accurate and affordable smartphone eye tracking." Nature communications 11, no. 1 (2020): 1-12.

[2] Jiang, Ming, Shengsheng Huang, Juanyong Duan, and Qi Zhao. "Salicon: Saliency in context." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1072-1080. 2015.

Explore the new Learn Kubernetes with Google website!

Thursday, November 17, 2022

As Kubernetes has become a mainstream global technology, with 96% of organizations surveyed by the CNCF1 using or evaluating Kubernetes for production use, it is now estimated that 31%2 of backend developers worldwide are Kubernetes developers. To add to the growing popularity, the 2021 annual report1 also listed close to 60 enhancements by special interest and working groups to the Kubernetes project. With so much information in the ecosystem, how can Kubernetes developers stay on top of the latest developments and learn what to prioritize to best support their infrastructure?

The new website Learn Kubernetes with Google brings together under one roof the guidance of Kubernetes experts—both from Google and across the industry—to communicate the latest trends in building your Kubernetes infrastructure. You can access knowledge in two formats.

One option is to participate in scheduled live events, which consist of virtual panels that allow you to ask questions to experts via a Q&A forum. Virtual panels last for an hour, and happen once quarterly. So far, we’ve hosted panels on building a multi-cluster infrastructure, the Dockershim deprecation, bringing High Performance Computing (HPC) to Kuberntes, and securing your services with Istio on Kubernetes. The other option is to pick one of the multiple on-demand series available. Series are made up of several 5-10 minute episodes and you can go through them at your own leisure. They cover different topics, including the Kubernetes Gateway API, the MCS API, Batch workloads, and Getting started with Kubernetes. You can use the search bar on the top right side of the website to look up specific topics.
ALT TEXT
As the cloud native ecosystem becomes increasingly complex, this website will continue to offer evergreen content for Kubernetes developers and users. We recently launched a new content category for ecosystem projects, which started by covering how to run Istio on Kubernetes. Soon, we will also launch a content category for developer tools, starting with Minikube.

Join hundreds of developers that are already part of the Learn Kubernetes with Google community! Bookmark the website, sign up for an event today, and be sure to check back regularly for new content.

By María Cruz, Program Manager – Google Open Source Programs Office

Get ready for Google Summer of Code 2023!

Thursday, November 10, 2022

We are thrilled to announce the 2023 Google Summer of Code (GSoC) program and share the timeline with you to get involved! 2023 will be our 19th consecutive year of hosting GSoC and we could not be more excited to welcome more organizations, mentors, and new contributors into the program.

With just three weeks left in the 2022 program, we had an exciting year with 958 GSoC contributors completing their projects with 198 open source organizations.

Our 2022 contributors and mentors have given us extensive feedback and we are keeping the big changes we made this year, with one adjustment around eligibility described below.
  • Increased flexibility in project lengths (10-22 weeks, not a set 12 weeks for everyone) allowed many people to be able to participate and to not feel rushed as they wrapped up their projects. We have 109 GSoC contributors wrapping up their projects over the next three weeks.
  • Choice of project time commitment there are now two options, medium at ~175 hours or large at ~350 hours, with 47% and 53% GSoC contributors, respectively.
  • Our most talked about change was GSoC being open to contributors new to open source software development (and not just to students anymore). For 2023, we are expanding the program to be open to students and to beginners in open source software development.
We are excited to launch the 2023 GSoC program and to continue to help grow the open source community. GSoC’s mission of bringing new contributors into open source communities is centered around mentorship and collaboration. We are so grateful for all the folks that continue to contribute, mentor, and get involved in open source communities year after year.

Interested in applying to the Google Summer of Code Program?

Open Source Organizations
Check out our website to learn what it means to be a participating organization. Watch our new GSoC Org Highlight videos and get inspired about projects that contributors have worked on in the past.

Think you have what it takes to participate as a mentor organization? Take a look through our mentor guide to learn about what it means to be part of Google Summer of Code, how to prepare your community, gather excited mentors, create achievable project ideas, and tips for applying. We welcome all types of open source organizations and encourage you to apply—it is especially exciting for us to welcome new orgs into the program and we hope you are inspired to get involved with our growing community.

Want to be a GSoC Contributor?
Are you new to open source development or a student? Are you eager to gain experience on real-world software development projects that will be used by thousands or millions of people? It is never too early to start thinking about what kind of open source organization you’d like to learn more about and how the application process works!

Watch our new ‘Introduction to GSoC’ video to see a quick overview of the program. Read through our contributor guide for important tips from past participants on preparing your proposal, what to think about if you wish to apply for the program, and everything you wanted to know about the program. We also hope you’re inspired by checking out the nearly 200 organizations that participated in 2022 and the 1,000+ projects that have been completed so far!

We encourage you to explore our website for other resources and continue to check for more information about the 2023 program.

You are welcome and encouraged to share information about the 2023 GSoC program with your friends, family, colleagues, and anyone you think may be interested in joining our community. We are excited to welcome many more contributors and mentoring organizations in the new year!

By Stephanie Taylor, Program Manager, and Perry Burnham, Associate Program Manager for the Google Open Source Programs Office
.