From MLPerf to MLCommons: moving machine learning forward

Today, the community of machine learning researchers and engineers behind the MLPerf benchmark is launching an open engineering consortium called MLCommons. For us, this is the next step in a journey that started almost three years ago.

Early in 2018, we gathered a group of industry researchers and academics who had published work on benchmarking machine learning (ML), in a conference room to propose the creation of an industry standard benchmark to measure ML performance. Everyone had doubts: creating an industry standard is challenging under the best conditions and ML was (and is) a poorly understood stochastic process running on extremely diverse software and hardware. Yet, we all agreed to try.

Together, along with a growing community of researchers and academics, we created a new benchmark called MLPerf. The effort took off. MLPerf is now an industry standard with over 2,000 submitted results and multiple benchmarks suites that span systems from smartphones to supercomputers. Over that time, the fastest result submitted to MLPerf for training the classic ML network ResNet improved by over 13x.

We created MLPerf because we believed in three principles:

Machine learning has tremendous potential: Already, machine learning helps billions of people find and understand information through tools like Google’s search engine and translation service. Active research in machine learning could one day save millions of lives through improvements in healthcare and automotive safety.

Transforming machine learning from promising research into wide-spread industrial practice requires investment in common infrastructure -- especially metrics: Much like computing in the ‘80s, real innovation is mixed with hype and adopting new ideas is slow and cumbersome. We need good metrics to identify the best ideas, and good infrastructure to make adoption of new techniques fast and easy.

Developing common infrastructure is best done by an open, fast-moving collaboration: We need the vision of academics and the resources of industry. We need the agility of startups and the scale of leading tech companies. Working together, a diverse community can develop new ideas, launch experiments, and rapidly iterate to arrive at shared solutions.

Our belief in the principles behind MLPerf has only gotten stronger, and we are excited to be part of the next step for the MLPerf community with the launch of MLCommons.

MLCommons aims to accelerate machine learning to benefit everyone. MLCommons will build a a common set of tools for ML practitioners including:

Benchmarks to measure progress: MLCommons will leverage MLPerf to measure speed, but also expand benchmarking other aspects of ML such as accuracy and algorithmic efficiency. ML models continue to increase in size and consequently cost. Sustaining growth in capability will require learning how to do more (accuracy) with less (efficiency).

Public datasets to fuel research: MLCommons new People’s Speech project seeks to develop a public dataset that, in addition to being larger than any other public speech dataset by more than an order of magnitude, better reflects diverse languages and accents. Public datasets drive machine learning like nothing else; consider ImageNet’s impact on the field of computer vision.

Best practices to accelerate development: MLCommons will make it easier to develop and deploy machine learning solutions by fostering consistent best practices. For instance, MLCommons’ MLCube project provides a common container interface for machine learning models to make them easier to share, experiment with (including benchmark), develop, and ultimately deploy.

Google believes in the potential of machine learning, the importance of common infrastructure, and the power of open, collaborative development. Our leadership in co-founding, and deep support in sustaining, MLPerf and MLCommons has echoed our involvement in other efforts like TensorFlow and NNAPI. Together with the MLCommons community, we can improve machine learning to benefit everyone.

Want to get involved? Learn more at mlcommons.org.

By Peter Mattson – ML Metrics, Naveen Kumar – ML Performance, and Cliff Young – Google Brain

opensource.google.com

Google Open Source Blog

From MLPerf to MLCommons: moving machine learning forward

Thursday, December 3, 2020