Lyra V2 - a better, faster, and more versatile speech codec

Friday, September 30, 2022

Since we open sourced the first version of Lyra on GitHub last year, we are delighted to see a vibrant community growing around it, with thousands of stars, hundreds of forks, and many comments and pull requests. There are people who fixed and formatted our code, built continuous integration for the project, and even added support for Web Assembly.

We are incredibly grateful for all these contributions, and we also heard the community's feedback, asking us to improve Lyra. Some examples of what developers wanted were to run Lyra on more platforms, develop applications in more languages; and for a model that computes faster with more bitrate options and lower latency, and better audio quality with fewer artifacts.

That's why we are now releasing Lyra V2, with a new architecture that enjoys a wider platform support, provides scalable bitrate capabilities, has better performance, and generates higher quality audio. With this release, we hope to continue to evolve with the community, and with its collective creativity, see new applications being developed and new directions emerging.

New Architecture

Lyra V2 is based on an end-to-end neural audio codec called SoundStream. The architecture has a residual vector quantizer (RVQ) sitting before and after the transmission channel, which quantizes the encoded information into a bitstream and reconstructs it on the decoder side.

Lyra V2's SoundStream architecture
The integration of RVQ into the architecture allows changing the bitrate of Lyra V2 at any time by selecting the number of quantizers to use. When more quantizers are used, higher quality audio is generated (at a cost of a higher bitrate). In Lyra V2, we support three different bitrates: 3.2 kps, 6 kbps, and 9.2 kbps. This enables developers to choose a bitrate most suitable for their network condition and quality requirements.

Lyra V2's model is exported in TensorFlow Lite, TensorFlow's lightweight cross-platform solution for mobile and embedded devices, which supports various platforms and hardware accelerations. The code is tested on Android phones and Linux, with experimental Mac and Windows support. Operation on iOS and other embedded platforms is not currently supported, although we expect it is possible with additional effort. Moreover, this paradigm opens Lyra to any future platform supported by TensorFlow Lite.

Better Performance

With the new architecture, the delay is reduced from 100 ms with the previous version to 20 ms. In this regard, Lyra V2 is comparable to the most widely used audio codec Opus for WebRTC, which has a typical delay of 26.5 ms, 46.5 ms, and 66.5 ms.

Lyra V2 also encodes and decodes five times faster than the previous version. On a Pixel 6 Pro phone, Lyra V2 takes 0.57 ms to encode and decode a 20 ms audio frame, which is 35 times faster than real time. The reduced complexity means that more phones can run Lyra V2 in real time than V1, and that the overall battery consumption is lowered.

Higher Quality

Driven by the advance of machine learning research over the years, the quality of the generated audio is also improved. Our listening tests show that the audio quality (measured in MUSHRA score, an indication of subjective quality) of Lyra V2 at 3.2 kbps, 6 kbps, and
9.2 kbps measures up to Opus at 10 kbps, 13 kbps, and 14 kbps respectively.

Lyra vs. Opus at various bitrates

Sample 1

Sample 2


Opus       @6kbps


Opus     @10kbps

LyraV2 @3.2kbps

Opus           @13k

LyraV2    @6kbps

Opus     @14kbps

LyraV2 @9.2kbps

This makes Lyra V2 a competitive alternative to other state-of-the-art telephony codecs. While Lyra V1 already compares favorably to the Adaptive Multi-Rate (AMR-NB) codec, Lyra V2 further outperforms Enhanced Voice Services (EVS) and Adaptive Multi-Rate Wideband (AMR-WB), and is on par with Opus, all the while using only 50% - 60% of their bandwidth.

Lyra vs. state-of-the-art codecs

Sample 1

Sample 2






Opus           @13kbps

LyraV2    @6kbps

This means more devices can be connected in bandwidth-constrained environments, or that additional information can be sent over the network to reduce voice choppiness through forward error correction and packet loss concealment.

Open Source Release

Lyra V2 continues to provide what is already in Lyra V1 (the build tools, the testing frameworks, the C++ encoding and decoding API, the signal processing toolchain, and the example Android app). Developers who have experience with the Lyra V1 API will find that the V2 API looks familiar, but with a few changes. For example, now it's possible to change bitrates during encoding (more information is available in the release notes). In addition, the model definitions and weights are included as .tflite files. As with V1, this release is a beta version and the API and bitstream are expected to change. The code for running Lyra is open sourced under the Apache license. We can’t wait to see what innovative applications people will create with the new and improved Lyra!

By Hengchin Yeh - Chrome


The following people helped make the open source release possible: from Chrome: Alejandro Luebs, Michael Chinen, Andrew Storus, Tom Denton, Felicia Lim, Bastiaan Kleijn, Jan Skoglund, Yaowu Xu, Jamieson Brettle, Omer Osman, Matt Frost, Jim Bankoski; and from Google Research: Neil Zeghidour, Marco Tagliasacchi

GSoC 2022: The first phase of completed projects

Tuesday, September 20, 2022

Google Summer of Code (GSoC) is a global, online program focused on bringing new contributors into open source software development. GSoC contributors work with an open source organization on a 12+ week programming project under the guidance of mentors. The updates to the program beginning in 2022 included a rolling timeline, which allowed considerable flexibility for contributors to either finish their projects in 12 weeks or extend their deadline up to 22 weeks. We are happy to announce that 755 contributors from 51 countries have successfully completed this year’s program thus far. Congratulations!

No GSoC is complete without our dedicated mentors and organization administrators from all around the globe. In its 18th year of the program, GSoC continues to thrive due to its robust mentor community. There are 198 open source organizations—and over 2,000 mentors from 76 countries—that are participating in the 2022 program. A sincere thank you to our mentors and organization administrators for guiding and supporting our contributors this year.

Throughout the course of the program, evaluations provide us a window to the GSoC experience that only a contributor or a mentor have. Evaluations have also influenced a number of GSoC program changes and events. At the suggestion of last year’s contributors, we held this year’s Contributor Summit earlier with several talks on how to have a successful Google Summer of Code. Mentor and contributor comments helped validate the changes to both projects and contributors that expanded the reach and flexibility of GSoC in 2022. We also held feedback sessions with our mentors to talk through their questions, suggestions and opinions on the changes we implemented this year. Our contributors and mentors inspire GSoC administrators too, and for that, we are truly grateful.

We’ll be back in a couple of months to give a final update on the GSoC projects that will conclude in October and November. Google Summer of Code 2022 hasn’t fully ended just yet, so please stay tuned!

By Romina Vicente – Google Open Source Programs Office

Co-simulating ML with Springbok using Renode

Wednesday, September 14, 2022

The landscape of Machine Learning software libraries and models is evolving rapidly, and to satisfy the ever-increasing demand for memory and compute while managing latency, power and security considerations, hardware must be developed in an iterative process alongside the workloads it is meant to run.

With its open architecture, custom instructions support and flexible vector extensions, the RISC-V ISA offers an unprecedented capacity for such co-design. And by energizing the open hardware ecosystem, RISC-V has supercharged research and innovation into how to improve chipmaking itself to better leverage the methods and suit the needs of software. Initiatives such as Google’s OpenMPW Shuttle show how a more open and software-focused approach to building hardware, are key to enabling a new wave of more powerful and transparent ML-focused solutions.

A RISC-V-based ML accelerator with a HW/SW co-design flow

In the past months, Google Research has joined efforts with Antmicro to work on a silicon project that can serve as a template for efficient hardware-software co-design. For their secure ML solution, the Google Research team supported by Antmicro has been developing a completely open source, rapid pre-silicon ML development flow using Renode, Antmicro’s open source simulation framework.

This builds on the result of cooperation from last year in which Antmicro implemented Renode support for RISC-V Vector extensions, which are used in the Google team’s RISC-V based ML accelerator codenamed Springbok. To allow a more well-rounded developer experience, as part of the project Antmicro is also working on improving the support for the underlying SoC and a large number of user oriented features such as OS-aware debugging, performance optimizations, payload profiling and performance measurement capabilities.

Springbok is part of Google’s AmbiML project that aims to create an open source ML development ecosystem centered on privacy and security. By using the RISC-V Vector extensions, the Google Research team has a standard but flexible way to parallelize the matrix multiply and accumulate operations that are universal in ML payloads. And thanks to Renode, the team can make informed choices as to how exactly to leverage RISC-V’s flexibility by analyzing tradeoffs between speed, complexity and specialization in a practical, iterative fashion using data generated by Renode and the text-based configuration capabilities that let them play around with hardware composition and functionality in a matter of minutes, not days.

Diagram of A RISC-V-based ML accelerator with a HW/SW co-design flow

On the ML software side, the ecosystem revolves around IREE—Google’s research project developing an open source ML compiler and runtime for constrained devices, based on LLVM MLIR.

IREE allows you to load models from typical ML frameworks such as TensorFlow or TensorFlow Lite and then convert them to Intermediate Representation (MLIR), which later goes through optimizations on graph level and then through an LLVM compilation flow to get the best-fitted runtime for a specific target. When it comes to deploying models on target devices, IREE provides APIs for both the C and Python programming languages as well as a TFLite C API which provides the same convention as TFLite for model loading, tensor management and inference invoking.

Using these runtimes, the model can be deployed and tested, debugged, benchmarked and executed on the target device or in a simulation environment like Renode.

Demoing the flow at Spring 2022 RISC-V Week

In the build up to the Spring 2022 RISC-V Week in Paris, the first such large open hardware meeting in years, an initial version of the AmbiML bare metal ML flow was released as open source. This includes both the ability to run interactively and an example CI using Antmicro’s GitHub Renode Action showing how such a workflow can be tested automatically on each commit. As a Google Cloud partner, Antmicro is currently working with Google Cloud to make Renode available for massive scale CI testing and deployments for scenarios similar to this one.

In a joint talk at the Paris event, Antmicro and Google presented the software co-development flow, together with a demo of a heterogeneous multi-core solution, with one core running the AmbiML Springbok payload and another core running Zephyr.

In the presented scenario the Springbok core, acting as a ML compute offload unit to the main CPU, executed inference on the MobileNetv1 network and reported the work done to the application core via a RISC-V custom instruction. Adding and modifying custom instructions is trivial in Renode, either via a single line of Python, C#, or even co-simulated in RTL.

Renode helps ML developers and silicon designers not only to run and test their solutions, but also to learn more about what their software is actually doing. As part of the Paris demonstration, Antmicro and Google showed how you can count executed instructions and how often specific opcodes are used to measure how well your solution is performing. These features, accompanied by execution metrics analysis, executed functions logging, and recently developed execution trace generation, give you great insight into every detail of your emulated ML environment.

These capabilities join the wide arsenal of hardware/software co-development solutions in Renode, such as RTL co-simulation which Antmicro has been developing with Microchip and support for verilated custom instructions developed with another ML-focused Google team responsible for RISC-V Custom Function Units and also used in the EU-funded VEDLIoT project.

Future plans

This is just the beginning of a wider activity from the Google Research team Antmicro is working with to release software and hardware components as well as tools supporting a collaborative co-design ecosystem for secure ML development. If you think Renode, RISC-V and co-development could help in building your next ML-focused product, go ahead and try the AmbiML flow yourself!

Visit the iree-rv32-springbok repository on GitHub, clone it locally and follow the instructions from

Renode Repository

You can also grab Renode from the official repository and start playing with the available demos, or head to the Renode documentation to read up on features helpful for ML acceleration development such as Verilator co-simulation.

By Peter Zierhoffer – Antmicro