opensource.google.com

Menu

Bringing Live Transcribe's Speech Engine to Everyone

Friday, August 16, 2019

Earlier this year, Google launched Live Transcribe, an Android application that provides real-time automated captions for people who are deaf or hard of hearing. Through many months of user testing, we've learned that robustly delivering good captions for long-form conversations isn't so easy, and we want to make it easier for developers to build upon what we've learned. Live Transcribe's speech recognition is provided by Google's state-of-the-art Cloud Speech API, which under most conditions delivers pretty impressive transcript accuracy. However, relying on the cloud introduces several complications—most notably robustness to ever-changing network connections, data costs, and latency. Today, we are sharing our transcription engine with the world so that developers everywhere can build applications with robust transcription.

Those who have worked with our Cloud Speech API know that sending infinitely long streams of audio is currently unsupported. To help solve this challenge, we take measures to close and restart streaming requests prior to hitting the timeout, including restarting the session during long periods of silence and closing whenever there is a detected pause in the speech. Otherwise, this would result in a truncated sentence or word. In between sessions, we buffer audio locally and send it upon reconnection. This reduces the amount of text lost mid-conversation—either due to restarting speech requests or switching between wireless networks.



Endlessly streaming audio comes with its own challenges. In many countries, network data is quite expensive and in spots with poor internet, bandwidth may be limited. After much experimentation with audio codecs (in particular, we evaluated the FLAC, AMR-WB, and Opus codecs), we were able to achieve a 10x reduction in data usage without compromising accuracy. FLAC, a lossless codec, preserves accuracy completely, but doesn't save much data. It also has noticeable codec latency. AMR-WB, on the other hand, saves a lot of data, but delivers much worse accuracy in noisy environments. Opus was a clear winner, allowing data rates many times lower than most music streaming services while still preserving the important details of the audio signal—even in noisy environments. Beyond relying on codecs to keep data usage to a minimum, we also support using speech detection to close the network connection during extended periods of silence. That means if you accidentally leave your phone on and running Live Transcribe when nobody is around, it stops using your data.

Finally, we know that if you are relying on captions, you want them immediately, so we've worked hard to keep latency to a minimum. Though most of the credit for speed goes to the Cloud Speech API, Live Transcribe's final trick lies in our custom Opus encoder. At the cost of only a minor increase in bitrate, we see latency that is visually indistinguishable to sending uncompressed audio.

Today, we are excited to make all of this available to developers everywhere. We hope you'll join us in trying to build a world that is more accessible for everyone.

By Chet Gnegy, Alex Huang, and Ausmus Chang from the Live Transcribe Team

OpenCensus Web: Unlocking Full End-to-End Observability for Your Entire Stack

Friday, August 9, 2019

OpenCensus Web is a tool to trace and monitor the user-perceived performance of your web pages. It can help determine whether or not your web pages are experiencing performance issues that you might otherwise not know how to diagnose.

Web application owners want to monitor the operational health of their applications so that they can better understand actual user performance; however, capturing relevant telemetry from your web applications is often very difficult. Today, we are introducing OpenCensus Web (OC Web) to make instrumenting and exporting metrics and distributed traces from web applications simple and automatic.

Background

The OpenCensus project provides a set of language-specific instrumentation libraries that collect traces and metrics from applications and export them to tracing and monitoring backends like Prometheus, Zipkin, Jaeger, Stackdriver, and others.

The OpenCensus Web library is an implementation of OpenCensus that focuses on frontend web application code that executes in the browser. OC Web instruments web pages and collects user-side performance data, including latency and distributed traces, which gives developers the information to diagnose frontend issues and monitor overall application health.

Overshadowing the work on OC Web, the wider OpenCensus family of projects is merging with OpenTracing into OpenTelemetry. OpenCensus Web’s functionality will be migrated into OpenTelemetry JS once this project is ready, although OC Web will continue working as an alpha release in the meantime.

Architecture

OC Web interacts with three application components:
  • Frontend web server: renders the initial HTML to the browser including the OC Web library code and configuration. This would typically be instrumented with an OpenCensus server-side library (Go, Java, etc.). We also suggest that you create an endpoint in the server that receives HTTP/JSON traces and proxies to the OpenCensus Agent.
  • Browser JS: the OC Web library code that runs in the browser. This measures user interactions and collects browser data and writes them to the OpenCensus Agent as spans via HTTP/JSON.
  • OpenCensus Agent: receives traces from the frontend web server proxy endpoint or directly from the browser JS, and exports them to a trace backend (e.g. Stackdriver, Zipkin).
OC Web requires the OpenCensus Agent, which will proxy and re-export telemetry to your backend of choice. For more details see the documentation.


Features

Initial page load tracing

You can use OC Web to capture traces of initial page loads, which will even capture events that take place before the OC Web library was loaded by the browser! Initial page load traces show you which resources may be causing poor website performance, and contain data that you can’t typically capture from a distributed tracing system.

To measure the time of the overall initial page load interaction, OC Web waits until after the document load event and generates spans from the initial load performance timings via the browser's Navigation Timing and Resource Timing APIs. Below is a sample trace from OC Web that has been exported to Zipkin and captured from the initial load example app. Notice that there is an overall ‘nav./’ span for the user navigation experience until the browser load event fires.

This example also includes ‘/’ spans for the client and server side measurements of the initial HTML load. These spans are connected by the server sending back a ‘window.traceparent’ variable in the W3C Trace Context format, which is necessary because the browser does not send a trace context header for the initial page load. The server side spans also indicate how much time was spent parsing and rendering the template:

Notice the long js task span in the previous image, which indicates a CPU-bound JavaScript event loop that took 80.095ms, as measured by the Long Tasks browser API.

Span annotations for DOM and network events

Spans captured by OC Web also include detailed annotations for DOM events like `domInteractive` and `first-paint`, as well as network events like domainLookupStart and secureConnectionStart. Here is a similar trace exported to Stackdriver Trace with the annotations expanded:


User Interactions

For single page applications there are often subsequent interactions after the initial load (e.g. user clicks a button or navigates to a different section of the page). Measuring end-user interactions within a browser application adds useful data for your application:
  • Ability to relate an initial page render with subsequent on-page interactions
  • Visibility into slowness as perceived by the end user, for example, an unresponsive page after clicking
Currently, OC Web tracks clicks and route transitions by monkey-patching the Angular Zone.js library. OC Web tracks the subsequent synchronous and asynchronous tasks (e.g. setTimeouts, XHRs, etc.) caused by the interaction even if there are several concurrent interactions.

Automatic tracing for click events

All browser click events are traced as long as the click is done in a DOM element (e.g. button) and the clicked element is not disabled. When the user clicks the element, a new Zone is created to measure this interaction and determine the total time.

To name this root span, we provide developers with the option of adding the attribute data-ocweb-id to elements and give a custom name to the interaction. For the next example, the resulting name will be ‘Save edit user info’:
<button type="submit" data-ocweb-id="Save edit user info">       Save changes </button>
This helps you to identify the traces related to a specific element. Also, this may avoid ambiguity when there are similar interaction. If you don’t add this attribute, OC Web will use the DOM element ID, the tag name plus the event involved in the interaction. For example, clicking this button:
<button id="save_changes"> Save changes </button>
will generate a span named : “button#save_changes click”.

Automatic tracing for route transitions

OC Web traces route transitions between the different sections of your page by monkey-patching the History API. OC Web will name these interactions with the pattern ‘Navigation /path/to/page’. The following screenshot of a trace exported to Stackdriver from the user interaction example shows a Navigation trace which includes several network calls before the route transition is complete:

Creating your own custom spans

OC Web allows you to instrument your web application with custom spans for tasks or code involved in a user interaction. Here is a code snippet that shows how to do this:

import { tracing } from '@opencensus/web-instrumentation-zone';

function handleClick() {

  // Start child span of the current root span on the current interaction.
  // This must run in in code that the button is running.
  const childSpan = tracing.tracer.startChildSpan({
    name: 'name of your child span'
  });
  // Do some operation...
  // Finish the child span at the end of it's operation
  childSpan.end();
}

See the OC Web documentation for more details.

Automatic spans for HTTP requests and Browser performance data

OC Web automatically intercepts and generates spans for HTTP requests generated by user interactions. Additionally, OC Web attaches Trace Context Headers to each intercepted HTTP request, using the W3C Trace Context format. This is only done for same-origin requests or requests that match a provided regex.

If your servers are also instrumented with OpenCensus, these requests will continue to be traced throughout your backend services! This lets you know if the issues are related to either the front-end or the server-side.

OC Web also includes Performance API data to make annotations like domainLookupStart and responseEnd and generates spans for any CORS preflight requests.

The next screenshot shows a trace exported to Stackdriver as result of the user interaction example. There, you can see the several network calls with the automatic generated spans (e.g. ‘Sent./sleep’) with annotations, the server-side spans (e.g. ‘/sleep’ and ‘ocweb.handlerequest’) and CORS Preflight related spans:

Relate user interactions back to the initial page load tracing

OC Web attaches the initial page load trace id to the user interactions as an attribute and a span link. This enables you to do a trace search by attribute to find the initial load trace and its interactions traces via a single attribute query as well as letting you understand the whole navigation of a user through the application for a given page load.

The next screenshot shows a search by initial_load_trace_id attribute containing all user interaction traces after the initial page loaded:


Making it Real

With OC Web and a few lines of instrumentation, you can now export distributed traces from your web application. Start exploring the initial load and user interaction examples and you're welcome to poke around the source code and send us feedback via either Gitter or contributing with Pull Requests!

By Cristian González – OpenCensus Team – Software Engineering intern at Google Summer 2019 and student of Computer and Systems Engineering at Universidad Nacional de Colombia.

Special thanks to Dave Raffensperger for being initial creator of OC Web and guiding me in the process to develop i
t.

Season of Docs Announces Technical Writing Projects

Tuesday, August 6, 2019

Season of Docs has announced the technical writers participating in the program and their technical writing projects! You can view a list of organizations and technical writing projects on the website.

The program received nearly 450 technical writer applications, and with them, over 700 technical writing project proposals. The enthusiasm from the technical writing and open source communities has been amazing!

What is next?

During the community bonding period from August 7 to September 1, mentors must work with the technical writers to prepare them for the doc development phase. By the end of community bonding, the technical writer should be familiar with the open source project and community, understand of the product as a whole, establish communication channels with the mentoring organization, and set clear goals and expectations for the project. These are critical to the successful completion of the technical writing project.

Documentation development begins on September 2, 2019.

What is Season of Docs?

Documentation is essential to the adoption of open source projects as well as to the success of their communities. Season of Docs brings together technical writers and open source projects to foster collaboration and improve documentation in the open source space. You can find out more about the program on the introduction page of the website.

During the program, technical writers spend a few months working closely with an open source community. They bring their technical writing expertise to the project's documentation and, at the same time, learn about the open source project and new technologies.

The open source projects work with the technical writers to improve the project's documentation and processes. Together, they may choose to build a new documentation set, redesign the existing docs, or improve and document the project's contribution procedures and onboarding experience.

General Timeline

August 6Google announces the accepted technical writer projects
August 7 - September 1Community bonding: Technical writers get to know mentors and the open source community, and refine their projects in collaboration with their mentors
September 2 - November 29Technical writers work with open source mentors on the accepted projects, and submit their work at the end of the period.
December 10Google publishes the list of successfully-completed projects.

See the full timeline for details, including the provision for projects that run longer than three months.

Find out more

Explore the Season of Docs website at g.co/seasonofdocs to learn more about the program. Use our logo and other promotional resources to spread the word. Check out the FAQ for further questions!

By Andrew Chen, Google Open Source and Sarah Maddox, Google Technical Writer
.