Code Search for Google open source projects

Wednesday, April 1, 2020

We are pleased to launch Code Search for Google open source projects. Code Search is one of Google’s most popular internal tools, and now we have a version (same binary, different flags) targeted to open source communities.

Googlers use Code Search every day to help understand the codebase: they search for half-remembered functions and usages; jump through the codebase to figure out what calls the function they are viewing; and try to identify when and why a particular line of code changed.

The Code Search tool gives a rich code browsing experience. For example, the blame button shows which user last changed each line and you can display history on the same page as the file contents. In addition, it supports a powerful search language and, for some repositories, cross-references.

Suggest-as-you-type in any search box annotates suggestions with the type of code object, the repository and the path, helping users find what they want faster.

The search language supports regular expressions and a number of helpful search atoms. For a user looking for a function foo in a Go file, instead of sifting through thousands of results containing foo, the user can search for lang:go function:foo to limit search results to Go files where foo is a function and not a struct or a word in a comment.

One example is finding a file using only part of the name. The query goes directly to the file, since there is only one such file.

See the quick reference for more information.

In addition to text search, some of the open source repositories have cross-references powered by Kythe. Kythe is a Google open source project that includes tools to help understand code. Project owners instrument a build of their repository to output compilation information for Kythe. Kythe tools convert this data to a graph. This graph connects definitions to declarations and code references to the abstract objects they represent (described by a graph schema). Google then runs an internal pipeline that combines these graphs for the different languages, prunes unnecessary pieces, and optimizes it for serving cross-references. The whole process runs several times per day to keep the data fresh.

Open source communities use a broader set of build systems than Google. In order to support cross-references, Kythe added drop-in support for Bazel, CMake, Maven, and Go. Projects using other build systems can use Kythe-provided wrappers for clang and javac to instrument their builds; these are used by Chromium and Android AOSP to provide compilation information for Kythe.

Because Kythe is based on the build, Kythe cross-references include links to files generated as part of the build process, such as Java files generated for AutoValues (example here) or protos. For repositories where cross-references are enabled, clicking on a symbol will take you to a definition of that symbol.

Clicking on the definition of a symbol will open a cross-reference panel, showing all the places where that symbol is referenced. For example, clicking on toVName below, we can see the places that reference this method. One of the callers is parseVName, and clicking on that shows the callers of that method.

At this time, we only provide search on the repositories listed below, but we plan to add more over time:
We are also investing in making the application keyboard-navigable and usable with a screen reader.

We hope you find this tool useful!

By Kris Hildrum, Code Search Team