opensource.google.com

Menu

Life after Season of Docs

Tuesday, January 18, 2022


My journey to technical writing involved a long, windy, and non-linear career path. Before I became a technical writer, I spent years working in finance jobs with a stint of teaching in between. Seeking a career change, I went back to university where I came across the Technical Writing certificate program. It was the perfect fit for my skills and interests.

One of Google’s technical writers, Nicole Yap, visited my class to talk about her career, and introduce Season of Docs—a program that brings technical writers and open source projects together to work on open source documentation. My interest was piqued as it seemed like a great opportunity for a new graduate. With no real world experience of writing documentation, I applied and was accepted into Season of Docs. I worked with Oppia, an online learning platform for a 3-month project, where I created a user guide with video tutorials.

During that time, I had to quickly become familiar with many new concepts:
  • Open source philosophy
  • Writing docs-as-code
  • Command-line basics
  • Submitting and amending pull requests on GitHub, and much more!
In the course of the Season of Docs program, I got my first full time job as a technical writer at a software company in Toronto. Juggling the demands of the project and my new job was challenging, but I was grateful for the experience as I could transfer the skills I learned to the new role. 

Opening doors to new experiences

I had such a positive experience working with my mentors1 at Oppia that we mutually agreed to extend our relationship. Over the next year, I continued to work with Oppia in different capacities—copywriting, editing, helping write math lessons—while getting to know the network of international volunteers who contribute to this incredible organization.

I also had the opportunity to present a talk at a Write the Docs Toronto meetup which was a great way to plug Season of Docs, and demonstrate what I had learnt during the program. There was quite a bit of interest from the audience as many hadn’t even heard of the program before.

My Season of Docs experience also helped me with my day job as a technical writer. After experiencing the steep learning curve with Oppia, I was able to hit the ground running with learning the new job processes at the software company. I was also able to fall back on my Season of Docs experience as I created marketing and technical videos in my new job as well.

A new opportunity

At the start of 2021, I had the opportunity to apply for a technical writing position at Google. I had the notion that a company like Google would require years of tech writing experience before they would even consider my application, but that turned out not to be true. I’ve been a technical writer at Google for four months now, and it still feels a bit surreal!

As a newcomer in the tech world, I find that everything I learned during the Season of Docs program has come in handy in helping me understand my job a little better. Getting into Season of Docs as a new entrant to the field of technical writing was a confidence-booster for me, and the path it led to has been challenging yet gratifying. I’m excited to continue learning every single day from the sea of talent around me.


By Audrey Tavares – Google Cloud


  1. The current Season of Docs program format does not have a defined mentor role, but technical writers in the program work closely with project contributors to learn open source skills.  




DeepNull: an open-source method to improve the discovery power of genetic association studies

Friday, January 14, 2022

In our paper “DeepNull models non-linear covariate effects to improve phenotypic prediction and association power,” we proposed a new method, DeepNull, to model the complex relationship between covariate effects on phenotypes to improve Genome-wide association studies (GWAS) results. We have released DeepNull as open source software, with a Colab notebook tutorial for its use.

Human Genetics 101

Each individual’s genetic data carries health information such as why certain individuals have a lower risk of developing skin cancer compared to others or why certain drugs differ in effectiveness between individuals. Genetic data is encoded in the human genome—a DNA sequence—composed of a 3 billion long chain built from four possible nucleotides (A, C, G, and T). Only a small subset of the genome (~4-5 million positions) varies between two individuals. One of the goals of genetic studies is to detect variants that are associated with different phenotypes (e.g., risk of diseases such as Glaucoma or observed phenotypic values such as high-density lipoprotein (HDL), low-density lipoproteins (LDL), height, etc).

Genome-wide association studies

GWAS are used to associate genetic variants with complex traits and diseases. To more accurately determine an association strength between genotype and phenotype, the interactions between phenotypes (such as age and sex) and principal components (PCs) of genotypes, must be adjusted for as covariates. Covariate adjustment in GWAS can increase precision and correct for confounding. In the linear model setting, adjustment for a covariate will improve precision (i.e., statistical power) if the distribution of the phenotype differs across levels of the covariate. For example, when performing GWAS on height, males and females have different means. All state of the art methods (e.g., BOLT-LMM, regenie) perform GWAS assuming that the effect of genotypes and covariates to phenotype is linear and additive. However, we know that the assumption of linear and additive contributions of covariates often does not reflect underlying biology, so we sought a method to more comprehensively model and adjust for the interactions between phenotypes for GWAS.

DeepNull method overview

We proposed a new method, DeepNull, to relax the linear assumption of covariate effects on phenotypes. DeepNull trains a deep neural network (DNN) to predict phenotype using all covariates in a 5-fold cross-validation. After training the DeepNull model, we make phenotype predictions for all individuals and add this prediction as one additional covariate in the association test. Major advantages of DeepNull are its simplicity to use and that it requires only a minimal change to existing GWAS pipeline implementations. In other words, to use DeepNull, we just need to add one additional covariate, which is computed by DeepNull, to the existing pipeline to perform GWAS.

DeepNull improves statistical power

We simulated data under different genetic architectures (genetic conditions) to first check that DeepNull controls type I error and then compare DeepNull statistical power with current state of the art methods (hereafter referred to as “Baseline”). First, we simulated data under genetic architectures where covariates have a linear effect on phenotype and observed that both Baseline and DeepNull have tight control of type I error. It is interesting that DeepNull power does not decrease compared to Baseline under a setting in which covariates have only a linear effect on phenotype. Next, we simulated data under genetic architectures where covariates have non-linear effects on phenotype. Both Baseline and DeepNull have tight control of type I error while DeepNull increases the statistical power depending on the genetic architecture. We observed that for certain genetic architectures, DeepNull increases the statistical power up to 20%. Below, we compare the -log p-value of test statistics computed from DeepNull versus Baseline for Apolipoprotein B (ApoB) levels obtained from UK Biobank:
Figure 1. Significance level comparison of DeepNull vs Baseline. X-axis is the -log p-value of Baseline and Y-axis is the -log p-value of DeepNull. The orange dots indicate variants that are significant for Baseline but not significant for DeepNull and green dots indicate variants that are significant for DeepNull but not significant for Baseline.

DeepNull improves phenotype prediction

We applied DeepNull to predict phenotypes by utilizing polygenic risk score (PRS) and existing covariates such as age and sex. We considered 10 phenotypes obtained from UK Biobank. We observed that DeepNull on average increased the phenotype prediction (R2 where R is Pearson correlation) by 23%. More strikingly, in the case of Glaucoma, referral probability that is computed from the fundus images (Phene et al. Ophthalmology 2019, Alipanahi et al AJHG 2021), DeepNull improves the phenotype prediction by 83.4% and in the case of LDL, DeepNull improves the phenotype prediction by 40.3%. The summary of DeepNull results versus Baseline are shown in figure 2 below:

 
 

Figure 2. DeepNull improves phenotype prediction compared to Baseline. The Y-axis is the R2 where R is the Pearson’s correlation between true and predicted value of phenotypes. Phenotypic abbreviations: alkaline phosphatase (ALP), alanine aminotransferase (ALT), aspartate aminotransferase (AST), apolipoprotein B (ApoB), glaucoma referral probability (GRP), LDLcholesterol (LDL), sex hormone-binding globulin (SHBG), and triglycerides (TG).

Conclusion

We proposed a new framework, DeepNull, that can model the nonlinear effect of covariates on phenotypes when such nonlinearity exists. We show that DeepNull can substantially improve phenotype prediction. In addition, we show that DeepNull achieves results similar to a standard GWAS when the effect of covariate on the phenotype is linear and can significantly outperform a standard GWAS when the covariate effects are nonlinear. DeepNull is open source and is available for download from GitHub or installation via PyPI.

By Farhad Hormozdiari and Andrew Carroll – Genomics team in HealthAI

Acknowledgments

This blog summarizes the work of the following Google contributors, who we would like to thank: Zachary R. McCaw, Thomas Colthurst, Ted Yun, Nick Furlotte, Babak Alipanahi, and Cory Y. McLean. In addition, we would like to thank Alkes Price, Babak Behsaz, and Justin Cosentino for their invaluable comments and suggestions.

Season of Docs announces results of 2021 program

Tuesday, December 14, 2021


Season of Docs has announced the 2021 program results for all projects. You can view a list of successfully completed projects on the website along with their case studies.

In 2021, the Season of Docs program allowed open source organizations to apply for a grant based on their documentation needs. Selected open source organizations then used their grant to hire a technical writer directly to complete their desired documentation project. Organizations then had six months to complete their documentation project. (In previous years, Google matched technical writers to projects and paid the technical writers directly.)

The 2021 Season of Docs documentation development phase began on April 16 and ended November 16, 2021 for all projects:
  • 30 open source organizations finished their projects (100% completion)
  • 93% of organizations had a positive experience
  • 96% of the technical writers had a positive experience
Take a look at the list of completed projects to see the wide range of subjects covered!

What is next?

Stay tuned for information about Season of Docs 2022—watch for posts on this blog and sign up for the announcements email list. We’ll also be sharing information about best practices in open source technical writing derived from the Season of Docs case studies.

If you were excited about participating, please do write social media posts. See the promotion and press page for images and other promotional materials you can include, and be sure to use the tag #SeasonOfDocs when promoting your project on social media. To include the tech writing and open source communities, add #WriteTheDocs, #techcomm, #TechnicalWriting, and #OpenSource to your posts.


By Kassandra Dhillon and Erin McKean, Google Open Source Programs Office
.