Using TensorFlow and JupyterHub in Classrooms

Monday, October 31, 2016

We’ve published a new solution and a companion GitHub repository that guides you through setting up a Google Container Engine cluster to run JupyterHub to automatically provision secure Jupyter containers for each user in a classroom or team. Don’t let the title of this article mislead you, not only does it use TensorFlow and JupyterHub, it’s actually an open source and cloud smorgasbord based on the Jupyter and Kubernetes platforms.

Jupyter is a powerful open source technology that gives you a platform to write and execute code to analyze, visualize and share the discoveries you find in your big data set. You can download a number of different Docker images preconfigured with many different notebook extensions and software packages to help you on any kind of data-science quest.

If you’re exploring on your own, and really want to get started quickly, you can get this all running on your local computer, but what if you want to take your expertise and lead a classroom of people along the same path? You have to either configure everything for them or walk them through configuring their own machines with all the required software.

This is where JupyterHub comes in, as a management layer in front of Jupyter instances, allowing you to configure users, using custom authentication, and giving you a Python interface to spawn new Jupyter instances for each user. Even with JupyterHub, you still need a way to provision physical and virtual hardware for the students.

Enter Kubernetes, an open source system for automating deploying, scaling and managing containerized applications. Google Container Engine is a fully managed service based on Kubernetes, allowing you to create clusters easily on Google Cloud Platform.

This solution comes with a JupyterHub Spawner class that allows it to create Kubernetes Pods, which are Docker images running Jupyter, for each user. It also comes with all the automation scripts required to create a Container Engine cluster and let you easily customize your setup.

When your students log into JupyterHub using Google OAuth2, they can choose from a list of several pre-built Jupyter images, including a newly updated “datalab-jupyter” image, which comes with the Google Datalab open source notebook extension enabling integration with BigQuery, Google Cloud ML, StackDriver, and it also has TensorFlow and the Apache Beam Python SDK for Google Cloud DataFlow installed.  Users can also choose to run any of the pre-configured Jupyter docker-stack images, or you can build your own Docker images to run any special libraries or Jupyter configurations you want.

We hope that this solution allows you to get your classroom or team environment running quickly so you can focus on learning rather than configuring machines.

By Brad Svee, Cloud Solutions Architect