Installation guide for setting up JupyterHub on Azure

The following are instructions for Linux users who want to setup and run JupyterHub on Microsoft Azure cloud services. Most of the instructions will translate well to Unix/Mac users, only a few are Linux specific. There is more than one way to install JupyterHub and the following instructions demonstrate a preference for a command line interface. Prerequisites include the installation of `az`, `kubectl`, and `helm` on a local machine.

JupyterHub’s architecture runs on Linux/Unix OS including a ready to go docker image. Installation on Windows OS is not supported. Deployment on cloud services leverages the container orchestration software Kubernetes, minimizing dependencies on a specific cloud service provider and improving portability. Helm is a package manager for Kubernetes which manages installation and updating of Kubernetes applications in coordination with the Tiller service.

Instructions are grouped in three parts, each with it’s own ‘Quick Start’ section. Further details and explanations are provided after each Quick Start section for those wanting more than an amalgamation of sequential commands. The Quick Start sections and subsequent instructions make assumptions about naming directories, namespaces, accounts and releases.

Prerequisites

1. Azure Cloud Pay-As-You-Go Account

Microsoft offers a free-trial account https://azure.microsoft.com/en-us/free/ but JupyterHub installation using a Free trial subscription is bound to fail so long as the VM resources required by JupyterHub exceed the limitations of the subscription (Fig 1). At the time of writing Free Trial subscriptions are limited to VMs with four cores and are not eligible for limit or quota increases.

error message

Figure 1: Error message received during installation indicating limitations with the Free Trial subscription account

VMs with six cores or more are required for JupyterHub installation and the Free Trial subscription prevents the installation of JupyterHub. Pay-As-You-Go Accounts can be accessed at the following URL: https://azure.microsoft.com/en-gb/pricing/purchase-options/pay-as-you-go/

2. Local dependencies

Managing the cluster can occur on a local machine. Installing the Azure CLI locally makes it easier to manage things and bypasses the requirement to go to a website every time. Install the following local dependencies `az`, `kubectl`, and `helm` by copying and pasting the commands in a terminal.

Azure Cli (`az`)

curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash

Helm (`helm`)

curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get | bashKubernetes Cluster Manager (`kubectl`)

Kubernetes Cluster Manager (`kubectl`)

snap install kubectl

OR

sudo apt-get update && sudo apt-get install -y apt-transport-https

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee -a /etc/apt/sources.list.d/kubernetes.list

sudo apt-get update

sudo apt-get install -y kubectl

OR

(mac only) -> brew install kubectl

Part 1: Kubernetes Cluster Setup

Quick Start

If you want to skip the explanations given in Part 1, Sections 1-8, all of the relevant commands are listed in sequence for your convenience. Assumptions are made to name the local directory and cluster ‘az-jupyterhub’, the resource group ‘comp689_jupyter_hub’ and declare the location of the data center in ‘Canada Central’.

DESCRIPTION COMMAND
Login to Azure
az login
Verify subscription
az account list --refresh --output table
Create Resource Group
az group create --name=comp689_jupyter_hub --location="Canada Central" --output=table
Make local directory
mkdir az-jupyterhub && cd az-jupyterhub
Generate key pair
ssh-keygen -f ssh-key-az-jupyterhub
Create Kubernetes Cluster
az aks create --name az-jupyterhub --resource-group comp689_jupyter_hub --ssh-key-value ssh-key-az-jupyterhub.pub --node-count 3 --node-vm-size Standard_D2s_v3 --output table
Download Credentials
az aks get-credentials --name az-jupyterhub --resource-group comp689_jupyter_hub --output table
Verify Cluster
kubectl get node

1. Login to Azure

Once the Azure command `az` is installed locally, the following will prompt you to login to the service through a browser interface. After completing this login step, will you be able to install a Kubernetes cluster and communicate with the Azure portal.

az login

If this is a new account, you may be prompted that you “have no storage mounted” and that the shell feature requires an Azure file share to persist files. This is so that login credentials are saved. Further prompts may inform you that creating a storage account will incur a small monthly cost. This is a requirement in order to continue.

2. Verify Azure Subscription

Since you can scale up and manage many subscriptions from the command line, it is important to associate the following commands with the right account. If this is the first and only subscription to Azure that you have the following command will list only one subscription:

az account list --refresh --output table

3. Create a Resource Group

A mechanism by which computational resources on a cloud service can be allocated to one application and distinguished from other applications is expressed in Azure as a resource group which requires a unique name and a location. The location of the data centre chosen in the following example is ‘Central Canada’ and the unique name ‘comp689_jupyter_hub’.

az group create --name=comp689_jupyter_hub --location="Canada Central" --output=table
terminal output

Figure 2: terminal output for creating a resource group in Azure

4. Cluster Name

Some files need to be kept on your local machine. Choose a name for your cluster and create a local directory with the same name. This name should also be used, in part, to identify the ssh key pair associated with the cluster. In the example below, ‘az-jupyterhub’ was chosen.

mkdir az-jupyterhub
cd az-jupyterhub

5. Authorization

Authentication between your local machine and the Kubernetes cluster is facilitated by a public/private ssh key. Interacting with and configuring your cluster will rely on the files created in this next step. Run the following command which generates a public/private key pair with a similar name ‘ssh-key-az-jupyterhub’.

ssh-keygen -f ssh-key-az-jupyterhub

Note: You can replace the name ‘az-jupyterhub’ with your own name for your cluster.

6. Create an Azure Kubernetes Cluster

A request can now be made to create a Kubernetes cluster with the following details: Authentication (‘ssh-key-az-jupyterhub.pub’), resource group (‘comp689_jupyter_hub’), and a cluster name (‘az-jupyterhub’).

az aks create --name az-jupyterhub \

--resource-group comp689_jupyter_hub \

--ssh-key-value ssh-key-az-jupyterhub.pub \

--node-count 3 \

--node-vm-size Standard_D2s_v3 \

--output table

7. Download Kubernetes Credentials

If the cluster creation is successful, configuration details will also be created (tokens, certificates, etc) which links your account to the cluster. Download these credentials to your local machine with the following command:

az aks get-credentials \

--name az-jupyterhub \

--resource-group comp689_jupyter_hub \

--output table

This will allow you to interact with your newly created Kubernetes cluster using the `kubectl` command on your local machine.

8. Check Kubernetes Cluster Functionality

If successful, the following command should list three running nodes:

kubectl get node
confirmation

Figure 3: Confirming Kubernetes cluster is functional

Part 2: Helm / Tiller setup

Quick Start

If you want to skip the explanations given in Part 2, Sections 1-3, all of the relevant commands are listed for your convenience:

DESCRIPTION COMMAND
Tiller setup
kubectl --namespace kube-system create serviceaccount tiller
Tiller permissions
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
Helm/Tiller start
helm init --service-account tiller –wait
Secure Tiller
kubectl patch deployment tiller-deploy --namespace=kube-system --type=json --patch='[{"op": "add", "path": "/spec/template/spec/containers/0/command", "value": ["/tiller", "--listen=localhost:44134"]}]'
Verify Helm
helm version

 

As a package manager for Kubernetes applications, Helm and Tiller work together to describe and deploy resources within a cluster. Tiller acts as a service on the cloud which interacts with the cluster. Helm is the client for that service. Helm charts describe deployment instructions that are sent to the Tiller service which then interacts with the Kubernetes cluster.

1. Tiller

Setup a service account for Tiller with the following command:

kubectl --namespace kube-system create serviceaccount tiller

Give that service account permission to manage the Kubernetes cluster:

kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller

2. Helm

The following command will setup the Helm client locally and start the Tiller service in the cluster.

helm init --service-account tiller –wait

It only has to be run once and then future changes can be deployed with the Helm client which will tell Tiller what instructions to execute within the cluster.

3. Secure Tiller and Verify Helm

Since Tiller service runs inside the cluster and has elevated permissions to control the cluster it is necessary to configure it so that it only listens to commands from localhost and not within the cluster. Leaving the port that Tiller uses open for probing would allow pods in the cluster to exploit Tiller’s elevated permissions. Secure Tiller with the following command:

kubectl patch deployment tiller-deploy --namespace=kube-system --type=json --patch='[{"op": "add", "path": "/spec/template/spec/containers/0/command", "value": ["/tiller", "--listen=localhost:44134"]}]'

Then you can verify (Fig.4) that Helm and Tiller are installed properly by ensuring that Helm and Tiller versions are matching:

helm version
verification

Figure 4: Verification of proper Helm and Tiller installation

Part 3: JupyterHub Setup

Quick Start

If you want to skip the explanations given in Part 3, Sections 1-5, all of the relevant commands are listed for your convenience. Assumptions a made to name the namespace and release labels ‘jhub’.

DESCRIPTION COMMAND
Create config file
{ echo proxy:; echo secretToken:\"$(openssl rand -hex 32)\"; } | tee config.yaml
Add Helm repo
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
Update repo
helm update repo
Install JupyterHub
helm upgrade --install jhub jupyterhub/jupyterhub --namespace jhub --version=0.8.2 --values config.yaml
Validate Installation
kubectl get pod –namespace jhub
Get External IP
kubectl get service –namespace jhub

1. Config file

Working from the local directory created earlier, the same directory with the ssh keys, a security token needs to be generated and then added to a `config.yaml` file. The following will generate a random string:

openssl rand -hex 32

Copy and paste the random string generated by the previous command into the `secretToken` field in the `config.yaml` file, formatted in the following way:

proxy:

secretToken:<random_hex_value_here>

Note: you can combine both steps above with:

{ echo proxy:; echo secretToken:\"$(openssl rand -hex 32)\"; } | tee config.yaml

2. Helm repo

Next, make the Helm client aware of the JupyterHub Helm chart repository so that it knows where to find the latest Helm charts created by JupyterHub. The second command will update the repository.

helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update

3. Install JupyterHub

Now you’re ready to install JupyterHub! There are two variables, RELEASE and NAMESPACE that can be given the same values; `jhub` is used for both in the following example.

helm upgrade --install jhub jupyterhub/jupyterhub \

--namespace jhub \

--version=0.8.2 \

--values config.yaml

4. Validation

Make sure that the pods are in the Running state:

kubectl get pod –namespace jhub
success

Figure 5: Successful Kubernetes deployment shows STATUS Running

5. External Access

Get the external IP to access JupyterHub from a browser:

kubectl get service –namespace jhub
external IP

Figure 6: Screen capture of the external IP(40.85.229.236) for JupyterHub

Enter the IP in a browser (Fig. 7):

success

Figure 7: Successful installation of JupyterHub in AKS

 

Brad

A few of my favourite things: Agile software development with the potential for significant social impact combined with responsible and appropriate use of data, machine learning algorithms and systems that support research and evidence based decision making.

Leave a Reply

Your email address will not be published. Required fields are marked *

*