This lesson is in the early stages of development (Alpha version)

Running a notebook in a JupyterHub

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How do I run a notebook in a JupyterHub?

  • How do I load specific software in a JupyterHub notebook

Objectives
  • Be able to run a notebook in a JupyterHub

Jupyter Notebooks

It’s quite likely that you have used Jupyter notebooks in the past. They can be a convenient way to test ideas and construct new code. While Jupyter notebooks aren’t the preferred choice for executing long-running code on a cluster (mostly because they are interactive in nature), they can be an important part of your development pipeline.

JupyterHub

A JupyterHub is a way to allow multiple users to access Jupyter notebooks (and other applications) in a way so that each user gets their own isolated environment. This has some similarity to Google’s cloud-based Colab service.

Your instructor will give you the URL for the JupyterHub login page (usually putting the address of the training cluster into the URL bar of your browser will get you there).

Production Alliance clusters will have ‘jupyterhub’ in the front of the cluster name, e.g.,

Once you have arrived at the JupyterHub page, you can log in with the same username and password used to log into the cluster via SSH.

Jupyter server options page

You are now given a page with some options to select some resources, much like a Slurm submission script.

Jupyter server options page

For the most part, we can keep the defaults to get a single core and some memory for an hour. Of particular note is the “JupyterLab” user interface option: Jupyter is a powerful way to run one-or-more notebooks or other applications.

Press “Start”. It may take a few moments for an interface to show up.

JupyterLab

On the left side, there is a vertical stack of icons for some general activities:

You can start a new launcher with the + tab button

Most operations make the launcher disappear, but you can open a new one with the + button

You can start a terminal …

This gives you a terminal session, just like as if you used SSH to access to cluster. The drawback here is that you are doing this through the scheduler, and this way will deplete your priority for running jobs (using SSH to access the cluster will never deplete your priority.

Picking a specific Python version

Select Python 3 (ipykernel) from a launcher.

In the notebook, you can find the current python version a couple of ways:

import sys
sys.version_info

or

!python --version

You can use a specific python version by visiting the software screen (hexagon icon) and loading the module ipython-kernel/3.11 (for example).

Now on the launcher, the notebook icon says Python 3.11.

In your running notebook, you can now switch python versions through the Kernel menu (Change Kernel). This will wipe clean any program you are currently running in the notebook.

Rerun the version code again.

Python scripts as modules

We can load (and run) our prime calculating code …

import primes_cpu
primes_cpu.main()

Another example:

!pip install --no-index pandas scikit-learn numpy

(You may need to restart the kernel first … You may also need to change directories with cd).

import titanic
titanic.main('random_forest')

Some other programs you can access

Some of these don’t work great on our test cluster, but you have access to some programs by loading other software modules … After loading the module, press the + on the tab bar to get a new launcher, you will new icons.

Quit

Go to File menu and select Logout.

Key Points

  • Most Alliance clusters have a JupyterHub