Jupyter Notebooks

In [1]:
# Execute this cell to update the styling of the Notebook
from IPython.core.display import HTML
css_file = 'https://raw.githubusercontent.com/ngcm/training-public/master/ipython_notebook_styles/ngcmstyle.css'
HTML(url=css_file)
Out[1]:

Material created by:

  • James Bailey
  • Marian Daogaru

In this workshop we will cover:

  • Introduction to Jupyter Notebooks
  • nbconvert
  • nbdime
  • nbval

1. Preparation

1.1 Pre-requisites

  • Jupyter Notebook
  • nbconvert
  • nbdime
  • nbval (version >= 0.5.0)

1.2 Installation Instructions

If you already have the required packages installed or are using the pre-installed virtual machine you can skip these steps.

!!! The majority of the steps below are for a Unix environment that does not have any advanced Python IDE installed.

pip

  • pip - there is a metapackage called Jupyter which will install all of the necessary Jupyter packages
    • Install by running in command line:
    • sudo apt-get upgrade python
      • install GRUB
    • sudo wget -O /home/feeg6003/Desktop/get-pip.py "https://bootstrap.pypa.io/get-pip.py"
    • sudo apt-get install python3-pip
    • sudo python /home/feeh6003/Desktop/get-pip.py
    • OR by installing Miniconda:
    • go to Miniconda website and download python 3.6 32-bit version
    • cd Downloads/
    • bash Miniconda3-latest-Linux-x86.sh and go through installation
    • restart terminal

Jupyter Notebook

  • Anaconda - Jupyter Notebook comes as part of the Anaconda distribution
  • pip - there is a metapackage called Jupyter which will install all of the necessary Jupyter packages
    pip install jupyter

nbconvert

  • in command line run:
    • pip install nbconvert

Optional step

  • for full capabilities of nbconvert, additional packages are required. As such, run the following lines in terminal. If more file formats have to be used, install as appropriate. For PDF conversion, LaTeX formatting is required. As such, TeXLive is installed:

    • sudo apt-get install texlive-base
    • sudo apt-get install texlive-xetex
    • sudo apt-get install texlive-fonts-recommended
    • sudo apt-get -y install texlive-latex-recommended texlive-pictures texlive-latex-extra

if the above produces errors. You have to point to where tex is:

- sudo nano ~./bashrc

Then add the following lines:

PATH=$PATH:/usr/local/texlive/2013/bin/x86-64-linux

export PATH MANPATH=$MANPATH:/usr/local/texlive/2013/texmf/doc/man export MANPATH INFOPATH=$INFOPATH:/usr/local/texlive/2013/texmf/doc/info export INFOPATH

nbdime

  • run the following lines in command line.

    pip install --upgrade nbdime

nbval

  • pip - nbval can be installed through pip:
    pip install nbval

1.3 Checking you're ready

The following steps allow you to check that you have Jupyter Notebook installed correctly and have the additional packages we are going to show you.

Opening Jupyter Notebook

In a terminal window:

jupyter notebook

This should open an instance of Jupyter Notebook which usually open automatically in your browser. The Jupyter Notebook instance has the default address http://localhost:8888/ so by typing this into your browser you should also be able to access a running Jupyter Notebook.

nbconvert

If you installed nbconvert using command line, run the following lines to download and run the official tests.

pip install nbconvert[test]
py.test --pyargs nbconvert

Most of the tests shoud pass, with some few skipped and several warnings. It can be read by the warning messages that the tests are deprecated, and will be eliminated in the next version.

nbdime

If you installed nbdime using command line, run the following lines to download and run the official tests.

pip install nbdime[test]
py.test --pyargs nbdime

Most of the tests should pass, with some failing due to some libraries not being installed properly. Ignore this for now, as they do not affect the functionality.

nbval

For the functionality included in this workshop we need version 0.5.0 or greater. You can check your current version using:

pip show nbval

If you have version 0.5.0 of greater then you have what you need.

2. Introduction to Jupyter Notebook

The following sections are designed as an introduction to Jupyter Notebooks. If you are already familiar with Jupyter Notebook then you can skip this section.

  • opening Jupyter Notebook
  • creating a new notebook
  • renaming a notebook
  • executing code
  • executing Markdown

2.1 Opening Jupyter Notebook

To start Jupyter Notebook, in a terminal window type:

jupyter notebook

This will open a menu containing the files currently in that directory.

2.2 Creating a new Notebook

To create a new Notebook go to the top right and press new, selecting Python from the drop down list. This will open a new Notebook.

2.3 Renaming a Notebook

You can rename the Notebook by clicking where it currently says Untitled and entering your name for the file.

2.4 Executing Code

From the toolbar click the plus to add a new cell. By default this a code cell and we can write code in this cell to be executed. Try typing:

print("Hello World")

and click Ctrl+Enter. The cell below is setup for you to try this out.

In [1]:
# Use this cell to print "Hello World"

2.5 Executing Markdown

The type of content in a cell can be changed using the toolbar at the top. In a code cell click code in the toolbar and select Markdown. This makes the cell a Markdown cell and we can enter text which when executed is displayed to the screen. The cell below is setup for you to try writing some text and executing.

Try writing some text in here.

Hopefully now you complete basic tasks in Jupyter Notebook like writing and executing code. The next sections present some of the additional tools available for Jupyer Notebook.

3. nbconvert

In this section, we will explore the capabilities of nbconvert. This section will use several material examples located in Example_notebook.

3.1 Functionality

  • converts Jupyter Notebooks to different formats

nbconvert is a tool for the Jupyter Notebook that enables the user to use export their notebooks in different formats. The default formats are:

  • python (.py)
  • HTML (.html)
  • markdown (.md)
  • restructured text (.rst)
  • PDF (using LaTeX)

3.2 Use cases

  • portability

Using nbconvert, a different range of formats can be presented. The initial coding and formating can be done in a Jupyter Notebook. The majority of work unit will not have Jupyter installed. However, they will have at least one software that can open one of the formats presented above. In addition, sometimes it might be easier to code a large piece of code directly in a Python IDE or on remote machines, rather than the Notebook.

3.3 Using nbconvert

There are two main ways to use nbconvert: either directly from notebook, or using command line.

3.3.1 nbconvert from notebook

The easiest way to use nbconvert is directly from the notebook. This can be done by clicking on the File in the top-left corner. The click on Download as, followed by which format you desire.

The picture above shows the format of a typical instalment of nbconvert, and what you should expect to see.

3.3.2 nbconvert from command line

To run nbconvert in command line. The command uses jupyter:

jupyter nbconvert --to FORMAT --template=NAME notebook.ipynb

--to FORMAT argument represents the type of file you want. FORMAT takes the values:

  • python
  • html
  • latex
  • pdf
  • markdown
  • rst
  • slides

Additional formats can be installed.

--template=NAME is an optional argument. Several templates exist for latex and html. Custom templates can be created.

Save from command

4. nbdime

In this section, nbdime shall be presented. In addition, examples can be found in the Example_notebooks folder.

4.1 Functionality

  • diffing between 2 notebooks
  • merging of 3 notebooks

nbdime allows the users to compare and contrast two different notebooks, by displaying what are the differences in the notebooks, both in terms of input (what is written in the cells), and also the output. In addition, it allows the users to merge three notebooks. The first notebook is called base. Following notebooks are knows are local and remote. A comparison is made between the three, allowing the users to features to be included or not into the merge.

4.2 Use cases

  • display the difference between versions of a notebook
  • allows the implementation of different functions to obtain the same results, and compare the methods
  • merge different versions of the same notebook
  • enables merging of different notebooks for expanding capabilities

nbdime can be used extensively in versioning and collaborative projects. Diffing can display the differences and similarities between two notebook. This can be useful when comparing versions of the same notebook, or different implementations which converge to the same solution. The merge option is important for adopting consistent methods and outputs. In addition, new scripts can be created by merging different notebooks together. By having several functionalities saved into different notebook, this feature creates a very convenient way of expanding the capabilities of your code.

4.3 Operating modes

  • terminal
  • browser

4.3.1 Terminal

Command in terminal

  • nbdiff
  • nbmerge

In terminal, nbdime will display all information directly in the terminal without the need for a browser, or graphics.

4.3.2 Browser

Command in terminal:

  • nbdiff-web
  • nbmerge-web

Browser mode provides a different visualisation of the date, with possible easier method of directly interacting with nbdime. In browser, has the same features as in command line, by using the GUI provided, rather than commands.

4.4 Using nbdime

4.4.1 nbdiff / nbdiff-web

The command has the structure:

  • nbdiff Notebook1.ipynb Notebook2.ipynb
  • nbdiff-web Notebook1.ipynb Notebook2.ipynb

or

  • nbdime diff Notebook1.ipynb Notebook2.ipynb
  • nbdime diff-web Notebook1.ipynb Notebook2.ipynb
In [1]:
!nbdiff Class_examples/nbdime_ce_1.ipynb Class_examples/nbdime_ce_2.ipynb
nbdiff Class_examples/nbdime_ce_1.ipynb Class_examples/nbdime_ce_2.ipynb
--- Class_examples/nbdime_ce_1.ipynb  2017-03-02 20:07:41
+++ Class_examples/nbdime_ce_2.ipynb  2017-03-02 20:07:38
## modified /cells/1/outputs/0/data/text/plain:
-  42
+  5

## modified /cells/1/source:
-  40+2
+  2+3

## inserted before /cells/2:
+  markdown cell:
+    source:
+      This is a markdown text.
+      
+      Not the same markdown.

## deleted /cells/2:
-  markdown cell:
-    source:
-      This is a markdown text.

## deleted /metadata/anaconda-cloud:

Expected result:

nbdiff 1

In [8]:
!nbdiff-web Class_examples/nbdime_ce_1.ipynb Class_examples/nbdime_ce_2.ipynb
[I nbdimeserver:293] Listening on 127.0.0.1, port 51596
[I webutil:29] URL: http://127.0.0.1:51596/diff?base=Class_examples%2Fnbdime_ce_1.ipynb&remote=Class_examples%2Fnbdime_ce_2.ipynb
[I web:1946] 200 GET /diff?base=Class_examples%2Fnbdime_ce_1.ipynb&remote=Class_examples%2Fnbdime_ce_2.ipynb (127.0.0.1) 13.64ms
[I web:1946] 200 GET /static/nbdime.js?v=f7e67850c1e3eaa996a947b1eac70cbe (127.0.0.1) 71.26ms
[I web:1946] 200 POST /api/diff (127.0.0.1) 35.89ms
[W web:1946] 404 GET /favicon.ico (127.0.0.1) 0.91ms
[I nbdimeserver:232] Closing server on remote request
[I web:1946] 200 POST /api/closetool (127.0.0.1) 1.11ms

Expeced result:

nbdiff web

The example above shows what changed between the 2 notebooks. It can be observed that provides output regarding the metadata, and also if output changed, what cells have been deleted.

The web application allows the user to easily change between different notebooks, by modifying the path in the top left.

4.4.2 nbmerge / nbmerge-web

Commands similar to nbdiff:

  • nbmerge base.ipynb local.ipynb remote.ipynb -o OUTPUT.ipynb
  • nbmerge-web base.ipynb local.ipynb remote.ipynb

or

  • nbdime merge base.ipynb local.ipynb remote.ipynb -o OUTPUT.ipynb
  • nbdime merge-web base.ipynb local.ipynb remote.ipynb
In [12]:
!nbmerge Class_examples/nbdime_ce_1.ipynb Class_examples/nbdime_ce_2.ipynb Class_examples/nbdime_ce_3.ipynb -o Class_examples/nbdime_ce_4.ipynb
[W nbmergeapp:54] Conflicts occured during merge operation.
[I nbmergeapp:67] Merge result written to Class_examples/nbdime_ce_4.ipynb
In [11]:
!nbmerge-web Class_examples/nbdime_ce_1.ipynb Class_examples/nbdime_ce_2.ipynb Class_examples/nbdime_ce_3.ipynb
[I nbdimeserver:293] Listening on 127.0.0.1, port 51332
[I webutil:29] URL: http://127.0.0.1:51332/merge?local=Class_examples%2Fnbdime_ce_2.ipynb&remote=Class_examples%2Fnbdime_ce_3.ipynb&base=Class_examples%2Fnbdime_ce_1.ipynb
[I web:1946] 200 GET /merge?local=Class_examples%2Fnbdime_ce_2.ipynb&remote=Class_examples%2Fnbdime_ce_3.ipynb&base=Class_examples%2Fnbdime_ce_1.ipynb (127.0.0.1) 15.22ms
[I web:1946] 200 GET /static/nbdime.js?v=f7e67850c1e3eaa996a947b1eac70cbe (127.0.0.1) 67.51ms
[I web:1946] 200 POST /api/merge (127.0.0.1) 52.03ms
[W web:1946] 404 GET /favicon.ico (127.0.0.1) 1.38ms
[I web:1946] 200 POST /api/merge (127.0.0.1) 30.48ms
[I web:1946] 200 POST /api/merge (127.0.0.1) 27.83ms
[I nbdimeserver:232] Closing server on remote request
[I web:1946] 200 POST /api/closetool (127.0.0.1) 0.85ms

Expected output:

nbmerge

Additional parameters can be given to the nbmerge to solve conflicts.

  • -m [inline, use-base, use-local, use-remote]
  • --merge-strategy [inline, use-base, use-local, use-remote]
    • Specify the merge strategy to use
  • --input-strategy [inline, use-base, use-local, use-remote]
    • Specify the merge strategy to use for inputs; overrides 'merge-strategy' for inputs
  • --output-strategy [inline, use-base, use-local, use-remote, remove, clear-all]
    • Specify the merge strategy to use for outputs; overrides 'merge-strategy' for outputs
  • --no-ignore-transients
    • Disallow deletion of transient data such as outputs and execution counts in order to resolve conflicts.

5. nbval

The following guide you through the functionality, use cases and examples of the use of nbval. Example notebooks to demonstrate passing and failing tests are given in the folder 'Example_Notebooks'.

5.1 Functionality

  • validation of outputs from notebook

nbval is a plugin for pytest and validates the output of notebooks by executing the notebook and comparing to the stored outputs. The outputs from a notebook are saved with the inputs. nbval uses this in testing by executing each cell and comparing the output in the test to the stored output in the notebook. Any differences will cause a failing test.

Stages

  • stored notebook output
  • test:
    • execute each cell
    • compare test output to stored output
    • differences cause failing test

A note on cell execution

Cell Execution:

  • manual execution as reference
  • execution during testing with nbval

The stored output in the notebook is the reference used when the notebook is tested. Manual execution of the notebook updates the references used in testing. During testing with nbval the notebook's output are created separately to the output cells, allowing comparison to the previous output.

5.2 Use Cases

  • validate documentation

Notebooks provide a useful way of creating documentation. By combining text and cell execution documentation can include description and examples of execution. nbval complements this by allowing the execution of the notebook to be validated.

5.3 Modes

  • py.test --nbval
  • py.test --nbval-lax

Testing with nbval

With this flag all cells in the notebook are tested by default with flags available to skip execution

Testing with nbval-lax

With this flag no cells in the notebook are tested by default and flags are available to cause their execution.

5.4 Basic Cell Execution

  • no output
  • deterministic output

In the folder 'Example_Notebooks' is nbval_example_11.ipynb. This notebook contains a cell importing some modules to the notebook and a cell calculating a sum. We can run the tests and check that the notebook is behaving as expected:

py.test --nbval path/to/nbval_example_11.ipynb -v

The second notebook is nbval_example_12.ipynb. Here we have changed the module being imported to one which does not exist and changed the result of the sum. The output cells are maintained as before to use as reference. When we execute the tests we get two failures:

py.test --nbval path/to/nbval_example_12.ipnb -v

5.5 Adding Comments to Control Testing

Comments can be added to cells to control the execution of the tests. The example below allows us to skip the execution of a cell. The random number outputted would cause a failing test.

# NBVAL_SKIP
import random
random.random()

Open nbval_example_21.ipynb and you will see a cell with the code above. When we execute the tests we get a skipped test.

py.test --nbval path/to/nbval_example_21.ipynb -v

5.7 Ignoring Output

For some cells we may want to execute the cell but we don't want the output to be checked. We can specify that we want the cell output ignored. Here we may want to use the value of the random number in another cell but do not want a failing test from the outputting of this number.

# NBVAL_IGNORE_OUTPUT
import random
start = random.random()
print(start)

Open nbval_example_31.ipynb; when we execute the tests we get a pass.

py.test --nbval path/to/nbval_example_31.ipynb -v

5.8 Checking Exceptions

If an exception occurs when a cell is run during testing then the test returns a fail explaining an exception was raised. We may expect that an exception is raised but want to check that the correct exception has been raised. We can do that by specifying that a cell raises an exception:

# NBVAL_RAISES_EXCEPTION
raise(ValueError)

Open nbval_example_41.ipynb. We see a cell which will raise a ValueError, we can execute the tests to see what the result will be:

py.test --nbval path/to/nbval_example_41.ipynb -v

Open nbval_example_42.ipynb. We now see a cell which tells nbval that an exception is raised. nbval checks the exception raised in testing against the reference exception stored in the output.

py.test --nbval path/to/nbval_example_42.ipynb -v

Open nbval_example_43.ipynb. We now see that the raised exception has changed. We can again execute the tests:

py.test --nbval path/to/nbval_example_43.ipynb -v

5.9 Checking Output with lax

When the flag -lax is used no cells are executed by default. We can specify that a cell should be executed using check output:

# NBVAL_CHECK_OUTPUT
2 + 2

Open nbval_example_51.ipynb. We can see that the output of some cells will be tested and others will not. We can again execute the tests:

py.test --nbval -v path/to/nbval_example_51.ipynb

5.10 Other Features

  • Cell Tagging
  • Output Sanitising
  • Figures

The features above are not covered as part of this workshop but we mention them here as further parts of nbval. More information about these features is contained in the package's documentation

Cell Tagging

For languages which do not fit the format of lines starting with # or if the additional comment is undesired in the cell then the cell can be tagged. The metadata for that cell can be altered to contain the desired tag.

Output Sanitising

For some cells the expected output will change. For cells containing references to the date or time or containing a random element the output will change each time, but does not mean a failing test. Regular expressions can be used to search for these outputs and sanitise them into something which can be checked by nbval.

Figures

For figures, the text output referencing the figure can be checked as part of the tests.

Exercises

Introduction to Jupyter Notebooks

If you are familiar with Jupyter Notebooks then feel free to skip this section and move on to nbconvert.

  • 1. Open Jupyter Notebook
  • 2. Create a new notebook
  • 3. Rename the notebook
  • 4. Add some new cells
  • 5. Execute some Python code in the cells
  • 6. Write some Markdown, execute the cells and see the output
  • 7. Remove some of the cells

nbconvert

For the workshop, this notebook can be used as a test notebook for nbconvert functionalities. However, additional notebooks can be found in the folder "Example_Notebooks".

Using the notebook

In [2]:
import numpy

print(2+2) 
f = lambda x: numpy.exp(x) * 4 - x*x
print(f(4))

x = numpy.linspace(-5, 5, 11)
y = f(x)
y
4
202.392600133
Out[2]:
array([ -2.49730482e+01,  -1.59267374e+01,  -8.80085173e+00,
        -3.45865887e+00,   4.71517765e-01,   4.00000000e+00,
         9.87312731e+00,   2.55562244e+01,   7.13421477e+01,
         2.02392600e+02,   5.68652636e+02])

Task 1

  • save this notebook as a .html, .python, .pdf
  • open the outputs and see if they match the notebook

Task 2

  • create a notebook that will have the same output as : nbconvert_example_1.python
  • create a notebook that will have the same output as : nbconvert_example_2.pdf
  • create a notebook that will have the same output as : nbconvert_example_3.html

nbdime

We shall use notebooks found in "Example_Notebooks" for these taks.

Task 1 nbdiff

  • using command line, explore the differences between nbdime_example_1 and nbdime_example_2 using "nbdiff"
  • using command line, observe the differences between the above notebooks in a browser environment
  • while in broswer, change the notebooks you want to investigate and see the differences
  • do the results match the class description?

Task 2 nbdiff - Additional

  • create your own notebooks
  • check the differences between the 2 notebooks using nbdiff & nbdiff-web

Task 3 nbmerge

  • using the command line, merge the 3 nbdime_example notebooks.
  • use both the nbmerge & nbmerge-web
  • save the new merge in both cases
  • for additional challenge, try to solve merge conflicts by specifing merge rules

Task 4 nbmerge - Additional

  • create your own notebooks
  • merge them
  • can you answer the questions:
    • What happens when base, local & remote are different?
    • But when base & local are the same, but remote is differnet?
    • What if base & remote are the same, yet local is different?
    • lastly, if base is different, and local & remote are the same?

nbval

  • If there were any examples shown which you are unsure about, see the folder 'Example_Notebooks' for the examples shown

Task 1

  • Create an example piece of documentation which you will test. This should include:
    • a basic function which you are documenting (Create something which has example uses and requires exceptions to be raised. We used factorial.)
    • Write some documentation for your function in the notebook
    • Show some example use cases, including cases when an exception should be raised

Task 2

  • Test your notebook to show the tests passing

Task 3

  • Add some bugs to your function to change the output and show the tests failing

Task 4

  • For an additional challenge try tagging rather than commenting the cells of your notebook and getting the same test results