Introduction to Jupyter notebooks
Prerequisites:
There is no prerequisite learning required, as this document is designed for a novice user of the Jupyter environment
Keywords beginner’s guide; jupyter notebook, jupyter notebook; beginner’s guide, jupyter notebook; markdown cell, jupyter notebook; raw cell, jupyter notebook; code cell
Background
Access to implementations of the Open Data Cube such as Digital Earth Africa and Digital Earth Australia is achieved through the use of Python code and Jupyter Notebooks. The Jupyter Notebook (also termed notebook from here onwards) is an interactive web application that allows for the viewing, creation and documentation of live code. Notebook applications include data transformation, visualisation, modelling and machine learning. The default web interface to access notebooks when using Digital Earth Africa is JupyterLab.
Description
This notebook is designed to introduce users to the basics of using Python code in Jupyter Notebooks via JupyterLab.
Topics covered include:
How to run (execute) a Jupyter Notebook cell
The different types of Jupyter Notebook cells
Stopping a process or restarting a Jupyter Notebook
Saving and exporting your work
Starting a new Jupyter Notebook
Getting started
Running (executing) a cell
Jupyter Notebooks allow code to be separated into sections that can be executed independent of one another. These sections are called “cells”.
Python code is written into individual cells that can be executed by placing the cursor in the cell and typing Shift-Enter
on the keyboard or selecting the ► “run” button in the ribbon at the top of the notebook. These options will run a single cell at a time.
If you wish to auto-run all cells in a notebook, navigate to the “Run” tab of the menu bar at the top of JupyterLab and select “Run All Cells” (or the option that best suits your needs). When you run a cell, you are executing that cell’s content. Any output produced from running the cell will appear directly below it.
Run the cell below:
[1]:
print ("I ran a cell!")
I ran a cell!
Cell status
The [ ]:
symbol to the left of each Code cell describes the state of the cell:
[ ]:
means that the cell has not been run yet.[*]:
means that the cell is currently running.[1]:
means that the cell has finished running and was the first cell run.
The number indicates the order that the cells were run in.
Note: You can also tell whether a cell is currently executing in a Jupyter notebook by inspecting the small circle in the top-right of the window. The circle will turn grey (“Kernel busy”) when the cell is running, and return to empty (“Kernel idle”) when the process is complete.
Jupyter notebook cell types
Cells are identified as either Code, Markdown, or Raw. This designation can be changed using the ribbon at the top of the notebook.
Code cells
All code operations are performed in Code cells. Code cells can be used to edit and write new code, and perform tasks like loading data, plotting data and running analyses.
Click on the cell below. Note that the ribbon at the top of the notebook describes it as a Code cell.
[2]:
print("This is a code cell")
This is a code cell
Markdown cells
Place the cursor in this cell by double clicking.
The cell format has changed to allow for editing. Note that the ribbon at the top of the notebook describes this as a Markdown cell.
Run this cell to return the formatted version.
Markdown cells provide the narrative to a notebook. They are used for text and are useful to describe the code operations in the following cells. To see some of the formatting options for text in a Markdown cell, navigate to the “Help” tab of the menu bar at the top of JupyterLab and select “Markdown Reference”. Here you will see a wide range of text formatting options including headings, dot points, italics, hyperlinking and creating tables.
Raw cells
Information in Raw cells is stored in the notebook metadata and can be used to render different code formats into HTML or \(\LaTeX\). There are a range of available Raw cell formats that differ depending on how they are to be rendered. For the purposes of this beginner’s guide, raw cells are rarely used by the authors and not required for most notebook users.
Stopping a process or restarting a Jupyter Notebook
Sometimes it can be useful to stop a cell execution before it finishes (e.g. if a process is taking too long to complete, or if you realise you need to modify some code before running the cell). To interrupt a cell execution, you can click the ■ “stop” button in the ribbon above the notebook, or select “Interrupt Kernel” from the Kernel menue.
To test this, run the following code cell. This will run a piece of code that will take 20 seconds to complete. To interrupt this code, press the ■ “stop” button. The notebook should stop executing the cell.
[3]:
import time
time.sleep(20)
If the approach above does not work (e.g. if the notebook has frozen or refuses to respond), you can also try restarting the entire notebook. To do this, navigate to the “Kernel” tab of the menu bar, then select “Restart Kernel”. Alternatively, click the ↻ “Restart the kernel” button in the ribbon above the notebook.
Restarting a notebook can also be useful for testing whether your code will work correctly the first time a new user tries to run the notebook. To restart and then run every cell in a notebook, navigate to the “Kernel” tab, then select “Restart and Run All Cells”.
Saving and exporting your work
Modifications to Jupyter Notebooks are automatically saved every few minutes. However, you can force a notebook to save by navigating to “File” in the menu bar, then selecting “Save Notebook”. Alternatively, click the 💾 “save” icon on the left of the ribbon above the notebook.
Exporting Jupyter Notebooks to Python scripts
The standard file extension for a Jupyter Notebook is .ipynb
.
There are a range of export options that allow you to save your work for access outside of the Jupyter environment. Python code for example can easily be saved as .py
Python scripts by navigating to the “File” tab of the menu bar in JupyterLab and selecting “Export Notebook As” followed by “Export Notebook To Executable Script”.
Starting a new notebook
To create a new notebook of your own, first use JupyterLab’s file browser to navigate to the directory you would like the notebook to be created in (if the file browser is not visible, re-open it by clicking on the 📁 “File browser” icon at the top-left of the screen).
Once you have navigated to your desired location, press the ✚ “New Launcher” button above the browser. This will bring up JupyterLab’s “Launcher” page which allows you to launch a range of new files or utilities. Below the heading “Notebook”, click the large “Python 3” button. This will create a new notebook entitled “Untitled.ipynb” in your directory.
To rename this notebook to something more useful, right-click on it in the file browser and select “Rename”.
Recommended next steps
For more advanced information about working with Jupyter Notebooks or JupyterLab, you can explore JupyterLab documentation page.
To continue working through the notebooks in this beginner’s guide, the following notebooks are designed to be worked through in the following order:
Jupyter Notebooks (this notebook)
Once you have you have completed the above six tutorials, join advanced users in exploring:
The “Datasets” directory in the repository, where you can explore DE Africa products in depth.
The “Frequently used code” directory, which contains a recipe book of common techniques and methods for analysing DE Africa data.
The “Real-world examples” directory, which provides more complex workflows and analysis case studies.
Additional information
License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Africa data is licensed under the Creative Commons by Attribution 4.0 license.
Contact: If you need assistance, please post a question on the Open Data Cube Slack channel or on the GIS Stack Exchange using the open-data-cube
tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on
Github.
Last Tested:
[4]:
from datetime import datetime
datetime.today().strftime('%Y-%m-%d')
[4]:
'2023-08-11'