During the last EAA meeting (2016, Maastricht) I was asked to give a short talk during the PhD Forum on the topic of using a tool called the Jupyter Notebook to increase the replicability and transparancy of our research:
The slides are available here:
Understand the Jupyter Notebook
There are three components to project Jupyter.
- The Jupyter Notebook which is accessed and used through your browser
- The Jupyter Server that is run on a computer or server
- The different kernels that perform the actual execution of code
There are several things to note:
- You can run the Jupyter Server on your own computer and connect to it locally in your browser (you can do this even without internet). However, it is also possible to run the Jupyter Server on a different computer, for example a high performance computation server in the cloud, and connect to it over the internet. For the Jupyter Notebook itself you only need a modern web-browser like Chrome or Firefox.
- The Jupyter Server requires the Python language to work and the Python Kernel is always included by default. Other kernels, such as the R kernel, need to be added manually after the installation.
For new users I highly recommend to install a Python distribution like [Anaconda](https://www.continuum.io/downloads). This will automatically install Python, the Jupyter Notebook, and other commonly used packages for scientific computing and data science. You can choose between Python 2.7 and Python 3.5, I personally would recommend going for Python 3.5.
If you want to know more about Python 3 vs Python 2 you can check out my other blog post on this topic:
Python 2 vs. Python 3 | My view
Notes on the installation:
- The default installation directory (in the user directory) is in most cases fine.
- Click yes if asked to add the path to your environment (this is desirable in most cases).
After installing the Anaconda distribution you have everything ready to start using the Jupyter Notebook with the Python programming language.
Adding additional kernels
There are kernels available for a large amount of programs and programming languages. The installation instructions are different for each kernel but is usually well explained in the corresponding repository.
A selection of kernels:
- Stata: https://github.com/TiesdeKok/ipystata
- SAS: https://github.com/sassoftware/sas_kernel
- R (rpy2): http://rpy2.bitbucket.org/ or http://irkernel.github.io/
- MATLAB: https://github.com/calysto/matlab_kernel
- Julia: https://github.com/JuliaLang/IJulia.jl
For more kernels use Google or check this list:
Starting a Jupyter Server
You can only connect to a live Jupyter Notebook if a corresponding Jupyter Server is running. There are multiple ways to start a Jupyter Server but I will highlight two:
Start from the command line:
- Open your command prompt (if you are on Windows I recommend using the Anaconda Command Prompt)
- cd to the desired starting directory
e.g. cd “C:\Files\Work\Project_1”
- Start the Jupyter Notebook server by typing: jupyter notebook
This should automatically open up the corresponding Jupyter Notebook in the browser.
You can also manually go to the Jupyter Notebook by going to localhost:8888 with your browser.
Closing down the Jupyter Server
If you want to close down the Jupyter Server you open up the command prompt window that runs the server and you press CTRL + C twice. Make sure that you have saved any open Jupyter Notebooks!
Run into problems?
Feel free to comment below or ask a question on the forum using the tag “Jupyter Notebook” and I will try to help!