13.8 billion years after the Big Bang we arrive at the Big Split or: what happened to my ipython notebooks?

Idego Idego • Sep 01
Post Img

If your experience with programming in Python is longer than 15 minutes – chances are you know that is is an interpreted language – and apart from writing scripts and invoking them almost exactly like you would any other program – you can use it interactively in a REPL loop.

The interactive interpreter bundled with python is a bit crude though – it does not offer auto-completion, output-coloring, history-search, quick lookup of code or documentation, multi-line copy-paste support, convenient drop-into editor feature, and worst of all – you have to type in the parentheses to exit it. Luckily, the awesome python ecosystem provides a bunch of alternatives – including ipython, bpython, ptpython and possibly many others – including the ones written by insomnia-struck students wanting to appear on hacker news. If you haven’t used an alternative python shell before you’re definitely missing out, even though in the last years, IDEs for python have improved significantly (I even know some folks who abandoned vim in favor of pycharm), knowledge of the (python) shell and projects such as ipython still remain an essential tool in every developer toolbox (part from helping in everyday programming tasks, many tools provide interactive modes, which can be used to troubleshoot error in production environments).

One of this projects – ipython – offers a little lesser known feature – a combination of traditional command line interface and web ui – up until now known as ipython notebook. This command line evolution concept has recently gained popularity, up to a point where decision was made to split the python-specific parts out-of ipython project and form a new one – Jupyter. The latest release of ipython 4.0 – aka “The Big Split” is the first that removes the notebook feature from the core package and move it to the new place. So, without further ado, let’s take a look at what jupyter/ipython offers and how can those tools aid us in our never-ending programming endeavor.

Hitchhikers guide to Jupyter

Before ipython 4.0 starting a notebook was relatively easy – pip install ipython and then ipython notebook – check the command line for missing package exception (like tornado or jinja2), and repeat until no more errors appeared (the packages were not listed as ipython dependencies since notebooks were not required for all the other stuff to work).

If you try getting it up and running with ipython 4.0 it won’t work – that’s because there is no more ipython notebook command – instead we need to use jupyter – which luckily – is still a pypi package. After that we can start the notebook with jupyter notebook command. If all was setup well, the command should open a tab in our browser directed at the notebook – if not, the terminal should contain an information on the address the notebook is listening on.

My God, it’s full of… cells

The initial jupyter page you see, will look like some form of web-based file browser – this is not the notebook yet. To see a notebook you need to open an existing one (a file with .ipynb extension) or start a new one. You’ll then be taken to the notebook page. Now, this looks like some kind of recording of terminal session, and that’s exactly what it is, except you can go back and edit any of the previously executed commands and rerun them. For multi-line commands, that’s certainly more convenient than terminal (even though ipython has nice capabilities for this, with %paste and %edit magic commands) The basic blocks you type code into are called cells – most of the time, they will contain code, but they can also be other types of them – like markdown – which can be useful if you’re writing up something that’s meant to be shared among other developers. Since we already are in a browser we can take advantage of that – so output can contain images, videos, plots, equations. As notebooks are fairly popular in academia – plotting features are important and often used. Debugging and interrupting (similar to keyboard interrupts in terminal sessions) is also possible, so we don’t really loose any of the basic capabilities of more traditional terminals. What lies under the hood is a REPL loop decoupled into two independent processes – the client – in the form of notebook (other clients are also possible) and the evaluating process – called the kernel – abstraction of the eval part of REPL, which can additionally accept multiple client connections. Python was the first kernel to be supported, and still is an integral part of jupyter, but the abstraction has grown enough that kernels for other languages can be created.

These are the kernels you’re looking for

In fact, there already exist a wide selection of kernels – jupyter project has been around for a while (announced at SciPy 2014 in July, with separate project page and development gaining momentum throughout the whole 2015), the big split just being the final cut in removing it from ipython releases. You can see a list of available kernels here and those include other languages (not limited to scripting) as well as backend technologies such as redis or scientific tools such as GNU Octave. Of course not every kernel is as polished as ipython – some are clearly just a showcase of the technology – but given the amount of time the technology has been available – it’s a very good selection.

The process of installation of other kernels vary greatly depending on the underlying technology – it’s best to consult the documentation of the kernel itself. For example, JavaScript (node.js) kernel is distributed as an npm package, ruby kernel is a gem (and like most ruby packages, does not install on windows – at least without a considerable amount of gimmicks I am clearly not aware of), and redis, being a python package that communicates with redis over tcp, can be installed with pip. You can also play around with some other kernels at try.jupyter.org.

Traveling at warp speeds

A good application targeted at programmers has almost no change of gaining popularity if it’s lacking keyboard shortcuts – especially it this application is a terminal. Jupyter notebooks are no different, and while it is possible to do practically everything (obviously except typing the actual code we want to execute) by pointing and clicking, which is great at the beginning – it will get tedious fast. Luckily enough, jupyter comes with a variety of shortcuts available – from basic ones to execute the current cell to more specific ones like restarting the kernel. The shortcuts are available in the notebook itself, either by clicking “Help -> Keyboard Shortcuts” or by pressing h, while being in command mode. If you feel the provided shortcuts are not enough, you can always customize it, as described in docs.

A notebook can never injure a human being

Just like every web application, notebook servers might be a subject to a wide spectrum of malicious attacks – and until IPython 2.0, it was not advised to run a publicly available notebook server, nor to run a notebook from an untrusted source locally, without examining it’s contents offline. But for some quite time now, the security aspect of the project is being treated as top-priority and a security model has been implemented. The basis of this model is trusting a notebook (which under the hood signs it with a key that should be kept private). If a notebook is untrusted – any HTML or JavaScript (which can lie in cells or in their output) that it contains will not be executed automatically. They can be run however if the user wishes so (executes the cell directly) – in that case, the content becomes trusted. The server also has other obvious security features like SSL support – which needs to be set up and configured first though. Some of the security docs cross-link to ipython documentation so it is not perfectly clear to me how is this separated – for now the ipython is a required dependency for jupyter, so this is likely a temporary state. There is also a side project called jupyterhub which is meant to be an multi-user facade for the jupyter notebook – it is in early stages of development though and limited to Unix at the moment.

The work of foundation has just begun.

There’s also a plethora of other features and side-project I haven’t mentioned yet – nbconvert, nbviewer, jupyter terminal, github integration to name the few. Still, you might ask – what is this notebook feature good for? After all – in the beginning… was the command line. Here we are over a decade in the 21st century, and the terminal is still amongst the essential tools for practically every developer. Can the notebook metaphor somehow find a place in between shells and gui apps or is it a niche product that will not gain audience outside educational areas? It does have some of the merits of shells – mainly that it is very much text-based, thus easily copied, pasted, documented, stored in version control, diffed, you name it. It also better suited for multi-line edit (not everything has to be a one liner), offers good history search, is available via browser, which might be even more ubiquitous than command line nowadays (thanks to smartphones and tablets) and if needed – can embed non-textual elements like equations, images, videos.

The main usage right know is educational – you can already find a good collection of notebook-based tutorials, it’s a great aid in conference speeches, trainings, workshops. Another use case is replacement of traditional shell – the easy multi-line support and ease of modifying mistakes might just convince me to move from traditional console-based ipython to jupyter. And kernels for bash and other shells are already being developed. A way of providing interactive documentation for libraries also seems like a great idea, that has not taken root yet. And if I were to stretch my imagination just a little bit – I’d say one more possibility can emerge – a way of providing interactive terminals of different kinds used for interacting with production environments (an alternative to SSHing into servers) – this of course requires work on the security front, and careful consideration, but in my opinion, is possible. Time will tell. Live long and prosper.


Leave a Reply

Your email address will not be published. Required fields are marked *

Drive tactical delivery
without inflating the top line

Your Swiss Army Knife in AI, Cloud and Digital

Get in touch Button Arrow