Jupyter notebooks are one of the most powerful tools in a data scientist's arsenal and one of the most commonly misused. They are excellent for exploration, communication, and teaching. They are catastrophically bad as production code, version-controlled libraries, or complex multi-step pipelines. The key to using notebooks effectively is understanding when they are the right tool and applying a set of practices that prevent the common failure modes.
The Core Problems with Notebooks
Cell execution order bugs. In a notebook, cells can be run in any order. This means your notebook can appear to work correctly (all outputs are present and look right) while being completely broken if cells are run top-to-bottom. A classic example: cell 5 deletes a column, cell 3 uses that column, you run them in order 3-4-5-6-7, everything works. Someone else opens the notebook and runs them in order 1-2-3-4-5-6-7, it crashes on cell 3 because cell 5 has not run yet. Except it does not crash on cell 3 because cell 3's cached output from your previous run is still displayed. The bug is invisible.
Hidden state. The Python kernel maintains state between cells. Variables defined and then deleted in a cell still exist in memory until you restart the kernel. This means a notebook can depend on variables that were defined in cells you deleted an hour ago.
Version control for outputs. Git stores the JSON representation of a notebook, including all cell outputs (plots as base64-encoded images, tables as HTML). A single re-run of a notebook with identical code but timestamps in the output creates a noisy, unreadable diff.
Refactoring resistance. Notebooks discourage extracting reusable code into functions and modules. Logic accumulates in cells, grows intertwined, and becomes impossible to test or reuse.
No testing. You cannot run pytest on a notebook. The standard Python testing ecosystem does not work with notebook cells.
Non-Negotiable Best Practices
Restart and Run All Before Sharing
Before sharing any notebook -- before committing it, before sending it to a colleague, before presenting it -- restart the kernel and run all cells from top to bottom. This is the single most important practice.
Kernel > Restart Kernel and Run All Cells...
If the notebook fails when run top-to-bottom, it is broken, regardless of what the cached outputs show.
Use nbstripout to Remove Outputs from Git
Install nbstripout as a git filter. It automatically strips cell outputs before committing, keeping your diffs clean and reviewable.
pip install nbstripout
nbstripout --install # Sets up the git filter for this repo
After setup, git diff shows only code changes, not base64-encoded plot images. Code review becomes meaningful.
Move Reusable Code to .py Modules Early
The moment you find yourself copying a function between two notebook cells, move it to a .py file and import it. The rule of thumb: if a function is longer than 10 lines, it belongs in a module.
# Instead of redefining this in every notebook:
# def preprocess_features(df): ...
# Create: src/preprocessing.py
# Import in the notebook:
from src.preprocessing import preprocess_features
This makes the code testable, version-controllable as plain Python, and reusable across notebooks.
Name Notebooks Semantically
Prefer 01_data_exploration_orders.ipynb over Untitled.ipynb. The numbering enforces intended execution order if there is one. Include the date for analytical notebooks that you will want to revisit: 2026_05_15_q2_cohort_analysis.ipynb.
Structure Notebooks Like Documents
A well-structured notebook reads like a document with a clear narrative:
- Title and description cell (markdown) explaining what this notebook does
- Imports cell
- Configuration cell (file paths, parameters -- things you might want to change)
- Data loading section
- Analysis sections with markdown headers explaining what each section does and what you found
- Conclusions section
This structure makes it possible for a new reader to understand the notebook without running it.