Jupyter notebooks are one of the most powerful tools in a data scientist's arsenal and one of the most commonly misused. They are excellent for exploration, communication, and teaching. They are catastrophically bad as production code, version-controlled libraries, or complex multi-step pipelines. The key to using notebooks effectively is understanding when they are the right tool and applying a set of practices that prevent the common failure modes.
The Core Problems with Notebooks
Cell execution order bugs. In a notebook, cells can be run in any order. This means your notebook can appear to work correctly (all outputs are present and look right) while being completely broken if cells are run top-to-bottom. A classic example: cell 5 deletes a column, cell 3 uses that column, you run them in order 3-4-5-6-7, everything works. Someone else opens the notebook and runs them in order 1-2-3-4-5-6-7, it crashes on cell 3 because cell 5 has not run yet. Except it does not crash on cell 3 because cell 3's cached output from your previous run is still displayed. The bug is invisible.
Hidden state. The Python kernel maintains state between cells. Variables defined and then deleted in a cell still exist in memory until you restart the kernel. This means a notebook can depend on variables that were defined in cells you deleted an hour ago.
Version control for outputs. Git stores the JSON representation of a notebook, including all cell outputs (plots as base64-encoded images, tables as HTML). A single re-run of a notebook with identical code but timestamps in the output creates a noisy, unreadable diff.
Refactoring resistance. Notebooks discourage extracting reusable code into functions and modules. Logic accumulates in cells, grows intertwined, and becomes impossible to test or reuse.
No testing. You cannot run pytest on a notebook. The standard Python testing ecosystem does not work with notebook cells.
Non-Negotiable Best Practices
Restart and Run All Before Sharing
Before sharing any notebook -- before committing it, before sending it to a colleague, before presenting it -- restart the kernel and run all cells from top to bottom. This is the single most important practice.
Kernel > Restart Kernel and Run All Cells...
If the notebook fails when run top-to-bottom, it is broken, regardless of what the cached outputs show.
Use nbstripout to Remove Outputs from Git
Install nbstripout as a git filter. It automatically strips cell outputs before committing, keeping your diffs clean and reviewable.
pip install nbstripout
nbstripout --install # Sets up the git filter for this repo
After setup, git diff shows only code changes, not base64-encoded plot images. Code review becomes meaningful.
Move Reusable Code to .py Modules Early
The moment you find yourself copying a function between two notebook cells, move it to a .py file and import it. The rule of thumb: if a function is longer than 10 lines, it belongs in a module.
# Instead of redefining this in every notebook:
# def preprocess_features(df): ...
# Create: src/preprocessing.py
# Import in the notebook:
from src.preprocessing import preprocess_features
This makes the code testable, version-controllable as plain Python, and reusable across notebooks.
Name Notebooks Semantically
Prefer 01_data_exploration_orders.ipynb over Untitled.ipynb. The numbering enforces intended execution order if there is one. Include the date for analytical notebooks that you will want to revisit: 2026_05_15_q2_cohort_analysis.ipynb.
Structure Notebooks Like Documents
A well-structured notebook reads like a document with a clear narrative:
- Title and description cell (markdown) explaining what this notebook does
- Imports cell
- Configuration cell (file paths, parameters -- things you might want to change)
- Data loading section
- Analysis sections with markdown headers explaining what each section does and what you found
- Conclusions section
This structure makes it possible for a new reader to understand the notebook without running it.
Papermill: Parameterized Notebook Execution
Papermill lets you execute notebooks programmatically with different parameters, turning a notebook into a batch job.
import papermill as pm
# Execute the notebook with different date parameters
pm.execute_notebook(
"analysis_template.ipynb",
"analysis_2026_05_18.ipynb",
parameters={
"start_date": "2026-05-01",
"end_date": "2026-05-18",
"cohort": "enterprise"
}
)
In the notebook, mark the parameters cell with a parameters tag in the cell metadata. Papermill injects a new cell after it with the provided parameter values.
This pattern is useful for scheduled report generation, running the same analysis for different segments, and A/B test result notebooks.
nbdev: Library Development in Notebooks
nbdev (from fast.ai) is a framework that lets you develop Python libraries using Jupyter notebooks as the source of truth. Code, tests, and documentation all live in the notebook; nbdev exports them to .py files and documentation sites.
This is a niche but powerful pattern for libraries where exploration and documentation are as important as the code itself. It is not appropriate for most production systems.
Marimo: The Reactive Alternative
Marimo is a new notebook environment with a fundamentally different execution model. In Marimo, notebooks are reactive: when you change a cell, all cells that depend on its outputs automatically re-run. You cannot have a stale cached output from a previous run.
Additional advantages: Marimo notebooks are stored as pure Python files (not JSON), making git diffs clean and meaningful without nbstripout. They can be deployed as interactive web applications. They can be run as scripts.
# Marimo notebook: pure Python, version-controllable
import marimo as mo
@mo.cell
def load_data():
import pandas as pd
df = pd.read_csv("data.csv")
return df
@mo.cell
def compute_stats(df):
return df.describe()
Marimo is worth trying for any new notebook work. The main limitation is ecosystem maturity -- some Jupyter-specific extensions do not work with Marimo.
When Notebooks Are Right vs When to Move to Scripts
Use notebooks for: initial data exploration and EDA (you are figuring out the structure of new data), one-off analyses (answering a specific business question that will not be repeated), communicating findings (the notebook IS the deliverable, showing methodology and results together), and teaching (interleaving code with explanation).
Move to scripts when: the code will run in production (scheduled pipeline, model serving), the same code needs to run repeatedly in CI/CD, you need unit tests, you are building a library, or multiple people need to contribute to and review the code as a codebase (not a document).
The transition point is usually when someone asks "can we automate this?" If the answer is yes, the notebook has served its exploratory purpose and the logic should be extracted to Python modules.
Keep Reading
- Python Data Science Tools in 2026 — the full stack context
- Data Pipeline Guide — where notebook-extracted code goes into production
- Exploratory Data Analysis Guide — the primary use case for notebooks
Pristren builds AI-powered software for teams. Zlyqor is our all-in-one workspace — chat, projects, time tracking, AI meeting summaries, and invoicing — in one tool. Try it free.