Monitoring Your Application with Prometheus and Grafana

Prometheus scrapes metrics from your app, Grafana visualizes them. Here is how to instrument a Node.js app, build dashboards, and set up alerts that matter.

Mahmudul Haque Qudrati

CEO & ML Engineer

May 18, 2026

10 min read

// tags

#prometheus#grafana#monitoring#observability#devops#metrics

FIG. ART-35

10 min read

“

Monitoring Your Application with Prometheus and Grafana

// reading plan

sections

978

words

min read

// Developer Tools

Clerk Authentication Guide for Next.js Developers

Clerk saves 40+ hours of auth work. Here is what it provides, how to set it up in Next.js, and when it is not the right choice.

9 min read

// Developer Tools

MongoDB Guide for Developers: When to Use It and When Not To

Prometheus collects metrics from your application by scraping a /metrics HTTP endpoint. Grafana visualizes those metrics in dashboards and fires alerts when thresholds are crossed. Together they are the most common open-source monitoring stack for production applications, and understanding them will make you a better operator of any backend system.

What You Are Trying to Monitor

Before setting up tools, know what you are measuring. The RED method defines the three signals that matter most for every service:

Rate: how many requests per second is your service handling? A sudden drop is as alarming as a sudden spike.

Errors: what percentage of requests are failing? Track 4xx and 5xx separately: 4xx are usually client errors, 5xx are your bugs.

Duration: how long are requests taking? Track percentiles, not averages. P50 (median), P95, and P99 tell you what most users experience and what the worst-case experience is.

For infrastructure-level monitoring (not covered by RED), track CPU utilization, memory usage, disk I/O, and network throughput.

What Prometheus Is

Prometheus is a time-series database and metric collection system. It works on a pull model: you configure Prometheus with a list of targets (your app instances' /metrics endpoints), and Prometheus scrapes those endpoints on a regular interval (typically every 15-30 seconds) and stores the metrics.

This pull model has an important implication: your application does not need to know about your monitoring system. You expose a /metrics endpoint, and Prometheus finds it. Adding a new metric to your app does not require any coordination with the Prometheus server.

Prometheus stores data in its own time-series database on disk. It is designed for high-cardinality time-series data (many unique combinations of metric labels) and is optimized for fast aggregation queries over time ranges.

Instrumenting a Node.js Application

The prom-client library is the standard Prometheus client for Node.js:

pnpm add prom-client

Set up default metrics (Node.js process metrics: CPU, memory, event loop lag, garbage collection) and a custom HTTP request counter:

import { collectDefaultMetrics, Counter, Histogram, register } from 'prom-client'

collectDefaultMetrics()

export const httpRequestDuration = new Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests in seconds',
  labelNames: ['method', 'route', 'status_code'],
  buckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5],
})

export const httpRequestTotal = new Counter({
  name: 'http_requests_total',
  help: 'Total number of HTTP requests',
  labelNames: ['method', 'route', 'status_code'],
})

// Expose the /metrics endpoint
app.get('/metrics', async (req, res) => {
  res.set('Content-Type', register.contentType)
  res.send(await register.metrics())
})

Instrument each request in middleware:

app.use((req, res, next) => {
  const end = httpRequestDuration.startTimer()
  res.on('finish', () => {
    const labels = {
      method: req.method,
      route: req.route?.path ?? req.path,
      status_code: res.statusCode,
    }
    end(labels)
    httpRequestTotal.inc(labels)
  })
  next()
})

Instrumenting a Next.js Application

Next.js does not have a traditional Express middleware layer, but you can add Prometheus metrics to Next.js API routes using the same prom-client library. Create a /api/metrics route that returns the Prometheus exposition format, and add timing logic to individual routes or to a shared wrapper function.

For App Router, the instrumentation.ts file (Next.js's built-in instrumentation hook) is the right place to initialize Prometheus collectors.

What Grafana Is

Grafana is a visualization platform that connects to data sources (Prometheus, Loki for logs, Tempo for traces, and many others) and lets you build dashboards. Dashboards consist of panels: graphs, stat displays, gauges, tables, and more.

Grafana's query language for Prometheus is PromQL. Example: the request rate over the last 5 minutes:

rate(http_requests_total[5m])

P95 latency by route:

histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

Error rate:

rate(http_requests_total{status_code=~"5.."}[5m])
/ rate(http_requests_total[5m])

Grafana's dashboard builder is visual: you write PromQL queries in the panel editor and see the graph render live. Dashboards can be exported as JSON and committed to version control.

Alerting

Grafana Alerting lets you define rules that fire when a metric crosses a threshold. The rule evaluates a PromQL query on a schedule and sends notifications via Slack, email, PagerDuty, or webhooks.

Principles for good alerting:

Alert on symptoms, not causes. Alert when error rate is high (symptom), not when CPU is high (cause). High CPU does not always mean users are affected. High error rate always means users are affected.

Set meaningful thresholds. "Error rate > 1% for 5 minutes" is a meaningful alert. "Any error ever" is noise. "CPU > 80%" by itself is noise.

Alert on what you would wake up for. If an alert fires and you look at it and decide nothing needs to be done, the alert should not exist. Alert fatigue kills monitoring programs.

Hosted Monitoring Alternatives

Grafana Cloud: hosted Prometheus + Grafana, free tier (10,000 metric series, 50GB logs, 14 days retention). The easiest way to run this stack without self-hosting.

Datadog: the most comprehensive commercial monitoring platform. APM, metrics, logs, traces, synthetics, security — all in one. Expensive ($15+/host/month) but the best-in-class experience for organizations that can afford it.

New Relic: similar to Datadog, competitive on pricing for certain tiers.

Better Uptime / UptimeRobot: simpler uptime monitoring (HTTP ping checks, status pages). Not a replacement for Prometheus but solves the "is my site up?" problem for $0.

When Self-Hosted Monitoring Makes Sense

Self-hosted Prometheus + Grafana on the same VPS as your application costs nothing extra and gives you full control over retention and data privacy. For small teams on a budget, this is the pragmatic choice.

Managed monitoring (Grafana Cloud, Datadog) makes sense when: your team does not want to manage infrastructure, you need long-term metric retention, or the time saved on operations is worth the monthly cost.

Keep Reading

Sentry Error Tracking Guide — application-level error tracking that complements Prometheus metrics
Coolify vs Fly.io vs Render — deploying your Prometheus and Grafana instances
We Replaced 6 SaaS Tools With One: What Happened — reducing operational overhead in your toolchain

Pristren builds AI-powered software for teams. Zlyqor is our all-in-one workspace — chat, projects, time tracking, AI meeting summaries, and invoicing — in one tool. Try it free.

Monitoring Your Application with Prometheus and Grafana

Related Articles

Clerk Authentication Guide for Next.js Developers

MongoDB Guide for Developers: When to Use It and When Not To

What You Are Trying to Monitor

What Prometheus Is

Instrumenting a Node.js Application

Instrumenting a Next.js Application

What Grafana Is

Alerting

Hosted Monitoring Alternatives

When Self-Hosted Monitoring Makes Sense

Keep Reading

The workspace your team
actually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Prisma ORM Guide for TypeScript Developers

Monitoring Your Application with Prometheus and Grafana

Related Articles

Clerk Authentication Guide for Next.js Developers

MongoDB Guide for Developers: When to Use It and When Not To

What You Are Trying to Monitor

What Prometheus Is

Instrumenting a Node.js Application

Instrumenting a Next.js Application

What Grafana Is

Alerting

Hosted Monitoring Alternatives

When Self-Hosted Monitoring Makes Sense

Keep Reading

The workspace your teamactually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Prisma ORM Guide for TypeScript Developers

The workspace your team
actually needs