Performance testing for application developers is about answering three questions: is this API route fast enough under normal load, does it degrade gracefully under high load, and where exactly is it spending time when it's slow? You don't need a dedicated performance team or enterprise APM tooling to answer these questions. You need k6 for load testing, Lighthouse for frontend performance, and Node.js's built-in profiler or clinic for finding where your server is spending time.
The Three Questions and the Tools That Answer Them
Is my API fast enough? Benchmark it: send N requests and measure p50, p95, and p99 latency. k6 does this in 10 lines of JavaScript.
Does it hold up under load? Load test it: ramp up to N concurrent users and watch what happens to error rates and latency. k6 does this too, with configurable ramp patterns.
Where is it slow? Profile it: collect a CPU profile or a flame graph showing where your Node.js process is spending time. clinic makes this accessible.
k6: Load Testing for Developers
k6 is an open source load testing tool that you write tests for in JavaScript (or TypeScript). It runs from your terminal, produces clear output, and has a free cloud option for distributed load.
Install:
brew install k6
Basic load test:
// load-test.js
import http from "k6/http";
import { check, sleep } from "k6";
export const options = {
vus: 50, // 50 virtual users
duration: "30s", // Run for 30 seconds
};
export default function () {
const res = http.get("https://api.myapp.com/projects");
check(res, {
"status is 200": (r) => r.status === 200,
"response time < 500ms": (r) => r.timings.duration < 500,
});
sleep(1);
}
k6 run load-test.js
The output shows:
http_req_duration— p50, p90, p95, p99 latencyhttp_req_failed— error rateiterations— total requests completedvus— concurrent users at each moment
Ramp-up load test (more realistic):
export const options = {
stages: [
{ duration: "30s", target: 10 }, // Ramp up to 10 users
{ duration: "1m", target: 50 }, // Ramp up to 50 users
{ duration: "30s", target: 100 }, // Ramp up to 100 users
{ duration: "30s", target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ["p95<500"], // Fail if p95 > 500ms
http_req_failed: ["rate<0.01"], // Fail if error rate > 1%
},
};
The thresholds section makes k6 exit with a non-zero code when performance degrades — which lets you fail CI builds when performance regressions are introduced.
Testing authenticated endpoints:
export function setup() {
const res = http.post("https://api.myapp.com/auth/login", JSON.stringify({
email: "test@example.com",
password: "test-password",
}), { headers: { "Content-Type": "application/json" } });
return { token: res.json("token") };
}
export default function (data) {
http.get("https://api.myapp.com/projects", {
headers: { Authorization: `Bearer ${data.token}` },
});
}
Reading k6 Output: What the Numbers Mean
p50 (median): Half of all requests completed faster than this. A healthy API typically has p50 under 100ms for simple reads.
p95: 95% of requests completed faster than this. This is the standard SLO threshold — "our API responds in under 500ms for 95% of requests."
p99: 99% of requests completed faster than this. This catches the "long tail" — the slow requests that happen due to garbage collection, cold starts, or lock contention. A p99 significantly higher than p95 (say, p95 is 200ms and p99 is 2000ms) indicates occasional severe slowdowns worth investigating.
What "slow" means in context: Google's Core Web Vitals define LCP (Largest Contentful Paint) under 2.5 seconds as good, 2.5-4 seconds as needs improvement, and above 4 seconds as poor. For API responses, 200ms p95 is excellent, 500ms p95 is acceptable, above 1 second p95 needs attention.
Lighthouse for Frontend Performance
Lighthouse audits web page performance, accessibility, and SEO. It's built into Chrome DevTools and available as a CLI.
Install Lighthouse CLI:
npm install -g lighthouse
Run an audit:
lighthouse https://myapp.com --output html --output-path ./lighthouse-report.html
Key metrics in the Lighthouse report:
- LCP (Largest Contentful Paint): How long until the main content is visible. Goal: under 2.5s.
- FID (First Input Delay) / INP (Interaction to Next Paint): How quickly the page responds to user input. Goal: INP under 200ms.
- CLS (Cumulative Layout Shift): How much the layout shifts as the page loads. Goal: under 0.1.
- TTFB (Time to First Byte): How long the server takes to respond. Goal: under 800ms.
Run Lighthouse in CI to catch regressions:
lighthouse https://myapp.com --output json --output-path ./lighthouse-results.json --chrome-flags="--headless"
# Extract LCP score and fail if below 80
node -e "
const result = require('./lighthouse-results.json');
const score = result.categories.performance.score * 100;
console.log('Performance score:', score);
if (score < 80) process.exit(1);
"
clinic: Node.js Profiling
clinic is a set of Node.js profiling tools from NearForm that make flame graphs and bottleneck analysis accessible.
Install:
npm install -g clinic
Find CPU bottlenecks with flame graphs:
clinic flame -- node server.js
Run a load test against your server while clinic is profiling, then Ctrl+C. clinic generates a flame graph HTML file showing exactly which functions are consuming CPU time.
Find I/O bottlenecks:
clinic bubbles -- node server.js
Bubbles shows event loop utilization — if the event loop is blocked (high utilization), I/O calls are stacking up, which causes latency spikes.
Catching Performance Regressions in CI
Integrate k6 into your CI pipeline to catch regressions before they reach production:
# .github/workflows/performance.yml
name: Performance Tests
on: [push]
jobs:
performance:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Start app
run: docker compose up -d
- name: Wait for app
run: sleep 10
- name: Run k6 load test
uses: grafana/k6-action@v0.3.1
with:
filename: tests/load-test.js
With thresholds set in your k6 test file, the CI step fails if p95 latency exceeds your threshold — surfacing performance regressions in the same PR that introduced them.
Keep Reading
- API Rate Limiting Implementation Guide — Protect your API after you know how it performs under load
- Docker for Developers Guide — Running performance tests against containerized services
- CI/CD for Small Engineering Teams — Integrating performance gates into your pipeline
Pristren builds AI-powered software for teams. Zlyqor is our all-in-one workspace — chat, projects, time tracking, AI meeting summaries, and invoicing — in one tool. Try it free.