Prompting for Code Review: Getting Useful Feedback, Not Compliments

How to write prompts that produce actionable code review - specifying focus areas, severity tiers, concrete fix examples, adversarial review framing, and the difference between diff review and full-file review.

Mahmudul Haque Qudrati

CEO & ML Engineer

May 18, 2026

8 min read

// tags

#code-review#prompt-engineering#security#llm

FIG. ART-31

8 min read

“

Prompting for Code Review: Getting Useful Feedback, Not Compliments

// reading plan

sections

1,141

words

min read

// AI Agents

Building reliable agentic AI systems: A Practical Overview

A practical guide to building reliable agentic AI systems covering structured outputs, observability, fallbacks, and cost controls with real code examples.

4 min read

// Prompt Engineering

Severity Tiers

Without severity labels, every observation looks equally important. A missing semicolon and a SQL injection vulnerability appear in the same list with the same presentation. Require explicit severity tiers:

For each issue you find, label it with one of:
- MUST FIX: Will cause a bug, security vulnerability, data loss, or crash in production
- CONSIDER: Not a bug today but will likely cause problems as the codebase grows
- OPTIONAL: Minor improvement; reasonable engineers would disagree on this

Format each finding as:
[SEVERITY] Issue description
Why: One sentence explanation of the actual risk
Fix: Specific code change, not general advice

The "Fix: specific code change" instruction is critical. Without it, you get "consider adding validation" rather than the actual validation code. Asking for specific fixes forces the model to commit to a concrete recommendation.

Ask for Specific Fix Examples

Explicit fix code is the difference between review that requires a second conversation and review you can act on immediately:

For every MUST FIX issue, provide:
1. The problematic code (quoted directly)
2. The corrected code
3. One sentence explaining why the fix works

Do not provide general guidance. Provide the actual corrected code.

This instruction dramatically increases the usefulness of the output. The model knows how to fix most issues - the prompt just forces it to show its work instead of gesturing at a solution.

Adversarial Review Framing

The most powerful frame for security-focused review is adversarial: ask the model to find every possible issue rather than giving a balanced assessment:

You are a security researcher performing a security review of this code before it handles production user data. Your job is to find every possible vulnerability, attack vector, or dangerous assumption. Be adversarial. Assume the caller is malicious. Assume the environment is hostile.

Do not soften findings. Do not say "consider" for things that are actual vulnerabilities. If something is exploitable, say it is exploitable and explain how.

Code:
[code here]

The framing shift - from "helpful reviewer" to "adversarial security researcher" - reliably produces more thorough security findings because it removes the model's default tendency toward balanced, non-alarming assessments.

For general correctness review, a similar frame works:

Your goal is to find every way this code could fail in production. Assume edge cases are common, not rare. Assume external services fail. Assume inputs are malformed. Assume concurrent access. What breaks?

Diff Review vs Full-File Review

The right scope depends on what you are reviewing.

Diff review is appropriate for PRs and incremental changes. It limits the model's attention to what actually changed, which produces more relevant feedback:

Review this diff. Focus only on the changed lines (lines marked with + in the diff). Do not comment on unchanged code.

Specifically:
- Are the new changes correct?
- Do the new changes introduce any regressions?
- Are there edge cases in the changed logic that are not handled?

Diff:
[git diff output]

Full-file review is appropriate when you want to review an entire module or when the change is too large to isolate:

Review this entire file. Assume it will be deployed to production and must be production-ready.

Focus on:
- Correctness of the core logic
- Error handling completeness
- Any assumption that breaks under load or with malicious input

Do not combine both. Diffing a full file ("here is the whole file, tell me if the changes are good") confuses the model about what to focus on.

Language and Framework-Specific Review

Generic review prompts miss language-specific footguns. Add explicit language context:

This is a Node.js async function using the MongoDB driver. Specifically check for:
- Unhandled promise rejections (missing try/catch or .catch())
- Callback vs promise API mixing (the MongoDB driver has both)
- Missing awaits before async operations
- Memory leaks from unclosed cursors or connections

For security-critical languages like C/C++:

This is C code that processes untrusted input. Check specifically for:
- Buffer overflows (strcpy, sprintf without bounds)
- Integer overflow before allocation
- Use-after-free patterns
- Missing NULL checks after malloc

Language-specific checklists produce dramatically more relevant findings than generic security reviews.

Handling Long Code Files

For files over ~500 lines, the model's attention degrades toward the end. Two approaches:

Section-by-section review: Split the file into logical sections and review each separately. "Review only the authentication middleware (lines 45-130)."

Targeted review: If you know what concerns you, focus there. "I'm most concerned about the transaction logic in the processPayment function (lines 200-280). Review that section in depth."

For very large codebases, targeted review beats comprehensive review for finding real issues. The model's attention is a limited resource - spend it where it matters.

Keep Reading

Prompt Testing Methodology Guide - tracking whether your review prompts find real bugs over time
System Prompt Guide with Examples - setting up a persistent code review persona
The Complete Prompt Engineering Guide (2026) - foundational techniques applied to technical tasks

Pristren builds AI-powered software for teams. Zlyqor is our all-in-one workspace - chat, projects, time tracking, AI meeting summaries, and invoicing - in one tool. Try it free.

Prompting for Code Review: Getting Useful Feedback, Not Compliments

Related Articles

Building reliable agentic AI systems: A Practical Overview

The Problem With Generic Code Review Prompts

Specify the Focus Area

Severity Tiers

Ask for Specific Fix Examples

Adversarial Review Framing

Diff Review vs Full-File Review

Language and Framework-Specific Review

Handling Long Code Files

Keep Reading

The workspace your team
actually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Advanced Prompt Engineering: Chain-of-Thought, ReAct, and Few-Shot Patterns

Structured Outputs from LLMs: Leveraging JSON Mode and Tool Calling

Prompting for Code Review: Getting Useful Feedback, Not Compliments

Related Articles

Building reliable agentic AI systems: A Practical Overview

The Problem With Generic Code Review Prompts

Specify the Focus Area

Severity Tiers

Ask for Specific Fix Examples

Adversarial Review Framing

Diff Review vs Full-File Review

Language and Framework-Specific Review

Handling Long Code Files

Keep Reading

The workspace your teamactually needs

AI & ML insights, weekly

Mahmudul Haque Qudrati

Advanced Prompt Engineering: Chain-of-Thought, ReAct, and Few-Shot Patterns

Structured Outputs from LLMs: Leveraging JSON Mode and Tool Calling

The workspace your team
actually needs