The difference between a useful AI response and a generic one is almost always the prompt. This guide covers 8 techniques that work across GPT, Claude, Gemini, and open-source models - with before/after examples for each.
Sorted from easiest to most advanced. Start with the beginner techniques - they deliver the biggest improvement for the least effort.
| Technique | Level |
|---|---|
| Be Specific | Beginner |
| Few-Shot Examples | Beginner |
| Chain-of-Thought | Intermediate |
| Role Assignment | Beginner |
| Structured Output | Intermediate |
| Constraint Setting | Intermediate |
| Self-Critique | Advanced |
| Decomposition | Advanced |
Replace vague instructions with precise requirements. Specify format, length, audience, and constraints upfront.
Write about machine learning.
Write a 200-word explanation of gradient descent for a software engineer who has never studied ML. Use a hiking analogy. No math notation.
Models generate text by predicting likely continuations. A specific prompt narrows the prediction space, so the model does not have to guess what you want.
Include 2-5 examples of input/output pairs before your actual request. The model learns the pattern from your examples.
Classify this review as positive or negative: "The battery lasts forever but the screen is dim."
Classify each review. Review: "Amazing sound quality, worth every penny." Label: positive Review: "Broke after two weeks, terrible build." Label: negative Review: "The battery lasts forever but the screen is dim." Label:
Few-shot examples set a clear pattern the model can follow. This is more reliable than describing the task because you are showing rather than telling.
Ask the model to think step-by-step before giving its final answer. This forces it to show intermediate reasoning rather than jumping to conclusions.
A store has 45 apples. They sell 3/5 of them, then receive 20 more. How many apples do they have?
A store has 45 apples. They sell 3/5 of them, then receive 20 more. How many apples do they have? Think through this step-by-step before giving your final answer.
Token-by-token generation means the model can use earlier tokens as "working memory." Without chain-of-thought, it must compute the answer in a single forward pass, which fails for multi-step problems.
Tell the model to act as a specific expert. This primes it to use domain-appropriate vocabulary, reasoning patterns, and depth.
Is this SQL query efficient?
You are a senior database engineer with 15 years of PostgreSQL experience. Review this SQL query for performance issues. Focus on indexing, joins, and query plan optimization.
Role assignment activates different knowledge clusters in the model. A "database engineer" persona draws on training data from DB experts, producing more specialized and accurate responses.
Request output in a specific format (JSON, markdown table, numbered list). Include the exact schema or template you want.
Extract the key info from this job posting.
Extract information from this job posting and return it as JSON with these exact fields: { "title": string, "company": string, "salary_range": string | null, "remote": boolean, "required_skills": string[] }
Models that score well on IFEval (instruction following) reliably produce structured output. Providing the exact schema removes ambiguity about field names, types, and nesting.
Explicitly state what the model should NOT do. Constraints are as important as instructions for controlling output quality.
Explain quantum computing.
Explain quantum computing for a curious 14-year-old. Constraints: - No equations or mathematical notation - No analogies involving cats (Schrodinger has been overdone) - Under 300 words - End with one question that makes them want to learn more
Without constraints, models default to their most common training patterns (which include overused analogies and verbose explanations). Constraints force creative alternatives.
Ask the model to generate a response, then review its own output for errors or improvements. Two passes produce better results than one.
Write a function to validate email addresses.
Write a function to validate email addresses in Python. After writing it, review your own code for: 1. Edge cases it misses 2. RFC 5322 compliance issues 3. Performance with large inputs Then provide an improved version addressing any issues found.
Generation and evaluation use different reasoning pathways. Models are often better at spotting errors in existing text than avoiding them during generation. The second pass catches mistakes the first pass introduced.
Break a large task into smaller subtasks. Handle each subtask separately, then combine results. Works better than asking for everything at once.
Analyze this 50-page contract and summarize all risks, obligations, and deadlines.
I will send you a contract in sections. For each section: 1. List any obligations for our company 2. Flag any risks or unusual clauses 3. Extract any deadlines or dates After all sections, I will ask you to compile a final summary.
Context windows have attention limits. Even models with 128K+ context perform worse on information in the middle of long inputs. Decomposition ensures each subtask gets focused attention.
The core techniques (specificity, few-shot, chain-of-thought) work across all major models. But each model has quirks. Claude responds well to detailed system prompts. GPT models handle structured output reliably. Open-source models may need more explicit formatting instructions. Always test your prompts on the specific model you are deploying.
More relevant than ever. As models become more capable, the gap between a mediocre prompt and an optimized one grows wider. A well-crafted prompt can get GPT-5.4 or Claude Opus to produce work that would require a much more expensive model with a generic prompt. Prompt engineering is now a recognized skill in job listings.
For most tasks, 2-5 examples hit the sweet spot. One example is often not enough to establish a pattern. More than 5 rarely improves results and wastes context window. For complex classification with many categories, you may need one example per category. Always include edge cases in your examples.
Use system prompts for persistent instructions (role, constraints, output format) and user prompts for the specific task. System prompts are given higher priority by most models and persist across conversation turns. Not all API providers support system prompts - check your model documentation.
Being vague. "Write something good about X" will always produce generic output. The most impactful improvement is being specific about format, audience, length, constraints, and success criteria. Think of it this way: if two reasonable people could interpret your prompt differently, it needs to be more specific.