AI
How to Compare Prompts and LLM Responses
Find exactly what changed between prompt versions or model responses without losing formatting in a chat interface.
Problem
Prompt edits are often small but consequential. A changed instruction, missing delimiter, or reordered example can change the output, and chat UIs are poor diff tools.
Solution
Normalize the prompt or response text, preview Markdown if needed, and use Split & Diff to compare versions side by side.
Workflow
- 1Keep versions separate
Copy the old prompt or response and the new one as separate blocks. Include system, developer, and user sections if they affect behavior. - 2Normalize whitespace intentionally
Use trim, remove empty lines, or formatting tools only when whitespace is not meaningful to the prompt. Otherwise preserve exact text. - 3Diff the two versions
Use Split & Diff to review line additions, removals, and edits. This is especially helpful for long few-shot examples. - 4Preview Markdown output
If the model response is Markdown, use Markdown Preview to check whether headings, lists, and code fences render correctly.
Examples
Prompt sections worth labeling
Clear labels make prompt diffs easier to understand later.
SYSTEM:
You are a concise API documentation assistant.
USER:
Write examples for POST /orders.
CONSTRAINTS:
Return Markdown only.Checklist
- Keep role labels when comparing multi-message prompts.
- Preserve delimiters and code fences.
- Compare outputs from the same input when evaluating model changes.
- Use Markdown Preview for responses intended for documentation.
Tools Used
- Split & DiffCompare prompt or response versions.
- Markdown PreviewPreview Markdown responses.
- JSON FormatterNormalize JSON outputs before diffing.
Frequently Asked Questions
Should I remove whitespace before comparing prompts?
Only if whitespace is not semantically important. For prompts with code, tables, YAML, or Markdown, preserve formatting.
Can I compare JSON responses from an LLM?
Yes. Format both JSON outputs first, then diff the formatted results for a cleaner comparison.