How to Compare Prompts and LLM Responses

Find exactly what changed between prompt versions or model responses without losing formatting in a chat interface.

Problem

Prompt edits are often small but consequential. A changed instruction, missing delimiter, or reordered example can change the output, and chat UIs are poor diff tools.

Solution

Normalize the prompt or response text, preview Markdown if needed, and use Split & Diff to compare versions side by side.

Workflow

1Keep versions separate
Copy the old prompt or response and the new one as separate blocks. Include system, developer, and user sections if they affect behavior.
2Normalize whitespace intentionally
Use trim, remove empty lines, or formatting tools only when whitespace is not meaningful to the prompt. Otherwise preserve exact text.
3Diff the two versions
Use Split & Diff to review line additions, removals, and edits. This is especially helpful for long few-shot examples.
4Preview Markdown output
If the model response is Markdown, use Markdown Preview to check whether headings, lists, and code fences render correctly.

Examples

Prompt sections worth labeling

Clear labels make prompt diffs easier to understand later.

SYSTEM:
You are a concise API documentation assistant.

USER:
Write examples for POST /orders.

CONSTRAINTS:
Return Markdown only.

Checklist

Keep role labels when comparing multi-message prompts.
Preserve delimiters and code fences.
Compare outputs from the same input when evaluating model changes.
Use Markdown Preview for responses intended for documentation.

Tools Used

Split & DiffCompare prompt or response versions.
Markdown PreviewPreview Markdown responses.
JSON FormatterNormalize JSON outputs before diffing.

Frequently Asked Questions

Should I remove whitespace before comparing prompts?

Only if whitespace is not semantically important. For prompts with code, tables, YAML, or Markdown, preserve formatting.

Can I compare JSON responses from an LLM?

Yes. Format both JSON outputs first, then diff the formatted results for a cleaner comparison.