>_
smartdevbox
Open SmartDevBox - free, no sign-upEngineering workflows · local processing · practical examples
AI

How to Compare Prompts and LLM Responses

Find exactly what changed between prompt versions or model responses without losing formatting in a chat interface.

Problem

Prompt edits are often small but consequential. A changed instruction, missing delimiter, or reordered example can change the output, and chat UIs are poor diff tools.

Solution

Normalize the prompt or response text, preview Markdown if needed, and use Split & Diff to compare versions side by side.

Workflow

  1. 1Keep versions separate
    Copy the old prompt or response and the new one as separate blocks. Include system, developer, and user sections if they affect behavior.
  2. 2Normalize whitespace intentionally
    Use trim, remove empty lines, or formatting tools only when whitespace is not meaningful to the prompt. Otherwise preserve exact text.
  3. 3Diff the two versions
    Use Split & Diff to review line additions, removals, and edits. This is especially helpful for long few-shot examples.
  4. 4Preview Markdown output
    If the model response is Markdown, use Markdown Preview to check whether headings, lists, and code fences render correctly.

Examples

Prompt sections worth labeling

Clear labels make prompt diffs easier to understand later.

SYSTEM:
You are a concise API documentation assistant.

USER:
Write examples for POST /orders.

CONSTRAINTS:
Return Markdown only.

Checklist

  • Keep role labels when comparing multi-message prompts.
  • Preserve delimiters and code fences.
  • Compare outputs from the same input when evaluating model changes.
  • Use Markdown Preview for responses intended for documentation.

Tools Used

Frequently Asked Questions

Should I remove whitespace before comparing prompts?

Only if whitespace is not semantically important. For prompts with code, tables, YAML, or Markdown, preserve formatting.

Can I compare JSON responses from an LLM?

Yes. Format both JSON outputs first, then diff the formatted results for a cleaner comparison.