# Strategy 4 — Seed-Derived "Moves" Prompt Augmentation

## Hypothesis

The simple_writer's system prompt uses 3 few-shot excerpts to convey voice but
does not EXPLICITLY name the writer's structural moves. By distilling those moves
from the Pete Nicholas seed corpus and putting them in the prompt as explicit
rules for a single holistic rewrite pass with `anthropic/claude-opus-4.7`, we
should get a tighter, more visibly Pete-ish article without losing content.

## Method

1. Mine 10 named moves from the seed corpus (see `derived_moves.md`).
2. For each of 3 source articles already produced by simple_writer:
   - Strip metadata; send the article + the moves catalogue to claude-opus-4.7
     with instructions to rewrite preserving content but applying the moves
     deliberately.
   - Accept iff output length ≥90% of input AND slop did not spike by >3/1k words.
   - Run `simple_writer.pipeline.editor_pass` once for the final scrub.
3. Hostile judge (claude-sonnet-4.6) scores BEFORE vs AFTER on voice / prose /
   argument fidelity, picks a winner.

## Results

| Article | Words B→A | Slop B→A | Judge avg B→A | Winner | Assessment |
|---|---|---|---|---|---|
| `nando-de-freitas-unveils-interventional-sft-for-…` | 2677 → 2660 | 0.0 → 0.0 | 83.7 → 81.0 | **before** | hurt (judge prefers before, -2.7 avg) |
| `speaking-in-tongues-what-1-corinthians-12-and-14…` | 2926 → 2939 | 1.71 → 1.7 | 84.0 → 81.3 | **before** | hurt (judge prefers before, -2.7 avg) |
| `freemasonry-why-a-secret-society-offering-brothe…` | 2914 → 2960 | 1.37 → 1.35 | 87.0 → 85.3 | **before** | did nothing meaningful (before, -1.7 avg) |

## Per-article detail

### `nando-de-freitas-unveils-interventional-sft-for-agent-models-what-it-means-when.md`

- **Word count**: 2677 → 2660 (99% of input)
- **Slop rate (per 1000 words)**: 0.0 → 0.0
- **Rewrite accepted**: True
- **Rewrite model**: `anthropic/claude-opus-4.7`
- **Judge verdict**: **winner = before**
  - BEFORE scores: voice 84, prose 82, argument 85
  - AFTER scores:  voice 81, prose 79, argument 83
- **Judge reasoning**: The before version handles its em-dash asides and parenthetical qualifications with more idiomatic confidence, whereas the after version's editorial interventions ('How can we respond?', numbered practical points, tidied punctuation) flatten the essayistic texture and impose a listicle scaffolding that works against the pastor-intellectual register. The before version's final section flows as organic exhortation; the after version's 'First, second, third' parallelism converts it into a sermon outline, losing the dry conversational authority that is the voice's chief distinction.
- **One-sentence assessment**: hurt (judge prefers before, -2.7 avg)

### `speaking-in-tongues-what-1-corinthians-12-and-14-actually-argue-and-whether-the.md`

- **Word count**: 2926 → 2939 (100% of input)
- **Slop rate (per 1000 words)**: 1.71 → 1.7
- **Rewrite accepted**: True
- **Rewrite model**: `anthropic/claude-opus-4.7`
- **Judge verdict**: **winner = before**
  - BEFORE scores: voice 84, prose 83, argument 85
  - AFTER scores:  voice 81, prose 80, argument 83
- **Judge reasoning**: The before version carries sharper idiomatic edges — 'denominational pissing match' lands with the dry transgressive wit characteristic of the voice, whereas the after sanitises it to 'squabble,' flattening the register. The after version also introduces clumsy scaffolding ('First,' 'Second,' 'So how can we respond?') that breaks the essayistic flow and reads like a sermon outline imposed on prose that was working without it. Both versions are competent, but the before sustains the antithetical rhythm and confessional 'I' more consistently without the editorial smoothing that makes the after feel slightly managed.
- **One-sentence assessment**: hurt (judge prefers before, -2.7 avg)

### `freemasonry-why-a-secret-society-offering-brotherhood-is-the-wrong-answer-to-a-r.md`

- **Word count**: 2914 → 2960 (102% of input)
- **Slop rate (per 1000 words)**: 1.37 → 1.35
- **Rewrite accepted**: True
- **Rewrite model**: `anthropic/claude-opus-4.7`
- **Judge verdict**: **winner = before**
  - BEFORE scores: voice 88, prose 86, argument 87
  - AFTER scores:  voice 85, prose 84, argument 87
- **Judge reasoning**: The before version carries slightly more idiomatic energy in its em-dash constructions and parenthetical asides, which feel more naturally inhabited than the after's tidied punctuation and smoothed-out contractions. The after's editorial interventions—expanding 'can't' to 'cannot' throughout, regularising dashes, breaking up the Colossians parenthetical—produce cleaner copy but flatten the voice's characteristic roughness and self-interruption. Argument coherence is essentially identical; the marginal prose and voice advantage stays with the before.
- **One-sentence assessment**: did nothing meaningful (before, -1.7 avg)

## Aggregate

- Judge winners: **after 0, before 3, tied 0** (n=3)
- Mean composite score: **84.9 → 82.6** (-2.3)
- Rewrites accepted (passed length+slop gates): 3/3

## Verdict

Strategy 4 **hurt** — explicit moves made the rewrites feel more mechanical / checklist-y. The few-shot-only baseline preserved more naturalness. Do not adopt.
