Below you will find my adversarial review prompt that I give to Claude on every pull request review. This is a living document that references dozens of cascading documentation. Collectively, over 1,700 norms contained in the "WitFoo Way" are reviewed. Also, the integrity of the work done by what is often a lazy or dishonest Claude session. In spite of the rigorous testing I outlined in Coding with Claude: What I've Learned Building a Cybersecurity Platform with AI , Claude finds new innovative ways to come off the rails. I use this prompt in an independent session to find shortcuts and lies from the PR session. I have run this prompt dozens of times and it has never come back without hits. As Claude finds new ways to bypassing checking or this prompt, I expand testing, norms and this prompt to cover those edge cases. As you can tell by it's length, there have been MANY iterations. While this prompt will not directly import into your projects, I thought it would be helpful to leave my details in place to assist in drafting your own. Turns out you have to fight Claude with Claude.
Reusable post-completion review prompt. **A PR was just declared "done." Your job is
to disprove that.** Operate as a hostile, skeptical reviewer who assumes the PR is
secretly broken, half-wired, over-claimed, or deviating from WitFoo norms — and who
sets out to *prove* it, file by file, claim by claim. Save findings to
`_prompts/audit.md` for correction.
> **Invocation:** "`PR{N}` is complete. Run an adversarial review." Substitute the PR
> number/branch below for `{PR}` and `{BRANCH}` (default `{BRANCH}` = current branch,
> base = `main`).
- Assume nothing is done until you've seen the line that does it **and** the test that
proves it. The PR writeup, the `CLAUDE.md` one-liner, and the changelog entry are
*claims*, not evidence.
- This is **PR-scoped and diff-driven**, not a whole-codebase sweep. For the holistic
rule-by-rule audit use [`_4 Code Audit and Review.md`](_4%20Code%20Audit%20and%20Review.md);
for the standards checklist use [`_0 Write PR.md`](_0%20Write%20PR.md). This prompt
attacks *what this PR changed* and *what it claims*.
- Time is not a constraint; thoroughness is. Ultrathink each finding. But **cry wolf and
you're useless** — every finding must be reproducible and survive the false-positive
calibration in the appendix.
- Take ownership of *all* defects you surface, including pre-existing ones in the blast
radius. Document them; don't fix in this pass unless asked.
***Deliverable:*** `_prompts/audit.md` — a severity-ranked findings file (format in
§Output) plus a "could-not-verify" honesty section, plus the captured result of
`full-testing-with-conductor.sh`, plus a coverage-gap analysis.
## How to operate (the adversarial loop)
1. **Establish the blast radius and the claims** (Phase 0). You cannot review what you
haven't scoped.
2. **Attack claim-vs-reality first** (Phase 1) — this is the single highest-yield axis in
this repo. Most "shipped" defects here are *no-op handlers, wrong-twin edits, wiring
gaps, and order-of-construction bugs* that compile, pass a unit test, and do nothing
live.
3. **Hunt deferred/stub/dead work** (Phase 2).
4. **Sweep the WitFoo-Way dimensions touched by the diff** (Phases 3–8). Skip dimensions
the diff doesn't touch; go deep on the ones it does.
5. **Analyze coverage** (Phase 9) — the user always wants this; `coverage.sh` lies, so
measure directly.
6. **Run the pipeline** (Phase 10) — `full-testing-with-conductor.sh`, capture every
failure.
7. **Write `audit.md`** — including what you could *not* verify and why.
Throughout, prefer **mutation-thinking**: for any "fixed/added" claim, ask *"if I
reverted this line, would a test go red?"* If no test would catch the revert, the fix is
unprotected (and may already be dead code).
## Phase 0 — Scope & claim extraction (do this first)
```bash
git fetch origin main
git --no-pager diff --stat main...HEAD # full blast radius
git --no-pager diff --name-only main...HEAD # file list — every file must map to a claim
```
- **Locate the contract.** Read the PR prompt in `_prompts/PR{N}*.md` and the tracker
`CURRENT_WORK.md` (root, or `_archives/pr{N}*/CURRENT_WORK.md`). Extract: the Mission
Contract exit criteria, the **Non-Goals/scope exclusions**, and the **Change Manifest**
table (`File | Reason | Diff Type | Test Coverage | WitFoo Way Rules`,
per [`_26 Change Manifest.md`](_26%20Change%20Manifest.md)).
- **Reconcile Manifest ↔ diff (both directions):**
- Every changed file **must** appear as a Manifest row. Files not in the Manifest =
undocumented change / scope creep — challenge each.
- Every Manifest row **must** correspond to a real diff. Rows with no diff =
aspirational / claimed-but-absent.
- **Extract behavioral claims.** From the PR writeup, the `CLAUDE.md` one-liner, and the
`docs/PR_CHANGELOG.md` entry, list every verb that asserts behavior — *wired, fixed,
now calls, persists, materializes, self-heals, gated, verified live*. Each verb names a
specific line you must find. Keep this list; Phase 1 checks every item.
- **Honesty check** (from [`_2 Honest Check.md`](_2%20Honest%20Check.md)): "Are there any
uncompleted or deferred tasks in this PR?" Then *prove* the answer with greps, don't
trust the writeup.
## Phase 1 — Claim-vs-reality (highest-yield axis)
This repo has duplicate/parallel implementations and silent wiring seams. A change can
compile, pass its unit test, and do nothing in the running binary. For **every** claim
from Phase 0, find the implementing line **and** prove it's on the live path.
### 1.1 No-op handlers (returns 200/202, never calls the engine)
```bash
# A handler that returns success without invoking its injected service/engine
grep -rn 'func (h \*.*Handler).*echo.Context\|func (h \*.*Handler).*gin.Context' $(git diff --name-only main...HEAD | grep '\.go$')
```
For each new/changed handler: confirm the injected service/engine field is actually
*referenced in the method body* and the result is persisted. **Red flag:** an injected
field (`engine`, `repo`, `store`) never read in the method. *(Real: PR359 `ManualExecute`
returned 202 and never called the engine; the injected engine field was unused.)*
### 1.2 Wiring gaps (constructor exists ≠ handler mounted)
```bash
grep -rn 'RegisterRoutes\|:= New.*Handler\|NewProxy\|NewService' api/cmd/main.go incident-engine/cmd/main.go
```
A handler constructed nowhere, or registered on a sub-router that's never `.Use`d/mounted,
404s live. Confirm every handler the PR claims is **constructed AND mounted**.
*(Real: PR373 #215 `AttachmentProxyHandler` constructed nowhere → every attachments call
404'd since PR359e.)*
### 1.3 Order-of-construction (dependency built after its consumer runs)
A repo/builder constructed *after* the scheduler that consumes it is a silent no-op.
Confirm dependencies of any `.Start()` / `.PreGenerate` / `.Run()` are wired **before**
that call in `cmd/main.go`. *(Real: PR360e — three audit repos constructed after
`reportScheduler.Start`, so the startup audit report silently never generated.)*
### 1.4 Wrong-twin / dead-package edits
The repo has parallel implementations: `api/internal/repository` **vs** `pkg/repository`;
IncidentAnalyzer's `CassandraGraphNodeRepository` **vs** graph-processor's `NodeManager`;
`pkg/repository` `FrameworkRepository` **vs** the `api/internal/handler` subset. A fix on
the wrong twin compiles, passes a unit test on that twin, and does nothing live.
```bash
# Is the edited package actually imported by the running binary?
go list -deps ./cmd 2>/dev/null | grep -E 'api/internal/repository|internal/repository|<edited-pkg>'
```
If an edited package is **not** in `go list -deps ./cmd`, it's vestigial. *(Real: PR375c
— a `node_id` filter added to `api/internal/repository`, never imported by `./cmd`, while
the live `pkg/repository` query builder never got the clause. Real: PR356 — PR353's
partial-PK sweep missed IncidentAnalyzer's second graph-node repo → every analyzer
`GetByID/Update` errored under graph-v2 while the changelog claimed graph-v2 done.)*
### 1.5 Nil-dependency no-ops
```bash
grep -rn 'NewSyncService\|ProductStore\|nil,' incident-engine/cmd/main.go api/cmd/main.go | grep -i 'nil'
```
A service handed a `nil` collaborator silently no-ops. *(Real: PR358d — api passed `nil`
ProductStore to `meta.NewSyncService` since PR20, so `SyncProducts` was a no-op on every
appliance for ~338 PRs while appearing wired.)*
### 1.6 Deployed-binary confirmation (when claims are "verified live")
If a claim says "verified live on soc" but the change is untested locally with no
artifact, treat verification as **unproven**. Locally, confirm the new path is compiled
into the running binary, not just an unimported package:
```bash
( cd <module> && go build -o /tmp/bin ./cmd && strings /tmp/bin | grep -i '<distinctive-new-string>' )
go list -deps ./cmd | grep '<new-pkg>'
```
### 1.7 Status-enum / validator lockstep
A new enum value or status added in one validator but not its twin is a latent acceptance
bug. If the PR touches incident status, permissions, or any dual-validated enum, confirm
**both** validators (and the reporter sets, metrics, UI label/Tag/facet/dropdown, and all
7 locales) widened together. *(Real: PR357/PR354a `StatusOpen(6)`.)*
## Phase 2 — Deferred / stub / dead-code detection
```bash
grep -rniE 'TODO|FIXME|HACK|XXX|stub|not implemented|coming soon|placeholder|for now|in (a )?future PR|deferred|advisory|no-op|nyi|mock data|USE_MOCK_DATA|disclaimer' $(git diff --name-only main...HEAD)
```
- A stub is legitimate **only** if the Mission Contract Non-Goals declared it. Otherwise
it's undisclosed deferred work. *(Real: PR22c shipped `ReceiverService` methods as
`// TODO: Implement … in future PR`; PR22i shipped mock retry-distribution with a UI
disclaimer.)*
- **Dead UI affordances:** a `<Button on:click={fn}>` whose `fn` is a stub/returns; a tab
that renders an empty state for a "complete" feature; a form posting to a missing route.
*(Real: PR359 removed a dead Execute button.)*
- **Silent-skip guards added in this PR:** any new `if … { return nil }` / `continue` on
an error or already-handled branch. Confirm it (a) advances/dequeues state so the same
item isn't reprocessed forever, and (b) logs at WARN — a guard that neither logs nor
changes state is a swallow. *(Real: PR375g — `aiState==completed` returned without
dequeuing the relic → one completed task froze all 17 others; 0 processed.)*
- **"Enabled but never re-read" caches:** a version/snapshot counter incremented on
mutation but never validated on read. *(Real: PR358h — `ParserRegistry.snapshotVersion`
stamped but never read → runtime-enabled parsers reported running while the stale
snapshot was iterated until restart.)* Grep for a counter with `++`/`.Add(1)` sites but
no read/compare site.
## Phase 3 — Security & trust boundaries (if the diff touches auth/handlers/compose)
*(Full rules: [`_0 Write PR.md`](_0%20Write%20PR.md) §Security and
[`_4`](_4%20Code%20Audit%20and%20Review.md) Phase 1. Reproducers:
`./scripts/testing/security-audit.sh` — **PASS = bug fixed, FAIL = bug present**.)*
- **X-Org-ID strip-and-reinject.** Every new `api/` proxy handler must derive `orgID` from
the JWT context (`proxyContext(c)` / `setIncidentEngineHeaders`), never forward the
client header. `grep -rn 'Header.Get("X-Org-ID")' api/internal/handler/` → **zero** hits
in proxy paths. *(F-b-001 cross-tenant.)*
- **Middleware chain & per-route permission.** incident-engine groups chain
`JWTMiddleware → OrgIDMiddleware → RequirePermission`, in that order. A new write route
on a JWT-but-no-`RequirePermission` group is a privesc hole:
```bash
grep -cE '\.(GET|POST|PUT|DELETE|PATCH)\(' incident-engine/cmd/main.go
grep -c 'RequirePermission\|RequireAnyPermission\|RequireAllPermissions' incident-engine/cmd/main.go
```
Investigate any *new* unguarded write group. (incident-engine `HasPermission` must
*expand* admin/manage/write, not literal-match — a literal gate 403s every legit user;
`IsPlatformAdmin` must stay strict-literal. *Real: PR363 reverted, PR366 fixed.*)
- **JWT validators.** Any new `jwt.Parse`/`jwt.NewParser` must set
`jwt.WithAudience(...)` + `jwt.WithValidMethods([]string{"HS256"})`. `aud` is
hard-enforced; `iss` is tolerated-missing (Phase 5a window) — don't be reassured by an
`iss` check alone.
- **Secrets fail-closed.** Services reading `JWT_SECRET`/`AUTH_CONFIG_ENCRYPTION_KEY`/
`AI_ENCRYPTION_KEY` must `log.Fatal` on empty **and** on the public default literal
(`witfoo-precinct-secret-…`). Compose must use `${VAR:?required}`, never `${VAR:-default}`,
and the **same** secret across api/incident-engine/dispatcher/reverse-proxy.
- **SSO scoped lookup.** SAML/LDAP `getOrCreateUser` must use `FindByOrgIDAndEmail`, never
global `FindByEmail`. *(F-b-003 cross-org takeover.)*
- **Secret repos fail-closed.** Any new credential-bearing repo (notifications,
integrations, connector creds) writing a `*_encrypted`/`secret`/`token`/`api_key` column
must return `ErrEncryptionRequired` when no master key; **Create AND Update** both
guarded. Response handlers must blank `APIKey` AND `APIKeyEncrypted` on **every** path.
- **Fail-closed gates.** Tenant-status / license / feature gates return 503 (not `next(c)`)
on lookup error with no cached value. License gates must be **live per-request**, not a
cached boot snapshot. *(Real: frozen boot snapshot 403'd SOAR.)*
- **Error/info leakage.** `grep -rn 'err.Error()' api/internal/handler/ incident-engine/internal/handler/`
inside `c.JSON`/`gin.H{"error":…}`/`response.*` bodies — internal (Cassandra/NATS) errors
must not reach the client. (Logging `err.Error()` server-side is fine.)
- **Download/export header order.** All `c.Header(...)` (Content-Type, Content-Disposition)
must precede the body write, or Go sniffs the body and can return `text/html` → stored
XSS on download.
- **WebSocket.** New `Upgrader.CheckOrigin` must not `return true` unconditionally; DM/
private rooms must use `BroadcastToOrgUsers(memberIDs)`, never `BroadcastToOrgTopic`.
- **Compose port exposure.** Internal services must bind `${LISTEN_IP:-127.0.0.1}:` —
`grep -nE '^\s*-\s*"[0-9]+:[0-9]+"' docker/docker-compose*.yml` (bare host:container on an
internal service bypasses reverse-proxy TLS/JWT/rate-limit). `go test ./tests/security/h/ -run F_h_003`.
## Phase 4 — Architecture / API-proxy / config / errors
- **The 3-hop contract.** Every new incident-engine endpoint needs (a) a matching api
proxy route wired in `api/cmd/main.go`, and (b) a UI `client.ts`/`$lib/api` caller. A
backend endpoint with no proxy route 404s in the browser — *never* caught by Go build or
unit tests. This is the most common half-done-feature signature in the repo.
- **Outbound calls** to incident-engine use `setIncidentEngineHeaders` + `proxyContext`,
never hand-rolled headers.
- **Response envelope.** Two envelopes coexist (api Echo `{success,message,data,meta}` vs
incident-engine raw Gin); `API_STANDARDS.md` documents a *third* idealized shape matching
neither. **Read the actual handler, not the doc** — confirm the UI TS type matches what
the proxy really returns (don't double-wrap).
- **Background services.** `ticker + stopChan` with `done chan struct{}` (not WaitGroup),
`Start(ctx)/Stop()/RunNow()`, with both `case <-ctx.Done()` and `case <-s.stopChan`. A
new ticker with no stop arm leaks for process life. **Exception:** the incident
aggregator (`INCIDENT_AGGREGATOR_INTERVAL_SEC`) must *sleep-after-cycle*, not use a
coalescing `time.Ticker` (overlapping cycles saturate Cassandra — PR351).
- **Cancellation paths** must wrap `ctx.Err()`, not `return nil` (a nil signals
init-success → scheduler promotes → nil-deref; PR340 class).
- **New env vars** route through `pkg/envconfig`, get a `docs/CONFIGURATION.md` row, and
are declared in **every** consuming service's compose env block. *(Real: PR325 added
`AI_ENCRYPTION_KEY` to api only; incident-engine read it and silently failed.)* Note
`envconfig.GetEnvInt/Bool/Duration` **silently falls back to default** on a parse error —
an operator typo applies the default with zero log signal.
- **Errors:** `%w` not `%v` for wrapping (breaks `errors.Is/As` → status-mapping branches
become dead code), generic client message + detailed server log.
## Phase 5 — Database / Cassandra / calculation correctness
- **Partition scoping.** `grep -rn 'ALLOW FILTERING' pkg/repository/ incident-engine/internal/repository/ | grep -iv 'partition_key\|org_id\|shared_token'`
— every hit must carry a partition key or be a known global exception. A missing
partition key = cross-org full scan + tenant leak + SLO breach.
- **`double` not `decimal`** (`grep -n decimal cassandra/init-schema.cql` → only in
comments). **No `!=` in CQL** (parse error or silently ignored). **DDL-only init-schema**
(`grep -niE 'INSERT|UPDATE' cassandra/init-schema.cql` → empty; DML lives in
`cassandra/seeders/`, and an edited seeder source needs `rebuild-consolidated.sh`).
- **Schema change ⇒ paired self-heal migration.** A new column/table/index in
`init-schema.cql` needs an idempotent `EnsureX` (`ADD … IF NOT EXISTS`) in
`pkg/repository/migrations.go` that is **CALLED** from **both** `incident-engine/cmd/main.go`
**and** `graph-processor/cmd/main.go`. Without it, existing appliances NULL-read forever
(init-schema only runs on fresh installs). **No** files under `cassandra/migrations/`
(archived PR184). An `EnsureX` that stamps a version before verifying the real column/
`bucket` state is the PR353 race.
- **Graph-v2 PK.** Any graph read with only `WHERE partition_key = ?` (missing
`node_type`+`bucket` / `bucket`) **errors at runtime** under v2 — resolve type+bucket via
`node_locator`/`graphid.NodeBucket` first. Won't fail `go build`; needs a v2 integration
or live check.
- **Concurrent set columns** (`product_ids`/`framework_ids set<int>`) use atomic
`SET col = col + ?` (AppendNodeLicensing), never read-modify-write.
- **REPLACE BulkUpsert** must `DELETE` (Exec) *before* the INSERT batch — a single batch
shares one timestamp, tombstone shadows inserts → empty table → compliance silently 0%.
- **Calculation correctness** (if the diff touches reporter/compliance/FTE/ROI math):
FTE = `totalHours / (2080 * reportDays/365)` — no `8760`/`525600` (Precinct legacy,
~4.2× wrong); TP = Disrupted(5)+Resolved(3); FP = Closed(4); active/detected =
Open(6)+Investigating(2) (NOT New(1)); thresholds Green>75 / Yellow 0–75 / Red 0;
Manual = −1. Use status-ID **constants** (3↔5 are swapped vs Precinct). Any new
ratio/percent without a `== 0` denominator guard propagates NaN/Inf into report JSON.
Changed formula ⇒ update `CALCULATION_METHODOLOGY.md` + the ChartHelpTip + all 7 locales.
- **Perf cliffs that pass dev tests:** new `ALLOW FILTERING` without SAI index/LIMIT;
O(controls×nodes) compliance loops (memoize `countNodesByProduct`); per-flag
`countNodesWhere` scans (PR351 single-scan). Dev data (~33k entities) won't expose what
soc (~114k nodes) will; dev SLO thresholds are 3× looser than prod.
## Phase 6 — Frontend / Svelte 5 / disconnected / branding / i18n / demo
*(Patterns: [`_22`](_22%20Svelte%205%20Reactivity%20Advanced.md),
[`_20`](_20%20Pagination%20Facets%20Histogram.md), [`_27`](_27%20Monaco%20Editor%20Integration.md),
[`_28`](_28%20FormData%20and%20Fetch%20Patterns.md).)*
- **Base path.** `grep -rn "goto(\`/\|goto('/\|href=\"/" ui/src | grep -v '\${base}'` —
internal nav must prepend `${base}` from `$app/paths` (app mounts at `/ui`). Works in
`npm run dev`, **404s behind the reverse proxy** — test the proxied path. `/api/*` and
`/conductor/*` fetches are deliberately root-mounted (don't flag those).
- **Reactivity.** `$effect` that writes `$state` it also reads → infinite cascade (wrap
self-writes in `untrack()`, or use `$derived`/`$derived.by()`). Arrays/objects holding
`m.key()` i18n calls must be `$derived.by`, not `const` (else frozen at boot locale).
- **Stale-data.** New refresh paths must not clear `dataWindow`/facets/histogram to `[]`
before an await — snapshot `staleDisplayData` and render via `$derived` while
`isRefreshing` (flash-to-empty only shows at live stream rate).
- **Faceting/pagination/histogram** are **3 independent server-side queries** — any
client-side `calculateFacets(dataWindow)`/`calculateHistogram` outside the
`USE_MOCK_DATA===true` branch is the Splunk anti-pattern (wildly wrong at ~300/min).
Confirm `USE_MOCK_DATA` is not left `true`.
- **`{#each}` keying** `(item.id)` for any sortable/filterable list (not the ~204
intentionally-static menus). **Carbon:** `resizable: false` on every new Chart
(ResizeObserver leak ~27 listeners/tab-switch); `on:click` on Carbon components vs
`onclick` on native `<button>` (mismatch = silent dead handler).
- **Monaco/Cytoscape** dynamic-import in `onMount`, dispose model **and** editor in
`onDestroy`. **FormData** uploads must not set Content-Type; empty POSTs send `{}`; use
the `client.ts` wrapper (raw `fetch` drops X-Org-ID/auth).
- **Disconnected network (MUST).** No new runtime external URL. `cd ui && npm run build`
**then** `scripts/util/audit-external-deps.sh` (it scans `ui/build` — a stale build
silently passes). `fonts.css` import must precede Carbon CSS. New map/flag/geo features
need a source-scan test (cf. `WorldMapHeatmap.test.ts`).
- **i18n / branding / demo.** New user-facing strings are Paraglide `m.key()` present in
**all 7** locale files (`en/es/fr/de/ja/ar/mi`) — a key only in `en.json` renders the raw
key elsewhere with no error. No new hardcoded `WitFoo`/`witfoo.com` in customer chrome
(white-label). Demo-mode features gated **server-side**, not just `{#if !isDemoMode}`.
## Phase 7 — Conductor / integrations / connectors (if touched)
- **New parser = FOUR files.** Package dir + `connector.go` import + `parserRegistry` map
entry + conductor-ui `frontend/src/lib/definitions/parsers/types.ts`. Verify the new
parser name appears in **all four**:
```bash
ls -d conductor_suite/signal-parser/pkg/connector/*/ | wc -l
grep -c 'ParserFactory:' conductor_suite/signal-parser/pkg/connector/connector.go
grep -c 'projectPrefix +' conductor_suite/conductor-ui/frontend/src/lib/definitions/parsers/types.ts
```
(These counts already drift — 267/249/218 — so check the *specific* new name.) Missing
the UI file = parser registers but never receives traffic (silently idle).
- **Runtime KV-enable does NOT join the serial chain** — signal-parser must be **restarted**.
A PR claiming "live enable" without the restart caveat (or without the validated
`versionedSnapshot` read path) ships an inert parser. *(PR358h class.)*
- **connector-exists ≠ parser-matches-format.** A wired, "running" connector whose parser
doesn't recognize the message format drops records into the *unknown* stream silently —
watch the Pipeline Sankey "Unknown" slice, not just container-up. JSON arrays decode to
`[]interface{}` not `[]map[string]interface{}` (PR348 silent record-drop).
- **conductor-ui dual-mode.** Proxy/AIO mode must NOT register standalone `/api/auth/*` /
`/api/v1/users/*`; reverse-proxy strips inbound `X-Auth-*` before injecting;
`proxy_auth.go` gates on `RemoteAddr` CIDR (not `RealIP()`) **before** reading `X-Auth-*`.
AIO needs matching `JWT_SECRET`/`WF_JWT_SECRET` + `SecondaryNetworks` to conductor-net.
- **FROM-scratch images** are pre-built binaries (`COPY bin/<svc>`); `conductor-build.sh`
must run first; no shell-based `HEALTHCHECK` (dead). conductor images deploy via the
**conductor repo's own tier branches**, not analytics CI. Stale `bin/` = container runs
old code while source looks updated.
- Conductor submodules are **excluded from analytics `unit-tests.sh`** — run
`cd conductor_suite/<svc> && go test ./...` separately; a `common` change has cross-parser
blast radius (run the **full** conductor `go test ./...`). The authoritative pipeline step
list is the script's `conductor_steps=(…)` array (9 steps), not the stale 7-step doc table.
## Phase 8 — Supply chain / Docker / observability / WFA (if touched)
- **No bare new Go `require`.** Each must have a `third_party/go/<module>` fork + a
`go.work` replace (never per-module go.mod) + a `MANIFEST.md` row. Gate:
`scripts/vendor/verify-vendor.sh` (GOPROXY=off build per module — but it can pass on a
warm module cache; prefer clean). `scripts/vendor/check-no-external-deps.sh` on the
go.mod diff. PATCHES.md↔MANIFEST.md via `check-manifest-patches-sync.sh`.
- **Module-discovery scripts** (`find -name go.mod`) must exclude `*/third_party/*` and
`*/_vendor/*`.
- **Known doc-vs-code trap:** Dockerfiles run `go mod download` (online build) despite docs
claiming a vendored offline build — don't trust an "air-gapped build" claim without
actually building the image with no network.
- **Observability.** A new metric needs both a `prometheus.yml` job and a registered
`/metrics` endpoint; avoid high-cardinality labels (`incident_id`/`org_id`/raw path →
series explosion / OOM) and duplicate names (`MustRegister` panic). A metric defined but
never `.Inc()/.Observe()`d is dead. `remote_write` ships commented-out — "cloud metrics
enabled" by env alone is inert. Don't regress `LOG_LEVEL=warn` / Alloy allowlist /
`HEALTH_STATUS_INTERVAL` (`scripts/testing/verify-log-filtering.sh`).
- **WFA.** New `WF_*` container env must flow through `EnsureContainersHaveCurrentEnv`
(env-drift self-heal) or it's never reconciled. Never roll out via
`systemctl restart wfad` (hangs the stack); never run `wfad --version` (spawns a rogue
daemon). `analytics-nats` `store_dir` must be `/data/nats/analytics` (AIO collision).
## Phase 9 — Coverage gap analysis (always required)
> **`scripts/util/coverage.sh` LIES.** Its header says it does *not* enforce thresholds; it
> runs `go test … 2>/dev/null` (swallows compile errors as a soft warning) and reports a
> **zero-test package as "no testable code" = PASS**. Never quote its green summary as proof.
- **Every new logic file needs a test:**
```bash
git diff --name-only --diff-filter=A main...HEAD | grep '\.go$' | grep -v _test.go \
| while read f; do t="${f%.go}_test.go"; [ -f "$t" ] || echo "MISSING TEST: $t"; done
```
(Schema/config/`cmd/main.go` exempt; business logic & API handlers never exempt.)
- **Measure the touched module directly** (no `2>/dev/null`):
```bash
cd <changed-module> && go test -short ./... # surfaces red/compile, not hidden
go test -short -coverprofile=cov.out ./... && go tool cover -func=cov.out | grep -E '^total:|<changed-file>'
```
Apply the **tiered** target, not a flat 75%: **85%** for security-critical (`pkg/sanitize`,
`pkg/security`, `pkg/middleware`, `reverse-proxy/internal/{middleware,sanitize}`,
`api/internal/middleware`, `artifact-ingestion/internal/middleware`); **60%** integration
layers; **75%** default.
- **Tests that prove nothing** (assert-nothing / test-theater):
```bash
grep -rLE 'assert\.|require\.[A-Z]|t\.(Error|Fatal)' $(git diff --name-only main...HEAD | grep _test.go)
git diff main...HEAD -- '*_test.go' | grep -iE 'TODO|FIXME|t\.Skip|not implemented'
```
Flag: fixtures with N fields but 1 assertion (`go vet`/staticcheck unusedwrite);
`assert.NoError` followed by a deref (should be `require`); E2E specs whose only outcome
is `test.skip()`/`.catch(()=>false)` with no hard `expect`; Svelte `render()` component
tests (known-broken under Svelte 5, quietly excluded); financial assertions via
Playwright `innerText` (numbers render in SVG — false-negative-prone).
- **Mutation check:** for each key new assertion, confirm it would **fail if the fix were
reverted**. *(Real: PR375c's original test passed against dead code; it was moved to the
live `pkg/repository` path.)*
- **Known standing gaps** (don't accept as new baseline): incident-engine 28.6% /
incident-engine/service 26.7% / dispatcher/subscriber 14.8% / api/handler 18.5%;
graph-processor & dispatcher have **no** dedicated unit tests; PR214 deferrals
(pkg/service, pkg/dns, pkg/notification, rp/proxy). A change here is unlikely to be caught
by existing tests — demand explicit tests for the *changed function*.
- **Workspace-wide vet** for any interface/submodule change (compiles in the edited module,
breaks a mock elsewhere — PR358f):
```bash
for d in . incident-engine api graph-processor conductor_suite/common; do (cd "$d" && go vet ./...); done
```
## Phase 10 — Run the full pipeline
```bash
./scripts/testing/full-testing-with-conductor.sh # backgrounds; logs to logs/full-testing-with-conductor.log
tail -f logs/full-testing-with-conductor.log # monitor
```
9 steps: `inspection → docker-rebuild → conductor-health → nats-bootstrap →
generate-artifacts → conductor-inject → system-tests → e2e-tests → conductor-benchmarks`.
Resume a failed step with `--from-step <step>`; isolate with `--only <step> --force
--foreground`; `--skip-passed` re-runs only failures; view state with
`scripts/testing/test-dashboard.sh`.
- Capture **every** failure (step, exit code, the failing assertion/log excerpt) into
`audit.md`. Take ownership of pre-existing failures too.
- The pipeline is the **only** level that proves browser→api→incident-engine wiring and the
broker→parser→filter→exporter→ingestion path — a "tests pass" claim on Go-only unit tests
is not coverage-complete.
- If full Docker is unavailable, run the cheaper gates and **say so** in the honesty
section: `./scripts/testing/unit-tests.sh`, `./scripts/testing/security-audit.sh`,
`./scripts/testing/linter.sh`.
## Calibration — avoid false positives (a credible reviewer)
Don't flag these known-intentional patterns:
- The literal default JWT secret (`witfoo-precinct-secret-…`) **present in source** — it's
retained only to **reject** the value (verify it's in an `==` check, not a fallback
assignment).
- `FindByEmail` **existing** — it's intentional for local auth; the bug is only calling it
from an SSO path (trace the call graph).
- `if x == 0` float guards in `report_calculations.go` — intentional divide-by-zero
sentinels (value is exactly 0 from integer math), not the float-compare anti-pattern.
- ~204 unkeyed `{#each}` on static menus; `client.ts` skipping `/api` for `/incident-engine`
/ `/conductor`; the unauthenticated `/health`/`/ready`/`/metrics`/`/conductor/status`
endpoints. (For `/v1/system/status`, *do* flag any **newly-added** tenant/error field.)
- AstraDB mode ignoring pool-tuning env vars; the per-process AIExecutor rate limiter (not
a distributed budget — flag only if a PR *treats* it as one).
When unsure whether something is a real defect, **say so and show the evidence** rather
than asserting — a finding the author can dismiss in 10 seconds discredits the whole audit.
## Output — `_prompts/audit.md`
Write findings grouped by severity (P0 → P1 → P2), then by phase/dimension. Per finding:
```markdown
### [P0|P1|P2] <short title>
- **Claim vs reality:** <what the PR claimed | what the code/test actually does>
- **Location:** `path/to/file.go:42` (and the dead/live-path note if relevant)
- **Evidence:** <the grep/command output or the missing line that proves it>
- **Impact:** <failure mode — what breaks, for whom, at what scale>
- **Fix:** <specific remediation>
- **Reverts-would-fail-a-test?:** yes/no (if no → unprotected)
```
Severity: **P0** = security / cross-tenant / data-loss / claimed-but-no-op shipped as done.
**P1** = WitFoo-Way MUST violation / missing self-heal migration / coverage regression on a
security-critical pkg / silent perf cliff. **P2** = SHOULD/NOTE / doc-drift / cosmetic.
1. **Summary table** — counts by severity × dimension.
2. **`full-testing-with-conductor.sh` result** — pass/fail per step, with failure excerpts.
3. **Coverage gap analysis** — per touched module: measured % vs tier target, files missing
tests, assert-nothing tests, mutation-check results.
4. **Could-not-verify (honesty section)** — every claim you could not prove or disprove,
and *why* (no Docker, needs a live appliance, needs soc-scale data, etc.). This
section is mandatory — an adversarial review that hides its blind spots is itself
over-claiming.
- For any **P0/P1**, draft `_prompts/PR{N+1} - Audit Remediation.md` (TDD: failing test
first).
- If the review established a *new* recurring defect pattern, add it to
[`_4 Code Audit and Review.md`](_4%20Code%20Audit%20and%20Review.md) and/or
[`_0 Write PR.md`](_0%20Write%20PR.md).
- Archive this run's `audit.md` into `_archives/pr{N}*/` alongside `CURRENT_WORK.md`.