The PM Copilot Field Manual — Six steps, every question, sample answers

p. 02 · editor's note

Editor's note

Why this manual exists

Most product specs die in the same place: somewhere between a hopeful problem statement and the first sprint planning meeting. The problem isn't talent. It's that the questions never got asked sharply enough, or early enough, or by enough of the right people.

The PM Copilot turns those questions into a guided flow. Six steps. Each one builds on the last. AI plays the role of a senior PM, a paranoid security engineer, a privacy lawyer, an analytics lead — pressure-testing the spec in their voice until what's left is shippable.

This manual is the paper version. Every system prompt the app uses is distilled here. Every sample answer the app ships with is printed alongside. You can use this without the app: keep it on a wall, run it in a workshop, hand it to a new PM. Or use it with the app: read a page, run the step, read the next page.

One rule: don't skip steps. The Why unlocks Stakeholders. Stakeholders unlocks Feature. Feature unlocks Edges. Edges unlocks Metrics. Metrics unlocks Traceability. The order is the method.

p. 04 · step 01

01

The Why

Five Whys · root-cause interrogation

3+ whys required
Max 7
Unlocks step 02

State the problem. Then let the system pressure-test it 3+ times. Each question references your exact wording and forces one layer deeper toward root cause.

"Why is that the case, in your own words?"

What the AI does: reads your problem statement plus the prior why chain, then produces ONE next question — 8–18 words, sharp, anchored in your own language. It will not answer for you. It will not propose solutions. If you already addressed root motivation, it surfaces a different angle (constraints, evidence, alternatives ruled out).

Hard rules · system-critique.ts

Output ONE question only. No preamble, no numbering.
8–18 words. Specific. Anchored in author's own language.
Never answer for them. Never propose solutions.
Avoid generic prompts ("why is that?"). Reference a concrete phrase.
If statement is solid → ask about evidence, sample size, or adjacent segments.

Pin the real business pain. The Five Whys forces you past surface symptoms to root cause.

Sample problem statements (the app ships these as examples)

Telecom · churn

Prepaid customers churn within 30 days of their first international roaming event, citing bill shock even when a roaming pack was active.

Streaming · activation

New broadband subscribers do not activate the bundled streaming app within the first week, suppressing trial-to-paid conversion.

B2B · onboarding

Enterprise admins drop off before completing SSO setup, blocking 40% of seat activations within the first month.

A sample why chain

Statement → why → why → why

Why #1. Why do prepaid customers churn after roaming, even with a pack active?
→ Pack confirmation arrives by SMS but in roaming many devices suppress local SMS until reboot.
Why #2. Why does the pack confirmation depend on a delivery channel we know is unreliable abroad?
→ It was designed for domestic use and was never re-validated for the visited-network case.
Why #3. Why was the visited-network case never re-validated?
→ No persona owns roaming experience end-to-end; it sits between Network Ops and CVM.

Kill criterion — No named decision-maker, no baseline metric, no cost-of-wrong → no product.

p. 07 · step 02

02

Stakeholders

8 internal personas · pick 2–5

Auto-flagged by trigger words
Accept or reject each
Unlocks step 03

Surface the people whose blockers will derail this. Better caught now than at engineering grooming.

"Who is impacted, and who must review before this ships?"

What the AI does: reads your problem statement plus feature description, identifies which of 8 internal personas are impacted, and quotes the specific phrase from your input that flagged each one. Returns 2–5 personas. Never invents personas outside the list.

Hard rules · system-stakeholders.ts

Pick from the exact list of 8 personas. Never invent.
Return between 2 and 5. Only the most clearly relevant.
Skip personas whose triggers do not appear in the input.
Each "trigger" must quote the specific phrase from the input. 12 words max.

The 8 personas (each speaks in first person when critiquing)

security

Lead Security Engineer

Paranoid by trade. Asks about authn/authz, secrets handling, encryption, third-party data sharing, audit logs, abuse vectors.

Triggersauth · token · oauth · pii · gdpr · third-party · api key · encryption

legal

Privacy & Legal Counsel

Thinks in lawful basis, retention, consent, ePrivacy, CPNI, regulator scrutiny, cross-border transfer, right-to-erasure.

Triggersgdpr · consent · minor · eu · retention · profiling · lawful intercept

devops

Platform / DevOps Lead

Worries about deployment, rollback, observability, error budgets, scaling, cost, third-party dependencies.

Triggersdeploy · scale · latency · sla · uptime · observ · cost · webhook

support

Customer Support Lead

Thinks support load. Asks about ambiguous error states, customer copy, human handoff, refund/credit logic.

Triggerserror · refund · credit · cancel · help · tier · churn · complaint

finance

FP&A / Commercial Lead

Unit economics, margin, billing accuracy, revenue recognition, regulator-reportable numbers, chargeback risk.

Triggersrevenue · arpu · pricing · discount · invoice · billing · subscription

data

Analytics / Data Lead

Event schemas, instrumentation, data quality, joins, late-arriving data, identity resolution, measurability.

Triggersmetric · event · track · analytics · dashboard · churn · attribution

design

Design Lead

User flow, error states, empty states, accessibility, perceived speed, moment of value within seconds.

Triggersflow · ui · ux · screen · modal · onboarding · accessibility

eng-lead

Engineering Lead

Build cost, integration risk, what gets descoped, testability, whether spec is unambiguous enough to estimate.

Triggersintegration · api · sdk · mobile · web · native · sync · migration

Sample "when to flag" — printed in the app's stuck-drawer

Flag · Security

"We will pull customer location to suggest the right roaming pack." Security wants to know who else will read that location field.

Flag · Legal

"We will track which content kids in the household watch." Legal needs lawful basis, parental consent, and a retention window.

Flag · Finance

"We will absorb roaming overages on the first event." Finance wants the cost-to-serve model and the cap before launch.

Kill criterion — Buyer = user = saboteur. Three different people, or the business case collapses.

p. 10 · step 03

03

Feature

Name it · describe it · refine + audit ambiguity

Two tools: refine + ambiguity
60–120 words target
Unlocks step 04

Name the feature so engineers can build it. Describe what it does, for whom, and when. Skip the marketing voice.

"Who uses it, when, what happens, what gets ruled out?"

Two AI tools available here:

↳ Refine rewrites your description so engineers can build from it. 60–120 words. Active voice. Cuts marketing. Never invents capabilities you didn't state. If too sparse → returns input unchanged.

↳ Ambiguity audit returns 3–6 bullets, each quoting a phrase + asking the clarifying question engineers will hit. If the description is unambiguous: "No major ambiguity detected."

Hard rules · system-feature-tools.ts

Refine output: 60–120 words, plain language, active voice.
Cover: who uses it, when, what happens, what gets ruled out.
Keep user's specifics; cut the marketing.
Ambiguity bullets: "<quoted phrase>" — <clarifying question>. ≤22 words each.
Do not invent capabilities the user did not state.

Good · vague · over-solutioned

Good

Roaming pause guarantee. When a prepaid customer crosses a border, voice and SMS billing pauses automatically. Customer must confirm a plain-language pack via SMS before charges resume.

Who · when · what · what's ruled out — all present.

Too vague

"Improve the roaming experience"

Does not tell engineering what to build. Refine tool would return unchanged — too sparse.

Too solution-y

"Push notification with three CTAs and a sticky banner"

Describes the UI, not the value the feature delivers. Ambiguity audit would flag "three CTAs" first.

Sample ambiguity-audit output

Audit on the "roaming pause guarantee" description

"voice and SMS billing pauses automatically" — What about data and MMS? Pause too, or charge as overage?
"crosses a border" — Detected by which signal: visited-network IMSI swap, geo-fenced location, or border-cell registration?
"confirm a plain-language pack" — What if the SMS is unanswered for 24h? Auto-resume, stay paused, or fall back to call-center?
"before charges resume" — Does "resume" mean retroactive billing from border crossing, or only forward-looking from confirmation?

Kill criterion — If a senior engineer can't estimate from the description alone, you haven't written one yet.

p. 13 · step 04

04

Edge cases

10–14 cases · 5 categories · status per case

Generated, not invented
Status: open · addressed · deferred · oos
Unlocks step 05

List what could go wrong before engineering does. Catch them in spec, not in production.

"What breaks when the boundary is the boundary?"

What the AI does: plays a senior staff engineer. Generates 10–14 short, specific edge cases (10–22 words each) referencing your product's actual mechanics — not generic "what if the server is down." Groups by category. Distribution roughly 3/2/2/2/2. No solutions. Just the case.

Hard rules · system-edges.ts

10–14 edge cases. No fewer, no more.
Each: 10–22 words. Reference actual mechanics.
Distribution ≈ 3 failure · 2 scale · 2 security · 2 ux · 2 data. ±1 OK.
No solutions. No "consider…". Frame as "what if" or "when X, then Y".
No duplicates, no near-duplicates.

Failure

~3

Scale

~2

Security

~2

UX

~2

Data

~2

Samples from the app's stuck drawer

Failure

Customer crosses a border but the SMS confirmation fails to deliver because the visited-network does not support our short code.

Scale

World Cup weekend: 3M customers cross borders in 72 hours. Confirmation queue must drain before voice service degrades.

Security

Attacker spoofs an SMS reply pretending to be the customer accepting a pack. Confirms with what authentication signal?

No solutions. Just the case framed as "when X happens, then…".

Status workflow (per case)

open

Default. Has not been resolved or scoped out yet.

addressed

Handled in design. Carry into PRD with the resolution.

deferred

Acknowledged. Punted to v2 with a date.

oos

Out of scope. Explicit refusal. Stays in PRD as written exclusion.

Kill criterion — Every edge case must end with a status. Open lists in PRDs become production incidents.

p. 16 · step 05

05

SMART metrics

Validate · suggest · instrument

Specific · Measurable · Achievable · Relevant · Time-bound
3–5 draft suggestions
1–5 event names

Make the metric measurable today. SMART forces a population, a number, a target, and a deadline.

"What number moves, by how much, by when?"

Two AI tools available here:

↳ Validate checks your draft metric against the 5 SMART letters. Defaults to false when ambiguous. For each failed letter: one short hint ≤14 words explaining what's missing. Plus 1–5 instrumentation event names in snake_case.

↳ Suggest proposes 3–5 distinct draft metrics, SMART by construction. Mixes leading (behavioral) and lagging (outcome) indicators. Never repeats your already-captured ones.

Hard rules · system-smart.ts

True only if the metric clearly satisfies the letter. Default false.
Events: verbs in snake_case ("session_started"), 1–5 names.
Suggestion drafts: 12–28 words, include population + direction + baseline + target + deadline.
Reference the actual feature wording. Avoid generic "increase engagement".

S

Specific

Names a population, a behavior, and a direction of change.

M

Measurable

A number or rate that can be queried from product data today.

A

Achievable

Within plausible reach with the current product surface. Not a moonshot.

R

Relevant

Clearly tied to the stated problem or feature, not a vanity number.

T

Time-bound

Has a window: week, month, quarter, year, or a specific date.

Strong · weak · leading indicator

Strong (S·M·A·R·T)

Reduce 30-day post-roaming churn for prepaid roamers from 18% to 12% by end of Q3 2026.

Suggested events: roaming_session_started, pack_confirmed, subscription_canceled.

Weak (fails all 5)

"Improve customer satisfaction"

No population. No baseline. No target. No deadline. Validation would mark all 5 letters false.

Strong (leading)

Achieve 90% pack-confirmation rate (SMS reply within 5 min of border crossing) within 6 weeks of launch.

Pairs with the lagging churn metric. Detects problems before the churn number moves.

Kill criterion — A metric without a counter-metric will be gamed. Always pair leading + lagging.

p. 19 · step 06

06

Traceability

Goal → Feature → Metric → Event

Every row auditable
Orphans = scope creep
Required for BRD/PRD

Connect each business goal to the feature, the metric proving it, and the event tracking it. Orphans signal scope creep.

"Can you draw a line from goal to event, on every feature?"

What the app does here: no AI call. Pure structuring. Each row is a thread: a business goal, the feature serving it, the metric that proves it, the event that measures it. Orphan features and orphan metrics are highlighted automatically.

Validation rules

Every feature must appear in at least one trace row.
Every SMART-valid metric must appear in at least one trace row.
Orphan feature → either link to a goal or drop from scope.
Orphan metric → link or remove. No measurement without purpose.
One feature can serve multiple goals — add a row per goal.

Sample traceability thread

Business goal		Feature		Metric		Event
Reduce prepaid churn	→	Roaming pause guarantee	→	30-day post-roaming churn ≤12%	→	subscription_canceled
Reduce prepaid churn	→	Roaming pause guarantee	→	Pack-confirmation rate ≥90%	→	pack_confirmed
Increase NPS	→	Roaming pause guarantee	→	Post-roaming NPS +8pts	→	nps_response_submitted

Orphans signal scope creep. Link it or drop it.

Kill criterion — A feature with no goal row, or a metric with no event, doesn't ship. No exceptions.

p. 22 · the brd/prd

▸

Run as agent

Consolidate everything · BRD · PRD · or both

≤1800 tokens output
Markdown only
Exact section structure

The final step. The agent takes the problem, why-chain, accepted stakeholders, accepted critiques, feature, edges (with status), SMART-valid metrics, and traceability rows. Outputs a polished BRD, PRD, or both.

"Could a stranger ship this from the document alone?"

Hard rules · system-consolidate.ts

GitHub-flavored markdown only. No preamble.
Exact section structure. Do not add or remove sections.
Reuse user's wording. Do not invent facts.
Missing data → "_TBD — [what's needed]_", never fabricate.
Edge cases tagged with status. Stakeholders carry trigger reason.
Metrics: SMART-valid first, weak last with failed letters called out.
Traceability table: goal → feature → metric → event.
Stay under 1800 tokens of output.

Output skeletons (exact, every time)

Business Requirements

BRD — for the sponsor

1.Executive Summary
2.Business Problem
3.Business Goals
4.Strategic Alignment
5.Stakeholders
6.Scope (IN)
7.Out of Scope
8.Success Metrics
9.Risks
10.Open Questions

Product Requirements

PRD — for engineering

1.Overview
2.Problem & Why Now
3.Target Users
4.Feature Description
5.User Flows
6.Functional Requirements
7.Non-Functional Requirements
8.Edge Cases
9.Metrics & Instrumentation
10.Dependencies
11.Risks
12.Open Questions

Definition of done — A senior IC reads the doc, asks zero questions, and starts ticketing.

p. 25 · faq

?

Engineering FAQ

Seven questions engineering will ask before they estimate

For: eng-lead · staff eng · architect
Answer in PRD, not in Slack
Source: data-product playbook

The six steps build the spec. These seven questions stress-test it from the build side. If you can't answer them on paper, engineering can't estimate — and you'll lose two sprints in clarification.

01

Functional spec

What exactly will the system do?

Specific features and functionalities the system will provide.
Detailed steps for each user interaction or system process.
Sample format — "User clicks X button → System displays Y screen → User enters Z data."

If the answer is a paragraph, it's not a spec yet. Decompose into atomic user × system steps.

02

Input / output contract

How will the system respond to specific inputs?

Exact inputs (user actions, data) the system accepts.
Precise output or response for each input.
Sample — "If user enters text, system validates for alphanumeric only. If invalid → error message 'Invalid characters' displayed."

Every input gets one expected output and one error path. No "TBD".

03

Business rules

What detailed business rules must the system enforce?

Calculations, validations, logic the system must apply.
Sample — "Discount of 10% applied to orders over $500."
Sample — "Minimum password length 8 chars: 1 uppercase, 1 number, 1 special."

Each rule needs a unit-testable definition. Vague rules = bugs in disguise.

04

Data lifecycle

How will data be handled within the system?

Specific data inputs the system processes.
Data stored + its attributes.
Sample — "user_id: unique integer · email: string, max 255 chars."
How data is manipulated, retrieved, presented.

Bring the schema. Type + constraint + cardinality per field. Schema first, UI second.

05

Integrations

How will the system interact with other systems?

Detailed integration points with existing or external systems.
Specific data exchanged + format.
Sample — "System sends order details via API call to fulfillment system in JSON format."

Sync vs async · auth method · retry policy · idempotency key — name them or get burned.

06

Non-functional reqs

What are the specific NFRs (system view)?

Performance — "Page load < 2s for 90% of requests."
Security — "Encrypted at rest + in transit · RBAC enforced."
Usability — "Nav menu visible on all pages · errors actionable."
Reliability — "Uptime 99.9%."
Scalability — Concurrent users / transactions target.
Maintainability — Technical standards or architectures to follow.

Numbers, not adjectives. "Fast" is not an SLA. "≤200ms p95" is.

07

Failure modes

What are the error handling and recovery procedures?

What happens when an error occurs. Sample — "System logs error and displays user-friendly message · administrator notified via email."
How the system recovers from failures: retry policy, circuit breaker, manual override, rollback path.
Map to Step 04 edge cases: every "open" status here = a known unanswered failure mode.

Pair with Step 04 edge cases. Each edge needs a detection rule, a user-facing message, and a recovery action.

Definition of ready — All 7 answers in the PRD before engineering estimates. No exceptions.

p. 28 · syncing expectations

≡

Syncing expectations

Requirements gathering · project charter

For: data-product kickoff
Use before Step 01
Foundation work

Specify the goals, requirements, and key stakeholders of your project to ensure clear communication. These questions feed the project charter directly — and surface assumptions before they become rework.

During requirements gathering — the 13 questions

Which KPIs / metrics must this data product track and display?Defines purpose + value. Drives sourcing, transformation, presentation.
How will it integrate with existing BI tools and platforms?Usability + adoption. Avoid siloed tools.
What are the data governance and security requirements?Compliance, trust, risk. Catch sensitivity + access rules early.
What level of interactivity / customization is expected?UX + technical complexity. Guides design choices.
What are the OKRs and goals of this data product?Strategic alignment. Yardstick for success.
Who are the primary end-users / target audience?Tailor visualizations + UX to their proficiency.
What specific data sources / datasets will be used?Technical feasibility. Quality, accessibility, integration cost.

Preferred data-viz tools or techniques?Organizational standards, user familiarity, streamlines dev.
Expected timelines and milestones?Project planning, resourcing, stakeholder expectations.
How does this align with the org's overall data strategy?Avoid creating isolated solutions.
What known pain points should this address?Frame as solution. Relevance + impact.
What tools, technologies, platforms — and do they fit existing infra?Feasibility, cost, maintainability.
Scope for cross-team collaboration (data · IT · business)?Communication channels + alignment for smooth execution.

When creating the project charter

Each gathering question feeds a charter section

Project objectives & goals	← What are the key OKRs and goals?
Project scope	← Data sources · interactivity level · tools & tech
Key stakeholders	← End-users · cross-team collaboration scope
High-level requirements	← KPIs · integration · governance · viz · data-strategy alignment
Project deliverables	← Implied by expected functionality and outputs
High-level timelines / milestones	← Expected timelines question
Assumptions & constraints	← Existing BI tools · governance · preferred tools · infra fit
Success criteria	← KPIs + OKRs (the same numbers that prove it worked)

Not just good questions. Essential ones. Answer them in the charter, or they answer themselves in rework.

Order of operations — Charter first. Then the six steps. Then engineering FAQ. Skip a layer and the next one breaks.