Walkthrough: Building "Streak" from Scratch
Product Kit is a free, open-source toolkit that brings structured product thinking to Claude. It guides you through 9 steps — from understanding your market to writing a validated product spec — using guided conversations that produce markdown artifacts.
In this walkthrough, we'll build Streak — a habit tracker for people who keep starting habits and quitting. You'll see every conversation, every artifact, and every decision along the way.
Prerequisites
- Claude with the Cowork plugin, or Claude Code with the CLI
- Product Kit set up (plugin uploaded, or
productkit workspace/productkit initin the CLI — covered in the next two sections) - A product idea you want to validate (or follow along with Streak)
Command format
Slash commands look slightly different depending on how you use Product Kit. Pick your setup and all commands in this walkthrough will update automatically.
Estimated time
- Steps 0–4 (Landscape through Assumptions): ~2 hours
- Step 5 (Validate): 1–2 weeks of fieldwork (interviews, surveys, data collection)
- Steps 6–8 (Solution through Spec): ~1 hour
/product-kit:bootstrap to generate artifacts from your codebase, then re-run individual steps (like Problem, Assumptions, and Validate) scoped to your next feature. You don't need to redo Landscape or Constitution every time.
Create a Workspace
A workspace is a parent directory that holds one or more Product Kit projects. It gives you a shared landscape, a shared knowledge base, and a single place to run /productkit.landscape that feeds every project inside it.
Why use a workspace?
- Your company has multiple products that share the same landscape (team, stack, competitors)
- You want a single
knowledge/directory that informs all projects - You want
landscape.mdwritten once at the org level, not repeated per project
Run the command
$ productkit workspace streak-org
Workspace created successfully!
Workspace: streak-org
Next steps:
1. cd streak-org
2. productkit init my-app
3. claude
4. /productkit.landscape (writes to workspace root)
What it creates
streak-org/
├── .productkit/
│ └── config.json # { "type": "workspace", ... }
├── .claude/
│ └── commands/
│ └── productkit.landscape.md
└── knowledge/
└── README.md
The workspace gets the landscape slash command and a knowledge/ directory. Individual projects created inside it will automatically detect the workspace and link to it.
productkit init streak directly. Workspaces are for orgs managing multiple products under one roof.
Initialize Your Project
This scaffolds a new Product Kit project with all the slash commands, a CLAUDE.md for agent context, and a knowledge/ directory. If you're inside a workspace, the project automatically links to it.
Run the command
$ cd streak-org
$ productkit init streak
How are you building this product?
1. Solo — I'm building it myself
2. Team — I'm working with engineers/designers
Choose (1 or 2): 2
Project initialized successfully!
Next steps:
1. cd streak
2. claude
3. /productkit.constitution
What it creates
streak/
├── .productkit/
│ └── config.json # { "type": "project", "workspace": "..", "mode": "team", ... }
├── .claude/
│ └── commands/
│ ├── productkit.constitution.md
│ ├── productkit.users.md
│ ├── productkit.problem.md
│ └── ... (all slash commands except landscape)
├── knowledge/
│ └── README.md
├── CLAUDE.md
├── README.md
└── .gitignore
Notice that landscape is not included in the project — it lives at the workspace level. When you run /productkit.landscape from Claude Code, it writes landscape.md to the workspace root, and all projects inside the workspace read from it.
Solo vs. Team mode
The mode you choose during init affects how downstream commands behave:
- Solo: Task lists instead of user stories, personal build plans instead of sprint estimates, techreview resolves questions conversationally
- Team: User stories with acceptance criteria, engineering handoffs, sprint-ready story points
You can also skip the prompt by passing --mode solo or --mode team.
productkit init --existing from inside your project directory to add Product Kit without creating a new folder.
productkit init streak from any directory. The project gets all slash commands except landscape, which lives at the workspace level. To follow this walkthrough's Step 0, create a workspace first.
Set Up Your Knowledge Base
Before you begin the 9-step workflow, set up your knowledge/ directory. This is where you store research files — competitor analyses, user interview transcripts, market reports, articles, anything that informs your product thinking.
How it works
- Drop any research files into
knowledge/(PDFs, markdown, text files) - Run
/product-kit:learnto index them intoknowledge-index.md - All other commands automatically read this index and reference your evidence
For Streak, you might add articles about habit formation psychology, competitor app reviews, or behavioral science research. When you later run /product-kit:problem, Claude will reference this research when framing the problem.
/product-kit:learn at any point in the workflow. The index updates automatically.
Landscape
"Front-load the context so every future step is smarter."
Why this step matters
Before you can build something useful, Claude needs to understand your world. Without landscape context, later commands will suggest features that don't fit your team size, market position, or tech stack.
This step isn't about your product idea yet — it's about the environment your product will live in. A 2-person bootstrapped startup and a 50-person funded company will get very different suggestions in every subsequent step.
Think of it as giving Claude the same context a new hire would need on their first day. Who are we? What do we do? Who else is doing it?
What to prepare
- Your company basics (stage, size, funding)
- Team composition and skills
- Tech stack and platform priorities
- 3-5 competitors you've researched
- Market gaps you've noticed
The conversation
Claude
Let's start with the basics. What does your company do, and what stage are you at?
You
We're a two-person startup building personal productivity tools. Pre-revenue, bootstrapped.
Claude
Got it. Two people — what does each person do? And what's your tech stack?
You
I'm the PM and my co-founder writes code. We're targeting mobile-first, building in React Native with Expo.
Claude
Who are your competitors in the habit tracking space? I need specifics — not just "there are some apps," but which ones and what they get right or wrong.
You
Habitica (gamified, complex — lots of features but overwhelming), Streaks (Apple-only, simple but limited), and Notion templates (flexible but zero guidance). The gap we see is between "too simple" and "too complex."
Full artifact
## Company
- Name: (unnamed startup)
- Stage: Pre-revenue / Bootstrapped
- Team Size: 2 (PM + Engineer)
- Tech Stack: React Native, Expo
- Revenue: None (pre-launch)
## Domain
- Market: Personal productivity — habit tracking
- Competitors:
- Habitica: Gamified habit tracker. Rich feature set but steep learning curve. Appeals to RPG fans.
- Streaks: Apple-only, minimal UI. Simple but limited — no flexibility in streak definitions.
- Notion templates: Infinitely flexible but no built-in guidance or motivation mechanics.
- Competitive Gap: No product occupies the middle ground between oversimplified trackers and complex gamified systems.
## Constraints
- Budget: Zero (bootstrapped)
- Timeline: Aiming for MVP in 8-12 weeks
- Platform: Mobile-first (iOS + Android via React Native)
Common mistakes
Tips & best practices
When to re-run
- Team size changes (hiring, co-founder leaves)
- A major new competitor enters the market
- You pivot to a different market or platform
How it connects
Constitution
"Make the hard tradeoffs before you write a line of code."
Why this step matters
Principles prevent scope creep. When your engineer asks "should we add social features?" the constitution gives you the answer without a debate.
Good principles are opinionated. "We value quality" isn't a principle — everyone values quality. "Simplicity over power" is a principle because it tells you what to sacrifice.
Without explicit principles, every feature request becomes a subjective argument. With them, you have a decision filter that works even when you're not in the room.
What to prepare
- What your product stands for
- What you will never do
- Past decisions you regret or are proud of
- How you'd resolve a tradeoff between speed and quality
The conversation
Claude
I see you're a 2-person team building mobile-first. Let's define what Streak stands for. What's the one word that describes the experience you want users to have?
You
Simplicity.
Claude
"Simplicity" is vague — Streaks is simple and you said it's too limited. What do you mean specifically? Give me a concrete example.
You
If a feature needs a tooltip or explanation, it's too complex. We'd rather have 3 features that feel effortless than 10 that need a tutorial.
Claude
That's much clearer. Now, when simplicity conflicts with what power users want, what do you sacrifice?
You
Power users. We're not building for people who want 50 configurable settings. If someone needs that, Habitica exists. We're building for people who gave up on habit apps because they were too much work.
Full artifact
## Vision
A habit tracker so simple that keeping a habit feels easier than breaking one.
## Core Principles
1. Effortless over powerful — No feature should require an explanation. If it needs a tooltip, cut it.
2. Momentum over metrics — Show streaks and progress, not spreadsheets and charts. Feeling good matters more than data.
3. Forgiveness over perfection — Missing one day shouldn't feel like failure. The system should encourage recovery, not punishment.
## Non-Negotiables
- No dark patterns (guilt-based notifications, shame screens)
- No data selling — user data stays on device
- Free core experience — monetize through premium, not paywalls on basics
## Decision Framework
When in doubt: Does this feature make the app feel lighter or heavier? If heavier, cut it.
## Anti-Patterns
- Never add a feature just because a competitor has it
- Never show a user a wall of settings on first launch
- Never punish a missed day with negative messaging
Common mistakes
Tips & best practices
When to re-run
- After validation reveals your assumptions were wrong
- When team values shift significantly
- Before a major pivot
How it connects
Users
"'Everyone' is not a user."
Why this step matters
Building for "everyone" means building for no one. A specific persona lets you make decisions. Would "The Restarter" want this feature? You can answer that. Would "people" want it? You can't.
Personas turn subjective debates into testable claims. Instead of arguing about whether to add a feature, you ask: "Does our primary persona need this?" That question has a researchable answer.
What to prepare
- Who you've talked to or observed
- Demographic details and behavioral patterns
- What they currently do to solve this problem
The conversation
Claude
I've read your landscape and constitution. Now — who has the problem you're solving? Who specifically keeps failing at building habits?
You
People who want to build habits.
Claude
That's 3 billion people. I need you to be much more specific. Who specifically keeps downloading habit apps and quitting? What's their life like? Walk me through their Tuesday.
You
Mid-20s professionals, 1-3 years into their career. They have unpredictable schedules — some days they're in back-to-back meetings, others they have free time. They've tried habit apps before, used each one for about two weeks, then forgot about them.
Claude
Good. What does their phone look like? How many habit apps are in their app graveyard?
You
They've downloaded 2-3 habit apps. Habitica was too complex, Streaks was too simple, and they tried a Notion template that they forgot about after a week. They feel guilty about it — "I can't even stick to a simple habit tracker."
Full artifact
## Primary Persona: The Restarter
- Who: 24-30 year old professionals, 1-3 years into career
- Context: Unpredictable schedule, high ambition, low follow-through on personal goals
- Day-in-the-life: Wakes up with good intentions, gets pulled into work, remembers their habit at 10 PM, feels too tired, skips it, feels guilty
- Pain Points: Guilt from broken streaks, app fatigue, feature overload
- Current Solutions: Downloaded 2-3 habit apps, used each <2 weeks. Currently using nothing.
- Quote: "I can't even stick to a habit tracker, let alone the habit."
## Secondary Persona: The Optimizer
- Who: 28-35 year old self-improvement enthusiasts
- Context: Already tracks multiple habits, finds current apps too simple
- Pain Points: Wants flexible scheduling, finds binary daily tracking too rigid
## Key Insight
Both personas share a frustration with rigid daily tracking. The Restarter is punished by it, the Optimizer is limited by it. A flexible model serves both.
Common mistakes
Tips & best practices
When to re-run
- After validation interviews reveal a different primary user
- When you discover a new user segment you hadn't considered
How it connects
Problem
"'We need an app' is a solution. What's the problem?"
Why this step matters
Most failed products solve the wrong problem. They build what the founder wants, not what users need. Problem framing is the highest-leverage activity in product development — get it wrong and nothing downstream matters.
The key insight: separate the problem from the solution. "We need a habit tracker" is a solution. The problem is something deeper — something that could be solved in multiple ways.
What to prepare
- User pain points from step 2
- Specific moments where the problem occurs
- Root causes vs. symptoms
The conversation
Claude
I've read your users.md. The Restarter feels guilty about broken streaks. What specific pain is driving that guilt?
You
They can't stick to habits. They start strong but always fall off after a week or two.
Claude
That's the symptom, not the problem. Lots of people can't stick to habits for lots of reasons — low motivation, bad goals, unrealistic expectations. What specifically goes wrong for your user?
You
They miss one day because of their unpredictable schedule, the app resets their streak to zero, and they feel like they've failed completely. So they give up.
Claude
Now we're getting somewhere. Is the problem that they break streaks, or that breaking a streak feels like total failure?
You
The second one. They will miss days — that's inevitable. The problem is that existing apps treat a missed day as a reset, so they give up entirely instead of recovering.
Full artifact
## Problem Statement
Existing habit trackers punish imperfection. When users with unpredictable schedules inevitably miss a day, the all-or-nothing streak model makes them feel like they've failed completely, leading to abandonment rather than recovery.
## Impact
- Frequency: Happens within 2-3 weeks of starting any new habit app
- Severity: Users don't just stop using the app — they stop trying to build the habit entirely
- Scope: Affects anyone with an unpredictable schedule (most working professionals)
## Root Cause
The streak model is binary: you either did it today or you didn't. This works for people with perfectly consistent schedules but fails everyone else. One missed day erases weeks of progress, making the psychological cost of failure disproportionate to the actual setback.
## Current Alternatives & Why They Fail
- Habitica: Still uses daily tracking. Gamification distracts from the core problem.
- Streaks: Rigid daily model with no flexibility in definitions.
- Manual tracking: No motivation or accountability mechanics.
## Scope
- In: How habit apps handle imperfect adherence
- Out: Motivation, goal-setting, social accountability (for now)
## Success Criteria
Users continue tracking habits after missing a day, instead of abandoning the app.
Common mistakes
Tips & best practices
When to re-run
- After validation proves the problem is different than expected
- When user research reveals a deeper root cause
How it connects
Assumptions
"What are you betting on without knowing it?"
Why this step matters
Every product decision is built on assumptions. Most teams don't make these explicit, so they never test them. The assumptions step surfaces your hidden bets and ranks them by risk.
This is what separates validated products from expensive guesses. If your critical assumptions are wrong, you'll find out before you build, not after.
What to prepare
- Nothing — Claude reads your artifacts and surfaces assumptions automatically
- Think about what you're least sure about
The conversation
Claude
I've analyzed your landscape, constitution, users, and problem artifacts. I found 7 assumptions embedded in your work. Let me walk you through the riskiest ones. First: you're assuming users abandon because of the streak model, not because of low motivation. If it's actually low motivation, your entire solution is misdirected.
You
That's our biggest bet, yeah. We're not 100% sure that's the cause.
Claude
That's a critical assumption then — low confidence, high impact. Here's another: you're assuming "Restarters" exist as a distinct, reachable segment. If they're not reachable through online communities, your go-to-market strategy fails. How confident are you about that?
You
Honestly, not very. We think they're on Reddit and Twitter but we haven't tested it.
Full artifact
## Critical (Low Confidence, High Impact)
1. Users abandon because of the streak model, not low motivation
- Source: problem.md
- Confidence: Low
- Risk: If it's low motivation, a forgiving streak model won't help
- Test: Interview 5-8 lapsed habit app users about why they quit
2. "Restarters" exist as a distinct segment and are reachable
- Source: users.md
- Confidence: Low
- Risk: If not a real segment, positioning and marketing fail
- Test: Survey in r/getdisciplined and r/habits
## Important (Medium Confidence)
3. A "weekly rhythm" model would feel more forgiving than daily streaks
- Source: constitution.md principle #3
- Test: Prototype test with 5 users
4. Mobile-first is the right platform choice
- Source: landscape.md
- Test: Check where users currently track habits
## Minor (High Confidence)
5. Users prefer simple UI over feature-rich UI
- Source: constitution.md, competitive analysis
- Confidence: High (supported by competitor reviews)
## Validation Priority
1. Assumption #1 (streak model causation)
2. Assumption #2 (segment reachability)
3. Assumption #3 (weekly rhythm preference)
Common mistakes
Tips & best practices
When to re-run
- After completing validation (to update confidence levels)
- When new information surfaces that changes your risk profile
How it connects
Validate
"Don't guess. Go find out."
Why this step matters
This is where Product Kit earns its keep. Assumptions without evidence are just opinions. Validation turns opinions into data.
This step generates the actual instruments — interview scripts, survey questions, CSV templates — and then processes the results. It's the step that takes real-world effort: you have to talk to people.
It's also the step most people skip. Don't. The cost of 5-8 interviews is a few hours. The cost of building the wrong product is months.
What to prepare
- Access to potential users (Reddit communities, Twitter, personal network)
- 1-2 weeks for data collection
- A willingness to hear that you're wrong
Finding your first 5 interviewees
The hardest part of validation isn't the interview — it's finding people to talk to. Here's where to start:
- Reddit & online communities: Post in subreddits related to your problem space. For Streak, r/getdisciplined, r/habits, and r/productivity are goldmines. Write a short post: "I'm researching why people quit habit apps. If you've downloaded and abandoned one, I'd love a 15-minute chat. DM me."
- Twitter/X: Search for people complaining about the problem. "quit habit app" or "broken streak" surface real frustrations in real time.
- Your extended network: Post on LinkedIn or ask friends to introduce you. "Know anyone who's tried and quit a habit tracker?" One warm intro often leads to 2-3 more.
- In-person: Coffee shops, co-working spaces, meetups. "Quick question — have you ever used a habit tracking app?" works surprisingly well.
- Respondent services: If you have budget, tools like Respondent.io or User Interviews can find screened participants in 24-48 hours.
The conversation (Phase 1: Generate instruments)
Claude
I'll create validation instruments for your 2 critical assumptions. For assumption #1 (streak model causes abandonment), here's an interview guide with open-ended questions that avoid leading the witness.
You
Great. What about reaching Restarters for the survey?
Claude
For assumption #2, here's a short survey you can post in r/getdisciplined and r/habits. I've also created a CSV template for recording interview responses so I can analyze them later.
The conversation (Phase 2: After data collection)
You
I interviewed 8 people and have notes. I also got 45 survey responses from Reddit.
Claude
Drop the files into your project and let me analyze them. ... Done. 6 of 8 interviewees cited "streak reset" as the specific moment they quit. Here's the strongest quote: P3 said "I missed one day and it said Day 1. I just deleted it." Your first critical assumption is validated with strong evidence.
You
What about the Restarter segment?
Claude
Your survey got 45 responses in 3 days from two subreddits. 38 of 45 respondents matched the Restarter profile. The segment exists and is reachable. Both critical assumptions are now validated.
Full artifact
## Assumption Status
1. Users abandon because of the streak model — VALIDATED
- Evidence: 6/8 interviewees cited streak reset as the moment they quit
- Key quote: P3: "I missed one day and it said Day 1. I just deleted it."
- Quality: Strong (consistent pattern across diverse interviewees)
2. Restarters exist as a reachable segment — VALIDATED
- Evidence: 38/45 survey respondents matched the Restarter profile
- Channels: r/getdisciplined (28 responses), r/habits (17 responses)
- Quality: Moderate (self-selected sample, but strong signal)
## Validation Gate: PASSED
All critical assumptions validated. Cleared to proceed to Solution.
What if an assumption fails?
You
I interviewed 6 people and... they actually said the streak reset wasn't a big deal. Most of them quit because they simply forgot about the app after a few days. Notifications didn't help either.
Claude
That's a significant finding. Your critical assumption #1 is invalidated — the root cause isn't the streak model, it's forgetting. This changes everything downstream. Here's what I recommend:
1. Update problem.md: The root cause shifts from "streak punishment" to "habit apps fail to stay top-of-mind." That's a different problem with different solutions.
2. Re-run /product-kit:solution: Weekly rhythms may still help, but the core solution should focus on re-engagement, not forgiveness.
3. Don't panic: This is exactly why you validate. You just saved months of building the wrong thing.
[PENDING] evidence, /product-kit:solution will refuse to proceed. This prevents building on guesses.
Common mistakes
Tips & best practices
When to re-run
- When you collect new data
- When you discover new assumptions to test
How it connects
Solution
"NOW you can brainstorm what to build."
Why this step matters
Notice that solution is step 6, not step 1. Most teams start here. Product Kit makes you earn the right to brainstorm by validating the problem first.
Now that you know the problem is real and your assumptions are tested, solution ideas are grounded in evidence, not fantasy. Every option gets evaluated against what you've learned.
What to prepare
- Any solution ideas you've been sitting on (it's normal to have them by now)
- Competitor approaches you've studied
The conversation
Claude
Your validation confirmed that streak resets cause abandonment. Let me generate 5 solution approaches. Option A: Streak Freeze (let users pause streaks). Option B: Grace Days (allow 1-2 misses per streak). Option C: Weekly Rhythm (track "days this week" with a flexible target). Option D: Decay Model (streaks fade slowly instead of resetting). Option E: Milestone Streaks (celebrate milestones, ignore daily gaps).
You
I like Option C — the weekly rhythm idea.
Claude
What about Option A? Streak Freeze is simpler to build — just add a "pause" button.
You
It still punishes missing days, just delays the punishment. That conflicts with our constitution principle — "Forgiveness over perfection."
Claude
Good point — Option C aligns better with your principles. It reframes the whole mental model: instead of "did I do it today?" it becomes "am I on track this week?" That's fundamentally more forgiving. Let's scope the v1.
Full artifact
## Options Considered
A. Streak Freeze — Pause streaks manually. Pro: Simple. Con: Still binary thinking; conflicts with "forgiveness" principle.
B. Grace Days — Allow 1-2 misses. Pro: Familiar model. Con: Still punishes the 3rd miss; arbitrary threshold.
C. Weekly Rhythm — Track days per week with flexible target. Pro: Reframes the mental model; aligns with constitution. Con: Unfamiliar to users; needs clear onboarding.
D. Decay Model — Streaks fade over days. Pro: Gradual. Con: Hard to visualize; feels abstract.
E. Milestone Streaks — Celebrate milestones only. Pro: Positive. Con: No daily accountability.
## Recommended Solution: Weekly Rhythm (Option C)
Replace all-or-nothing daily streaks with a "weekly rhythm" model. Users set a target like "4 out of 7 days." Missing a day doesn't break anything — you see progress toward the week's goal. Streaks count weeks met, not consecutive days.
## v1 Scope
- Core rhythm tracker with weekly targets
- Simple daily check-in (one tap)
- Recovery nudge after a missed day
## Deferred to v2+
- Social accountability (unvalidated — low confidence)
- Advanced analytics / charts (conflicts with "momentum over metrics")
- Integrations with other apps
Common mistakes
Tips & best practices
When to re-run
- When validation results change
- When you realize the solution doesn't address the root cause
How it connects
Prioritize
"What's the smallest thing you can ship that solves the core problem?"
Why this step matters
Even a focused solution has features that can be ranked. Prioritization prevents scope creep by forcing every feature to justify itself with a score. It's the difference between "let's add this too" and "that scored 1.5, it's v2."
What to prepare
- The solution from step 6
- A sense of engineering effort for each feature (or access to your engineer)
The conversation
Claude
I've broken your solution into 8 features and scored each one using Impact × Confidence / Effort. The top scorer is "Single daily check-in" at 20.0 — high impact, high confidence, low effort. Social accountability scored 1.5 — I'd cut it from v1.
You
Agreed on cutting social. What about the recovery nudge?
Claude
It scored 8.0 — solid but not essential. Should it be must-have or nice-to-have?
You
Nice-to-have. The core rhythm tracker comes first.
Claude
One note: these effort scores are PM estimates. Share this table with your engineer before we write the spec — engineering effort is the most commonly misjudged score.
Full artifact
| Rank | Feature | Impact | Confidence | Effort | Score | Status |
|------|---------|--------|------------|--------|-------|--------|
| 1 | Single daily check-in | 5 | 4 | 1 | 20.0 | v1 must-have |
| 2 | Weekly rhythm tracker | 5 | 5 | 2 | 12.5 | v1 must-have |
| 3 | Week completion celebration | 4 | 4 | 1 | 16.0 | v1 must-have |
| 4 | Recovery nudge | 4 | 4 | 2 | 8.0 | v1 nice-to-have |
| 5 | Habit creation flow | 3 | 5 | 1 | 15.0 | v1 must-have |
| 6 | Progress visualization | 3 | 3 | 2 | 4.5 | v2 |
| 7 | Notifications | 3 | 3 | 3 | 3.0 | v2 |
| 8 | Social accountability | 3 | 2 | 4 | 1.5 | v2 |
## Engineering Review
Effort scores are PM estimates. Engineer validation required before spec.
Common mistakes
Tips & best practices
When to re-run
- After engineering review changes effort scores
- When scope changes significantly
How it connects
Spec
"Everything synthesized. Every claim traced. Ready to build."
Why this step matters
The spec is the culmination of all previous work. It synthesizes 8 artifacts into one document that an engineer can build from. Every feature traces to evidence. Every decision traces to a principle.
It's not a requirements doc — it's a validated blueprint. The difference? Requirements docs say "build this feature." A spec says "build this feature because 6 of 8 users cited this as their #1 pain point."
What to prepare
- All 8 previous artifacts complete
- Priorities reviewed by engineering
The conversation
Claude
I'll synthesize all your artifacts into a product spec. Every feature will link back to validation evidence and constitution principles.
You
Sounds good. Include acceptance criteria for every feature — specific, testable conditions that must be true for a feature to be considered done.
Claude
Done. I also noticed a gap — your success metrics don't include a retention metric, which is critical given your problem is about abandonment. Want me to add week-over-week retention as a primary metric?
You
Yes, that's a good catch. Add it.
Full artifact
## Overview
A mobile habit tracker that replaces all-or-nothing daily streaks with a forgiving "weekly rhythm" model, designed for 24-30 year old professionals with unpredictable schedules who've abandoned 2-3 habit apps.
## Principles (from constitution.md)
1. Effortless over powerful
2. Momentum over metrics
3. Forgiveness over perfection
## Target User: The Restarter (from users.md)
24-30 year old professionals. Unpredictable schedules. Downloaded 2-3 habit apps, used each <2 weeks.
## Problem (from problem.md)
All-or-nothing streak models punish imperfection, causing abandonment. 6/8 interviewees confirmed (validation.md).
## v1 Features
### Weekly Rhythm Tracker — Must Have
- What: Users set a weekly target (e.g., 4/7 days). Progress shown as filled dots.
- Why: 6/8 interviewees cited streak resets as why they quit (validation.md)
- Acceptance criteria:
☐ User can create a habit with a weekly frequency target
☐ Missing a day does not reset progress
☐ Weekly completion is celebrated, not daily perfection
### Single Daily Check-in — Must Have
- What: One-tap "did it" button. No forms, no details.
- Why: Constitution principle #1 (effortless over powerful)
- Acceptance criteria:
☐ Check-in takes <2 seconds
☐ No required fields beyond the tap
## Success Metrics
- Week-over-week retention (primary)
- % of users who continue after missing a day
- Weekly rhythm completion rate
## Risks
- Users may not understand the weekly model without onboarding
- Weekly rhythms may feel less motivating than daily streaks for some users
## Appendix
Source artifacts: landscape.md, constitution.md, users.md, problem.md, assumptions.md, validation.md, solution.md, priorities.md
spec.md to Claude and start building.
Common mistakes
Tips & best practices
When to re-run
- When priorities change after engineering review
- After significant user feedback post-launch
How it connects
Utility Commands
These commands can be run at any point in the workflow. Here's how each one helps during the Streak project.
Clarify
After writing solution.md, you notice it mentions a "daily check-in" but your constitution says "no daily pressure." Is that a contradiction?
Claude
I found a potential contradiction. Your solution includes a "single daily check-in" but your constitution principle #2 says "momentum over metrics — no daily pressure." However, looking deeper, the check-in is optional and frictionless (one tap). The constitution prohibits pressure, not daily actions. Recommendation: clarify in the spec that the check-in has no penalty for skipping.
Analyze
After completing step 5, run Analyze for a quality check.
Claude
Score: 7/10. Strengths: strong validation evidence, clear problem framing, well-defined persona. Gaps: competitive differentiation is weak — your problem statement doesn't explain why you will solve this better than a competitor who copies the weekly rhythm model. Recommendation: add a "why us" section to your problem statement.
Bootstrap
Your friend has a half-built habit app with no product artifacts. Run Bootstrap on their codebase.
Claude
I scanned the codebase and found: a React Native app with daily tracking, streak counting, and push notifications. I've auto-drafted users.md (inferred from UI text), problem.md (inferred from the streak reset handler), and solution.md (the current implementation). Review each one and refine — these are starting points, not finished artifacts.
Audit
After building Streak v1, run Audit to compare spec vs. code.
Claude
Spec vs. Code Comparison:
• Weekly rhythm tracker — Implemented
• Single daily check-in — Implemented
• Recovery nudge — Not implemented (was nice-to-have)
• Settings page — Built but not specced (scope creep?)
Learn
You found 3 articles about habit formation psychology and saved them to knowledge/.
Claude
Indexed 3 files into knowledge-index.md. Key findings: "implementation intentions" increase habit adherence by 2-3x (source: habit-research.pdf). I'll reference this in future artifact drafts when relevant.
Tech Review
Before writing stories, run Tech Review so engineering concerns are surfaced early.
Claude
Feature-by-feature feasibility:
• Weekly rhythm tracker — Straightforward, ~2 days. Uses standard date math.
• Recovery nudge — [Needs engineering input] Push notification timing depends on OS-level scheduling constraints. Consider local notifications vs server-triggered.
• Social accountability — High effort (~2 weeks). Requires real-time sync, friend graph, privacy controls. Recommend deferring to v2.
Stories
After Tech Review, break the spec into implementable stories.
Claude
Epic 1: Core Tracking
• E1-S1: Weekly rhythm setup — Must-have, 3 pts
• E1-S2: Daily check-in — Must-have, 2 pts, depends on E1-S1
Epic 2: Engagement
• E2-S1: Recovery nudge — Nice-to-have, 5 pts
Each story has acceptance criteria and definition of done. Export with productkit export --stories-csv for Jira/Linear import.
Next Steps
You've gone from "I want to build a habit tracker" to a validated, evidence-based product spec. Along the way, you:
- Mapped your competitive landscape and constraints
- Defined principles that filter every decision
- Built a specific, research-grounded persona
- Framed a testable problem (not a solution in disguise)
- Surfaced your hidden assumptions and tested them
- Brainstormed solutions grounded in evidence
- Prioritized ruthlessly with a scoring framework
- Synthesized everything into a build-ready spec
The spec is your handoff to development. Give spec.md to Claude and start building the right thing. Need to share your artifacts with stakeholders? Run productkit export to bundle all artifacts into a single markdown file.