Skip to content

DOMAIN:TESTING — RECONCILIATION_CALIBRATION

OWNER: jasper
ALSO_USED_BY: antje (TDD source), marije (post-impl source), anna (spec arbitration)
UPDATED: 2026-03-24
SCOPE: calibration examples for TDD-vs-post-impl test reconciliation — JIT injected before every reconciliation task
PURPOSE: ensure Jasper consistently resolves test conflicts by tracing back to the specification, and correctly triages coverage gaps


HOW_TO_USE_THIS_PAGE

Read these examples BEFORE reconciling any test pair.
Jasper's job: compare TDD tests (Antje, pre-implementation) against post-impl tests (Marije, post-implementation) and resolve discrepancies.

RECONCILIATION_PRINCIPLES:
- The SPEC is the ultimate authority, not either test suite
- When TDD and post-impl disagree, ask: "What does Anna's spec say?"
- If the spec is ambiguous, escalate to Anna for clarification — do NOT guess
- Coverage gaps are only blocking if they cover spec-required behavior or error paths with real impact
- Cosmetic gaps (logging, dev utilities) are noted but never blocking

DECISION_FRAMEWORK:

TDD says X, Post-impl says Y
  ├─ Spec says X → TDD is right, post-impl needs update
  ├─ Spec says Y → Post-impl is right, TDD was based on wrong assumption
  ├─ Spec says both X and Y are valid → Not a conflict, both pass
  ├─ Spec says neither X nor Y → Escalate to Anna
  └─ Spec is silent on this behavior → Escalate to Anna


EXAMPLE_1: TDD CONTRADICTS POST-IMPL — TDD IS RIGHT

RESOLUTION: KEEP TDD, UPDATE POST-IMPL

SCENARIO

Feature: User registration with email validation.
Anna's spec states: "Registration MUST reject email addresses without a TLD (e.g., user@localhost is invalid)."

Antje's TDD test:

it('should reject email without TLD', async () => {
  const result = await registerUser({ email: 'admin@localhost', password: 'Str0ng!Pass' });
  expect(result.success).toBe(false);
  expect(result.error).toContain('invalid email');
});

Marije's post-impl test:

it('should accept valid email formats', async () => {
  // Developer's regex accepts user@hostname (no TLD required)
  const result = await registerUser({ email: 'admin@localhost', password: 'Str0ng!Pass' });
  expect(result.success).toBe(true);
});

RECONCILIATION_ANALYSIS

CONFLICT: TDD expects rejection of admin@localhost, post-impl expects acceptance.
SPEC_CHECK: Anna's spec explicitly says "MUST reject email addresses without a TLD."
VERDICT: TDD is correct. The developer's implementation is too permissive.

ROOT_CAUSE: The developer used a regex that considers user@hostname valid. The spec requires a stricter validation that demands at least one dot in the domain part.

ACTION_ITEMS:
- Developer must update email validation to reject addresses without TLD
- Marije must update post-impl test to expect rejection for admin@localhost
- Add additional TDD-aligned tests: user@.com, user@com., user@-domain.com

BLOCKING: YES — spec violation, security-relevant (localhost could bypass email verification flows)


EXAMPLE_2: TDD CONTRADICTS POST-IMPL — POST-IMPL IS RIGHT

RESOLUTION: UPDATE TDD, KEEP POST-IMPL

SCENARIO

Feature: Search results pagination.
Anna's spec states: "Search endpoint returns paginated results. Default page size is configurable."

Antje's TDD test:

it('should return 10 results per page by default', async () => {
  await seedProducts(25);
  const result = await searchProducts({ query: 'test' });
  expect(result.items).toHaveLength(10);
  expect(result.totalPages).toBe(3);
});

Marije's post-impl test:

it('should return default page size from config', async () => {
  await seedProducts(25);
  // Config sets default page size to 20
  const result = await searchProducts({ query: 'test' });
  expect(result.items).toHaveLength(20);
  expect(result.totalPages).toBe(2);
});

RECONCILIATION_ANALYSIS

CONFLICT: TDD assumes default page size is 10, post-impl says it's 20.
SPEC_CHECK: Anna's spec says "default page size is configurable." It does NOT specify the default value.
IMPLEMENTATION_CHECK: The config file sets DEFAULT_PAGE_SIZE=20.

VERDICT: Post-impl is correct. TDD hardcoded an assumption (10) that the spec did not mandate. The spec said "configurable" — the configured value is 20.

ROOT_CAUSE: Antje assumed a common convention (10 per page) because the spec didn't specify. This is not Antje's fault — the spec was ambiguous on the exact default value.

ACTION_ITEMS:
- Antje must update TDD test to read from config or use the configured value (20)
- Flag to Anna: spec should explicitly state the default page size to prevent future ambiguity
- Consider making the TDD test config-aware: expect(result.items).toHaveLength(config.defaultPageSize)

BLOCKING: NO — this is a spec ambiguity, not a code defect. The implementation behavior is correct.

FOLLOW_UP: File spec clarification request to Anna. This prevents the same class of ambiguity in future features.


EXAMPLE_3: COVERAGE GAP THAT MATTERS

RESOLUTION: BLOCKING — ADD TESTS BEFORE RELEASE

SCENARIO

Feature: Payment processing with Stripe.
Anna's spec states: "Payment must handle: success, card declined, insufficient funds, network timeout, and Stripe outage."

TDD tests cover: success, card declined, insufficient funds.
Post-impl tests cover: success, card declined, network timeout.

Neither test suite covers: insufficient funds (post-impl) or network timeout (TDD) or Stripe outage (both).

RECONCILIATION_ANALYSIS

COVERAGE_MATRIX:

Scenario TDD (Antje) Post-Impl (Marije) Spec Required
Success YES YES YES
Card declined YES YES YES
Insufficient funds YES NO YES
Network timeout NO YES YES
Stripe outage NO NO YES

GAPS_IDENTIFIED:
- Stripe outage: UNTESTED BY BOTH — spec-required error path in payment flow
- Insufficient funds: only TDD tested — post-impl should verify implementation handles it
- Network timeout: only post-impl tested — TDD should have caught this from spec

WHY_THIS_MATTERS:
- Payment flows are the highest-risk code path in any client project
- Stripe outage is not hypothetical — Stripe has had 4 incidents in the past 12 months
- If the app doesn't handle Stripe outage gracefully, users see a blank screen or a 500 error
- The user's payment may have been charged but the order not created (worst case)

BLOCKING: YES — Stripe outage is untested spec-required behavior in a financial flow.

ACTION_ITEMS:
- Antje: Add TDD test for network timeout scenario
- Marije: Add post-impl tests for insufficient funds and Stripe outage
- Priority: Stripe outage test is the most critical gap — covers a scenario with real financial risk


EXAMPLE_4: COVERAGE GAP THAT IS COSMETIC

RESOLUTION: NOT BLOCKING — NOTE AND MOVE ON

SCENARIO

Feature: Application logging utility.
Anna's spec does NOT mention logging requirements (logging is infrastructure, not feature behavior).

TDD tests: None (Antje correctly skipped — logging is not in the spec).
Post-impl tests: None (Marije did not test the logging utility).

Code coverage tool flags: lib/utils/logger.ts has 0% coverage.

RECONCILIATION_ANALYSIS

COVERAGE_CHECK: logger.ts has no tests.
SPEC_CHECK: The spec does not mention logging. Logging is an internal utility, not user-facing behavior.

WHY_THIS_DOES_NOT_MATTER:
- The logger is a thin wrapper around pino — testing it would test the pino library, not our code
- Logger failures do not affect user-facing behavior (fire-and-forget)
- The logger has no branching logic — it formats and outputs. There is nothing to "get wrong"
- Adding tests here would be test theater (see testing/calibration-examples.md, Example 5)

WHEN_LOGGING_GAPS_WOULD_MATTER:
- If the logger contained PII filtering logic — that MUST be tested
- If the logger wrote to a database (audit trail) — the write must be tested
- If the logger had conditional output (log level routing) — the routing must be tested
- If the spec explicitly required "all API calls must be logged" — coverage is required

BLOCKING: NO — cosmetic gap. Logging utility has no spec requirement and no business logic.

ACTION_ITEMS:
- Note in reconciliation report: "logger.ts untested — no spec requirement, no business logic, acceptable"
- If code coverage gate is set to a threshold that fails because of this, exclude lib/utils/logger.ts from coverage calculation (not from the codebase)


RECONCILIATION_DECISION_TABLE

Situation Blocking? Action
TDD and post-impl agree No Confirm and move on
TDD and post-impl disagree, spec is clear Yes Spec wins, update the wrong suite
TDD and post-impl disagree, spec is ambiguous Yes Escalate to Anna for clarification
Coverage gap in spec-required behavior Yes Add tests before release
Coverage gap in error path with financial/security impact Yes Add tests before release
Coverage gap in internal utility with no spec requirement No Note and move on
Coverage gap in cosmetic feature (tooltips, animations) No Note, low-priority ticket
Both suites test the same thing differently but equivalently No Keep both — independent verification has value

ESCALATION_RULES

ESCALATE_TO_ANNA when:
- Spec is ambiguous and both interpretations are reasonable
- Spec is missing a scenario that both test suites assumed differently
- A behavior exists in code that the spec never mentioned

ESCALATE_TO_MARIJE/ANTJE when:
- Their test has a bug (wrong assertion, wrong setup)
- Their test is flaky (passes sometimes, fails sometimes)
- Their test is redundant with the other suite (consolidation opportunity)

ESCALATE_TO_MARTA when:
- Reconciliation reveals a pattern of spec gaps across multiple features
- The same class of conflict keeps recurring (systemic issue)
- Coverage gap is in a security-critical path and release is imminent