Skip to content

DOMAIN:DEVOPS:MERGE_GATE

OWNER: marta (Team Alfa), iwona (Team Bravo)
UPDATED: 2026-03-24
SCOPE: PR merge governance, quality gates, DORA evidence, release readiness
AGENTS: marta (Alfa goalkeeper), iwona (Bravo goalkeeper), koen (quality), jasper (reconciliation), marco (conflict resolution)


MERGE_GATE:OVERVIEW

PURPOSE: prevent broken, undertested, or spec-drifted code from reaching main
PRINCIPLE: the merge gate is the last defense before production. It must be strict, automated, and auditable.
PRINCIPLE: merge gate agents (marta/iwona) are GOALKEEPERS, not gatekeepers. They help code get IN safely, not keep it OUT arbitrarily.

FLOW: PR opened -> automated checks -> marta/iwona review -> merge queue -> main

IF automated checks fail THEN PR blocked, author notified
IF marta/iwona flag issues THEN PR blocked with structured feedback
IF all green THEN PR enters merge queue
IF merge queue conflict THEN marco handles resolution


MERGE_GATE:PR_SIZE_ENFORCEMENT

THRESHOLDS

METRIC: total diff lines (additions + deletions)
SMALL: <= 200 lines (ideal, fast review)
MEDIUM: 201-500 lines (acceptable, needs focused review)
LARGE: 501-1000 lines (warning, must justify)
EXCESSIVE: > 1000 lines (blocked, must split)

RULE: PRs over 1000 diff lines are AUTO-BLOCKED
REASON: large PRs have exponentially higher defect rates (Microsoft Research, 2015)
EXCEPTION: generated code (migrations, lockfiles) excluded from count
EXCEPTION: bulk renames/moves counted as 0 if content unchanged

IMPLEMENTATION

# .github/workflows/pr-size-check.yml
name: PR Size Check

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  size-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Check PR size
        run: |
          DIFF_LINES=$(git diff --stat origin/main...HEAD -- \
            ':!pnpm-lock.yaml' \
            ':!*.generated.*' \
            ':!drizzle/migrations/*' \
            | tail -1 | awk '{print $4 + $6}')
          echo "Total diff lines: $DIFF_LINES"
          if [ "$DIFF_LINES" -gt 1000 ]; then
            echo "::error::PR too large ($DIFF_LINES lines). Max 1000. Split into smaller PRs."
            exit 1
          elif [ "$DIFF_LINES" -gt 500 ]; then
            echo "::warning::PR is large ($DIFF_LINES lines). Consider splitting."
          fi

SPLIT_STRATEGIES

STRATEGY: vertical slice (one feature end-to-end)
USE_WHEN: feature touches multiple layers (API + UI + tests)
HOW: one PR per user story or acceptance criterion

STRATEGY: horizontal slice (one layer at a time)
USE_WHEN: refactoring or infrastructure changes
HOW: one PR for DB migration, one for API, one for UI

STRATEGY: preparatory refactor
USE_WHEN: feature requires refactoring existing code
HOW: refactor PR first (no behavior change), then feature PR


MERGE_GATE:TEST_WEAKENING_DETECTION

WHAT_IS_TEST_WEAKENING

DEFINITION: any change that reduces test coverage or loosens assertions without justification
EXAMPLES:
- deleting test files without deleting the code they tested
- changing expect(result).toBe(42) to expect(result).toBeTruthy()
- adding .skip to test cases
- reducing assertion count in a test
- lowering coverage thresholds

DETECTION_RULES

RULE: if test files deleted AND source files NOT deleted THEN flag
RULE: if .skip or .todo added to test THEN flag
RULE: if coverage threshold lowered in vitest.config THEN block
RULE: if assertion count in test file decreases THEN flag
RULE: if expect.anything() or expect.any() replaces specific matcher THEN flag

IMPLEMENTATION

- name: Test weakening check
  run: |
    # Check for .skip/.todo additions
    SKIPS=$(git diff origin/main...HEAD -- '*.test.*' '*.spec.*' | grep -c '^\+.*\.\(skip\|todo\)' || true)
    if [ "$SKIPS" -gt 0 ]; then
      echo "::warning::$SKIPS test skip/todo additions detected. Justify in PR description."
    fi

    # Check for test file deletions without source deletions
    DELETED_TESTS=$(git diff --name-only --diff-filter=D origin/main...HEAD | grep -c '\.test\.\|\.spec\.' || true)
    DELETED_SOURCE=$(git diff --name-only --diff-filter=D origin/main...HEAD | grep -cv '\.test\.\|\.spec\.\|\.d\.ts' || true)
    if [ "$DELETED_TESTS" -gt "$DELETED_SOURCE" ]; then
      echo "::error::More test files deleted ($DELETED_TESTS) than source files ($DELETED_SOURCE). Unjustified test removal."
      exit 1
    fi

MERGE_GATE:CODE_CHURN_FLAGGING

WHAT_IS_CODE_CHURN

DEFINITION: code that is written and then rewritten within a short period
SIGNAL: high churn = unstable design, unclear requirements, or trial-and-error coding
THRESHOLD: file modified in 3+ PRs within 5 business days = high churn

DETECTION

METHOD: track file modification frequency via git log
TOOL: custom script run by marta/iwona during review

#!/bin/bash
# churn-check.sh - detect high-churn files in current PR
CHANGED_FILES=$(git diff --name-only origin/main...HEAD)
for file in $CHANGED_FILES; do
  RECENT_COMMITS=$(git log --since="5 days ago" --oneline -- "$file" | wc -l)
  if [ "$RECENT_COMMITS" -ge 3 ]; then
    echo "HIGH CHURN: $file modified $RECENT_COMMITS times in last 5 days"
  fi
done

RESPONSE_TO_HIGH_CHURN:
IF high churn detected THEN marta/iwona flags in PR review
IF same file churns across 5+ PRs THEN escalate to Faye/Sytske (PM) for spec review
NEVER block PR solely for churn — it is a signal, not a gate


MERGE_GATE:SCAFFOLDING_DETECTION

WHAT_IS_SCAFFOLDING

DEFINITION: placeholder code intended to be replaced (TODO, FIXME, stub functions, hardcoded values)
RISK: scaffolding that reaches production becomes permanent technical debt

DETECTION_RULES

RULE: scan for TODO, FIXME, HACK, XXX, TEMP in added lines
RULE: scan for hardcoded localhost URLs, test credentials, magic numbers
RULE: scan for empty catch blocks, functions returning hardcoded values
RULE: scan for console.log in production code (not test files)

- name: Scaffolding detection
  run: |
    # TODO/FIXME in new code
    SCAFFOLDING=$(git diff origin/main...HEAD | grep '^\+' | grep -ciE '(TODO|FIXME|HACK|XXX|TEMP):' || true)
    if [ "$SCAFFOLDING" -gt 0 ]; then
      echo "::warning::$SCAFFOLDING scaffolding markers found in new code. Track or resolve before merge."
    fi

    # Console.log in production code
    CONSOLE_LOGS=$(git diff origin/main...HEAD -- 'src/**' ':!src/**/*.test.*' ':!src/**/*.spec.*' | grep -c '^\+.*console\.log' || true)
    if [ "$CONSOLE_LOGS" -gt 0 ]; then
      echo "::error::$CONSOLE_LOGS console.log statements in production code. Remove before merge."
      exit 1
    fi

EXCEPTION: console.error and console.warn are allowed (they serve a purpose)
EXCEPTION: scaffolding in draft PRs is expected and not flagged


MERGE_GATE:SPEC_TO_PR_TRACEABILITY

REQUIREMENT

EVERY PR must link to its originating work package or spec
FORMAT: PR description must contain Spec: WP-{id} or Closes: #issue-number
REASON: SOC 2 CC8.1 requires traceability from requirement to deployment

VALIDATION

- name: Traceability check
  run: |
    PR_BODY="${{ github.event.pull_request.body }}"
    if ! echo "$PR_BODY" | grep -qE '(Spec:\s*WP-[0-9]+|Closes:\s*#[0-9]+|Resolves:\s*#[0-9]+)'; then
      echo "::error::PR must reference a work package (Spec: WP-xxx) or issue (Closes: #xxx)"
      exit 1
    fi

EXCEPTION: dependency updates (Dependabot/Renovate) exempt
EXCEPTION: hotfix PRs may reference incident ID instead

TRACEABILITY_CHAIN

Client request → Aimee scope → Anna spec → Antje TDD → Work Package (WP-xxx)
  → Developer PR (references WP-xxx) → Marta/Iwona merge gate → Main → Deploy

AUDIT: full chain must be reconstructable for any production change
STORED: work package references in PR metadata, queryable via GitHub API


MERGE_GATE:DORA_METRICS

SEE: domains/devops/dora-metrics.md for deep dive

TRACKED_AT_MERGE_GATE:
- change_failure_rate: PRs that cause revert within 7 days
- lead_time: time from first commit to merge
- rework_rate: PRs that require force-push after review

EXPOSED: via PR labels and GitHub Actions annotations
AGGREGATED: weekly by marta/iwona into project metrics


MERGE_GATE:MERGE_QUEUE

WHAT_IS_MERGE_QUEUE

FEATURE: GitHub native merge queue (enabled per-repo)
PURPOSE: serialize merges, run CI on merge result (not just branch)
WHY: prevents "merge skew" where two green PRs conflict when combined

CONFIGURATION

# Branch protection rules for main:
require_status_checks: true
required_status_checks:
  - ci
  - e2e
  - pr-size-check
  - traceability-check
require_merge_queue: true
merge_queue:
  merge_method: squash
  max_entries: 5
  min_entries_to_merge: 1
  grouping_strategy: ALLGREEN

RULES

RULE: all PRs to main go through merge queue
RULE: merge method is SQUASH (clean history, one commit per PR)
RULE: max 5 PRs in queue (prevents long CI waits)
RULE: if queue entry fails CI, it is ejected and author notified

PITFALL: merge queue runs CI on the MERGED result, not the branch
IMPLICATION: a PR that passes CI alone may fail in queue if it conflicts with another queued PR
RESPONSE: author must rebase and re-enter queue


MERGE_GATE:SOC2_CC8_EVIDENCE

CONTROLS_MAPPED

CC8.1: change management process
EVIDENCE: PR metadata (author, reviewers, timestamps, linked spec)
COLLECTED: automatically by GitHub, queryable via API

CC8.2: infrastructure and software changes are authorized
EVIDENCE: branch protection rules, required reviewers, environment approvals
COLLECTED: GitHub audit log + branch protection API

CC8.3: changes are tested before deployment
EVIDENCE: CI status checks (required), test results, coverage reports
COLLECTED: GitHub Actions run results

EVIDENCE_AUTOMATION

# Monthly SOC 2 evidence collection
gh api repos/{owner}/{repo}/pulls \
  --paginate \
  -q '.[] | select(.merged_at != null) | {
    number: .number,
    title: .title,
    author: .user.login,
    reviewers: [.requested_reviewers[].login],
    merged_at: .merged_at,
    merge_commit: .merge_commit_sha
  }' > evidence/prs-$(date +%Y-%m).json

STORAGE: evidence exported monthly to compliance archive
RETENTION: 7 years per SOC 2 requirements


MERGE_GATE:EMERGENCY_PROCEDURES

BREAK_GLASS

DEFINITION: bypass merge gate for critical production fix
AUTHORITY: Dirk-Jan (CEO) only
PROCESS:
1. developer creates hotfix branch from production tag
2. Dirk-Jan approves bypass via GitHub admin override
3. PR merged without full CI suite (smoke test still required)
4. incident report filed within 24 hours
5. marta/iwona retroactively review the change
6. SOC 2 exception logged with justification

TRACKING:

BREAK_GLASS_LOG:
  date: YYYY-MM-DD
  approver: dirk-jan
  pr: #xxx
  reason: [production incident description]
  retroactive_review: [date completed]
  remediation: [follow-up PR number]

RULE: break-glass used more than twice per quarter triggers process review
RULE: every break-glass MUST have a follow-up PR with proper tests


MERGE_GATE:RELEASE_READINESS_SCORE

CALCULATION

SCORE: 0.0 to 1.0 (weighted composite)

Factor Weight Scoring
All CI checks green 0.25 1.0 if all pass, 0.0 if any fail
Test coverage >= threshold 0.15 1.0 if >= 80%, proportional below
No high/critical vulnerabilities 0.15 1.0 if clean, 0.5 if medium only, 0.0 if high/critical
All PRs have spec traceability 0.10 ratio of traced PRs
No scaffolding in production code 0.10 1.0 if clean, deduct 0.1 per marker
E2E pass rate 0.15 pass rate (flaky = 0.5 per flaky test)
Change failure rate < 15% 0.10 1.0 if < 5%, 0.5 if < 15%, 0.0 if >= 15%

THRESHOLD: release readiness >= 0.8 required for production deploy
THRESHOLD: release readiness >= 0.6 required for staging deploy

DISPLAY

FORMAT: badge in PR description, updated on each push
AGGREGATE: project-level score on admin-ui dashboard


MERGE_GATE:PITFALLS

PITFALL: approving PRs without reading the diff
IMPACT: quality erosion, SOC 2 evidence becomes meaningless
RULE: marta/iwona must leave substantive review comment on every PR

PITFALL: adding required status checks AFTER PRs are open
IMPACT: existing PRs stuck, cannot satisfy new checks
FIX: always make new checks "soft" (warning) for one sprint, then required

PITFALL: merge queue ejecting PRs due to transient CI failures
IMPACT: developer frustration, merge delays
FIX: retry logic in CI, separate flaky test quarantine

PITFALL: squash merge losing granular commit history
IMPACT: harder to git bisect
MITIGATION: PR description must summarize all changes, link to original commits

PITFALL: break-glass becoming routine
IMPACT: SOC 2 non-conformity, quality erosion
SIGNAL: if break-glass > 2x per quarter, process is broken
RESPONSE: root cause analysis on why normal process was too slow