Skip to content

Cross-Session Struggle Follow-up Tracking

Problem

The struggle detector's dimension 5 (outcome_quality) scores based only on the session's own outcome signals. It cannot detect when Agent B had to fix Agent A's broken output — making Agent A's session look clean when it was actually a struggle.

Impact

False negatives: sessions that produce broken output but exit "successfully" are invisible to the learning pipeline.

Proposed Solution

Cross-reference completion files with subsequent task assignments via the hook trigger graph in Redis. If a completion triggers a follow-up for the same work item within 30 minutes, and the follow-up task contains "fix", "correct", "redo" — score the original session higher.

Acceptance Criteria

  • Detector flags Agent A's session when Agent B fixes A's output within 30 minutes
  • No false positives from normal handoff chains