DOMAIN:CREATIVE — EXPERT PANEL SCORING¶
OWNER: leah ALSO_USED_BY: rick, christel, alexander, ilian, felice (understand how their work is scored) UPDATED: 2026-04-03 SCOPE: multi-perspective quality scoring for all creative deliverables, applied during Leah's reconciliation
PURPOSE¶
RULE: every creative deliverable receives multi-perspective scoring alongside contradiction scanning RULE: scoring dimensions vary by deliverable type (see matrices below) RULE: AI Humanizer perspective ALWAYS included at 1.5x weight for text-containing deliverables RULE: scoring is additive to — not a replacement for — SG-03 contradiction scanning
SCORING_MATRICES¶
FOR_COPY (Rick deliverables — headlines, campaigns, taglines, scripts)¶
| Dimension | Weight | What to evaluate |
|---|---|---|
| brand_alignment | 0.20 | Does the copy express the strategic brief's positioning and messaging pillars? |
| audience_resonance | 0.15 | Does the language match what the target audience actually uses? Not jargon, not corporate. |
| ai_humanization | 0.30 | Does the copy pass AI detection patterns? (1.5x weight — read ai-detection-patterns.md) |
| clarity_impact | 0.20 | Does the reader know exactly what to do after reading? Is every sentence earning its space? |
| craft_quality | 0.15 | Rhythm, word economy, originality. Does it sound like a person wrote it with care? |
FOR_BRAND_IDENTITY (Christel deliverables — logos, visual identity, brand books)¶
| Dimension | Weight | What to evaluate |
|---|---|---|
| strategic_alignment | 0.25 | Does the visual identity express the brand positioning from the strategic brief? |
| system_coherence | 0.25 | Do all elements (logo, colors, typography, patterns) work as a unified system? |
| scalability | 0.20 | Does the identity work across all required touchpoints (web, mobile, print, social)? |
| differentiation | 0.15 | Does it look distinct from competitors? Not generic, not derivative. |
| production_quality | 0.15 | Clean vectors, correct color spaces, proper file formats, usable brand book. |
FOR_UI_DESIGN (Alexander deliverables — interfaces, design system, components)¶
| Dimension | Weight | What to evaluate |
|---|---|---|
| brand_alignment | 0.20 | Does the UI express the brand identity? Color tokens, typography, visual language. |
| usability | 0.25 | Clear hierarchy, obvious interactions, no ambiguity. Krug test: don't make me think. |
| accessibility | 0.20 | WCAG AA minimum (AAA target). Contrast, touch targets, screen reader compatibility. |
| consistency | 0.20 | Does every screen follow the same patterns? No one-off components. |
| ai_humanization | 0.15 | For any UI copy — microcopy, labels, errors. (1.5x weight applied to this dimension) |
FOR_MOTION (Ilian deliverables — animations, video, motion graphics)¶
| Dimension | Weight | What to evaluate |
|---|---|---|
| brand_alignment | 0.25 | Does the motion language match the brand identity? Timing, easing, energy level. |
| narrative_clarity | 0.20 | Does the viewer understand the message without sound? Is the story clear? |
| technical_quality | 0.20 | Frame rate, compression, export specs, accessibility (reduced motion, captions). |
| emotional_impact | 0.20 | Does it evoke the intended feeling? Premium vs playful vs urgent. |
| production_efficiency | 0.15 | Appropriate complexity for the context. Hero asset vs social clip = different bar. |
FOR_PRODUCTION_ASSETS (Felice deliverables — images, infographics, video variants)¶
| Dimension | Weight | What to evaluate |
|---|---|---|
| brand_compliance | 0.25 | Correct colors, fonts, logo usage, visual style per brand book. |
| brief_adherence | 0.25 | Does the asset match what was requested? Dimensions, format, content. |
| technical_quality | 0.20 | Resolution, file format, file size, color profile (sRGB web, CMYK print). |
| accessibility | 0.15 | Alt text, contrast, no text-in-image without accessible alternative. |
| ai_humanization | 0.15 | For assets with text — social posts, infographics. (1.5x weight applied) |
SCORING_PROTOCOL¶
STEP 1: identify deliverable type → select correct scoring matrix above STEP 2: score each dimension 0-100 with specific evidence (cite the element, not a feeling) STEP 3: for ai_humanization dimensions — READ wiki/docs/domains/content/ai-detection-patterns.md and score against 24+ patterns STEP 4: apply weights — multiply each score by its weight STEP 5: apply 1.5x multiplier on ai_humanization dimension weight (e.g., 0.30 becomes effective 0.45) STEP 6: normalize total back to 0-100 scale STEP 7: determine outcome per thresholds below
THRESHOLD_ACTIONS¶
| Total Score | Determination | Action |
|---|---|---|
| >= 85 | PASS | Deliverable proceeds to SG-05 client review |
| 70-84 | CONDITIONAL_PASS | Document improvement areas. Proceed if no CRITICAL contradiction findings. |
| < 70 | BLOCKED | Generate improvement brief (see creative-feedback-protocol.md). Agent must rework. |
RULE: CONDITIONAL_PASS + any CRITICAL contradiction finding = BLOCKED (contradiction scanning overrides) RULE: scoring result is included in reconciliation report alongside contradiction findings
RECURSIVE_IMPROVEMENT¶
RULE: when score < 85, generate structured improvement brief per creative-feedback-protocol.md RULE: after agent rework, re-score ENTIRE deliverable (not just improved dimensions) RULE: maximum 3 scoring cycles before human escalation RULE: track score progression: [cycle 1: 62] → [cycle 2: 78] → [cycle 3: 87 → PASS] RULE: if score does not improve between cycles, escalate immediately (agent is stuck)
INTEGRATION_WITH_CONTRADICTION_SCANNING¶
Expert panel scoring runs ALONGSIDE Leah's existing contradiction scan (SESSION_FLOW steps 3-6). The two systems produce independent results that combine at the GATE phase:
PASS = zero CRITICAL contradictions AND expert panel score >= 85
CONDITIONAL = zero CRITICAL contradictions AND score 70-84
BLOCKED = ANY CRITICAL contradiction OR score < 70
IF contradiction scan says PASS but expert panel says BLOCKED → BLOCKED (quality gate wins) IF expert panel says PASS but contradiction scan says BLOCKED → BLOCKED (compliance gate wins)
SELF_CHECK¶
CHECK: did I select the correct scoring matrix for this deliverable type? CHECK: did I include ai_humanization for text-containing deliverables? CHECK: did I apply the 1.5x weight multiplier on ai_humanization? CHECK: is every dimension score backed by specific evidence (not "looks good")? CHECK: did I combine panel scoring with contradiction scan results at the GATE?
READ_ALSO: domains/content/ai-detection-patterns.md READ_ALSO: domains/creative/creative-acceptance-criteria.md READ_ALSO: domains/creative/brand-compliance.md READ_ALSO: domains/creative/creative-feedback-protocol.md