Skip to content

DOMAIN:VISUAL_PRODUCTION:ACCESSIBILITY_MEDIA

OWNER: felice
UPDATED: 2026-03-24
SCOPE: Media accessibility — captions, audio descriptions, alt text, motion, color, keyboard
AGENTS: felice (primary), floris/floor (frontend implementation)
PARENT: Visual Production
COMPLIANCE: WCAG 2.2 AA minimum, European Accessibility Act (EAA)


A11Y:CAPTIONS

REQUIREMENTS

WCAG 1.2.2 (Level A): Captions for prerecorded audio content in synchronized media.
WCAG 1.2.4 (Level AA): Captions for live audio content in synchronized media.

RULE: ALL videos with audio MUST have captions — no exceptions
RULE: auto-generated captions MUST be human-reviewed before delivery
RULE: captions must be accurate — not approximate (accuracy target: 99%+)
RULE: sync tolerance: captions within 100ms of spoken audio

FORMAT_SRT

SRT (SubRip Text) — universal compatibility, simple format.

1
00:00:01,000 --> 00:00:04,500
Welcome to Growing Europe.
Today we will walk through the platform.

2
00:00:05,000 --> 00:00:08,200
The dashboard shows your real-time
analytics at a glance.

3
00:00:09,000 --> 00:00:12,800
[upbeat background music]

4
00:00:13,000 --> 00:00:16,500
Click the settings icon
to configure your workspace.

RULES:
- sequential numbering (no gaps)
- timestamp format: HH:MM:SS,mmm (comma before milliseconds)
- max 2 lines per subtitle
- max 42 characters per line
- minimum display time: 1 second
- indicate non-speech audio: [music], [applause], [phone ringing]
- identify speakers when multiple: ANNA: Welcome to the demo.

FORMAT_WEBVTT

WebVTT (Web Video Text Tracks) — preferred for web delivery.

WEBVTT

NOTE This is a comment

STYLE
::cue {
  font-family: Inter, sans-serif;
  font-size: 1.2em;
  background: rgba(0, 0, 0, 0.75);
  color: #FFFFFF;
  padding: 0.2em 0.5em;
}

00:00:01.000 --> 00:00:04.500
Welcome to Growing Europe.
Today we will walk through the platform.

00:00:05.000 --> 00:00:08.200
The dashboard shows your real-time
analytics at a glance.

00:00:09.000 --> 00:00:12.800
<i>[upbeat background music]</i>

00:00:13.000 --> 00:00:16.500
Click the <b>settings icon</b>
to configure your workspace.

ADVANTAGES over SRT:
- styling via CSS (::cue pseudo-element)
- positioning (line, position, align settings)
- inline formatting (<b>, <i>, <u>)
- metadata headers and comments
- native browser support via <track> element

HTML_INTEGRATION

<video controls preload="metadata">
  <source src="product-demo.mp4" type="video/mp4">
  <track
    kind="captions"
    src="captions-en.vtt"
    srclang="en"
    label="English"
    default
  >
  <track
    kind="captions"
    src="captions-nl.vtt"
    srclang="nl"
    label="Nederlands"
  >
  <track
    kind="captions"
    src="captions-de.vtt"
    srclang="de"
    label="Deutsch"
  >
</video>

RULE: kind="captions" (includes non-speech audio) vs kind="subtitles" (speech only)
RULE: default attribute on the primary language track
RULE: provide captions in all languages the product supports
RULE: label must be human-readable language name (not code)

GENERATING_CAPTIONS

WORKFLOW:
1. generate initial transcript with Whisper API
2. human review for accuracy (proper nouns, technical terms, punctuation)
3. add speaker identification if multiple speakers
4. add non-speech audio indicators
5. verify timing synchronization
6. export as WebVTT (web) and SRT (fallback)

WHISPER_API:

curl https://api.openai.com/v1/audio/transcriptions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F file=@audio.mp3 \
  -F model=whisper-1 \
  -F response_format=srt \
  -F language=en

RULE: Whisper output is a starting point — never deliver without human review
RULE: Whisper struggles with: brand names, technical jargon, accented speech, overlapping speakers
RULE: cost: ~$0.006 per minute of audio

BURN_IN_VS_SOFT_CAPTIONS

BURN-IN (hard-coded into video pixels):
- always visible, no user choice to disable
- use for: social media (viewers can't enable tracks), short-form content
- FFmpeg: ffmpeg -i input.mp4 -vf "subtitles=captions.srt" output.mp4

SOFT (separate track, user-toggleable):
- user can enable/disable and choose language
- use for: web embed, long-form content, multi-language
- HTML: <track> element

RULE: social media videos: always burn in captions (85% of social video viewed without sound)
RULE: web embed: soft captions via <track> (user choice, multi-language)
RULE: provide both when possible — burned-in for social, soft for web


A11Y:AUDIO_DESCRIPTIONS

REQUIREMENTS

WCAG 1.2.3 (Level A): Audio description or media alternative for prerecorded video.
WCAG 1.2.5 (Level AA): Audio description for prerecorded video content.

WHEN_REQUIRED:
- video shows important information NOT conveyed in the existing audio track
- examples: on-screen text, visual demonstrations, charts, scene changes
- NOT required if: video is talking-head only, or narration describes all visual content

TECHNIQUES

TECHNIQUE 1: INTEGRATED DESCRIPTION (preferred)
- write the narration script to include visual descriptions naturally
- example: "As you can see on the dashboard, the blue line chart shows revenue growing 40% this quarter"
- cheaper and better UX than a separate audio description track
- RULE: always prefer integrated description for new content

TECHNIQUE 2: EXTENDED AUDIO DESCRIPTION
- separate audio track that pauses video to describe visual content
- used when existing narration leaves no gaps for description
- delivered as alternate <track kind="descriptions">

TECHNIQUE 3: TEXT TRANSCRIPT
- full text alternative below or linked from the video
- includes both spoken content and visual descriptions
- minimum requirement when audio description is impractical
- RULE: every video must have a transcript regardless of other accommodations

WRITING_AUDIO_DESCRIPTIONS

STRUCTURE:
- describe WHAT is shown, not HOW to interpret it
- be concise — description must fit in natural pauses
- describe actions, expressions, scene changes, on-screen text

EXAMPLE (narrator says "Let me show you the dashboard"):

AUDIO DESCRIPTION: The screen shows a web dashboard with a left sidebar
containing six navigation items. The main area displays a line chart
trending upward and a data table with five rows of client data.

ANTI_PATTERN: "A really nice-looking dashboard is shown"
FIX: describe specific, factual visual content

ANTI_PATTERN: describing every visual detail
FIX: describe only what adds information beyond the audio track


A11Y:ALT_TEXT

REQUIREMENTS

WCAG 1.1.1 (Level A): All non-text content has a text alternative.

WRITING_GUIDE

INFORMATIVE_IMAGES (convey information):

<img src="chart.png" alt="Bar chart showing Q4 revenue of $2.4M, up 40% from Q3's $1.7M">

FUNCTIONAL_IMAGES (buttons, links):

<img src="search-icon.svg" alt="Search">
<img src="logo.svg" alt="Growing Europe — go to homepage">

DECORATIVE_IMAGES (purely visual):

<img src="decorative-wave.svg" alt="" role="presentation">

COMPLEX_IMAGES (charts, diagrams, infographics):

<img src="architecture-diagram.png" alt="System architecture diagram" aria-describedby="arch-desc">
<div id="arch-desc">
  <p>The system consists of three layers: the client application connects to the API gateway,
  which routes requests to microservices. Each microservice has its own database.
  A message queue handles asynchronous communication between services.</p>
</div>

RULES

RULE: every <img> must have an alt attribute — even if empty (alt="" for decorative)
RULE: alt text should describe the PURPOSE of the image in context, not just what it shows
RULE: max recommended length: 125 characters (screen readers may truncate)
RULE: do not start with "Image of" or "Picture of" — screen readers already announce it as an image
RULE: include data values for charts and graphs — not just "a chart"
RULE: for linked images, alt text should describe the link destination
RULE: decorative images MUST have alt="" — not omit alt entirely (missing alt is an error)

AI_GENERATED_ALT_TEXT

For batch processing, LLMs can generate draft alt text:
1. send image to vision model (Claude, GPT-4V)
2. prompt: "Write alt text for this image for web accessibility. Be concise and descriptive. Max 125 characters."
3. human review for accuracy and context appropriateness
4. RULE: never deploy AI-generated alt text without human review
5. RULE: AI cannot know the PURPOSE of the image in context — human must verify


A11Y:REDUCED_MOTION

REQUIREMENTS

WCAG 2.3.3 (Level AAA): Motion from interaction can be disabled.
WCAG 2.3.1 (Level A): No content flashes more than 3 times per second.

PREFERS_REDUCED_MOTION

Users with vestibular disorders, motion sickness, or cognitive disabilities may enable
reduced motion in their OS settings. Respect this preference.

CSS:

/* Default: animations enabled */
.animated-element {
  animation: slideIn 0.3s ease-out;
  transition: transform 0.2s ease;
}

/* Reduced motion: disable or minimize */
@media (prefers-reduced-motion: reduce) {
  .animated-element {
    animation: none;
    transition: none;
  }
}

VIDEO:

@media (prefers-reduced-motion: reduce) {
  video[autoplay] {
    display: none;
  }
  .video-static-fallback {
    display: block;
  }
}

LOTTIE:

const prefersReducedMotion = window.matchMedia('(prefers-reduced-motion: reduce)').matches;

{prefersReducedMotion ? (
  <img src="/images/illustration-static.svg" alt="Feature illustration" />
) : (
  <LottiePlayer autoplay loop src="/animations/feature.json" />
)}

JAVASCRIPT:

const prefersReducedMotion = window.matchMedia('(prefers-reduced-motion: reduce)');

function setupAnimation() {
  if (prefersReducedMotion.matches) {
    // Show static content, skip animation
    return;
  }
  // Start animation
}

// Listen for changes (user may toggle during session)
prefersReducedMotion.addEventListener('change', () => {
  if (prefersReducedMotion.matches) {
    stopAllAnimations();
  }
});

RULES:
- RULE: never autoplay video with motion — always require user interaction to start
- RULE: provide static alternative (image + transcript) for every animated/video element
- RULE: parallax effects must be disabled when reduced motion is preferred
- RULE: page transitions and micro-animations must respect the preference
- RULE: Remotion-rendered videos are pre-rendered and not affected by this — but their embed context must be


A11Y:PREFERS_COLOR_SCHEME

DARK_MODE_IMAGES

Some images need different versions for light and dark mode:

<picture>
  <source srcset="/img/logo-dark.svg" media="(prefers-color-scheme: dark)">
  <img src="/img/logo-light.svg" alt="Growing Europe">
</picture>

CSS:

.hero-image {
  content: url('/img/hero-light.jpg');
}

@media (prefers-color-scheme: dark) {
  .hero-image {
    content: url('/img/hero-dark.jpg');
  }
}

WHEN_NEEDED:
- logos with light/dark variants
- illustrations with background-dependent colors
- screenshots showing light vs dark UI
- diagrams where contrast depends on background

RULE: not every image needs a dark variant — only when contrast/readability is affected
RULE: test images against both light (#FFFFFF) and dark (#1E293B) backgrounds


A11Y:VIDEO_PLAYER_KEYBOARD

REQUIREMENTS

WCAG 2.1.1 (Level A): All functionality available from keyboard.
WCAG 2.1.2 (Level A): No keyboard trap.

CONTROLS

Every video player must support these keyboard interactions:

Key Action
Space / Enter play / pause
Left Arrow seek back 5 seconds
Right Arrow seek forward 5 seconds
Up Arrow volume up
Down Arrow volume down
M mute / unmute
F fullscreen toggle
C captions toggle
Escape exit fullscreen
Tab navigate between controls

ARIA_MARKUP

<div role="region" aria-label="Video player">
  <video id="player" aria-describedby="video-description">
    <source src="demo.mp4" type="video/mp4">
    <track kind="captions" src="captions.vtt" srclang="en" label="English" default>
  </video>

  <div role="toolbar" aria-label="Video controls">
    <button aria-label="Play" aria-pressed="false">
      <svg aria-hidden="true"><!-- play icon --></svg>
    </button>

    <input
      type="range"
      role="slider"
      aria-label="Seek"
      aria-valuemin="0"
      aria-valuemax="300"
      aria-valuenow="45"
      aria-valuetext="45 seconds of 5 minutes"
    >

    <button aria-label="Mute">
      <svg aria-hidden="true"><!-- volume icon --></svg>
    </button>

    <input
      type="range"
      role="slider"
      aria-label="Volume"
      aria-valuemin="0"
      aria-valuemax="100"
      aria-valuenow="80"
      aria-valuetext="80 percent"
    >

    <button aria-label="Toggle captions" aria-pressed="true">
      <svg aria-hidden="true"><!-- CC icon --></svg>
    </button>

    <button aria-label="Fullscreen">
      <svg aria-hidden="true"><!-- fullscreen icon --></svg>
    </button>
  </div>

  <p id="video-description" class="sr-only">
    Product demonstration showing how to configure your workspace settings.
  </p>

  <div aria-live="polite" aria-atomic="true" class="sr-only" id="player-status">
    <!-- Announce state changes: "Playing", "Paused", "Muted" -->
  </div>
</div>

RULES:
- RULE: aria-live="polite" region for announcing player state changes to screen readers
- RULE: all controls must have visible focus indicators (no outline: none)
- RULE: icons in buttons must have aria-hidden="true" — the aria-label provides the name
- RULE: aria-valuetext on sliders must be human-readable (not just a number)
- RULE: Tab order must follow visual layout (play, seek, volume, captions, fullscreen)


A11Y:SCREEN_READER_CONSIDERATIONS

DECORATIVE_VS_INFORMATIVE

DECORATIVE (adds no information):
- background patterns, dividers, ornamental graphics
- images that duplicate adjacent text content
- purely aesthetic illustrations
- TREATMENT: alt="" + role="presentation" or aria-hidden="true"

INFORMATIVE (conveys meaning):
- photos showing products, people, places
- charts, graphs, diagrams
- icons that indicate function (search, settings, etc.)
- screenshots of UI
- TREATMENT: descriptive alt text or aria-describedby for complex images

When an image is wrapped in a link:

<!-- Image IS the link content -->
<a href="/dashboard">
  <img src="dashboard-preview.png" alt="Go to your analytics dashboard">
</a>

<!-- Image accompanies text link -->
<a href="/features">
  <img src="feature-icon.svg" alt="" aria-hidden="true">
  <span>View all features</span>
</a>

RULE: if image is the only content in a link, alt text describes the destination
RULE: if text already describes the link, image should be alt="" (avoid redundancy)

FIGURE_AND_FIGCAPTION

<figure>
  <img src="architecture.png" alt="System architecture showing three layers">
  <figcaption>
    Figure 1: The Growing Europe platform architecture consists of an
    orchestration layer (Redis Streams), a knowledge layer (LLM summarization),
    and a company layer (agent identities and processes).
  </figcaption>
</figure>

RULE: <figure> + <figcaption> for images that need visible captions
RULE: alt text should be complementary to figcaption, not duplicate it
RULE: screen readers announce both — so alt should be brief if figcaption is detailed


A11Y:COLOR_INDEPENDENCE

REQUIREMENTS

WCAG 1.4.1 (Level A): Color is not the only means of conveying information.
WCAG 1.4.3 (Level AA): Text contrast ratio minimum 4.5:1 (3:1 for large text).
WCAG 1.4.11 (Level AA): Non-text contrast minimum 3:1.

IN_IMAGES

  • charts must use patterns/labels in addition to color
  • status indicators need icons + color (not just red/green)
  • text overlay on images must maintain contrast ratio

EXAMPLE (chart accessible to colorblind users):

Instead of: red bar = loss, green bar = profit
Use: red bar with diagonal lines = loss, green solid bar = profit, plus text labels

IN_VIDEO

  • never convey information solely through color
  • use labels, patterns, icons alongside color
  • test video frames against colorblind simulation
  • TOOLS: Color Oracle (desktop simulator), Sim Daltonism (macOS)

RULE: every piece of color-coded information must have a non-color alternative
RULE: test all charts and infographics with deuteranopia simulation (most common type)


A11Y:TESTING_CHECKLIST

AUTOMATED

  • [ ] all <img> elements have alt attributes
  • [ ] all <video> elements have <track kind="captions">
  • [ ] color contrast meets 4.5:1 minimum (text on images)
  • [ ] prefers-reduced-motion handled in CSS/JS
  • [ ] no content flashes more than 3 times per second
  • TOOLS: axe DevTools, Lighthouse accessibility audit, pa11y

MANUAL

  • [ ] captions are accurate and synchronized
  • [ ] alt text is meaningful in context
  • [ ] video player fully keyboard navigable
  • [ ] screen reader announces all interactive states
  • [ ] reduced motion shows appropriate fallback
  • [ ] color is never the sole information carrier
  • [ ] transcript available for all video content

SCREEN_READER_TESTING

Test with at least 2 screen readers:
- VoiceOver (macOS/iOS) — Safari
- NVDA (Windows) — Chrome/Firefox
- TalkBack (Android) — Chrome

RULE: test media content with screen reader before every client delivery
RULE: automate what you can (axe, Lighthouse) but manual testing is not optional


CROSS_REFERENCES