DOMAIN:VISUAL_PRODUCTION:ACCESSIBILITY_MEDIA¶
OWNER: felice
UPDATED: 2026-03-24
SCOPE: Media accessibility — captions, audio descriptions, alt text, motion, color, keyboard
AGENTS: felice (primary), floris/floor (frontend implementation)
PARENT: Visual Production
COMPLIANCE: WCAG 2.2 AA minimum, European Accessibility Act (EAA)
A11Y:CAPTIONS¶
REQUIREMENTS¶
WCAG 1.2.2 (Level A): Captions for prerecorded audio content in synchronized media.
WCAG 1.2.4 (Level AA): Captions for live audio content in synchronized media.
RULE: ALL videos with audio MUST have captions — no exceptions
RULE: auto-generated captions MUST be human-reviewed before delivery
RULE: captions must be accurate — not approximate (accuracy target: 99%+)
RULE: sync tolerance: captions within 100ms of spoken audio
FORMAT_SRT¶
SRT (SubRip Text) — universal compatibility, simple format.
1
00:00:01,000 --> 00:00:04,500
Welcome to Growing Europe.
Today we will walk through the platform.
2
00:00:05,000 --> 00:00:08,200
The dashboard shows your real-time
analytics at a glance.
3
00:00:09,000 --> 00:00:12,800
[upbeat background music]
4
00:00:13,000 --> 00:00:16,500
Click the settings icon
to configure your workspace.
RULES:
- sequential numbering (no gaps)
- timestamp format: HH:MM:SS,mmm (comma before milliseconds)
- max 2 lines per subtitle
- max 42 characters per line
- minimum display time: 1 second
- indicate non-speech audio: [music], [applause], [phone ringing]
- identify speakers when multiple: ANNA: Welcome to the demo.
FORMAT_WEBVTT¶
WebVTT (Web Video Text Tracks) — preferred for web delivery.
WEBVTT
NOTE This is a comment
STYLE
::cue {
font-family: Inter, sans-serif;
font-size: 1.2em;
background: rgba(0, 0, 0, 0.75);
color: #FFFFFF;
padding: 0.2em 0.5em;
}
00:00:01.000 --> 00:00:04.500
Welcome to Growing Europe.
Today we will walk through the platform.
00:00:05.000 --> 00:00:08.200
The dashboard shows your real-time
analytics at a glance.
00:00:09.000 --> 00:00:12.800
<i>[upbeat background music]</i>
00:00:13.000 --> 00:00:16.500
Click the <b>settings icon</b>
to configure your workspace.
ADVANTAGES over SRT:
- styling via CSS (::cue pseudo-element)
- positioning (line, position, align settings)
- inline formatting (<b>, <i>, <u>)
- metadata headers and comments
- native browser support via <track> element
HTML_INTEGRATION¶
<video controls preload="metadata">
<source src="product-demo.mp4" type="video/mp4">
<track
kind="captions"
src="captions-en.vtt"
srclang="en"
label="English"
default
>
<track
kind="captions"
src="captions-nl.vtt"
srclang="nl"
label="Nederlands"
>
<track
kind="captions"
src="captions-de.vtt"
srclang="de"
label="Deutsch"
>
</video>
RULE: kind="captions" (includes non-speech audio) vs kind="subtitles" (speech only)
RULE: default attribute on the primary language track
RULE: provide captions in all languages the product supports
RULE: label must be human-readable language name (not code)
GENERATING_CAPTIONS¶
WORKFLOW:
1. generate initial transcript with Whisper API
2. human review for accuracy (proper nouns, technical terms, punctuation)
3. add speaker identification if multiple speakers
4. add non-speech audio indicators
5. verify timing synchronization
6. export as WebVTT (web) and SRT (fallback)
WHISPER_API:
curl https://api.openai.com/v1/audio/transcriptions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-F file=@audio.mp3 \
-F model=whisper-1 \
-F response_format=srt \
-F language=en
RULE: Whisper output is a starting point — never deliver without human review
RULE: Whisper struggles with: brand names, technical jargon, accented speech, overlapping speakers
RULE: cost: ~$0.006 per minute of audio
BURN_IN_VS_SOFT_CAPTIONS¶
BURN-IN (hard-coded into video pixels):
- always visible, no user choice to disable
- use for: social media (viewers can't enable tracks), short-form content
- FFmpeg: ffmpeg -i input.mp4 -vf "subtitles=captions.srt" output.mp4
SOFT (separate track, user-toggleable):
- user can enable/disable and choose language
- use for: web embed, long-form content, multi-language
- HTML: <track> element
RULE: social media videos: always burn in captions (85% of social video viewed without sound)
RULE: web embed: soft captions via <track> (user choice, multi-language)
RULE: provide both when possible — burned-in for social, soft for web
A11Y:AUDIO_DESCRIPTIONS¶
REQUIREMENTS¶
WCAG 1.2.3 (Level A): Audio description or media alternative for prerecorded video.
WCAG 1.2.5 (Level AA): Audio description for prerecorded video content.
WHEN_REQUIRED:
- video shows important information NOT conveyed in the existing audio track
- examples: on-screen text, visual demonstrations, charts, scene changes
- NOT required if: video is talking-head only, or narration describes all visual content
TECHNIQUES¶
TECHNIQUE 1: INTEGRATED DESCRIPTION (preferred)
- write the narration script to include visual descriptions naturally
- example: "As you can see on the dashboard, the blue line chart shows revenue growing 40% this quarter"
- cheaper and better UX than a separate audio description track
- RULE: always prefer integrated description for new content
TECHNIQUE 2: EXTENDED AUDIO DESCRIPTION
- separate audio track that pauses video to describe visual content
- used when existing narration leaves no gaps for description
- delivered as alternate <track kind="descriptions">
TECHNIQUE 3: TEXT TRANSCRIPT
- full text alternative below or linked from the video
- includes both spoken content and visual descriptions
- minimum requirement when audio description is impractical
- RULE: every video must have a transcript regardless of other accommodations
WRITING_AUDIO_DESCRIPTIONS¶
STRUCTURE:
- describe WHAT is shown, not HOW to interpret it
- be concise — description must fit in natural pauses
- describe actions, expressions, scene changes, on-screen text
EXAMPLE (narrator says "Let me show you the dashboard"):
AUDIO DESCRIPTION: The screen shows a web dashboard with a left sidebar
containing six navigation items. The main area displays a line chart
trending upward and a data table with five rows of client data.
ANTI_PATTERN: "A really nice-looking dashboard is shown"
FIX: describe specific, factual visual content
ANTI_PATTERN: describing every visual detail
FIX: describe only what adds information beyond the audio track
A11Y:ALT_TEXT¶
REQUIREMENTS¶
WCAG 1.1.1 (Level A): All non-text content has a text alternative.
WRITING_GUIDE¶
INFORMATIVE_IMAGES (convey information):
FUNCTIONAL_IMAGES (buttons, links):
DECORATIVE_IMAGES (purely visual):
COMPLEX_IMAGES (charts, diagrams, infographics):
<img src="architecture-diagram.png" alt="System architecture diagram" aria-describedby="arch-desc">
<div id="arch-desc">
<p>The system consists of three layers: the client application connects to the API gateway,
which routes requests to microservices. Each microservice has its own database.
A message queue handles asynchronous communication between services.</p>
</div>
RULES¶
RULE: every <img> must have an alt attribute — even if empty (alt="" for decorative)
RULE: alt text should describe the PURPOSE of the image in context, not just what it shows
RULE: max recommended length: 125 characters (screen readers may truncate)
RULE: do not start with "Image of" or "Picture of" — screen readers already announce it as an image
RULE: include data values for charts and graphs — not just "a chart"
RULE: for linked images, alt text should describe the link destination
RULE: decorative images MUST have alt="" — not omit alt entirely (missing alt is an error)
AI_GENERATED_ALT_TEXT¶
For batch processing, LLMs can generate draft alt text:
1. send image to vision model (Claude, GPT-4V)
2. prompt: "Write alt text for this image for web accessibility. Be concise and descriptive. Max 125 characters."
3. human review for accuracy and context appropriateness
4. RULE: never deploy AI-generated alt text without human review
5. RULE: AI cannot know the PURPOSE of the image in context — human must verify
A11Y:REDUCED_MOTION¶
REQUIREMENTS¶
WCAG 2.3.3 (Level AAA): Motion from interaction can be disabled.
WCAG 2.3.1 (Level A): No content flashes more than 3 times per second.
PREFERS_REDUCED_MOTION¶
Users with vestibular disorders, motion sickness, or cognitive disabilities may enable
reduced motion in their OS settings. Respect this preference.
CSS:
/* Default: animations enabled */
.animated-element {
animation: slideIn 0.3s ease-out;
transition: transform 0.2s ease;
}
/* Reduced motion: disable or minimize */
@media (prefers-reduced-motion: reduce) {
.animated-element {
animation: none;
transition: none;
}
}
VIDEO:
@media (prefers-reduced-motion: reduce) {
video[autoplay] {
display: none;
}
.video-static-fallback {
display: block;
}
}
LOTTIE:
const prefersReducedMotion = window.matchMedia('(prefers-reduced-motion: reduce)').matches;
{prefersReducedMotion ? (
<img src="/images/illustration-static.svg" alt="Feature illustration" />
) : (
<LottiePlayer autoplay loop src="/animations/feature.json" />
)}
JAVASCRIPT:
const prefersReducedMotion = window.matchMedia('(prefers-reduced-motion: reduce)');
function setupAnimation() {
if (prefersReducedMotion.matches) {
// Show static content, skip animation
return;
}
// Start animation
}
// Listen for changes (user may toggle during session)
prefersReducedMotion.addEventListener('change', () => {
if (prefersReducedMotion.matches) {
stopAllAnimations();
}
});
RULES:
- RULE: never autoplay video with motion — always require user interaction to start
- RULE: provide static alternative (image + transcript) for every animated/video element
- RULE: parallax effects must be disabled when reduced motion is preferred
- RULE: page transitions and micro-animations must respect the preference
- RULE: Remotion-rendered videos are pre-rendered and not affected by this — but their embed context must be
A11Y:PREFERS_COLOR_SCHEME¶
DARK_MODE_IMAGES¶
Some images need different versions for light and dark mode:
<picture>
<source srcset="/img/logo-dark.svg" media="(prefers-color-scheme: dark)">
<img src="/img/logo-light.svg" alt="Growing Europe">
</picture>
CSS:
.hero-image {
content: url('/img/hero-light.jpg');
}
@media (prefers-color-scheme: dark) {
.hero-image {
content: url('/img/hero-dark.jpg');
}
}
WHEN_NEEDED:
- logos with light/dark variants
- illustrations with background-dependent colors
- screenshots showing light vs dark UI
- diagrams where contrast depends on background
RULE: not every image needs a dark variant — only when contrast/readability is affected
RULE: test images against both light (#FFFFFF) and dark (#1E293B) backgrounds
A11Y:VIDEO_PLAYER_KEYBOARD¶
REQUIREMENTS¶
WCAG 2.1.1 (Level A): All functionality available from keyboard.
WCAG 2.1.2 (Level A): No keyboard trap.
CONTROLS¶
Every video player must support these keyboard interactions:
| Key | Action |
|---|---|
| Space / Enter | play / pause |
| Left Arrow | seek back 5 seconds |
| Right Arrow | seek forward 5 seconds |
| Up Arrow | volume up |
| Down Arrow | volume down |
| M | mute / unmute |
| F | fullscreen toggle |
| C | captions toggle |
| Escape | exit fullscreen |
| Tab | navigate between controls |
ARIA_MARKUP¶
<div role="region" aria-label="Video player">
<video id="player" aria-describedby="video-description">
<source src="demo.mp4" type="video/mp4">
<track kind="captions" src="captions.vtt" srclang="en" label="English" default>
</video>
<div role="toolbar" aria-label="Video controls">
<button aria-label="Play" aria-pressed="false">
<svg aria-hidden="true"><!-- play icon --></svg>
</button>
<input
type="range"
role="slider"
aria-label="Seek"
aria-valuemin="0"
aria-valuemax="300"
aria-valuenow="45"
aria-valuetext="45 seconds of 5 minutes"
>
<button aria-label="Mute">
<svg aria-hidden="true"><!-- volume icon --></svg>
</button>
<input
type="range"
role="slider"
aria-label="Volume"
aria-valuemin="0"
aria-valuemax="100"
aria-valuenow="80"
aria-valuetext="80 percent"
>
<button aria-label="Toggle captions" aria-pressed="true">
<svg aria-hidden="true"><!-- CC icon --></svg>
</button>
<button aria-label="Fullscreen">
<svg aria-hidden="true"><!-- fullscreen icon --></svg>
</button>
</div>
<p id="video-description" class="sr-only">
Product demonstration showing how to configure your workspace settings.
</p>
<div aria-live="polite" aria-atomic="true" class="sr-only" id="player-status">
<!-- Announce state changes: "Playing", "Paused", "Muted" -->
</div>
</div>
RULES:
- RULE: aria-live="polite" region for announcing player state changes to screen readers
- RULE: all controls must have visible focus indicators (no outline: none)
- RULE: icons in buttons must have aria-hidden="true" — the aria-label provides the name
- RULE: aria-valuetext on sliders must be human-readable (not just a number)
- RULE: Tab order must follow visual layout (play, seek, volume, captions, fullscreen)
A11Y:SCREEN_READER_CONSIDERATIONS¶
DECORATIVE_VS_INFORMATIVE¶
DECORATIVE (adds no information):
- background patterns, dividers, ornamental graphics
- images that duplicate adjacent text content
- purely aesthetic illustrations
- TREATMENT: alt="" + role="presentation" or aria-hidden="true"
INFORMATIVE (conveys meaning):
- photos showing products, people, places
- charts, graphs, diagrams
- icons that indicate function (search, settings, etc.)
- screenshots of UI
- TREATMENT: descriptive alt text or aria-describedby for complex images
IMAGE_LINK_PATTERN¶
When an image is wrapped in a link:
<!-- Image IS the link content -->
<a href="/dashboard">
<img src="dashboard-preview.png" alt="Go to your analytics dashboard">
</a>
<!-- Image accompanies text link -->
<a href="/features">
<img src="feature-icon.svg" alt="" aria-hidden="true">
<span>View all features</span>
</a>
RULE: if image is the only content in a link, alt text describes the destination
RULE: if text already describes the link, image should be alt="" (avoid redundancy)
FIGURE_AND_FIGCAPTION¶
<figure>
<img src="architecture.png" alt="System architecture showing three layers">
<figcaption>
Figure 1: The Growing Europe platform architecture consists of an
orchestration layer (Redis Streams), a knowledge layer (LLM summarization),
and a company layer (agent identities and processes).
</figcaption>
</figure>
RULE: <figure> + <figcaption> for images that need visible captions
RULE: alt text should be complementary to figcaption, not duplicate it
RULE: screen readers announce both — so alt should be brief if figcaption is detailed
A11Y:COLOR_INDEPENDENCE¶
REQUIREMENTS¶
WCAG 1.4.1 (Level A): Color is not the only means of conveying information.
WCAG 1.4.3 (Level AA): Text contrast ratio minimum 4.5:1 (3:1 for large text).
WCAG 1.4.11 (Level AA): Non-text contrast minimum 3:1.
IN_IMAGES¶
- charts must use patterns/labels in addition to color
- status indicators need icons + color (not just red/green)
- text overlay on images must maintain contrast ratio
EXAMPLE (chart accessible to colorblind users):
Instead of: red bar = loss, green bar = profit
Use: red bar with diagonal lines = loss, green solid bar = profit, plus text labels
IN_VIDEO¶
- never convey information solely through color
- use labels, patterns, icons alongside color
- test video frames against colorblind simulation
- TOOLS: Color Oracle (desktop simulator), Sim Daltonism (macOS)
RULE: every piece of color-coded information must have a non-color alternative
RULE: test all charts and infographics with deuteranopia simulation (most common type)
A11Y:TESTING_CHECKLIST¶
AUTOMATED¶
- [ ] all
<img>elements havealtattributes - [ ] all
<video>elements have<track kind="captions"> - [ ] color contrast meets 4.5:1 minimum (text on images)
- [ ]
prefers-reduced-motionhandled in CSS/JS - [ ] no content flashes more than 3 times per second
- TOOLS: axe DevTools, Lighthouse accessibility audit, pa11y
MANUAL¶
- [ ] captions are accurate and synchronized
- [ ] alt text is meaningful in context
- [ ] video player fully keyboard navigable
- [ ] screen reader announces all interactive states
- [ ] reduced motion shows appropriate fallback
- [ ] color is never the sole information carrier
- [ ] transcript available for all video content
SCREEN_READER_TESTING¶
Test with at least 2 screen readers:
- VoiceOver (macOS/iOS) — Safari
- NVDA (Windows) — Chrome/Firefox
- TalkBack (Android) — Chrome
RULE: test media content with screen reader before every client delivery
RULE: automate what you can (axe, Lighthouse) but manual testing is not optional
CROSS_REFERENCES¶
- Video production pipeline: video-production.md
- Delivery specifications (caption formats per platform): delivery-specs.md
- Asset optimization (contrast, format support): asset-optimization.md
- Image generation (alt text for AI images): image-generation.md