Captions

Overview

Captions communicate a digital video's verbal information with identification of different speakers, and any important, audible, non-verbal information (e.g. sound effects, laughter).

All pre-recorded digital video content produced or controlled by the University must be published with captions.

For advice on producing captions for education materials, see our page Providing captions for online learning.

What and why: Captions allow everyone access to the audio information in a digital video. This benefits people with hearing impairments and cognitive differences, as well as those without disabilities who may prefer to read captions and listen, or may be in an environment where they cannot listen to the audio of a video.

Types of captions

Captions can be ‘open’ or ‘closed’. Open captions are ‘burnt in’ on the video. Closed captions can be turned on and off.

Open captions should only be used in place of closed captions where it’s known viewers won’t have the ability to turn captions on (eg public display screens, some social media).

Caption requirements

Accuracy

Captions are considered accessible when they are accurate. Automatic captions aren't considered accessible because they may contain errors. However, they can be edited and corrected to be made accessible.
Do not paraphrase. Condense and delete wording if there's too much to be read at a comfortable speed.

Visual presentation

Media players typically define a visual style for captions, or allow the user to customise the appearance. Where you are required to set a visual style (eg 'burned in' open captions), apply the following.

Captions mustn't obscure important information (eg on-screen text, key steps in demonstrations).
Left-align the caption text.
Use a common sans serif font, such as Arial or Helvetica.
Use white text on an opaque dark grey (hex value #3B3B3B) background or a translucent black background. In the case of the latter, a black background of minimum 75% opacity (25% transparency) should be used to ensure the white text easily maintains a minimum 4.5:1 contrast ratio no matter the video content underneath.

Content guide

As a guide, captions should be visible for a minimum of around 0.3 seconds per word (eg 3 seconds for 10 words). However, following the speed of typical speech should be acceptable.
There should only be two lines of captions on screen at a time.
Capitalise words only as you would in a typical sentence. Do not use all caps unless for a linguistic, pedagogic or informative purpose.
Non-verbal audio information should be presented inside square brackets: [loud alarm], [in French], [shouting].
If there are named speakers, their speech should be introduced with the name in square brackets. Each new speaker should be introduced this way.
Similarly, use names in square brackets to distinguish multiple speakers where their speech will display on screen at the same time. Different speakers should be placed on different lines.
Disrupted speech should be conveyed by using double hyphens (--) at the end of the broken speech.
If strong language is present in the audio it should be in the captions. If it is 'bleeped' or muted in the audio the bleep or muting should be presented in the captions the same as in the audio.

Further guidance

For any topic not covered here, taking your lead from one of the following standards should result in an accessible choice:

Described and Captioned Media Program Captioning Tip Sheet.
Netflix English Timed Text Style Guide.
BBC Subtitle Guidelines. Ensure you're using guidance applicable to online and not only broadcast subtitles. Broadcast has different constraints compared to digital video.

How to produce captions

Commissioned videos

If you're commissioning a video, always ask the supplier to provide closed captions. They will likely do this by providing as SRT file (with the .SRT extension). This file contains the text of what’s being said in a video, along with the timing for the dialog and the order in which it appears. This can be uploaded to YouTube or other media players so users have the option of turning it on or off.

Self-produced video or applying captions retroactively

Use the following workflow to produce captions in a way that will put you in a good position to provide a full transcript and, if necessary, a script for audio description.

Use an automatic transcription as a starter. If you don't have one already - such as one produced by Microsoft Stream, Mediasite or YouTube - then Word in Office 365 (free to all students and staff) has a transcription feature. Word allows you to upload audio or video from which it will create a written document of the verbal information. See Microsoft's guidance on transcribing your recordings.
Read the transcript as you listen to the audio. Correct any speech or grammar mistakes. Introduce speakers and add descriptions of other important audio-only content, placing the information in square brackets and on a new line.
Save two copies. One to use as a captions starter, another to use later for the transcript.
From here it's possible to upload or adjust the captions starter to provide captions for end users. How this is done depends on the platform your video is in.

Detail about how to ensure accurate captions in platforms such as Microsoft Stream, Mediasite or YouTube will follow shortly.

Correcting a subtitle file

You can easily correct subtitle files you already have using the free Subtitle Edit software via AppsAnywhere.