Captioning & description
What is captioning?
Captions are meant to support people who are D/deaf (external link) and hard of hearing. They are different from subtitles, which are only meant to translate dialogue for viewers who speak a different language. Subtitles assume the audience can hear music, background sounds, or non-verbal content. Captions, by contrast, will include these sounds in addition to all dialogue. They will describe sound effects, the type of music playing, or if the speaker has an accent.
Captions have been shown to support the learning of students who speak English as an additional language, students with learning disabilities, and students who are new to a discipline and may be unfamiliar with unique terminology.
Automatic captions
Automatic captions are generated using speech recognition technology powered by machine learning. Although the accuracy and efficiency of the technology is always improving, it does not offer 100% accuracy and requires significant editing. Automatic captions can be used as a starting point for developing accurate captions and transcripts.
Accurate captioning of at least 99% accuracy is the only way to ensure that people who are D/deaf or hard of hearing can understand audio content. Automatic captions should never be used as a substitute for captions or ASL interpreting.
Open versus closed captioning
There are two types of captioning: open and closed. Open or “hard” captions are permanently embedded in the video stream and cannot be turned off by the user. Closed captions contain the exact same text as open captions, although users have the ability to toggle them on or off using the video player.
There are different factors to consider when deciding between open or closed captioning, such as the target audience, where it’s being uploaded, what video player or platform, and accessibility features of the video player. In most cases, closed captioning is recommended.
When recording to the Zoom Cloud, a transcript will automatically be generated in WebVTT format. While this isn't a compatible subtitle format for many media players, it can be converted into one that is, or it can be uploaded as an accompanying transcript.
Step 1: Edit the transcript for accuracy
Under Cloud Recordings, (external link) find your recording. Select the Play thumbnail to open up the Zoom media player. Navigate to the Audio Transcript panel on the right and select the Edit button (pencil icon next to the phrase you want to edit). You can adjust the speed of the video if it makes it easier to make corrections.
Step 2: Download recording and corrected audio transcript
If re-uploading your recording to Google Drive, Stream, or any video hosting platform, you will need the video and transcript file. Select Download in the top right corner, or download the files individually from the previous screen. The Zoom recording and transcript will download as a .mp4 and .vtt file.
Step 3: Upload to your preferred hosting platform
If you receive any captioning error messages when uploading the VTT file, please convert the VTT file into SRT format (third party tool). (external link)
Live captioning for events
Similar to closed captioning on a video; live captioning is done live in real-time where a person listens in remotely over the internet (via Skype for example) or phone, and delivers the reproduced text instantaneously on a projected screen, TV or a user’s mobile device.
For more information, please visit Remote Captioning for Events or Virtual Events and Meetings.
Information for faculty & staff
The Accessibility for Ontarians with Disabilities Act (AODA) stipulates that all video and audio content shared on a public facing website must be captioned and/or transcribed. Any video or audio that is not intended for general public use must be captioned upon request.
TMU has an official (google doc) Vendor of Record for Audio/Video Captioning and Transcription Services with Ai-Media. (external link)
Guidance on captioning and description
If you’re creating multimedia for university-affiliated websites or social media, videos must have closed captions and audio content must have an accompanying transcript. Whenever starting a new multimedia project, budget for captioning in the same way you would budget for video editing, equipment and other expenses. Alternatively, you can learn how to caption videos yourself for free.
Video or audio 1-3 minutes
Free: Manually edit Zoom or YouTube's automatically-generated captions for accuracy.
Video or audio 3-60+ minutes
Budget for professional captions. Refer to TMU's (google doc) Vendor of Record for Audio/Video Captioning and Transcription Services with Ai-Media. (external link)
Recorded lectures
Recorded lectures can sometimes average 3 hours in length and may only be used for one term. We do not recommend budgeting for recorded lectures unless they will be posted on a public facing university website. There are currently no viable closed captioning solutions yet. Consider editing Zoom's automatically generated transcript.
For students registered with Academic Accommodation Support (AAS) that require captioning or description:
- For lectures that require live captioning, please contact Academic Accommodation Support.
- For pre-recorded lectures, videos or audio content used within a course, please contact the Library’s Accessibility Services.
Content within D2L Brightspace
When multimedia content is developed for courses and will be reused in subsequent courses, videos must be captioned and audio content must be transcribed prior to dissemination. While this requirement does not apply to third-party or supplementary content, it is highly recommended that captioned content is sourced at the outset to minimize the need for individuals to request accommodation.
- You can use Zoom to pre-record shorter videos or lecture components, and leverage Zoom’s auto transcription as a starting point for creating accurate captions.
Classroom accommodations
For students who require captioned media, please contact Library Accessibility Services as soon as possible. Library Accessibility Services will work with everyone involved to ensure access to course materials, including the student, instructor and Academic Accommodation Support.
Audio description
For any public facing videos without sound or videos that only contain music, a text alternative must be provided at a minimum. Text or audio descriptions ensure people who are blind or have low vision can understand what is happening in the video.
- Minimum: Provide a simple text description of what's happening on screen below the video frame (WCAG 2.0: Level A).
- Recommended : Provide a voice over or narration for all on-screen text elements, and describe any complex visuals or interactions.
- Best: Provide audio description or a detailed narration for your video if possible. Audio description is not mandatory, however is highly encouraged. Audio description provides the most accessible experience.