Copy of Transcript Formats

You can upload these transcript file types to Audio-Video:

SubRip (.srt extension)
.xml
.txt

You can add .xml or .txt files on a project-by-project basis. To do this, contact shanti-support@virginia.edu. Let them know the format you use to encode transcripts.

These formats assume you have a transcription of your file that includes timestamps – that is, start times and end times for the dialogue. If you don't have timestamped transcripts, you can paste the text of the transcript in the resource's description.

SubRip

Transcripts in .srt format are written in the following order:

A number, which identifies the transcript in the sequence
The start time for the transcript in hours:minutes:seconds,milliseconds , followed by --> , then the end time for the transcript
The transcript text
A blank line to mark the end of the transcript

Here's an example:

120
00:00:17,710 --> 00:00:19,820 
I am speaking

To mark a change in speakers, start the transcript with text with >>, followed by the speaker's name. For example:

120
00:00:17,710 --> 00:00:19,820 
>> JOHN: I want to upload a transcript. 
 

121
00:00:20,710 --> 00:00:21,820 
>> ALICE: I can help you with that. 
 
122
00:00:22,910 --> 00:00:23,820 
>> Do you have an Audio-Video account? 
 
123
00:00:25,710 --> 00:00:26,820 
>> JOHN: Yes, I do.

You can find out more with the external YouTube transcript guide.

Multilingual Transcripts

You can use multiple languages for SubRip transcripts in Audio-Video. Each language will have its own line in the public video interface. Viewers can then hide and display languages from transcripts or subtitles.

Enabling Multilingual Transcripts

To make this feature work for your video:

Upload your transcript in Audio-Video
- A “Language Tier” field will open
Choose the languages in your transcript from the drop-down menu
Click Connect
- Audio-Video will process your transcript to let viewers choose languages

Formatting in Inqscribe

You can make multilingual transcripts using InqScribe Transcription Software. To mark multilingual transcripts using Inqscribe, separate each language in the timecode with a /. Here's an example of a single time-coded line in in English and Tibetan:

[00:00:00.14]??????????????? / ??????????????????????????????? / Now what is this we call "böd" (Tibet)?

If you want to mark who is speaking:

keep the speaker within the / that separates the languages
follow the speaker's name with a colon

Here's an example of the same single time-coded line in English and Tibetan with one speaker, Tsering Gyalpo:

[00:00:00.14]??????????????? / Tsering Gyalpo: ??????????????????????????????? / Now what is this we call "böd" (Tibet)?

To export the multilingual transcript with speakers in InqScribe:

From the menu, click File > Export > XML..
- A window will open
Check Export Speaker Names
Enter a colon ( : ) for the speaker name delimiter
Export the file
- Your file is ready for Audio-Video

Page tree

Copy of Transcript Formats

SubRip

Multilingual Transcripts

Enabling Multilingual Transcripts

A “Language Tier” field will open

Audio-Video will process your transcript to let viewers choose languages

Formatting in Inqscribe

A window will open

Your file is ready for Audio-Video