Friday, September 23, 2011

Adding a Transciption in YouTube

A while back I wrote about adding Captions in YouTube. It's a pretty straight forward process, though you have to put in start and stop times for each piece of text, which can be quite a lot of additional work.

The other day, someone suggested that adding a transcription to a video served much the same purpose as doing captions, for a lot less work, so I thought I'd give it a try.

I typed up a simple transcription of the video I made of the Big 'Ole Lens Test Party. I put in no timing information, but I did put in the names of the people who spoke in parenthesis in front of each piece of dialog, because I thought that would make it more understandable to any reader. The file looked something like this:
[Chris Loughran] Ben and I were having dinner at the Border Cafe in Cambridge, and started talking about lenses, and Ben doesn’t really think there’s a big difference between a hundred dollar lens and a 20,000 lens, so we're here today to prove him wrong.


[Ben Eckstein] I'm not sure that that's exactly what I said, but ah, yeah, my thesis statement going into this is that we will see very marginal image quality differences.
Uploading the transcription is simple;
  1. Go to your list of videos in YouTube
  2. Click the Edit info button
  3. Click the Captions and Subtitles link at the top of the screen
  4. Click the Add New Captions or Transcript button
At the next screen, choose Transcript file, click the Choose File button, and choose the transcription file using the open file dialog. Then click the Upload file button.

At this point, Youtube will say that it is processing the file, and this can take several minutes. I didn't time it, but I think it was processing for about ten minutes for this 11 minute video.

When it was finished, I was surprised to discover that YouTube had taken the transcription and attempted to match it to the video as though it were a caption file.

It wasn't perfect by any means. I'd created my transcription file in Microsoft Word, and it contained ellipses […] and special apostrophe characters [ ’ instead of ' ]. These came up as garbage characters - YouTube only likes plain text - so I replaced all the special characters in the original file, deleted the transcription, and uploaded a second time.

All things considered, YouTube does a pretty good job of distributing the transcription text to the video. And it's not just breaking it up evenly along the length of the movie; clearly it's doing some sort of audio analysis, as it got about 75% of the text almost perfectly matched up. There were, however, three or four sections where it got seriously messed up displaying section of text at the wrong time. Also, the beginning of the clip has music for several seconds and the very first line of text is displayed long before someone starts speaking!

If you want to fix the timing, you can download the transcription from YouTube as an SBV file. You'll find that the transcription now has timecode added. So you can use this as a step in the process of creating a good captions file.
[Chris Loughran] Ben and I were having dinner
at the Border Cafe in Cambridge, and started

talking about lenses, and Ben doesn't really
think there's a big difference between a

hundred dollar lens and a 20,000 lens, so
we're here today to prove him wrong.


If I had the enthusiasm - and the energy - I might edit the timecode and upload the file again, but editing the timecode would still be a lot of additional work, particularly as from this point on, you're doing it all manually.

Even so, I'm really impressed by what YouTube has done, but it only makes me want more. It's a pity that YouTube hasn't provided some kind of online editor that lets you play the video and adjust the timing of the captions as you move through the video and caption file. Perhaps they will have something like that in the future.

Note: if you don't see captions, click the CC button in the control bar.

NotesOnVideo: Adding Captions in YouTube
YouTube Help: Adding and Editing captions / subtitles
WebDev-il: SBV file format for Youtube Subtitles and Captions


Arlen said...

InqScribe is worth a try if you make a lot of transcriptions

Michael Murie said...

That looks really interesting; even if it is $100. I'm going to have to try it out; could be useful for transcribing meeting recording too...