Crowdsourcing YouTube Video Captioning

Oh, I just love Christian Heilmann’s blog! He’s one of the most skilled JavaScript developers I know, so I can still learn tricks from him and his colleagues at Yahoo! But he also does everything with JavaScript, and every time when this happens, it inspires me to mull over how it could be done without! You know, because restrictive company proxies filter JavaScript, or just to provide a graceful alternative, or because I’m so old-fashioned. ;)

So when Chris mentions JavaScript badges for, I think of Ed Eliot’s PhpDelicious or how to get similar results in WordPress. Naturally when he had the splendid idea to add captioning to YouTube videos with Google’s JavaScript API, I asked myself if there wasn’t a better way. There is, but to my surprise neither YouTube nor Yahoo! Video take advantage of that capability:

  1. It’s common practice to implement text content in Flash via XML.
  2. DFXP is a W3C XML standard for captioning videos.
  3. There are free tools like MAGpie for creating captions, and they all support DFXP.
  4. If YouTube or Yahoo! Video would allow users to upload and attach a DFXP file to a video, it would become dead easy for anybody to caption it. Not only for the filmmaker, captioning could be crowdsourced!

Just anticipating Joe Clark’s inevitable (and justifiable) objection: of course captioning is not an easy thing that anybody can do, to do it right it requires people with special training. But considering the number of videos on those platforms the only affordable and practical solution to provide any captioning at all is crowdsourcing. That can also be applied to the control of quality, accuracy, or reporting abuse.

Of course this would enhance primarily accessibility, but the XML files would further help search engines to index video content. And their marketing people would love such a feature for the positive PR!

Well, I submitted the suggestion both to Google and Yahoo!, you can vote on the Yahoo! Developer Network for it (Google is more closed-lipped). I’m curious who will be the first to offer that feature …

13 Responses to ‘Crowdsourcing YouTube Video Captioning’

  1. chris heilmann

    Hm… Yes, there are these free tools and there are standards for captioning. The roaring success that captioning in online video has is based on these being a real pain to use and them being overhead for development and deployment. I used JavaScript not because I wanted to use JavaScript but to use the API the player comes with. This means with a script like that and for example a browser extension we could caption any video that is already on youtube within youtube. You will not get people to do extra work with extra tools but if a very high profile public tool gets a support functionality like this we might at least get some more quasi captions out there which is a lot more than nothing — as we get now.

  2. Martin Kliehm

    Oh, no criticism intended. The proposed JavaScript solution is an improvement and works right now, that’s great! But thinking forward I’m sure there are people who would take the burden to create captions for their own videos or those of other people they deem important enough to contribute time to. I’m not talking about developers doing extra work, this could be crowdsourced to the community. If only there was an upload mechanism or interface…

    With JavaScript captioning is limited to the page of the person who embeds the video. With captions integrated into Flash they would come automatically wherever the Flash video is embedded, even directly on YouTube or Yahoo Video. Now that would be an improvement!

  3. chris heilmann

    Actually with the captioning format created with this tool it could be part of the embedding script and load the caption file showing it in clear text or in a form field on the page. That way you wouldn’t even have to access the flash movie. You won’t be able to change all Flash players out there, but you can pimp them, that is why Google offers an API.

  4. Martin Kliehm

    OK, I get what’s the advantage of the Google API concerning third party players. Nevertheless Google or Yahoo could pimp their players, and millions of users would benefit from it. That’s my whole suggestion. ;)

  5. Don Rideaux-Crenshaw

    I’m working in my spare time on creating a dfxp parser and tying it into the Google API to create a player that can read the dfxp file, get the timecode from the api and display the text in an “open caption” box.

    In concept it’s pretty simple. Use javascript to read the captioning text and begin/end times into an array. Grab the playhead position, search the array and use innerHTML on a span to display the caption text associated with the current time. Style information could also be incorporated. I’ve built similar things in the past in actionscript as part of custom flash video players. In that case, I wrote my own XML data island so I could focus on solving the problem at hand, not on a more general approach.

    If the project comes to fruition, I’ll make the javascript widely available. Then yes, anyone could embed a YouTube video on their site, point the script at their dfxp file, and be rocking and rolling. Alternative dfxp files could give “subtitles” in alternative languages, etc.

    As time permits, I’m taking the first step learning how to parse the dfxp and building the parser. Anyone with a javascript dxfp parser they’d like to donate to the cause, email me at dgcrenshaw at comcast dot net. Once I get a little traction on this project, I’ll put a page up on my site to support it.

  6. Bill Shackleton

    Thank you for this post, Martin.

    I am with you 100% on the idea of crowdsourcing this functionality. Think wikipedia collaboration for accessifying that part of the web that requires the human touch. IBM’s Social Accessibility Project seems to be heading in this direction for providing needed ALT-text for images (

    I think that although much could be accomplished by getting the key organizations onboard (YouTube, Google), they shouldn’t, and don’t need to be, barriers to this effort. Don, you seem to be suggesting a way that would enable the independent development of the video and captioning (design-time) in a way that could integrate during run-time. I think that’s the key.

    Have there been any developments since you made this post Martin? I’d like to explore hosting a way for volunteers to provide captions, and users to be able to request and enjoy captioned videos.



  7. Bill


    I Voted!

  8. Javier

    The editor at TubeCaption is completely in Javascript. TubeCaption allows users to add captions to any YouTube video. It uses a timeline with tracks were you can add your captions as if you were using software like adobe premier. The editor is all in javascript only the player is in flash.

    Check it out and let us know what you think.

  9. jj

    the .srt files for subtitles used in movies you can get from “the dark side of the net” are dead simple. Online video players would just need to parse the text according to the timestamp — as desktop video players already do (VLC, QuickTime player, etc)

  10. Karen Putz

    Just wanted to pop in here and say “thanks” for all your efforts in making the web accessible. Nothing is more frustrating to me than to come across a video, audio or podcast that I want to access and having the door slam shut. I look forward to the day when everything, everywhere on the ‘net is accessible.

  11. CC

    Has anyone recently tried uploading a DFXP file to YouTube? It seems to accept it.

  12. Martin Kliehm

    @CC, YouTube says:

    Although you can upload your captions/subtitles in any format, only supported formats will be displayed properly on the playback page.

    I haven’t tried it recently, but the text implies it might or might not work…

  13. Melian

    Hi!!! Anyone know how to display text plus icons using DFXP??