So videos don't have Ctrl+F...
Lucas Eduardo
Posted on September 13, 2020
Youtube-dl is an awesome download tool for videos and audios in a huge amount of sites but there are some features that people don't usually use. One of them is to download video subtitles that sometimes is available.
Youtube-dl is a CLI program, it can be a bit difficult for begginners but it will be worth learning. I promise.
People usually just use youtube-dl with a link, maybe referencing a list of links using the -a
flag.
youtube-dl $link
youtube-dl -a $file
If you run youtube-dl --help
there will be a lot of options and I will show some of them related to subtitles;
--list-subs
: Shows the list of embedded subtitles that the video have. Usually not that useful, at least not in this case xD;--write-subs
: Downloads the subtitles that are embedded in the video;--sub-lang
: Specifies a language to the downloaded subtitles, the default is en. In brazil the language is pt;--sub-format
: Specifies a file format of the downloaded subtitles. The default isVTT
and is the fallback if the specified formats aren't available to download;--write-auto-sub
: This is the flag I was talking about xD. This flag downloads the subtitle automatically generated in youtube, the video transcript. lang and format flags are hints about what subtitle will be downloaded;
After downloading the subtitle, the program will also download the video, if you just want the subtitle you can use the flag --skip-download
.
Batching works too, so if you want to download a full playlist or a file with all the URLs using -a
you can and it will work.
When you download a subtitle you download not a TXT
file full of words but a VTT
file, that if you want just the text you can delete the unwanted parts. In every text line there is a fixed lines of other data that you can delete using someting like a vim
macro, sed
or other text manipulation tool you like.
Posted on September 13, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.