Tool for extracting vocab words from youtube video subtitles

dightonjw

举人
I mentioned this in another thread. It's a python tool that can take the video id of a youtube video, download the subtitles, tokenize them, order them by frequency, and finally will output an anki deck as an .apkg.

A script is included for converting to a more pleco friendly (hopefully) format. Download the zip at the below link and open the documentation for information on usage.

Download (valid for 6 days from today)

I use when I'm trying to understand youtube content that is a different context to what I'm familiar with. Just filter out the high frequency words, and bam, you have a deck with the wierd/less-common words that you want to study.

I've found it to be helpful. Works for mandarin, russian, spanish, japanese. Although I only use it for Chinese anymore, and didn't do any tweaking or tuning for the other languages.
 

Shun

状元
Hi dightonjw,

thanks a lot for this package! I can see it's very professionally done, and also the documentation is great! It ought to be able to run on Macs running Homebrew, too.

Cheers,

Shun
 
Top