Discussion started by Captain Planet, Jul 21, 2010.

  1. Hello,

    I would like to share with you something I've been working on for quite some time. Namely, a flashcard list and a user dictionary based on all the five volumes of the new edition of Practical Audio-Visual Chinese (新版實用視聽華語). This is the most popular book for learning Chinese in Taiwan, particularly at the National Taiwan Normal University Mandarin Training Center (國立臺灣師範大學國語教學中心, popularly known as Shida), although it is also used in many other places. More details about the books can be found here.

    AV Chinese.jpg

    The files were last updated in May 2015. For the list of changes and download links, see the bottom of this post.

    Scope. Included are words appearing anywhere within the first four volumes, either in the textbook or copybook, except technical pages. For the fifth volume, the list includes all words and proverbs from the textbook list for all chapters. Also included, in separate categories (see below), are the vocabulary lists for all four levels of the TOP (Test of Proficiency - Huayu: Beginner, Learner, Superior and Master) exam, less any words that had already appeared before somewhere within the lesson categories. In total, the dictionary comprises nearly 7,000 entries.

    Categorization. The words are categorized by book and chapter in the format of AV Chinese/Book n/Lesson [m]m, where n is the volume number, and [m]m is the lesson number. For example, the words from lesson 6 of book 3 can be found under the category AV Chinese/Book 3/Lesson 6. An additional subcategory for each book, AV Chinese/Book n/Extra, includes words that appear outside the main word lists (i.e. as footnotes, in the grammar section, in the supplementary exercises at the end of each chapter, or in the copybook) but were never included in the main word lists in any of the chapters, until the end of the last book. These words can be considered non-essential, and somewhat random, but many are actually very useful to know in the long term. Finally, all the TOP vocabulary that did not appear in the books is included under the categories AV Chinese/TOP/{Beginner,Learner,Superior,Master}, separately for each exam level.

    Definition syntax. Each entry is marked for the part of speech it represents. The abbreviations used are as follows: A: adverb, AT: attributive, AV: auxiliary verb, CONJ: conjuction, CV: co-verb, DEM: demonstrative pronoun, I: interjection, IE: idiomatic expression, M: measure word (classifier), N: noun, P: particle, PRON: pronoun, PV: proverb (note this is used differently in the books), RC: resultative compound, SV: stative verb, V: verb, VO: verb-object compound. If a word can appear as more than one part of speech, multiple abbreviations will be listed, separated with a comma, and the respective multiple definitions will be delimited by the semicolon sign. If the entry is a compound that can be split into several parts, abbreviations for each of these will be listed, separated by a hyphen. To simplify distinguishing parts of speech, verb definitions are always preceded by "to," and those of stative verbs are always preceded by "to be." Nouns are, however, not preceded by an article.

    Definition tags. Each definition is followed by the book and chapter number, provided in square brackets, [ and ], in the format PAVC-nmm, where n is the book number, and mm is the lesson number, padded with zeroes if less than 9. For example, a word with its definition tagged as [PAVC-306] can be found in lesson 6 of book 3 (possibly within the Extra vocabulary). The TOP words are tagged as [TOP-c], where the character c is the first letter in the English name of the pertaining exam level, and can be any of the following: B for Beginner, L for Learner, S for Superior, M for Master.

    Additional notes:
    • To make the most use of Pleco features, entries were trimmed not to exceed four characters. Wherever the entry is actually a part of a longer phrase, this will be mentioned in parentheses in the entry definition.
    • Wherever there is more than one word with a given meaning, and both words are likely to be studied at the same time (i.e. they are in the same chapter or in the Extra subcategory for the same book), the definition provides a cue by revealing one (usually the first) letter of its pronunciation. This might be useful if you configure a test to only show the definition (without revealing pronunciation).
    • The pronunciation follows the book, which follows Taiwanese conventions. This sometimes means different tones, and sometimes may even mean entirely different sounds (as is in the case of 血 or 垃圾). The definitions are based on those provided in the book, but were significantly edited for clarity, or rewritten where incomprehensible. The list focuses on Taiwanese usage, and does not provide information on the Mainland Chinese pronunciation.
    • Simplified character variants are included for completeness. These do not appear anywhere within the books or TOP word lists, and were automatically generated with the OpenCC tool.
    How to use it:
    (Feel free to adjust these steps to suit your usage scenario.)
    1. Install the dictionary:
      • The easy way: point Pleco to the file AV Chinese.pqb, which is a ready-to-use dictionary file for the current version of Pleco (3.2 as of the time of writing), and can be added as an existing user dictionary to your Pleco installation.
      • The other way: create a new user dictionary on your own, and import the entries from the file AV Chinese Dictionary.txt.
    2. Install the flashcards:
      • The easy way assuming you don't need to keep your current flashcards: use the provided Pleco Flashcards.pqb file to restore your flashcard backup, or - if you're on Android - just copy it over your current flashcard file. (Warning: all your current flashcards will be irreversibly lost.)
      • The other way: import the flashcards from the file AV Chinese Flashcards.txt. The recommended settings are:
        • Text encoding: UTF-8
        • Definition source: Prefer dicts
        • Dictionaries: 1 dicts, choose the newly-installed AV Chinese
        • Store in user dict: Off
        • Ambiguous Entries: Prompt
        • Duplicate Entries: Prompt
        Note that there are four (4) legitimately duplicate flashcards in the set. These are: 兒 (儿) /er2/ 'son' and /er5/ 'suffix', /na3/ 'which' and /na5/ 'ending particle', /de2/ 'get' and /de5/ 'particle', and /zi3/ 'terrestrial branch' and /zi5/ 'suffix'. For this reason, you should leave the above settings to Prompt and when prompted what to do, answer Allow Duplicate for the above four cards, and Merge Cats for all the others (unfortunately, this means answering the question quite a lot).
    3. Install the HSK flashcards addendum: this additional flashcard list contained in the file AV Chinese Flashcards HSK Addendum.txt provides the vocabulary from all levels of the old and new HSK exam that did not appear in any of the previous categories. There are no definitions, so change the import settings accordingly to map them to the dictionary of your choice (for convenience, set the "Duplicate Entries" to "Skip", and "Ambiguous Entries" to "Use First" for this import). More details about the HSK list can be found in another thread.
    List of changes:
    • July 2010: First public release, 3982 unique entries.
    • November 2010: Updated to include all the extra words from volumes 3 and 4, as well a complete rewrite of the lists for the second part of book 4 and book 5, which were previously based on another source of insufficient quality. Also, many definitions from the earlier chapters were edited and the occasional errors were fixed. 4674 unique entries.
    • February 2012: Extended to include all the vocabulary from all four levels of the TOP - Huayu (TOCFL) exam that did not appear in any of the books.
    • May 2015: Added simplified headwords. Brought back per-chapter categories for the first two books. Cosmetic changes to category names and definition tags. Usage instructions rewritten. Removed duplicate entry for 協助. Fixed duplicated quotation marks. 6679 unique entries.
    Sharing. Feel free to redistribute these files as long as you don't charge any money for them but please provide a reference to the original thread on this forum (, so that all the users can learn about the most up-to-date version. Please report any errors back in this thread, so that the quality can be improved for everyone. Enjoy!

    Dude, you just totally rocked my world. I had spent two hours today working on pavc5 ch 8-11, using anki to make those card. And now, thanks to you, my new best friend, I have the set for all 5 books, and they come with sound, an awesome thing since my 聽力 is so bad. Thank you a million and one!!!

    This is amazing. Thank you a lot. To the above poster I see you mention you were making anki cards, do you have any completed books for the pav series?
    Nice! I'm using these books at ShiDa for review, so I'm going through the chapters quickly and don't have time to enter all the flashcards myself. This is hella useful.
  6. A significant update. Lots of errors fixed (unfortunately there were many, mostly in books 4 and 5, hopefully all fixed now). Also the list now includes all the extra vocabulary from books 3 and 4, which is about 700 new words. Please see the first post (also updated) to download.
    Does anyone know if I can use this with Pleco on my iPod Touch 4?
    I will gladly purchase the flashcard option for Pleco if I can make use of Captain Planet's flashcards.

    Yes - should work fine with that, there's not really any difference between flashcards on the iPod Touch and any other iDevice.
    Mr. Love,

    I don't know how you do it. I am indebted to you for your generosity.

    Installed and working on iPod touch4. Thank you!
    Now I just need to figure out how to limit the flashcards to show only book one at this time. I'm not ready for the other books yet but when testing, flashcards from all books are being used.
    Go to Flashcard Testing > Card Categories, then deselect the books you don't want to be tested on.
    How did you instal it on your Ipod? I have an Iphone 4 and i have downloaded it to my computer and uploaded it to my phone, but when i go to look at the file it is still a TXT and when i click it, it just takes me to a place to change the name
    Go into the Flashcards tab / Import Cards and select the file in there.
    I was wondering if anyone can help me, I'm trying to use these databases, and for the most part they are awesome. I just was wondering if it is possible to limit the flashcards to each chapter. Everytime I boot them up I get the flashcards for the entire book, which isn't very useful if I'm only up to Chapter three or Chapter four! Excuse my interest, Pleco is awesome but it's also a bit confusing and I can't figure out how to limit it and I've been looking everywhere on the site, so any help would be super appreciated.

    Now I also have these dictionaries installed but I don't know if I've installed them properly, are there other dictionaries with Traditional characters that I can install?

    thanks again!
    Here's an updated version of the flashcard list with books 1 and 2 divided into lessons; I also removed the superfluous bracketed headword characters (since the lists only supply traditional versions and they're just duplicates). Captain Planet, if you want to incorporate any of this into your version and update the first post, feel free to do so.

    Attached Files:

    ohlordamen likes this.
    Thank you for posting this! :) I'm thinking of using these textbooks so this will be extremely useful.

    Slightly off-topic, but does anyone know an international (specifically UK) distributor for these books (in either physical or electronic form)?

