Common-Words List

MALAILI

进士
CHINESE WORD LISTS

Recently, I have been trying without success to find a list(s) of the most common Chinese "Words," not the most common Chinese characters.

Much to my chagrin, I have been unable to come up a single list. I have tried contacting anyone and everyone, but to no avail.

Anyone know of or know where or how I can get such a thing.I need to to something online or something I can access online.

The HSK vocabulary lists don't offer any type of usage frequency.

Any and all help would be appreciated.
 

renovator

榜眼
I think you are going to have a hard time finding what you are looking for because in Chinese there are so many more words that convey an idea than in other languages. This is compounded by the fact that a 3 year old does not have the same most common word usage as an 18 year old student. A doctor uses many different common words that are not used by an engineer, etc. Just think of the common words used by a chef in a restaurant everyday. You will find very few of these in the HSK and other text books.

If you want to define the most basic couple of thousand words, you can probably achieve that goal. The problem is that those will not allow you to communicate beyind the most basic situations. Most likely those basic words will allow you to get a few points across to someone else, but good luck trying to understand their reply.
 

MALAILI

进士
Idolse,

Actually I had already visited the site you suggested, and since I didn't know for what I was looking, "Bigrams," I completely missed it. Once you suggested it, I went back and found essentially what I wanted.

Thank you for your help!
 

Erik

Member
Hello,

Wenlin (wenlin.com) has this function; it will list characters as well as words by frequency. If you are interested I can send you the list as a pdf document, with perfect formatting (but more than 40 MB), or as opendoc/word with somewhat less perfect formatting (but then normal size).

Erik
 
Hi, Michael,

Nice lists, although seems strange that "detonator" comes before "to take care of" :lol:
1778 信管
1779 看管

Would you consider also uploading the lists to my wiki (http://china.panlogicsoftware.com/)? This is the wiki that Mike links to (last I checked) on the Pleco downloads page. Perhaps something more Pleco specific, though, so people could download and import directly.

Second, would you be interested in making the frequency list compiler available? I have a lot of full length (modern) e-Books in Chinese, and I would like to be able to first build a vocabulary list from the book before starting it, etc. These are books I've scanned myself and OCR'd, so there's no public list available for these, etc. (Sorry, obviously I can't make the book text available, to anticipate any questions.).

Stephan
 

mihobu

秀才
stephanhodges said:
Nice lists, although seems strange that "detonator" comes before "to take care of" :lol:

There's lots of strangeness in the list, such as the appearance of high-frequency collocations that also happen to be uncommon words, crude orthography rules, etc.

Would you consider also uploading the lists to my wiki (http://china.panlogicsoftware.com/)? This is the wiki that Mike links to (last I checked) on the Pleco downloads page. Perhaps something more Pleco specific, though, so people could download and import directly.

I've got this on my mental to-do list, just gotta figure out what format is appropriate and get it done.

Second, would you be interested in making the frequency list compiler available?

The code is pretty unfriendly right now, so I need to do a little cleanup first. Also, my scripts do not actually do the work of compiling the frequency data from a corpus of text. Instead, it munches on frequency data taken from Jun Da's Chinese Computing website to produce the list.

-mhb
 
Top