How many characters do you know?

Rafael

秀才
I've been often asked how many characters I know, particularly, but not only, by non-Chinese speakers.

Of course, as anyone who has studied the language knows, just memorizing the individual characters is not enough, you need to know combinations of characters too since there are a lot of two character words.

Nevertheless, knowing the number of characters you have memorized whether it is 500, 1000 or 2000 does give an indication of your Chinese reading fluency and it can also be useful (read motivating) in order to indicate your progress over time.

Does anyone know of a test that can estimate the number of words you know? I assume that it would not be too difficult to create such a test using, for example, the occurance figure assigned to each character as in Wenlin.
 

goulniky

榜眼
Along those lines, I just uploaded a few scripts that you may find useful : they analyse the number of unique characters (hanzi) on any GB or Big5-formatted webpage (UTF-8 not supported). There's also an option to copy/paste text for analysis, or to upload text documents. The output is a list of hanzi sorted by frequency order, with the number of times they appear.

Note that this only extracts individual characters, it would be nice to segment words too but that's a lot more complicated and would require a dictionary (or Wenlin pre-processing).
 
Regarding how many characters you know or need to know, the basic list is 2000 and the general ability to read a newspaper is 7000.

Still, this does not give you a complete ability to recognize names [as that is a culturally acquired knowledge] and it does not include your comprehension of Chinese adages [these are pretty much like idioms that are made up of 3 to 6 characters].

At some point, you have to go beyond mere character study and learn how to think in a Chinese context.

Similarly, I keep struggling with my English students that think the entire language is in a dictionary.
 

goulniky

榜眼
There's lots to learn, but then you don't have to understand every single character / word / chengyu to get the meaning.

I've been working with 'An Extensive Reading Course of Intermediate Chinese' published by BLCU (中级汉语 - 阅读教程 2 volumes). It's a very interesting textbook that systematically goes through a series of speed-reading / text-analysis techniques with lots of examples to practice (that's what I've used for HSK preparation this year, we'll see how effective it really is :wink: )
 
Word counts can be very misleading

Traditionally, Chinese educators ranked students by the number of characters that they learned, but they already had their 1st language ability to communicate and had a far wider vocabulary than their written ability. Second language learning often tries and fails to assimilate first language educational techniques. There are lots of examples.

As a guage of language progress for 2nd language learners, the number of characters is quite artificial and only relates superficially to reading skills.

What this means is that mere character recognition is not enough to decyper the semantics [full linguistic meaning] of the Chinese language. While it does indeed help you to become literate, you still have to acquire a lot of other information about the language [including how to recognize phrases, a grammatical sense, Chinese usage of punctuation, a phonological sense, and a large inventory of translated names].

A world map in Chinese is just as important as a dictionary. A bi-linguage encyclopedia of Chinese culture would be useful, but there is none that I know of. The best ones in all English might be to read the British history of China's development of Science and Tecnology and then Wilhem's I Ching.

You begin to see the problem. Language barrier evolves into an experiential cultural barrier as we begin to assimilate the words.

All I can say is don't give up. I started Chinese 12 years ago at age 47 and it has been a fountain of youth for me. It taught me that every day for the rest of my life I can and will learn something I didn't know. And, my point of view is more global, less ethnocentric.

Incidently, the Chinese were also the inventors of the civil service exam. In part it was because of their character based language, unlike an alphabet language, being more challenging and likely to require a life long scholorship for the 1st language user too. The written character linked a vast empire of people that actually could not understand eachother's speech. Being a newer language [only 700 years old] and having modern communication facilities, English is easier. We really don't have to fully understand the look of the word, just the sound.

And modern lingusitics is really a western culturally based science and doesn't fully appreciate or understand character based languages. So, if you really want to fully understand what is language - you need both the occidential and the oriental view points.

In sum....

Nu3 li4 du2 shu4; bu2 pa4 man4, zhi3 pa4 zhan4.

"Study diligently; don't fear slowness, only fear standstill."
 

feng

榜眼
Regarding how many characters you know or need to know, the basic list is 2000 and the general ability to read a newspaper is 7000.

Still, this does not give you a complete ability to recognize names [as that is a culturally acquired knowledge] and it does not include your comprehension of Chinese adages [these are pretty much like idioms that are made up of 3 to 6 characters].

At some point, you have to go beyond mere character study and learn how to think in a Chinese context.

Similarly, I keep struggling with my English students that think the entire language is in a dictionary.
Outside of someone who majored in Chinese literature or history, university educated Chinese and Taiwanese typically do not even know 5,000 characters.
For a non-native speaker, 2,000 characters would be woefully insufficient for reading, good only for guessing.
Names, idioms (of whatever number of characters), and multi-character words are simply vocabulary one must learn.
Character study is fundamental to going beyond.
Thinking the language is in a dictionary is common to students of all foreign languages, in my humble experience.
 

Sy

进士
QUOTE : "Regarding how many characters you know or need to know, the basic list is 2000 and the general ability to read a newspaper is 7000."

I don't fully agree with the above statement.7000 char is NO,2000 is okay.
I studies the char numbers in different dictionaries for a long time.I collected over many chinese dictionaries .
when I was going to school I owned none..
statistical shows =6000 char for college students...general workers' use= 4000 char
Now the gov't announced 3500 char as commonly used ones.
you can test yourself by google to print out the 3500 char list and check off your known char.
make a glossary with a double char as a rough definition or a self reminder of its meaning.
I made a list of 1600 char as basic and use 3333 char as my daily use .I have compiled many reports
for my own use in an index system of self design called Weizima.
When one sees a new char,he now can consult Pleco dictionary or search in Baidu.
No one should be discouraged in learning Chinese which is an ideogram type of language. there are logical
ways to help you to remember .once you learned the char ,it locks in to your brain.Character is a form or an image.
a 3000 single char can make 9,000,000 bigram. 9,000,000 devided by 2 to cut the inverted bigram.result is
4,500,000 possible words. by knowing 3000 char ,you can get clues for the 4,500,000 words.
all you have to know is 3000 char ;unlike in English ,you probably have to know 20,000 words to get equal value.
Of course there are problem in Chinese so is in Any language.
as a frame of reference : <<the three principle of the people>>has 2134 char;
4 volumes of <<Mao's selected works>> has 3300 char;
<<rickshaw boy>> has 2413 char.
This reply is 10 years late.things changed.some problem remain.new tools are made for convenience .
 
Top