Outlier Dictionary of Chinese Characters

Shun

状元
What you're referring to is 分化字, i.e., a new character being created out of an existing character. I'm not sure the Wikipedia explanation is correct regarding 采 though.

Thank you very much! OK, then the Wikipedia author must have been looking for a quick example and stumbled.

As far as the definition length, the Mini Edition won't give long definitions. The Essentials Edition, however, gives a lot more meanings and shows the logical relationships between the different meanings (see the entry for 反 or 角 for example).

Indeed, I see there are FORM, MEANINGS, and STROKE ORDER, often COMPONENTS sections, and liked it so far. Here, I have two questions (sorry if they've been answered before):

1. According to the website, the Expert edition will include "detailed information on each character's history." Will you trace how a character's meanings changed over time, how and when components were added to it or removed from it (which is probably quite rare), when it was simplified and again made more complex, and so on? Or would that require one to draw a tree-like structure, which would go beyond what a dictionary can do? I think that would be great for reading texts from different epochs, because then you can quickly recognize the changes of meaning a character has undergone.

2. Will there be an effort to classify the origin of each character by the traditional six categories from the 说文解字? I see that the wording "反 depicts" or "角 depicts" would mean it was pictographic, but it might be harder to tell for the other five categories. (I know it's sometimes hard to assign a single category, then one could of course name them all and give an explanation as to why that is so.) I think such a classification would help language learners remember characters through a deeper understanding of their formation, as well.
 
Last edited:

Ash

进士
Thank you very much! OK, then the Wikipedia author must have been looking for a quick example and stumbled.
Indeed, I see there are FORM, MEANINGS, and STROKE ORDER, often COMPONENTS sections, and liked it so far. Here, I have two questions (sorry if they've been answered before):

1. According to the website, the Expert edition will include "detailed information on each character's history." Will you trace how a character's meanings changed over time, how and when components were added to it or removed from it (which is probably quite rare), when it was simplified and again made more complex, and so on? Or would that require one to draw a tree-like structure, which would go beyond what a dictionary can do? I think that would be great for reading texts from different epochs, because then you can quickly recognize the changes of meaning a character has undergone.

2. Will there be an effort to classify the origin of each character by the traditional six categories from the 说文解字? I see that the wording "反 depicts" or "角 depicts" would mean it was pictographic, but it might be harder to tell for the other five categories. (I know it's sometimes hard to assign a single category, then one could of course name them all and give an explanation as to why that is so.) I think such a classification would help language learners remember characters through a deeper understanding of their formation, as well.
Answer to #1.
If you want to see what the Expert Edition looks like, check out the demo entries: 土 屮 攵 立 各 艸 足 尚 春 美 草 射 造 堂 黑 路 監 謝 藍 變 艹 攴 (or their simplified equivalents). As far as meaning, the dictionary focuses on a character's meaning in modern Mandarin. Having a dictionary like you mention would be super cool, but its out of the scope of this dictionary, which focuses on character learning. This dictionary does show you the logical connections between different senses of a character in modern Mandarin, but it focuses only on common meanings (because we want to minimize what people have to memorize).

Answer to #2.
We don't use the 6 categories for several reasons. Many top-notch paleographers, such as 陳夢家、唐蘭、裘錫圭 don't use these categories as they are inadequate to explain character evolution/form. Other scholars, such as 季旭昇 (I learned a lot of my paleography from him), still use them, but have to modify the categories to make things fit. To be honest, it gets pretty unwieldy. For instance, 詹鄞鑫 distinguishes 形聲字 whose sound component gives a meaning and 會意字 whose meaning component gives a sound. He criteria are rigorous, but the distinction isn't at all important for someone learning Chinese. In fact, it would be more of a burden. Abstract character-type categories are not useful for learners, because they require that you know a lot of characters to make use of them.

Our system explains characters in terms of their functional components, i.e., the parts of a character that are doing something in that character. Each component has 3 attributes: form, sound and meaning. There are 4 types of functional components: Form components, Meaning components (these two are collectively called semantic components), Sound components and Empty components. Each time these words appear in the dictionary, there is a link to an explanation of what they are. Understanding these 4 types is crucial!
In order to truly understand a character, you need to know how it breaks down component-wise and what each of those components are doing. If you get that, the type of the character itself is irrelevant, because you already thoroughly understand the character. And, your understanding is tied to concrete things, like meaning and sound expression, rather than abstract categories. We have blog posts that explain this stuff too, but we recently switched sites and the blog posts haven't been put back up yet.
Takeaway: one of the most important aspects of this dictionary is the character breakdowns and the system of functional components. These are much more important than looking at ancient forms (though I love looking at ancient forms!).
 

Shun

状元
Many thanks for these answers! I agree it's not the categorization that matters, but the thoughts that led to the various categorizations, and these are included in the dictionary already now. (like for the classic 会意字 “明” or “好”)

Of course, I've never learned characters based on functional explanations, I sort of just used subconscious mnemonics. But as long as a system is able to account for many phenomena in a coherent whole which is held together by reason, it will definitely provide a good foundation that learners can build on.

It's like learning the grammar of languages like French. You need it at first, but once you feel the language subconsciously, you can do without thinking about grammar. Like here, you build a conscious structure using the Outlier system that you can rely on, then later you just remember the characters without needing any additional explanations to remember them.
 
Last edited:

Ash

进士
Many thanks for these answers! I agree it's not the categorization that matters, but the thoughts that led to the various categorizations, and these are included in the dictionary already now. (like for the classic 会意字 “明” or “好”)

Of course, I've never learned characters based on functional explanations, I sort of just used subconscious mnemonics. But as long as a system is able to account for many phenomena in a coherent whole which is held together by reason, it will definitely provide a good foundation that learners can build on.

It's like learning the grammar of languages like French. You need it at first, but once you feel the language subconsciously, you can do without thinking about grammar. Like here, you build a conscious structure using the Outlier system that you can rely on, then later you just remember the characters without needing any additional explanations to remember them.
Well said!
 

shaluig

举人
It's like learning the grammar of languages like French. You need it at first, but once you feel the language subconsciously, you can do without thinking about grammar. Like here, you build a conscious structure using the Outlier system that you can rely on, then later you just remember the characters without needing any additional explanations to remember them.

Oui, et c'est d'autant plus subconscient qu'on est français. :)

Will you trace how a character's meanings changed over time, how and when components were added to it or removed from it (which is probably quite rare), when it was simplified and again made more complex, and so on?

Grand Ricci has that, in form of the GRH (Grand Ricci Historical). From oracles bones to more recent forms, you often have the details of the construction of a specific character,
but I guess you already know that, Shun.
 

Shun

状元
Oui, et c'est d'autant plus subconscient qu'on est français. :)

Très beau, tout à fait. :) — Translation: These rules are the more subconscious the more you are French.

Grand Ricci has that, in form of the GRH (Grand Ricci Historical). From oracles bones to more recent forms, you often have the details of the construction of a specific character,
but I guess you already know that, Shun.

Oh yes, and also the normal Grand Ricci is very diachronic. Merci de m'en avoir rappelé!

Salutations de la Suisse allemande,

Shun
 

shaluig

举人
Oh yes, and also the normal Grand Ricci is very diachronic. Merci de m'en avoir rappelé!

Salutations de la Suisse allemande,

Shun

Das stimmt, das wirklich ist ein ausgezeichnetes Wörterbuch !
Mit freundlichen grüssen aus Bretagne in Frankreich.
 

rizen suha

状元
@rizen suha: You will be pleased to know that as I was poking around looking for information and examples to answer your post, I discovered that for 把, we had 巴 listed as only a sound component, but it actually gives a meaning! So, I've already corrected that and it will go out in the next update.
thanks!

how about this one? (stud classical medieval dict)

廣 broad(en), wide(n), spacious
擴 broaden, widen, expand, extend

im guessing that 廣 is the original character and that 扌has been added to disambiguate or "verbify"?
 
Last edited:

Ash

进士
thanks!

how about this one? (stud classical medieval dict)

廣 broad(en), wide(n), spacious
擴 broaden, widen, expand, extend

im guessing that 廣 is the original character and that 扌has been added to disambiguate or "verbify"?
Yes, you are correct in this case. I've updated our entry and it will go out in the next update. It's important to keep in mind, however, that characters are merely symbols used to write spoken language. This is a crucial distinction that needs to be made when doing this kind of research. So, it's far more likely that there was a spoken word kuò "to expand," quite probably etymologically related to guǎng "vast; broad" and possibly native speakers of the time even felt the relationship (similar to how we can feel the relationship between "national" and "nationalistic"). So, 扌was added to write the verb form that already existed in spoken language, rather than causing the character to be verbified. I'm not sure what you meant exactly in your comment, but I wanted to clarify just in case.

Native speakers often say stuff like, "If you change the tone of a character, it changes the meaning." You can't change the tone of a character without somehow getting the Ministry of Education to change it. What really happens is that you say a syllable of speech incorrectly causing the listener to hear a word different from the one you intended. We speak in syllables of speech, not in characters. Characters are merely used to write spoken language. It may seem like a small thing, but when you're doing research like this, if you don't make this distinction, it will cause errors.
 
I have a question about 卖 while we're on the subject.

This stack exchange answer says that "[t]he original form of the top part of 賣 is 出," and then gives a shuowen reference. OTC says 土 on top is an empty component and OSC says 十 on top is an empty component.

Are these unfinished entries (well, obviously they are) or has shuowen been refuted on this point in later studies?
 
The Shuowen is right about this particular character. Our analysis doesn't actually conflict with it, though it may appear to at first.

We analyze characters in terms of their components in the modern script because our dictionary is meant to answer the question, "Why does this character look like that?" Since the modern form contains 土 rather than 出, we call it an empty component because neither the meaning "dirt" nor the sound tǔ is related to the meaning or sound of 賣. So "empty" refers to the component in the modern script.

When that entry gets filled out in a future version, it will say something like "賣 was originally composed of the meaning components 買 'to buy' and 出 'to go out,' which pointed to the meaning 'to sell'. In the modern form, 出 has corrupted to 土. 買 also gave the sound." Something along those lines.

The same applies for the simplified 卖. The 十 on top does not give the sound shí and does not give the meaning "10" or "needle." Notice that the 头 on the bottom is also empty. It doesn't mean "head" and does not give the sound tóu. It's merely a place holder for an earlier 買.
 

rizen suha

状元
please have a look at 臟

practical dict of chinese medicine:
The Chinese ideogram was originally written as 藏 cang2 meaning to store, and was later distinguished by the addition of the flesh signifier 肉(月)

let me just reiterate my full support for and enthusiasm with the outlier project.

i must however point to the fact that the current edition seems to be somewhat lacking in exactness. as a work in progress, it would perhaps be preferrable to mark information that is pending research or verification as such.

also allow me to point out that i have posted only a few examples, many many more i have not posted since a few suffice to illustrate the issue when accompanied with the (my subjective) observation that around 20% of characters seem to have an obvious (apparent) semantic relation with both the "1st" and "2nd" component in a character; a relationship (though less "obvious") i believe to see in a total of up to 50% of characters. also, i refrain from posting more examples since this is work for me and work for the outlier team that will be better invested on the dictionary itself.

best regards and good luck going forward
 

朱真明

进士
What you say is obvious doesn't really mean much. You either have research supporting your claims or you don't.
 

rizen suha

状元
What you say is obvious doesn't really mean much. You either have research supporting your claims or you don't.
you may be quite right. no hard research, only observations that i have not written down. my posts only aim to give my point of view, to be researched or discarded by whoever it may concern.
 

Ash

进士
please have a look at 臟

practical dict of chinese medicine:
The Chinese ideogram was originally written as 藏 cang2 meaning to store, and was later distinguished by the addition of the flesh signifier 肉(月)

let me just reiterate my full support for and enthusiasm with the outlier project.

i must however point to the fact that the current edition seems to be somewhat lacking in exactness. as a work in progress, it would perhaps be preferrable to mark information that is pending research or verification as such.

also allow me to point out that i have posted only a few examples, many many more i have not posted since a few suffice to illustrate the issue when accompanied with the (my subjective) observation that around 20% of characters seem to have an obvious (apparent) semantic relation with both the "1st" and "2nd" component in a character; a relationship (though less "obvious") i believe to see in a total of up to 50% of characters. also, i refrain from posting more examples since this is work for me and work for the outlier team that will be better invested on the dictionary itself.

best regards and good luck going forward

You're confusing several issues here. The question at hand is “Does 藏 give a meaning in 臟?” First, look at
臧 zāng, cáng, zàng:
Original meaning: either “善”or“slave.”(for our purposes here, it doesn't matter which one it is.)
Form component: 臣 “servant”
Sound component: 戕 qiāng, zāng

藏 cáng, zàng, zāng:
Original meaning: name for a type of plant used to feed cattle.
Meaning component: 艹 “vegetation”
Sound component: 臧 zāng, cáng, zàng

臧 was used in ancient texts to mean: 善 or slave. It was also used via sound loan to mean “to store.” In order to keep these apart, 藏 began to be used via sound loan to represent “to store.” 贓 zāng “goods obtained by illegal means”and 臟 zàng“internal organs”are both meanings extended from “to store.” Note that one character uses 臧 while the other uses 藏. This is because both 臧 and 藏 were used for their sound, not for their original meanings, which have nothing to do with “to store” or any of its derived meanings. Like I said last time, if you don't distinguish between characters and spoken words, you will make errors. This is one of those cases. Yes, 臟 and 贓 are 分化字 from 臧 and 藏, but NOT because of their original meanings or meanings extended from their original meanings (i.e., meanings that are related to the characters' forms), but from sound loan meanings. It's like using 馬來西亞 Mǎláixīyà (or Mǎláixīyǎ in Taiwan) to represent “Malaysia.”This has nothing to do with any meanings of 馬, 來, 西, or 亞. These characters are being used for their sounds, pure and simple.

In summary, since 藏's original meanings or meanings extended from those meanings have nothing to do with the meaning “to store,” we say it only gives a sound in 臟. Also, please note any regular dictionaries (even Chinese-Chinese ones) are probably not going to contain any reliable paleography.
 

rizen suha

状元
@Ash

thank you very much.

you say:

>>藏 began to be used via sound loan to represent “to store.” 贓 zāng “goods obtained by illegal means”and 臟 zàng“internal organs”are both meanings extended from “to store.”<<

i was wondering if the dictionary will include this kind of analysis, going beyond stating that 藏 is a sound component.
 
Top