Outlier Dictionary of Chinese Characters

JD · Jun 25, 2021

MMarks said:
I don't understand the pricing of the Outlier dictionary. I seem to have Expert and Essentials. The latest upgrade to Essentials would cost me $19.99. Is that correct? Just asking. I had assumed that buying the dictionary would include the upgrade. Not complaining, I just don't find it clear.

It's a one-time purchase where you automatically get the upgrades. If you aren't getting the upgrades, there might be something amiss with your Pleco account (purchases not synched with your ID, etc)

Ash · Jun 25, 2021

MMarks said:
I don't understand the pricing of the Outlier dictionary. I seem to have Expert and Essentials. The latest upgrade to Essentials would cost me $19.99. Is that correct? Just asking. I had assumed that buying the dictionary would include the upgrade. Not complaining, I just don't find it clear.

No, that's not correct. If you buy Essentials, you only have to buy it once. All future updates to Essentials are included.
If you have Expert, that includes Essentials and Expert. Also, you only have to buy it once. All future updates to Essentials and Expert are included.
The updates should happen automatically.
Go to the 話 entry, if you see a blue button with a white S (for iPhone) or just an S (Android) to the left of, "In 話, 言 "speech"... in the component breakdown, then you already have the update. If you click on the S, you'll see the system data.
Pleco is working on fixing the S button for Android.

MMarks · Jun 25, 2021

Thanks, JD and Ash, that's what I was wondering. Pleco is showing me I need to pay for the upgrade.
There's nothing to the left of that (iPhone).

MMarks · Jun 25, 2021

So how do I sync my purchases with my ID?

MMarks · Jun 25, 2021

Actually, it looks as if I have been able to download the updates without paying. The original $19.99 was assuming I had bought Outlier Mini. The "S" has now appeared! The update was not automatic. Thanks for the confirmation that I don't have to pay.

JD · Jun 25, 2021

MMarks said:
So how do I sync my purchases with my ID?

Send a PM to @mikelove

Vitali · Aug 5, 2021

@John Renfroe on your website it is written that it is bilingual also in German. Is only the translation of the searched word into German or all other information also? Because in my expert edition everything is only in English and if it is possible i dont know how to switch to german.

JD · Aug 5, 2021

@Vitali I don’t see anything on the Outlier website saying that their dictionary is bilingual. I do see that Outlier, the company, offers a translation service that includes German, but this is not the dictionary. Where are you seeing that their website says they have a German version?

aha…..after some digging, I did find an option for the German version of the Expert Edition, but it says the following:
Also note that the German version hasn't been released yet. If you order it, you'll get the English version for now, and we'll let you know when the German version is ready so you can switch over if you'd like.
At this link: https://www.outlier-linguistics.com/products/outlier-dictionary-of-chinese-characters-expert-edition

alex_hk90 · Aug 6, 2021

JD said:
aha…..after some digging, I did find an option for the German version of the Expert Edition, but it says the following:
Also note that the German version hasn't been released yet. If you order it, you'll get the English version for now, and we'll let you know when the German version is ready so you can switch over if you'd like.
At this link: https://www.outlier-linguistics.com/products/outlier-dictionary-of-chinese-characters-expert-edition

@Ash @John Renfroe
It looks like that page on the website needs updating - I'm pretty sure there are many more than 38 Expert entries now.

Note that this version hasn't been released yet. If you pre-order it, you'll get the current version of the Essentials Edition, plus all the Expert entries we've finished so far (only 38 right now, but a few hundred more will be added in the next few months). You'll also get regular updates until the dictionary is finished!

Click to expand...

Vitali · Aug 6, 2021

@JD here you can see the Information. Essentials and Expert is bilingual and the Mini Version monolingual in this Version you can choose between german and Englisch.

Three Editions of the Outlier Dictionary

The Outlier Dictionary of Chinese Characters comes in three “flavors”: - The Mini Edition - The Essentials Edition - The Expert Edition So what are the differences, and which one is right for you? The Mini Edition The Mini Edition contains: 2000 characters, both Simpilfied and Traditional Full...

www.outlier-linguistics.com

Here you can select and purchase them.

Chinese

Creators of groundbreaking Chinese character/Japanese kanji etymology dictionaries for iOS and Android and other tools for learning Chinese characters/kanji. We show how characters/kanji work as a system, rather than as a bunch of disconnected single characters to be individually mastered...

www.outlier-linguistics.com

Longdan · Aug 25, 2021

mikelove said:
Innovative new Chinese character etymology dictionary; on Kickstarter here, will be available as a Pleco add-on upon release. So we're finally fulfilling one of our very oldest feature requests and offering an etymology dictionary

That’s great news! Outlier is doing some amazing work already, can’t wait to download this new product. I wonder if there may also be a Chinese WORD etimology dictionary in the pipeline. It’s sorely missing, even among paper dictionary, apart from Schuessler’s dictionaries of Old Chinese. Any chance we may be getting something along those lines?

Longdan · Aug 25, 2021

Longdan said:
That’s great news! Outlier is doing some amazing work already, can’t wait to download this new product. I wonder if there may also be a Chinese WORD etimology dictionary in the pipeline. It’s sorely missing, even among paper dictionary, apart from Schuessler’s dictionaries of Old Chinese. Any chance we may be getting something along those lines?

Oh wait, I just realized this is an old post and that the Outlier dictionary I’ve been using actually is the one you’re talking about here! My bad. But the question remains relevant: is there any hope of getting “real” etymological dictionaries (etimology of the spoken word rarher than philologic analysis of the graphic components of the written characters)?

mikelove · Aug 25, 2021

We have a license for the “ABC Etymological Dictionary of Old Chinese” which is about the closest anyone has gotten to what you describe in English, but the data files are thorny enough that we had to put them aside until after 4.0 is out.

We could consider licensing a Chinese-only title for this but I don’t know of any good ones that are new enough to exist in digital form.

Longdan · Aug 25, 2021

mikelove said:
We have a license for the “ABC Etymological Dictionary of Old Chinese” which is about the closest anyone has gotten to what you describe in English, but the data files are thorny enough that we had to put them aside until after 4.0 is out.

We could consider licensing a Chinese-only title for this but I don’t know of any good ones that are new enough to exist in digital form.

That’s wonderful, looking forward to version 4.0 then!

Ash · Sep 7, 2021

@Vitali, @JD, @Longdan, @alex_hk90

There are about 190 Expert entries at the moment, we're about to add another 70 very soon. John is going to make an announcement here with more details. Re: German. We are going to do that, but it's on hold at the moment. We were in talks with a translator, but he disappeared into China. We do have semantic component posters in German that are currently available.

And, word etymology resources are fairly sparse, even in Chinese. We don't have any plans in the near future to do that. That would require a complete retooling.

Book-wise, James Matisoff's Handbook of Proto-Tibeto-Burman has some in it, but it's tainted by the fact that he bases things on Karlgren's OC reconstruction. One of the major issues being there are no -- or almost no-- open syllables in Karlgren's system. So, words that would be *ka in Baxter92, Baxter & Sagart 2014, 王力、鄭張尚芳, etc. would be *kak (or *kap, *kat depending on the rhyme group) in Karlgren's system (as it would in 李方桂). Needless to say, having that extra consonant at the end could make unrelated words look related, and related words look unrelated. I've not actually evaluated the quality of Matisoff's Chinese etymologies, but I don't see how using Karlgren's OC can do anything but have a negative effect.
王力《同源字典》 despite the name is basically about word etymology. His 《古漢語字典》 points those things out as well.
吳安其《漢藏語同源研究》2002 (You could get some Chinese word etymology from this, but it's not going to be direct.)
姚榮松《古代漢語詞源研究論衡》(This looks like what you want, but I've never read it, so I can't say anything about it's quality.)

One more thing to point out, those are all dealing with the etymologies of ancient words, which is a completely different question from how did the word 社會 come about. The Taiwanese MOE online dictionary does show quotes from ancient texts for words, so that might be a place to start (assuming you already read Chinese). However, I'm not sure what process they used for those quotes, so they may not represent the earliest uses.

John Renfroe · Sep 8, 2021

Hi all. We're getting ready to release another big update. This one will have over 380 new Essentials entries and almost 70 new Expert entries (some super interesting ones this time around). We're putting the last touches on it right now, and we should be able to release it this month.

alex_hk90 said:
I'm pretty sure there are many more than 38 Expert entries now.

Thanks for pointing that out. That page isn't accessible via normal navigation of our site (the normal page is here: https://www.outlier-linguistics.com/products/outlier-dictionary-of-chinese-characters), so it hasn't been updated in a while. I guess you must have followed an old link on a forum or social media post somewhere. At any rate, it's updated now.

John Armstrong · Sep 17, 2021

Ash said:
@Vitali, @JD, @Longdan, @alex_hk90

There are about 190 Expert entries at the moment, we're about to add another 70 very soon. John is going to make an announcement here with more details. Re: German. We are going to do that, but it's on hold at the moment. We were in talks with a translator, but he disappeared into China. We do have semantic component posters in German that are currently available.

And, word etymology resources are fairly sparse, even in Chinese. We don't have any plans in the near future to do that. That would require a complete retooling.

Book-wise, James Matisoff's Handbook of Proto-Tibeto-Burman has some in it, but it's tainted by the fact that he bases things on Karlgren's OC reconstruction. One of the major issues being there are no -- or almost no-- open syllables in Karlgren's system. So, words that would be *ka in Baxter92, Baxter & Sagart 2014, 王力、鄭張尚芳, etc. would be *kak (or *kap, *kat depending on the rhyme group) in Karlgren's system (as it would in 李方桂). Needless to say, having that extra consonant at the end could make unrelated words look related, and related words look unrelated. I've not actually evaluated the quality of Matisoff's Chinese etymologies, but I don't see how using Karlgren's OC can do anything but have a negative effect.
王力《同源字典》 despite the name is basically about word etymology. His 《古漢語字典》 points those things out as well.
吳安其《漢藏語同源研究》2002 (You could get some Chinese word etymology from this, but it's not going to be direct.)
姚榮松《古代漢語詞源研究論衡》(This looks like what you want, but I've never read it, so I can't say anything about it's quality.)

One more thing to point out, those are all dealing with the etymologies of ancient words, which is a completely different question from how did the word 社會 come about. The Taiwanese MOE online dictionary does show quotes from ancient texts for words, so that might be a place to start (assuming you already read Chinese). However, I'm not sure what process they used for those quotes, so they may not represent the earliest uses.

Hi. I just joined this forum and your post caught my eye. My main interest is Middle Chinese as seen through the window of rhyme dictionaries and rhyme tables, and mostly avoid Old Chinese and Tibeto-Burman and other proto-language hypotheses, which are too speculative for my tastes.

But I do recall that the system of final consonants that Karlgren postulated for Old Chinese (his Archaic Chinese, as opposed to his Ancient Chinese, which is Middle Chinese) arose from his study of character phonetic series (諧聲 xiéshēng) which showed a mix of apparently vowel-final syllables with 去 qù ‘departing’ tone and stop-final (-k -t –p) syllables with 入 rù ‘entering’ tone (the only choice for such finals in Middle Chinese as we know it). A standard example is:

夜 Mand yè Cant je6 ‘night’ MC /jiaH/ (final H = qù tone)
液 Mand yì yè Cant jik6 jat6 ‘liquid’ MC /jiɛk/ (final stop = rù tone)

I know phonetic series play a big role in the Outlier character learning method. Have you encountered the kinds of series that Karlgren was concerned with and if so what is your take on them? (BTW I notice that the example is not problematic in modern Mandarin because the Middle Chinese final -k was lost long ago; but the difference can still be seen in Cantonese.)

Ash · Sep 17, 2021

John Armstrong said:
Hi. I just joined this forum and your post caught my eye. My main interest is Middle Chinese as seen through the window of rhyme dictionaries and rhyme tables, and mostly avoid Old Chinese and Tibeto-Burman and other proto-language hypotheses, which are too speculative for my tastes.

But I do recall that the system of final consonants that Karlgren postulated for Old Chinese (his Archaic Chinese, as opposed to his Ancient Chinese, which is Middle Chinese) arose from his study of character phonetic series (諧聲 xiéshēng) which showed a mix of apparently vowel-final syllables with 去 qù ‘departing’ tone and stop-final (-k -t –p) syllables with 入 rù ‘entering’ tone (the only choice for such finals in Middle Chinese as we know it). A standard example is:

夜 Mand yè Cant je6 ‘night’ MC /jiaH/ (final H = qù tone)
液 Mand yì yè Cant jik6 jat6 ‘liquid’ MC /jiɛk/ (final stop = rù tone)

I know phonetic series play a big role in the Outlier character learning method. Have you encountered the kinds of series that Karlgren was concerned with and if so what is your take on them? (BTW I notice that the example is not problematic in modern Mandarin because the Middle Chinese final -k was lost long ago; but the difference can still be seen in Cantonese.)

If you're trying to understand sound series, OC is the only game in town. Most characters in use today were created before the Middle Chinese period. Not only that, it's not all highly speculative. There's core of things that most scholars agree upon. The newer stuff systematically answers a lot of questions. Having said that, here's part of a response on Reddit I gave that answers your question:

As far as post-codas, it's the same kind of logic. The coda is part of the root, while a post-coda is something attached to the end of the root. Baxter & Sagart use Haudricourt's explanation for tonogenesis in Vietnamese (which is from the 1940's or 50's). Haudricourt basically showed that a language related to Vietnamese which wasn't tonal had syllable segments that were missing in Vietnamese, and these missing segments corresponded to tones in Vietnamese.

Applied to OC, *-ʔ is used as the origin of MC 上聲 and *-s is the origin of MC 去聲. Additionally, this reconstruction also accounts for what had been considered a rather puzzling connection between 入聲 and 去聲 words. In Baxter & Sagart's reconstruction, an OC *-k might also have a post-coda like *-s, so *-ks > *-s > -H (MC 去聲). If you know Cantonese, if you can think of a 入聲 character that also has an open syllable reading, that open syllable reading will almost always be the Canto 3rd tone or 6th tone (both come from MC 去聲). A competing idea accepted by 李方桂, 董同龢, 竺家寧, etc. is that all syllables in OC were closed. But, there are no known languages in the entire world where there are no open syllables (i.e., with only closed syllables), so this is highly unlikely. In contrast, the post-codas provide an elegant explanation with greater explanatory power. They not only explain the origin of MC tones, they also explain why MC 去聲 and 入聲 have so much contact, in addition to providing a motivation for the change: The *-s assimilates the *-k (where assimilation is a very normal sound change).

If you're interested in the rest of the post (it talks about OC and its reliability), you can check it out here:
https://www.reddit.com/r/classicalchinese/comments/lfzczb/whats_your_opinion_on_the_baxtersagart_2014/gms9dq9/?context=3

If not, then no worries!

John Armstrong · Sep 17, 2021

Ash said:
If you're trying to understand sound series, OC is the only game in town. Most characters in use today were created before the Middle Chinese period. Not only that, it's not all highly speculative. There's core of things that most scholars agree upon. The newer stuff systematically answers a lot of questions. Having said that, here's part of a response on Reddit I gave that answers your question:

As far as post-codas, it's the same kind of logic. The coda is part of the root, while a post-coda is something attached to the end of the root. Baxter & Sagart use Haudricourt's explanation for tonogenesis in Vietnamese (which is from the 1940's or 50's). Haudricourt basically showed that a language related to Vietnamese which wasn't tonal had syllable segments that were missing in Vietnamese, and these missing segments corresponded to tones in Vietnamese.

Applied to OC, *-ʔ is used as the origin of MC 上聲 and *-s is the origin of MC 去聲. Additionally, this reconstruction also accounts for what had been considered a rather puzzling connection between 入聲 and 去聲 words. In Baxter & Sagart's reconstruction, an OC *-k might also have a post-coda like *-s, so *-ks > *-s > -H (MC 去聲). If you know Cantonese, if you can think of a 入聲 character that also has an open syllable reading, that open syllable reading will almost always be the Canto 3rd tone or 6th tone (both come from MC 去聲). A competing idea accepted by 李方桂, 董同龢, 竺家寧, etc. is that all syllables in OC were closed. But, there are no known languages in the entire world where there are no open syllables (i.e., with only closed syllables), so this is highly unlikely. In contrast, the post-codas provide an elegant explanation with greater explanatory power. They not only explain the origin of MC tones, they also explain why MC 去聲 and 入聲 have so much contact, in addition to providing a motivation for the change: The *-s assimilates the *-k (where assimilation is a very normal sound change).

If you're interested in the rest of the post (it talks about OC and its reliability), you can check it out here:
https://www.reddit.com/r/classicalchinese/comments/lfzczb/whats_your_opinion_on_the_baxtersagart_2014/gms9dq9/?context=3

If not, then no worries!

Re the relationship between phonetic series (xiesheng) and OC, yes, it’s definitely a close one. But it’s also a circular one, in that phonetic series have always been one of the main sources for the reconstruction of OC, maybe even the most important of all. A big part of what OC is, then, is a theory developed to explain the range of phonetic series.

I like your reddit post. It includes the qu-ru phonetic series issue I mentioned provides a nice summary of the basic methodology of OC reconstruction. I am a fan of Baxter and Sagart myself, because they try to go beyond pure phonological reconstruction and identify phonological rules and (traces of) morphological formations (especially prefixation and suffixation). I’m aware of stated goal to present a model of OC that is supported by empirical evidence and potentially falsifiable by counter evidence, but my impression is that while the individual reconstructions can be challenged (and replaced) the model as a whole is so resilient that it’s practically impossible to prove that that it’s wrong – at least without a major discovery of new evidence or a major advance in methodology. I therefore regard it as speculative in the specific sense of being not confirmable or disconfirmable to a reasonable degree of confidence.

There’s a question about OC that has been on my mind for a long time but I haven’t seen addressed and would love to hear your opinion on. Evidence continues to mount that:

(1) what appear as syllables in MC were segmentable into optional prefixes, roots (always present), and optional suffixes;

(2) these prefixes and suffixes may originally and that these prefixes and suffixes sometimes had derivational force and as such were morphemes in their own right separate from the root morphemes to which they were attached; and

(3) some prefixes and suffixes at may have vowels of their own or at least may have been joined to the roots via connecting vowels, even if they tended to be short (forming so-called sesquisyllables with the root) and subject to elision.

My question is, given this situation, why weren’t the prefixes and suffixes represented in the writing system? Two ways they could have been were (a) as separate characters or (b) as character components added to the root character. But as far as I know there is no evidence of either practice.

(BTW an interesting case of (a) in a Vietnamese text that you may already be familiar with is GONG Xun’s his 2019 paper “Chinese loans in Old Vietnamese with a sesquisyllabic phonology” https://eprints.soas.ac.uk/32159/.)

Ash · Sep 17, 2021

re: "in that phonetic series have always been one of the main sources for the reconstruction of OC, maybe even the most important of all. "
That is not an accurate characterization of OC. OC isn't "mainly" reconstructed based on xiesheng series. The rhyme categories are "mainly" reconstructed on how characters rhyme in the 《詩經》 (which is independent of xiesheng; in fact, xiesheng provides independent evidence for rhyming in addition to the rhyming in the Shijing and other ancient rhyming texts). 諧聲 is taken into account, but is not the only thing taken into account. There are tons of 異體字, which represent the same word with different phonetics, there is tons, and tons and tons of 通假, which is characters being used as sound symbols without regard to their meanings. In Warring States excavated texts, it's not uncommon for 70% of the characters in a single sentence to be 通假. That's a lot of sound data. There's also how foreign place names are written in Chinese characters. There is also 異文 of ancient texts. So, 異體字、通假字、異文 all give independent evidence for the sound relations between both individual characters and sound components. There are a lot of dictionaries of 通假 pairs.

OC is not "a theory developed to explain the range of phonetic series." In fact, the early interests in OC has to do with understanding ancient texts. It has hundreds of years of history behind it. One of the first people that tried to reconstruct OC was 鄭庠 in the Song dynasty, though it was very rudimentary. The traditional analysis goes back to the 顧炎武 in the 1600's. I don't think you can characterize any of these guys as being "primarily concerned with phonetic series."

Re: "I therefore regard it as speculative in the specific sense of being not confirmable or disconfirmable to a reasonable degree of confidence. "
I just saw a paper given at a conference about how well Baxter & Sagart 2014 vs. 張鄭尚芳 can explain rhyming phenomena in excavated texts, B&S did really well. They did have a problem with 侵部. Interestingly, Sagart gave a paper at the same conference on some geographical differences with OC部 and 侵部 was the main one. I'm pretty sure (though I'd have to confer with the author of the BnS2014 vs. zzsf paper to be 100% sure) that some of the issues he came across were cleared up due to the geographical differences.
So, to challenge your assertion, here is a case of an excavated texts being used to confirm theories based on the 《詩經》、諧聲、通假、異體字, (all of which are independent from each other), and performing rather well. In places it doesn't perform, it points out areas to fix. That is not circular.

So, your characterization of OC is not accurate.

Re: "what appear as syllables in MC were segmentable into optional prefixes, roots (always present), and optional suffixes; "
I'm not sure what you mean. Are you saying that there was suffixation in OC? What "appear as syllables in MC" are syllables. As are the roots plus affixation in OC. I'm not sure what you mean by "appearing as syllables".

Re: "My question is, given this situation, why weren’t the prefixes and suffixes represented in the writing system? "
There are examples of them being reflected in writing. There are cases like, I think it's in 《方言》, where they say that 筆 is pronounced 不律 in some region, which fits exactly BnS2014 "loosely attached" (不律) vs. "tightly attached" (筆) prefix types. I've found other examples where 注釋家 explain some character like:
無X，X也. So, "not X" = "X". The reason? The 無 is representing a sound, a loosely attached pre-initial. So, there are instances of your (a).
I don't think your (b) is viable though. How would you know that a given component is representing affixation? I've never seen any marking on a character that would indicate some internal aspect of a character's pronunciation. In fact, native speakers of languages don't analyze word-internal grammar, which is what your (b) is. So, (b) is out, but there are examples of (a). Pre-Qin characters reflect syllables, but not parts of syllables.

There are things like 合文, where two characters are written together as a single character, but they are usually marked with a = symbol, showing that it should be read as two characters. But having a component represent a prefix doesn't really match how things worked in pre-Qin scripts.

I'm familiar with GONG Xun, but haven't read that particular paper.

EDIT:
I forgot to mention that any OC reconstruction also has to systematically evolve into MC. That's yet another independent constraint, given the data in rhyme books. Of course, there is also Chinese characters being borrowed into Japanese, Korean, Vietnamese. There are more lines of independent evidence.

Outlier Dictionary of Chinese Characters

状元

进士

举人

举人

举人

状元

Member

状元

状元

Member

Member

Member

皇帝

Member

进士

进士

Member

进士

Member

进士