Dictionary showing character components?

pravit · Jan 13, 2013

I often use Pleco in conjunction with the excellent ChineseEtymology.org site as it really helps to see the phonetic and signific components of a character broken out. Oftentimes that phonetic has its own phonetic and it's a trip down the rabbit hole of tracing down the most basic phonetic component, often some rather obscure ancient character or radical I would have never learned otherwise. I was thinking this would be an excellent type of dictionary to have in Pleco as the tap interface would make tracing down character component roots much faster. Does any sort of dictionary like this exist or are there plans to include one? I'd buy it. It's be awesome if you managed to somehow link up with the database that ChineseEtymology uses. Displaying the seal script and oracle script characters would be nice too although I have no idea if Pleco even supports images in definitions.

mikelove · Jan 13, 2013

pravit said:
I often use Pleco in conjunction with the excellent ChineseEtymology.org site as it really helps to see the phonetic and signific components of a character broken out. Oftentimes that phonetic has its own phonetic and it's a trip down the rabbit hole of tracing down the most basic phonetic component, often some rather obscure ancient character or radical I would have never learned otherwise. I was thinking this would be an excellent type of dictionary to have in Pleco as the tap interface would make tracing down character component roots much faster. Does any sort of dictionary like this exist or are there plans to include one? I'd buy it.

We've had a devil of a time licensing a dictionary like that - couldn't get anywhere with ChineseEtymology or anybody else - and in frustration have been considering developing our own instead. (in which case we'd probably also put it online, as it would be a bit of a waste not to)

pravit said:
Displaying the seal script and oracle script characters would be nice too although I have no idea if Pleco even supports images in definitions.

We don't yet, but we're working on that, and one of our upcoming dictionaries actually includes some of those for common characters at least.

pravit · Jan 14, 2013

What about something like Yellowbridge's "Character Decomposition" which breaks out a character into its constituent components?
http://www.yellowbridge.com/chinese/character-etymology.php?searchChinese=1&zi=尉

That would be a great feature. I wonder where they are getting that data from.

mikelove · Jan 14, 2013

pravit said:
What about something like Yellowbridge's "Character Decomposition" which breaks out a character into its constituent components?
http://www.yellowbridge.com/chinese/cha ... =%E5%B0%89

That would be a great feature. I wonder where they are getting that data from.

Their "Character Formation" data seems to be something they generated themselves. (we already wrote to them and they weren't interested in licensing it out) The Decomposition data isn't really any different from what we already offer in-app in our "Chars" tab.

pravit · Jan 15, 2013

Thanks - that "Chars" tab is an awesome feature! I think it's quite helpful as is; only thing I'd really like to see is a hierarchical, collapsible menu that initially hides all the sub-components until you tap on a parent component. It can be hard to see the main components initially amidst the big list of nameless sub-components.

For anyone who was unaware of the "chars" tab until now, like me, if you highlight a character and press the "字" button on the top right, it brings up another screen where you can look at character info, and words containing the character. If you buy the "Stroke Diagrams" addon, along with a "Strokes" tab, it adds another "Chars" tab which breaks down the characters into components. You can even search for other characters containing that component. Worth buying for this feature alone, in my opinion.

One more question. I noticed that there is a field "phonetic" in the character info with a numerical code. Doing some searching, you can use this index as a key to show all characters sharing the same phonetic (see attached text file in this thread): http://www.plecoforums.com/viewtopic.php?f=17&t=3174
Would it be possible to add this list of same-phonetic characters into the interface with their pinyin?

mikelove · Jan 15, 2013

pravit said:
One more question. I noticed that there is a field "phonetic" in the character info with a numerical code. Doing some searching, you can use this index as a key to show all characters sharing the same phonetic (see attached text file in this thread): viewtopic.php?f=17&t=3174
Would it be possible to add this list of same-phonetic characters into the interface with their pinyin?

I suppose, but why not just type that Pinyin into the Pleco search field?

pravit · Jan 15, 2013

I suppose, but why not just type that Pinyin into the Pleco search field?

Because there is no immediate mapping of the numerical code shown in Pleco to the component or Pinyin syllable, e.g. if I look up the character 慰 in Pleco, I get this field:

Phonetic: 1429

Pulling up the text file linked in the other thread, I search through for all instances of 1429 to find every character which shares the same phonetic:

尉 [U+5C09] 1176 1429 (wei4 yu4)
嶎 [U+5D8E] 1429 (yu4)
慰 [U+6170] 1429 (wei4)
熨 [U+71A8] 1429 (yu4 yun4)
罻 [U+7F7B] 1429 (wei4 yu4)
蔚 [U+851A] 1429 (wei4 yu4)
etc. from which I can infer that phonetic component 1429 = 尉 and that characters with this phonetic are typically pronounced "wei" or "yu".

Doing a Pinyin search on "wei" will return many characters that happen to have the same pronunciation but do not share the phonetic component "尉".

If you knew that the phonetic of this character is "尉", then you can do a search for characters containing that component, but that isn't immediately obvious from the "1429" code.

Essentially it'd be nice if this "1429" code was replaced with something more useful, e.g. the actual component "尉", or a listing of characters with the same phonetic component, since this info seems to be already available in the UNIHAN data.

mikelove · Jan 16, 2013

pravit said:
Because there is no immediate mapping of the numerical code shown in Pleco to the component or Pinyin syllable, e.g. if I look up the character 慰 in Pleco, I get this field:

Hmm, didn't realize that's what that field was for actually... have to investigate this data set further, we might be able to do something with it now that we know it's there

Dictionary showing character components?

pravit

秀才

mikelove

皇帝

pravit

秀才

mikelove

皇帝

pravit

秀才

mikelove

皇帝

pravit

秀才

mikelove

皇帝