Pinyin Converter Website / App Request (Not PurpleCulture)

Hi there,

I am needing to find a 汉字 to Pinyin converter which can paste the pinyin over the characters into Ms Word without Ms Word thinking it is an image. I've tried Purple Culture but no luck. Other websites / apps I've used end up with one character per line for a whole paragraph. Or if it does work, the pinyin is so close together it's impossible to distinguish when one word begins and ends.

Any suggestions would be much appreciated.

Thanks

Christopher
 

Shun

状元
Hello Christopher, hello rizen,

The best option that I know of is to add a pinyin Phonetic Guide inside Apple Pages, which can be reached by selecting the Chinese text and clicking the right mouse button. This adds the pinyin automatically but also won't care about the space between pinyin syllables. Following is the Pages document converted to Word format. To me, it looks barely acceptable (provided the reader already knows which Chinese syllables exist).

Export from Apple Pages to Word.png

This problem surely was solved a long time ago in the Chinese market, considering that there are plenty of books for children or other learners which include nice pinyin lines with Chinese text. There of course always is some manual work involved, since the pronunciation sometimes depends on the meaning of the character, or whether a word is used as a noun or a verb. Perhaps it could be worthwhile searching the Chinese internet for this type of software.

Google Translate does a pretty good job of adding pinyin. Perhaps that would be the most suitable compromise, since pinyin as separate text will always look good.

Hope this helps,

Shun
 

Attachments

  • Export from Pages to Word format.docx.zip
    17.8 KB · Views: 304
Thanks Rizen and Shun for your suggestions. Yes, I might just have to go with the pinyin only solution. But will first ask Chinese friends for the software you mentioned, as I agree, this has been done.
Many thanks
Christopher
 

Shun

状元
Hi Thessaliad,

thanks for the font. As you surely know, it's too bad the pinyin for each character isn't always the same. That would have made the implementation of ruby text a lot simpler. A good piece of pinyin annotation software needs to have a large dictionary and AI to get a low error rate. In some cases, the pinyin will still be wrong, because the computer still can't understand the meaning of sentences at a pragmatic level. In search of a good example for this problem, I found the following on Wikipedia:

"The sentence "You have a green light" is ambiguous. Without knowing the context, the identity of the speaker or the speaker's intent, it is difficult to infer the meaning with certainty. For example, it could mean:
  • the space that belongs to you has green ambient lighting;
  • you are driving through a green traffic signal;
  • you no longer have to wait to continue driving;
  • you are permitted to proceed in a non-driving context;
  • your body is cast in a greenish glow;
  • you possess a light source which radiates green; or
  • you possess a light with a green surface."
I would guess that in the writing of a Chinese book, the effort needed to go over the ruby text isn't that great compared to the writing of the book itself, so the gold standard for good ruby text probably still is editing done by humans, perhaps aided by a computer for the initial annotation.

Did you know this site? It looks all right:


This app "Chinese Annotator" for 1 USD also looks good, with positive reviews:


Cheers, Shun
 

Thessaliad

Member
Hi Shun! Thanks for the links!

Oddly enough, I came across this very situation looking at the tone shifts in yi1 to yi2 when I was removing duplicates from a flashcard deck in Pleco. Pleco's dictionary reflects both variants, but it would be impossible in this font.

Of course, the Chinese Pronunciation Wiki came to the rescue! They explained why the tone changes are (unfortunately)not reflected in the text:

Normally the tone changes above are not written in the pinyin; you are supposed to just know the rule and apply it if you say the word(s) aloud.

Which certainly explains why the pinyin showed one thing and native speakers said something else. I focus on memorizing the speech. :)

My search for a ruby pinyin font came from wanting to read webnovels with audio and follow along. Didn't know about Edge's TTS until I had created a TTS in Azure, haha.

I'm trying to use the text of my current novel as a basis for sentence mining and making up my own sentences. I just bumped into Pierre's python script, so now I guess I get to learn how to play with Python. Still can't believe that reading danmei is what got me into programming.
 

Shun

状元
Hi Thessaliad,

You're welcome! You're absolutely right about the tone changes not being written down. But about one tenth of all Chinese characters, so I've read, have more than one pronunciation. A few examples of such homographs / 多音字:

打的 dǎ
我的 wǒ de
好处 hǎochu
处于 chǔ
别处 biéchù
爱好 àihào

Have fun with Python, and do share your results.

Cheers, Shun
 
Last edited:

Thessaliad

Member
Hey Shun,

You're right! I forgot about all those pesky homographs (facepalm).

Incidentally, I typed those words in the ruby font, and only 别处 biéchù came out correct. That's a failing grade by anyone's account. Seems like the designers would have accounted for homographs.

While I like Windows IME's predictive capability, I sometimes get mixed up when I'm trying to type tone numbers. I like typing the tone numbers as a way of reinforcing my memory, à la Glossika.
 

Shun

状元
Hey Thessaliad,

no problem! Yeah, in the end the ruby text is still going to be done by hand. No computer is good enough for that yet. ;)

That certainly is a good way of practicing tones.

Shun
 
Top