TTS System Question

mikelove

皇帝
Staff member
We're looking into incorporating an offline text-to-speech system from NeoSpeech as one possible way of meeting the many requests people have had for some sort of system for reading documents etc aloud, but it seems like due to technical limitations it would only be possible to include one of their two Chinese voices in our app.

So if you've got a free minute, could you go to their website, try out the two Chinese voices in the demo box on the left side of the page, and let us know what you think? Which one do you like better, male or female, and are they good enough in general for you to consider them worth buying as a paid upgrade over the current syllable-by-syllable / word-by-word audio system?

Thanks.
 

gato

状元
I tried both voices with this news article below.
http://news.qq.com/a/20101220/001942.htm
上海今日发生两起高楼火灾 相关人员紧急疏散

The female voice sounds much more natural and clear than the male one. Aside from being slightly robotic, there seems to some kind of buzz or feedback in the background of the male voice.
 

numble

状元
I also like the female voice better. I think we're conditioned for robotic male voices, so his robotic-ness stands out.
 
I also have a preference for the female voice(Hui), the cadence is really quite amazing. We've come a long way from Microsoft Sam, haven't we? :)
 

mikelove

皇帝
Staff member
Thanks for the feedback - Hui sounds better to me too, and makes up for the fact that the female voice recordings in our regular audio add-on are of slightly lower quality than the male ones.
 
Having a real TTS would be brilliant. I almost never use the current system to read individual characters/words, but if I could read entire texts, that would be a whole other matter. And the female voice is really good! My ex-gf in China used to listen to English audiobooks, while reading along in the texts, and found that that helped both on her pronunciation/listening, but also on gently increasing her reading speed.
 

mikelove

皇帝
Staff member
Not looking so good for licensing this particular one, actually, but we've got another option lined up that we think is pretty comparable - downside is that it requires an internet connection, but the upside is that it's free (at least for now) and we can get a rudimentary version of it implemented possibly as soon as 2.2.2.
 
That's too bad about being able to license that one, the woman's voice sounds extremely realistic. Just curious, is it a budget issue, inflexible licensing terms, or compatibility related problem?
 

sburkle

Member
I'm disappointed to hear that licensing won't workout with this one, the woman's voice sounds really good! The man's on the other hand, sounds very Microsoft Sam-ish.
 

kun4

举人
mikelove said:
downside is that it requires an internet connection

If there's a web site out there which offers free text-to-speech, please tell us all about it. But please, please don't convert Pleco into just another spruced-up graphical interface for a web site.

One of the main features of Pleco is that, once installed, it just works. Yes, Pleco costs money, but I feel it's worth it. Pleco is self-contained, meaning it works reliably even in places where phone coverage is spotty, or in places where asking for an internet connection is met with polite smiles. As no internet access is needed, there are no "data roaming" charges, no unpleasant surprises when the phone bill arrives.

To me, Pleco is a notch above many apps which are merely a frontend to Yellowbridge, Google translate or other web sites.
 

mikelove

皇帝
Staff member
taiwanshaun said:
That's too bad about being able to license that one, the woman's voice sounds extremely realistic. Just curious, is it a budget issue, inflexible licensing terms, or compatibility related problem?

License isn't dead yet, I just don't want everyone to get their hopes up too much... it's primarily a licensing terms issue. More of an Apple problem than a licensor problem, they're very nice people but it's just difficult to get everything lined up legally... ironically it wouldn't be a problem at all on Android, but we feel like built-in Chinese TTS support on Android is imminent (already supported for several other languages) and it's probably best to just use whatever Google builds into the OS (as we plan to do with speech recognition).

kun4 said:
If there's a web site out there which offers free text-to-speech, please tell us all about it. But please, please don't convert Pleco into just another spruced-up graphical interface for a web site.

One of the main features of Pleco is that, once installed, it just works. Yes, Pleco costs money, but I feel it's worth it. Pleco is self-contained, meaning it works reliably even in places where phone coverage is spotty, or in places where asking for an internet connection is met with polite smiles. As no internet access is needed, there are no "data roaming" charges, no unpleasant surprises when the phone bill arrives.

To me, Pleco is a notch above many apps which are merely a frontend to Yellowbridge, Google translate or other web sites.

There are several free TTS websites out there, actually, but in this case we'd be using one of them as a prelude to ultimately running our own - it turns out there's a very good Chinese TTS system which, unlike most of the others out there, can be hosted on our own server without paying some sort of silly fee per minute of audio generated, so we'd be building in one of the free ones now and switching to our own server if that proves popular.

We have no intention of converting Pleco into a glorified website front-end, but there's been a lot of interest lately in things that require an internet connection (Google Translate, options to redirect search terms to online dictionaries / web searches for usage examples) and I'm starting to think my attitude of absolutely refusing to do any of that no longer makes sense - if there's a useful function that we simply can't provide offline then an online version may be better than nothing. I don't think any of these things would be core features or even be on by default, but at least for some people they might be worth having - enhancing our offline features with the option of bringing in additional online content can only make things better.
 

mikelove

皇帝
Staff member
So it looks like we're going to be able to work this out after all - turned out that the issue was due to a misunderstanding on my part of a particular Apple requirement. No exact prediction on when this will make it into Pleco, but almost certainly by version 2.3 if not before.
 

mikelove

皇帝
Staff member
gato said:
This is going to be awesome, Mike. You should post another video when it's ready.

Makes sense... between that and a 2.2.2 OCR demo and maybe something in the yes-it-does-exist vein for Android users I think our YouTube page could be quite busy in the next month or two.
 

mikelove

皇帝
Staff member
numble said:
Do you have any comments on Nuance's capabilities?

They don't seem to do anything much for Chinese; also, anything they do on iOS is likely to require a network connection, not to mention that it would rely on a bunch of US-based servers, so at least for text-to-speech, since it's feasible both technologically and financially for us to offer an entirely offline solution I think that's probably still the best bet.

There isn't much out there in the way of offline Chinese voice recognition for iOS, but I'm hoping that Google responds to the Apple-Nuance deal by making the speech recognition APIs they currently employ on Android / Chrome available on other platforms as well; their Chinese speech recognition is quite good and, all other things being equal, I'd rather rely on the same algorithm on Android and iOS to deliver a more consistent experience. (we're soon going to be updating our handwriting library on iOS to the latest version of Hanwang's algorithm in order to match the version we're deploying on Android)
 
Top