Wrong Cantonese Jyutping [lei5 --> incorrect] for 裡 [leoi5 --> correct], can this please be fixed?

mikelove

皇帝
Staff member
Sorry, do you have a source for that? I see several non-Pleco dictionaries listing that pronunciation, including Wiktionary and CantoDict, and it’s also in Unihan. I do see leoi5 too but my impression is that both are valid (and we include both in Pleco).
 
Sorry, do you have a source for that? I see several non-Pleco dictionaries listing that pronunciation, including Wiktionary and CantoDict, and it’s also in Unihan. I do see leoi5 too but my impression is that both are valid (and we include both in Pleco).
My source was simply an Anki Deck (I'm using both the Anki Deck and Pleco to supplement my learning), but I cross-referenced it with family and didn't find the audio on Pleco (lei5) to match what I was hearing on the Anki Deck nor my family.

Are you sure that the "lei5" isn't corresponding to the "裏" character, instead of the "裡" character, which should be "leoi5"?
 

mikelove

皇帝
Staff member
There are enough sources saying that '裡' can be 'lei5' that I'm not inclined to discard that pronunciation altogether - maybe it's a regional variation? If you're trying to make a Pleco flashcard with that specific mapping and it's giving you lei5 instead, you can use the 'change dictionary entry' command in the flashcard info screen to pick a different dictionary entry (both CCY and WHK have them) that uses leoi5, or 'convert to custom card' and edit it manually.
 
There are enough sources saying that '裡' can be 'lei5' that I'm not inclined to discard that pronunciation altogether - maybe it's a regional variation? If you're trying to make a Pleco flashcard with that specific mapping and it's giving you lei5 instead, you can use the 'change dictionary entry' command in the flashcard info screen to pick a different dictionary entry (both CCY and WHK have them) that uses leoi5, or 'convert to custom card' and edit it manually.
Can you look through the provided examples for the "裡" character?

Because throughout all of them listed on Pleco, I only see/hear Cantonese audio provided for the jyutping "lei5", can there also be audio provided for "leoi5" for the "裡" character, as well?

In particular, this example for the noun sounds really weird with "lei5" instead of "leoi5".

指內部 inside

他住在城裡。[I have never heard this as "sing4 lei5"] [I have heard "sing4 leoi5"]
He lives in the town.

▶ 裡間, 裡應外合, 表裡如一

~~~~~

With regards to regional pronunciation, which pronunciation does Pleco base the Cantonese Jyutping off of?

My family speaks Hong Kong Cantonese.
 

mikelove

皇帝
Staff member
Sorry, are you seeing Cantonese example sentences with a printed 'lei5' reading under them? Which dictionary? If the example doesn't have a printed Cantonese reading and you're just hearing it in the audio, then the 'lei' is being auto-generated by the TTS engine - are you using the built-in system TTS (Apple or Google?) or our paid Cantonese TTS add-ons?

We don't currently try to track specific regional variations, we just try to show any reading we have reason to think is valid. We would be delighted if somebody made a Cantonese dictionary that got into details about regional pronunciation differences and where they're used, and would be happy to license + make use of that information if they did, but as far as we know nobody's done that yet and we don't really have the resources to develop something like it internally.
 

mikelove

皇帝
Staff member
I just did a quick test on this end and it looks like our paid Cantonese TTS add-ons are reading 裡 as 'leoi5' but both the Apple and Google ones are reading it as 'lei5.' Which is interesting... maybe they're going by the reading in Unihan?
 

mikelove

皇帝
Staff member
No, this wasn't a fix we made on our end, it's just what I found from testing - our paid TTS engines read 裡 as 'leoi5' already and Apple and Google's built-in system ones seemingly do not. So the one place where we actually have the ability to override the way it reads 裡 out loud in sentences, it seems to already be using your preferred reading.

Overriding Apple's engine is dicey and Google's is impossible, so if those are what you're using and they're reading sentences that way then I'm afraid all I can really do is file bug reports with them. (but since it would be replacing lei5 with leoi5 rather than supporting both, it would help if I had a more unambiguous reference to point to to say that it should be pronounced leoi5 and not lei5)
 

Akizhuzhu

Member
Words.hk proposes 裏 leoi5
Canto Dict says 裡 leoi5 lei5

Humannum says https://humanum.arts.cuhk.edu.hk/Lexis/lexi-can/
both exist, and leoi5 is a variant of lei5

shyyp.net has both leoi5 and lei5

Using VF fonts,
the first reading of 裏 leoi5 , but it is not authoritative, and can be changed through the font syntax

native speaker from Foshan, first reation was lei5, asked about leoi5,
changed to leoi5 and was unsure if lei5 maybe had a special context only

ZDIC lists both lei5 and leoi5

HK Education Bureau lists leoi5

The Graphical Cantonese Generator proposes leoi5
not authoritative, and can be edited

In order to know what people actually use, the
The Hong Kong Cantonese Corpus (HKCanCor) is a resource for studying conversational Cantonese:

Downloading the full text
UTF8 (zipped, transcriptions only, no sound)
and checking the file FR-R013a_v for 裏 (not 裡) gives

裏邊/f/leoi5bin6/
心裏/f/sam1leoi5/

I checked all files, and I found only those two words.

There is a PyCantonese Library that allows Python searches on other corpuses.
PyCantonese comes with one built-in corpus, the Hong Kong Cantonese Corpus.
For corpora other than HKCanCor, PyCantonese provides the function read_chat() to read in Cantonese data in the CHAT format.

Someone with more skills than me could try to read 裏 through this python search from other corpuses and see what is the result.
 
Last edited:
Words.hk proposes 裏 leoi5
Canto Dict says 裡 leoi5 lei5

Humannum says https://humanum.arts.cuhk.edu.hk/Lexis/lexi-can/
both exist, and leoi5 is a variant of lei5

shyyp.net has both leoi5 and lei5

Using VF fonts,
the first reading of 裏 leoi5 , but it is not authoritative, and can be changed through the font syntax

native speaker from Foshan, first reation was lei5, asked about leoi5,
changed to leoi5 and was unsure if lei5 maybe had a special context only

ZDIC lists both lei5 and leoi5

HK Education Bureau lists leoi5

The Graphical Cantonese Generator proposes leoi5
not authoritative, and can be edited

In order to know what people actually use, the
The Hong Kong Cantonese Corpus (HKCanCor) is a resource for studying conversational Cantonese:

Downloading the full text
UTF8 (zipped, transcriptions only, no sound)
and checking the file FR-R013a_v for 裏 (not 裡) gives

裏邊/f/leoi5bin6/
心裏/f/sam1leoi5/

I checked all files, and I found only those two words.

There is a PyCantonese Library that allows Python searches on other corpuses.
PyCantonese comes with one built-in corpus, the Hong Kong Cantonese Corpus.
For corpora other than HKCanCor, PyCantonese provides the function read_chat() to read in Cantonese data in the CHAT format.

Someone with more skills than me could try to read 裏 through this python search from other corpuses and see what is the result.
Oh my goodness, thank you so much for bringing the receipts!!!

MikeLove, with the proof for leoi5, now, can you please make the necessary changes for 裡 as leoi5, instead of lei5? Lei5 should be 裏.

Also curious, MikeLove, do you speak Cantonese, yourself?
 

mikelove

皇帝
Staff member
Thanks!

Sorry, this was proof I was seeking to file a bug report with Apple/Google to hopefully get their TTS engines updated to use that reading, so that sentences with 裡 will be read with leoi5 instead of lei5. Having a lot of evidence like this is very helpful in convincing them. (as a bonus it'll also benefit anybody else listening to Cantonese read aloud on their phone) Wasn't your complaint that you were hearing it read as 'lei5' and it sounded weird?

We already use 'leoi5' in our TTS engines, so there's not anything we would need to do on our end for those - if you play a sentence with 裡 with Kaho or Kayan it should already pronounce it that way. (if there's a particular sentence where it doesn't, let me know and I'll investigate, but it was consistent in my testing at least) We support both readings in our dictionaries, which seems correct, though we can look at adjusting the order pronunciations are listed in some of them - it looks like if you sort CC ahead of PLC you'll generally get the leoi5 reading first.

And no, I don't speak much Cantonese - when we update our Cantonese dictionaries we generally do it in a big batch of changes that we farm out to outside editors who do.
 
Top