why are British spellings sometimes included?

mikelove

皇帝
Staff member
JimmyTheSaint said:
If it can be reduced to a straightforward batch string-search-and-replace from a reference list of alternative spellings, then it's quite fair to ask for consistency of spelling convention--whether British or American--within each dictionary. Mike, however, says it's too labor intensive to be cost effective. But it's sloppy. It's easy enough for the average customer to get used to overlooking it, unfortunately, but with language professionals, such inconsistency reflects badly on the product. When's the last time you saw a published text with mixed British/American spelling conventions?

Have you spent a lot of time with printed Chinese-English dictionaries? A great many of them have similar errors - these errors in PLC come entirely from the printed title we licensed to base it on, so you're asking us to fix errors in that printed text rather than errors that we made ourselves. Dictionaries are simply too big and low-margin for everything to be perfect - even the OED has the occasional embarrassing glitch.

I do agree that we ought to do something to straighten this out for full-text searching - not hard, we'd just build a little list of British and American spellings into our app and automatically search on both when you type in either one - but with all of the other editorial projects we're working on it's very hard to justify going through and rewriting every entry to use the same spelling; that's time we could be using to, say, clean up one of our not-so-nicely-formatted dictionaries (like 21C), or for that matter to adapt a newly-licensed dictionary. Even in PLC specifically, the time might be better put towards updating some outdated example sentences, or adding entries for some more of the thousands of missing words people have reported to us. Mixed British and American spelling might look sloppy, but not having an entry for 超市 (as we didn't until last summer) or having an example sentence with embarrassing 70s-era political rhetoric in it (as we still have quite a number of) looks considerably sloppier.
 

LantauMan

进士
I'm late to this discussion, but as an American long-term resident of Asia, I don't see what the problem is with mixed US/UK usage in Pleco's various English dictionaries. US English is still the minority usage in printed media outside of the United States. Even in speech, try living in Hong Kong or Singapore, or among expat communities elsewhere, and the hodgepodge of spoken English is rampant. My US-passport-holding born-in-Asia kids tell me, in pure California accents, how "cross" they are, not "angry", at the "rubbish" (not "garbage") they learn in school. Their English-accented friends talk about stopping at the "gas station" rather than the "petrol station". And so on; it goes both ways.

If one is learning Chinese, it sort of implies an international outlook, so an insistence on being consistently either American or British in spelling actually goes against the grain of international English. I think Mike should leave well enough alone and not waste his time Americanizing any of the dictionaries.
 
LantauMan said:
an insistence on being consistently either American or British in spelling actually goes against the grain of international English.

Please provide an example of a respected publisher whose publications contain an inconsistent mix of British and American spelling conventions.
 

LantauMan

进士
JimmyTheSaint said:
LantauMan said:
an insistence on being consistently either American or British in spelling actually goes against the grain of international English.

Please provide an example of a respected publisher whose publications contain an inconsistent mix of British and American spelling conventions.

I never said that about publishers. If you want to be a stickler about it, the majority of publications published in English in Asia use British spelling. A few follow Australian usage (annoyingly, a newspaper I used to work for switched from British to Australian and then back again). So asking for consistent American spelling in PLC is barking in the wrong direction. In my kids' international school, which follows a British curriculum, either American or British usage is acceptable. However, I take your point that publications and school assignments are normally consistent within one spelling track. Nevertheless, the actual usage of English outside native English speaking countries is a stew of accents, vocabulary and, yes, spelling. If an ESL student in Beijing spelled color with a U, would you mark them as incorrect?

In my opinion, in an open source dictionary such as PLC, an eclectic mix of spellings is to be predicted. And to be accepted for what it is, not placing any constrictions on it so that it continues to expand based on voluntary effort. I honestly see nothing wrong with it.

But I also see that this is an unresolveable argument against your equally valid point, so I don't plan to drag it out. I'd rather see the Pleco folks put their efforts into something else--an English-Mandarin slang and idiom dictionary maybe?
 

mikelove

皇帝
Staff member
I'll reiterate that I think we do need to handle this properly on the search side (along with stemming, so singular forms of nouns will match plurals and so on) but probably not right away on the text side since that's a lot more labor-intensive.

PLC isn't open-source, though - it's a licensed dictionary that we've been heavily modifying in-house. And the errors you're complaining about are actually in the original title as well - however, it's tough to prioritize fixing them over (say) rewriting the example sentences to no longer sound like 70s propaganda.
 
Please consider allowing user defined pairings as well when implementing the search enhancements. I'd imagine this could work similar to Word, since they've been doing substitution stuff for a long while (albeit in a different context and different purpose). Point with Word is just that they've got a fairly stable user model for substitutions, etc.

I'm looking forward to an enhanced search feature.
 

mikelove

皇帝
Staff member
stephanhodges said:
Please consider allowing user defined pairings as well when implementing the search enhancements. I'd imagine this could work similar to Word, since they've been doing substitution stuff for a long while (albeit in a different context and different purpose). Point with Word is just that they've got a fairly stable user model for substitutions, etc.

Maybe at some point, but initially we'll probably be doing it on the database encoding side for performance reasons. (Word doesn't have to search 8 different 20-50 MB data files for matches *on a smartphone* and return the results in less time than it takes to move your finger to the next keyboard key)
 

alanmd

探花
JimmyTheSaint said:
Please provide an example of a respected publisher whose publications contain an inconsistent mix of British and American spelling conventions.

Easy: just name pretty much any Canadian publisher! Canadian English is officially a mix of UK and US English spelling [1], but the rules are not too well defined and there are inconsistencies everywhere. A Google search in a well respected publication such as The Globe And Mail turns up a mix of UK/US spelling [2]. I demand that all dictionaries should be converted to Canadian spellings (which include 'aboot' for 'about' of course).

[1] http://en.wikipedia.org/wiki/Canadian_E ... ctionaries
[2] https://www.google.ca/search?hl=en&safe ... ze+realise
 

abhoriel

Member
what i would imagine most people do: type "color" or "colour" (depending on preference) into the dictionary. if no relevant results are found, try the other one.

I agree that consistency is nice, but this is a ridiculously minor issue and I don't think its a simple fix as suggested (dictionaries are third party; simple substitutions will not catch every case) I don't think we need an argument over this.
 

mikelove

皇帝
Staff member
abhoriel said:
what i would imagine most people do: type "color" or "colour" (depending on preference) into the dictionary. if no relevant results are found, try the other one.

Actually, we fixed this as of 2.4 - for full-text searches the software automatically matches both, regardless of which one you type.
 
Top