The MoE dictionary is now open source

Thanks so much, together these two responses did the trick. At first I was scratching my head why I couldn't find the "Add user dictionary" option, unil I saw the second pst and added Flashcards. Good to go now, thanks!

I am a US-trained eye surgeon based in China for the last 10 yers, working at a Chinese danwei in Guangzhou. I need professional level language tools as all of our meetings, projects, forms, protocols etc., are in Chinese (which is part of the appeal of the job!) I always thought of Pleco as entry level and relied on the (still very good) topline Casio e-dectionary. But now that a few C-C dictionaries have been added to Pleco, especially Guifan and HYDCD, and the responses fro the different dictionaries have been integrated, Pleco is now a professional tool for me, as I can't beat the convenience of having it all on my phone. Nice, too, to read chinese novels with the option of clicking for a definition with the reader. I expect Pleco will now be my go-to reference source. Now if we could just get a general medical dictionary (E-C, C-E), a chengyu dictionary, Xiandai Hanyu Cidian all in Android...and the HYDCD 2010 supplement! A long wish list! But this is now a really great product for all levels of users. And very impressed to get an answer to my first query within hours from Mike Love himself. Can't beat that for service!

Best regards,
Nathan
 

mikelove

皇帝
Staff member
Thank you!

Medical and Chengyu are coming - only reason they're not on Android already is that we don't want to have to backport them to our old database format when we're about to support the new one in 3.0. HYDCD supplement as well, we've got that licensed but just haven't released it yet. Xiandai Hanyu Cidian is a tricky license to get and we're not sure if it adds enough to what we have in Guifan to justify getting; Xiandai Hanyu Da Cidian we do have a license for, though (different publisher - same one as HYDCD) so that probably will show up at some point.
 
All sounds great! You have a new long-term customer, and I will certainly be showing off the HYDCD to my friends here in GZ: expect a few sales!

Nathan
 

alex_hk90

状元
EDIT: An official Pleco version of MoEDict is now available.

EDIT: Migrated information from this thread to Pleco-User-Dictionaries repo (GitHub).

Right, I've finally got round to looking at this again and cleaned up some of the SQL for the MoEDict Pleco conversion (optimising and formatting the queries better), as well as updating for new source data (based on the latest commit - 18 Nov 2014 - here, derived from the original source 201311 file here; MoEDict-04c was based on the 201305 source file), and merged the notes for the Simplified conversion into a single Bash script.

MoEDict Pleco Conversion 05 script (SQLite) with comments:
(attached as 20150123-MoEDict-Pleco-05.sql.txt)

Resulting output (MoEDict-05) from the above (traditional characters only, as the original data):
- Pleco flashcards (.txt) (165,814 entries):
http://www.mediafire.com/download/3j8un86dri4stjv/MoEDict-05-cards.txt
https://www.dropbox.com/s/3g8ewbvtdwzwdh9/MoEDict-05-cards.txt.7z
- Pleco user dictionary file (.pqb) (165,814 entries):
http://www.mediafire.com/download/b7b7hx91qet7b6q/MoEDict-05.pqb.7z
https://www.dropbox.com/s/s5amxb0cvnz5kj2/MoEDict-05.pqb.7z

Compared to MoEDict-04c (using a simple diff) there appears to be something like 4 new entries, 9 or so Hanzi replaced in the headwords and 399 or so modified definitions. For details see the links to the source data above.

MoEDict Simplified Conversion 03 script (Bash and SQLite) with comments:
(attached as 20150123-Conversion-Simp03.sh.txt, supporting "Conversion-*.csv" input files Conversion-Simp03.zip)

Resulting output (MoEDict-05-Simp03) from the above (using the same data, in fact via importing the resulting flashcards, but with simplified headwords added):
- Pleco flashcards (.txt) (165,814 entries):
http://www.mediafire.com/download/2orvm92ig6118u1/MoEDict-05-Simp03-cards.txt
https://www.dropbox.com/s/avy920um3wuijpl/MoEDict-05-Simp03-cards.txt.7z
- Pleco user dictionary file (.pqb) (165,814 entries):
http://www.mediafire.com/download/642ykctfv9a4300/MoEDict-05-Simp03.pqb.7z
https://www.dropbox.com/s/mby9xmnix3vdzcr/MoEDict-05-Simp03.pqb.7z

As before I would recommend only installing one of the two (MoEDict-05 or MoEDict-05-Simp03) user dictionaries above (or changing one of the icon abbreviations from MOE). And as always you need the paid flashcard add-on to load this (or any) user dictionary.

Any feedback for improvements or suggestions let me know in this thread. :)
 

Attachments

  • 20150123-MoEDict-Pleco-05.sql.txt
    15.7 KB · Views: 633
  • 20150123-Conversion-Simp03.sh.txt
    14.1 KB · Views: 757
  • Conversion-Simp03.zip
    36.4 KB · Views: 566
Last edited:

jlnr

进士
Thanks for the update! :)

One question about updating user dictionaries - do I have to be careful with importing the updated dictionary, is there any way I could mess up my flashcards that link to the MOE dictionary?
 

mikelove

皇帝
Staff member
Yes, the entry IDs will almost certainly be different so you'll want search for "incomplete" cards and remap them afterwords. Or just keep both copies of the MoE dictionary around but totally disable the old one. (uncheck all of its options in Manage Dictionaries)

Alternatively, you could search for your MoE cards beforehand, export them all to a text file with definitions included and then reimport them - that will put the definitions on the cards instead of the dictionary. (we don't currently offer a batch way to do this without exporting/importing)
 

jleeyap

Member
Sorry, I must be a real klutz or something, I have tried downloading this file several times. When I come to "Load Existing" the file MoEDict-05-Simp03 shows up but when I click on it, a popup says "Add Dict Failed. Sorry, the dictionary could not be installed correctly, either because it's invalid or because the same dictionary has already been installed." Yet it is not yet under the list of my installed dictionaries. What am I doing wrong?
 

mikelove

皇帝
Staff member
This is definitely the .pqb file, right? And not the .7z archive containing it (or the original text file)?
 

jleeyap

Member
Thank you for your answer. I examined the issue further, and it seems that although I did load the .pqb file initially using the web uploader method with the https address, somehow it was terminating before the whole file was loaded (I figured it out because the file sizes were different). When I reloaded the .pqb file using iTunes and USB cable, it's now working well. Thank you.
 

giokve

进士
Annoying glitch after the update: pinyin in MOE has spaces while every other dictionary hasn't, so I always end up with two different entries. (Actually, I don't know if it's because of it, but that's the only difference I can notice between the two entries)
 
Last edited:

alex_hk90

状元
Annoying glitch after the update: pinyin in MOE has spaces while every other dictionary hasn't, so I always end up with two different entries. (Actually, I don't know if it's because of it, but that's the only difference I can notice between the two entries)

The source data for MoEDict has always had spaces in the Pinyin, so I guess this was after a Pleco app update?
 

mikelove

皇帝
Staff member
I know, but it could be an issue with the version of Pleco in which that version of MoEDict was encoded.
 

giokve

进士
I updated everything, Android (now 5.1), Pleco and MOE. Also (maybe I should post it somewhere else) all the strokes in the stroke order diagrams are now black, they don't change color like they did before.
 
Last edited:

Taichi

榜眼
I came across the two different entries problem as shown in the screenshot.
Headwords are excact the same but split into two results.
I'm using MOEDict-04c-simp03 (the old one) with pleco 3.2.10.
I'm not sure if this is a new to pleco 3.2.10MOE though.
 

Attachments

  • Screenshot_2015-03-20-23-27-36.png
    Screenshot_2015-03-20-23-27-36.png
    182.2 KB · Views: 692

giokve

进士
I came across the two different entries problem as shown in the screenshot.
Headwords are excact the same but split into two results.
I'm using MOEDict-04c-simp03 (the old one) with pleco 3.2.10.
I'm not sure if this is a new to pleco 3.2.10MOE though.

It happens with every word formed by two characters or more in my case.
 

mikelove

皇帝
Staff member
@giokve - I can't seem to reproduce this here; which dictionaries are listed in Manage Dictionaries on your device now, and in what order? Could you give me some examples of search terms that produce this behavior?

With the stroke order, that was a setting that was accidentally switched on for some people - you can turn it on permanently in Settings / Definition Screen / Stroke Order / Fade Strokes.

@Taichi - in this case, MoEDict seems to have a completely different pronunciation for this word (not just tones but whole syllables) - we don't merge entries in that case.
 

giokve

进士
@mikelove every search term that has more than one syllable and, I just noticed, whose traditional and simplified forms are different. I believe this wasn't the case before.
Screenshot_2015-03-20-16-53-35.jpg
Screenshot_2015-03-20-16-53-15.jpg
 
Last edited:

mikelove

皇帝
Staff member
@giokve - It looks like you have a version of MoEDict without any simplified characters in it; it needs those to merge correctly, but @alex_hk90 added them a while ago. Try deleting that one and then installing the latest database file (MoEDict-05-Simp03.pqb.7z) - does that work any better?
 
Top