MakePlecoDict crashing

therudds

Member
Can anyone shed more light on what is and isn't allowed with running the MakePlecoDict software? I have been trying many different things and it keeps crashing.

Specifically, I am able to make a .pdb file that works, but if I manipulate the data (e.g. a search and replace, convert to a table and then back convert, or even type in some text sometimes) in Word or Excel and then save it in what I believe to be the proper format again (see below) in Word or Wenlin, MakePlecoDict makes a 0KB .pdb file and crashes returning a Microsoft Error Report. Even if I try to convert one line of text that had previously converted successfully (without making any seeable text changes) it still crashes (this is in the Chinese-to-English direction; I haven't tried the English-to-Chinese direction yet).

I know the MakePlecoDict documentation states, "MakePlecoDict accepts input files ONLY in little-endian, UTF-16 Unicode format." In Word 2003 is this the straight "Unicode" option as opposed to the "Unicode (Big-Endian)", "Unicode (UTF-7)", "Unicode (UTF-8)", or any of the "Chinese Simplified" options (such as (Auto-select), (EUC), (GB...), (HZ), (ISO...))? In Wenlin is this the, "Little-endian Unicode" option as opposed to the "Unicode (recommended, international)" option. In Excel, is the paste special "Unicode Text" option the proper format?

In addition, I wanted to know if there are any characters that are not allowed in the various fields, and specifically if there are certain format/character constraints in the pinyin and character fields (and definition fields for that matter)?

Thanks for the help!
 

mikelove

皇帝
Staff member
Yes, the straight "Unicode" option in Word is the one you want, and in Wenlin it is indeed little-endian Unicode. And Unicode Text in Excel should be correct, too. You might want to try a more Unicode-oriented text editor, though; we usually recommend EmEditor since it's what we use for most of our Chinese text editing.

I believe the only characters that MakePlecoDict won't accept are characters in the Unicode "private use" area, many of which we use for internal formatting data, but there's not really any reason why you'd ever want to use those and it's almost impossible that you'd do so accidentally. The formatting is pretty much wide-open as long as you use tabs correctly (only use them to delineate between text fields) and make sure to use Pinyin tone numbers instead of tone marks.

Even though MakePlecoDict is technically "unsupported," if you send us an e-mail with one of your non-working text files we'd be glad to take a look at it and see if there are any obvious problems.
 
Top