Feature Suggestions


mikelove said:
The reader actually isn't very good at breaking down sentences into words, we need a proper text segmenter for that - there's a very good open-source one of those we've been thinking about using but it'll take some work to port it to Palm/WM.
Mike, could you please mention which text segmenter you found? The ones I've run across have not had very friendly licenses. I had a non-dictionary/reader project but didn't want to write my own Chinese-aware text segmenter.


The forecast is coming from the pleco flashcard database. So I back up my pleco db every week (or month) and then can run this program from my pc to get some idea of how the cards are stacking up in the future. Good to hear some good things being planned for 2.1 (sentences and srs algorithms).


Staff member
character - it's called ICTCLAS - see this page for source downloads. The license is proprietary but seems to explicitly allow for commercial use. The Linux file is mistakenly named a .tar.tar instead of a .tar.gz, so if you download that one, change the file extension to .tar.gz before you try to open it. If you're not worried about modifying the source code there's a newer binary-only version available at here for Windows or here for Linux. Seems to be pretty widely-deployed, it's part of Apache Lucene I believe.

taijidan - neat. Excellent argument for adding some sort of API for third-party programs to easily access the flashcard database once we get our desktop version released, though it seems like you didn't have much difficulty with that anyway.


I have commented a bit on this before, but think its worth repeating with a bit different spin....

I am wondering if one easy (relatively speaking...) option for the audio is to just have the sentences created by the user, and give them the capability to record an audio file associated with the card. (I personally would like to get this functionality.) A native Chinese speaker could read each sentence and the file could be saved and advanced to the next in less than 10 seconds.

I have a friend who helps me record, but I could also get any of my teachers to do it as well (maybe for a small fee....).

Not sure if it would simplify things, but perhaps the audio could be for "import" only. That is, the sentences would be constructed off-board, and then audio/hanzi/definition would all be imported at the same time. This is the way I work with the sentences now, basically creating them first and then importing them (I don't want to do all that editing on a small screen)

One problem is that this approach would not lend itself to the copying of sentences from the dictionaries unless they existed on the external computer, or unless they were somehow created in Pleco, and then exported to be matched with the audio recordings for import.

Maybe not the best ultimate "text to speech" solution, but maybe could serve as an interim one.


Staff member
That's certainly doable, SQLite can easily support database records as large as a sentence-length audio recording. However, it'd run up into a bunch of memory limitations on Palm, so it's only something we can do on WM (and iPhone, which actually has a nice simple API for recording audio with the built-in microphone, at least on microphone-equipped models).

An alternative would be to have an audio recordings directory and simply store the filenames in the flashcard database, but I'm not wild about that since it's very easy to back the flashcard database up but forget to back up that directory - that would, however, work on Palm, though only with Ogg since the lack of built-in MP3 support on Palm means we can't legally play back MP3s without an expensive license from Fraunhofer.


After having my hand start hurting after writing ridiculous amounts of characters, it occurred to me that there should be a way to increase the efficiency of the scoring model for writing words when given pinyin + definition. For example, If I know the word 你 and the word 好 very well, then the card for 你好 should probably have the score advanced much more quickly than it would be otherwise. Granted, knowing the first two, doesn't guarantee that you will know the combination, but it makes it a lot more likely. It potentially could be a lot of work to impllement something like this, but the benefits could also be great.

I was thinking if within the score file, you could keep some kind of statistics on how well you know an individual character.. each time you test a word containing one of those characters, the statistics would be updated. The statistics of the individual characters could somehow influence the difficulty rating of a newly added card.

The scoring model might also be appropriate for other tests also, considering that if you know all the components well, your likelyhood of discerning and/or retaining the meaning of a word containing those components would probably be much greater.


Staff member
That's a great idea - rather thorny thing to implement, and coming up with a UI for it that wasn't painfully intimidating would also be a challenge, but if we could pull it off it certainly makes sense for character/Pinyin mappings at least (and to a lesser extent definitions too). Thanks.


Two feature requests borne out of my recent flashcard session

1) A way for a custom flashcard to be saved as a dictionary entry (just a single one - not a bulk import) I am thinking this could be good for people's names, or names of places, where I make the flashcard, utilize it, and maybe upon later review decide it should be a dictionary entry.

2) If I am looking at a list in the flashcard module, to sort columns by just pressing the column identifier at the top of the column (assuming there is one - similar to sorting in Excel).


Staff member
#1 is already implemented - just go into the "Edit Card" screen and hit the "To User" button. #2 would make sense if we added column headers, but that screen is so crowded already that I'm not sure if it makes sense to take away a line of search results for that.


Speaking of that writing flashcard discussion above, a while back during the beta we discussed a writing review function that combined handwriting recognition and stroke order playback into one screen. Any more thought go into that?


Staff member
Still an interesting idea, but tricky new features like that have had to take a backseat to platform support lately - between iPhone and desktops, a lot of the things we'd like to add to 2.0 are having to wait a while unfortunately.


I have only custom flashcards, which I add using the "Add to flashcards"-button. Afterwards I change the cards to "custom" and usually change the definition. I would like some changes in Pleco:

(1) in Manage Flashcards / advanced / dictionary a selection of "any" (the opposite of none) would show all flashcards that were entered, but not changed yet, because they are still linked to any dictionary (I could of course look for all dictionaries one after the other)

(2) in Manage Flashcards a selection with "definition" would help (only full text search, because the definitions usually differ slightly, so that an exact search would not always lead to results): In a flashcard session with Input=definition, output= headword there is an increasing number of flashcards with the same definition, but different headwords (e.g. "hour" = 小时 or 钟头). I would like to find all flashcards with the same or almost the same definition and then change some of the definitions slightly to increase my chance to get the right answer from 50% to 100%.

(3) When in "Edit Card" I change the dictionary linked card to "Card Text:" = custom, then the definition, which came from the dictionary, is in some cases deleted. It seems to depend on the dictionary if the definition is deleted or not. Is this a bug or a result due to the licence? (This is not a very important problem because I usually change the definition)

(4) In Pleco 1.0 all my flashcards were linked to the user dictionary. As this dictionary was difficult to change, especially to delete duplicate entries, all my flashcards are now in flashcards.pqb. To check if a word is already included in flashcards.pqb takes some time: start flashcards, manage flashcards, search ... With a USR dictionary it certainly would take less time. Is it possible to copy the words of flashcards.pqb to PlecoCUserxxx.pqb and to keep the informations synchronized?

All these are not very important, but would make Pleco even more comfortable!


Staff member

That "any" search in 1) would certainly be doable, but it's a bit tricky if we want to also handle cards which link to deleted entries in valid dictionaries differently - that would require a card-by-card check. (I guess for what you're trying to do you'd want those cards to be included too, but other applications of this might require them to be separated)

Full-text search on flashcard (and user dictionary ) definitions will be doable once we drop Palm OS support - the SQLite module that enables full-text searches isn't something we can implement on Palm, so it would pretty much require separating the Palm OS code base from the Windows Mobile / iPhone ones in order to implement it in our code.

The definition-dropping is indeed a licensing issue - actually not every dictionary license, but since we don't yet have a way to securely code different usage rights into different dictionaries (could add it, but it's not really enough of a priority to justify the work it would involve).

#4 is already doable actually - you can convert a custom flashcard into a user dictionary entry with the "To User" button in the Edit Card screen. Though at that point you'll only be able to edit its text as a dictionary entry (with Add / Edit Entry) and not as a flashcard.


With "any" I only want to see cards I just entered by the "Add to flashcards"-button - those cards should be valid and not deleted.

#4: I will try to export all flashcards, save and rename flashcards.pqb, import the exported flashcards again with "store imported definitions in user dict" (thereby creating a new flashcards.pqb with linked entries), delete this new flashcards.pqb and restore the old. I hope to get a user dictionary with the same information as in my flashcards.pqb, and no flashcard is linked to my user dictionary. Adding the flashcards card by card seems to take some time ...


Staff member
Actually, you can just export the flashcards to a text file and import them directly into your user dictionary (via the "Manage Dicts" screen) - there's no need to fiddle around with flashcards on the import side, only the export side.


But then all Flashcards are linked to the user dictionary with the restrictions you mentioned and I remember from Pleco 1.0 like duplicate entries. I would like to have an additional dictionary with my flashcards, but with a flashcards.pqb with all the comfort provided.


Staff member
Oh, well then the procedure you're describing won't really work - the flashcards from the old database won't automatically link to the new user dictionary entries. But there's no need for that anyway - just do a search in Manage Flashcards for "all cards," tap "Batch," and tap on the "Custom -> User Dict" button - that will convert all of those flashcards to user dictionary entries in a single operation. (sorry, for some reason I thought you wanted to preserve the custom flashcards while also having a user dictionary with those same definitions)


It was only now that I found out that there are 2 import functions - in spite of your hint to "Manage Dict." I did not see the import to user dict! That is exactly what I need, sorry to bore you.


mikelove said:
Actually, you can just export the flashcards to a text file and import them directly into your user dictionary (via the "Manage Dicts" screen) - there's no need to fiddle around with flashcards on the import side, only the export side.

It works just as you described. To prevent duplicate entries from flashcards with more than 1 category it would help however, if every flashcard would be exported only once or if I could skip duplicate entries when I reimport the file (as in flashcard import). With some Excel and Editor between export and import the same result can be achieved.