Pleco Repetition Algorithm

chaorendaifu · Jul 25, 2006

I don't know if this has been covered in the older threads, however I wanted to ask if the new version of pleco 2.0 will have an algorithm to determine repetition spacing for reviewing of vocabulary words?

Something similar to SuperMemo or other software which has repetition algorithms to help determine the correct repetition spacing for maxium retention of vocabulary words?

If not, I strongly recommend one be implemented!! SuperMemo offers their algorithms for software such as Pleco. The reason I strongly recommend this as an upgrade is that I hated to part from pleco as my software of choice for flashcard vocabulary review since Pleco has sooo many nice features. But, I was having tremendous difficulty recalling many of the flashcard terms. I tinkered with the repetition spacing options for a very long time and still having poor recall.

As a result, I reluctantly switched to software which has repetition spacing algorithm built in. My retention went from 60% to now upwards of 93%. I'd very much like to switch back to Pleco at some point, as it is much better for Chinese study. (It's a massive pain to have to make each of my flashcards by hand for the other software). So I am desperately hoping Pleco makers will strongly consider my suggestion to implement the SuperMemo (or similar) algorithm to greatly enchance Pleco and make it a complete flashcard / dictionary software.

mikelove · Jul 25, 2006

We're not actually planning to make too many changes to the repetition spacing algorithm in 2.0. We've gotten quite a few requests to integrate a SuperMemo-esque algorithm into PlecoDict's flashcard system, but I remain unconvinced of the effectiveness of such an algorithm. At its core, SuperMemo is really just about letting you make the optimal use of the time you have for studying, which it does by displaying words at only the minimum frequency necessary to ensure memorization and continued retention. You review words less often and hence get to review more words in the time you have available. There's nothing inherently wrong with that idea, but it can only work optimally in a situation where your only interaction with those words is inside of SuperMemo.

Suppose you've just added the word kao3ya1 "roast duck" to your vocabulary list. You review it once, answer it correctly, and SuperMemo's algorithm determines that the next time you review it should be 6 days later. Two days before SuperMemo is scheduled to show you the word again, you go to a restaurant with some friends and order some Beijing duck. You've just reviewed the word kao3ya1 again, but SuperMemo doesn't know about it, so it still tests you on kao3ya1 two days later, even though the optimum spacing might not have you seeing it again for another 5 or 6 days. Because it has no way of knowing about that extra kao3ya1 practice, it's impossible for SuperMemo to calculate the optimum repetition spacing, regardless of how advanced its algorithm is - it just doesn't have enough information.

If you're learning Chinese in a vacuum (with no other vocabulary review beyond what you do in SuperMemo), or if you're studying something like state capitals or licensing exam questions that you're not likely to encounter in any other context, then SuperMemo is a great tool for that, but I just don't think it's all that effective for real-life language learning.

That being said, there certainly is room for improvement in our repetition spacing system, chiefly in the area of its user interface. It really should be a lot easier to start using our repetition spacing system; ideally there would be no customization required at all, "set it and forget it," and I think in 2.0 we may actually be able to offer something at that level of simplicity. There are also some common-sense improvements we can make that should bring our system's "performance" a lot closer to SuperMemo's; for example, it should be easier to add words that you already know and have them immediately treated the same as words that you've learned through our system, and the default spacing days / rank advancement threshholds could certainly be tweaked to make them a bit more optimal for language learning.

But in terms of actually licensing SuperMemo or a similar algorithm, even if we thought it was useful for language study it would still be impractical for PlecoDict because all of the new test modes we're adding. With a multiple-choice test, for example, there are really only two possible results for each card, correct and incorrect; there's no way to map the results of a multiple-choice test onto SuperMemo's six-level answer system. I suppose we could prompt the user to score themselves after each answer on how well / easily they recalled a word (or how close they came if they answered it incorrectly), but that seems like kind of an awkward fix.

gato · Jul 25, 2006

My retention went from 60% to now upwards of 93%.

May I ask how you were studying vocabs when your retention was at 60%? Were you using PlecoDict's flashcards? If so, in which mode?

chaorendaifu · Jul 30, 2006

Mike, I'm very sorry to hear your response.

Such algorithms are a phenomenal means of helping users improve their Chinese characters. While I agree that it's not 100% accurate as you may encounter words used in , say conversations, thus reinforcing retention, etc it does help a great deal. Furthermore, the more data the algorithm has about a particular user's learning style , the more accurate it becomes. The problem ones will occur more frequently and those that are more ingrained will appear much less frequently.

Regarding the multiple choice addition to the flashcard, you could always add that as an option for the user. If user chooses say 'algorithm mode' then the multiple choice quizzes are not available, etc. The same would go for the additional 'test modes' you have in store for the new version. "Algorithm mode' could be a separate mode in and of itself for those users who truly believe in its value. (Even if there are some skeptics *wink*)

Mike, I understand your skepticism from a theoretical perspective. But my question is if you have actually tried using such a method of learning for an extended period of time before? (Or at least know of many people who have used such a method for an extended period of time with very little positive results.) If you have used such a method and you did not personally experience any improvement or not any improvement greater than the method you use now (or previously used) then I would better (somewhat) understand your reluctance to implement such a feature for those who truly believe in such a learning method. However, if your skepticsm is based only in theory without applying it in practice, then I'd encourage you to at least try it before you reject it has a great option for pleco.

mikelove · Aug 1, 2006

I have had some experience with graduated recall - both SuperMemo and static systems like Pimsleur - and honestly I haven't found it to make much of a difference for me. But people's minds work differently, and I can certainly imagine it might make a difference for some people. I don't believe, however, that the level of refinement in SuperMemo's latest versions makes any significant difference for language learning - it really is impossible to optimize things that much with all of the extra review that goes on outside of the software. So the only reason we'd want to license SuperMemo's algorithm would be for the brand name, and given all the money we've already tied up in licenses for Pleco 2.0, I don't think the difference that would make would justify the extra spending (at least not until we've recovered much of that investment from 2.0 sales).

That being said, I suppose it wouldn't actually be that difficult to implement something at the level of SuperMemo for Palm; my understanding is that they use one of the early SuperMemo algorithms, and mathematically the difference between those and our current system is pretty negligible - an extra variable or two and that's about it. Not much more than we were planning to add anyway. It wouldn't be exactly the same, obviously, but I do think we could get something comparable to the performance of SuperMemo for Palm without having to license anything.

Whether or not we actually have time to do so remains to be seen - if we do implement this, though, it would probably be "experimental" and implemented through a checkbox rather than a whole separate mode: checking a box in Preferences would get you the extra answer choices and the more advanced algorithm. (though if we gets rave reviews in the beta version we could certainly consider promoting it to a more prominent position in the finished release) One of the nice things about our new database system is that it makes it a lot easier for us to insert extra variables, fields, etc, so really the whole thing would only require one or two extra functions along with the interface changes to support the extra answer choices.

shenkuang · Sep 4, 2006

what about using the mnemosyne project algorithm?

Mike,

I have discovered a free, open source flash card program with a spaced repetition algorithm. It is calledthe Mnemosyne Project and runs on the PC. This eliminates the licensing issue since you could just copy the algorithm that Mnemosyne uses. So long as you do not copy any code, you will not have to worry about the GNU license.

I would like to add my vote to the helpfulness of spaced repetition (using Mnemosyne). It has let me learn 400 words in the last 5 months by reviewing for 20 minutes a day. This is much faster than when I took Chinese classes where we learn 200 words a year.

The only drawback is that you would have to learn Python if you don't know it already, since the source code is in Python. I recently learned Python by reading the tutorial, 1 chapter a day. Took about 10 days and at the end I was able to write a utility to automatically generate mnemosyne flash cards from chinese text.

mikelove · Sep 10, 2006

Actually, something on the level of what Mnemosyne does wouldn't be difficult at all - it's almost exactly the same as one of the "early SuperMemo algorithms" I mentioned in my earlier posting. So it's definitely within the range of what we might do in 2.0. We've hardly done any work on higher-level parts of the flashcard system yet, the first preview release likely won't even include flashcard support, so all this stuff is still up in the air at the moment.

longjie · Sep 14, 2006

I'd add my vote for some kind of *easily configured* basic repetition spacing that does not restrict how much you can study in any given day.

sthubbar · Jul 8, 2007

Enhancement to repetition spacing algorithm

Mike, first let me say that Plecodict is awesome. The current implementation of the repetition spacing algorithm is also awesome. I doubt I would be where I am studying Mandarin if it were not for Plecodict.

OK, enough praise... :twisted:

You make a valid point about a repitition spacing program lacking complete information. A point that I would like to propose is that despite this lack of complete information, a repetition spacing program is the most effective and efficient method of remembering stuff. As of now there is not another method that can even come close in terms of the amount of time required to retain information.

The small improvement that I could see being made in Plecodict 2.0 is that it would be nice if the repetition interval could be different for each item instead of being determined by rank. This is what the Supermemo and other such programs do. Supermemo has simple formulas on their website that could be used.

Allowing this individualized repetition spacing would tend to spread out the items that the individual is encountering outside of the repetitions rapidly and would allow the more difficult items to be reviewed more often.

Just an idea.

mikelove · Jul 9, 2007

Sorry this thread hasn't been updated in so long. The situation has actually changed considerably now, the new scoring system in Pleco 2.0 does track spacing individually for each item, and in Automated mode it even has a "difficulty factor" like SuperMemo does, but unlike in SuperMemo you can tweak our algorithm to make the spacing change faster or slower with a special "Aggressiveness" control. Not exactly the same algorithm, but close enough that people who are fans of SuperMemo should feel quite comfortable with our system.

jugdish · Jul 18, 2007

Final Drill

I'm really excited to experience the flashcard system in 2.0. My wife and I are up to our necks in learning using the current system. I've set up the spacing to mimic SuperMemo as best as I know how, including a final drill (by using a flag).

I'd like to suggest a final drill that
1. Quizzes the user on words that he's missed since the last final drill,
2. Does not affect the history of flashcards, and
3. Loops through words endlessly, only allowing cards to be dismissed once they've been answered correctly.

I know you're already including a final drill that looks like the above, but I thought I'd share my thoughts in case you hadn't thought of something. The final drill is where most of my vocabulary learning takes place.

Thanks! And know that I have lots of other language-learners who I'll be talking to about 2.0 whenever it's released!

David

mikelove · Jul 18, 2007

Thanks for your thoughts on this. #2 and #3 should both be in there, but for #1 we only quiz you on words you missed in the current (just-ended) flashcard session; remembering missed cards from previous sessions too seems like it could be rather confusing for people, the idea of a Final Drill is a lot cleaner / easier-to-understand if it only applies to cards you just recently answered incorrectly.

donohuel · Sep 25, 2007

Having covered 7600 Chinese words in the past 60 weeks at the Defense Language Institute here in Monterey CA, I can tell you time is of the essence!! Any improvement on not having to go through a lessons complete set of flashcards each time I review would save me sooooo much time!! Keep up the good work!

johnh113 · Sep 26, 2007

Dear donohuel,

That is exactly what repetition spacing is for. If you know the character, it rises in rank value and you see it less frequently until it is of a rank so high you never see it again. The flashcards you get wrong stay at a low rank and you see them frequently. There will be a new repetition algorithm with the new Pleco 2, but I think the existing one works very well.

John

punter888 · Dec 14, 2007

Hi. Could somebody recommend some optimized repetition settings for Pleco that mimic the graduated recall process and would enable me to transition from (a) definitition from pinyin, to (b) pinyin from definition, and (c) pinyin from character, and (c) character from pinyin. I'm not too clever with software settings and could use the help. Hoping there are some clever tricks/settings that really optimize this functionality. Thanks.

daniu · Dec 17, 2007

punter888 said:
Hi. Could somebody recommend some optimized repetition settings for Pleco that mimic the graduated recall process and would enable me to transition from (a) definitition from pinyin, to (b) pinyin from definition, and (c) pinyin from character, and (c) character from pinyin. I'm not too clever with software settings and could use the help. Hoping there are some clever tricks/settings that really optimize this functionality. Thanks.

Hi!

I guess you better wait 2 weeks and ask for settings for PD 2.0 ...

regards
Daniel

jugdish · Dec 17, 2007

If you need a system right away, go into your card settings and set up your repetition spacing like it says in this discussion topic:
viewtopic.php?f=6&t=614&p=3613&hilit=+card+settings+#p3613
Here's an excerpt from the above discussion:
"The length of time between reviews is determined by the number of days you've entered for the rank in question. A spacing of 1 means review the card every day, 2 means every other day, etc. One special case is a spacing of 0, which means review the card an infinite number of times until you get it right enough times to move it to the next rank. Some people use that as their rank 1 spacing so they can keep reviewing new cards on the first day.
My setup? I want a smooth progression of time between reviews, so I have a lot of ranks -- 13 -- and fairly aggressive movement: 2 correct in a row goes up a rank and 2 incorrect in a rank moves down 3 ranks. My spacing values are 1, 2, 4, 7, 11, 15, 21, 30, 45, 68, 94, 141, and 211 days. By the time a card gets to rank 13 I only see it once or twice a year."

Your question about pinyin and english and hanzi and whatnot could be solved by going into test settings (also under the start new session menu) and change "display fields" to by rank, setting them up to ask you how you want. Then each card will get asked in a different way for each rank it's on. This is the only way to get drills that ask you different languages for the same cards. If you need help, just post again.
David

Pleco Repetition Algorithm

chaorendaifu

秀才

mikelove

皇帝

gato

状元

chaorendaifu

秀才

mikelove

皇帝

shenkuang

Member

mikelove

皇帝

longjie

Member

sthubbar

榜眼

mikelove

皇帝

jugdish

秀才

mikelove

皇帝

donohuel

Member

johnh113

榜眼

punter888

Member

daniu

榜眼

jugdish

秀才