Importing including statistics

doctorj · May 6, 2012

Hi Mike

I currently have two SRS programs running on my ipod. I previously was using Flashcard Deluxe, which served me very well until i got to the point where lists of vocab for my textbook weren't available anymore. That is where creating flashcards from Pleco came in very well. Previously it wasn't a problem to do both, but there are additional challenges that come from running two sets of flashcards, particularly in that it is harder to spot duplicates.

Is there a way that i can import in my previous flashcards along with the SRS data? I don't want to lose the statistics particularly the current score, and next due date.

Thanks, Jonny

mikelove · May 6, 2012

doctorj said:
I currently have two SRS programs running on my ipod. I previously was using Flashcard Deluxe, which served me very well until i got to the point where lists of vocab for my textbook weren't available anymore. That is where creating flashcards from Pleco came in very well. Previously it wasn't a problem to do both, but there are additional challenges that come from running two sets of flashcards, particularly in that it is harder to spot duplicates.

Is there a way that i can import in my previous flashcards along with the SRS data? I don't want to lose the statistics particularly the current score, and next due date.

Not at the moment, unfortunately - the algorithms are dissimilar enough that it would be difficult to translate data from one to the other. (we don't store due dates, just last reviewed dates and scores to indicate time until next review) Does Flashcards Deluxe give you a way to export only cards that are due in a certain range of times?

doctorj · May 7, 2012

Hi Mike,
this is the output that comes from Flashcards Deluxe

Status – 0) Pending 1) New 2) Active 3) Exclude
Flag values – 0) not flagged 1) flagged
Review Count – number of times tested
Correct Count – number of times gotten correct
Streak – number of times in row you got it correct. This can be greater than Review Count since a strong correct will increase this by 2 for a single response.
Rounds until shown – Use for Leitner mode to delay card testing
Current Interval – current time between testing for Spaced Repetition mode, in hours.
Last Review – Date/time of last testing for this card. Next due date = last review + current interval.

So maybe it could be possible? Should be just a case of transferring an interval in hours to a score, and making sure the dates use the same format. I think other information could be dropped if there is no corresponding field.

mikelove · May 8, 2012

doctorj said:
So maybe it could be possible? Should be just a case of transferring an interval in hours to a score, and making sure the dates use the same format. I think other information could be dropped if there is no corresponding field.

Well it should theoretically be possible, you'd want to map them to these attributes in a Pleco XML file score tag:

Status: no distinction between pending and new, if "active" then set "forcepool" = "include," if "exclude" then set "forcepool" = "exclude."
Flag values: this should probably translate to an extra category assignment for "Flagged."
Review count: set "reviewed" = this, set "incorrect" = this minus "Correct count"
Correct count: set "correct" = this
Streak: set "history" to a string containing this number of 6s (so a streak of 5 means "66666")
Rounds until shown: N/A, I think
Current Interval: multiply this by 25/6 (100/24) and set "score" to that value
Last Review: set "lastreviewedtime" to this (in Unix epoch seconds)

Also, set "difficulty" = 100 (unless there's some other exported value we can use) and "firstreviewedtime" = 0.

If you export a list of flashcards to XML the format should be pretty clear. But obviously this is rather technical and it would probably be best if we found a way to support both this format and Anki ourselves - in general I'd rather wait and do that later in the year when we get our flashcard system database format revamped, though, since that would make it easier for us to store values we don't know how to use in a "custom field" from which they could be retrieved later by an app that you imported data from (thus potentially facilitating two-way conversion).

doctorj · May 17, 2012

Thanks, i have just had to check up on how epoch seconds work, but now i understand that, i think i can push the flashcards across.

Thanks for your help.

mikelove · May 19, 2012

doctorj said:
Thanks, i have just had to check up on how epoch seconds work, but now i understand that, i think i can push the flashcards across.

Thanks for your help.

No problem! Let me know if you run into any more issues / questions with this.

doctorj · May 21, 2012

When creating the xml file there is certain data that I don't know. I can skip those lines in the XML, or do i need to insert nominal values e.g.
When i look at an exported card in XML format i get this:

<cards>
<card language="chinese" created="1330089941" modified="1330090777">
<entry>
<headword charset="sc">马</headword>
<headword charset="tc">馬</headword>
<pron type="hypy" tones="numbers">Ma3</pron>
<defn>horse(name)</defn>
</entry>
<dictref dictid="PCED" entryid="18526464"/>
<catassign category="Animals"/>
<scoreinfo scorefile="Default" score="5929" difficulty="124" history="666" correct="3" incorrect="0" reviewed="3" sincelast="0" firstreviewedtime="1330893392" lastreviewedtime="1333192498"/>
</card>

The things that I know how to fill in are:
card language,head word, pronunciation, definition, category, score, correct, incorrect, reviewed, firstreviewedtime & lastreviewedtime.

However, the following i am not so sure about:
created, modified,I don't know these dates
entryid I assume Pleco will generate
history= I don't have this data, can i leave it blank or does it need to match up with the reviewed count.
sincelast - I am not sure what this means.

Can you advise please?
Thanks in advance for your help.

mikelove · May 22, 2012

doctorj said:
created, modified,I don't know these dates

Probably best if you just set them to the current date in that case. It's in Unix time format, this website will give you the current value.

doctorj said:
entryid I assume Pleco will generate

Yes, in fact you should skip the entire <dictref> tag in this case.

doctorj said:
history= I don't have this data, can i leave it blank or does it need to match up with the reviewed count.

It should be OK if you leave it blank - it's built into the design of the software that the history and the reviewed count don't always sync up. (history has a limited storage capacity since it's really only used for "in a row" stuff) You might, however, want to set it to a number of 6es equal to the current "streak" so that correct-in-a-row calculations at least will match up.

doctorj said:
sincelast - I am not sure what this means.

It's an obscure value mainly used for our "manual" scoring system - you can safely leave it at 0.

BenJackson · Apr 10, 2019

mikelove said:
It should be OK if you leave history blank - it's built into the design of the software that the history and the reviewed count don't always sync up.

I used the information here to import Skritter data to Pleco back in January. I set 'history' to match the only streak data available from Skritter (which is "prevSuccess", i.e. did you get it right last time), so either a single "6" or a single "2".

For anyone trying this in the future: Be sure to set history to be at least as many "6" as your imported "correct" count. It turns out that under New Test, Card Selection, Limit new cards, the "card is learned if" options "correct in a row" and "correct total" both use history. I expected "correct in a row" to be an issue, but I could never figure out why "correct total" didn't work as expected (it never showed me new cards). In hindsight, I guess these share a codepath internally (e.g. a sqlite WHERE GLOB test).

One other thing I did that really simplified the import: I first generated a plain file of words I wanted to import, and imported them into Pleco normally. That way all of the usual import machinery created the cards and assigned definitions (and categories). Then my script only modified pleco_flash_scores_n (via a python script with sqlite3).

Shun · Apr 11, 2019

That's good to know, great work and analysis!

Importing including statistics

doctorj

Member

mikelove

皇帝

doctorj

Member

mikelove

皇帝

doctorj

Member

mikelove

皇帝

doctorj

Member

mikelove

皇帝

BenJackson

举人

Shun

状元