I think that you would like Pleco—or some other code—to iterate through all the sentences in a text and determine the text's difficulty for you, based on the flashcards that are already marked as learned in your database. Since most learners study the same couple of thousand words first and will specialize only later, a script that rates texts by the six HSK levels of the HSK 2 standard (5,000 basic words) should also be helpful to you.
If you know Python, I recommend that you have a look at the three threads below for scripts that determine the HSK difficulty level of sentences
, which could quite easily be enhanced to assess the difficulty of entire texts made up of many sentences:
(continued from the thread ”79,000 Chinese-English, French, German, Italian, Japanese, and Spanish sentences“) Dear @leguan, I believe your current intention is to develop "difficulty" analysis tools as a first step. . Using the current sentence contextual flashcards is a good way to evaluate...
Dear all, here is an archive containing translated Chinese sentences in the following language pairs, ready for importing into Pleco: Chinese-English 41,955 sentences Chinese-French 15,740 sentences Chinese-German 4,566 sentences Chinese-Italian 3,800 sentences...
The following algorithm by @Pierre Biannic
runs much faster, though there is still some disagreement in the ratings of many sentences compared to my old script (I will have to get to the bottom of this.):
Dear all, I am pleased to share with you this Python script that allows to automatically generate sentences flashcards from the Tatoeba database, that I wrote based on a previous script from and with the help of @Shun . The main features are: Choice of translation language, and automatic...
A precondition for using these scripts is, of course, that the texts be available in unencrypted, DRM-free form.