Difference between revisions of "Statistics"
|Line 55:||Line 55:|
=== Memorized ===
=== Memorized ===
total number of [[Glossary:Element|element]]s introduced into the learning process with options such as '''[[Learn]]''' or '''[[Glossary:Remember|Remember]]'''. If an [[Glossary:Item|item]] takes part in [[Glossary:Repetition|repetitions]] it is a [[Glossary:Memorized_element|memorized]] [[Glossary:Item|item]]. It does not mean it is a remembered [[Glossary:Item|item]]. A proportion of [[Glossary:Memorized_element|memorized]] [[Glossary:Item|items]] is always forgotten. ''The presented [[Glossary:Collection|collection]] has 635,699 [[Glossary:Element|element]]s in the learning process and these elements make up 100.0% of all elements destined to enter the learning process, i.e. <code>Memorized/(Memorized+Pending)=100.0</code>. This indicates that Pending=0 (see [[#Pending|below]])''
=== Memorized items ===
=== Memorized items ===
Revision as of 16:35, 15 April 2019
- 1 Introduction
- 2 Caption
- 3 Toolbar
- 4 Learning parameters
- 4.1 Date
- 4.2 First day
- 4.3 Period
- 4.4 Memorized
- 4.5 Memorized items
- 4.6 Memorized topics
- 4.7 Memorized/Day
- 4.8 Total
- 4.9 Items
- 4.10 Topics
- 4.11 Outstanding
- 4.12 Review
- 4.13 Protection
- 4.14 Retention
- 4.15 Measured FI
- 4.16 R-Metric
- 4.17 Alarm
- 4.18 Burden
- 4.19 Burden +/-
- 4.20 Avg Workload
- 4.21 Exp Workload
- 4.22 Time
- 4.23 Avg time
- 4.24 Total time
- 4.25 Lapses
- 4.26 Speed
- 4.27 Avg Cost
- 4.28 Exp Cost
- 4.29 Interval (I)
- 4.30 Interval (T)
- 4.31 Repetitions
- 4.32 Rep count
- 4.33 Last Rep (I)
- 4.34 Last Rep (T)
- 4.35 Next Rep (I)
- 4.36 Next Rep (T)
- 4.37 Pending
- 4.38 Dismissed
- 4.39 Average FI
- 4.40 Completion
- 4.41 A-Factor
- 4.42 Additional notes
- 5 Statistics context menu
The Statistics window allows you to inspect the main learning process statistics in the currently opened collection. It can most conveniently be viewed by pressing F5 (Window : Layout : Warrior layout).
- the daily and the monthly calendar of repetitions
- Statistics menu
- open the statistics context menu otherwise available with a right-click over the window
- various statistics related to the learning process and SuperMemo Algorithm
- Memory graphs
- visual presentation of memory functions, their approximations as used by Algorithm SM-18, and options for resetting and re-computing memory data on the basis of repetition histories
- Warrior layout
- arrange windows in the way optimized for the use of incremental reading with the Statistics window conveniently aligned to the left of the element window.
- view this help article
To easily compare the exemplary fields with their corresponding descriptions:
- Shift+click the picture above to open it in full resolution in a new browser window
- Drag the title bar of the picture window to the left or right side of the screen until an outline of the expanded window appears
- Release the mouse to expand the window
- Repeat steps 2 and 3 with this window to arrange the windows side by side
The current date and day of the week. If this value is preceded with Night, it means that the new calendar day has already started but the old repetition day will not start until the time defined in Toolkit : Options : Learning : Midnight shift. When the midnight shift is passed, this field will display a red warning Time to close: Alt+F4. If you see that message, close/restart your collection to prevent collecting learning data with undefined repetition timing. In the example above, the picture snapshot was taken after midnight on Apr 01, 2019 (Mon)
The date on which the learning process started (i.e. the day on which the first element was memorized). The exemplary collection presented in the picture has been in use since December 15, 1987 (i.e. two days after the birth date of SuperMemo for DOS)
The presented collection has been in use for 31 years, 3 months and 17 days
The total number of elements introduced into the learning process with options such as Learn or Remember. If an item takes part in repetitions it is a memorized item. It does not mean it is a remembered item. A proportion of memorized items is always forgotten. The presented collection has 635,699 elements in the learning process and these elements make up 100.0% of all elements destined to enter the learning process, i.e.
Memorized/(Memorized+Pending)=100.0. This indicates that Pending=0 (see below)
- Memorized items
- number of memorized items in the collection and the proportion of memorized items among memorized elements. In the example above, 203,827 items take part in repetitions. These items make 32.1% of all elements taking part in the learning process (the remaining 67.9% of elements are memorized topics, memorized concepts or memorized tasks). The Retention field indicates that 92.4937% of these items should be remembered at any given time
- Memorized topics
- number of memorized topics, concepts and tasks and their cumulative proportion among all memorized elements. In a well-balanced incremental reading process, topics should make a minority of elements served for review. If the proportion of topics increases, the retention drops, and the learning process may gradually start to resemble traditional learning where ineffective passive review predominates. You can store as many topics in your collection as you wish as long as you make sure that you limit their review by setting appropriate repetition sorting criteria (Learn : Sorting : Sorting criteria). In the picture, 431,872 topics make 67.9% of the material taking part in the learning process
- number of items memorized per day:
(Memorized items)/Day. In the example, the average of 17.8311 items have been memorized daily in the presented collection over the previous 31+ years. This is typical of an average student as long as regular reviews are executed on a daily basis
- number of items, topics, concepts and tasks in the collection. Two relationships hold true:
- Deleted elements do not contribute to the total count of elements in the collection. In the picture, the presented collection is made of 727,259 elements (largest collections reported by users reached beyond a half million elements)
- number of topics, concepts and tasks in the collection. The presented collection includes 519,816 topics (counted together with concepts and tasks)
- number of outstanding items, outstanding topics and final drill items scheduled for repetition on this given day. The first number (before +) indicates the number of items scheduled for this given day and not yet processed. The second number (after the plus sign) indicates the number of topics scheduled for review for this day. The third number (after the second plus sign), if present, indicates the number of items that have already been repeated today but scored less than Good (4). Those are the items that make up the final drill queue. The final drill queue is built only if Skip final drill is unchecked in Toolkit : Options : Learning. In the presented collection, there are still 3521 items scheduled for repetition on Apr 01, 2019. There are also 1297 topics scheduled for review on that day as part of the incremental reading process. There are no elements in the final drill queue (the third component of the Outstanding parameter is missing)
- number of elements scheduled for subset review (e.g. elements in the neural review in Learn : Go neural, elements in branch repetitions in Contents' Learn, elements in browser subset repetitions in the browser's Learn, elements in the random test queue in Toolkit : Random test, etc.). The display may have a form of Neuro=<elements to do> in neural review, or <items to do>+<topics to do>+<pending to do>+(<subset description>) in subset review, or <elements unprocessed>/<all elements in the test> in random tests. Here 86 elements remain in neural review. Neural review is most often executed with Ctrl+F2 in the element window or browser, or Alt+N in the registry window
Current day's degree of processing of the top priority material. Important: As the statistic is taken from the top of the queue of outstanding items or topics (not the top of outstanding queue, which is a randomized mix of the two), if you change the priority of the top item, you will see a false value in Statistics until you review that item of changed priority (this behavior is by design to prevent the need to scan the entire queue at each update to statistics). In the example, only 0.031% of top priority items, and 0% of top priority topics have been processed. 0.031% protection does not mean going through 0.031% of the outstanding items queue. It means that the highest priority of unprocessed items in the queue is 0.031%
- estimated average knowledge retention in the collection. Retention for high-priority items should be higher than the one listed. Retention for low-priority items may be much lower driving the average down. To judge upon the retention of top-priority material, see Toolkit : Statistics : Analysis : Graphs : Forgetting index vs. Priority. In the example, 92.8374% of the material should be recalled in a random test on all elements in the collection at any time. You can test your retention using random tests and see if SuperMemo's estimates are accurate. This statistic may be overly optimistic if you have recently abused rescheduling tools such as Postpone or Mercy.
- Measured FI
- value of the measured forgetting index as recorded during repetitions. The number in the parentheses indicates Measured FI for the day. In collections with heavy item overload, measured forgetting index may be much lower than the overall forgetting index for the entire collection due to the fact that repetitions include primarily high-priority material. It can also be lower than the requested forgetting index when transitioning from more randomized to more prioritized sorting (as determined by sorting criteria) or when knowledge formulation and mnemonic skills improve before this fact can be reflected in the forgetting curve. It is also not uncommon to have Measured FI higher than Average FI. This is due to three factors:
- every user will experience delays in repetitions from time to time (e.g. as a result of using Postpone),
- low-priority material in the overloaded incremental reading process is scheduled in intervals longer than optimum intervals, and
- SuperMemo imposes some constraints on the length of intervals that, in some cases, make it schedule repetitions later than it would be implied by the forgetting index. The constraints in computing intervals, for example, prevent the new interval from being shorter than the old interval (assuming the item has not been forgotten). For low values of the forgetting index and for difficult items, the new optimum interval might often be shorter than the old one! Measured FI can be reset with Toolkit : Statistics : Reset parameters : Forgetting index record.
- In the presented example, an average of 13.97% of item repetitions end with a grade less than Pass (3) (since the measured forgetting index record has last been reset). On May 04, 2016, 3.8% of repetitions ended in failure thus far (i.e. with a grade less than Pass).
- absolute measure of performance of two spaced repetition algorithms based on their ability to predict recall before a grade is scored. In SuperMemo 17, R-Metric is used solely to compare Algorithm SM-15 (known from SuperMemo 16) and the new Algorithm SM-17. It is shown as percentage in Statistics and Toolkit : Statistics : Analysis : Use : Efficiency : R-Metric. R-Metric is a difference between the performance of the two algorithms:
LSRMis the least squares predicted recall measure for a given algorithm. R-Metric greater than zero shows superiority of Algorithm SM-17. R-Metric less than zero indicates underperformance of the new algorithm.
LSRMis a square root of the average of squared absolute differences in recall predictions:
Recallis 0 for failing grades and
Recallis 1 for passing grades.
PredictedRecallis a prediction issued by the algorithm before the repetition. In Algorithm SM-17, the prediction is a weighted average of the value taken from the Recall matrix, and R (retrievability) computed from S (stability) and the used interval. The weight used is based on prior repetition cases which inform of the significance of the Recall matrix prediction (the prediction becomes more meaningful with more prior repetition data).
Recallis 1 for Grade>=3, and 0 for Grade<3
PredictedRecallin SuperMemo 17:
weight (0..1)depends on the number of prior repetition cases
- In this case, R-Metric of 17.0302% shows a huge advantage of Algorithm SM-17 over Algorithm SM-15 on that particular day (May 4, 2016)
The time left till the next alarm and the hour at which the alarm will ring off (to learn more about alarms see: Plan). This field is editable. To change the alarm setting, click the field and type in the new time in minutes (e.g. 21.5 will set the alarm to sound in 21 minutes and 30 seconds). To end editing, press Enter. In the example, the alarm will sound off in 20 minutes and 21 seconds at 00:52:13 (i.e. 52 minutes after midnight)
- estimation of the average number of items and topics repeated per day. This value is equal to the sum of all interval reciprocals (i.e. 1/interval). The interpretation of this number is as follows: every item with interval of 100 days is on average repeated 1/100 times per day. Thus the sum of interval reciprocals is a good indicator of the total repetitions workload in the collection. The presented collection requires 271 item repetitions per day and 555 topic reviews per day. In incremental reading, it is not unusual to have many more elements in the process than one can handle. Auto-postpone can be used to unload the excess of topics as well as to reduce the load of low-priority items. Postpone skews the Burden statistic. Topics often crowd at lower intervals and are regularly reshuffled with Postpone or Auto-postpone
- Burden +/-
- change of the Burden parameter above on a given day. Here, on May 04, 2016, the average number of expected daily repetitions was slightly decreased (i.e. by 8 items). The topic load was also decreased (i.e. by almost 90 topics). Exemplary interpretation of a burden change: Let's say Burden dropped by 39 (burden change of -39). To reduce the burden by 39, one would need to review 78 elements with an interval increase from 1 to 2 days (78*0.5=39). However, one could equally well execute Postpone on 2344 elements with interval increase from 10 to 12 days (2344*(1/10-1/12)=39)
- Avg Workload
- average time spent on responding to questions per day (from the first day of learning). For the presented collection, the student spent on average 18 minutes and 30 seconds per day on answering items over the period spanning 28 years 4 months and 20 days between May 04, 2016 and December 15, 1987 when the very first repetition was made.
- Exp Workload
- estimation of the average daily time used for responding to questions in a given collection.
- In the presented collection, 271 item repetitions per day taking 9.922 seconds each result in a daily repetition time estimated at 44 minutes and 52 seconds. A real learning time may be twice longer due to grading, editing, reviewing the collection and various interruptions. In incremental reading, the learning time will increase further due to topic review that is not taken into account in the Exp Workload parameter. The real learning time may also be cut if Postpone is used often
- total question response time on a given day and the total session time (in parentheses). Here the total time needed to respond to questions on May 04, 2016 was 14 minutes and 31 seconds. On the same day, SuperMemo has been running for 3 hours, 19 minutes and 1 second (this value will increase even if you simply keep SuperMemo running)
- Avg time
- average response time in seconds. This is the time that elapses between displaying the question (or equivalent) and choosing Show answer (or equivalent). The timer does not stop if you start editing the question before pressing Show answer. In the presented collection, the average time to answer a single question is around 9.922 seconds. If this number grows beyond 15-20 seconds, you may need to analyze your learning material if it is not overly difficult or badly structured
- Total time
- total time taken by responding to questions in the collection. This time cannot be accurately measured for collections created with SuperMemo 98 or earlier (the measurements were made possible only in SuperMemo 99). If you upgrade older collections, this number will roughly be guessed for you. SuperMemo will derive this time from the total number of items, average number of repetitions, average number of lapses, and the average repetition time. In the presented example, answering questions during repetitions took the total of over 133 days in over 28 years of learning
- average number of times individual items have been forgotten in the collection (only memorized elements are averaged). The number in parentheses shows the number of lapses on a given day. Here an average element has been forgotten 0.51239 times. On May 04, 2016, 3 items have thus far been graded less than Pass (3)
- average knowledge acquisition rate, i.e. the number of items memorized per year per minute of daily work (only answering item questions counts). Initially this value may be as high as 100,000 items/year/minute (esp. if you enthusiastically start working with the program before truly measuring its limitations, and the limitations of human memory). This parameter should later stabilize between 40 and 400 items/year/minute.
- Speed=(Memorized items/Day)/(Repetitions time)*365
- In the presented collection, every minute of work per day resulted in 342 new items memorized each year
- Avg Cost
- cost in time of memorizing a single item, i.e. total learning time divided by the number of memorized items.
- In the presented example, the total repetition time per single item is 1 minute and 3 seconds, which is the amount of time it has contributed to the total of non-stop over 133 days of repetitions. The cost of editing, collection restructuring, incremental reading, etc. is not included in the Avg Cost parameter
- Exp Cost
- daily repetition time per each newly memorized item assuming no postpones.
- In the presented collection, each of the 18 newly memorized items per day contributes 2 minutes and 23 seconds of repetitions to the total workload of almost 45 minutes per day. As this value is derived from Burden, it may be highly overestimated if you use Postpone a lot (e.g. in incremental reading)
- Interval (I)
- average interval among memorized items in the collection. Here an average memorized item has reached the inter-repetition interval of 7 years, 9 months and 30 days
- Interval (T)
- average interval among memorized topics in the collection. Here an average memorized topic has reached the inter-repetition interval of 6 years, 1 month and 9 days
- average number of repetitions/reviews per memorized item (I) and topic (T) in the collection. Here an average item has been repeated 3.272 times while an average topic has been reviewed 2.431 times
- Rep count
- total count of item repetitions made in the collection. In the presented collection, 945+ thousands of item repetitions have been made. This is about 5 repetitions per memorized item. That includes repetitions of items that have been reset, forgotten, dismissed, deleted, etc.
Last Rep (I)
- Last Rep (I)
- average date of the last repetition among memorized items in the collection. Here the average date of the last repetition is September 17, 2008
Last Rep (T)
- Last Rep (T)
- average date of the last review among memorized topics in the collection. Here the average date of the last review is June 21, 2010
Next Rep (I)
- Next Rep (I)
- average date of the next repetition among memorized items in the collection.
- Here the average date of the next repetition is July 19, 2016 or 2,863 days after September 17, 2008
Next Rep (T)
- Next Rep (T)
- average date of the next review among memorized topics in the collection.
- Here the average date of the next review is July 30, 2016 or 2,232 days after June 21, 2010
- the number of elements (topics or items) that have not yet been introduced into the learning process and await memorization (with operations such as Learn, Remember, Schedule, etc). All pending elements are kept in the so-called pending queue that determines the sequence of learning new elements. Dismissed elements are not kept in the pending queue. In the example, the collection contains no pending elements. With incremental reading, the role of the pending queue in SuperMemo is diminishing
- number of elements (items, topics, concepts or tasks) that have been excluded from the learning process and are kept only as reference material, folders in the knowledge tree, or tasklist elements. Dismissed items are neither pending nor memorized. All tasks are dismissed by default, i.e. they usually do not take part in repetitions. In the example, almost 77,000 elements have been dismissed
- Average FI
- average requested forgetting index in the entire collection (the number in parentheses is the default forgetting index). If the forgetting index of individual elements is not changed manually, Average FI is equal to the default forgetting index as set in Toolkit : Options : Learning : Forgetting index (default). The default forgetting index is the requested forgetting index given to all new items added to the collection. Forgetting index, in general, is the proportion of items that are not remembered during repetitions. The lower the value of the forgetting index the better the recall of the element, but the more repetitions will be needed to keep it in memory. Optimum value of the forgetting index falls into the range from 7% to 13%. Too low a forgetting index makes learning too tiresome due to a prohibitively large number of repetitions. All elements can have their desired forgetting index set individually. The easiest way to change the forgetting index of a large number of elements is to use Forgetting index option among subset operations. In the presented example, the average forgetting index is 10.00% while the default forgetting index is 10%. See: Using forgetting index
- expected date on which all elements from the pending queue will be memorized assuming the present rate of learning new items. This parameter is particularly useful if you are memorizing large ready-made collections such as Advanced English. For Pending=0, the value of this field is today.
- average value of A-Factor among memorized items (I) and topics (T) in the collection. For items, A-Factor is a measure of difficulty in Algorithm SM-15. The higher the A-Factor, the easier the item. For topics, A-Factor is the number by which the current interval should be multiplied to get the value of the next interval. In the presented collection, the average A-Factor for items is 4.06. This indicates that the collection is rather well-structured and the material is thus relatively easy to remember. The average A-Factor for topics is 1.246
- Items are added to the final drill not only during standard repetitions when you grade an element below Good (4). Operations such as Remember (Ctrl+M), Cloze (Alt+Z), and Add to drill (Shift+Ctrl+D) will also add to the final drill queue. The final drill queue is created automatically only if you uncheck Toolkit : Options : Learning : Skip final drill
- Some fields of the Statistics window can be edited. For example: Alarm, Total time, Rep count, etc. To edit an entry, click it, type the new value and press Enter. If the entry cannot be modified SuperMemo will warn you (e.g. "Retention entry cannot be modified").
- See Survey 1994 and Survey 1999 for some interesting notes about the speed of learning reached with SuperMemo
To open the context menu:
- Right-click anywhere in the window
- Click the first button in the toolbar
Context menu items: