SuperMemo Algorithm: Difference between revisions

From SuperMemo Help
Jump to navigation Jump to search
 
(45 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== Introduction ==
== Algorithm SM-18 ==


SuperMemo had a fertile impact on the research of modeling long-term memory. One of the most interesting fruits of that research is the [http://www.super-memory.com/english/2vm.htm two component model of long-term memory]. Over the last two decades, it has been hinted on many occasions that the model may provide a sound theoretical basis for a better and more universal approach to [[Glossary:Spaced_repetition|spaced repetition]]. Algorithm SM-17 (Alg-SM17 or just SM17 in short) is an attempt to convert the theory into a practical application. Due to its universal nature, the algorithm should, in the future, find its way to all SuperMemo products.
SuperMemo had a fertile impact on the research of modeling long-term memory. One of the most interesting fruits of that research is the [http://www.super-memory.com/english/2vm.htm two component model of long-term memory]. Over the last two decades, it has been hinted on many occasions that the model may provide a sound theoretical basis for a better and more universal approach to [[Glossary:Spaced_repetition|spaced repetition]]. Algorithm SM-18 is the most up-to-date implementation of the theory in a practical application. Due to its universal nature, the algorithm should, in the future, find its way to all SuperMemo products.


<div class="bs-callout bs-callout-warning">
For details see: [https://supermemo.guru/wiki/Algorithm_SM-18 Algorithm SM-18]
'''Important!''' At the moment of writing (April 2016), [[What's new in SuperMemo 17?|SuperMemo 17]] does not use incremental adjustments to optimization matrices in [[SuperMemo Algorithm|Algorithm SM-17]]. This is why you should execute '''[[Tools menu|Tools]] : [[Tools menu#Memory|Memory]] : 4D Graphs : Stability : Compute''' from time to time to adjust the algorithm to newly available data. In the future, the adjustments will be made at each [[Glossary:Repetition|repetition]].
</div>


== Two component model ==
== Historical note: earlier releases of the algorithm  ==


The two component model of long-term memory underlies Algorithm SM-17. The model asserts that two variables are sufficient to describe the status of unitary memory in a healthy human brain:
Although the presented algorithm may seem complex, you should find it easier to follow its steps once you understand the evolution of individual concepts such as [[Glossary:E-Factor|E-Factor]], matrix of optimum intervals, optimum factors, [[Glossary:Forgetting curve|forgetting curves]], mid-interval repetition, stability, retrievability, etc.
* [[Glossary:Stability|stability]] - this variable determines how long a memory can last if undisturbed and if not retrieved
* [[Glossary:Retrievability|retrievability]] - this variable determines the probability of retrieving a memory at any given time since the last review/recall


In the literature on memory, an ill-defined term of ''memory strength'' will often stand for either stability (S) or retrievability (R). This results in a great deal of confusion that slows down the progress of memory research. In 1995 (and earlier), we [http://www.super-memory.com/english/2vm.htm have proven] theoretically that the two variables: R and S are sufficient to describe the status of memory in [[Glossary:Spaced_repetition|spaced repetition]].
*'''1985''' - Paper-and-pencil version of SuperMemo is formulated ([http://super-memory.com/english/ol/beginning.htm#Algorithm Algorithm SM-0]). Repetitions of whole pages of material proceed along a fixed table of intervals. See also: [http://super-memory.com/articles/paper.htm Using SuperMemo without a computer]
*'''1987''' - [http://super-memory.com/articles/soft/sm2.htm First computer implementation] makes it possible to divide material into individual items. Items are classified into difficulty categories by means of [[Glossary:E-Factor|E-Factor]]. Each difficulty category has its own optimum spacing of repetitions ([http://super-memory.com/english/ol/sm2.htm Algorithm SM-2])
*'''1989''' - SuperMemo 4 was able to modify the function of optimum intervals depending on the student's performance ([http://super-memory.com/english/ol/sm4.htm Algorithm SM-4]). This was then the first algorithm in which the function of optimal intervals was adaptable
*'''1989''' - SuperMemo 5 replaced the matrix of optimum intervals with the matrix of optimal factors (an optimum factor is the ratio between successive intervals). This approach accelerated the adaptation of the function of optimum intervals ([http://super-memory.com/english/ol/sm5.htm Algorithm SM-5])  
*'''1991''' - [http://super-memory.com/articles/soft/sm6.htm SuperMemo 6] derived optimal factors from [[Glossary:Forgetting curve|forgetting curves]] plotted for each entry of the matrix of optimum factors. This could dramatically speed up the convergence of the function of optimum intervals to its ultimate value ([http://super-memory.com/english/ol/sm6.htm Algorithm SM-6]). This was then the first adaptable algorithm that would use regression to find the best fit to the actual memory performance data. Unlike SuperMemo 5, which could keep converging and diverging depending on the quality of the learning material and the learning process, SuperMemo 6 would get closer to the student's ultimate memory model with each day of learning
*'''1995''' - [http://super-memory.com/articles/soft/sm8.htm SuperMemo 8] capitalized on data collected by users of [http://super-memory.com/articles/soft/sm6.htm SuperMemo 6] and [http://super-memory.com/articles/soft/sm7.htm SuperMemo 7] and added a number of improvements that strengthened the theoretical validity of the function of optimum intervals and made it possible to accelerate its adaptation, esp. in the early stages of learning ([http://super-memory.com/english/algsm8.htm Algorithm SM-8]). New concepts:
**replacing [[Glossary:E-Factor|E-Factors]] with absolute difficulty factors: [[Glossary:A-Factor|A-Factors]]. Item difficulty was thus defined in terms of actual properties of human memory, and would not depend on the average difficulty of the learning material
**fast approximation of [[Glossary:A-Factor|A-Factors]] from the '''First Grade vs. A-Factor''' correlation graph and '''Grade vs. Forgetting index graph'''. This makes it possible to quickly guess item's difficulty before more data is available
**real-time adjustment of the matrix of optimal factors based on the power approximation of the decline of optimum factors
*'''2002''' - SuperMemo 2002 introduced the first SuperMemo algorithm that is resistant to interference from delay or advancement of repetitions. This makes it possible to safely delay repetitions ('''[[Postpone]]''') or advance repetitions ([[Subset learning|'''Review''']]):
**accounting for delayed repetitions by introducing the concept of repetition categories
**accounting for advanced repetitions by introducing O-Factor decrements derived from the concept of the ''spacing effect''
*'''2005''' - [http://www.freewarefiles.com/SuperMemo_program_13849.html SuperMemo 2004] introduced boundaries on A and B parameters computed from the '''Grade vs. Forgetting Index''' data. This plugs up a weakness in the algorithm that showed when importing repetitions from other applications (e.g. open source MemAid). If a large number of easy repetitions occurred at unnaturally long intervals (as after pre-training with another application), the graph might show reversed monotonicity that would temporarily affect the estimation of [[Glossary:A-Factor|A-Factors]] (the speed of self-correction would be reversely proportional to the amount of flawed data). When boundaries are imposed, self-correction is instant, and the accuracy of [[Glossary:A-Factor|A-Factor]] estimation increases with each repetition executed in SuperMemo
*'''2011 '''- with [https://supermemo.guru/wiki/Algorithm_SM-15 Algorithm SM-15], SuperMemo 15 eliminated two weaknesses of Algorithm SM-11 that would show up in heavily overloaded collections with very large item delays:
** [[Glossary:U-Factor|U-Factors]] now allow of correctly interpreting repetition delays of up to 15 years (previously 60 days only)
** [[Glossary:Forgetting curve|forgetting curves]] are now corrected for repetition delays beyond the maximum registered [[Glossary:U-Factor|U-Factor]] value (preventing failed grades in delayed repetitions decreasing the estimates of the [[Glossary:Optimum interval|optimum interval]] for standardly-scheduled items in the same category)
*'''2015''' - [https://supermemo.guru/wiki/Algorithm_SM-17 Algorithm SM-17] is a successor to all prior versions of the algorithm in future SuperMemos. It had passed all important benchmarks, and was part of SuperMemo 17 in 2016
*'''2018''' - [https://supermemo.guru/wiki/Algorithm_SM-18 '''Algorithm SM-18'''] is the newest version of the spaced repetition algorithm introduced in SuperMemo 18. The new algorithm improves the way [https://supermemo.guru/wiki/Item_difficulty_in_Algorithm_SM-18 item difficulty is estimated]


Algorithm SM-17 is the ultimate practical proof for the correctness of the [http://www.super-memory.com/english/2vm.htm two component model].
== See also ==


== Past vs. Future ==
* [https://www.supermemopedia.com/wiki/Algorithm_SM-17_FAQs Algorithm SM-17 FAQs]
 
* [http://super-memory.com/english/ol/sm2.htm Algorithm SM-2 (first algorithm for spaced repetitions on a computer)]
All SuperMemo algorithms before the year 2015 took one essential assumption that limited their universal applicability in learning: the review of the learning material should occur at optimum timing determined by SuperMemo. Algorithms employed in SuperMemo 11 through 16 included weak heuristics that would allow of '''repetition at any time''' without a noticeable detriment to learning. In particular, those heuristics would help overcome the impact of the [[Glossary:Spacing_effect|spacing effect]] where frequent reviews lead to less impact or even weakening of memories. New research indicates that the applicability of those heuristics was limited to the less extreme departures from the optimum schedule. The new algorithm, in theory, should make it possible to optimize review for any [[repetition history]], and any repetition timing. It should also make it possible to simulate the status of memory produced with different variants of spaced repetition algorithms (e.g. those used by SuperMemo in remote past).
* [https://supermemopedia.com/wiki/Comparing_SuperMemo_with_other_applications_based_on_spaced_repetition Comparing SuperMemo with other applications based on spaced repetition]
 
* [https://supermemopedia.com/wiki/Spaced_repetition_algorithm_metric Spaced repetition algorithm metric]
=== Weaknesses of Algorithm SM-15 ===
* [https://supermemo.guru/wiki/First_fast-converging_spaced_repetition_algorithm:_Algorithm_SM-5#Criticism_of_Algorithm_SM-5 Superiority of SuperMemo over Anki]
 
* [https://supermemopedia.com/wiki/Open_letter_to_spaced_repetition_developers Open letter to spaced repetition developers]
* lack of the retrievability dimension in mapping the function of optimum intervals
* [https://www.supermemopedia.com/index.php?title=Algorithm_SM-17&action=history Collaborative edits history of this file at SuperMemopedia]
* weak spacing effect heuristic (e.g. performing poorly in highly massed repetitions)
* memory-less difficulty determination method (new [[Glossary:A-Factor|A-Factor]] would be derived from the current [[Forgetting_index#estimated_forgetting_index|estimated forgetting index]] and the O-Factor matrix without taking into account past difficulty estimates and [[repetition history]])
* incremental changes to [[Glossary:Difficulty|item difficulty]] would produce nice-looking difficulty distributions that would not truly reflect actual item difficulty
* heterogeneous streams of [[Glossary:Item|items]] passed via the optimization matrices resulted in clusters of "polluted" entries
* relying on the best fit approximation of [[Glossary:O-Factor|optimum factors]] would not reflect variations caused by user strategy, or item heterogeneity
* lack of filtering for the power law in heterogeneous [[Glossary:Forgetting_curve|forgetting curves]]. In particular, the first review forgetting curve is better approximated with a power function, which shows only in [[Glossary:Collection|collections]] with substantial first review delay (beyond 100-200 days)
* excess reliance on the weak correlation between [[Glossary:Grade|grades]] and the [[Forgetting_index#estimated_forgetting_index|estimated forgetting index]]
* weak first grade vs. A-Factor correlation. For a flat correlation only extreme values of [[Glossary:A-Factor|A-Factor]] would be used after the first [[Glossary:Repetition|repetition]]. Such a flat correlation would often develop in well-managed old [[Glossary:Collection|collections]]
* relying on the [[Glossary:Measured_forgetting_index|measured forgetting index]] as the main optimization criterion may be unreliable for heterogeneous learning material where difficulty streams are not well separated
* lack of tools for maximizing the speed of learning for varying levels of the [[Glossary:Forgetting_index|forgetting index]]
* lack of tools for maximizing the increase in [[Glossary:Stability|memory stability]] for varying levels of the [[Glossary:Forgetting_index|forgetting index]]
* first post-lapse [[Glossary:Interval|interval]] would not differentiate between [[Glossary:Item|items]] whose forgetting was caused by different degrees of delay (rather than overestimated [[Glossary:Difficulty|difficulty]])
* assuming optimality of the schedule: manual interventions into the length of [[Glossary:Interval|intervals]] result in discarding the stability record, i.e. well-remembered [[Glossary:Item|items]] with short [[Glossary:Interval|intervals]] are treated in the same way as other items with the same interval length and with the same [[Glossary:Difficulty|difficulty]. See: [http://supermemopedia.com/wiki/Is_Algorithm_SM-17_much_better_than_Algorithm_SM-15%3F#Why_it_is_hard_to_compare_intervals_between_different_algorithms Comparing intervals in SuperMemo 16 and SuperMemo 17]
 
=== Strengths of Algorithm SM-17 ===
 
* extending review optimization to various levels of recall (from cramming to decade-long delays)
* [[Glossary:Optimum_interval|optimum interval]] is always based on the full [[Repetition history|history of repetitions]]
* data-based improvements to the algorithm can instantly be applied in the learning process by re-computing optimization matrices at 500,000 reps/min (previously, it would take months and years to collect corrective data)
* best fit difficulty determination
* universal memory formulas make it possible to simulate any imaginable learning outcomes, e.g. computing the efficiency of older algorithm post factum
* universal memory formulas make it possible to simulate review scenarios for individual [[Glossary:Item|items]], e.g. computing maximum learning speed schedules, or best average retention schedules, or probability of recall at chosen [[Glossary:Interval|interval]], etc.
* separating the first review [[Glossary:Forgetting_curve|forgetting curve]] and the first post-lapse review forgetting curves from the forgetting curve matrix for the sake of keeping homogeneous flow of [[Glossary:Item|items]] when computing changes in recall and [[Glossary:Stability|memory stability]]
* employing power law for the first review [[Glossary:Forgetting_curve|forgetting curve]] as more accurate for long-interval review. Shorter first [[Glossary:Interval|interval]] should be well compensated by faster increase in successive intervals for well-formulated [[Glossary:Item|items]]
* less sensitivity to chaotic oscillation in [[Glossary:Stability|memory stability]] caused by inaccurate stability or retrievability estimates. Those oscillations are not visible in older algorithms as stability estimates were not available to juxtapose against the [[Glossary:Interval|interval]] used in the learning process
* expressing first post-lapse [[Glossary:Interval|interval]] as a function of [[Glossary:Retrievability|retrievability]] (in addition to the number of [[Glossary:Lapse|lapses]]) yields better Recall-Retrievability matching later in the learning process
* expressing first post-lapse [[Glossary:Interval|interval]] as a function of [[Glossary:Retrievability|retrievability]] makes it easier to estimate the [[Glossary:Difficulty|difficulty]] of forgotten [[Glossary:Item|items]] by filtering out [[Glossary:Lapse|lapses]] caused by delays
* 35-quantile [[Glossary:Forgetting_curve|forgetting curve]] for the first review spans a decade (rather than just 3 weeks as in older versions of the algorithm)
 
=== Weaknesses of Algorithm SM-17 ===
 
* unlike all prior versions of the algorithm, Algorithm SM-17 requires a full record of [[repetition history]] for all [[Glossary:Item|items]]. This adds an additional burden on developers and licensees
* one-to-one mapping between recall and [[Glossary:Retrievability|retrievability]] remains elusive. This is the main validity test for the underlying theory of memory. There is still room for improvement, esp. for difficult [[Glossary:Item|items]]
* older versions of the algorithm would keep 255 most recent repetition outcomes when plotting multidimensional [[Glossary:Forgetting_curve|forgetting curves]]. Alg SM-17 keeps the full record of repetitions. This may make it more "conservative" in adapting to user's own progress (e.g. in formulating [[Glossary:Item|items]]). However, users of auto-memorized [http://super-memo.com/ae2002.html Advanced English 2014] showed dramatically bad recall predicted by the first [[Glossary:Forgetting_curve|forgetting curve]] in SuperMemo 16. This would be less prominent in Algorithm SM-17. Amending the inflexibility of Algorithm SM-17 is still in consideration (e.g. by recency weighing)
 
== Theory ==
 
=== Two component memory model ===
 
The implications of the [http://super-memory.com/english/2vm.htm two component model of long-term memory] are described in this [http://www.super-memory.com/articles/stability.htm 2005 article]:
 
Due to the fact that real-life application of SuperMemo require tackling learning material of varying difficulty, the third variable involved in the model is item difficulty (D). Some of the implication of item difficulty have also be discussed in the above article. In particular, the impact of composite memories with subcomponents of different memory stability (S). The three component model of memory is referred to as the [[Glossary:DSR_model_of_memory|DSR model]].
 
For the purpose of the new algorithm we have defined the three components of memory as follows:
 
* Memory Stability (S) is defined as the [http://supermemopedia.com/wiki/Is_stability_just_an_interval%3F inter-repetition interval that produces average recall probability of 0.9 at review time]
* Memory Retrievability (R) is defined as the expected probability of recall at any time on the assumption of negatively exponential forgetting of homogenous learning material with the decay constant determined by memory stability (S)
* Item Difficulty (D) is defined as the maximum possible increase in memory stability (S) at review mapped linearly into 0..1 interval with 0 standing for easiest possible [[Glossary:Item|items]], and 1 standing for highest difficulty in consideration in SuperMemo (the cut off limit currently stands at stability increase 6x less than the maximum possible)
 
See also: [http://supermemopedia.com/wiki/Modeling_memory Modeling memory]
 
=== Historical note ===
 
The three variables of memory were introduced gradually into SuperMemo algorithms:
* Difficulty (D) was first introduced in SuperMemo 1.0 (1987) in the form of [[Glossary:E-Factor|E-Factors]] that reflected difficulty. E-Factors would decrease each time [[Glossary:Item|items]] performed poorly in learning
* Stability (S) was first hinted at in SuperMemo 5.0 (1989), which was able to demonstrate the decline in stability increase with longer [[Glossary:Interval|intervals]] (decrease in [[Glossary:O-Factor|O-Factors]] with the increase in repetition number)
* Retrievability (R) was first accounted for in SuperMemo 11 (2002), which used a heuristic formula to compensate for [[Glossary:Spacing_effect|spacing effect]] in massed presentation and in delayed review
 
All [[Glossary:DSR_model_of_memory|three variables]] can now be computed precisely using [[Repetition history|repetition histories]] in Algorithm SM-17.
 
=== Deviations ===
 
To assess how well computer models fit reality, SuperMemo uses its own deviation measures. In particular:
# the deviation between grades and predicted grades derived from [[Glossary:Retrievability|retrievability]] when computing the SInc[] matrix, and
# the deviation between matrices and theoretical symbolic function (in particular, the entries of the SInc[] matrix and the theoretically predicted values of the stability increase function Sinc()).
 
==== Grade deviation ====
 
SuperMemo converts [[Glossary:Grade|grades]] to average [[Glossary:Retrievability|retrievability]] correlated with those grades for a given user (every user has his own system of grading and different levels of average retrievability will correspond with individual grades). The grade-derived deviation (gDev) is the difference between grade-derived retrievability (Rg) and theoretical retrievability (Rt) derived from stability estimate (Sw). The fail-success deviation (fDev) is 1-R for grades>2 and R for grades<3.
 
In estimating the deviation of grades from predicted grades, SuperMemo uses the formula:
 
<blockquote>
Dev=BGW*fDev+(1-BGW)*abs(gDev)
 
<div>
where:
* Dev - prediction deviation
* BGW - binary vs. grade-based weight (currently chosen to be 70%)
* fDev - binary success-failure deviation
* gDev - grade derived retrievability deviation
</div>
</blockquote>
 
The deviation for an item repetition history is computed as the square root of the average of squared deviations (Dev) for each repetition.
 
The deviation for a [[Glossary:Collection|collection]] is computed as the square root of the average of squared deviations (Dev) for each [[Glossary:Repetition|repetition]] in each [[repetition history]] of each [[Glossary:Item|item]].
 
In least squares comparison, for [[Glossary:Item|items]] or for [[Glossary:Collection|collections], the last squares deviation is:
 
<blockquote>
LSDev=sqrt(sum(square(Dev))/count)
 
<div>
where:
* LSDev - least squares deviation for all grade predictions (Dev) in a subset of repetition histories
* count - number of repetitions in the set
* sqrt - square root
</div>
</blockquote>
 
The assumed weighing (BGW) is a heuristic based on experiments and the fact that grade-retrievability correlation makes for a weak prediction of grades based on R. The said deviation formula is subject to change.
 
The deviation used in estimating [[Glossary:Difficulty|difficulty]] is weighed further to increase the impact of more recent repetitions on the current difficulty estimate.
 
Currently, for large [[Glossary:Collection|collections]], the algorithm can generate stability increase matrix which produces a deviation of 18%. Considering the stochastic nature of forgetting, this number would be close to the lowest possible theoretical value if only fail-success deviation (fDev) was used. However, grade-derived deviation (gDev) reduces the span of the deviation (by making R predictions more precise).
 
====== Theoretical minimum to a grade deviation ======
 
When using predicted grade deviations in assessing the strength of particular algorithmic solutions, we need to keep in mind that there is a theoretical limit on the minimum value of the deviation. This means that we will never approach the ideal 0% deviation.
 
For the formulas used above, the limit will depend on weights (e.g. BGW) and particular user's grade-retrievability averages.
 
If only binary success-failure deviation (fDev) was used, and all [[Glossary:Item|items]] were scheduled with a perfect [[Glossary:Interval|interval]] at the [[Glossary:Requested_forgetting_index|requested forgetting index]] of 10%, the minimum deviation would correspond with the retrievability prediction of 0.9. That deviation would then be:
 
<blockquote>
sqrt((R-Failure)*(R-Failure)*P(failure)+(R-Success)*(R-Success)*P(success))
</blockquote>
 
which is:
 
<blockquote>
sqrt((0.9-0)*(0.9-0)*0.1+(0.9-1)*(0.9-1)*0.9)=30%
</blockquote>
 
In addition to the least square deviation, SuperMemo 17 will provide a few other metrics for assessing the effectiveness of the algorithm. If you want to find your metric in SuperMemo options, please join the discussion: [http://supermemopedia.com/wiki/Spaced_repetition_algorithm_metric Spaced repetition algorithm metric]
 
==== Matrix deviation ====
 
SuperMemo uses matrix representations to express various functions, and hill-climbing approximations to find best theoretical symbolic matches for those functions for use in various contexts.
 
To guide the hill-climbing algorithms, a separate deviation measure is used as an optimization criterion. For each individual matrix entry:
 
<blockquote>
Dev=sqr(abs(MatrixValue-ApproximationValue)/MatrixValue))
</blockquote>
 
The deviation for the entire matrix is the square root of the average of individual deviations (Dev). SuperMemo expresses that deviation in percent.
 
=== Approximations ===
 
SuperMemo uses hill-climbing algorithms to match hypothetical functions that might model data categories of interest for the algorithm. Those functions can later be used to speed up calculations of Algorithm SM-17.
Currently, three best fit approximation procedures are used to verify theoretical formulas used in the algorithm. Those approximations determine the following functions:
* stability increase (arguments: D, S, R)
* recall (arguments: D, S, R)
* first post-lapse interval (arguments: Lapses, R)
 
Those approximation procedures may also be used in research that helps validate the algorithm. For example, to prove the exponential nature of forgetting, take additive and multiplicative variants of approximation formulas for Recall[D,S,R] with separate set of parameters defining power and exponential components along the R dimension. Outcome: despite the fact that heterogeneous [[Glossary:Forgetting_curve|forgetting curves]] may appear to follow the power law, Algorithm SM-17 is filtering [[Glossary:Item|items]] for [[Glossary:Difficulty|difficulty]] well enough to eliminate the power component entirely from the equation. Note that the [[Glossary:Forgetting_curve|forgetting curve]] describing the first review does not form a part of the Recall[] matrix. This is because a heterogeneous stream of new [[Glossary:Item|items]] might pollute the matrix and initiate the "ripple effect" of distorted estimates for higher stability levels (as in older SuperMemos). That first review [[Glossary:Forgetting_curve|forgetting curve]] follows the power law pretty well and, unlike in older SuperMemos, power regression is used to produce the default [[#Startup stability|startup stability]] for new [[Glossary:Item|items]].
 
==== Approximation parameters in student assessment ====
 
There are several hypothetical correlations between approximation parameters and characteristics of user's learning process and knowledge quality. For example:
* RMax in stability increase may be a good overall indicator of the quality of learning (the higher the RMax, the closer the approach to the optimum characterized by [[Glossary:Minimum_information_principle|minimum information principle]])(RMax is [[Glossary:Retrievability|retrievability]] that maximized the increase in [[Glossary:Stability|stability]])
* S Decay in stability increase may be a good indicated of the quality of knowledge (the worse the formulation, the higher S Decay)(S Decay is the decay constant determining how fast SInc[] decreases in the S dimension)
* Recall Intercept in recall approximation is higher in well formulated [[Glossary:Collection|collections]] (Recall Intercept is a measure of the agreement between Recall[] matrix and the theoretical exponential decline in recall with time)
 
== The Algorithm ==
 
=== Intro ===
 
Algorithm SM-17 computes new [[Glossary:Interval|intervals]] using the '''stability increase function'''. This function tells SuperMemo how much intervals should increase for all possible combinations of memory status variables: [[Glossary:Difficulty|Difficulty]], [[Glossary:Stability|Stability]] and [[Glossary:Retrievability|Retrievability]]. SuperMemo determines item difficulty by finding which difficulty level provides the best fit to [[Glossary:Grade|grades]] scored in [[repetition history]]. The first [[Glossary:Interval|interval]] is computed in a way that is similar to earlier SuperMemos. The stability increase function is computed using data collected during repetitions.
 
==== Stability increase function ====
 
Central to Algorithm SM-17 is the stability increase function represented by the stability increase matrix (abbreviated to SInc[]). The simple formula at the core of SuperMemo is:
 
<blockquote>
Int[n]=Int[n-1]*SInc[D,S,R]
</blockquote>
 
This means that [[Glossary:Optimum_interval|optimum intervals]] in SuperMemo increase by SInc[] in each [[Glossary:Repetition|repetition]] (assuming the [[Glossary:Requested_forgetting_index|requested forgetting index]] is 10%). D, S, R stand for 3 memory variables: [[Glossary:Difficulty|difficulty]], [[Glossary:Stability|stability]] and [[Glossary:Retrievability|retrievability]] respectively.
 
==== Startup stability and startup interval ====
 
Before the formula for the increase in the length of [[Glossary:Interval|intervals]] can be used, we need to determine the initial [[Glossary:Stability|stability]] and the first interval to use in repetitions:
* '''startup interval''' (SI) is the common first [[Glossary:Interval|interval]] for all new [[Glossary:Item|items]] determined by the [[Glossary:Forgetting_curve|forgetting curve]] for new heterogeneous material
* '''startup stability''' (SS) is the estimate of [[Glossary:Stability|stability]] after the first review determined by the best match to [[Glossary:Grade|grades]] obtained in later [[Glossary:Repetition|repetitions]] (that estimate may change with the arrival of new data)
* '''post-lapse stability''' (PLS) is the [[Glossary:Stability|stability]] of an [[Glossary:Item|item]] after a [[Glossary:Repetition|repetition]] that scored a failing [[Glossary:Grade|grade]]
 
The relevant formulas used to determine first [[Glossary:Interval|intervals]] in SuperMemo are:
 
<blockquote>
<p>Int[1]=SI (for the first repetition)</p>
<p>Int[1]=PLS (for the first review after a failing grade)</p>
<p>Int[2]=SS*SInc[D,S,R]</p>
</blockquote>
 
=== Historical note ===
 
The SInc[] matrix is similar to the optimum factor matrix from earlier SuperMemos (OF matrix in SuperMemo 5.0 through SuperMemo 16.0), but it takes different arguments. In the same way as OF-matrix determined the increase of [[Glossary:Interval|intervals]], SInc[] is used in the exactly same way except it takes three arguments: [[Glossary:Difficulty|Difficulty]], [[Glossary:Stability|Stability]] and [[Glossary:Retrievability|Retrievabilty]]. OF-matrix was indexed by [[Glossary:A-Factor|A-Factors]] (reflecting Difficulty) and [[Glossary:Repetition_category|repetition category]] (roughly related to Stability). However, OF matrix did not include the Retrievability dimension as repetitions were supposed to be executed at [[Glossary:Optimum_interval|optimum intervals]] determined by the [[Glossary:Requested_forgetting_index|requested forgetting index]].
 
=== Item difficulty ===
 
==== Difficulty: Definition ====
 
Item difficulty in SuperMemo Algorithm SM-17 is a number from 0 (easiest) to 1 (hardest). [[Glossary:Item|Items]] are linearly mapped in the 0..1 difficulty range by using their maximum possible memory stability increase. Maximum increase in [[Glossary:Stability|stability]] occurs in the first review when the length of the [[Glossary:Interval|interval]] corresponds with [[Glossary:Retrievability|retrievability]] that produces the greatest increase in the strength of memory. [[Glossary:Item|Items]] with theoretically highest possible stability increase are assigned difficulty of 0. These are the easiest [[Glossary:Item|items]] in the process. Those [[Glossary:Item|items]] are the target of all forms of learning as they are a perfect reflection of the [[Glossary:Minimum_information_principle|minimum information principle]]. As the maximum stability increase for very difficult [[Glossary:Item|items]] may approach 1.0 (or even drop below that value, i.e. produce memory liability!), all items at certain lowest acceptable maximum stability increase are classified as having difficulty 1.0. The exact cut off number is the parameter of the algorithm that is subject to change and further optimization (see Value N in Computing difficulty algorithm below)
 
==== Historical note ====
 
As of Algorithm SM-8, SuperMemo used the concept of A-Factor to define item difficulty. A-Factor was defined as the increase in [[Glossary:Optimum_interval|optimum interval]] during the first [[Glossary:Repetition|repetition]] (RepNo=2 in SuperMemo). In most situations, A-Factor would fall in the 1.0-10.0 range. SuperMemo employed the range from 1.2 to 6.9 where 1.2 limit was set for the sake of filtering out the so-called "leech material", i.e. material whose difficulty would disqualify it from efficient use of [[Glossary:Spaced_repetition|spaced repetition]]. Alg-SM17 might have used that A-Factor definition or strive at more universal concept of maximum increase in [[Glossary:Stability|memory stability]]. Those two approaches do not need to differ much as newer data indicates that the maximum stability increase occurs at [[Glossary:Retrievability|retrievability (R)]] that are close to the default levels used in SuperMemo. Alg-SM17 normalizes difficulty in the 0 (easiest) to 1 (hardest) range to roughly correspond with A-Factors 1.1 to 10.0. The assumption is that [[Glossary:Item|items]] that are harder than 1.0 should be eliminated from the learning process, while items that are easier than 10.00 are either extremely rare or can safely be treated along the rest of 10.0 items without a detriment to total workload.
 
==== Computing difficulty ====
 
Alg-SM17 requires a full [[Repetition history|history of repetitions]] to compute [[Glossary:Difficulty|item difficulty]] (see below for a simplified procedure that goes around that requirement for developers who might wish to develop ''light'' SuperMemos that might look at smaller data sets for mobile devices, etc.).
 
For the full range of possible [[Glossary:Difficulty|item difficulties]], Alg-SM17 computes the deviations of predicted [[Glossary:Grade|grades]] (based on a given assumed difficulty) from actual grades as recorded in [[repetition history]] of a given [[Glossary:Item|item]]. The algorithm chooses the [[Glossary:Difficulty|difficulty]] that produces the least deviation.
 
An important parameter in computing grade deviation in difficulty estimates is the '''deviation exponent'''. Least squares approach is prevalent in statistics, however, different values of the deviation exponent yield slightly different results. The ultimate goal is to set all parameters of optimization so that to produce a closer adherence of predicted outcomes to theoretical models.
 
===== Computing difficulty: Algorithm outline =====
 
Computing difficulty:
 
# Set S0 to the default optimum first [[Glossary:Interval|interval]] from the first [[Glossary:Forgetting_curve|forgetting curve]] for mixed difficulty material
# For all difficulty quantiles, using Algorithm SM-17, compute the deviation of predicted [[Glossary:Grade|grades]] based on that difficulty, S0, and the actual [[repetition history]] from the grades scored in that repetition history
# Find maximum and minimum quantiles with lowest deviation (very often, several quantiles will produce the exactly same deviation as a real number [[Glossary:Retrievability|retrievability]] is converted to [[Glossary:Grade|grades]], which are integers)
# Assume that the [[Glossary:Item|item]] in question has a [[Glossary:Difficulty|difficulty]] that corresponds with the average of minimum and maximum quantiles that generate the minimum deviation
 
Note that weighing quantiles by deviations tends to amass most [[Glossary:Difficulty|difficulties]] around 0.5, which shows that this is not a valid approach.
 
Computing the deviation:
 
# set S to S0
# for each [[Glossary:Repetition|repetition]], using the [[Glossary:Interval|interval]], compute new S using SInc[] matrix indexed by the assumed [[Glossary:Difficulty|difficulty]], [[Glossary:Stability|S]], and [[Glossary:Retrievability|R]] (corresponding with the interval)
# for each repetition, using [[Glossary:Stability|S]] and [[Glossary:Interval|interval]], compute [[Glossary:Retrievability|R]] and convert it to a predicted [[Glossary:Grade|grade]] (using R:grade mapping characteristic for the user)
# compare that [[Glossary:Grade|grade]] to the actual grade and add the difference to the deviation
 
Some important considerations in difficulty calculations:
# recent repetitions have a greater weight than older repetitions due to the fact that [[Glossary:Difficulty|difficulty]] is subject to change over time (e.g. due to new learning or minor item reformulation)
# instead of squared deviations, SuperMemo seeks to find the optimum exponent in weighing grade deviations to find the best match to theoretical memory models
# [[Glossary:Difficulty|difficulty]] can be computed using theoretical SInc() function, or actual SInc[] matrix data. The outcomes will differ. In particular, the range of minimum and maximum quantiles with the minimum deviation may be larger when using real/irregular data. The value of both approaches is currently under intense study
 
At the moment, the most likely solution to be adopted in SuperMemo is to use BestSInc() function that transitions from SInc() towards SInc[] with the gradual inflow of data.
 
Theoretical SInc() function is the root where the baseline mapping of maximum stability increase to difficulty is established:
 
<blockquote>
SInc:=(5*(1-Difficulty)+1)*f(R,S)+1
<div>
where:
* SInc - stability increase
* Difficulty - difficulty in 0..1 range
* f(R,S) - component of SInc that depends on R and S
</div>
</blockquote>
 
This baseline formula currently determines the cutoff value for hardest possible [[Glossary:Item|items]] at SInc that is N times less than the maximum value possible for easiest items (with N=(6*f(R,S)+1)/(f(R,S)+1)).
 
===== Computing difficulty: Simplified procedure =====
 
We have also developed a simplified solution for computing [[Glossary:Difficulty|item difficulties]]. This might be useful for developers who look for simplicity, thinner and faster code. Simplified procedure is similar to the approach used in older SuperMemos, esp. SuperMemo 8 through SuperMemo 16.
 
[[Glossary:Difficulty|Item difficulty]] may be made history-less by relying on the fact that difficulty of an [[Glossary:Item|item]] keeps evolving depending on the context of knowledge and through interference. For that reason, the performance in the most recent [[Glossary:Repetition|repetition]] counts more than performance in prior repetitions.
 
Simplified difficulty can then be obtained using the formula:
 
<blockquote>
D[n]=sDF(D[n-1],Grade,Rr)
<div>
where:
* sDF - simplified difficulty function
* D[n] - difficulty estimate after the n-th repetition
* Grade - grade scored in the n-th repetition
* Rr - expected recall at the time of the n-th repetition (based on the Recall[D,S,R] matrix)
</div>
</blockquote>
 
Unlike best-fit [[Glossary:Difficulty|difficulty]], simplified difficulty requires no expensive hill-climbing optimization that may slow down SuperMemo (e.g. when computing learning data for long [[Repetition history|repetition histories]]). The accuracy of the simplified difficulty estimation is bound to be less and is currently being estimated.
 
=== Startup interval ===
 
The first interval Int[1] is determined using the first [[Glossary:Forgetting_curve|forgetting curve]]. Unlike the entries of the Recall[] matrix, the first forgetting curve data is collected using 35 quantiles determined by the interval distribution on a large representative sample collected from an incremental reading [[Glossary:Collection|collection]]. Those quantiles span nearly a decade (older SuperMemos collected data for only the first 20 days after memorizing the [[Glossary:Item|item]]). The startup interval (the first [[Glossary:Optimum_interval|optimum interval]]) is determined by power regression on that forgetting curve data. Power law forgetting is the result of the superposition of exponential [[Glossary:Forgetting_curve|forgetting curves]] of heterogeneous indeterminate difficulty material entering the learning process. The first [[Glossary:Interval|interval]] is corrected for the [[Glossary:Requested_forgetting_index|requested forgetting index (rFI)]] and may be subject to random dispersion that is a standard fixture of SuperMemo algorithms since its inception.
 
In short, the startup interval is determined by the point in time at which the [[Glossary:Forgetting_curve|forgetting curve]] indicates the recall is likely to fall below the value determined by the [[Glossary:Requested_forgetting_index|requested forgetting index]].
 
Note that the startup interval (with rFI=10%) is not the same as the startup stability. The interval needs to be determined before the first review (i.e. be a resultant of prior [[Glossary:Repetition|repetitions]] of other [[Glossary:Item|items]]). The startup stability is determined from [[Repetition history|repetition histories]] and may differ after each successive [[Glossary:Repetition|repetition]]. In short, it represents the best fit to data.
 
=== Startup stability ===
 
Startup stability S[0] is the stability accomplished with the first single review. Unlike stability computed at later repetitions, this stability cannot be derived from the SInc[] matrix. Instead, it is computed in a way similar to the way [[Glossary:Difficulty|difficulty]] is computed. Within a range of possible values, S[0] is used to compute [[Glossary:Stability|memory stabilities]] after successive [[Glossary:Repetition|repetitions]]. The startup stability S[0] is taken as the value that produces the least predicted grade deviation from [[Glossary:Grade|grades]] in [[repetition history]].
 
=== Post-lapse stability ===
 
Post-lapse stability (PLS) is the [[Glossary:Stability|stability]] resulting from a review with a failing [[Glossary:Grade|grade]]. Unlike [[Glossary:Stability|stability]] computed after successful [[Glossary:Repetition|repetitions]], this stability cannot be derived from the SInc[] matrix.
 
In the ideal case, for simple memories, forgetting results in a reset of [[Glossary:Stability|stability]] back to near-zero. In theory, only difficult [[Glossary:Item|items]] made of composite memories may show a substantial decrease in the costs of re-learning, however, even that does not show in data (perhaps due to imperfect difficulty estimations or due to SuperMemo's natural tendency to help users eliminate hard-to-learn material in the learning process).
 
It has been shown long ago that the length of the first post-lapse optimum interval is best correlated with the number of [[Glossary:Lapse|memory lapses]] afforded an [[Glossary:Item|item]]. Even then, post-lapse interval usually oscillates in the 1-4 days range for [[Glossary:Forgetting_index|forgetting index]] of 10%, and the correlation is not very useful in adding to the efficiency of learning in optimizing the length of that interval. Some competitive spaced repetition software, as well as SuperMemo in its first years, experimented with re-learning hypotheses based on ancient wisdom of learning psychology, e.g. by halving [[Glossary:Interval|intervals]] after a [[Glossary:Lapse|memory lapse]]. Current data shows clearly that this is a harmful and time-wasting approach.
 
In addition to [[Glossary:Lapse|memory lapses]], there is also a strong correlation between the length of the post-lapse interval and [[Glossary:Retrievability|retrievability (R)]]. This should be obvious considering that substantial delay in review will result in "undeserved" [[Glossary:Lapse|lapse]] on [[Glossary:Item|items]] that might be then quite easy to re-learn. Algorithm SM-17 takes retrievability dimension into account.
 
In the light of the above, SuperMemo uses a separate matrix for post-lapse stabilities: PLS[] with Lapse and Retrievability dimensions. The first [[Glossary:Interval|interval]] after scoring a failing [[Glossary:Grade|grade]] is then determined as follows:
 
<blockquote>
Int[1]:=PLS[Lapses,R]
<div>
where:
* Int[1] - first [[Glossary:Interval|interval]] (after a failing [[Glossary:Grade|grade]])
* PLS[] - post-lapse interval matrix
* Lapses - total number of [[Glossary:Lapse|memory lapses]] (failing grades) scored by the [[Glossary:Item|item]]
* R - [[Glossary:Retrievability|retrievability]] at the moment of the lapse
</div>
</blockquote>
 
=== Retrievability ===
 
==== Theory ====
 
In theory, [[Glossary:Retrievability|retrievability R]] should be easy to derive from [[Glossary:Stability|stability]] and [[Glossary:Interval|interval]]:
 
<blockquote>
R[n]:=exp(-k*t/S[n-1])
<div>
where:
* R[n] - [[Glossary:Retrievability|retrievability]] at the n-th [[Glossary:Repetition|repetition]]
* k - decay constant
* t - time ([[Glossary:Interval|interval]])
* S[n-1] - [[Glossary:Stability|stability]] after the (n-1)th [[Glossary:Repetition|repetition]]
</div>
</blockquote>
 
That neat theoretical approach is made a bit more complex when we consider that forgetting may not be perfectly exponential if [[Glossary:Item|items]] are difficult or with mixed [[Glossary:Difficulty|difficulty]]. In addition, [[Glossary:Forgetting_curve|forgetting curves]] in SuperMemo can be marred by user strategies in all situations where [[Glossary:Grade|grades]] are not an ideal reflection of recall/[[Glossary:Retrievability|retrievability]].
 
==== Three measures of retrievability ====
 
Theoretical two-component model may be undermined by imperfect difficulty filtering, user strategies, implementation shortcomings (e.g. poor choice of parameters), and imperfect modelling of reality. For that reason SuperMemo uses three different measures of retrievability:
* '''Theoretical retrievability''' (R) as derived from the exponential decline of recall over time (as above). This measure is a perfect reflection of passing time
* '''Recall matrix''': measured recall R for different levels of [[Glossary:Difficulty|item difficulty]], [[Glossary:Stability|memory stability]], and theoretical R (or time). This measure is a perfect reflection of average recall (for imperfectly measured memory status parameters)
* '''Recall-Grade correlation''': all users have their own average level of recall corresponding with all five [[Glossary:Grade|grades]] used in SuperMemo. This measure is highly unreliable (for example, there were cases in which higher grades have been shown to be associated with lower recall)
 
'''Recall matrix''': Algorithm SM-17 accounts for possible non-exponential decay of memory, as well as for D and S estimate errors, by using the '''Recall[D,S,R] matrix'''. Recall matrix takes the same arguments as the SInc[] matrix. Recall[D,S,R] is the average recall for [[Glossary:Item|items]] of [[Glossary:Difficulty|difficulty D]], at [[Glossary:Stability|stability level S]], and at [[Glossary:Retrievability|retrievability R]]. Recall matrix keeps the forgetting curve record with the time axis expressed as [[Glossary:Retrievability|retrievability]]. The main functions of Recall[] matrix are:
# to measure the actual level of recall for different retrievability levels and
# verify the soundness of the model and its parameters (in the ideal theoretical case Recall should equal [[Glossary:Retrievability|retrievability]]).
The actual recall level statistic is used to amend the theoretical retrievability level derived from stability and time.
 
'''Recall-Grade correlation''': For individual [[Glossary:Repetition|repetitions]], [[Glossary:Retrievability|retrievability]] can be checked against the average recall level registered for individual [[Glossary:Grade|grades]] at [[Glossary:Repetition|repetitions]]. Success and failure of memory, as defined in the grade scale are also used to measure binary recall (0 - failure, 1 - success). Algorithm SM-17 uses both the recall-grade correlation and binary recall to add to the set of information needed to estimate [[Glossary:Stability|memory stability]]. [[Glossary:Grade|Grades]] carry more information, while binary recall is a better reflection of forgetting. Those two must be weighed up when looking at grade-derived estimate of R.
 
==== Retrievability estimate ====
 
[[#Three measures of retrievability|All the three measures of retrievability]] can be weighed up to provide for the best estimate of [[Glossary:Retrievability|R]] that is used in further estimates of [[Glossary:Stability|stability]] and stability increase. Weights used in the estimate will change with inflow of new information (e.g. repetition cases registered in the Recall matrix). The ultimate goal of weight parameters is to minimize the deviation between the model and the [[Glossary:Grade|grades]] scored in learning.
 
A careful distinction must be made between all measures of recall. Theoretical retrievability (R) concept is essential for indexing optimization matrices as R is the ultimate target when converging onto the theoretical model of memory. At the other extreme, stability estimates are item-specific and need to take into account past recall statistics and actual [[Glossary:Grade|grades]] scored by a particular [[Glossary:Item|item]] in the learning process.
 
=== Stability ===
 
The [[Glossary:Interval|interval]] and the resulting [[Glossary:Grade|grade]] can be used to draw conclusions about the stability estimate. If the [[Glossary:Interval|interval]] is greater than [[Glossary:Stability|stability]], good [[Glossary:Grade|grades]] indicate stability underestimate. Conversely, short [[Glossary:Interval|intervals]] and poor [[Glossary:Grade|grades]] may indicate an overestimate. This type of reasoning underlay the first attempt to compute the stability increase function in [http://www.super-memory.com/articles/stability.htm this 2005 article]. However, due to being applicable to a subset of repetition scenarios, this information is insufficient to estimate stability for all [[Glossary:Item|items]] and all [[Glossary:Repetition|repetitions]]. In addition, conclusions drawn in 2005 differ substantially from the new data based on far more accurate recall-verified retrievability estimate (as described above).
 
The second, and more reliable source of information about [[Glossary:Stability|stability]] is the value derived from retrievability estimate modified along prior recall measurements. [[Glossary:Stability|Stability]] can be estimated from retrievability level and the stability estimate value before the [[Glossary:Repetition|repetition]]. This is the most reliable estimate.
 
Last but not least, all [[Glossary:Repetition|repetitions]] produce a new value of estimated [[Glossary:Stability|stability]]. That pre-repetition estimate is the third source of information on the actual level of [[Glossary:Stability|stability]].
 
Those three sources of information are weighed up for reliability (e.g. on the number of cases in the Recall matrix used to modify [[Glossary:Retrievability|R]], or the number of cases in the SInc[] matrix used to generate pre-repetition [[Glossary:Stability|stability]]). The actual parameters used in weighing up information have been computed using optimization methods targeted at minimizing predicted grade deviation in larger [[Glossary:Collection|collections]].
 
=== Stability increase ===
 
Stability increase function is used to compute [[Glossary:Interval|intervals]] in Algorithm SM-17.
 
The stability increase function is expressed in Algorithm SM-17 using three different methods:
* the theoretical formula, which is a multiplicative compound expression of the impact of all its three [[Glossary:Difficulty|D]], [[Glossary:Stability|S]], [[Glossary:Retrievability|R]] arguments
* the matrix SInc[] built on the basis of data collected from [[Glossary:Collection|collection]]'s [[Repetition history|repetition histories]]
* the ''best compromise'' function (BestSInc) that combines the information from the matrix, with support of neighboring matrix entries where data is missing, and with support of the theoretical function where no data is available
 
Due to its monotonicity, theoretical formula is used in all hill-climbing procedures which are employed richly in the algorithm. It is also used exclusively in new [[Glossary:Collection|collections]] where no learning data is available. The SInc[] matrix is the truest expression of stability changes in user's particular [[Glossary:Collection|collection]]. The SInc[] matrix has the greatest research value and is used in all contexts where true properties of memory need to be taken into account. The matrix is the key to Alg-SM17 adaptability. Finally, the ultimate compromise BestSInc() function is a statement on what is best known about the memory properties and learning strategies of the user and his or her particular [[Glossary:Collection|collection]] at any given moment in time.
 
The choice between the tree expressions of SInc[], and the choice of compromise parameters in BestSInc() are essential for the ultimate efficiency of the algorithm. Those choices and parameters are still in the process of being optimized.
 
The SInc[] matrix can be computed using three different methods:
* in case of lack of data, from a theoretical universal formula, for all new users
* incrementally during learning while collecting repetition data and executing Algorithm SM-17
* wholesale by recomputing stability increase noted in all past [[Glossary:Repetition|repetitions]] (esp. for users of earlier versions of SuperMemo)
 
The recommended approach for users of older SuperMemo is to compute a new SInc[] matrix as the first step and then proceed with incremental changes during a course of standard learning with Algorithm SM-17.
 
=== Interval ===
 
[[Glossary:Interval|Intervals]] in Algorithm SM-17 are computed using stability increase function SInc() (the ''best compromise'' variant is used as described above as ''BestSInc'').
 
The following formula is used:
 
<blockquote>
Int[n]:=S[n-1]*SInc[D[n-1],S[n-1],R[n]]*ln(1-rFI/100)/ln(0.9)
<div>
where:
* Int[n] - [[Glossary:Interval|interval]] after the n-th [[Glossary:Repetition|repetition]]
* S[n] - [[Glossary:Stability|memory stability]] after n-th repetition
* R[n] - [[Glossary:Retrievability|memory retrievability]] at the time of n-th repetition
* SInc[D,S,R] - stability increase for a given level of [[Glossary:Difficulty|difficulty (D)]], [[Glossary:Stability|stability (S)]] and [[Glossary:Retrievability|retrievability (R)]]
* rFI - [[Glossary:Requested_forgetting_index|requested forgetting index]] expressed in percent
</div>
</blockquote>
 
This formula is analogous to similar formulas in all versions of SuperMemo beginning with Algorithm SM-0. The role of the stability increase SInc[] was played earlier by [[Glossary:E-Factor|E-Factors]] (based on item difficulty) and [[Glossary:O-Factor|O-Factors]] (derived from [[Glossary:Forgetting_curve|forgetting curves]]).
 
=== Data ===
 
* [[Repetition history|item repetition histories]] (date and grade)
* recall matrix: Recall[D,S,R]
* stability increase matrix: SInc[D,S,R] (i.e. the matrix representation of the stability increase function)
 
=== The Algorithm: Outline ===
 
The following procedure can be used to determine the status of memory (DSR status) at any moment of time for a given [[Repetition history|history of repetitions]]. Once the DSR status is known, we can determine the next [[Glossary:Interval|interval]] using criteria of choice, e.g. [[Glossary:Forgetting_index|forgetting index]], maximization of [[Glossary:Stability|stability]], long-term workload minimization, etc.
 
# estimate [[Glossary:Difficulty|item difficulty D]] using the [[Repetition history|history repetitions]] for that [[Glossary:Item|item]]
# determine [[#Startup stability|startup stability S0]] using the [[Repetition history|history of repetitions]]
# for all repetition history records repeat the steps below
# compute theoretical retrievability Rt using current stability estimate Sw and the interval Int
# update Recall[] matrix using D, Sw[n-1], Rt with the grade-derived recall
# compute recall-based retrievability Rr
# compute grade-derived retrievability Rg
# estimated weighted Rw from Rt, Rr, and Rg
# compute Rw-based stability Sr
# compute SInc-based Se (Se=Sw[n-1]*SInc[D,Sw[n-1],Rw])
# compute interval derived stability Si
# estimate weighted Sw from Sr, Se, and Si
# compute the stability increase SInc on the basis of Sw change
# update Sinc[] matrix using D, Sw, Rw with the new SInc value
# compute new interval using Int:=Sw*SInc[D,Sw,Rw]
# go back computing Rt step
 
Note that replacing Int:=Int[n-1]*SInc[D,Sw,Rw] with Int:=Sw*SInc[D,Sw,Rw] frees the user from ever worrying about manual intervention in the length of [[Glossary:Interval|intervals]] (e.g. before an exam, or when encountering knowledge in real life, etc.). This is one of the key advantages of Algorithm SM-17 over all prior versions of the algorithm.
 
== Related topics ==
 
* [[SuperMemo Algorithm SM-15]]
* [http://super-memory.com/english/ol/sm2.htm Algorithm SM-2]
* [http://supermemopedia.com/wiki/Comparing_SuperMemo_with_other_applications_based_on_spaced_repetition Comparing SuperMemo with other applications based on spaced repetition]
* [http://supermemopedia.com/wiki/Spaced_repetition_algorithm_metric Spaced repetition algorithm metric]
* [http://supermemopedia.com/wiki/SuperMemo_or_Anki SuperMemo or Anki]
* [http://supermemopedia.com/wiki/Open_letter_to_spaced_repetition_developers Open letter to spaced repetition developers]

Latest revision as of 16:52, 4 June 2019

Algorithm SM-18

SuperMemo had a fertile impact on the research of modeling long-term memory. One of the most interesting fruits of that research is the two component model of long-term memory. Over the last two decades, it has been hinted on many occasions that the model may provide a sound theoretical basis for a better and more universal approach to spaced repetition. Algorithm SM-18 is the most up-to-date implementation of the theory in a practical application. Due to its universal nature, the algorithm should, in the future, find its way to all SuperMemo products.

For details see: Algorithm SM-18

Historical note: earlier releases of the algorithm

Although the presented algorithm may seem complex, you should find it easier to follow its steps once you understand the evolution of individual concepts such as E-Factor, matrix of optimum intervals, optimum factors, forgetting curves, mid-interval repetition, stability, retrievability, etc.

  • 1985 - Paper-and-pencil version of SuperMemo is formulated (Algorithm SM-0). Repetitions of whole pages of material proceed along a fixed table of intervals. See also: Using SuperMemo without a computer
  • 1987 - First computer implementation makes it possible to divide material into individual items. Items are classified into difficulty categories by means of E-Factor. Each difficulty category has its own optimum spacing of repetitions (Algorithm SM-2)
  • 1989 - SuperMemo 4 was able to modify the function of optimum intervals depending on the student's performance (Algorithm SM-4). This was then the first algorithm in which the function of optimal intervals was adaptable
  • 1989 - SuperMemo 5 replaced the matrix of optimum intervals with the matrix of optimal factors (an optimum factor is the ratio between successive intervals). This approach accelerated the adaptation of the function of optimum intervals (Algorithm SM-5)
  • 1991 - SuperMemo 6 derived optimal factors from forgetting curves plotted for each entry of the matrix of optimum factors. This could dramatically speed up the convergence of the function of optimum intervals to its ultimate value (Algorithm SM-6). This was then the first adaptable algorithm that would use regression to find the best fit to the actual memory performance data. Unlike SuperMemo 5, which could keep converging and diverging depending on the quality of the learning material and the learning process, SuperMemo 6 would get closer to the student's ultimate memory model with each day of learning
  • 1995 - SuperMemo 8 capitalized on data collected by users of SuperMemo 6 and SuperMemo 7 and added a number of improvements that strengthened the theoretical validity of the function of optimum intervals and made it possible to accelerate its adaptation, esp. in the early stages of learning (Algorithm SM-8). New concepts:
    • replacing E-Factors with absolute difficulty factors: A-Factors. Item difficulty was thus defined in terms of actual properties of human memory, and would not depend on the average difficulty of the learning material
    • fast approximation of A-Factors from the First Grade vs. A-Factor correlation graph and Grade vs. Forgetting index graph. This makes it possible to quickly guess item's difficulty before more data is available
    • real-time adjustment of the matrix of optimal factors based on the power approximation of the decline of optimum factors
  • 2002 - SuperMemo 2002 introduced the first SuperMemo algorithm that is resistant to interference from delay or advancement of repetitions. This makes it possible to safely delay repetitions (Postpone) or advance repetitions (Review):
    • accounting for delayed repetitions by introducing the concept of repetition categories
    • accounting for advanced repetitions by introducing O-Factor decrements derived from the concept of the spacing effect
  • 2005 - SuperMemo 2004 introduced boundaries on A and B parameters computed from the Grade vs. Forgetting Index data. This plugs up a weakness in the algorithm that showed when importing repetitions from other applications (e.g. open source MemAid). If a large number of easy repetitions occurred at unnaturally long intervals (as after pre-training with another application), the graph might show reversed monotonicity that would temporarily affect the estimation of A-Factors (the speed of self-correction would be reversely proportional to the amount of flawed data). When boundaries are imposed, self-correction is instant, and the accuracy of A-Factor estimation increases with each repetition executed in SuperMemo
  • 2011 - with Algorithm SM-15, SuperMemo 15 eliminated two weaknesses of Algorithm SM-11 that would show up in heavily overloaded collections with very large item delays:
    • U-Factors now allow of correctly interpreting repetition delays of up to 15 years (previously 60 days only)
    • forgetting curves are now corrected for repetition delays beyond the maximum registered U-Factor value (preventing failed grades in delayed repetitions decreasing the estimates of the optimum interval for standardly-scheduled items in the same category)
  • 2015 - Algorithm SM-17 is a successor to all prior versions of the algorithm in future SuperMemos. It had passed all important benchmarks, and was part of SuperMemo 17 in 2016
  • 2018 - Algorithm SM-18 is the newest version of the spaced repetition algorithm introduced in SuperMemo 18. The new algorithm improves the way item difficulty is estimated

See also