Evidence (SPRT)
Belief revision in Mneva is a sequential probability ratio test (SPRT). Every observation contributes a log-likelihood ratio; when the accumulator crosses a decision boundary, confidence shifts and sprt_status flips. The math is documented, the trail is queryable, the boundary is defensible.
The thing SPRT solves
Most memory products have a confidence field. Few have a defensible way to move it. Setting confidence is easy; changing it on the right evidence at the right cumulative weight is the hard part.
Naive approaches fail in known ways:
- Threshold counting (3 strikes you're out) ignores evidence strength. Three weak refutations should not move you as far as one strong one.
- Exponential smoothing has no decision point. Confidence drifts; nothing crisply flips.
- Bayesian update is the right family, but without a stopping rule you accumulate forever.
SPRT is Wald's sequential test: accumulate the log-likelihood ratio of support vs refute evidence, stop and decide as soon as you cross either of two boundaries set by your tolerance for false-promote (α) and false-demote (β).
The math
Mneva uses α=0.05, β=0.10:
- Promote boundary
A = ln(0.9 / 0.05) ≈ +2.890 - Demote boundary
B = ln(0.1 / 0.95) ≈ -2.250
Each evidence(belief_id, kind, strength) call contributes a log-likelihood:
kind="supporting"→ contribution is+ln(s / (1 − s))kind="refuting"→ contribution is+ln((1 − s) / s)
where s ∈ [0.1, 0.9] is the evidence strength (clamped to keep the log finite).
The contribution is added to beliefs.sprt_log_ratio. When the cumulative ratio crosses A (and sprt_status is not already promoted), the belief is promoted: confidence += 0.10 (capped 0.95), status = promoted. When it crosses B (and status is not already demoted), the belief is demoted: confidence −= 0.20 (floored 0.05), status = demoted.
A belief stays at most once in each state — second-strike refuting evidence after a demote does not double-penalize. The audit trail in belief_evidence still records every event.
Walk through an example
Belief: the worker pool should be sized at 4 threads. Initial confidence 0.6, sprt_log_ratio 0, status accumulating.
Three refuting observations, each at strength 0.85:
| Evidence | Contribution ln(0.15/0.85) ≈ | Cumulative log ratio | Status | Confidence |
|---|---|---|---|---|
| #1 refute @ 0.85 | −1.735 | −1.735 | accumulating | 0.60 |
| #2 refute @ 0.85 | −1.735 | −3.470 ← crosses −2.25 | demoted | 0.40 |
| #3 refute @ 0.85 | −1.735 | −5.205 | demoted (already) | 0.40 |
After three observations the belief is demoted. The fourth refute would not move confidence any further; the agent should call revise with the new wording.
You can ask evidence_for(belief_id) at any time and see the full event-by-event trail with timestamps and notes. No magic number; the confidence has a receipt.
Why we do not auto-supersede
It would be technically easy to auto-write a new belief when the ratio crosses B. Mneva does not do this on purpose.
Structural revision changes what recall returns to the agent — silently mutating the working set of beliefs is the wrong default. The intermediate state (status='demoted' + confidence 0.40 + old wording still current) is the honest state: the belief is in trouble and the agent has the responsibility to write the replacement.
When you call revise explicitly, the new wording goes in your voice, with your reasoning. Mneva's job is to surface that the moment has arrived. The wording is yours.
See also
evidence— record one observationevidence_for— see the trailbelieve,revise— the structural side