Evidence (SPRT)

Belief revision in Mneva is a sequential probability ratio test (SPRT). Every observation contributes a log-likelihood ratio; when the accumulator crosses a decision boundary, confidence shifts and sprt_status flips. The math is documented, the trail is queryable, the boundary is defensible.

The thing SPRT solves

Most memory products have a confidence field. Few have a defensible way to move it. Setting confidence is easy; changing it on the right evidence at the right cumulative weight is the hard part.

Naive approaches fail in known ways:

Threshold counting (3 strikes you're out) ignores evidence strength. Three weak refutations should not move you as far as one strong one.
Exponential smoothing has no decision point. Confidence drifts; nothing crisply flips.
Bayesian update is the right family, but without a stopping rule you accumulate forever.

SPRT is Wald's sequential test: accumulate the log-likelihood ratio of support vs refute evidence, stop and decide as soon as you cross either of two boundaries set by your tolerance for false-promote (α) and false-demote (β).

The math

Mneva uses α=0.05, β=0.10:

Promote boundary A = ln(0.9 / 0.05) ≈ +2.890
Demote boundary B = ln(0.1 / 0.95) ≈ -2.250

Each evidence(belief_id, kind, strength) call contributes a log-likelihood:

kind="supporting" → contribution is +ln(s / (1 − s))
kind="refuting" → contribution is +ln((1 − s) / s)

where s ∈ [0.1, 0.9] is the evidence strength (clamped to keep the log finite).

The contribution is added to beliefs.sprt_log_ratio. When the cumulative ratio crosses A (and sprt_status is not already promoted), the belief is promoted: confidence += 0.10 (capped 0.95), status = promoted. When it crosses B (and status is not already demoted), the belief is demoted: confidence −= 0.20 (floored 0.05), status = demoted.

A belief stays at most once in each state — second-strike refuting evidence after a demote does not double-penalize. The audit trail in belief_evidence still records every event.

Walk through an example

Belief: the worker pool should be sized at 4 threads. Initial confidence 0.6, sprt_log_ratio 0, status accumulating.

Three refuting observations, each at strength 0.85:

Evidence	Contribution `ln(0.15/0.85) ≈`	Cumulative log ratio	Status	Confidence
#1 refute @ 0.85	−1.735	−1.735	accumulating	0.60
#2 refute @ 0.85	−1.735	−3.470 ← crosses −2.25	demoted	0.40
#3 refute @ 0.85	−1.735	−5.205	demoted (already)	0.40

After three observations the belief is demoted. The fourth refute would not move confidence any further; the agent should call revise with the new wording.

You can ask evidence_for(belief_id) at any time and see the full event-by-event trail with timestamps and notes. No magic number; the confidence has a receipt.

Why we do not auto-supersede

It would be technically easy to auto-write a new belief when the ratio crosses B. Mneva does not do this on purpose.

Structural revision changes what recall returns to the agent — silently mutating the working set of beliefs is the wrong default. The intermediate state (status='demoted' + confidence 0.40 + old wording still current) is the honest state: the belief is in trouble and the agent has the responsibility to write the replacement.

When you call revise explicitly, the new wording goes in your voice, with your reasoning. Mneva's job is to surface that the moment has arrived. The wording is yours.

Evidence (SPRT)

The thing SPRT solves

The math

Walk through an example

Why we do not auto-supersede

See also