Skip to content

Ablation Tests

Every core module in Mnemo has been validated through ablation experiments — systematically removing each component and measuring the impact on retrieval quality.

35 tests, 3 suites, 100% pass.

Why Ablation?

Most memory frameworks claim features without proving they help. Mnemo takes a different approach: every module must earn its place through measurable contribution. If removing a component doesn't degrade performance, it shouldn't exist.

Results Summary

Reflection Ablation (12 tests)

ModuleMethodMeasured Contribution
Logistic decayCompare logistic vs linear decay curves3-5x sharper discrimination between recent and old reflections
Invariant/Derived splitCompare differential vs uniform decay20-day invariant outranks 20-day derived (identity facts persist)
3-tier systemCompare Core/Working/Peripheral vs flat β=1.0>5% score spread at 90 days
Access reinforcementCompare log1p reinforcement vs no reinforcement5 accesses → 60% half-life extension, with diminishing returns
Resonance gateCompare full pipeline vs gated pipeline>50% compute savings by skipping low-relevance queries

Salience Ablation (12 tests)

AblationMethodMeasured Impact
Remove salience from recencySet salience=0 in Weibull formulaHalf-life extension lost for emotional memories
Remove salience from intrinsicSet salience=0 in value formula15-40% value boost eliminated
Coefficient sensitivityTest 0.2 / 0.5 / 0.80.5 = balanced sweet spot (~50% of max spread)
Tier × salience interactionCompare spread across Core/Working/PeripheralPeripheral benefits most (steeper decay amplifies effect)
Heuristic signal qualityScore emotional vs routine text>0.3 separation gap
Full ablation (salience=0)Remove salience entirelyRanking spread collapses to 0 — signal completely lost

Salience A/B Test (11 tests)

Compared current parameters against a proposed optimization using 12 realistic memories across 4 time horizons:

Config A (Current): importanceModulation=1.5, salienceCoeff=0.5, intrinsicCoeff=0.3Config B (Candidate): importanceModulation=1.0, salienceCoeff=0.65, intrinsicCoeff=0.4

MetricABWinner
Discrimination 30d0.6980.714B
Discrimination 90d0.7380.731A
Discrimination 120d0.7180.706A
Ranking tau (60d)0.9090.909Tie
Survival curveIdenticalIdenticalTie
Signal balance (60d)40/24/36%37/25/38%Both balanced

Result: 2:2 tie. A wins at long-term retention (90d+), B wins at short-term. Since long-term stability matters more for a memory system, current parameters are confirmed optimal.

Key Findings

  1. importanceModulation is the strongest factorexp(1.5 × 0.7) ≈ 2.86x half-life amplification dominates the decay formula
  2. Salience is a secondary but essential signal — its value is greatest at 60d+ horizons and for peripheral-tier memories
  3. Every module is load-bearing — no component can be safely removed without measurable quality loss
  4. Parameters are at the sweet spot — A/B testing confirmed no adjustment needed

Running the Tests

bash
# Reflection ablation
node --test packages/core/test/reflection-ablation.test.mjs

# Salience ablation
node --test packages/core/test/salience-ablation.test.mjs

# Salience A/B parameter test
node --test packages/core/test/salience-ab-test.mjs

Released under the MIT License.