Abstract. This document describes the AGISystem2 saturation evaluation suite (evals/runSaturationEval.mjs), designed to measure how quickly different HDC strategies lose discriminative power when many facts are superposed into a single composite representation. We construct synthetic “books” as hierarchical bundles (records → chapters → book) and test whether the resulting @Book vector supports pure holographic membership queries via the reasoning equation UNBIND(Book, QueryKey), alongside symbolic query validation. The suite is explicitly comparative: classic probabilistic VSA strategies are measured against the lossless EXACT strategy, which provides an upper bound on retrievability.

1. What “Saturation” Means Here

In HDC/VSA systems, a bundle/superposition combines many items into one vector. As the number of bundled items grows, the composite vector may become less discriminative: unrelated candidates can become “accidentally” similar to a query, increasing ambiguity and false positives. We refer to this effect as saturation.

Key idea: the suite does not test “can the symbolic engine answer the query?” (it usually can), but “can the @Book composite alone support retrieval via UNBIND + cleanup?”

2. Data Model: Books, Chapters, Ideas, and Records

The suite uses DSL files in evals/saturation/books/ to simulate books:

Records (facts)

Index-like facts that connect a book identifier, a key, and an idea.

Mentions Book02 Key_B02_C04_I02 ActionSequencing

Chapters

Content is a bundle of record vectors. Optional ordered structure is kept separately.

@Chapter04:Chapter04 bundle [$B02_R0007, $B02_R0008]

Book

Book content is a bundle of chapter content vectors (hierarchical superposition).

@Book:Book bundle [$Chapter01, ..., $Chapter04]

3. Why We Keep _Seq Variables (Structure vs. Membership)

Real books have order (ideas in a chapter; chapters in a book). In AGISystem2, the structural operator __Sequence builds an ordered superposition by binding each element to a position marker (Pos1, Pos2, …) before bundling.

@Chapter04_Seq __Sequence [$B02_R0007, $B02_R0008] @Chapter04:Chapter04 bundle [$B02_R0007, $B02_R0008]

For saturation testing we need a membership-oriented representation: the @Book vector should behave like a superposition of queryable records. If we used __Sequence everywhere, the “book content” would be expressed in a different algebra (everything becomes “positioned”), and the simple fact-unbinding pattern would no longer reflect the intended query semantics.

Design choice: keep ordered structure as *_Seq metadata, while using pure bundle for @ChapterNN and @Book content vectors.

4. Query Protocol (Two Holographic Tests + Symbolic Validation)

Each book file contains two markers parsed by the runner:

# SAT_QUERY_POS op=Mentions book=Book02 key=Key_B02_C04_I02 expect=ActionSequencing # SAT_QUERY_NEG op=Mentions book=Book02 key=Key_B02_Missing expect=none

4.1 Holographic decode A: (book, key) → idea

We treat Mentions(book, key, idea) as a 3-argument record with positional encoding (Pos1..Pos3). For a query where book and key are known, we compute a query key and unbind:

partial = Mentions BIND (Book BIND Pos1) BIND (Key BIND Pos2) answer = UNBIND(BookVector, partial) ideaVec = UNBIND(answer, Pos3)

The recovered ideaVec is then “cleaned up” by ranking against a bounded candidate set.

4.2 Holographic decode B: (book, idea) → key (membership test)

To approximate the question “is this idea in the book?”, we invert the missing slot:

partial = Mentions BIND (Book BIND Pos1) BIND (Idea BIND Pos3) answer = UNBIND(BookVector, partial) keyVec = UNBIND(answer, Pos2)

If the idea is not present, the decoded key should not match any real key confidently.

4.3 Candidate sets (cleanup) simulate a reverse-index

Pure holographic decode typically needs cleanup against a candidate set. The runner simulates a “reverse-index narrowed” candidate pool of fixed size (default 10):

4.4 Symbolic validation

In parallel, the suite validates correctness via the query engine:

@q Mentions Book02 Key_B02_C04_I02 ?idea @q Mentions Book02 ?key ActionSequencing

This ensures that failures in holographic decode are interpreted as saturation effects (representation/geometry), not “missing knowledge”.

5. Running the Suite and Interpreting Outputs

node evals/runSaturationEval.mjs node evals/runSaturationEval.mjs --full node evals/runSaturationEval.mjs --huge node evals/runSaturationEval.mjs --extra-huge node evals/runSaturationEval.mjs --strategies=dense-binary,metric-affine,metric-affine-elastic,exact node evals/runSaturationEval.mjs --priority=holographicPriority node evals/runSaturationEval.mjs --no-color

5.1 Suite modes (geometry sweeps)

Each strategy has its own “geometry” meaning (bits for Dense-Binary, k for Sparse-Polynomial, bytes for Metric, etc.). The suite provides fast/full/huge/extra-huge presets to explore scaling without changing code.

5.2 Summary metrics

The runner prints a summary table with (names may vary by mode):

Column Meaning Why it matters for saturation
HDC Holographic decode A pass rate (book,key→idea) Measures retrieval from the composite alone
HMem Holographic membership pass rate (book,idea→key) Direct proxy for “idea ∈ book?”
Query, QMem Symbolic validation pass rates Ground truth: should usually pass unless the DSL is wrong
AvgPosM, AvgNegM Average margin (top1-top2 similarity) for POS/NEG Lower margins indicate ambiguity and approach to saturation
SimChk Similarity checks performed during cleanup Cost proxy: cleanup is O(|candidates|)
UnbChk (EXACT) Operation count proxy for polynomial unbind (subset checks) EXACT cost scales with number of terms in the composite
Time Total per-config time (learn+decode) Throughput comparison across strategies

5.3 Pass criteria (avoiding “tie passes”)

In saturation experiments it is easy to accidentally count a decode as “correct” even when the representation carries no usable signal (e.g., every candidate has similarity 0.000, and the ranking depends on iteration order). To ensure the suite measures real retrieval, the runner uses a strict positive criterion and a conservative negative criterion:

Consequence: symbolic queries can be 100% correct while holographic decode reports failures—this is expected and indicates saturation or threshold mismatch, not missing data.

6. Why Results Differ Across Strategies and Parameters

6.1 Geometry and capacity

Higher geometry typically increases capacity (more nearly-orthogonal space), delaying saturation. However, strategies differ in how “capacity” manifests:

6.2 Strategy-specific UNBIND semantics

In XOR-based strategies, UNBIND is often implemented by BIND again (cancellation). In EXACT, UNBIND is quotient-like: it searches for residual terms consistent with the component. The saturation suite runs both default modes:

In this suite, most UNBIND calls use single-term components, so A/B are often similar; differences emerge in multi-term unbinding workloads.

6.3 Why symbolic queries can pass while holographic decode fails

Symbolic query engines use structured metadata and indexes over persisted facts. They can answer a query even when the holographic composite saturates, because they do not rely on approximate similarity in a single bundled representation. In contrast, holographic decode is intentionally “index-less” and therefore sensitive to saturation.

7. Book Families and Scaling Axes

The suite includes multiple book families to test different scaling regimes:

Interpretation: different scaling axes saturate strategies differently. Increasing “ideas per chapter” increases local bundle density; increasing “per-idea width” increases the number of facts and tokens competing in the same composite.

8. Practical Reading of Variations

When comparing runs across presets (--fast/--full/--huge/--extra-huge), interpret changes as follows: