Performance & accuracy across strategies, geometries, and reasoning priorities.
Capacity study: superposition stress tests and holographic membership queries.
Research directions for Dense, SPHDC, Metric-Affine, EMA, and EXACT.
Near-term and long-term plan for scaling experiments and DSL growth.
HDC Tried, HDC Valid, HDC Match, HDC Final), since “HDC worked” can mean “HDC matched symbolic” even when symbolic is chosen for richer proof traces.
For theoretical foundations, see HRR/VSA Comparison.
The Research section is organized around reproducible experiments (evaluation suites), theory notes, and forward-looking directions. These are the current topics we actively study:
Measures how quickly different HDC strategies lose discriminative power under hierarchical superposition (“book” = bundle(chapters) = bundle(records)) and evaluates pure holographic membership queries vs. symbolic ground truth.
Comparative timing and success rates across strategies, geometries, and reasoning priorities on the Core Theory suite.
Near-term and long-term plan: scaling experiments, new query operators, improved cleanup, and strategy exploration.
Reasoning engines and priorities, decoding workflows, and the tradeoffs between purely holographic steps and symbolic validation.
Additional research directions are documented under HDC Strategies, Learning from Text, NL→DSL, Proof→NL, and Semantic Libraries.
Benchmark Configs
(Dec 2025 snapshot)
Current Full Matrix
(incl. EMA)
Total Tests
(27 Suites)
Success Rate
(All Configurations)
Fastest Config
(Metric 32-byte)
| Strategy | Vector Size(s) Tested | Bind Operation | Similarity | Status |
|---|---|---|---|---|
| Dense-Binary Classic VSA | 256, 512 bytes | XOR (O(n/32) ops) | Hamming distance | Baseline (Standard HRR) |
| Sparse-Polynomial Novel | k=2, k=4 BigInt exponents | Symmetric difference (O(k²) ops) | Jaccard index | Novel paradigm (NOT HRR) |
| Metric-Affine Novel | 16, 32 bytes | Affine transformation | Channel overlap | ⚡ Fastest (317ms metric-16 vs 845ms sparse-4) |
| Metric-Affine Elastic Extension | 32+ bytes (elastic) | Affine transformation | Channel overlap (max over chunks) | Extension for large KB superpositions |
We evaluate reasoning systems using two complementary test suites:
runStressCheck.js)Purpose: Validate theory loading and detect errors (syntax, missing dependencies, contradictions)
| Test Phase | Description | Files Tested |
|---|---|---|
| Base Theories | Core reasoning theories (relations, logic, temporal, modal) | 17 Core files |
| Stress Theories | Domain knowledge (biology, sociology, logic, math, medicine, etc.) | 12 domain files |
| Validation | Syntax check, dependency resolution, contradiction detection | All .sys2 files |
Default Run (--full): 6 configurations in parallel ─────────────────────────────────────────────────────────────── Strategy | Reasoning | Load Time | Result ─────────────────────────────────────────────────────────────── dense-binary | symbolic | 858ms | ✓ 0 errors dense-binary | holographic | 710ms | ✓ 0 errors sparse-poly | symbolic | 608ms | ✓ 0 errors sparse-poly | holographic | 505ms | ✓ 0 errors metric-affine | symbolic | 412ms | ✓ 0 errors metric-affine | holographic | 326ms | ✓ 0 errors ─────────────────────────────────────────────────────────────── All strategies still load 1,314 facts from stress theories
runQueryEval.mjs)Purpose: Test advanced semantic reasoning (analogy, abduction, induction, explanation)
12 complex queries testing:
npm run eval -- --full)| Configuration | Geometry | Success Rate | Total Time | Speedup vs Slowest |
|---|---|---|---|---|
| Metric-Affine + Symbolic | 32 bytes | 100% (364/364) | 318ms | ⚡ 2.6x (FASTEST) |
| Metric-Affine + Symbolic | 16 bytes | 100% (364/364) | 337ms | 2.5x |
| Sparse-Polynomial + Symbolic | k=2 | 100% (364/364) | 349ms | 2.4x |
| Dense-Binary + Symbolic | 512 bytes | 100% (364/364) | 355ms | 2.4x |
| Metric-Affine + Holographic | 32 bytes | 100% (364/364) | 386ms | 2.2x |
| Sparse-Polynomial + Holographic | k=2 | 100% (364/364) | 390ms | 2.1x |
| Metric-Affine + Holographic | 16 bytes | 100% (364/364) | 411ms | 2.0x |
| Dense-Binary + Symbolic | 256 bytes | 100% (364/364) | 456ms | 1.8x |
| Dense-Binary + Holographic | 512 bytes | 100% (364/364) | 475ms | 1.8x |
| Dense-Binary + Holographic | 256 bytes | 100% (364/364) | 530ms | 1.6x |
| Sparse-Polynomial + Symbolic | k=4 | 100% (364/364) | 711ms | 1.2x |
| Sparse-Polynomial + Holographic | k=4 | 100% (364/364) | 835ms | 1.0x (baseline) |
| Suite Category | Tests | Coverage |
|---|---|---|
| Foundations & Hierarchies | 35 | Deep transitive chains (6-10 steps), type taxonomies, property inheritance |
| Logic & Rules | 75 | Rule inference, negation, compound logic (AND/OR/NOT), modal operators |
| Temporal & Causal | 28 | before/after chains, causes relationships, event ordering |
| Advanced Reasoning | 105 | Composition, CSP, fuzzy matching, meta-operators (similar, analogy, deduce) |
| Domain-Specific | 45 | Set theory, biological pathways, predicate logic, planning primitives |
| Integrity & Robustness | 76 | Contradiction detection, deduction, atomic learn transactions |
runQueryEval.mjs)similar,
analogy, happenedBefore, solve, and isBestExplanation are still missing, which
confirms that semantic coverage—not HDC compute—is the current bottleneck. EMA and EXACT are not part of this historical benchmark.
| Strategy | Priority | Geometry | Success Rate | Total Time | Speedup vs Dense Sym |
|---|---|---|---|---|---|
| Metric-Affine | holographic | 32 bytes | 100% (12/12) | 326ms | ⚡ 2.6x |
| Metric-Affine | symbolic | 32 bytes | 100% (12/12) | 412ms | 2.1x |
| Sparse-Polynomial | holographic | k=4 | 25% (3/12) | 505ms | 1.7x |
| Sparse-Polynomial | symbolic | k=4 | 25% (3/12) | 608ms | 1.4x |
| Dense-Binary | holographic | 2048 bits | 67% (8/12) | 710ms | 1.2x |
| Dense-Binary | symbolic | 2048 bits | 67% (8/12) | 858ms | 1.0x (baseline) |
The 12 advanced queries cover causality, analogy, temporal reasoning, inductive generalization, CSP, explanation, and property inheritance.
Only Q1 (causal chains), Q6 (deductive proof), and Q11 (whatif) return successful results on all six configurations. The remaining nine
queries succeed in only 2-4 sessions because they require operator definitions that are still missing from the stress theories or parser
(common names: similar, analogy, abduce, induce, hasAttribute,
happenedBefore, solve, isAnalytic, isNecessary, isTransitive,
isBestExplanation, etc.).
similar),
analogical mappings, and explanations trip over missing HDC algebra.happenedBefore, hasAttribute and the CSP
solve operator require new definitions or parsers, so the symbolic engines report "unknown" for those queries.caused, similar/analogy, happenedBefore, hasAttribute,
induce/abduce, solve, and isTransitive. Addressing those definitions before adding
new HDC strategies will unlock the remaining reasoning gaps.
AGISystem2 uses a multi-source query fusion strategy with configurable priority:
| Mode | Priority Order | Best For | Trade-off |
|---|---|---|---|
| symbolicPriority | Direct > Transitive > Rules > HDC | Knowledge bases, taxonomies | Fast, exact, but limited to KB content |
| holographicPriority | HDC > Direct > Transitive > Rules | Similarity search, approximation | Flexible, but requires good HDC retrieval |
Operator backlog and research notes in future-improvements.md
| Operator | Implementation | Quality | Impact on Query Success |
|---|---|---|---|
| similar | Jaccard similarity on properties | ⭐⭐⭐⭐⭐ Complete | Q2: 67% success (geometry-dependent) |
| analogy | Symbolic relation lookup | ⭐⭐⭐ Basic (missing HDC algebra) | Q3: 33% success (needs HDC bind/unbind) |
| abduce | Rule backward chaining | ⭐⭐⭐ Basic (missing Bayesian) | Q4: 67% success (heuristic scoring) |
| induce | Pattern frequency counting | ⭐⭐⭐ Basic (missing statistics) | Q5: 67% success (no significance testing) |
| whatif | Causal chain tracing | ⭐⭐⭐ Basic (missing do-calculus) | Q11: 100% success (simple cases work) |
| explain | Wrapper around abduce | ⭐⭐ Thin wrapper | Q10: 67% success (just calls abduce) |
| deduce | Forward chaining | ⭐⭐⭐⭐ Good | Q6: 100% success (works well) |
| Operation | Dense-Binary | Sparse-Polynomial | Metric-Affine |
|---|---|---|---|
| Bind complexity | O(n/32) = 64 XOR ops | O(k²) = 16-64 XOR ops | O(m) = 32 byte ops (byte-wise XOR) |
| Similarity computation | Hamming (bit count) | Jaccard (set operations) | Channel overlap (byte compare) |
| Memory access pattern | 32-byte chunks | Random BigInt access | Sequential byte access |
| Cache efficiency | Good | Poor (sparse access) | Excellent (sequential) |
Metric-Affine's byte-channel representation aligns better with symbolic reasoning:
Key findings:
| Strategy | Theoretical Capacity | Practical Limit | Bottleneck |
|---|---|---|---|
| Dense-Binary | 2^2048 unique vectors | ~10K concepts (similarity threshold) | Noise accumulation in bundles |
| Sparse-Polynomial | (2^64)^k unique sets | ~100K concepts (tested) | Jaccard similarity degrades |
| Metric-Affine | 256^m unique patterns | Unknown (not tested at scale) | Channel saturation (hypothesized) |
| Use Case | Recommended Strategy | Rationale |
|---|---|---|
| Production systems | Metric-Affine (32 bytes) | 100% accuracy with holographic mode, ≈2.6x speed, 8x memory savings vs dense |
| Similarity-based retrieval | Dense-Binary (2048 bits) | Better HDC Master Equation performance (35% vs 0%) |
| Memory-constrained devices | Metric-Affine (32 bytes) | Smallest footprint with full functionality |
| Maximum speed | Metric-Affine (32 bytes, holographic) | 326ms total (2.6x faster than dense-symbolic baseline) |
| Research/experimentation | Dense-Binary (2048 bits) | Standard HDC semantics, widely understood |
| Symbolic reasoning only | Any strategy | All achieve similar performance (symbolic path dominates) |
Priority 1: HDC Relational Algebra for Analogy
bind(A, unbind(B, KB)) for proportional reasoningPriority 2: Bayesian Abduction
Priority 3: Statistical Induction
To reproduce these experiments:
# Run Core Theory evaluation (364 tests, 27 suites; default: 8 configs = 4 strategies × 2 priorities)
npm run eval # Default geometries
# Expected: 100% success, ~300-850ms depending on config
# Run Core Theory with ALL configurations (current full matrix: 16 configs incl. EMA)
npm run eval -- --full # Includes dense(256,512), sparse(2,4), metric(16,32), metric-elastic(16,32)
# Expected: 100% success on the benchmark subset; measure EMA on your machine
# Run stress testing (theory loading validation)
node evals/runStressCheck.js # Default: 12 configs (dense/sparse/metric × 2 geometries × 2 priorities)
node evals/runStressCheck.js --fast # Single config only
# Run cross-domain query evaluation (12 queries, 12 configs)
node evals/runQueryEval.mjs # Quiet mode
node evals/runQueryEval.mjs --verbose # Show per-query progress
# Expected: Low success (missing operators in KB), but speed results valid
# Run all evaluations sequentially
node evals/runAll.js # Core Theory + Cross-Domain
node evals/runAll.js --fast --verbose # Fast mode with details
# Test specific HDC strategy
SYS2_HDC_STRATEGY=metric-affine npm run eval
SYS2_HDC_STRATEGY=sparse-polynomial node evals/runQueryEval.mjs
# Test with specific geometry size
SYS2_GEOMETRY=16 SYS2_HDC_STRATEGY=metric-affine npm run eval
similar, analogy, deduce)✓ What works (Core Theory - 100%): 364 tests across 27 suites: Foundations, hierarchies, deep transitive chains (6-10 steps), rule inference, negation, compound logic, temporal/causal reasoning, modal operators, composition, CSP solving, fuzzy matching, property inheritance, meta-operators (similar, analogy, deduce), macros, set theory, biological pathways, predicate logic, planning primitives, contradiction detection.
⚠ What needs work (Cross-Domain Queries): Stress theory files lack domain-specific operator definitions. The reasoning engine is capable, but the knowledge base is incomplete. This is a content issue, not an architecture limitation.
🚀 Performance validated: Metric-Affine HDC (32 bytes) is the fastest configuration at 318ms—2.6x faster than the slowest (sparse-4 holographic at 835ms). Memory savings: 8-16x vs Dense-Binary. The byte-channel approach is benchmark-validated in this evaluation.
🔬 Novel contributions: Two original HDC strategies validated. See HRR Comparison for theoretical analysis.
Next phase: Complete operator ecosystem (estimated 20-40 hours) to unlock full reasoning capabilities on cross-domain queries.