AGISystem2 is a platform for exploring multiple strategies for representing and using knowledge extracted from natural language. The practical goal is to formalize scientific, technical, and creative “theories” into a small DSL so they can be used: queried, tested, compared, composed, audited, and revised.
We treat “meaning” as formal pragmatics—a family of engineered, context-sensitive interpretation modes—rather than assuming a single, final, universal semantics. The same text can support different tasks (explain vs. predict vs. verify vs. design), and the system is built to make those task contracts explicit.
AGISystem2 uses VSA/HDC ideas as one major axis, but the project is explicitly a strategy lab: different representation strategies (including lossless and hybrid approaches) can be swapped and evaluated under the same DSL and test suite.
Why we treat meaning as engineered “use contracts”, not a single final semantics.
Translate natural language into DSL so theories become executable and testable.
How our strategies relate (or don’t) to classic HRR/VSA formulations.
Background on “distributed” representations and similarity-first retrieval.
How names become vectors (hashing/PRNG), plus reproducibility and privacy notes.
Patterns for verification, explainability, compliance, and agent planning.
Honest limitations and why we take a pragmatic, engineering-first stance.
Turn informal statements into compact DSL artifacts: relations, rules, constraints, and reusable “theory modules”.
Support pragmatic tasks: ask queries, run proof attempts, detect contradictions, track assumptions, and generate structured explanations.
Run the same workloads across multiple backends (vector‑symbolic, hybrid, exact) and measure quality, stability, and cost.
Prefer deterministic behavior, reproducibility, and inspectable intermediate states over “mysterious” end‑to‑end predictions.
Unlike probabilistic neural networks, HDC operations are fully deterministic. The same input always produces the same output, enabling perfect reproducibility and debugging.
Complex structures can be built from simple parts using only two operations (Bind and Bundle). This enables systematic construction and deconstruction of knowledge.
High-dimensional representations are naturally robust to noise and errors. Small perturbations don't significantly affect similarity comparisons.
Core operations are extremely fast on modern hardware. The mathematical structure allows for hardware-friendly implementations.
The fundamental data structure is a hypervector - a high-dimensional representation where each concept occupies a unique region in the vector space.
| Operation | Purpose | Mathematical Property |
|---|---|---|
| BIND | Create associations between concepts | Supports XOR-style cancellation where available. Other strategies use a distinct UNBIND and may require decoding/cleanup. |
| UNBIND | Remove a known component from a composite | Inverse-like of BIND; some strategies implement UNBIND ≡ BIND (XOR-based), but this is not required by the contract |
| BUNDLE | Combine multiple vectors into one | Result is similar to all inputs |
| SIMILARITY | Measure relatedness | Range [0, 1], 0.5 = unrelated |
Because BIND is commutative, we need a mechanism to distinguish argument positions. Position vectors (Pos1, Pos2, ..., Pos20) solve this:
// Without positions: loves(John, Mary) = loves(Mary, John) (WRONG!)
// With positions:
// With positions:
fact = Loves BIND ( (Pos1 BIND John) BUNDLE (Pos2 BIND Mary) )
fact = Loves BIND ( (Pos1 BIND Mary) BUNDLE (Pos2 BIND John) ) // DIFFERENT!
All queries reduce to this fundamental principle:
QueryKey is the part of the query you already know (relation + bound arguments). UNBIND removes that key from the KB composite to reveal the unknown parts. In XOR-based strategies, UNBIND is often implemented by calling BIND again. In other strategies (e.g. lossless EXACT), UNBIND yields a residual that must be projected back to entity candidates via strategy-aware decoding/cleanup.
A proposed engine primitive is to turn one-shot UNBIND into a bounded closure loop: STAR(query) repeatedly applies a strategy-provided forward step (like “UNBIND + decode”) to chain inferences; UNSTAR(goal) applies reverse steps for abduction/explanations. This is research-level and not implemented yet. Read the STAR/UNSTAR overview → and DS40 (engine integration proposal).
AGISystem2 implements multiple HDC strategies, each with different internal representations while maintaining the same mathematical contract:
Classic HDC with fixed-length binary vectors. Uses XOR for binding and majority vote for bundling.
Set-based HDC with k integer exponents. Uses Cartesian XOR for binding and Jaccard index for similarity.
Compact D-byte vectors over Z₂₅₆. Uses XOR binding with L₁ similarity and arithmetic mean bundling. Geometry = D (bytes per vector, default 32). Fuzzy-Boolean Hyper-Lattice.
Metric-Affine extension with chunked bundling (bounded depth) for large KB superpositions. Geometry remains a manual knob (D bytes; no auto-growth in the current runtime).
Lossless bitset-polynomial HDC over BigInt monomials. Session-local atom IDs (appearance index dictionary). UNBIND is not required to equal BIND.
AGISystem2 implements five HDC strategies. Two are original contributions not found in existing literature, one is an elastic extension of Metric-Affine, and one (EXACT) is a lossless, session-local “bitset polynomial” exploration. How do they relate to Tony Plate's Holographic Reduced Representations (HRR)?
Strategy pages: Dense-Binary, SPHDC, Metric-Affine, EMA, EXACT.
Detailed analysis of Dense-Binary (standard), Sparse Polynomial (novel), and Metric-Affine (novel) against classic HRR, with notes on the EMA extension.
STANDARD: A classic VSA/HDC baseline: fixed-length binary vectors, XOR binding, majority-vote bundling. Useful as a reference point when comparing to HRR-style binding and to our novel strategies.
ORIGINAL: Set-based HDC with Cartesian XOR binding and Min-Hash sparsification. NOT HRR - a novel paradigm.
ORIGINAL: Fuzzy-Boolean hybrid combining XOR binding with continuous bundling. HRR-inspired but novel.
EXTENSION: Chunked bundling (bounded depth) for stable superposition at scale; geometry is configurable but not auto-grown in the current runtime.
EXPLORATION: Lossless bitset-polynomial strategy (no PRNG/hashing for atom IDs inside a session). Uses a quotient-like UNBIND instead of bind reuse.
HDC's deterministic, explainable nature makes it ideal for building trustworthy AI systems—AI that can be verified, audited, and understood.
How HDC enables verifiable, explainable, auditable AI. Common patterns and approaches.
Research pattern: formal tool semantics and plan validation (external planner/runtime required).
Research pattern: compliance checks with proof traces (audit logging/export is external).
Open problems: formal verification, privacy-preserving reasoning, LLM+HDC hybrids.
Regardless of which strategy is used, the following properties are guaranteed:
AGISystem2's approach to knowledge representation has deep roots in holographic computing - a paradigm where information is distributed across the entire representation rather than localized in specific positions. Read the full documentation →
The concept of holographic representations in computing traces back to several foundational works:
When we bind two concepts A and B, the result contains "holographic traces" of both:
// A BIND B is "holographically" related to both A and B
// For XOR-based bindings, unbinding can be done by binding again:
(A BIND B) BIND B = A // B is the "key" to recover A
(A BIND B) BIND A = B // A is the "key" to recover B
This is analogous to optical holography where the reference beam is needed to reconstruct the image.
AGISystem2's DSL (Domain Specific Language) provides high-level primitives for holographic computation:
| DSL Primitive | Holographic Operation | Purpose |
|---|---|---|
relation arg1 arg2 |
Encode structured fact | Create holographic record |
query relation ?var |
Holographic unbinding | Content-addressable retrieval |
prove goal |
Recursive unbinding + matching | Logical inference via retrieval |
bundle [facts] |
Superposition of records | Build knowledge base |
The DSL abstracts away the underlying vector operations, allowing programmers to think in terms of relations, rules, and queries while the system performs holographic computation underneath.
The holographic representation enables interesting possibilities for search and backtracking:
There is an intriguing parallel between holographic representations and homomorphic encryption:
This suggests potential applications in:
While not equivalent to formal homomorphic encryption, the holographic paradigm offers similar "compute without decode" properties in a probabilistic, similarity-based setting. Read the full analysis of HDC privacy properties →