AGISystem2 is a platform for exploring multiple strategies for representing and using knowledge extracted from natural language. The practical goal is to formalize scientific, technical, and creative “theories” into a small DSL so they can be used: queried, tested, compared, composed, audited, and revised.

We treat “meaning” as formal pragmatics—a family of engineered, context-sensitive interpretation modes—rather than assuming a single, final, universal semantics. The same text can support different tasks (explain vs. predict vs. verify vs. design), and the system is built to make those task contracts explicit.

Terminology:

AGISystem2 uses VSA/HDC ideas as one major axis, but the project is explicitly a strategy lab: different representation strategies (including lossless and hybrid approaches) can be swapped and evaluated under the same DSL and test suite.

Quick Links

Formal Pragmatics

Why we treat meaning as engineered “use contracts”, not a single final semantics.

NL→DSL

Translate natural language into DSL so theories become executable and testable.

Strategies

Same DSL, different representations:

HRR Comparison

How our strategies relate (or don’t) to classic HRR/VSA formulations.

Holographic Notes

Background on “distributed” representations and similarity-first retrieval.

Deterministic Vectors

How names become vectors (hashing/PRNG), plus reproducibility and privacy notes.

Trustworthy AI

Patterns for verification, explainability, compliance, and agent planning.

Grounding Problem

Honest limitations and why we take a pragmatic, engineering-first stance.

What This Project Tries To Do

Formalize Theories

Turn informal statements into compact DSL artifacts: relations, rules, constraints, and reusable “theory modules”.

Use Theories

Support pragmatic tasks: ask queries, run proof attempts, detect contradictions, track assumptions, and generate structured explanations.

Compare Strategies

Run the same workloads across multiple backends (vector‑symbolic, hybrid, exact) and measure quality, stability, and cost.

Engineer Trust

Prefer deterministic behavior, reproducibility, and inspectable intermediate states over “mysterious” end‑to‑end predictions.

Why Vector-Symbolic Methods?

Determinism

Unlike probabilistic neural networks, HDC operations are fully deterministic. The same input always produces the same output, enabling perfect reproducibility and debugging.

Compositionality

Complex structures can be built from simple parts using only two operations (Bind and Bundle). This enables systematic construction and deconstruction of knowledge.

Noise Tolerance

High-dimensional representations are naturally robust to noise and errors. Small perturbations don't significantly affect similarity comparisons.

Efficiency

Core operations are extremely fast on modern hardware. The mathematical structure allows for hardware-friendly implementations.

Core Concepts

The Hypervector

The fundamental data structure is a hypervector - a high-dimensional representation where each concept occupies a unique region in the vector space.

Key Insight: In high-dimensional spaces, randomly generated vectors are almost orthogonal to each other. This "quasi-orthogonality" property means any new concept gets a representation that doesn't interfere with existing ones.

The Core Operations

Operation Purpose Mathematical Property
BIND Create associations between concepts Supports XOR-style cancellation where available. Other strategies use a distinct UNBIND and may require decoding/cleanup.
UNBIND Remove a known component from a composite Inverse-like of BIND; some strategies implement UNBIND ≡ BIND (XOR-based), but this is not required by the contract
BUNDLE Combine multiple vectors into one Result is similar to all inputs
SIMILARITY Measure relatedness Range [0, 1], 0.5 = unrelated

Position Vectors

Because BIND is commutative, we need a mechanism to distinguish argument positions. Position vectors (Pos1, Pos2, ..., Pos20) solve this:

// Without positions: loves(John, Mary) = loves(Mary, John) (WRONG!)
// With positions:
// With positions:
fact = Loves BIND ( (Pos1 BIND John) BUNDLE (Pos2 BIND Mary) )
fact = Loves BIND ( (Pos1 BIND Mary) BUNDLE (Pos2 BIND John) )  // DIFFERENT!

The Reasoning Equation

All queries reduce to this fundamental principle:

Answer ≈ UNBIND(Knowledge, QueryKey)

QueryKey is the part of the query you already know (relation + bound arguments). UNBIND removes that key from the KB composite to reveal the unknown parts. In XOR-based strategies, UNBIND is often implemented by calling BIND again. In other strategies (e.g. lossless EXACT), UNBIND yields a residual that must be projected back to entity candidates via strategy-aware decoding/cleanup.

Research (VSA/HDC upgrade): STAR / UNSTAR closure.

A proposed engine primitive is to turn one-shot UNBIND into a bounded closure loop: STAR(query) repeatedly applies a strategy-provided forward step (like “UNBIND + decode”) to chain inferences; UNSTAR(goal) applies reverse steps for abduction/explanations. This is research-level and not implemented yet. Read the STAR/UNSTAR overview → and DS40 (engine integration proposal).

HDC Strategies

AGISystem2 implements multiple HDC strategies, each with different internal representations while maintaining the same mathematical contract:

Dense-Binary

Classic HDC with fixed-length binary vectors. Uses XOR for binding and majority vote for bundling.

Full Documentation →

Sparse Polynomial (SPHDC)

Set-based HDC with k integer exponents. Uses Cartesian XOR for binding and Jaccard index for similarity.

Full Documentation →

Metric-Affine

Compact D-byte vectors over Z₂₅₆. Uses XOR binding with L₁ similarity and arithmetic mean bundling. Geometry = D (bytes per vector, default 32). Fuzzy-Boolean Hyper-Lattice.

Full Documentation →

Metric-Affine Elastic (EMA)

Metric-Affine extension with chunked bundling (bounded depth) for large KB superpositions. Geometry remains a manual knob (D bytes; no auto-growth in the current runtime).

Full Documentation →

EXACT (Exact-Sparse)

Lossless bitset-polynomial HDC over BigInt monomials. Session-local atom IDs (appearance index dictionary). UNBIND is not required to equal BIND.

Full Documentation →

HRR Comparison: Original Contributions

AGISystem2 implements five HDC strategies. Two are original contributions not found in existing literature, one is an elastic extension of Metric-Affine, and one (EXACT) is a lossless, session-local “bitset polynomial” exploration. How do they relate to Tony Plate's Holographic Reduced Representations (HRR)?

Strategy pages: Dense-Binary, SPHDC, Metric-Affine, EMA, EXACT.

HRR vs. Our Strategies

Detailed analysis of Dense-Binary (standard), Sparse Polynomial (novel), and Metric-Affine (novel) against classic HRR, with notes on the EMA extension.

Dense-Binary

STANDARD: A classic VSA/HDC baseline: fixed-length binary vectors, XOR binding, majority-vote bundling. Useful as a reference point when comparing to HRR-style binding and to our novel strategies.

Sparse Polynomial (SPHDC)

ORIGINAL: Set-based HDC with Cartesian XOR binding and Min-Hash sparsification. NOT HRR - a novel paradigm.

Metric-Affine

ORIGINAL: Fuzzy-Boolean hybrid combining XOR binding with continuous bundling. HRR-inspired but novel.

Metric-Affine Elastic (EMA)

EXTENSION: Chunked bundling (bounded depth) for stable superposition at scale; geometry is configurable but not auto-grown in the current runtime.

EXACT (Exact-Sparse)

EXPLORATION: Lossless bitset-polynomial strategy (no PRNG/hashing for atom IDs inside a session). Uses a quotient-like UNBIND instead of bind reuse.

Summary: Dense-Binary is standard VSA/HDC. Sparse Polynomial and Metric-Affine are original contributions developed for AGISystem2, Metric-Affine Elastic extends Metric-Affine for large KB superpositions, and EXACT explores a fully lossless session-local representation.

Trustworthy AI

HDC's deterministic, explainable nature makes it ideal for building trustworthy AI systems—AI that can be verified, audited, and understood.

Status: Trustworthy AI patterns are research-level and not shipped as runnable Core/config theory sets.

Trustworthy AI Overview

How HDC enables verifiable, explainable, auditable AI. Common patterns and approaches.

Agent Planning

Research pattern: formal tool semantics and plan validation (external planner/runtime required).

Compliance & Verification

Research pattern: compliance checks with proof traces (audit logging/export is external).

Research Directions

Open problems: formal verification, privacy-preserving reasoning, LLM+HDC hybrids.

Mathematical Guarantees

Regardless of which strategy is used, the following properties are guaranteed:

Holographic Representations

AGISystem2's approach to knowledge representation has deep roots in holographic computing - a paradigm where information is distributed across the entire representation rather than localized in specific positions. Read the full documentation →

Historical Context

The concept of holographic representations in computing traces back to several foundational works:

Key Insight: Just as in optical holography where each fragment contains information about the whole image, holographic computing distributes information across all dimensions of the vector. This provides natural noise tolerance and content-addressable memory.

The Holographic Property

When we bind two concepts A and B, the result contains "holographic traces" of both:

// A BIND B is "holographically" related to both A and B
// For XOR-based bindings, unbinding can be done by binding again:
(A BIND B) BIND B = A    // B is the "key" to recover A
(A BIND B) BIND A = B    // A is the "key" to recover B

This is analogous to optical holography where the reference beam is needed to reconstruct the image.

A Programming Language for Holographic Computation

AGISystem2's DSL (Domain Specific Language) provides high-level primitives for holographic computation:

DSL Primitive Holographic Operation Purpose
relation arg1 arg2 Encode structured fact Create holographic record
query relation ?var Holographic unbinding Content-addressable retrieval
prove goal Recursive unbinding + matching Logical inference via retrieval
bundle [facts] Superposition of records Build knowledge base

The DSL abstracts away the underlying vector operations, allowing programmers to think in terms of relations, rules, and queries while the system performs holographic computation underneath.

Backtracking and Search

The holographic representation enables interesting possibilities for search and backtracking:

Connections to Homomorphic Encryption

There is an intriguing parallel between holographic representations and homomorphic encryption:

Conceptual Link: Just as homomorphic encryption allows computation on encrypted data without decryption, holographic binding allows composition of concepts without "unpacking" their internal structure. The vector for "loves(John, Mary)" is meaningful without explicitly storing "John" or "Mary" as separate retrievable entities.

This suggests potential applications in:

While not equivalent to formal homomorphic encryption, the holographic paradigm offers similar "compute without decode" properties in a probabilistic, similarity-based setting. Read the full analysis of HDC privacy properties →

Epistemological Foundations

The Symbol Grounding Problem — An honest assessment of what formal systems can and cannot achieve. Why we adopt a meta-rational approach: pragmatic utility over theoretical idealization. No magic, no singularity, just useful engineering.

Further Reading