Dense-Binary HDC represents concepts as fixed-length binary vectors where approximately half the bits are 1 and half are 0. This is the classic approach to Hyperdimensional Computing, providing robust distributed representations.

1. The Core Idea: Binary Patterns as Representations

Fundamental Insight

In Dense-Binary HDC, a concept is a long binary string. The key insight is that in high-dimensional spaces (thousands of bits), randomly generated binary vectors are almost orthogonal - they share approximately 50% of bits with any other random vector, which is the mathematical "neutral" point.

Definition: Binary Hypervector

A binary hypervector V is a fixed-length sequence of bits:

V = [b0, b1, ..., bn-1] where bi ∈ {0, 1}

Typical dimensions: n = 2048 bits (256 bytes) or n = 32768 bits (4KB).

Why Binary?

Intuition: Binary vectors are the simplest possible representation. Each bit is a yes/no answer to an abstract question about the concept. With thousands of such questions, concepts become uniquely identifiable patterns.
Binary Hypervector Structure (2048 bits) 1 0 1 1 0 0 1 0 . . . 1 0 bit 0 bit 2047 Storage: 64 × 32-bit words = 2048 bits = 256 bytes Operations use native 32-bit bitwise instructions (fast!)

2. The Quasi-Orthogonality Principle

The power of high-dimensional binary spaces comes from a remarkable statistical property.

Theorem: Quasi-Orthogonality

In a binary space of n dimensions, two randomly generated vectors share approximately n/2 bits:

Why This Matters

Because random vectors are "equidistant" from each other (all at ~50% similarity), the space can accommodate an essentially unlimited number of distinct concepts. Each new random vector is guaranteed to be "far enough" from all existing vectors.

Similarity Distribution in High Dimensions 0.0 0.25 0.5 0.75 1.0 Similarity mean = 0.5 Inverse Related Frequency

3. Binding: BIND (XOR Operation)

BIND creates associations between concepts. In Dense-Binary, BIND is implemented as bitwise XOR.

Definition: BIND (XOR)

Given vectors A and B, their binding BIND(A, B) is computed bit-by-bit:

BIND(A, B)i = Ai XOR Bi

XOR produces 1 when bits differ, 0 when they match.

A: 1 0 1 1 0 0 1 0 ... B: 0 1 1 0 0 1 1 0 ... XOR (xor) BIND(A, B): 1 1 0 1 0 1 0 0 ... 1 xor 0 0 xor 1 1 xor 1 1 xor 0 =1 =1 =0 =1 Properties: Commutative (BIND(A, B) = BIND(B, A)) | XOR cancellation (BIND(A, A) = 0) | Reversible (BIND(BIND(A, B), B) = A)

Why XOR?

The XOR Cancellation Property

XOR is its own inverse, so Dense-Binary supports reversible binding: BIND(A, A) = 0 (all zeros).

BIND(BIND(A, B), B) = A

Binding with B, then binding again with B, perfectly recovers A. This is the mathematical foundation for querying knowledge bases.

4. Bundling: Majority Vote

Bundling combines multiple vectors into one by voting on each bit position.

Definition: Majority Vote Bundling

Given vectors V1, V2, ..., Vk, their bundle is computed bit-by-bit:

bundle(V1...Vk)i = 1 if majority of Vj,i are 1, else 0

Majority Vote Bundling A: 1 0 1 1 0 0 1 0 B: 0 1 1 0 0 1 1 0 C: 1 1 1 0 1 0 0 0 MAJ vote Result: 1 1 1 0 0 0 1 0 Bit-by-Bit Voting: Bit 0: 1+0+1=2 → 1 Bit 1: 0+1+1=2 → 1 Bit 2: 1+1+1=3 → 1 Bit 3: 1+0+0=1 → 0 Bit 4: 0+0+1=1 → 0 ...
Intuition: The bundle is a "consensus" vector. It preserves features that appear in most inputs while filtering out idiosyncratic features. The result is similar to ALL inputs - this is content-addressable memory.

Bundle Capacity

As more vectors are bundled, similarity to each original decreases.

Vectors Bundled Expected Similarity to Each Quality
3 ~0.67 Excellent
10 ~0.60 Good
50 ~0.54 Usable
100 ~0.52 Marginal
200+ ~0.51 Near noise

5. Similarity: Hamming Distance

Similarity measures how many bits two vectors share.

Definition: Normalized Hamming Similarity

sim(A, B) = 1 - (popcount(A xor B) / n)

Where popcount counts the number of 1-bits (differing positions) and n is the dimension.

0.0 Inverse 0.25 Very different 0.5 Random/Unrelated 0.75 Related 1.0 Identical
Example: Computing Similarity
A = 10110010... (2048 bits)
B = 01100110... (2048 bits)

A xor B = 11010100...  (bits that differ)
popcount(A xor B) = 980  (number of differing bits)

similarity = 1 - (980 / 2048) = 1 - 0.478 = 0.522

Interpretation: Nearly random (close to 0.5)

6. Creating Concept Vectors

Each concept gets a deterministic binary vector generated from its name.

Algorithm: Deterministic Vector Creation
  1. Hash the concept name to get a seed: seed = DJB2("Dog")
  2. Initialize a PRNG with the seed
  3. Generate n random bits (50% ones, 50% zeros)

Read more about hash functions in HDC →

Why Deterministic?

The same name always produces the same vector. This enables:

7. Encoding Relations

Structured knowledge is encoded by binding concepts with position vectors.

Relation Encoding Formula

fact(loves, John, Mary) = Loves BIND (Pos1 BIND John) BIND (Pos2 BIND Mary)

Position vectors (Pos1, Pos2, ...) distinguish argument positions.

Loves BIND (Pos1 BIND John) BIND (Pos2 BIND Mary) = Fact Vector [2048-bit encoding of "loves(John, Mary)"] Position Vectors Preserve Order loves(John, Mary) ≠ loves(Mary, John) because Pos1 ≠ Pos2

8. Querying: The Unbind Operation

Since Dense-Binary implements BIND as XOR, queries can cancel known parts to find unknowns.

Query Principle

To find ?who in "loves(?who, Mary)", unbind the known parts:

candidate = KB BIND Loves BIND (Pos2 BIND Mary)

Then extract the answer by unbinding Pos1 and matching against vocabulary.

Example: Query Resolution
# KB contains: loves(John, Mary) encoded as fact_vector

# Query: Who loves Mary?  (loves(?who, Mary))
partial = Loves BIND (Pos2 BIND Mary)

# Unbind from KB (XOR is reversible!)
candidate = KB BIND partial
         = fact_vector BIND Loves BIND (Pos2 BIND Mary)
         = (Pos1 BIND John)

# Extract answer
raw_answer = candidate BIND Pos1 = John

# Match against vocabulary
similarity(raw_answer, "John") ≈ 1.0  → Answer: John

9. Mathematical Properties

Key Properties of Dense-Binary HDC