Theory Guide: Hyperdimensional Computing & Vector Symbolic Architectures¶

Last Updated: November 5, 2025

Table of Contents¶

Introduction
Mathematical Foundations
VSA Models Implemented
Core Operations
Properties & Characteristics
Model Comparison
Capacity & Scalability
Applications
References

Introduction¶

What is Hyperdimensional Computing?¶

Hyperdimensional Computing (HDC) / Vector Symbolic Architectures (VSA) is a computing paradigm inspired by the brain’s use of high-dimensional distributed representations. The key idea is to represent concepts and data as large vectors (hypervectors) in very high-dimensional spaces (typically D = 1,000 to 10,000+ dimensions).

Why High Dimensions?¶

In high-dimensional spaces, several remarkable properties emerge:

Quasi-orthogonality: Random vectors are very likely to be nearly orthogonal
Robust representation: Information is distributed across many dimensions
Noise tolerance: Small perturbations don’t significantly affect similarity
Compositional structure: Complex structures can be encoded in fixed-size vectors

Historical Context¶

1990s: Tony Plate introduces Holographic Reduced Representations (HRR)
2009: Pentti Kanerva formalizes Hyperdimensional Computing framework
2020s: Multiple VSA variants developed for different application domains
2024: Modern implementations with neural network integration

Mathematical Foundations¶

High-Dimensional Spaces¶

A hypervector $\mathbf{h} \in \mathbb{V}^D$ lives in a $D$-dimensional vector space $\mathbb{V}$.

Common vector spaces:

Real: $\mathbb{V} = \mathbb{R}^D$ (continuous values)
Bipolar: $\mathbb{V} = \{-1, +1\}^D$ (binary with sign)
Binary: $\mathbb{V} = \{0, 1\}^D$ (unsigned binary)
Complex: $\mathbb{V} = \mathbb{C}^D$ (complex numbers)
Sparse: $\mathbb{V} = \{0, 1\}^D$ with $\|h\|_0 \ll D$ (few non-zeros)
Matrix: $\mathbb{V} = \mathbb{C}^{D \times m \times m}$ (matrices per dimension)

Quasi-Orthogonality¶

Definition: Random vectors $\mathbf{a}, \mathbf{b} \in \mathbb{V}^D$ sampled independently are approximately orthogonal with high probability.

Mathematical statement: $$\mathbb{P}(\cos(\mathbf{a}, \mathbf{b}) \approx 0) \to 1 \quad \text{as } D \to \infty$$

Intuition: In high dimensions, “most of the space” is far from any given vector.

Example (Real vectors):

For $D = 10,000$ and $\mathbf{a}, \mathbf{b} \sim \mathcal{N}(0, I)$:
$\mathbb{P}(85° < \angle(\mathbf{a}, \mathbf{b}) < 95°) > 0.99$

Similarity Measures¶

Cosine Similarity (Real/Bipolar): $$\delta(\mathbf{a}, \mathbf{b}) = \frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{a}\| \|\mathbf{b}\|} = \cos(\mathbf{a}, \mathbf{b})$$

Hamming Distance (Binary): $$d_H(\mathbf{a}, \mathbf{b}) = \sum_{i=1}^D \mathbb{1}[a_i \neq b_i]$$

Overlap Similarity (Sparse Binary): $$\delta(\mathbf{a}, \mathbf{b}) = \frac{|\{i : a_i = 1 \land b_i = 1\}|}{D \cdot p}$$

Complex Angle Distance (Complex): $$\delta(\mathbf{a}, \mathbf{b}) = \frac{1}{D} \sum_{i=1}^D \cos(\arg(a_i) - \arg(b_i))$$

VSA Models Implemented¶

1. MAP (Multiply-Add-Permute)¶

Space: Bipolar $\{-1, +1\}^D$ or Real $\mathbb{R}^D$

Operations:

Binding: $\mathbf{c} = \mathbf{a} \odot \mathbf{b}$ (element-wise multiplication)
Unbinding: $\mathbf{a} = \mathbf{c} \odot \mathbf{b}$ (self-inverse)
Bundling: $\mathbf{s} = \sum_i \mathbf{a}_i$ (sum + normalization)

Properties:

✅ Self-inverse
✅ Commutative
✅ Associative
✅ Exact inverse (bipolar)
⚠️ Approximate inverse (real)

Best for: Simple operations, hardware implementations, self-inverse requirements

Reference: Gayler (1998), Schlegel et al. (2022)

2. FHRR (Fourier Holographic Reduced Representations)¶

Space: Complex unit circle $\{e^{i\theta} : \theta \in [0, 2\pi)\}^D$

Operations:

Binding: $c_i = a_i \cdot b_i = e^{i(\theta_a + \theta_b)}$ (angle addition)
Unbinding: $a_i = c_i \cdot b_i^* = e^{i(\theta_c - \theta_b)}$ (angle subtraction)
Bundling: $s_i = \arg\left(\sum_j e^{i\theta_j}\right)$ (vector addition → angle)

Properties:

✅ Exact inverse
✅ Commutative
✅ Associative
✅ Best bundling capacity (~0.35D items)
✅ Element-wise operations (O(D))

Best for: High capacity requirements, exact recovery, kernel methods

Reference: Plate (2003), Schlegel et al. (2022)

3. HRR (Holographic Reduced Representations)¶

Space: Real $\mathbb{R}^D$

Operations:

Binding: $\mathbf{c} = \mathbf{a} \circledast \mathbf{b}$ (circular convolution) $$c_j = \sum_{k=0}^{D-1} a_k b_{\text{mod}(j-k, D)}$$
Unbinding: $\mathbf{a} \approx \mathbf{c} \circledast \tilde{\mathbf{b}}$ (circular correlation) $$a_j \approx \sum_{k=0}^{D-1} c_k b_{\text{mod}(k+j, D)}$$
Bundling: $\mathbf{s} = \sum_i \mathbf{a}_i / \|\sum_i \mathbf{a}_i\|$

Properties:

⚠️ Approximate inverse (~0.70 similarity after one unbinding)
✅ Commutative
✅ Associative
⚠️ Degrades with depth (~0.05 after 40 operations)
⚠️ O(D log D) via FFT

Best for: Standard VSA tasks, good balance of properties

Reference: Plate (1995), Schlegel et al. (2022)

4. BSC (Binary Spatter Codes)¶

Space: Binary $\{0, 1\}^D$

Operations:

Binding: $\mathbf{c} = \mathbf{a} \oplus \mathbf{b}$ (XOR)
Unbinding: $\mathbf{a} = \mathbf{c} \oplus \mathbf{b}$ (XOR is self-inverse)
Bundling: Majority vote

Properties:

✅ Self-inverse
✅ Exact inverse
✅ Commutative
✅ Associative
✅ Simple hardware (XOR gates)
⚠️ Lower capacity than FHRR

Best for: Hardware implementations, digital circuits, exact operations

Reference: Kanerva (1993), Schlegel et al. (2022)

5. GHRR (Generalized Holographic Reduced Representations)¶

Space: Unitary matrices $U(m)^D$ where each element is $m \times m$ unitary matrix

Operations:

Binding: $[\mathbf{A}]_j \cdot [\mathbf{B}]_j$ (element-wise matrix multiplication)
Unbinding: $[\mathbf{C}]_j \cdot [\mathbf{B}]_j^\dagger$ (conjugate transpose)
Bundling: $\sum_i [\mathbf{A}_i]_j$ (element-wise matrix addition)

Properties:

✅ Exact inverse
✅ Non-commutative (tunable via diagonality)
✅ Associative (matrix multiplication is associative: (A·B)·C = A·(B·C))
✅ Better capacity than FHRR for bound vectors
✅ No permutation needed for order
⚠️ Higher memory (D × m × m)

Best for: Compositional structures, trees, nested relationships

Reference: Yeung, Zou, & Imani (2024)

6. VTB (Vector-derived Transformation Binding)¶

Space: Real $\mathbb{R}^D$

Operations (implemented via circular convolution):

Binding: $\mathbf{c} = \mathbf{a} \circledast \mathbf{b}$ (circular convolution)
Unbinding: $\mathbf{a} \approx \mathbf{c} \circledast \tilde{\mathbf{b}}$ (circular correlation)
Bundling: $\mathbf{s} = \sum_i \mathbf{a}_i / \|\sum_i \mathbf{a}_i\|$

Properties:

⚠️ Approximate inverse (~0.69 similarity)
✅ Commutative (via circular convolution)
✅ Better unbinding than HRR
⚠️ O(D log D) via FFT

Best for: Better approximate unbinding, similar to HRR with improvements

Reference: Gosmann & Eliasmith (2019), Schlegel et al. (2022)

Note: Our implementation uses FFT-based circular convolution (O(D log D)) instead of explicit circulant matrices (O(D²)) for efficiency.

7. BSDC (Binary Sparse Distributed Codes)¶

Space: Sparse binary $\{0, 1\}^D$ with $\|h\|_0 = pD$ where $p = 1/\sqrt{D}$

Operations:

Binding: $\mathbf{c} = \mathbf{a} \oplus \mathbf{b}$ (XOR)
Unbinding: $\mathbf{a} = \mathbf{c} \oplus \mathbf{b}$ (self-inverse)
Bundling: Top-k selection (maintains sparsity)

Properties:

✅ Self-inverse
✅ Exact inverse
✅ Optimal sparsity formula
✅ ~50x memory savings at D=10,000
✅ Excellent bundling capacity
⚠️ Density increases with binding

Best for: Memory-constrained applications, hardware implementations

Reference: Kanerva (2009), Rachkovskij (2001)

Core Operations¶

Bundling (+)¶

Purpose: Superimpose multiple hypervectors to create a representation similar to all inputs.

Property: If $\mathbf{h} = \mathbf{a} + \mathbf{b} + \mathbf{c}$, then:

$\delta(\mathbf{h}, \mathbf{a}) > 0$ (similar)
$\delta(\mathbf{h}, \mathbf{b}) > 0$ (similar)
$\delta(\mathbf{h}, \mathbf{c}) > 0$ (similar)

Cognitive interpretation: Memory consolidation, prototype formation

Example:

from holovec import VSA

model = VSA.create('FHRR', dim=10000)

# Create base concepts
dog = model.random(seed=1)
cat = model.random(seed=2)
bird = model.random(seed=3)

# Bundle into "pets" concept
pets = model.bundle([dog, cat, bird])

# pets is similar to all three
print(model.similarity(pets, dog))   # ~0.58
print(model.similarity(pets, cat))   # ~0.58
print(model.similarity(pets, bird))  # ~0.58

Binding (⊗)¶

Purpose: Connect two hypervectors to create a representation dissimilar to both.

Properties (Plate 1997):

Dissimilarity: $\delta(\mathbf{a} \otimes \mathbf{b}, \mathbf{a}) \approx 0$
Similarity preservation: If $\mathbf{a}' \approx \mathbf{a}$ and $\mathbf{b}' \approx \mathbf{b}$, then $\mathbf{a}' \otimes \mathbf{b}' \approx \mathbf{a} \otimes \mathbf{b}$
Invertible: Can recover $\mathbf{a}$ from $\mathbf{c} = \mathbf{a} \otimes \mathbf{b}$ using $\mathbf{b}$

Cognitive interpretation: Association, role-filler binding

Example (Role-filler pairs):

# Roles and fillers
role_subject = model.random(seed=10)
role_verb = model.random(seed=11)
role_object = model.random(seed=12)

filler_cat = model.random(seed=20)
filler_chased = model.random(seed=21)
filler_mouse = model.random(seed=22)

# Encode "cat chased mouse"
sentence = model.bundle([
    model.bind(role_subject, filler_cat),
    model.bind(role_verb, filler_chased),
    model.bind(role_object, filler_mouse)
])

# Query: What is the subject?
subject = model.unbind(sentence, role_subject)
print(model.similarity(subject, filler_cat))    # High!
print(model.similarity(subject, filler_mouse))  # Low

Permutation (ρ)¶

Purpose: Reorder vector elements to encode position or sequence information.

Property: $\delta(\rho(\mathbf{h}), \mathbf{h}) \approx 0$ (dissimilar)

Usage: Encoding sequences without binding

Example:

# Encode sequence [A, B, C]
A = model.random(seed=1)
B = model.random(seed=2)
C = model.random(seed=3)

sequence = model.bundle([
    A,                    # Position 0
    model.permute(B, 1),  # Position 1
    model.permute(C, 2)   # Position 2
])

# Different from [C, B, A]
sequence2 = model.bundle([
    C,
    model.permute(B, 1),
    model.permute(A, 2)
])

print(model.similarity(sequence, sequence2))  # Low!

Properties & Characteristics¶

Self-Inverse vs Non Self-Inverse¶

Self-Inverse (MAP, BSC, BSDC):

Binding = Unbinding operation
$\mathbf{a} \otimes \mathbf{b} \otimes \mathbf{b} = \mathbf{a}$
Enables elegant solutions (e.g., “Dollar of Mexico” problem)

Non Self-Inverse (FHRR, HRR, GHRR, VTB):

Separate unbinding operation
More complex but often more powerful
May require “readout machines” for hierarchical structures

Exact vs Approximate Inverse¶

Exact Inverse (MAP-bipolar, FHRR, BSC, GHRR, BSDC):

Perfect recovery: $\mathbf{a} = \mathbf{c} \oslash \mathbf{b}$ when $\mathbf{c} = \mathbf{a} \otimes \mathbf{b}$
No degradation with multiple bind/unbind cycles
Similarity after recovery = 1.0

Approximate Inverse (MAP-real, HRR, VTB):

Imperfect recovery: $\mathbf{a} \approx \mathbf{c} \oslash \mathbf{b}$
Degrades with sequence depth
Similarity after recovery: 0.50-0.70 (single), 0.05-0.25 (after 40 ops)
Often requires “clean-up memory” for practical use

Commutativity¶

Commutative (MAP, FHRR, HRR, BSC, VTB, BSDC):

$\mathbf{a} \otimes \mathbf{b} = \mathbf{b} \otimes \mathbf{a}$
Cannot distinguish order without additional techniques (permutation)
Simpler algebraic properties

Non-Commutative (GHRR):

$\mathbf{a} \otimes \mathbf{b} \neq \mathbf{b} \otimes \mathbf{a}$
Naturally encodes order
Better for hierarchical/compositional structures
Tunable via diagonality parameter

Associativity¶

Associative (MAP, FHRR, HRR, BSC, BSDC, GHRR):

$(\mathbf{a} \otimes \mathbf{b}) \otimes \mathbf{c} = \mathbf{a} \otimes (\mathbf{b} \otimes \mathbf{c})$
Order of operations doesn’t matter
Simplifies nested bindings
GHRR is associative because matrix multiplication is associative

Non-Associative (VTB for unbinding):

Order of operations may matter in certain unbinding scenarios
Note: VTB binding itself is typically associative; non-associativity may appear in complex factorization

Model Comparison¶

Quick Reference Table¶

Model	Space	Self-Inv	Exact	Comm	Assoc	Capacity	Complexity
MAP	Bipolar/Real	✅	✅/⚠️	✅	✅	Medium	O(D)
FHRR	Complex	❌	✅	✅	✅	Best	O(D)
HRR	Real	❌	⚠️	✅	✅	Medium	O(D log D)
BSC	Binary	✅	✅	✅	✅	Medium	O(D)
GHRR	Matrices	❌	✅	❌	✅	High	O(Dm²)
VTB	Real	❌	⚠️	❌	✅	Medium	O(D²)
BSDC	Sparse	✅	✅	✅	✅	High	O(Dp)

Legend:

✅ = Yes/Good
⚠️ = Approximate/Conditional
❌ = No

When to Use Each Model¶

Use MAP when¶

Need self-inverse binding
Implementing on simple hardware
Want exact recovery (bipolar mode)
Prioritize simplicity

Use FHRR when¶

Need maximum bundling capacity
Want exact inverse
Can work with complex numbers
Kernel methods are relevant

Use HRR when¶

Standard VSA tasks
Good balance of properties needed
Real-valued operations preferred
Approximate recovery acceptable

Use BSC when¶

Implementing on digital circuits
Need exact XOR-based operations
Binary operations required
Simple hardware

Use GHRR when¶

Encoding compositional structures
Need non-commutative binding
Working with trees/graphs
Order is semantically important

Use VTB when¶

Need slightly better unbinding than HRR
Approximate recovery acceptable
Real-valued operations preferred

Use BSDC when¶

Memory is constrained
Need sparse representations
Exact recovery required
Hardware efficiency critical

Capacity & Scalability¶

Bundling Capacity¶

Definition: Maximum number of hypervectors that can be bundled while maintaining retrieval accuracy.

Empirical Results (Schlegel et al. 2022):

For 99% retrieval accuracy:

Model	Dimensions Needed (for k=15)	Efficiency
FHRR	~330	⭐⭐⭐⭐⭐ Best
BSDC	~320	⭐⭐⭐⭐⭐ Best
HRR	~510	⭐⭐⭐⭐ Good
VTB	~510	⭐⭐⭐⭐ Good
MAP-C	~640	⭐⭐⭐ Adequate
BSC	~750	⭐⭐⭐ Adequate
MAP-B	~790	⭐⭐ Lower

Scaling: Most models require dimensions to scale linearly with number of items: $D \propto k$

Memory Efficiency¶

Dense Models (MAP, FHRR, HRR, BSC, VTB, GHRR):

Memory: $O(D)$ for scalars, $O(Dm^2)$ for matrices
Storage: 32-64 bits per element

Sparse Models (BSDC):

Memory: $O(Dp)$ where $p = 1/\sqrt{D}$
Savings: ~50x at D=10,000 (100 ones vs 5,000)
Can use sparse data structures

Computational Complexity¶

Operation	MAP	FHRR	HRR	BSC	GHRR	VTB	BSDC
Random	O(D)	O(D)	O(D)	O(D)	O(Dm²)	O(D)	O(Dp)
Bind	O(D)	O(D)	O(D log D)	O(D)	O(Dm³)	O(D log D)	O(D)
Unbind	O(D)	O(D)	O(D log D)	O(D)	O(Dm³)	O(D log D)	O(D)
Bundle	O(kD)	O(kD)	O(kD)	O(kD)	O(kDm²)	O(kD)	O(kDp)
Similarity	O(D)	O(D)	O(D)	O(D)	O(Dm²)	O(D)	O(Dp)

Where:

$D$ = dimension
$k$ = number of vectors to bundle
$m$ = matrix size (GHRR)
$p$ = sparsity (~$1/\sqrt{D}$)

Capacity Formulas¶

FHRR Capacity (Plate 2003): $$k_{\max} \approx 0.35D$$

BSDC Optimal Sparsity (Rachkovskij 2001): $$p = \frac{1}{\sqrt{D}}$$

Expected Ones: $$N_{\text{ones}} = pD = \sqrt{D}$$

Applications¶

Classic Applications¶

Analogical Reasoning
- “What is the Dollar of Mexico?” → Peso
- Self-inverse models excel
Semantic Memory
- Store facts as bundled role-filler pairs
- Query by unbinding
Sequence Encoding
- Text, time series, trajectories
- Use permutation or GHRR
Image Similarity
- Encode spatial structure
- Fast similarity search
Cognitive Architectures
- Working memory
- Reasoning systems
- Symbol grounding

Modern Applications¶

Few-Shot Learning
- Fast adaptation with hypervectors
- One-shot classification
Neuromorphic Computing
- Brain-inspired hardware
- Energy-efficient inference
Edge AI
- Low-power devices
- IoT sensors
Bioinformatics
- DNA sequence analysis
- Protein structure
Robotic Navigation
- Place recognition
- SLAM (Simultaneous Localization and Mapping)

References¶

Foundational Papers¶

Plate, T. A. (2003) “Holographic Reduced Representations” IEEE Transactions on Neural Networks
- Original HRR and FHRR formulation
Kanerva, P. (2009) “Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors” Cognitive Computation
- Foundational HDC framework
Gayler, R. W. (1998) “Multiplicative Binding, Representation Operators & Analogy”
- MAP architecture
Kanerva, P. (1993) “Sparse Distributed Memory and Related Models”
- Binary spatter codes, SDM

Modern Advances¶

Schlegel, K., Neubert, P., & Protzel, P. (2022) “A Comparison of Vector Symbolic Architectures” Artificial Intelligence Review
- Comprehensive comparison of 11 VSA implementations
Yeung, C., Zou, Z., & Imani, M. (2024) “Generalized Holographic Reduced Representations” arXiv:2405.09689v1
- GHRR with non-commutative binding
Gosmann, J., & Eliasmith, C. (2019) “Vector-Derived Transformation Binding”
- VTB model
Rachkovskij, D. A. (2001) “Representation and Processing of Structures with Binary Sparse Distributed Codes”
- Optimal sparsity for BSDC

Survey Papers¶

Kleyko, D., Osipov, E., & Gayler, R. W. (2023) “A Survey on Hyperdimensional Computing aka Vector Symbolic Architectures, Part I: Models and Data Transformations” ACM Computing Surveys
- Comprehensive modern survey
Frady, E. P., et al. (2021) “Computing on Functions Using Randomized Vector Representations” Physical Review Research
- Fractional power encoding, kernel methods

Appendix: Mathematical Notation¶

Symbols¶

$\mathbf{h}$: Hypervector
$D$: Dimension
$\otimes$: Binding operation
$\oslash$: Unbinding operation
$+$: Bundling operation
$\rho$: Permutation operation
$\delta(\cdot, \cdot)$: Similarity measure
$\|\cdot\|$: Norm
$\odot$: Element-wise (Hadamard) product
$\oplus$: XOR operation
$\circledast$: Circular convolution
$\circledast$: Circular correlation

Notation Conventions¶

Bold lowercase ($\mathbf{a}$): Vectors
Bold uppercase ($\mathbf{A}$): Matrices or hypervectors with matrix elements
Italic ($D$, $k$, $m$): Scalars
Blackboard bold ($\mathbb{R}$, $\mathbb{C}$): Number sets

End of Theory Guide

For practical usage examples, see the main README and examples directory. For implementation details, see the API documentation. For validation results, see docs/validation_results.md.

Theory Guide: Hyperdimensional Computing & Vector Symbolic Architectures¶

Table of Contents¶

Introduction¶

What is Hyperdimensional Computing?¶

Why High Dimensions?¶

Historical Context¶

Mathematical Foundations¶

High-Dimensional Spaces¶

Quasi-Orthogonality¶

Similarity Measures¶

VSA Models Implemented¶

1. MAP (Multiply-Add-Permute)¶

2. FHRR (Fourier Holographic Reduced Representations)¶

3. HRR (Holographic Reduced Representations)¶

4. BSC (Binary Spatter Codes)¶

5. GHRR (Generalized Holographic Reduced Representations)¶

6. VTB (Vector-derived Transformation Binding)¶

7. BSDC (Binary Sparse Distributed Codes)¶

Core Operations¶

Bundling (+)¶

Binding (⊗)¶

Permutation (ρ)¶

Properties & Characteristics¶

Self-Inverse vs Non Self-Inverse¶

Exact vs Approximate Inverse¶

Commutativity¶

Associativity¶

Model Comparison¶

Quick Reference Table¶

When to Use Each Model¶

Use MAP when¶

Use FHRR when¶

Use HRR when¶

Use BSC when¶

Use GHRR when¶

Use VTB when¶

Use BSDC when¶

Capacity & Scalability¶

Bundling Capacity¶

Memory Efficiency¶

Computational Complexity¶

Capacity Formulas¶

Applications¶

Classic Applications¶

Modern Applications¶

References¶

Foundational Papers¶

Modern Advances¶

Survey Papers¶

Further Reading¶

Implementation Guides¶

Tutorials¶

Applications Papers¶

Appendix: Mathematical Notation¶

Symbols¶

Notation Conventions¶