Retrieval & Memory¶

class holovec.retrieval.Codebook(items: Dict[str, Any] | None = None, backend: Backend | None = None)[source]¶

Bases: object

Thin wrapper for label→vector mappings with convenience methods.

Keeps insertion order of labels. Vectors are backend arrays.

__init__(items: Dict[str, Any] | None = None, backend: Backend | None = None)[source]¶

add(label: str, vector: Any) → None[source]¶

extend(items: Dict[str, Any]) → None[source]¶

property labels: List[str]¶

property size: int¶

as_list() → List[Tuple[str, Any]][source]¶

as_matrix(backend: Backend | None = None) → Tuple[List[str], Any][source]¶: Return (labels, matrix) where matrix has shape (L, D).

save(path: str) → None[source]¶

classmethod load(path: str, backend: Backend | None = None) → Codebook[source]¶

class holovec.retrieval.ItemStore(model: VSAModel, cleanup: CleanupStrategy | None = None)[source]¶

Bases: object

Thin retrieval wrapper around a Codebook and a CleanupStrategy.

Provides nearest-neighbor queries and multi-factor factorization via the configured cleanup strategy.

__init__(model: VSAModel, cleanup: CleanupStrategy | None = None) → None[source]¶

fit(items: Dict[str, Any] | Codebook) → ItemStore[source]¶

add(label: str, vector: Any) → None[source]¶

extend(items: Dict[str, Any]) → None[source]¶

query(vec: Any, k: int = 1, return_similarities: bool = True, fast: bool = True) → List[Tuple[str, float]][source]¶

Query top-k nearest items.

If fast=True, uses a batched matrix routine when possible, otherwise falls back to scalar nearest_neighbors.

factorize(vec: Any, n_factors: int, **kwargs) → Tuple[List[str], List[float]][source]¶

save(path: str) → None[source]¶

classmethod load(model: VSAModel, path: str, cleanup: CleanupStrategy | None = None) → ItemStore[source]¶

class holovec.retrieval.AssocStore(model: VSAModel)[source]¶

Bases: object

Lean heteroassociative store: keys → values via aligned codebooks.

Stores two codebooks with aligned label order. Query by a key vector returns the best-matching key label and its corresponding value label/vector.

__init__(model: VSAModel) → None[source]¶

fit(key_items: Dict[str, Any], value_items: Dict[str, Any]) → AssocStore[source]¶

add(label: str, key_vec: Any, value_vec: Any) → None[source]¶

query_label(key_vec: Any, k: int = 1) → List[Tuple[str, float]][source]¶

query_value(key_vec: Any, top: int = 1) → Tuple[str, Any][source]¶

save(keys_path: str, values_path: str) → None[source]¶

classmethod load(model: VSAModel, keys_path: str, values_path: str) → AssocStore[source]¶

Cleanup Strategies¶

class holovec.utils.cleanup.BruteForceCleanup[source]¶

Bases: CleanupStrategy

Brute-force cleanup via exhaustive codebook search.

This is the baseline cleanup strategy that computes similarity between the query and every codebook entry, returning the best match. Simple and effective, but slow for large codebooks.

Performance:

Time complexity: O(n × d) for n items, d dimensions
Space complexity: O(1)
Best for: Small codebooks (< 1000 items)

Examples

>>> # Create strategy
>>> cleanup = BruteForceCleanup()
>>>
>>> # Single cleanup
>>> label, sim = cleanup.cleanup(query, codebook, model)
>>> print(f"Found: {label}")
>>>
>>> # Multi-factor factorization
>>> labels, sims = cleanup.factorize(query, codebook, model, n_factors=3)
>>> print(f"Factors: {labels}")

References

Kanerva (2009): Classic cleanup operation

cleanup(query: Any, codebook: Dict[str, Any], model: VSAModel) → Tuple[str, float][source]¶

Find best match via exhaustive search.

Computes similarity between query and every codebook entry, returning the label with highest similarity.

Parameters:

query – Query hypervector to clean up
codebook – Dictionary mapping labels to hypervectors
model – VSA model for similarity computation

Returns:

Tuple of (label, similarity) for the best match

Raises:

TypeError – If arguments are not correct types
ValueError – If codebook is empty

Examples

>>> label, sim = cleanup.cleanup(query, codebook, model)
>>> print(f"Best match: {label} (sim: {sim:.3f})")

factorize(query: Any, codebook: Dict[str, Any], model: VSAModel, n_factors: int = 2, max_iterations: int = 20, threshold: float = 0.99) → Tuple[List[str], List[float]][source]¶

Factorize via iterative cleanup and unbinding.

Repeatedly finds the best match, unbinds it from the query, and continues until n_factors are extracted or convergence.

Parameters:

query – Composite hypervector to factorize
codebook – Dictionary mapping labels to hypervectors
model – VSA model for bind/unbind/similarity operations
n_factors – Number of factors to extract (default: 2)
max_iterations – Maximum iterations per factor (default: 20)
threshold – Convergence threshold for similarity (default: 0.99)

Returns:

labels: List of factor labels in extraction order
similarities: List of similarities for each factor

Return type:

Tuple of

Raises:

TypeError – If arguments are not correct types
ValueError – If n_factors < 1 or codebook is empty

Examples

>>> labels, sims = cleanup.factorize(
...     query, codebook, model, n_factors=3, threshold=0.95
... )
>>> print(f"Extracted {len(labels)} factors")

class holovec.utils.cleanup.ResonatorCleanup[source]¶

Bases: CleanupStrategy

Resonator network cleanup via iterative refinement.

Implements the resonator network algorithm from Kymn et al. (2024), which uses iterative attention mechanisms to refine factor estimates. Achieves 10-100x speedup over brute-force for multi-factor unbinding.

Algorithm:

Initialize estimates for all factors
For each iteration:
1. Unbind other factors to isolate target
2. Cleanup against codebook
3. Update estimate
Repeat until convergence or max_iterations

Performance:

Convergence: Typically 5-15 iterations
Speedup: 10-100x over brute-force
Best for: Multi-factor compositions (3+ factors)

Examples

>>> # Create resonator cleanup
>>> cleanup = ResonatorCleanup()
>>>
>>> # Single cleanup (same as brute-force)
>>> label, sim = cleanup.cleanup(query, codebook, model)
>>>
>>> # Multi-factor with resonator (much faster)
>>> labels, sims = cleanup.factorize(
...     query, codebook, model, n_factors=5, threshold=0.99
... )
>>> print(f"Converged with {len(labels)} factors")

None¶

Type:: stateless

References

Kymn et al. (2024): Attention Mechanisms in VSAs

Section 3: Resonator Networks
Algorithm 1: Iterative factorization

cleanup(query: Any, codebook: Dict[str, Any], model: VSAModel) → Tuple[str, float][source]¶

Find best match via exhaustive search.

For single-factor cleanup, resonator networks reduce to brute-force search. Use factorize() for multi-factor speedup.

Parameters:

query – Query hypervector to clean up
codebook – Dictionary mapping labels to hypervectors
model – VSA model for similarity computation

Returns:

Tuple of (label, similarity) for the best match

Raises:

TypeError – If arguments are not correct types
ValueError – If codebook is empty

Examples

>>> label, sim = cleanup.cleanup(query, codebook, model)

factorize(query: Any, codebook: Dict[str, Any], model: VSAModel, n_factors: int = 2, max_iterations: int = 20, threshold: float = 0.99, temperature: float = 20.0, top_k: int = 1, patience: int = 3, min_delta: float = 0.0001, mode: str = 'hard') → Tuple[List[str], List[float]][source]¶

Factorize via resonator network iteration.

Uses iterative attention to refine factor estimates simultaneously, achieving much faster convergence than sequential unbinding.

Algorithm (from Kymn et al. 2024):

Initialize: estimates = [random from codebook] × n_factors
Repeat for max_iterations:
1. For each factor i:
  
  Unbind all other estimates from query
  
  Cleanup result against codebook
  
  Update estimate[i]
2. Check convergence (all similarities >= threshold)
Return final estimates and similarities

Parameters:

query – Composite hypervector to factorize
codebook – Dictionary mapping labels to hypervectors
model – VSA model for bind/unbind/similarity operations
n_factors – Number of factors to extract (default: 2)
max_iterations – Maximum iterations (default: 20)
threshold – Convergence threshold for similarity (default: 0.99)

Returns:

labels: List of factor labels
similarities: List of similarities for each factor

Return type:

Tuple of

Raises:

TypeError – If arguments are not correct types
ValueError – If n_factors < 1 or codebook is empty

Examples

>>> # Fast multi-factor unbinding
>>> labels, sims = cleanup.factorize(
...     query, codebook, model, n_factors=5
... )
>>> print(f"Factors: {labels}")
>>> print(f"Avg similarity: {sum(sims)/len(sims):.3f}")

factorize_verbose(query: Any, codebook: Dict[str, Any], model: VSAModel, n_factors: int = 2, max_iterations: int = 20, threshold: float = 0.99, temperature: float = 20.0, top_k: int = 1, patience: int = 3, min_delta: float = 0.0001, mode: str = 'hard') → Tuple[List[str], List[float], List[float]][source]¶: Like factorize(), but also returns avg-similarity history per iteration.

Retrieval & Memory¶

Cleanup Strategies¶

See Also¶