Home » Memory Lifecycle Management » Archive Without Losing Access

How to Archive Memories Without Losing Access

Archiving moves inactive memories out of the fast retrieval index and into cold storage, reducing search candidates and storage costs while preserving the data for future use. A well-built archival system lets you restore memories to the active tier when they become relevant again, so forgetting is reversible and no knowledge is permanently lost unless you explicitly delete it.

Before You Start

Two-tier archival makes sense when your memory store has grown large enough that inactive memories measurably degrade retrieval quality or increase costs. If you have fewer than a thousand memories, the overhead of maintaining two storage tiers probably exceeds the benefit. Once you cross several thousand memories, especially if many are months old and rarely retrieved, archival becomes valuable.

You need a storage backend that supports two access patterns: fast, indexed retrieval for the active tier (your vector database or hybrid search index) and cheap, bulk storage for the archive tier (object storage, a relational database, or a document store). The archive tier does not need vector search capability, because archived memories are queried differently than active ones.

Step-by-Step Implementation

Step 1: Design the two-tier storage model.
The active tier holds memories that are indexed for vector search, have current activation values, and participate in cognitive scoring during retrieval. This tier needs fast read access and vector similarity search. The archive tier holds complete memory records, including content, metadata, entity lists, access history, and the reason for archival, but does not maintain vector embeddings or search indexes. This tier optimizes for low storage cost and bulk operations rather than fast individual lookups.
# Active tier: vector DB with full indexing active_tier = VectorStore( index_type='hnsw', dimensions=1536, metadata_fields=['confidence', 'entities', 'activation'] ) # Archive tier: cheap document storage archive_tier = DocumentStore( storage_class='standard', indexed_fields=['id', 'archived_at', 'entities'] )
Step 2: Define archival criteria.
A memory qualifies for archival when its activation has dropped below the forgetting threshold and it is not protected by high importance. The simplest criteria combine an activation check with a minimum age requirement to prevent archiving recently created memories that simply have not been retrieved yet. A memory must be below the activation threshold and at least 30 days old, with no access in the last 14 days. These values should be tuned based on your domain.
def should_archive(memory, threshold=-3.0, min_age_days=30, min_idle_days=14): activation = compute_activation(memory['access_times']) if activation >= threshold: return False now = time.time() age_days = (now - memory['created_at']) / 86400 if age_days < min_age_days: return False last_access = max(memory['access_times']) if memory['access_times'] else 0 idle_days = (now - last_access) / 86400 if idle_days < min_idle_days: return False if memory.get('confidence', 0) >= 8.0: return False return True
Step 3: Build the archival migration.
The migration process copies the complete memory record to the archive tier, then removes it from the active tier. Order matters: write to the archive first, verify the write succeeded, then remove from the active tier. This prevents data loss if the process is interrupted between the two operations. Record the archival timestamp and the reason for archival in the archive record so you can audit and filter archived memories later.
Step 4: Add archive search capability.
Archived memories are not in the vector index, so they cannot be found through normal retrieval. Build a separate search function that queries the archive tier using entity matching, keyword search, or metadata filters. This function is called explicitly when a user or application wants to search historical knowledge that might have been archived. It is deliberately slower than active retrieval because it prioritizes completeness over speed.
def search_archive(query_entities, archive_store, limit=20): results = [] for entity in query_entities: matches = archive_store.query( filter={'entities': {'contains': entity}}, limit=limit ) results.extend(matches) # deduplicate and rank by relevance seen = set() unique = [] for r in results: if r['id'] not in seen: seen.add(r['id']) unique.append(r) return sorted(unique, key=lambda m: m.get('confidence', 0), reverse=True )[:limit]
Step 5: Implement restoration.
When an archived memory becomes relevant again, restore it to the active tier. Restoration regenerates the vector embedding (which was removed during archival), resets the activation value to a moderate starting point, and re-inserts the memory into the active search index. The memory's confidence, entity connections, and historical metadata carry forward from the archive, so it does not lose its established reputation. Add the restoration event to the access history so the restored memory starts accumulating activation immediately.
def restore_memory(memory_id, archive_store, active_tier): archived = archive_store.get(memory_id) if not archived: raise ValueError('Memory not found in archive') memory = archived['metadata'] memory['embedding'] = generate_embedding(memory['content']) memory['access_times'].append(time.time()) memory['restored_at'] = time.time() active_tier.put(memory) archive_store.delete(memory_id) return memory

Cost Savings from Two-Tier Storage

The primary cost benefit comes from removing vector embeddings from the active search index. Vector indexes are the most expensive component of a memory system, requiring high-performance storage with specialized indexing structures. Moving 40% of memories to cold storage reduces your vector index size by 40%, which directly reduces the compute and memory costs of every retrieval operation. The archive tier uses commodity storage that costs a fraction of vector database pricing.

A secondary benefit is faster retrieval. With fewer candidates in the active index, vector search runs faster and cognitive scoring evaluates fewer items. The quality improvement from smaller, cleaner candidate sets often matters more than the direct cost savings.

Compliance Considerations

Archival differs from deletion in that the data still exists. For GDPR right-to-erasure requests or similar compliance requirements, archived memories must be searchable and deletable from the archive tier. Build your archive with the ability to locate and permanently remove specific memories by user ID, content match, or entity reference. Adaptive Recall's forget tool operates across both tiers, ensuring that a deletion request removes the memory from whichever tier it currently resides in.

Two-tier archival and automatic lifecycle management, built into every account. Focus on your application while your memory stays lean.

Get Started Free