Customer Memory Architecture: What to Store

The most common mistake in customer memory systems is storing too much. Raw conversation transcripts, irrelevant personal details, and transient operational data bloat the memory store, degrade retrieval quality, and create unnecessary privacy exposure. Effective customer memory stores only information that changes how the AI should handle future interactions: the customer's environment, their preferences, their issue history, and their relationship context. Everything else should be processed and discarded.

The Principle: Store What Changes Future Behavior

Before storing any piece of information, apply a single test: will this information change how the AI handles a future conversation with this customer? If the answer is yes, store it. If the answer is no, let it expire with the conversation. This test eliminates most of the noise that accumulates in uncurated memory systems.

A customer mentioning they use Python changes future behavior because the AI should provide Python examples instead of generic ones. That gets stored. A customer saying "hold on, my cat just jumped on the keyboard" does not change future behavior. That does not get stored. A customer expressing frustration about a recurring issue changes future behavior because the AI should acknowledge the pattern and prioritize resolution. That gets stored. A customer saying "thanks, have a good evening" does not change future behavior. That does not get stored.

This principle also applies to structured data. The customer's subscription plan changes future behavior because the AI should tailor recommendations to available features. Store it. The customer's internal CRM score might not change how the AI interacts, even if it is useful for other systems. Do not store it in the AI memory unless it influences the AI's responses.

What to Store: Six Categories

1. Technical Environment

What technology the customer uses and how their systems are configured. This is the highest-value memory category for technical support because it eliminates the most common re-explanation: "What language/framework/platform are you using?"

Store: programming languages and versions, frameworks, cloud provider and services, database systems, operating system, deployment method, integration count and types, API version in use. Update when the customer mentions changes. Priority: high, retrieve for every technical support interaction.

2. Communication Preferences

How the customer prefers to interact with support. This category drives personalization and requires evidence-gating because preferences should be learned from patterns, not assumed from single interactions.

Store: preferred response detail level (concise vs thorough), preferred technical depth (code examples vs plain language), preferred follow-up method (email, chat, none), preferred contact channel, sensitivity to response time. Learn from at least three interactions before storing as a confirmed preference. Priority: medium, retrieve for every interaction.

3. Issue History

What problems the customer has had and how they were resolved. This category prevents repeated troubleshooting and enables pattern recognition for recurring issues.

Store as episodic memories: the issue summary (not the full conversation), what was tried, what worked, what did not work, resolution status, and date. Consolidate over time: multiple interactions about the same issue should be merged into a single memory with the full troubleshooting history. Priority: high for recent issues (last 30 days), medium for older issues.

4. Account Context

The customer's business relationship with your organization. This category helps the AI understand the customer's value, their constraints, and the appropriate level of service.

Store: subscription plan and tier, customer tenure (how long they have been a customer), company name and size, industry, primary use case for the product, account health indicators if available from CRM sync. Priority: medium, retrieve when the conversation involves account-level decisions (upgrades, billing, feature access).

5. Product Usage

Which features the customer uses and how they use them. This helps the AI provide relevant recommendations and avoid suggesting features the customer already uses or features that are not available on their plan.

Store: primary features used, features mentioned as critical to their workflow, feature requests expressed in conversations, adoption stage (new user vs power user), and any custom configurations. Priority: medium, retrieve when discussing feature questions or troubleshooting.

6. Relationship Signals

Indicators of the customer's satisfaction and engagement trajectory. These help the AI recognize when extra care is needed, such as when a customer has been through multiple frustrating interactions.

Store: sentiment trajectory (improving, stable, declining), escalation history, explicit feedback (compliments or complaints), and churn risk indicators like mentions of competitors or frustration with pricing. Priority: low for routine interactions, high when sentiment signals indicate risk.

What Not to Store

Raw conversation transcripts. Full transcripts are too verbose for retrieval, too expensive to store at scale, and contain too much noise (greetings, pleasantries, thinking out loud) that degrades search quality. Store structured summaries instead.

Payment and authentication details. Credit card numbers, passwords, security questions, and any authentication credentials should never be stored in the memory system. These are handled by dedicated, PCI-compliant systems and have no place in AI memory.

Personal identifiers beyond what is needed. Social security numbers, government IDs, dates of birth, and similar personal identifiers should not be in the memory system unless they are required for the service (and even then, consider whether the AI actually needs them or whether they belong in a separate secure store).

Transient operational details. The customer's current browser, today's error log output, the exact timestamp of their last login, these are useful during a conversation but not worth persisting. They change too frequently to be reliable and they clutter the memory store with rapidly outdated information.

Information the CRM already holds. If your CRM stores the customer's company size, industry, and contract value, do not duplicate that in the memory system. Instead, sync a curated subset of CRM data into memory at query time or through periodic import, keeping the CRM as the source of truth for structured account data.

Metadata Schema

Every stored memory should carry metadata that enables effective retrieval, lifecycle management, and privacy compliance. A practical metadata schema for customer memories includes: customer_id (the canonical identifier linking all memories to a customer profile), type (episodic, semantic, preference, or procedural), topic (a category tag like billing, api_integration, account_management), source (conversation, crm_sync, consolidation), channel (chat, email, phone, social), confidence (how certain the system is about this information), created_at (when the memory was stored), updated_at (when the memory was last modified), and expires_at (when the memory should be automatically deleted).

The metadata should not include any content that would be sensitive if the metadata layer were exposed independently. Customer IDs and topic tags are fine. Memory content fragments in metadata fields are not. Keep the metadata structural and use it for filtering, sorting, and lifecycle management rather than as a secondary content store.

Storage Volume Planning

A typical customer generates 1 to 3 memories per support interaction: an interaction summary (episodic), any new factual observations (semantic), and preference signal updates (preference). With consolidation running weekly, the net growth is roughly 1 memory per interaction after deduplication and merging. A customer who contacts support monthly accumulates about 12 memories per year before consolidation reduces that to 5 to 8 consolidated memories plus a handful of recent episodic records.

For a business with 10,000 customers, plan for 50,000 to 100,000 total memories in the store, with 5,000 to 15,000 new memories per month and consolidation removing a similar number. This volume is well within the capacity of any modern vector database and adds minimal storage cost (typically $50 to $200 per month depending on the provider and embedding dimensions).

Store what matters, skip what does not. Adaptive Recall's memory API handles storage, retrieval, and consolidation so your team focuses on the customer experience rather than the data architecture.

Get Started Free