How to Build an AI Assistant with LangChain
Before You Start
You need Python 3.10 or later and API keys for your model provider. This guide uses Claude through langchain-anthropic, but the agent architecture works the same way with OpenAI, Google, or any supported provider. You also need a plan for memory. LangChain's built-in memory classes (ConversationBufferMemory, ConversationSummaryMemory) handle conversation history within a session but do not provide persistent cross-session memory, knowledge extraction, or cognitive retrieval. For production use, you will want to integrate a persistent memory service.
Step-by-Step Setup
Install the core LangChain package and the provider integration for your model. If you plan to use retrieval, install the vector store integration as well.
pip install langchain langchain-anthropic langchain-core
# Optional: for retrieval-augmented assistants
pip install langchain-community faiss-cpuInitialize the model with your API key, define tools using LangChain's @tool decorator, and create the agent. LangChain's tool decorator converts a Python function with a docstring into a tool definition that the model can call. The docstring becomes the tool description, parameter annotations become the schema, and the function body becomes the execution handler.
from langchain_anthropic import ChatAnthropic
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
# Initialize model
llm = ChatAnthropic(model="claude-sonnet-4-6", temperature=0)
# Define tools
@tool
def search_knowledge_base(query: str) -> str:
"""Search the internal knowledge base for relevant documentation.
Use this when the user asks about product features, policies, or procedures."""
results = kb_client.search(query, limit=5)
return "\n".join([r["content"] for r in results])
@tool
def get_user_profile(user_id: str) -> dict:
"""Retrieve a user's profile including preferences, plan, and history.
Use this when you need to personalize a response or check account details."""
return db.get_user(user_id)
@tool
def create_support_ticket(title: str, description: str, priority: str = "medium") -> dict:
"""Create a support ticket in the helpdesk system.
Priority must be low, medium, or high."""
return helpdesk.create_ticket(title=title, description=description, priority=priority)
tools = [search_knowledge_base, get_user_profile, create_support_ticket]LangChain's built-in memory handles conversation history within a session. For persistent memory that survives across sessions, integrate Adaptive Recall as a tool that the agent can call to store and retrieve long-term knowledge. This approach works with LangChain's agent architecture because the agent treats memory operations like any other tool call.
@tool
def remember(content: str) -> str:
"""Store an important fact, preference, or decision in long-term memory.
Use this when the user shares information that should be remembered
across future conversations."""
result = memory_client.store(content=content)
return f"Stored in memory: {content}"
@tool
def recall_memories(query: str) -> str:
"""Retrieve relevant memories from previous conversations.
Use this at the start of a conversation or when context from
past interactions would help answer the current question."""
memories = memory_client.recall(query=query, limit=10)
if not memories:
return "No relevant memories found."
return "\n".join([f"- {m['content']}" for m in memories])
tools.extend([remember, recall_memories])Create a prompt template that includes the system instructions, conversation history, and the agent scratchpad (where LangChain tracks the tool calling loop). Then create the agent and wrap it in an AgentExecutor that manages the execution loop.
# Create prompt template
prompt = ChatPromptTemplate.from_messages([
("system", """You are a helpful AI assistant with access to tools and
persistent memory. At the start of each conversation, use recall_memories
to check for relevant context from previous sessions. When the user shares
important information, use remember to store it for future reference.
Be concise, accurate, and proactive about using tools when they would
help answer the user's question."""),
MessagesPlaceholder(variable_name="chat_history"),
("human", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
])
# Create agent and executor
agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
max_iterations=10,
handle_parsing_errors=True
)For assistants that need to answer questions from a specific knowledge base, add a retriever that provides grounding context. This can be a vector store retriever for document search, a knowledge graph retriever for structured facts, or both. The retriever can be integrated as a tool (the agent decides when to search) or as a pre-processing step (relevant documents are always included in context).
# Run the assistant
from langchain_core.messages import HumanMessage, AIMessage
chat_history = []
async def chat(user_input):
result = await executor.ainvoke({
"input": user_input,
"chat_history": chat_history
})
chat_history.append(HumanMessage(content=user_input))
chat_history.append(AIMessage(content=result["output"]))
return result["output"]When to Use LangChain vs Building from Scratch
LangChain accelerates development when your assistant fits standard patterns: tool-using agents, retrieval-augmented Q&A, or multi-step chains. The framework handles the boilerplate of tool routing, context management, and provider abstraction, letting you focus on your tools, prompts, and business logic. For prototypes and MVPs, LangChain is typically the fastest path to a working assistant.
The trade-off appears in production. LangChain's abstractions can make debugging harder because errors propagate through several layers before surfacing. Performance tuning requires understanding the framework's internals, particularly around context management and token usage. Custom behavior that does not fit LangChain's patterns (unusual tool routing logic, non-standard context assembly, custom streaming behavior) often requires working against the framework rather than with it. Many teams start with LangChain for the speed advantage and migrate to direct SDK usage once their assistant's behavior is well-defined and they need more control.
Add persistent memory to your LangChain assistant. Adaptive Recall integrates as a set of tools that give your agent long-term knowledge, cognitive retrieval, and memory lifecycle management.
Get Started Free