Home » MCP Server Setup and Integration

MCP Server Setup and Integration

The Model Context Protocol (MCP) is an open standard that lets AI assistants call external tools, read resources, and use prompt templates through a unified interface. Instead of building custom integrations for every AI client, you build one MCP server and any compatible client can connect to it. This guide covers everything from building your first server to deploying it in production with authentication, monitoring, and team-wide configuration.

What MCP Is and Why It Exists
MCP Architecture: The Three Primitives
Transport Layer: stdio vs Streamable HTTP
Building Your First MCP Server
Connecting MCP Servers to AI Clients
Security and Authentication
Production Deployment
Debugging and Troubleshooting
Setup and Integration Guides
Core Concepts
Common Questions

What MCP Is and Why It Exists

Before MCP, every AI application that needed to call external tools had to build its own integration layer. If you wanted Claude to query a database, you wrote a function calling wrapper. If you wanted GPT to search your codebase, you built a custom plugin. If you wanted to switch between AI providers, you rewrote the integration from scratch. The tool existed independently from the AI, but the glue code that connected them was tightly coupled to both the specific tool and the specific AI client.

MCP solves this by defining a standard protocol for the connection between AI clients and external tools. An MCP server exposes capabilities through three primitives: tools (actions the AI can invoke), resources (data the AI can read), and prompts (templates the AI can use). An MCP client, typically an AI assistant or IDE, discovers these capabilities at connection time and presents them to the language model as available actions. The model decides when to use them based on the conversation context.

Anthropic released MCP as an open specification in late 2024, and adoption has grown rapidly since. Claude Code, Claude Desktop, Cursor, Windsurf, Cline, and dozens of other AI clients now support MCP natively. The ecosystem has expanded from a handful of reference implementations to thousands of community servers covering databases, APIs, cloud services, developer tools, and domain-specific functionality. Building an MCP server is now one of the most effective ways to make any capability available to AI assistants.

The analogy that stuck is "USB-C for AI." Just as USB-C provides a universal physical and electrical interface for connecting devices to computers, MCP provides a universal protocol interface for connecting tools to AI. You build one server, and any MCP-compatible client can use it without modification. The server does not need to know which AI model is calling it, and the client does not need custom code for each server.

MCP Architecture: The Three Primitives

MCP organizes server capabilities into three distinct primitive types. Understanding the differences between them is essential for designing a server that clients can use effectively.

Tools

Tools are functions that the AI can invoke to perform actions or computations. When you register a tool with your MCP server, you provide a name, a description, and a JSON schema for the input parameters. The AI model reads this description and schema to decide when and how to call the tool. Each tool call sends the parameters to your server, your code executes the logic, and the result is returned to the model.

Tools are the most commonly used primitive because they map directly to actions: store a memory, run a database query, create a file, send a notification, fetch data from an API. The description you write for each tool is critically important because the AI model uses it to decide whether the tool is appropriate for the current task. A vague description leads to incorrect tool selection; a precise description ensures the model calls the right tool with the right parameters.

Resources

Resources are data that the AI can read without invoking an action. They function like GET endpoints in a REST API, providing information the model can incorporate into its context. Each resource has a URI, a name, and a MIME type. The AI client can list available resources and read their content to build context for the conversation.

Resources are useful for exposing configuration files, documentation, database schemas, system status, or any other read-only data that informs the model's decisions. Unlike tools, resources do not perform side effects. Reading a resource does not change any state; it simply returns data.

Prompts

Prompts are reusable templates that guide the AI's behavior for specific workflows. A prompt template includes a name, a description, optional parameters, and a predefined message structure. When a user or the AI invokes a prompt, the template is expanded with the provided parameters and added to the conversation context.

Prompts are less commonly used than tools and resources, but they are valuable for encoding complex workflows. A code review prompt might include instructions for checking security, performance, and style along with template slots for the file path and review criteria. A data analysis prompt might structure the model's approach to exploring a dataset. By packaging these workflows as prompts, you ensure consistent behavior across sessions and users.

Transport Layer: stdio vs Streamable HTTP

MCP supports two transport mechanisms for communication between clients and servers: stdio and Streamable HTTP. The choice of transport determines how the server is started, how messages are exchanged, and where the server can run.

The stdio transport runs the MCP server as a subprocess of the client. The client launches the server process, and all communication happens through standard input and output streams. This is the simplest setup: no network configuration, no authentication layer, no firewall rules. The server runs on the same machine as the client, starts and stops automatically, and has access to the local filesystem and environment. Most development-time MCP servers use stdio because it requires zero infrastructure.

Streamable HTTP runs the MCP server as a network service that the client connects to over HTTP. The client sends JSON-RPC requests to the server's endpoint, and the server responds with results. This transport supports remote deployment: the server can run on a different machine, in a container, or in the cloud. It also supports authentication (the client sends credentials with each request), load balancing (multiple server instances behind a proxy), and multi-user access (different clients connecting to the same server).

The original MCP specification also included a Server-Sent Events (SSE) transport, which was used in early implementations for streaming responses over HTTP. The specification has since consolidated around Streamable HTTP as the standard network transport, which handles streaming through standard HTTP mechanisms. If you encounter older documentation referencing SSE transport, note that it has been superseded.

For local development tools like file managers, git helpers, and code analyzers, stdio is the right choice. For shared services like databases, APIs, and memory systems, Streamable HTTP is the right choice because it allows the server to run independently and serve multiple clients. Adaptive Recall uses Streamable HTTP because the memory service needs to persist between sessions and serve multiple users, which is not possible with a stdio transport that starts and stops with the client.

Building Your First MCP Server

Building an MCP server requires choosing a language and SDK. The official SDKs exist for Python and TypeScript, and community SDKs cover Go, Rust, Java, C#, and other languages. The Python and TypeScript SDKs are the most mature and have the best documentation, so they are the recommended starting points.

A minimal MCP server in Python requires only a few lines. You import the MCP library, create a server instance, register your tools with their descriptions and schemas, and start the server on your chosen transport. The SDK handles protocol negotiation, message parsing, capability advertisement, and error formatting. Your code focuses on the actual tool logic.

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("my-tools")

@mcp.tool()
def search_docs(query: str, limit: int = 5) -> str:
    """Search the documentation for relevant articles.

    Args:
        query: The search query string
        limit: Maximum number of results to return
    """
    results = your_search_function(query, limit)
    return format_results(results)

if __name__ == "__main__":
    mcp.run()

The TypeScript SDK follows a similar pattern but uses the standard MCP server class with explicit tool registration. You create a server, define tool handlers with their JSON schemas, and connect the server to a transport.

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";

const server = new McpServer({
  name: "my-tools",
  version: "1.0.0"
});

server.tool(
  "search_docs",
  "Search the documentation for relevant articles",
  { query: z.string(), limit: z.number().default(5) },
  async ({ query, limit }) => {
    const results = await yourSearchFunction(query, limit);
    return { content: [{ type: "text", text: formatResults(results) }] };
  }
);

const transport = new StdioServerTransport();
await server.connect(transport);

In both cases, the actual tool logic (your_search_function, yourSearchFunction) is your code, completely independent of MCP. The protocol layer just wraps it in a standard interface that any MCP client can discover and invoke. This separation means you can start with existing functions and expose them through MCP without rewriting anything.

Connecting MCP Servers to AI Clients

Once your server is built, you need to tell the AI client how to connect to it. Each client has its own configuration format, but they all need the same information: the server name, the transport type, and the connection details (command and arguments for stdio, or URL and headers for HTTP).

Claude Code uses a JSON configuration file. For project-specific servers, create a .mcp.json file in the project root. For servers that should be available across all projects, add them to ~/.claude/settings.json. The configuration specifies the server type, the command to run (for stdio) or the URL to connect to (for HTTP), and any required environment variables or headers.

{
  "mcpServers": {
    "adaptive-recall": {
      "type": "url",
      "url": "https://mcp.adaptiverecall.com/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY"
      }
    },
    "local-files": {
      "command": "python",
      "args": ["./tools/file_server.py"]
    }
  }
}

Cursor uses a similar JSON format in its settings. Windsurf, Cline, and other editors each have their own configuration paths, but the structure is consistent: server name, transport details, and credentials. The MCP specification ensures that any server you build works with any client, so you write the server once and configure it into each client as needed.

When a client connects to your server, it sends an initialization request. Your server responds with its capabilities: the list of available tools, resources, and prompts. The client presents these to the language model, which can then invoke them during the conversation. Tool calls flow from the model through the client to your server, and results flow back the same path.

Security and Authentication

MCP servers that run over stdio inherit the security context of the local user. The server process has the same filesystem access, network access, and credentials as the user who started it. This is appropriate for local development tools but insufficient for shared or remote servers.

For HTTP-based servers, MCP supports OAuth 2.1 as the standard authentication mechanism. The client obtains a token through the OAuth flow and includes it in the Authorization header of each request. The server validates the token and uses the associated identity to scope access. This allows a single server to serve multiple users with different permissions, enforce rate limits, and audit access.

Even without full OAuth, you can secure an HTTP server with API keys. The client sends the key in the Authorization header, and the server validates it against a stored set of keys. This is simpler than OAuth but lacks features like token expiration, refresh, and per-user scoping. For single-user or small-team deployments, API key authentication is sufficient. For multi-tenant or enterprise deployments, OAuth is the right choice.

Beyond authentication, consider what your server exposes. A tool that executes SQL queries against a production database needs careful input validation and query scoping. A tool that reads files from the filesystem needs path validation to prevent directory traversal. A tool that calls external APIs needs rate limiting to prevent abuse. These are standard security concerns that apply regardless of whether the caller is a human user or an AI model, but the attack surface is different because AI models can make many rapid, programmatic calls that a human user would not.

Production Deployment

Moving an MCP server from development to production involves the same considerations as any networked service: containerization, health monitoring, scaling, logging, and error handling. The MCP protocol itself is stateless (each request is independent), which simplifies horizontal scaling. You can run multiple server instances behind a load balancer and route requests to any instance.

Containerization with Docker is the standard approach. Your Dockerfile installs dependencies, copies your server code, and sets the entrypoint to start the server on the HTTP transport. You expose the server port and add a health check endpoint that the orchestrator can use to verify the server is running. Environment variables control configuration: API keys, database connection strings, model endpoints, and feature flags.

For memory servers like Adaptive Recall, the deployment also needs persistent storage. The memory store must survive container restarts, which means either connecting to an external database (the recommended approach for production) or mounting a persistent volume. The server itself remains stateless; all state lives in the storage backend. This separation allows you to scale the server independently from the storage layer.

Monitoring an MCP server means tracking request latency, error rates, tool invocation counts, and resource consumption. Standard observability tools (Prometheus, Grafana, Datadog, CloudWatch) work because MCP servers are ordinary HTTP services. Add structured logging to each tool handler so you can trace requests from client through server to backend and back.

Debugging and Troubleshooting

The most common MCP integration problems are connection failures, tool discovery issues, and parameter mismatches. Connection failures happen when the client cannot reach the server: wrong URL, missing credentials, network configuration, or the server process failing to start. Tool discovery issues happen when the server connects but the client does not show the expected tools: version mismatches, capability negotiation failures, or tools failing to register during initialization. Parameter mismatches happen when the model sends parameters that do not match the tool's schema: missing required fields, wrong types, or extra fields that the schema does not accept.

The MCP Inspector is the primary debugging tool. It connects to your server, displays the registered tools, resources, and prompts, and lets you invoke tools manually with test parameters. This isolates your server from the AI client, so you can verify that the server works correctly before debugging the client integration. If a tool works in the Inspector but fails in the client, the problem is in the client configuration or the model's tool invocation, not in your server.

For stdio servers, the most common startup problem is environment configuration. The server process inherits environment variables from the client, which may not include your PATH, virtual environment, or required API keys. Test your server by running it manually from the command line with the same command and arguments that the client configuration specifies. If it works from the command line but not from the client, the issue is likely a missing environment variable or path.

For HTTP servers, use curl or any HTTP client to test the MCP endpoint directly. Send the initialization request and verify that the server responds with its capability list. Then send a tool invocation request and verify the response. This confirms that the server is reachable, authenticated, and functional before you involve the AI client in the debugging process.

Setup and Integration Guides

Building Servers

How to Build Your First MCP Server in Python How to Build an MCP Server in TypeScript How to Deploy a Remote MCP Server in Production How to Secure an MCP Server with OAuth 2.1

Connecting Clients

How to Connect Claude Code to Tools via MCP How to Add MCP Servers to Cursor IDE How to Share MCP Config Across a Dev Team How to Debug MCP Servers with the Inspector Tool

Core Concepts

Understanding MCP

What Is MCP and Why Should Developers Care MCP Architecture: Resources, Tools, and Prompts MCP Transport: stdio vs Streamable HTTP Why MCP Is Called the USB-C for AI

Comparisons and Troubleshooting

MCP vs REST APIs: When to Use Each MCP vs Function Calling: Key Differences Common MCP Integration Problems and Fixes

Common Questions

Is MCP Only for Claude or Does It Work with GPT Can MCP Servers Access My Database Safely How Many MCP Servers Can Run at Once Do I Need MCP If I Already Have an API Is MCP Secure Enough for Production Use Can MCP Servers Call Other MCP Servers

Connect your AI to a memory system that learns. Adaptive Recall is a production MCP server with cognitive scoring, knowledge graphs, and memory lifecycle management built in.

Get Started Free