Do I Need a Dedicated Vector Database
When an Extension Is Enough
For applications where vector search is a feature rather than the core product (adding semantic search to a documentation site, building a RAG pipeline for a chatbot, adding memory to an AI assistant), pgvector or a similar extension is usually sufficient. These applications have a few hundred thousand to a few million vectors, queries happen alongside other database operations, and the team benefits from managing one database instead of two.
pgvector gives you vector search with zero additional infrastructure. Your vectors, metadata, and application data share a single database, single backup strategy, single monitoring system, and single connection pool. Joins between vector search results and relational data are a single SQL query. There is no synchronization lag between your application database and a separate vector store. If you need to filter vector results by user ID, date range, or any other attribute, that is a WHERE clause in the same query rather than a separate API call with a filter parameter.
The practical limit for pgvector on a reasonably sized instance (32 to 64 GB RAM) is 2 to 5 million 1,536-dimensional vectors with sub-10ms query latency. Most AI applications, especially those in the early to mid-growth stage, never exceed this range. Starting with pgvector and migrating later if you outgrow it is a sound strategy because the migration path is well-documented and the initial time savings are significant.
Beyond pgvector, other database extensions support vector search for different stacks. SQLite has sqlite-vss and sqlite-vec for embedded and mobile applications. MongoDB Atlas supports vector search with $vectorSearch aggregation stages. Redis has a vector similarity search module. DynamoDB does not support vector search natively, but you can pair it with OpenSearch for a serverless vector search layer. If your application already uses one of these databases, check whether a vector extension exists before adding a separate system.
When You Need a Dedicated Database
Scale beyond single-machine limits. If you need to store and query more than 10 million vectors at production latency, a single PostgreSQL instance cannot hold the HNSW index in memory. Dedicated vector databases like Qdrant and Pinecone distribute the index across multiple machines, maintaining low latency at any scale.
High query throughput. If your application serves thousands of vector search queries per second, the CPU and memory resources required for vector search may compete with your other database workloads. A dedicated vector database isolates the search workload so it does not affect your application database's performance.
Advanced vector features. Product quantization, binary quantization, sparse vector support for hybrid search, multi-vector representations, and GPU-accelerated search are features that dedicated vector databases implement but pgvector does not (or implements partially). If these features are important for your use case, a dedicated database provides them out of the box.
Multi-region or multi-cloud deployment. If your application serves users globally and needs low-latency vector search in multiple regions, dedicated vector databases with built-in replication (Pinecone, Qdrant Cloud, Weaviate Cloud) handle this natively. Replicating PostgreSQL across regions is possible but significantly more complex to manage.
The Migration Question
A common concern is "what if we start with pgvector and outgrow it." The migration from pgvector to a dedicated vector database is straightforward because the core data is the same: vectors, metadata, and the original text. You export vectors from PostgreSQL, load them into the new database, and update your query code to use the new client library. The vector search interface (embed query, find top-k, return results) is the same regardless of backend. The migration effort is typically a few days of engineering work, not a multi-month project.
The bigger risk is the opposite direction: starting with a dedicated vector database when you do not need one. You commit to a separate system that needs its own monitoring, backup strategy, scaling plan, and cost management. If your application also needs the vector results joined with relational data (which most do), you add synchronization logic between your application database and the vector database. This complexity is justified at scale, but at early stages it slows development and increases operational burden for no retrieval quality benefit.
The Decision Framework
Ask three questions. First, do you already run PostgreSQL (or another database with a vector extension)? If yes, start with the extension. The incremental cost is zero and the setup takes minutes. Second, will you exceed 5 million vectors in the next 12 months? If no, pgvector is sufficient for the foreseeable future. If yes, plan for a migration but start with pgvector anyway because you will have more information about your query patterns when the time comes to migrate. Third, is vector search your primary workload? If it is the main thing your system does (a search engine, a similarity matching service), start with a dedicated database. If it is one feature among many (adding search to a web app, memory for a chatbot), an extension is the right starting point.
For most teams building AI features into existing applications, the answer is clear: use an extension on your current database, focus your engineering time on retrieval quality (embedding model selection, chunking, hybrid search), and revisit the infrastructure decision when you have concrete evidence that the extension is a bottleneck. Premature infrastructure optimization is one of the most common ways teams waste time on AI projects.
Adaptive Recall provides managed vector search alongside cognitive scoring and knowledge graphs. No vector database to choose, provision, or maintain.
Try It Free