sudo_sid
Befriending Vector Databases header artwork

Befriending Vector Databases · Part 1

Part 1: What Is a Vector Database and What Options Are Available?

May 7, 2026 · 5 min read

Vector DatabaseSemantic SearchRagAi Infrastructure

Large language models are great at generating text, but they do not automatically know your private documents, product catalog, support tickets, codebase, or latest business data. To make AI applications useful, we need a way to retrieve the right context at the right moment. That is where vector databases come in.

A vector database stores data as embeddings: arrays of numbers that represent the meaning of text, images, audio, video, users, products, or events. Instead of asking "which document contains this exact keyword?", a vector database asks "which items are closest in meaning to this query?"

This enables semantic search, recommendations, duplicate detection, anomaly detection, personalization, and retrieval-augmented generation (RAG).

Under the hood, vector databases usually combine a few capabilities:

  • vector storage
  • similarity search
  • metadata filtering
  • approximate nearest-neighbor indexing
  • hybrid keyword + vector search
  • replication and access control
  • monitoring and production APIs

The core query pattern is straightforward:

  1. Convert a query into an embedding.
  2. Compare it to stored embeddings.
  3. Retrieve nearest results.
  4. Optionally rerank.
  5. Pass context to your app or LLM.

The market has become crowded because "vector database" now means several things. Some products are purpose-built vector-native databases. Others are search engines or traditional databases that added vector search. Both can be excellent, depending on where your data lives and what you are building.

The Market Map

Pinecone shines as a managed, production-grade vector service with low operational burden — a strong fit for SaaS RAG apps, multitenant retrieval, and teams that want infrastructure handled for them.

Weaviate is an open-source, AI-native database (cloud or self-hosted) that works well for semantic + hybrid search, object-plus-vector modeling, and flexible deployment paths.

Milvus / Zilliz are built for large-scale vector infrastructure, especially when you’re aiming for very high throughput or massive collections and you’re comfortable operating distributed systems (often in Kubernetes-heavy environments).

Qdrant shines in metadata-rich retrieval: strong filtering, hybrid queries, and developer-friendly APIs — great for product catalogs, recommendations, and filtered RAG.

pgvector is often the best starting point when your data already lives in PostgreSQL, because you get vectors alongside relational data with SQL, joins, transactions, and familiar operational workflows.

MongoDB Atlas Vector Search is a strong option for document-oriented applications where embeddings belong next to JSON documents and you want vector search + full-text search + filters in one workflow.

Elasticsearch / Elastic are practical when you already run Elastic for search or observability and want hybrid retrieval (dense vectors + kNN plus classic keyword scoring) with mature search tooling.

Amazon OpenSearch Service is an AWS-native option for semantic and hybrid search that can fit neatly when you’re already in the AWS ecosystem and have existing OpenSearch pipelines.

Azure AI Search often shines as the center of gravity for Azure-based RAG: enterprise security, integrated ingestion patterns, and hybrid retrieval designed for production.

Google Vertex AI Vector Search is a high-scale GCP option based on ScaNN, and it’s worth considering for large retrieval systems and recommendation workloads.

AlloyDB AI is PostgreSQL-compatible on Google Cloud with pgvector compatibility and ScaNN acceleration — useful when you want SQL + operational data + high-performance retrieval together.

Redis is compelling for ultra-low-latency retrieval close to application state — useful for session memory, LLM caching, and real-time recommendations.

Vespa is an advanced search/recommendation engine where vector retrieval is one piece of a larger relevance and ranking system with deep control.

Chroma is a friendly default for local development, prototypes, and smaller agent/RAG systems where simplicity matters more than heavy ops.

LanceDB shines for multimodal datasets and AI data workflows — training-data curation, versioned datasets, and retrieval across text/image/audio/video in a more “lakehouse-like” way.

How To Choose

If your product data already lives in PostgreSQL, start with pgvector or AlloyDB AI before adding another system.

If you want a fully managed vector-native service, look at Pinecone. If open source and deployment flexibility matter, compare Weaviate, Qdrant, and Milvus.

If metadata filtering drives retrieval quality, pay close attention to Qdrant, Postgres/pgvector, and mature search engines.

If you already run search infrastructure, vector search inside Elasticsearch, OpenSearch, Azure AI Search, or Vespa may be more practical than introducing a separate vector database.

If you are building an Azure enterprise RAG app, Azure AI Search is often the natural center of gravity. If you are deep in AWS, OpenSearch plus Bedrock or SageMaker can fit neatly. If you are on GCP and need very large-scale retrieval or recommendations, Vertex AI Vector Search deserves a close look.

For prototypes, agent memory, and fast local development, Chroma is quick and friendly. For multimodal datasets and AI data engineering, LanceDB is less "just a vector index" and more a storage layer for evolving AI datasets. For very low-latency online workloads, Redis can be compelling when vector search sits close to cached application state.

The Most Important Point

A vector database will not magically make AI accurate. Retrieval quality depends on chunking, embedding model choice, metadata design, hybrid retrieval, reranking, evaluation, and observability.

The best system is not always the one with the fastest benchmark. It is the one that fits your data gravity, scale, query patterns, compliance needs, team skills, and cost model.

The market is converging from both directions: traditional databases are adding vector search, while vector-native systems are adding full-text search, reranking, hosted embeddings, and agent-friendly features. In practice, the winner is rarely "the best vector database in the abstract." It is the one that makes your retrieval pipeline reliable, understandable, and easy to operate.