A vector database is a database optimised for storing and querying high-dimensional vectors (embeddings) using similarity search rather than exact match. It returns the nearest results based on cosine similarity, dot product, or Euclidean distance, which is what makes RAG, semantic search, and AI-native AR workflows possible at scale.
A vector database is a database built for one specific job: storing high-dimensional numerical vectors (typically embeddings of 384 to 3,072 dimensions) and finding the ones most similar to a query vector. Instead of asking does this row equal that value, you ask which rows are closest to this point in vector space. Closeness is measured with cosine similarity, dot product, or Euclidean distance.
Relational databases struggle with this because B-tree indexes only work for one-dimensional ordered keys. Keyword search engines like Elasticsearch are excellent at lexical matching but do not understand that the bill was wrong and pricing dispute mean the same thing. A vector database closes that gap by indexing meaning rather than exact tokens, which is the prerequisite for any AI-native retrieval pattern.
Brute-force comparison of a query vector against millions of stored vectors is too slow for production. Vector databases use approximate nearest neighbour (ANN) indexes that give up a small amount of recall in exchange for very large speed gains.
You also choose a distance metric. Cosine similarity is the default for most embedding models because it ignores magnitude and focuses on direction. Dot product is faster on normalised vectors. Euclidean distance is rarely the right choice for modern text embeddings.
The market is now mature enough that the choice is rarely about capability, and more about operational fit.
For an AR or O2C team, the value of a vector database is not the technology itself, it is what it unlocks.
Vector databases look simple until you put them in production. A few things bite teams in the first 90 days.
The pragmatic path for most finance organisations: start with pgvector if you already run Postgres. You get vector search next to your customer master and invoice tables, with the same backup, access control, and compliance posture you already trust. Move to a dedicated vector database only when query latency, vector count, or specialised features (high-end hybrid search, very large scale) demand it.
The pitfalls that derail projects are almost always the same: dimension mismatch after a model upgrade, the wrong distance metric for the embedding model in use, missing metadata filtering so the system retrieves irrelevant chunks at scale, no monitoring of recall and latency over time, and underestimating the operational cost of backup and restore (which is materially harder than for SQL). Treat the vector database as production infrastructure from day one, not as a notebook experiment that happens to be live.
If you are doing anything beyond a single prompt to an LLM (such as RAG over your contracts, semantic search across dispute history, or agentic workflows that retrieve context before acting), yes. The vector database is what lets the AI find the right piece of your data before it answers. Without it, you are limited to whatever fits in the model's context window, which does not scale to a real enterprise document set.
Often yes. The pgvector extension adds vector columns and HNSW/IVF indexes to Postgres, and for many finance teams that is the right starting point. You keep vectors next to your structured data, you reuse existing backup and access controls, and you avoid a second system to operate. You typically only need a dedicated vector database when you outgrow Postgres on scale, latency, or need very advanced hybrid search.
Elasticsearch is built for keyword and full-text search using inverted indexes. It is excellent when the user's query shares words with the document. A vector database searches by meaning, so it finds documents that are semantically related even when the wording is completely different. Modern Elasticsearch does support vector search, and most vector databases now support keyword search, so the line is blurring. Hybrid search (both at once) is the production sweet spot.
HNSW gives lower query latency and higher recall but uses more memory and is slower to build. IVF uses less memory and builds faster but typically has slightly higher query latency for the same recall target. For most AR use cases (hundreds of thousands to a few million vectors, sub-100ms latency target), HNSW is the default choice. IVF or IVF plus product quantization becomes attractive at very large scale or when memory cost dominates.
Costs come from three places: storage of the vectors and index, query volume, and (for managed services) a base platform fee. As a rough order of magnitude, expect tens to low hundreds of euros per month for a few hundred thousand vectors on a managed service, scaling roughly linearly with vector count. Self-hosted pgvector or Qdrant on an existing cloud database can be effectively free at small scale beyond the underlying compute. Always benchmark with your real embedding dimensions before committing.
Not storing enough metadata alongside each vector. Without fields like customer ID, document type, region, and date, you cannot filter the search, and the database happily returns the globally most-similar chunk even when it belongs to a different customer or a superseded contract version. The fix is cheap if you do it at ingestion and very expensive if you discover it after a few million vectors are already indexed.