Embeddings are numerical vector representations of text, images or audio that capture semantic meaning, so that items with similar meaning sit close together in mathematical space and can be searched, compared and classified by what they mean rather than the exact words they use.
An embedding is a list of numbers that represents the meaning of a piece of content. Feed the sentence customer disputes invoice due to short shipment into an embedding model and you receive back a vector of perhaps 1536 floating-point numbers. Feed in buyer raised a claim for missing units and you receive a different vector, but one that sits very close to the first in mathematical space. The two sentences share no keywords yet an embedding model recognises they describe the same business event.
This is the breakthrough that makes modern AI useful in finance. Traditional keyword search treats remittance and payment advice as unrelated strings. Embeddings treat them as near-identical concepts. That semantic understanding is the connective tissue beneath nearly every agentic AR capability, from email routing to dispute resolution to credit decisioning.
An embedding model is a neural network, usually a transformer variant, that has been trained on enormous volumes of text to predict context. During training it learns that words appearing in similar contexts should produce similar internal representations. The final layer of the network exposes those representations as a fixed-length vector.
The dimensions are not human-readable. You cannot point to position 412 and say it represents urgency. What matters is the geometry of the full vector. Similarity is measured with cosine similarity or dot product, both of which return a score between roughly minus one and one. Scores above 0.8 typically indicate strong semantic overlap.
Common dimensionalities are 384, 768, 1024, 1536 and 3072. Higher dimensions capture more nuance but cost more to store and search. The same input text always produces the same vector from the same model, which makes embeddings deterministic and cacheable.
Finance teams sit on mountains of unstructured language. Master service agreements, dunning correspondence, dispute case notes, remittance advice, credit memos and customer emails all carry critical information that keyword systems struggle to navigate. Embeddings change the economics of working with that text.
They are the retrieval engine inside RAG architectures, the matching layer behind find me similar disputes queries, and the routing logic that decides whether an incoming email is a payment confirmation, a short-pay justification or a logistics complaint. Without embeddings, every AI-native AR feature collapses back to brittle rules and regex patterns.
Several concrete patterns dominate AR deployments today.
Several model families dominate enterprise deployments in 2026. OpenAI text-embedding-3-small produces 1536-dimensional vectors and text-embedding-3-large produces 3072-dimensional vectors, both with strong general-purpose performance. Cohere Embed v3 is widely used for multilingual workloads. Voyage AI offers voyage-3 and the domain-specialised voyage-finance-2, which is tuned on financial language and often outperforms general models on contract and credit text. Open-source families including BGE, E5 and GTE allow on-premise deployment for organisations that cannot send finance data to a hosted API.
Cost is one of the most attractive properties of embeddings. Pricing typically sits between 0.0001 and 0.001 euros per 1000 tokens, roughly three orders of magnitude cheaper than LLM inference. A finance organisation can embed a full year of customer correspondence and a complete contract repository for a few hundred euros. The vectors themselves are then stored in a vector database such as pgvector or a managed service, ready for instant retrieval.
Embeddings look simple until production exposes the edges. Five issues recur.
Done well, embeddings become the quiet substrate beneath every agentic AR workflow. Done badly, they produce a search engine that feels worse than the keyword system it replaced.
An embedding is a list of numbers that represents the meaning of a piece of text. Two sentences with similar meaning produce two lists of numbers that are mathematically close together, which lets software find related content based on what it means rather than the exact words used.
Keyword search matches strings. If a customer writes short shipment and your knowledge base uses the phrase missing units, keyword search returns nothing. Embeddings recognise both phrases describe the same concept and return the relevant result, because both map to similar regions of vector space.
For general English and multilingual workloads, OpenAI text-embedding-3-small or Cohere Embed v3 are sensible defaults. For finance-heavy content such as contracts and credit memos, Voyage AI voyage-finance-2 often outperforms general models. Open-source options like BGE work well for on-premise deployments where data cannot leave the network.
Embedding pricing typically falls between 0.0001 and 0.001 euros per 1000 tokens. A full year of customer correspondence and a complete contract repository can usually be embedded for a few hundred euros. The ongoing cost is dominated by storage and re-embedding when documents change, not by the initial vectorisation.
Embeddings live in vector databases such as pgvector, Pinecone, Weaviate or Chroma. These systems index vectors using approximate-nearest-neighbour algorithms so that a similarity search across millions of vectors returns results in tens of milliseconds. The vector database is the retrieval engine behind every RAG-based AR copilot.
Mixing dimensions and models inside the same index. Vectors from different embedding models live in different mathematical spaces and cannot be compared meaningfully. The second most common mistake is poor chunking, where documents are split into pieces that are either too small to carry context or too large to be specific. Both issues silently degrade retrieval quality without producing obvious errors.