arrow

MongoDB as Your Vector Database: Python Guide to Semantic & Hybrid Search

Harsh Soni

Harsh Soni

|
Oct 16, 2025
|
book

4 mins read

cover-image

MongoDB Atlas now doubles as a vector database. You can store embeddings alongside your documents, run $vectorSearch for semantic retrieval, blend it with full-text for hybrid search, and scale on Search Nodes—no extra system to run. Recent updates add View support (GA) and built-in smarts via the Voyage AI acquisition.

Why MongoDB for vectors?

  • One place for app + AI data: Keep embeddings in the same collection as your source records, then query with $vectorSearch (ANN/ENN). Less glue code, fewer moving parts. 

  • Hybrid search: Combine BM25 full-text with vectors (RRF, semantic boosting) to improve relevance—best of both worlds. 

  • Scales cleanly: Atlas Search Nodes isolate capacity for search/vector workloads; multi-region options and big perf gains reported. 

  • Fresh in 2025: View support GA for Atlas Search/Vector Search—pre-shape and pre-filter data before indexing. 

  • Better embeddings: MongoDB acquired Voyage AI; models (text, multimodal, rerankers) integrate into Atlas to boost retrieval quality and cost-efficiency.

What you can build (fast)

  • Semantic search: over product docs, tickets, logs, or knowledge bases. 

  • RAG chatbots: that ground LLM answers on your collections. 

  • Hybrid search: experiences (keyword + meaning) with tunable weights.

Quick start (Python): index → embed → query

# pip install "pymongo>=4.7"  # plus your embedding library from pymongo import MongoClient from pymongo.operations import SearchIndexModel import os ATLAS_URI = os.getenv("ATLAS_URI") client = MongoClient(ATLAS_URI) col = client.kb.articles
  1. Create a Vector Search index (embedding: 1024 dims, cosine)
vector_index = SearchIndexModel(     name="vec_articles",     type="vectorSearch",     definition={         "fields": [             {"type": "vector", "path": "embedding", "numDimensions": 1024, "similarity": "cosine"},             {"type": "filter", "path": "tenant_id"}  # optional pre-filter field(s)         ]     }, ) col.create_search_index(vector_index)
  1. Upsert a doc with an embedding (pseudo-code; swap in your embedding call)
def embed(text:str) -> list[float]:     # e.g., any embedding service—just return a list of floats sized to your index     return [0.0]*1024 doc = {     "_id": "kb-001",     "title": "Reset 2FA on iOS",     "body": "Steps to recover access when you lost your authenticator.",     "tenant_id": "acme",     "embedding": embed("Reset 2FA on iOS Steps...") } col.replace_one({"_id": doc["_id"]}, doc, upsert=True)

# 3) Semantic search with $vectorSearch (+ optional pre-filter)

user_query = "I can't log in because I lost my phone authenticator" query_vec = embed(user_query) pipeline = [     {"$vectorSearch": {         "index": "vec_articles",         "path": "embedding",         "queryVector": query_vec,         "numCandidates": 200,         "limit": 5,         "filter": {"tenant_id": "acme"}     }},     {"$project": {         "title": 1, "score": {"$meta": "vectorSearchScore"}     }} ] results = list(col.aggregate(pipeline)) print(results[:2])
  • $vectorSearch is an aggregation stage; set numCandidates and limit for recall/latency trade-offs.  

  • Build the vector index with the same dimensions + similarity metric as your embedding model.  

  • You can also create indexes via Atlas UI, CLI, mongosh (`db.collection.createSearchIndex()`), or driver APIs.

Hybrid search (keyword + vectors)

Blend BM25 full-text with vectors for higher quality. A common pattern is Reciprocal Rank Fusion (RRF) to merge results. MongoDB docs walk through RRF/semantic boosting with $search + $vectorSearch.

RAG on Atlas (simple path)

MongoDB’s RAG guides show ingest → index → retrieve with $vectorSearch → generate with your LLM. There’s even a local RAG tutorial if you want to avoid external APIs while prototyping.

Performance & scale notes

  • Search Nodes: dedicate & scale search/vector compute separately from your database replica set (multi-region available).  

  • Views (GA): index a view to pre-filter/transform documents before vector indexing—handy for multi-tenant or curated catalogs.  

  • Quantization & dim choices: fewer dimensions and lower precision reduce storage/latency.  

  • ANN under the hood: Atlas Vector Search uses HNSW for approximate nearest-neighbor (fast and accurate for semantic search).

When to use MongoDB vs a dedicated vector DB

  • MongoDB Atlas: best when you want operational + vector data together, need hybrid search, or already run Atlas and want to keep infra minimal.  

  • Alt options: Cloud-compatible “MongoDB-like” services also ship vector search (e.g., Amazon DocumentDB, Azure Cosmos DB for MongoDB vCore). Consider them if you’re locked into those clouds.

Wrap-up

If you’re already on Atlas, you don’t need a separate vector store. Create a vector index, push embeddings, and start with $vectorSearch. Layer hybrid search for quality, use Views to pre-shape data, and move heavy traffic to Search Nodes. Voyage AI’s models + MongoDB’s native integration round out a practical, production-ready stack for RAG and semantic features.