November 3, 2025

Mem0



What is Mem0?

Mem0 is an open-source memory layer for LLM apps that extracts durable facts from conversations and makes them searchable later. The OSS package gives you a Memory client you can run locally; by default it wires up an OpenAI model for fact extraction, OpenAI embeddings, a local Qdrant vector store, and a small SQLite history DB so you can get started quickly without extra setup. You can later swap in your own LLM, embeddings, vector DB, or enable graph relationships.

Key characteristics

  • Local first, swappable stack. Run entirely on your machine, then customize LLMs, embedders, and vector stores as you grow.

  • Structured fact extraction. add(...) turns a chat turn into concise, durable facts that survive across sessions.

  • Fine-grained filtering. Search can combine metadata filters with logical operators and comparisons for precise retrieval. 

  • Optional graph memory. Persist relationships (people, places, events) and blend them into retrieval for richer context.

  • Async + performance tools. Async variants and reranker options are available when you need scalability and ranking control.

  • Multimodal. You can attach images and search alongside text memories in the OSS stack.

A runnable example: recording long-term memory with Mem0

# pip install mem0ai from mem0 import Memory m = Memory(api_key="YOUR_OPENAI_API_KEY") # 1) Add a short user ↔ assistant exchange; Mem0 will infer a durable fact messages = [ {"role": "user",
"content": "Hi, I'm Alex. I love basketball and gaming."}, {"role": "assistant",
"content": "Nice to meet you, Alex! I’ll remember your interests."} ] m.add(messages, user_id="alex") # 2) Retrieve what was learned later results = m.search("What do you know about me?", filters={"user_id": "alex"}) print(results)
# -> includes a distilled memory like “Name is Alex; enjoys basketball and gaming.”

That flow mirrors the OSS quickstart: you instantiate Memory, call add(messages, user_id=...) to persist what matters, and search(...) to recall it later. Defaults include an OpenAI small model for extraction, OpenAI embeddings, local Qdrant, and SQLite history.

Internal logic: 

  1. Client bootstrap: Memory(api_key=...) loads a default configuration. It prepares an LLM for extracting durable facts, an embedding model for vectorization, and a storage layer for vectors and metadata. It also sets up a namespace so your data is scoped to your app.

  2. Session scoping: When you pass user_id="alex" (and optional metadata), Mem0 uses those fields as filters and tags. Every memory unit written during add(...) carries this identity so you can retrieve per user or per room later.

  3. Ingestion and normalization: add(messages, ...) accepts a short list of chat turns. The library cleans and normalizes the text, removes trivial fillers, and constructs a compact context for extraction. It may keep light provenance such as role, timestamps, and message ids.

  4. Fact extraction: An LLM prompt turns the conversation into candidate “memories.” Each candidate is a concise statement that should be stable across sessions, for example “User name is Alex” or “Alex enjoys basketball and gaming.” Candidates are structured with fields like text, source, created time, and the user or room tags. LLM generation

  5. Novelty and dedup checks: Before writing, Mem0 compares candidates against what it already knows. It computes similarity between the new fact and existing ones that share the same scope. If a near duplicate exists, it updates recency or frequency metadata rather than creating a second copy. If the new fact conflicts, the system can keep both with timestamps or prefer the most recent revision, depending on configuration. Embedding: Optional (some setups embed the candidate here to compute similarity, others reuse existing vectors and defer new embeddings to step 6). Retrieve from DB (read existing facts and/or their vectors for comparison).

  6. Embedding and indexing: Approved candidates are embedded using the configured embedding model. Vectors are upserted into the index with the accompanying scalar metadata (user_id, kind, room_id, timestamps). A small history log records that this chat produced these memory units so you can audit or rebuild later. Embedding (for the new or updated facts). Store to DB (write vectors and metadata)

  7. Query-time recall (search(...)): When you call search("What do you know about me?", filters={"user_id": "alex"}), Mem0 first applies the filter to restrict the search space. It embeds the query, runs a similarity search in the filtered index, and collects the top matches. A lightweight ranker can reorder results by relevance, freshness, or confidence. Embedding (for the query text). Retrieve from DB (vector search returns matching items and their metadata).

  8. Result shaping: The search returns memory objects rather than raw chunks. Each object contains the distilled fact text, relevance score, and the metadata you wrote earlier. You can print them directly, feed them to your system prompt, or render them in your UI. LLM generation: Optional (only if a configured reranker uses an LLM; default can be non-LLM).

  9. Idempotency and updates over time: If Alex later says “I now prefer soccer,” a new candidate will be extracted. On write, Mem0 sees that it updates a preference for the same user. The old preference may be marked superseded, or both may remain with different timestamps. Your retrieval policy decides which one to surface first. LLM generation (extract the new fact). Embedding (for the new or updated fact). Store to DB (write the change). Retrieve from DB (to check prior state for dedup or conflicts).

  10. Privacy and persistence: By default the stack runs locally unless you configure remote services. Deletion calls remove both vectors and metadata for a user or a room. Export tools can dump memory objects for backups or migration. Retrieve from DB for export or audits; also delete operations touch the DB but do not store new data.

That is the flow your small example exercises: ingest a short exchange, distill stable facts, embed and store them with identity metadata, then recall them later with a filtered semantic search that returns structured, durable memories.

Example: two users chatting, each with their own long-term memory

# pip install mem0ai from mem0 import Memory m = Memory(api_key="YOUR_OPENAI_API_KEY") # Alice chat m.add( [ {"role": "user",
"content": "I'm Alice. I live in Seattle and prefer oat milk latte."}, {"role": "assistant",
"content": "Got it, Alice—Seattle and oat milk latte."} ], user_id="alice", ) # Bob chat m.add( [ {"role": "user",
"content": "I'm Bob. I live in Austin and I like brisket tacos."}, {"role": "assistant",
"content": "Thanks, Bob—Austin and brisket tacos."} ], user_id="bob", ) # Later: recall per-user alice_view = m.search("Where do I live and what do I like?",
filters={"user_id": "alice"}) bob_view = m.search("Where do I live and what do I like?",
filters={"user_id": "bob"}) print("Alice recalls:", alice_view) print("Bob recalls:", bob_view)

Using filters={"user_id": ...} keeps each person’s long-term memories isolated while allowing clean retrieval.

Example: two users share team/project context while keeping personal memory

You can tag memories with metadata (for example a shared room_id or project_id) when you add them, then query with logical filters to retrieve the shared “room” view from either side.

# pip install mem0ai from mem0 import Memory m = Memory(api_key="YOUR_OPENAI_API_KEY") ROOM = "proj-42" # Alice contributes a shared fact to the room m.add( [ {"role":    "user",
         "content": "For project 42 our repo is on GitHub at org/proj42."}, {"role":    "assistant",
         "content": "Logged the repo location for project 42."} ], user_id="alice", metadata={"room_id": ROOM, "kind": "project_knowledge"}, ) # Bob contributes another shared fact m.add( [ {"role": "user",
         "content": "Standup is 10am PT for project 42."}, {"role": "assistant",
         "content": "Noted the daily standup time for project 42."} ], user_id="bob", metadata={"room_id": ROOM, "kind": "ritual"}, ) # Either user can pull the shared context using metadata filters: room_view_for_alice = m.search( "What shared details do we have for project 42?", filters={ "AND": [ {"room_id": ROOM}, {"OR": [{"user_id": "alice"}, {"user_id": "bob"}]} ] } ) print(room_view_for_alice)

This pattern relies on adding metadata during add(...) and then using enhanced metadata filtering during search(...) to join across contributors while still respecting user scoping. The OSS docs explicitly support complex metadata filters with AND / OR, comparisons, and substring operators.

Data persistence

Where to store each data type

Facts

  • Best place: a transactional database such as Postgres in production, SQLite for local.

  • Why: strong consistency, easy versioning, clean indexing and access control.

  • Tip: keep facts as text with clear ownership fields like user_id and optional room_id.

Embeddings

  • Best place: a vector database such as Qdrant, Weaviate, Milvus, or a managed service like Pinecone.

  • Alternative: Postgres with pgvector for small to medium scale or when you want one database.

  • Tip: keep only vectors here, plus the minimal keys required to join back to facts and to filter.

Metadata

  • Best place: the same transactional database as facts.

  • Why: you will filter on user_id, room_id, timestamps, tags. This is easier and safer in a relational or document style store.

  • Optional: mirror a subset of hot filters into the vector DB payload for faster pre-filtering.

History and versions

  • Best place: an append-only audit table in your transactional database.

  • Alternative: object storage for large exports and cold archives.

  • Tip: record who changed what and when, keep soft deletes and prior values, keep a small reason field for operator actions.

Deployment profiles that work

Local developer setupSQLite for facts, metadata, history. Qdrant for vectors.

One-database simplicityPostgres with pgvector for all four types. Good up to mid scale.

Split for performancePostgres for facts, metadata, history. Dedicated vector DB for embeddings.

Cloud scaleManaged vector DB for embeddings. Postgres for facts, metadata, history. Object storage for periodic snapshots.

Anything else worth storing

  • Source transcripts or snippets: short excerpts that justify a fact, useful for audits and UI tooltips.

  • Extraction prompts and model info: which prompt and model version produced a fact, so you can reprocess later.

  • Dedup signatures: hash or fingerprint of a fact to speed novelty checks.

  • Access control state: org_id, tenant, role bindings, policy versions, consent flags.

  • Deletion proofs and retention records: when a user requests erasure, store the request and the confirmation record.

  • Job checkpoints: offsets for background tasks such as compaction, re-embedding, or backfills.

  • Caches: optional short-lived recall caches, stored in Redis or memory, not for durability.

  • Attachments and multimodal assets: images or files linked to a memory, stored in object storage with secure URLs.

  • Operational snapshots: periodic logical dumps of Postgres and snapshots of the vector index for disaster recovery.

Bottom line

Facts, metadata, and history belong in a transactional store that you trust. Embeddings belong in a vector index tuned for similarity search. You can run everything in Postgres with pgvector for simplicity, or split Postgres plus a dedicated vector DB for scale. Add object storage for archives and attachments, and keep small audit and governance records so you can rebuild and explain your memory over time.

Short summary

Mem0 is a lightweight, local memory layer that turns chat into concise, durable facts you can search with precision. You can start with the default OSS stack in minutes, then refine retrieval with metadata, enable a graph for relationships, and scale with async options when needed. The examples above show how to persist personal memory per user and how to model shared team context with a simple metadata convention. 

ref: docs.mem0.ai

No comments:

Post a Comment