Contexara gives AI agents a three-tier memory engine — hot session turns, long-term semantic memories, and crystallized episode summaries — so nothing gets lost between conversations.
Works via MCP · Python SDK · REST API · Runs where your agent runs
User is a senior backend engineer at a Series B startup
imp=5 · 2h ago
Prefers SQLite for local projects, avoids Postgres overhead
imp=4 · 3h ago
Don't summarize output — user reads the diff directly
imp=5 · 5h ago
Building a memory layer for AI agents — Contexara v2
imp=3 · ttl=30d · 1d ago
Last session: built hybrid FTS5+vector search, moved DBs to ~/.contexara/
crystallized · 1d ago
Most memory systems pick one approach. Contexara runs three in parallel so your agent is never blind — whether it's mid-conversation or resuming a week-old session.
Every user/assistant exchange is stored verbatim in a hot SQLite DB with FTS5 full-text search. The last N turns are injected at the start of every conversation — your agent picks up exactly where it left off.
After every turn an LLM extraction pass distills durable facts — preferences, corrections, tech choices, constraints — into typed, versioned memories. Hybrid FTS5 + cosine vector search returns the most relevant facts for every query.
When a session ends, a three-pass LLM crystallizer compresses the full transcript into a structured episode — title, actions, outcomes, open items, and a zero-fact-loss facts list. The next session opens with this summary pre-injected.
Contexara runs a deterministic lifecycle on every conversation turn — no agent decisions required.
Last N raw turns and the previous episode summary are injected — agent is never blind.
Hybrid FTS5 + cosine search surfaces the most relevant long-term memories before the agent responds.
The full (user, assistant) pair is saved to L0. An LLM extraction pass distills new durable facts into L2.
After 60 min idle or a manual checkpoint, the session is crystallized into a structured episode with zero fact-loss verification.
Use the Python SDK for deterministic production agents, or drop the MCP server into any framework that supports tools.
from contexara import ContextaraClient # One client per agent / namespace mem = ContextaraClient(namespace="my-agent") # Session start — inject history + last episode ctx = mem.context() # Search memory before answering facts = mem.retrieve("user preferences", top_k=5) # After every response — always mem.ingest(user_text, assistant_text) # Explicit memory store mem.store("User prefers concise bullet responses", kind="style", importance=4) # Crystallize session on demand mem.checkpoint()
# claude_desktop_config.json { "mcpServers": { "contexara": { "command": "python", "args": ["-m", "contexara.mcp_server"] } } } # Tools exposed to your agent: chat_context # get last N turns + episode chat_ingest # store turn after every reply memory_search # hybrid search L2 store memory_store # explicit fact storage session_checkpoint # crystallize now episode_search # search past sessions last_episode # most recent episode anchor
Contexara classifies every extracted memory into one of nine typed kinds so search, retrieval, and importance scoring are always precise.
Name, role, company, background — who this person is.
User corrected the agent's output, approach, or assumption.
Stated likes and dislikes, workflows the user favours.
Hard limits — budget, team size, rules they operate under.
Specific tech: languages, libraries, tools preferred or avoided.
Communication style: format, detail level, tone preferences.
Recurring behaviour — how the user works, iterates, decides.
Current active work item. Expires in 30 days automatically.
Anything worth keeping that doesn't fit a specific kind.
Contexara is in early access. Drop your email and we'll reach out when the hosted cloud tier is ready.
Local-first · No data leaves your machine until you opt in