NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Show HN: A memory database that forgets, consolidates, and detects contradiction (github.com)
endymi0n 30 minutes ago [-]
I've experimented quite a bit with mem0 (which is similar in design) for my OpenClaw and stopped using it very soon. My impression is that "facts" are an incredibly dull and far too rigid tool for any actual job at hand and for me were a step back instead of forward in daily use. In the end, the extracted "facts database" was a complete mess of largely incomplete, invalid, inefficient and unhelpful sentences that didn't help any of my conversations, and after the third injected wrong fact I went back to QMD and prose / summarization. Sometimes it's slightly worse at updating stuck facts, but I'll take a 1000% better big picture and usefulness over working with "facts".

The failure modes were multiple: - Facts rarely exist in a vacuum but have lots of subtlety - Inferring facts from conversation has a gazillion failure modes, especially irony and sarcasm lead to hilarious outcomes (joking about a sixpack with a fat buddy -> "XYZ is interested in achieving an athletic form"), but even things as simple as extracting a concrete date too often go wrong - Facts are almost never as binary as they seem. "ABC has the flights booked for the Paris trip". Now I decided afterwards to continue to New York to visit a friend instead of going home and completely stumped the agent.

pranabsarkar 4 hours ago [-]
Author here. I built this because I was using ChromaDB for an AI agent's memory and recall quality went to garbage at ~5k memories. The agent kept recalling outdated facts, contradicting itself across sessions, and the context window was full of redundant near-duplicates.

I tried to write the consolidation/conflict-detection logic on top of ChromaDB. It didn't work — the operations need to be transactional with the vector index, and they need an HLC for ordering across nodes. So I built it as a database.

The cognitive operations (think, consolidate, detect_conflicts, derive_personality) are the actual differentiator. The clustered server is what made me confident enough to ship — I needed to know the data was safe before I'd put real work on it.

What I genuinely want to know: is this solving a problem you're hitting with your AI agent's memory, or did I build a really polished thing for my own narrow use case? Honest reactions help more than encouragement.

all2 39 minutes ago [-]
I've bookmarked this. I'll let you know what I find over the next few weeks.

I'm in the middle of building an agent harness and I haven't had to deal with long-running memory issues yet, but I will have to deal with it soon.

pranabsarkar 33 minutes ago [-]
Thanks, really appreciate it. I am using the server as MCP server and connected all my workspaces. It has definitely changed my experience.
tcdent 1 hours ago [-]
I appreciate the effort you put into mapping semantics so language constructs can be incorporated into this. You’re probably already seeing that the amount of terminology, how those terms interact with each other, and the way you need to model it have ballooned into a fairly complex system.

The fundamental breakthrough with LLMs is that they handle semantic mapping for you and can (albeit non-deterministically) interpret the meaning and relationships between concepts with a pretty high degree of accuracy, in context.

It just makes me wonder if you could dramatically simplify the schema and data modeling by incorporating more of these learnings.

I have a simple experiment along these lines that’s especially relevant given the advent of one-million-token context windows, although I don’t consider it a scientifically backed or production-ready concept, just an exploration: https://github.com/tcdent/wvf

pranabsarkar 37 minutes ago [-]
Thanks for the careful read — the "schema is ballooning" observation is real and I've felt it building this. You're pointing at a genuine design tension.

My counter, qualified: deterministic consolidation is cheap and reproducible in a way LLM-in-the-loop consolidation isn't, at least today. Every think() invocation is free (cosine + entity matching + SQL). If I put an LLM in the loop the cost is O(N²) LLM calls per consolidation pass — for a 10k-memory database, that's thousands of dollars of inference per tick. So for v1 I'm trading off "better merge decisions" against "actually runs every 5 minutes without burning a budget."

On 1M-context-windows: I think they push the "vector DB break point" out but don't remove it. Context stuffing still has recall-precision problems at scale (lost-in-the-middle, attention dilution on unrelated facts), and 1M tokens ≠ unbounded memory. At 10M memories no context window saves you.

wvf is interesting — just read through. The "append everything, let the model retrieve" approach is the complement of what I'm doing: you lean fully into LLM semantics, I try to do the lookup deterministically. Probably both are right for different workloads. Yours wins when you have unbounded compute + a small corpus; mine wins when you have bounded compute + a large corpus that needs grooming.

Starring wvf now. Curious if you're seeing meaningful quality differences between your approach and traditional retrieval at scale.

tcdent 10 minutes ago [-]
Appreciate the thoughtful reply.

Absolutely agree the deterministic performance-oriented mindset is still essential for large workloads. Are you expecting that this supplements a traditional vector/semantic store or that it superceeds it?

My focus has absolutely been on relatively small corpii, and which is supported by forcing a subset of data to be included by design. There are intentionally no conventions for things like "we talked about how AI is transforming computing at 1AM" and instead it attempts to focus on "user believes AI is transforming computing", so hopefully there's less of the context poisoning that happens with current memory.

Haven't deployed WVF at any scale yet; just a casual experiment among many others.

polotics 2 hours ago [-]
In this day and age, without serious evidence that the software presented has seen some real usage, or at least has a good reviewable regression test suite, sadly the assumption may be that this is a slopcoded brainwave. The ascii-diagram doesn't help. Also maybe explain the design more.
6r17 1 hours ago [-]
I kind of agree with the comment here that a lot of stuff happening around comes out from an idea without proof that the project has a meaningful result. A compacting memory bench is not something difficult to put off but I'm also having difficulties understanding what would be the outcome on a running system
pranabsarkar 39 minutes ago [-]
[dead]
hazelnut 47 minutes ago [-]
Congrats, looking promising. How does it compare to supermemory.ai?
pranabsarkar 40 minutes ago [-]
Fair question. Supermemory is a hosted SaaS built around embedding + ranking. YantrikDB is self-hosted and adds three things Supermemory doesn't do as first-class operations:

think() — consolidates similar memories into canonical ones (not just deduplication, actual collapse of redundant facts) Contradiction detection — when "CEO is Alice" and "CEO is Bob" both exist in memory, it flags the pair as a conflict the agent can resolve Temporal decay with configurable half-life — memories fade, so old unimportant stuff stops polluting recall Supermemory does more on the cloud side (team sharing, permissions, integrations). YantrikDB does more on the "actively manage my agent's memory" side. Different optimization points — no dig at Supermemory.

Mithriil 1 hours ago [-]
The half-life idea is interesting.

What's the loop behind consolidation? Random sampling and LLM to merge?

pranabsarkar 39 minutes ago [-]
No LLM in the loop. The consolidation pass is deterministic:

Pull the N most recent active memories (default 30) with embeddings Pairwise cosine similarity, threshold 0.85 For each similar pair, check if they share extracted entities Shared entities + similarity 0.85-0.98 → flag as potential contradiction (same topic, maybe different facts) No shared entities + similarity > 0.85 → redundancy (mark for consolidation) Second pass at 0.65 threshold specifically for substitution-category pairs (e.g., "MySQL" vs "PostgreSQL" in otherwise-similar sentences) — these are usually real contradictions even at lower similarity Consolidation then collapses the redundancy set into canonical memories with combined importance/certainty. No LLM call, no randomness. Reproducible, cheap, runs in a background tick every ~5 minutes.

The LLM could improve this (better merge decisions, better entity alignment) but the tradeoff is cost and non-determinism. v1 is deterministic on purpose.

Source: crates/yantrikdb-core/src/cognition/triggers.rs and consolidate.rs next to it.

altmanaltman 1 hours ago [-]
Did you check if this leads to any actual benefits? If so, how did you benchmark it?
pranabsarkar 16 minutes ago [-]
[dead]
aayushkumar121 3 hours ago [-]
[dead]
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 19:53:53 GMT+0000 (Coordinated Universal Time) with Vercel.