NHacker Next
- new
- past
- show
- ask
- show
- jobs
- submit
login
This is a great deep dive into SIMD. I've been experimenting with similar constraints but on even more restrictive hardware. Managed to achieve sub-85ns cycles for 10.8T dataset audits on a budget 3GB RAM ARM chip (A04e) by combining custom zero-copy logic with strict memory mapping. The trick was bypassing the standard allocator entirely to keep the L1 cache hot. Does your SIMD approach account for the memory controller bottleneck on lower-end ARM v8 cores, or is it mostly tuned for x86/high-end silicon?
How does your memory engine actually work?
Hey Husain here, co founder of Modulus
Can talk about this for hours but heres a summary
Every repo added by the user is analyzed for technical specifications which are stored without the code itself.
Updated every time a significant change is made to the codebase.
These are used at the time of retrieval by checking for relevance of the connected repos and extracting them as relevant context to the ongoing task.
Hope that answers your question!
[flagged]
Hey Husain here, cofounder of Modulus
Good point, we do just that - storing the explicit schema as structured facts.
Relevance is based on similarity threshold of the embeddings of the repo purpose and schema fetching is based on structured facts.
Rendered at 04:14:14 GMT+0000 (Coordinated Universal Time) with Vercel.