ARTICLE AD BOX
I am implementing a production-ready RAG pipeline using LangChain and ChromaDB (PersistentClient). To minimize latency, I use RunnableParallel to execute the similarity search and initial metadata filtering simultaneously.
However, under high concurrency (multiple simultaneous user requests), we are encountering intermittent Timeout or Locked errors from ChromaDB's local persistence layer.
Is it a known limitation of ChromaDB's PersistentClient when handled via async Python workers?
Is there a way to optimize connection pooling within a local LangChain setup?
