Why is my LLM hallucinating even after using a multi-vector RAG pipeline with FAISS?

1 week ago 13

I implemented a RAG pipeline using:

OpenAI embeddings (text-embedding-3-large)

Multi-vector indexing (sentence-level chunks)

FAISS flat index

Top-k = 5 retrieval

Even after this, the LLM still hallucinates and sometimes ignores the retrieved context.

Things I already checked:

Embedding dimensions are correct

Normalization applied

Prompts include retrieved context

FAISS returns expected neighbors

What additional steps can reduce hallucination in a multi-vector RAG system?
Do I need re-ranking or context compression?

Hidden in mobile, Best for skyscrapers.