Project

RAG Decision Support System

Production-grade retrieval-augmented generation system for grounded question answering over private documents with hybrid retrieval, reranking, inline citations, and confidence-aware responses.

PythonFastAPIPostgreSQLpgvectorOpenAI APIDockerGitHub Actions

Started 2026Back to projects View code

RAG Decision Support System is a production-grade retrieval-augmented generation system built for grounded question answering over private document collections, with hybrid retrieval, reranking, inline citations, and confidence-aware response generation.

Key results

Indexed more than 1,600 document chunks across uploaded knowledge sources for retrieval and citation-backed answers.
Reached approximately 0.78 Precision@3 in retrieval evaluation across diverse query types.
Reduced repeated-query latency with caching and parallel reranking, cutting rerank latency by about 76% and repeated-query latency by about 57%.

What I built

Built an end-to-end RAG pipeline with document ingestion, chunking, embeddings, hybrid retrieval, and response generation.
Combined vector search, keyword retrieval, and cross-encoder reranking to improve answer relevance and grounding.
Implemented inline citation support, score filtering, and multi-factor confidence estimation to make outputs more trustworthy.
Designed an evaluation framework covering retrieval quality, groundedness, latency, and hallucination behavior across multiple query categories.

Why this matters

Most LLM demos answer confidently whether the evidence is strong or weak. This system was built to make retrieval quality visible, reduce hallucinations, and produce answers that stay tied to source material.