RAG Knowledge Base
A retrieval-augmented generation system built over internal documentation — enabling staff to query years of institutional knowledge in plain natural language, with cited sources.
Overview
An organization had accumulated years of internal documentation — SOPs, policy documents, training materials, meeting notes — spread across Google Drive, Notion, and a legacy intranet. New staff spent weeks onboarding, and experienced staff regularly wasted time hunting for information they knew existed somewhere.
The goal: make the entire knowledge base queryable in natural language, with answers grounded in actual source documents and citations provided for verification.
Technical Approach
- Built an ingestion pipeline that crawls connected document sources, chunks content intelligently (respecting document structure), and generates embeddings via OpenAI's text-embedding model.
- Stored all vectors in Pinecone with rich metadata (source, date, author, document type) to enable filtered retrieval.
- Implemented a FastAPI backend that accepts natural language queries, retrieves the top-k most relevant chunks, and passes them to GPT-4 with a strict grounding prompt.
- Responses always include source citations with links back to the original document — preventing hallucination and building user trust.
- Added a feedback loop where users can rate answers, with low-rated responses flagged for prompt refinement.
Results & Learnings
Staff reported an 85% reduction in time spent searching for information. New employee onboarding time dropped significantly — instead of asking colleagues or digging through folders, they could query the system directly. The citation feature was critical for adoption; users trusted answers they could verify.
The most important technical lesson: chunking strategy matters enormously. Naive fixed-size chunking produced poor retrieval quality. Switching to semantic chunking — splitting on meaningful boundaries like headings and paragraphs — dramatically improved answer relevance.