Case StudyRAG / Knowledge Systems

RAG Knowledge Base

A retrieval-augmented generation system built over internal documentation — enabling staff to query years of institutional knowledge in plain natural language, with cited sources.

RAGPineconeOpenAI EmbeddingsFastAPIPython
800+
Docs Indexed
<3s
Query Response
85%
Search Time Saved

Overview

An organization had accumulated years of internal documentation — SOPs, policy documents, training materials, meeting notes — spread across Google Drive, Notion, and a legacy intranet. New staff spent weeks onboarding, and experienced staff regularly wasted time hunting for information they knew existed somewhere.

The goal: make the entire knowledge base queryable in natural language, with answers grounded in actual source documents and citations provided for verification.

Technical Approach

  • Built an ingestion pipeline that crawls connected document sources, chunks content intelligently (respecting document structure), and generates embeddings via OpenAI's text-embedding model.
  • Stored all vectors in Pinecone with rich metadata (source, date, author, document type) to enable filtered retrieval.
  • Implemented a FastAPI backend that accepts natural language queries, retrieves the top-k most relevant chunks, and passes them to GPT-4 with a strict grounding prompt.
  • Responses always include source citations with links back to the original document — preventing hallucination and building user trust.
  • Added a feedback loop where users can rate answers, with low-rated responses flagged for prompt refinement.

Results & Learnings

Staff reported an 85% reduction in time spent searching for information. New employee onboarding time dropped significantly — instead of asking colleagues or digging through folders, they could query the system directly. The citation feature was critical for adoption; users trusted answers they could verify.

The most important technical lesson: chunking strategy matters enormously. Naive fixed-size chunking produced poor retrieval quality. Switching to semantic chunking — splitting on meaningful boundaries like headings and paragraphs — dramatically improved answer relevance.

DragonsWorkshopAI Integration & Workflow
© 2026 DragonsWorkshop.com — All rights reserved.