Mike Maeda
Back to projects

2026

Alfred RAG Assistant

PythonChromaDBGroq LLMGradio
Problem

Student-life information is scattered across small documents, and generic chatbots can confidently answer with unsupported guesses. I wanted a grounded assistant that could answer from a known source set and refuse when the answer was not there.

What I built

I built a RAG system that answers Alfred University student-life questions from a small, curated 10-document knowledge base. The pipeline chunks documents into 47 segments, embeds them with sentence-transformers, stores them in ChromaDB, and generates grounded answers with Groq's LLaMA 3.3 model, including a refusal mechanism so it says "I don't know" instead of guessing. The most interesting part wasn't building it, it was debugging it. One test question kept returning a confident, wrong answer with no errors anywhere. I had to trace backward through the entire pipeline, chunking, then embeddings, then retrieval, before finding a retrieval-recall failure: the right information existed in the vector store, it just wasn't being retrieved. Fixing it meant rethinking how I chunked content in the first place. It's the project that taught me that in AI systems, correctness depends on every step in the pipeline, not just the model at the end.

Impact
  • Built and evaluated an end-to-end RAG pipeline from chunking to embedding to generation.
  • Added refusal behavior so the assistant can say it does not know instead of inventing answers.
  • Diagnosed a retrieval-recall failure by tracing the full pipeline instead of only tuning the model.