Abstract:
This report presents the development of a resource-efficient college chatbot assistant using Retrieval-Augmented Generation (RAG) to handle academic and administrative queries. The system automates document preprocessing and retrieval using LangChain, with embeddings stored in a Qdrant vector database. A 4-bit quantized phi-2 GGUF model, optimized with llama-cpp-python, enables fast inference on limited hardware. The chatbot features a history- aware retriever for context-aware responses and a voice-to-text interface powered by WhisperAI.