How to Apply RAG Chatbot with LangChain + Ollama

In this post, you'll learn how to build a powerful RAG (Retrieval-Augmented Generation) chatbot using LangChain and Ollama. We'll also show the full flow of how to add documents into your agent dynamically!

Let's go step-by-step.


What is RAG Chatbot?

RAG stands for Retrieval-Augmented Generation. Instead of a chatbot replying only from what it "knows" internally, it first retrieves relevant documents and then generates an intelligent, customized answer.

This approach makes your chatbot:

  • More accurate
  • Up-to-date with external knowledge
  • Better at answering domain-specific questions

Tools We'll Use

  • LangChain: Framework for chaining LLMs with retrieval, tools, agents.
  • Ollama: Local LLM runner (models like Mistral, Llama3, etc.).
  • FAISS: Local vector search for fast document retrieval.

Step 1: Setup

First, install the libraries:

pip install langchain ollama faiss-cpu

Make sure you have Ollama installed and running:


Step 2: Basic RAG Chatbot Code

from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.llms import Ollama
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
from langchain.text_splitter import CharacterTextSplitter
from langchain.docstore.document import Document

# 1. Documents
my_docs = [
    Document(page_content="Python is a programming language created by Guido van Rossum."),
    Document(page_content="LangChain helps build applications powered by language models."),
    Document(page_content="The capital of Thailand is Bangkok."),
]

# 2. Split Documents
splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
split_docs = splitter.split_documents(my_docs)

# 3. Embedding
embedder = OllamaEmbeddings(model="nomic-embed-text")

# 4. FAISS Vector Store
vectorstore = FAISS.from_documents(split_docs, embedder)

# 5. Retriever
retriever = vectorstore.as_retriever()

# 6. LLM (Ollama)
llm = Ollama(model="mistral")

# 7. RetrievalQA Chain
rag_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

# 8. Ask a Question
query = "Who created Python?"
result = rag_chain({"query": query})

print("Answer:", result["result"])
print("Sources:", result["source_documents"])

Step 3: Add New Documents to Your Agent

You can dynamically add documents to your chatbot without restarting everything.

Here is the conceptual workflow in Mermaid.js:

graph TD
    A[User Uploads New Document] --> B[Split into Chunks]
    B --> C[Embed Chunks]
    C --> D[Add Embeddings to Vectorstore]
    D --> E[Retriever Automatically Updated]
    E --> F[Chatbot Now Uses New Knowledge]

Python Code to Add New Document:

# Suppose you get a new document
new_doc = Document(page_content="Django is a popular Python web framework.")

# 1. Split
new_splits = splitter.split_documents([new_doc])

# 2. Embed
new_vectors = embedder.embed_documents([doc.page_content for doc in new_splits])

# 3. Add to FAISS
vectorstore.add_documents(new_splits)

# Now your retriever has fresh knowledge instantly!

Conclusion

Building a RAG chatbot with LangChain + Ollama is powerful and flexible. You can:

  • Control your own models (no external APIs)
  • Add new knowledge live
  • Build super domain-specific chatbots

In production, you can scale this by connecting to:

  • PDF loaders
  • Website scrapers
  • Database retrievers

The future is open-source, private, and customizable.

Stay tuned for Part 2 where we'll make the chatbot stream responses and keep conversation memory! 🚀

Related Posts

Articles

Our Products


Related Posts

Articles

Our Products


Get in Touch with us

Speak to Us or Whatsapp(+66) 83001 0222

Chat with Us on LINEiiitum1984

Our HeadquartersChanthaburi, Thailand