Build a Local Product Recommendation System with LangChain, Ollama, and Open-Source Embeddings

In this post, you’ll learn how to create a fully local, privacy-friendly product recommendation engine for your e-commerce site using LangChain, Ollama (for LLMs), and open-source embeddings. No OpenAI API or external cloud needed—run everything on your machine or private server!

Why This Approach?

Keep your customer data private
Zero API cost—no pay-per-call fees
Use powerful open-source LLMs (like Llama 3, Mistral)
Flexible: works for product catalogs, FAQs, or any knowledge base

Solution Overview

We combine three key components:

SentenceTransformers for generating semantic product embeddings.
Chroma for efficient local vector search.
Ollama to run LLMs (like Llama 3) locally, generating human-like recommendations.

Data Flow Diagram

Here’s how data flows through the system:

flowchart TD
    U["User Query<br/>(e.g., 'waterproof running shoe for women')"]
    Q["LangChain<br/>Similarity Search"]
    V["Chroma Vector Store<br/>+ Embeddings"]
    P["Product Data<br/>(JSON, CSV, DB)"]
    R["Relevant Products"]
    LLM["Ollama LLM<br/>(Llama 3, Mistral, etc.)"]
    A["Final Recommendation<br/>(Chatbot Response)"]

    U --> Q
    Q --> V
    V -->|Top Matches| R
    R --> LLM
    LLM --> A
    P --> V

Flow:

User enters a query.
LangChain searches for the most relevant products using embeddings and Chroma.
The matched products are passed to the LLM (via Ollama) to generate a friendly, personalized recommendation.

Step-by-Step Implementation

1. Prepare Product Data

Format your product catalog in a structured format like JSON:

[
  {
    "id": "1",
    "name": "Nike Pegasus 39",
    "description": "Waterproof women's running shoe",
    "category": "Running Shoes",
    "tags": ["waterproof", "running", "women"]
  },
  ...
]

2. Install Required Packages

pip install langchain-community langchain-core chromadb sentence-transformers ollama

Make sure Ollama is installed and running with your chosen model (e.g., ollama pull llama3).

3. Python Code: Bringing It All Together

from langchain_community.llms import Ollama
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import SentenceTransformerEmbeddings
import json

# Load product data
with open('products.json', encoding='utf-8') as f:
    products = json.load(f)

texts = [p['description'] for p in products]
metadatas = [{"id": p["id"], "name": p["name"], "category": p["category"], "tags": p["tags"]} for p in products]

# Generate embeddings
embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

# Build vector store
vectorstore = Chroma.from_texts(texts, embeddings, metadatas=metadatas)

# User query
query = "waterproof running shoe for women"
results = vectorstore.similarity_search(query, k=2)

print("Recommended products:")
for r in results:
    print("-", r.metadata['name'], "|", r.page_content)

# LLM: Generate final recommendation
llm = Ollama(model="llama3")
context = "\n".join([f"{r.metadata['name']}: {r.page_content}" for r in results])
user_question = f"Which of these products would you recommend for a woman who needs waterproof running shoes?\n\n{context}"

response = llm.invoke(user_question)
print("\nChatbot answer:")
print(response)

How Does It Work?

Semantic Search: When the user asks for a product, we don’t just do keyword search—we find the closest matches in meaning using embeddings.
Chroma Vector DB: Handles fast, efficient similarity search on your local machine.
Ollama LLM: Receives the search results and generates a natural, human-like reply that feels like a real product expert.