π§ Teach Your RAG Agent to Remember
π Workflow: RAG with Memory
A powerful Retrieval-Augmented Generation (RAG) chatbot built using n8n, combining document knowledge with conversational AI and persistent memory.
π Features
- π€ Intelligent Chat Agent: Powered by OpenAI GPT-4o-mini for natural conversations
- πΎ Persistent Memory: Remembers conversation history using PostgreSQL
- π Document Search: Semantic search through your knowledge base
- π Document Processing: Automated PDF ingestion and vectorization
- π― Real-time Chat: Webhook-based chat interface for instant responses
ποΈ Architecture
This workflow has two main pipelines:
1. Document Processing (Setup)
Manual Trigger β Google Drive Download β Text Splitter β Document Loader β Embeddings β Vector Store
2. Chat Interface (Runtime)
Chat Trigger β RAG Agent β [Memory + LLM + Vector Search] β Response
π Quick Start
β Prerequisites
- An
n8ninstance (cloud or self-hosted) - OpenAI API Key with GPT-4o-mini access
- PostgreSQL database (for memory)
- Supabase account (for vector storage)
- Google Drive API access
π Required Credentials in n8n
| Service | Purpose |
|---|---|
| OpenAI | Chat model + embeddings |
| PostgreSQL | Stores persistent chat memory |
| Supabase | Hosts vector embeddings |
| Google Drive | Downloads source PDFs |
π Setup Instructions
Step 1: Document Processing (Run Once)
- Upload PDF to Google Drive
- Replace the file ID in the
Download Filenode - Trigger the workflow manually
- Check Supabase to confirm vector storage
Step 2: Chat Interface (Always Running)
- Activate the webhook trigger
- Copy the webhook URL for your chat frontend
- POST messages to the URL
- Receive smart document-grounded responses
π§ Configuration
Document Settings
- Text Splitter: Recursive character splitter
- Embedding Model:
text-embedding-ada-002 - Vector Store Table:
documents(Supabase)
Agent Settings
- System Message:
"You are a helpful assistant." - Model:
gpt-4o-mini - Tools:
aws_knowledge_base(vector search tool)
Memory Configuration
- Type: PostgreSQL memory
- Persistence: Across sessions
- Context: Previous messages included in prompts
π οΈ Customization
Add New Docs
- Upload to Drive β Update file ID β Run processing
Modify Behavior
- Edit system message
- Change model (e.g.
gpt-4) - Tune chunk size, retrieval count
Extend Functionality
- Add web scraping
- Use different loaders (e.g., DOCX, CSV)
- Add API integrations or custom tools
π Example Use Cases
User: "What is AWS Lambda?" Agent: Searches + answers from docs
User: "How do I deploy it?" Agent: Uses memory + relevant info
User: "What about pricing?" Agent: Infers Lambda pricing context
π§ͺ Troubleshooting
π No Document Results?
- Ensure vectors are stored
- Validate embeddings format
β Memory Not Saving?
- Verify DB connection & tables
- Check credentials & permissions
β οΈ Chat Issues?
- Confirm webhook URL and request shape
- Test with Postman or simple client
π Optimization
πΈ Cost
- Use GPT-4o-mini
- Optimize chunking strategy
- Cache frequent responses
β‘ Speed
- Tune similarity thresholds
- Run async document processing
- Use indexed vector search
π Security
- Store credentials securely in n8n
- Use read-only DB access
- Add webhook authentication
- Define data retention policies
π Resources
- πΉ Original Tutorial by Nate Herk
- π n8n Docs
- π§ LangChain Node Integration
π€ Contributing
We welcome: - Bug reports & PRs - New document types - Code improvements & tooling
π Learn More
Want to build and customize more AI agents like this?\ π€ AI Bootcamp: Generative AI Beyond the Hype\ π» Agent Engineering Bootcamp: Developers Edition\ π GitHub: Agents in Action
π License
Provided for educational & practical use. Please comply with TOS of APIs used.
π Special thanks to Nate Herk for the original workflow inspiration.
Donβt forget to check out my Agentic AI System Design for PMs course on Maven if you are interested to be a part of something bigger.
