π°οΈ Build Your Own Perplexity Clone
π Workflow: Build Your Own Perplexity Clone
This blog contains the Webhook + RAG + Internet Search workflow for n8n.
It extends 301 by adding a web search fallback for non-S3 AWS questions, while still using a vector store (RAG) for S3 queries, and politely refusing non-AWS topics.
β¨ Overview
This workflow demonstrates tool-routed answering inside an n8n Agent:
- π¦ S3 questions β RAG (vector store built from your S3 docs)
- π Other AWS questions β Internet Search (up-to-date info)
- π« Non-AWS β respectful refusal
Learners see how an agent can classify, choose tools, and ground answers.
π How It Works
Ingestion (one-time / as needed)
graph LR
MT["π±οΈ Manual Trigger"] --> GD["β¬οΈ Google Drive: Download File"]
GD --> DL["π Data Loader"]
DL --> TS["π Text Splitter"]
TS --> EM["π€ Embeddings"]
EM --> VS["π Vector Store (S3 KB)"]
Runtime (per request)
graph LR
WB["π Webhook (POST)"] --> AG["π§ AI Agent (Router)"]
AG --> MEM["πΎ Memory (sessionKey = username)"]
AG --> LLM["π€ OpenAI Chat Model"]
AG --> S3["π¦ s3_knowledge_base (RAG)"]
AG --> NET["π internet_search (HTTP Tool)"]
AG --> RSP["β©οΈ Respond to Webhook"]
- Webhook receives
{ query, username }. -
AI Agent classifies:
-
If S3 β query s3_knowledge_base (RAG) and answer.
- If AWS but not S3 β call internet_search and answer from results.
- If non-AWS
-
β refuse.
-
Memory keeps context per
username. - Respond to Webhook returns the final answer.
ποΈ Architecture


π Inputs (JSON Body)
query(string, required) β user question.username(string, recommended) β stable ID for memory.
Example
{
"query": "What's the difference between S3 Standard and S3 Glacier?",
"username": "demo-user-1"
}
π€ Output
- HTTP 200 with the agentβs answer.
-
Replies indicate source:
-
(Answer based on S3 knowledge base)
- (Answer enriched with Internet Search results)
- (Refusal: non-AWS topic)
βοΈ Setup
- Import
perplexity-clone.jsoninto n8n Cloud. -
Credentials
-
π OpenAI (for the Agentβs LLM)
- π Google Drive (document download for S3 KB)
-
π Internet Search tool (set
x-api-keyheader in the HTTP Request Tool) -
Activate the workflow and copy the Production Webhook URL.
- (Optional) Update Google Drive β fileId to your own S3 reference doc and run the Manual Trigger to rebuild the vector store.
β Tip: Keep temperature low (0.1β0.2) in the OpenAI node so the agent follows tool rules reliably.
π§ͺ Try It
Option A β Google Colab (Recommended)
- Open the instructorβs Colab: Webhook Client (Colab)
- Click Copy to Drive to make it editable.
- In n8n, Activate this 401 workflow and copy the Production Webhook URL (not the Test URL).
- In your Colab copy, replace the webhook variable (
urlorWEBHOOK_URL) with the Production URL. -
Run all cells. Try:
-
S3 (RAG expected): βHow do I enable S3 versioning?β
- AWS non-S3 (Search expected): βWhat is AWS Lambda?β
- Non-AWS (Refusal): βTell me about Paris.β
π‘ Use the same
usernameto observe memory continuity.
Option B β cURL
WEBHOOK_URL="https://<your-n8n>/webhook/<id>" # Production URL
curl -X POST "$WEBHOOK_URL" \
-H "Content-Type: application/json" \
-d '{"query":"What is AWS Lambda?","username":"demo-user-1"}'
Option C β Postman
- New POST β Production Webhook URL
- Body β raw β JSON:
json
{ "query": "How do I enable S3 versioning?", "username": "demo-user-1" }
- Send β view response.
π§ Teaching Notes
- Routing pattern: Students see S3 β RAG vs other AWS β search.
- Guardrails: Non-AWS questions are politely declined.
- Grounding: Answers always cite source mode in the closing tag.
- Maintainability: Docs can be refreshed without changing the runtime flow.
π©Ή Troubleshooting
- Refuses AWS question: Ensure tool names in the Agent match node names (
s3_knowledge_base,internet_search) and the Internet Search API key is set. - Schema errors: Internet Search expects
{"query": ["..."]}(array of strings). The S3 tool expects{"query": "..."}(string). - No response / 404: Workflow may not be Active; use Production webhook URL.
π References
- π Amazon S3 Getting Started Guide
- π n8n β Simple Vector Store node
- π n8n β RAG in n8n
- π n8n β Http Request node
- β¬’ traversaal.ai β Ares API Documentation
Donβt forget to check out my Agentic AI System Design for PMs course on Maven if you are interested to be a part of something bigger.
π These resources expand on the workflows here and show how to apply AI + n8n in real projects.
