dify-kb-searchSearch Dify Knowledge Base (Dataset) to get accurate context for RAG-enhanced answers.
Install via ClawdBot CLI:
clawdbot install xiaowenzhou/dify-kb-searchš Search your Dify Knowledge Base to get accurate, contextual answers
This skill enables AI agents to query Dify datasets for RAG (Retrieval-Augmented Generation) context retrieval. Perfect for knowledge base Q&A, documentation search, and contextual AI responses.
Set up in openclaw.json:
{
"env": {
"vars": {
"DIFY_API_KEY": "${DIFY_API_KEY}",
"DIFY_BASE_URL": "https://dify.example.com/v1"
}
}
}
Environment Variables:
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| DIFY_API_KEY | ā
Yes | - | Your Dify API Key (from Settings ā API) |
| DIFY_BASE_URL | ā No | http://localhost/v1 | Your Dify instance base URL |
pip3 install requests
Lists all available knowledge bases (datasets) in your Dify instance.
Invocation: dify_list tool
Example Response:
{
"status": "success",
"count": 2,
"datasets": [
{
"id": "dataset-abc123",
"name": "Product Documentation",
"doc_count": 42,
"description": "All product guides and tutorials"
},
{
"id": "dataset-xyz789",
"name": "API Reference",
"doc_count": 156,
"description": "REST API documentation"
}
]
}
Usage:
{}
Searches a Dify Dataset for relevant context chunks.
Invocation: dify_search tool (mapped to python3 scripts/search.py)
Parameters:
| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| query | string | ā
Yes | - | Search query or question |
| dataset_id | string | ā No | Auto-discover | Specific dataset ID to search |
| top_k | integer | ā No | 3 | Number of results to return |
| search_method | string | ā No | hybrid_search | Search strategy |
| reranking_enable | boolean | ā No | false | Enable reranking for better results |
Search Methods:
hybrid_search - Combine semantic + keyword search (recommended)semantic_search - Meaning-based similarity searchkeyword_search - Exact keyword matchingExample Usage:
{
"query": "How do I configure OpenClaw?",
"top_k": 5
}
{
"query": "API authentication methods",
"dataset_id": "dataset-xyz789",
"search_method": "semantic_search",
"reranking_enable": true
}
Example Response:
{
"status": "success",
"query": "How do I configure OpenClaw?",
"dataset_id": "dataset-abc123",
"count": 3,
"results": [
{
"content": "To configure OpenClaw, edit the openclaw.json file...",
"score": 0.8923,
"title": "Installation Guide",
"document_id": "doc-001"
},
{
"content": "OpenClaw supports environment variables via...",
"score": 0.8451,
"title": "Configuration Options",
"document_id": "doc-002"
}
]
}
[
{
"tool": "dify_list",
"parameters": {}
},
{
"tool": "dify_search",
"parameters": {
"query": "What are the system requirements?",
"top_k": 5,
"search_method": "hybrid_search"
}
}
]
| Error | Solution |
|-------|----------|
| Missing DIFY_API_KEY | Set DIFY_API_KEY in environment variables |
| Connection refused | Check DIFY_BASE_URL is correct and accessible |
| No datasets found | Verify dataset exists in your Dify workspace |
| API request failed | Check network connectivity and API key permissions |
Run manually to see detailed errors:
DIFY_API_KEY=your-key python3 scripts/search.py <<< '{"query":"test"}'
# Example: Use search results in AI response
results = dify_search(query, top_k=5)
context = "\n".join([r["content"] for r in results["results"]])
final_prompt = f"Answer based on context:\n\n{context}\n\nQuestion: {query}"
For searching across multiple datasets, loop through them:
{
"query": "Find information about authentication",
"dataset_id": "dataset-api-docs"
}
Then query another dataset separately.
.env filesThis skill uses the Dify Dataset API:
GET /v1/datasetsPOST /v1/datasets/{id}/retrieveFor API documentation, see: https://docs.dify.ai/reference/api-reference
v1.1.0 (2026-02-08):
v1.0.0 (2026-02-06):
Generated Mar 1, 2026
Enables AI agents to search internal knowledge bases like product documentation or FAQs to provide accurate, contextual answers to customer inquiries. Reduces manual support load and ensures consistent information delivery.
Allows researchers to query datasets containing academic papers, reports, or internal data for relevant information. Supports RAG-enhanced analysis by retrieving precise context for generating insights or summaries.
Helps employees quickly find company policies, procedures, or technical guides by searching centralized knowledge bases. Improves productivity and reduces time spent navigating disparate information sources.
Assists content teams in retrieving factual data or brand guidelines from knowledge bases to generate accurate marketing materials, blog posts, or social media content. Ensures alignment with company messaging.
Enables medical professionals to search datasets of clinical guidelines, research studies, or patient records for quick reference. Supports decision-making by providing relevant, up-to-date medical context in a secure manner.
Offer this skill as part of a paid AI agent platform where users pay monthly or annual fees for access. Revenue is generated through tiered plans based on usage limits, number of datasets, or advanced features like reranking.
Provide custom implementation services to businesses for integrating this skill into their existing workflows. Revenue comes from one-time project fees or ongoing support contracts for setup, training, and maintenance.
Offer a free basic version with limited searches or datasets, then charge for premium features like higher top-k limits, advanced search methods, or priority support. Revenue is driven by upgrades from free users to paid tiers.
š¬ Integration Tip
Use the dify_list tool first to auto-discover dataset IDs, then integrate search results into AI prompts for RAG pipelines to enhance response accuracy.
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
AI-optimized web search via Tavily API. Returns concise, relevant results for AI agents.
This skill should be used when users need to search the web for information, find current content, look up news articles, search for images, or find videos. It uses DuckDuckGo's search API to return results in clean, formatted output (text, markdown, or JSON). Use for research, fact-checking, finding recent information, or gathering web resources.
Web search and content extraction via Brave Search API. Use for searching documentation, facts, or any web content. Lightweight, no browser required.
Search indexed Discord community discussions via Answer Overflow. Find solutions to coding problems, library issues, and community Q&A that only exist in Discord conversations.
Multi search engine integration with 17 engines (8 CN + 9 Global). Supports advanced search operators, time filters, site search, privacy engines, and WolframAlpha knowledge queries. No API keys required.