boofConvert PDFs and documents to markdown, index them locally for RAG retrieval, and analyze them token-efficiently. Use when asked to: read/analyze/summarize a PDF, process a document, boof a file, extract information from papers/decks/NOFOs, or when you need to work with large documents without filling the context window. Supports batch processing and cross-document queries.
Install via ClawdBot CLI:
clawdbot install chiefsegundo/boofLocal-first document processing: PDF → markdown → RAG index → token-efficient analysis.
Documents stay local. Only relevant chunks go to the LLM. Maximum knowledge absorption, minimum token burn.
bash {SKILL_DIR}/scripts/boof.sh /path/to/document.pdf
bash {SKILL_DIR}/scripts/boof.sh /path/to/document.pdf --collection my-project
qmd query "your question" -c collection-name
boof.sh on a PDF. This converts it to markdown via Marker (local ML, no API) and indexes it into QMD for semantic search.qmd query to retrieve only the relevant chunks. Send those chunks to the LLM — not the entire document."Analyze this specific aspect of the paper" → Boof + query (cheapest, most focused)
"Summarize this entire document" → Boof, then read the markdown section by section. Summarize each section individually, then merge summaries. See advanced-usage.md.
"Compare findings across multiple papers" → Boof all papers into one collection, then query across them.
"Find where the paper discusses X" → qmd search "X" -c collection for exact match, qmd query "X" -c collection for semantic match.
Converted markdown files are saved to knowledge/boofed/ by default (override with --output-dir).
If boof.sh reports missing dependencies, see setup-guide.md for installation instructions (Marker + QMD).
MARKER_ENV — Path to marker-pdf Python venv (default: ~/.openclaw/tools/marker-env)QMD_BIN — Path to qmd binary (default: ~/.bun/bin/qmd)BOOF_OUTPUT_DIR — Default output directory (default: ~/.openclaw/workspace/knowledge/boofed)Generated Mar 1, 2026
Researchers can use Boof to convert and index academic PDFs, enabling efficient querying of specific sections or findings without reading entire papers. This supports literature reviews and hypothesis validation by retrieving only relevant chunks for LLM analysis, saving time and computational resources.
Law firms can process legal briefs, contracts, and case files into markdown for semantic search, allowing quick retrieval of clauses or precedents. Boof's local indexing ensures data privacy while reducing token costs in AI-assisted legal analysis and summarization tasks.
Analysts can boof financial reports, market studies, and presentations to extract key insights across multiple documents. By indexing into a single collection, cross-document queries enable trend analysis and competitive intelligence without manual data sifting.
Non-profits and research institutions can handle NOFOs (Notices of Funding Opportunity) and proposals by converting PDFs to searchable markdown. This facilitates efficient extraction of requirements and alignment checks, streamlining grant application workflows with focused AI assistance.
Offer Boof as a cloud-based service with enhanced features like team collaboration, API access, and advanced analytics. Charge monthly per user or document volume, targeting businesses needing scalable document processing without local setup overhead.
Sell customized licenses to large organizations requiring strict data control, with support for integration into existing workflows and compliance tools. Include premium support and training, generating revenue through one-time fees or annual contracts.
Provide Boof as a free open-source tool for individual users, monetizing through paid add-ons like batch processing automation, priority support, and enhanced indexing algorithms. This model attracts a broad user base while upselling to power users.
💬 Integration Tip
Ensure dependencies like Marker and QMD are installed locally; use environment variables to customize paths for seamless integration into existing document management systems.
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
AI-optimized web search via Tavily API. Returns concise, relevant results for AI agents.
This skill should be used when users need to search the web for information, find current content, look up news articles, search for images, or find videos. It uses DuckDuckGo's search API to return results in clean, formatted output (text, markdown, or JSON). Use for research, fact-checking, finding recent information, or gathering web resources.
Web search and content extraction via Brave Search API. Use for searching documentation, facts, or any web content. Lightweight, no browser required.
Search indexed Discord community discussions via Answer Overflow. Find solutions to coding problems, library issues, and community Q&A that only exist in Discord conversations.
Multi search engine integration with 17 engines (8 CN + 9 Global). Supports advanced search operators, time filters, site search, privacy engines, and WolframAlpha knowledge queries. No API keys required.