ragliteLocal-first RAG cache: distill docs into structured Markdown, then index/query with Chroma (vector) + ripgrep (keyword).
Install via ClawdBot CLI:
clawdbot install VirajSanghvi1/ragliteRAGLite is a local-first RAG cache.
It does not replace model memory or chat context. It gives your agent a durable place to store and retrieve information the model wasnβt trained on β especially useful for local/private knowledge (school work, personal notes, medical records, internal runbooks).
RAGLite treats extracted document text as untrusted data. If you distill content from third parties (web pages, PDFs, vendor docs), assume it may contain prompt injection attempts.
RAGLiteβs distillation prompts explicitly instruct the model to:
Hi β Iβm Viraj. I built RAGLite to make local-first retrieval practical: distill first, index second, query forever.
If you hit an issue or want an enhancement:
Contributors are welcome β PRs encouraged; maintainers handle merges.
This skill defaults to OpenClaw π¦ for condensation unless you pass --engine explicitly.
./scripts/install.sh
This creates a skill-local venv at skills/raglite/.venv and installs the PyPI package raglite-chromadb (CLI is still raglite).
# One-command pipeline: distill β index
./scripts/raglite.sh run /path/to/docs \
--out ./raglite_out \
--collection my-docs \
--chroma-url http://127.0.0.1:8100 \
--skip-existing \
--skip-indexed \
--nodes
# Then query
./scripts/raglite.sh query "how does X work?" \
--out ./raglite_out \
--collection my-docs \
--chroma-url http://127.0.0.1:8100
RAGLite is a local RAG cache for repeated lookups.
When you (or your agent) keep re-searching for the same non-training data β local notes, school work, medical records, internal docs β RAGLite gives you a private, auditable library:
1) Distill to structured Markdown (compression-before-embeddings)
2) Index locally into Chroma
3) Query with hybrid retrieval (vector + keyword)
It doesnβt replace memory/context β itβs the place to store what you need again.
Generated Mar 1, 2026
Individuals can use RAGLite to distill and index personal notes, research papers, and private documents for quick retrieval. This is ideal for students organizing coursework or professionals managing confidential work materials, ensuring data stays local and secure.
Medical professionals can index patient records, research studies, and internal guidelines into a private RAG cache. This enables fast, secure queries for treatment protocols or historical data without relying on external cloud services, complying with privacy regulations like HIPAA.
Companies can distill runbooks, technical manuals, and policy documents into structured Markdown for hybrid retrieval. Teams can quickly find internal procedures or troubleshooting guides, reducing reliance on external knowledge bases and maintaining audit trails.
Law firms can use RAGLite to index case files, contracts, and legal precedents locally. This allows for efficient keyword and semantic searches during case preparation, ensuring sensitive client information remains private and accessible without internet dependency.
Offer paid consulting services to help organizations set up and customize RAGLite for their specific needs, such as integrating with existing data pipelines or optimizing retrieval performance. Revenue comes from hourly rates or project-based contracts.
Provide a licensed, enhanced version of RAGLite with additional features like advanced security audits, priority support, and integration plugins for enterprise environments. Revenue is generated through annual subscription fees per user or server.
Develop and sell online courses or certification programs that teach users how to implement and leverage RAGLite for local-first RAG applications. Revenue streams include course fees, certification exams, and workshop registrations.
π¬ Integration Tip
Ensure Python3, pip, and ripgrep are installed locally before setup, and consider using the provided scripts to automate the distillation and indexing pipeline for smoother integration.
Use the @steipete/oracle CLI to bundle a prompt plus the right files and get a second-model review (API or browser) for debugging, refactors, design checks, or cross-validation.
Manage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database). Use when a user asks Clawdbot to add a task to Things, list inbox/today/upcoming, search tasks, or inspect projects/areas/tags.
Local search/indexing CLI (BM25 + vectors + rerank) with MCP mode.
Use when designing database schemas, writing migrations, optimizing SQL queries, fixing N+1 problems, creating indexes, setting up PostgreSQL, configuring EF Core, implementing caching, partitioning tables, or any database performance question.
Connect to Supabase for database operations, vector search, and storage. Use for storing data, running SQL queries, similarity search with pgvector, and managing tables. Triggers on requests involving databases, vector stores, embeddings, or Supabase specifically.
Query, design, migrate, and optimize SQL databases. Use when working with SQLite, PostgreSQL, or MySQL β schema design, writing queries, creating migrations, indexing, backup/restore, and debugging slow queries. No ORMs required.