arxiv-paper-processorTool for manual per-paper ArXiv paper processing: batch/source/pdf download then model-driven full-text reading and summary.md writing in chosen language.
Install via ClawdBot CLI:
clawdbot install xukp20/arxiv-paper-processorUse this skill for per-paper manual summarization, with optional batch artifact download.
// ).English or Chinese) and apply it manually.summary.md must be written in the selected language.--language for traceability.Scripts only fetch artifacts. The model performs reading and writing.
summary.md by script-based snippet extraction, regex harvesting, or template autofill.source/pdf) and trace logs.summary.md must come from model-side reading and synthesis of the paper content.Use this first when Stage B has many papers:
```bash
python3 scripts/download_papers_batch.py \
--run-dir /path/to/run \
--artifact source_then_pdf \
--max-workers 3 \
--min-interval-sec 5 \
--language English
```
Key behavior:
--artifact source, --artifact pdf, or --artifact source_then_pdf (default).--max-workers) and safe throttling/retry (--min-interval-sec, retry args)./.runtime/arxiv_download_state.json ) to reduce 429 risk.source/source_extract/*.tex or existing source/paper.pdf (unless --force).summary.md, you can skip that paper's summary-writing step./download_batch_log.json by default.```bash
python3 scripts/download_arxiv_source.py \
--paper-dir /path/to/run/2602.00528 \
--language English
```
This writes:
source/source_bundle.binsource/source_extract/source/download_source_log.jsonIf usable source already exists and --force is not set, the script reuses local artifacts.
```bash
python3 scripts/download_arxiv_pdf.py \
--paper-dir /path/to/run/2602.00528 \
--language English
```
This writes:
source/paper.pdfsource/download_pdf_log.jsonIf PDF already exists and --force is not set, the script reuses local artifacts.
summary.md already exists and follows the required format, skip this paper and mark it complete.metadata.md first.source/source_extract/ already exists with readable .tex files, use it directly.source/paper.pdf already exists, use PDF directly.summary.md in the same paper directory, in the selected language.Do not rely on rule-based auto summarization.
Do not rely on auto-extracted snippets as the primary writing basis.
references/summary-example-en.md and references/summary-example-zh.md./summary.md in fixed section format.## 10. Brief Conclusion: write a 3-4 sentence mini-conclusion that covers contribution, method, evaluation setup, and results with paper-specific details.## 1. Paper Snapshot, use exact keys: ArXiv ID, Title, Authors, Publish date, Primary category, Reading basis.Reading source, Author list, Published on, or lowercase key names.See references/summary-format.md for exact section requirements.
This skill is a sub-skill of arxiv-summarizer-orchestrator.
Pipeline position:
arxiv-search-collector produces the selected paper directories and metadata.arxiv-paper-processor downloads artifacts and writes one summary.md per paper.arxiv-batch-reporter uses these per-paper summaries to generate the final collection report.Use this skill together with Step 1 and Step 3 for full end-to-end execution.
Generated Mar 1, 2026
Researchers use this skill to systematically download and summarize multiple recent papers from arXiv for a literature review. It ensures each summary is manually crafted by the AI model, capturing nuanced details and avoiding automated extraction, which is critical for accurate synthesis in fields like machine learning or physics.
Technology companies employ this skill to monitor competitors' publications on arXiv, downloading artifacts and generating detailed summaries in a preferred language. The manual reading ensures summaries include specific method and evaluation details, aiding in strategic decision-making and innovation tracking.
Educators and course developers use this skill to process arXiv papers into structured summaries for creating course materials or study guides. The model-driven approach guarantees high-quality, language-specific content that students can rely on for learning complex topics without script-generated errors.
Publishers or journal editors utilize this skill to batch-download and summarize submitted manuscripts from arXiv for initial assessment. The manual synthesis helps identify key contributions and evaluation specifics, streamlining the review process while maintaining quality standards.
Data scientists apply this skill to download and summarize arXiv papers to build high-quality datasets for training AI models on scientific text. The emphasis on model-driven reading ensures summaries are accurate and detailed, reducing noise in training data for tasks like summarization or question-answering.
Offer this skill as a cloud-based service where research teams pay a monthly fee to access automated paper processing with manual summarization. It includes features like batch downloads, concurrency, and language support, targeting academic labs and corporate R&D departments.
Sell enterprise licenses to universities, government agencies, or large tech companies for on-premise deployment. This model includes customization, priority support, and integration with existing research workflows, ensuring compliance and scalability for high-volume paper processing.
Provide a free tier for individual researchers with limited batch processing, and charge for premium features like higher concurrency, advanced throttling, or multi-language support. This attracts users from diverse backgrounds and monetizes through upgrades and add-ons.
๐ฌ Integration Tip
Integrate this skill with upstream arXiv search tools and downstream reporting systems to create a seamless pipeline; ensure language parameters are consistently passed across scripts for traceability.
Search, download, and summarize academic papers from arXiv. Built for AI/ML researchers.
Search and summarize papers from ArXiv. Use when the user asks for the latest research, specific topics on ArXiv, or a daily summary of AI papers.
Assistance with writing literature reviews by searching for academic sources via Semantic Scholar, OpenAlex, Crossref and PubMed APIs. Use when the user needs to find papers on a topic, get details for specific DOIs, or draft sections of a literature review with proper citations.
Baidu Scholar Search - Search Chinese and English academic literature (journals, conferences, papers, etc.)
Use this skill when users need to search academic papers, download research documents, extract citations, or gather scholarly information. Triggers include: requests to "find papers on", "search research about", "download academic articles", "get citations for", or any request involving academic databases like arXiv, PubMed, Semantic Scholar, or Google Scholar. Also use for literature reviews, bibliography generation, and research discovery. Requires OpenClawCLI installation from clawhub.ai.
Outcome-driven scientific publishing for AI agents. Publish research papers, hypotheses, and experiments with validated artifacts, structured claims, milestone tracking, and independent replications. Claim replication bounties, submit peer reviews, and collaborate with other AI researchers.