⚠️Install with caution. This skill has very few installs. Always review the source and verify it on clawhub.ai before installing. Community-built skills run with agent permissions — only install ones you trust.

🔄 Agent Self-Improvement

Hle Benchmark Evolverv1.0.0

Name: Hle Benchmark Evolver
Author: wanng-ide

hle-benchmark-evolver

wanng-ide

Runs HLE-oriented benchmark reward ingestion and curriculum generation for capability-evolver. Use when the user asks to optimize Humanity's Last Exam score,...

latest

Download Package View on ClawHub

Installs (all time)

Installs (current)

Downloads

1.1K

Stars

CreatedFeb 15, 2026

UpdatedMay 18, 2026

Install & Quick Start

Install via ClawdBot CLI:

clawdbot install wanng-ide/hle-benchmark-evolver

Skill Package6 files

📋SKILL.mdmarkdown

Failed to load file.

Quality Score

B53/100

Grade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.

Market Validation2/35

· No tracked installs (may still have manual users)
· 466 downloads (low demand)

Documentation18/25

· SKILL.md present
· Moderate documentation (≥1500 chars)
· Contains usage examples or trigger description
· Detailed summary

Package Completeness8/15

· skillAssets present (5 files)

💡

Usage Guide

Generated Mar 21, 2026

AI researchers and developersEdTech companies and educatorsCorporate training managersintermediate

💡 Application Scenarios

AI Education Platform OptimizationEdTech

An online learning platform uses this skill to ingest student performance data from HLE-style benchmark tests, converting it into reward signals to evolve its AI tutor's teaching strategies. It prioritizes easy-first curriculum queues to improve student engagement and target weak areas, ensuring adaptive learning paths.

Enterprise AI Assessment EnhancementCorporate Training

A corporate training provider employs this skill to analyze employee benchmark results on HLE-oriented exams, generating curriculum signals to refine AI-driven assessment tools. It focuses on specific subjects and modalities to boost training efficiency and track progress trends for compliance reporting.

Research Lab Capability EvolutionAcademic Research

A research institution integrates this skill to process benchmark reports from AI experiments, using reward ingestion to evolve models like OpenClaw for better HLE scores. It automates curriculum generation to prioritize research questions and monitor accuracy improvements over cycles.

Competitive Exam Prep ToolEducation Services

A test preparation company leverages this skill to ingest question-level results from practice HLE exams, creating easy-first queues to guide learners. It provides immediate benchmark snapshots to adjust study plans and focus on high-impact subjects for score optimization.

💼 Business Models

SaaS SubscriptionRecurring subscription fees

Offer this skill as a cloud-based service where clients upload benchmark reports via API, paying monthly for automated reward ingestion and curriculum generation. Revenue comes from tiered plans based on usage cycles and support levels, targeting EdTech and corporate users.

Consulting and IntegrationProject-based and retainer fees

Provide custom integration services to deploy this skill within existing AI systems, helping organizations optimize HLE scores through tailored workflows. Revenue is generated from project-based fees and ongoing maintenance contracts, focusing on enterprises and research labs.

White-Label SolutionLicensing and royalty payments

License this skill as a white-label component for other software vendors, allowing them to embed HLE benchmark evolution into their products. Revenue comes from upfront licensing fees and royalties based on user adoption, appealing to educational platform developers.

💬 Integration Tip

Ensure the benchmark report JSON follows the expected schema and use absolute paths for inputs; automate cycles with eval_cmd for continuous evolution in production environments.