skill-testerValidates, tests, and scores skills for structure, script correctness, documentation, and usability to ensure compliance with tiered quality standards.
Install via ClawdBot CLI:
clawdbot install alirezarezvani/skill-testerName: skill-tester
Tier: POWERFUL
Category: Engineering Quality Assurance
Dependencies: None (Python Standard Library Only)
Author: Claude Skills Engineering Team
Version: 1.0.0
Last Updated: 2026-02-16
The Skill Tester is a comprehensive meta-skill designed to validate, test, and score the quality of skills within the claude-skills ecosystem. This powerful quality assurance tool ensures that all skills meet the rigorous standards required for BASIC, STANDARD, and POWERFUL tier classifications through automated validation, testing, and scoring mechanisms.
As the gatekeeping system for skill quality, this meta-skill provides three core capabilities:
This skill is essential for maintaining ecosystem consistency, enabling automated CI/CD integration, and supporting both manual and automated quality assurance workflows. It serves as the foundation for pre-commit hooks, pull request validation, and continuous integration processes that maintain the high-quality standards of the claude-skills repository.
Automatically classifies skills based on complexity and functionality:
The skill-tester follows a modular architecture where each component serves a specific validation purpose:
All validation is performed against well-defined standards documented in the references/ directory:
Designed for seamless integration into existing development workflows:
# Primary validation workflow
validate_skill_structure() -> ValidationReport
check_skill_md_compliance() -> DocumentationReport
validate_python_scripts() -> ScriptReport
generate_compliance_score() -> float
Key validation checks include:
# Core testing functions
syntax_validation() -> SyntaxReport
import_validation() -> ImportReport
runtime_testing() -> RuntimeReport
output_format_validation() -> OutputReport
Testing capabilities encompass:
# Multi-dimensional scoring
score_documentation() -> float # 25% weight
score_code_quality() -> float # 25% weight
score_completeness() -> float # 25% weight
score_usability() -> float # 25% weight
calculate_overall_grade() -> str # A-F grade
Scoring dimensions include:
# Pre-commit hook validation
skill_validator.py path/to/skill --tier POWERFUL --json
# Comprehensive skill testing
script_tester.py path/to/skill --timeout 30 --sample-data
# Quality assessment and scoring
quality_scorer.py path/to/skill --detailed --recommendations
# GitHub Actions workflow example
- name: Validate Skill Quality
run: |
python skill_validator.py engineering/${{ matrix.skill }} --json | tee validation.json
python script_tester.py engineering/${{ matrix.skill }} | tee testing.json
python quality_scorer.py engineering/${{ matrix.skill }} --json | tee scoring.json
# Validate all skills in repository
find engineering/ -type d -maxdepth 1 | xargs -I {} skill_validator.py {}
# Generate repository quality report
quality_scorer.py engineering/ --batch --output-format json > repo_quality.json
All tools provide both human-readable and machine-parseable output:
=== SKILL VALIDATION REPORT ===
Skill: engineering/example-skill
Tier: STANDARD
Overall Score: 85/100 (B)
Structure Validation: ā PASS
āā SKILL.md: ā EXISTS (247 lines)
āā README.md: ā EXISTS
āā scripts/: ā EXISTS (2 files)
āā references/: ā MISSING (recommended)
Documentation Quality: 22/25 (88%)
Code Quality: 20/25 (80%)
Completeness: 18/25 (72%)
Usability: 21/25 (84%)
Recommendations:
⢠Add references/ directory with documentation
⢠Improve error handling in main.py
⢠Include more comprehensive examples
{
"skill_path": "engineering/example-skill",
"timestamp": "2026-02-16T16:41:00Z",
"validation_results": {
"structure_compliance": {
"score": 0.95,
"checks": {
"skill_md_exists": true,
"readme_exists": true,
"scripts_directory": true,
"references_directory": false
}
},
"overall_score": 85,
"letter_grade": "B",
"tier_recommendation": "STANDARD",
"improvement_suggestions": [
"Add references/ directory",
"Improve error handling",
"Include comprehensive examples"
]
}
}
#!/bin/bash
# .git/hooks/pre-commit
echo "Running skill validation..."
python engineering/skill-tester/scripts/skill_validator.py engineering/new-skill --tier STANDARD
if [ $? -ne 0 ]; then
echo "Skill validation failed. Commit blocked."
exit 1
fi
echo "Validation passed. Proceeding with commit."
name: Skill Quality Gate
on:
pull_request:
paths: ['engineering/**']
jobs:
validate-skills:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Validate Changed Skills
run: |
changed_skills=$(git diff --name-only ${{ github.event.before }} | grep -E '^engineering/[^/]+/' | cut -d'/' -f1-2 | sort -u)
for skill in $changed_skills; do
echo "Validating $skill..."
python engineering/skill-tester/scripts/skill_validator.py $skill --json
python engineering/skill-tester/scripts/script_tester.py $skill
python engineering/skill-tester/scripts/quality_scorer.py $skill --minimum-score 75
done
#!/bin/bash
# Daily quality report generation
echo "Generating daily skill quality report..."
timestamp=$(date +"%Y-%m-%d")
python engineering/skill-tester/scripts/quality_scorer.py engineering/ \
--batch --json > "reports/quality_report_${timestamp}.json"
echo "Quality trends analysis..."
python engineering/skill-tester/scripts/trend_analyzer.py reports/ \
--days 30 > "reports/quality_trends_${timestamp}.md"
The Skill Tester represents a critical infrastructure component for maintaining the high-quality standards of the claude-skills ecosystem. By providing comprehensive validation, testing, and scoring capabilities, it ensures that all skills meet or exceed the rigorous requirements for their respective tiers.
This meta-skill not only serves as a quality gate but also as a development tool that guides skill authors toward best practices and helps maintain consistency across the entire repository. Through its integration capabilities and comprehensive reporting, it enables both manual and automated quality assurance workflows that scale with the growing claude-skills ecosystem.
The combination of structural validation, runtime testing, and multi-dimensional quality scoring provides unparalleled visibility into skill quality while maintaining the flexibility needed for diverse skill types and complexity levels. As the claude-skills repository continues to grow, the Skill Tester will remain the cornerstone of quality assurance and ecosystem integrity.
Generated Mar 1, 2026
A software development team uses skill-tester to validate new skills before committing to the claude-skills repository. It runs automated structure validation and script testing as a pre-commit hook, ensuring all skills meet tier-specific documentation and code standards, preventing substandard submissions.
An organization integrates skill-tester into their CI/CD pipeline to automatically test and score pull requests for skill updates. It validates directory structures, runs Python script tests, and generates quality scores, serving as a gatekeeper to maintain ecosystem consistency in automated workflows.
A university computer science department uses skill-tester to evaluate student projects that involve creating AI agent skills. It provides multi-dimensional quality scoring with letter grades and improvement recommendations, helping instructors assess documentation, code quality, and completeness efficiently.
An open-source community managing a skill repository employs skill-tester for batch processing of existing skills. It validates compliance with standards, identifies dependencies, and classifies skills into BASIC, STANDARD, or POWERFUL tiers, aiding in repository maintenance and quality audits.
A large enterprise uses skill-tester to enforce internal standards for AI skill development across teams. It checks for proper argparse implementation, output format compliance, and error handling, ensuring skills are robust and maintainable for integration into production systems.
Offer skill-tester as a cloud-based service where users can upload skill packages for automated validation and scoring. Revenue is generated through subscription tiers based on usage volume, with features like detailed reports and API access for integration into custom workflows.
Provide consulting services to help organizations integrate skill-tester into their development pipelines. Revenue comes from project-based fees for setup, customization, and training, ensuring clients meet quality standards and optimize their skill development processes.
Distribute skill-tester as an open-source tool with basic validation features free to use. Generate revenue by offering premium features such as advanced analytics, batch processing for large repositories, and priority support, targeting enterprises and large development teams.
š¬ Integration Tip
Integrate skill-tester into CI/CD pipelines using pre-commit hooks to automatically validate skills during development, ensuring consistent quality and reducing manual review effort.
Set up and use 1Password CLI (op). Use when installing the CLI, enabling desktop app integration, signing in (single or multi-account), or reading/injecting/running secrets via op.
Security-first skill vetting for AI agents. Use before installing any skill from ClawdHub, GitHub, or other sources. Checks for red flags, permission scope, and suspicious patterns.
Perform a comprehensive read-only security audit of Clawdbot's own configuration. This is a knowledge-based skill that teaches Clawdbot to identify hardening opportunities across the system. Use when user asks to "run security check", "audit clawdbot", "check security hardening", or "what vulnerabilities does my Clawdbot have". This skill uses Clawdbot's internal capabilities and file system access to inspect configuration, detect misconfigurations, and recommend remediations. It is designed to be extensible - new checks can be added by updating this skill's knowledge.
Use when reviewing code for security vulnerabilities, implementing authentication flows, auditing OWASP Top 10, configuring CORS/CSP headers, handling secrets, input validation, SQL injection prevention, XSS protection, or any security-related code review.
Security check for ClawHub skills powered by Koi. Query the Clawdex API before installing any skill to verify it's safe.
Scan Clawdbot and MCP skills for malware, spyware, crypto-miners, and malicious code patterns before you install them. Security audit tool that detects data exfiltration, system modification attempts, backdoors, and obfuscation techniques.