data-cogYour data has answers. CellCog asks the right questions. #1 on DeepResearch Bench (Feb 2026) + frontier coding agent — upload messy CSVs with minimal prompting and get structured insights back: charts, dashboards, statistical reports, and clean data. Full Python access for data cleaning, exploratory analysis, visualization, hypothesis testing, ML model evaluation, and dataset profiling. Analyzes everything, presents it beautifully.
Install via ClawdBot CLI:
clawdbot install nitishgargiitd/data-cogYour data has answers. CellCog asks the right questions. #1 on DeepResearch Bench (Feb 2026) + frontier coding agent.
Most AI tools return code when you ask about data. CellCog returns answers — actual charts, clean datasets, statistical reports, and visual dashboards. Upload messy CSVs with a minimal prompt, and CellCog's coding agent explores your data, finds the patterns, and presents them beautifully. Full Python access for everything from data cleaning to ML model evaluation.
This skill requires the cellcog skill for SDK setup and API calls.
clawhub install cellcog
Read the cellcog skill first for SDK setup. This skill shows you what's possible.
Quick pattern (v1.0+):
# Fire-and-forget - returns immediately
result = client.create_chat(
prompt="Analyze this dataset: <SHOW_FILE>/path/to/data.csv</SHOW_FILE>",
notify_session_key="agent:main:main",
task_label="data-analysis",
chat_mode="agent" # Agent mode for most data work
)
# Daemon notifies you when complete - do NOT poll
Other AI tools give you Python code and say "run this." CellCog runs the code for you and delivers the results:
| Other AI Tools | Data-Cog |
|---------------|----------|
| "Here's a pandas script to analyze your data" | Here are your actual insights with charts |
| "Run this matplotlib code to see the chart" | Here's the chart, annotated with findings |
| "This SQL query will find outliers" | Found 23 outliers, here's what they mean |
| "You'll need scikit-learn for this" | Model trained, here's accuracy and feature importance |
You upload data. You get answers. The code runs behind the scenes.
Understand your data fast:
Example prompt:
"Analyze this dataset:
/path/to/customer_data.csv
I don't know much about this data yet. Give me:
- Overview: rows, columns, data types, missing values
- Key distributions and summary statistics
- Most interesting correlations
- Any outliers or data quality issues
- 3-5 insights that jump out
Present findings as an interactive HTML report with charts."
Wrangle messy data into shape:
Example prompt:
"Clean and transform this dataset:
/path/to/messy_data.csv
Issues I know about:
- Dates are in mixed formats (MM/DD/YYYY and YYYY-MM-DD)
- 'Revenue' column has some values with $ signs and commas
- Duplicate rows exist
- Missing values in 'Region' column
Clean it up and give me back a clean CSV plus a summary of what you changed."
Rigorous analysis with real numbers:
Example prompt:
"I ran an A/B test on our checkout page:
/path/to/ab_test_results.csv
Columns: user_id, variant (A or B), converted (0/1), revenue, timestamp
Tell me:
- Is variant B statistically better? (p-value, confidence interval)
- Conversion rate difference
- Revenue per user difference
- Sample size adequacy check
- My recommendation: ship B or keep testing?
Present with clear charts and a plain-English conclusion."
Turn data into visual stories:
Applied ML without the setup:
Example prompt:
"Predict customer churn from this dataset:
/path/to/customer_features.csv
Target column: 'churned'
- Train a model, try at least 2 algorithms
- Show feature importance — what drives churn?
- Confusion matrix and ROC curve
- Plain-English summary: 'The top 3 reasons customers churn are...'
- Actionable recommendations based on findings
I want insights, not just metrics."
| Format | How to Send |
|--------|-------------|
| CSV | Upload via SHOW_FILE |
| Excel (XLSX) | Upload via SHOW_FILE |
| JSON | Upload via SHOW_FILE |
| Parquet | Upload via SHOW_FILE |
| SQL exports | Upload the dump via SHOW_FILE |
| Inline data | Describe small datasets directly in prompt |
| Format | Best For |
|--------|----------|
| Interactive HTML Dashboard | Explorable charts, filters, drill-downs |
| PDF Report | Shareable analysis reports with charts and findings |
| Clean CSV/XLSX | Cleaned or transformed data files for downstream use |
| Markdown | Quick insights for integration into docs |
| Scenario | Recommended Mode |
|----------|------------------|
| Quick data cleaning, simple charts, basic statistics | "agent" |
| Deep analysis with multiple techniques, ML modeling, comprehensive reports | "agent team" |
Use "agent" for most data work. Data cleaning, EDA, chart generation, and standard statistical analysis execute well in agent mode.
Use "agent team" for complex analytical projects — multi-technique analysis, ML model comparisons, or when you need deep domain reasoning about what the data means.
Minimal prompt, maximum insight:
"Analyze this:
/path/to/data.csv
Tell me everything interesting."
That's it. CellCog's coding agent will profile the data, run exploratory analysis, find patterns, and present findings with charts. You don't need to know what to ask — the agent figures it out.
Business analysis:
"Analyze our e-commerce data:
/path/to/orders.csv
I need:
- Revenue trends (daily, weekly, monthly)
- Best and worst performing products
- Customer purchase frequency distribution
- Average order value trends
- Seasonal patterns
- Top 5 actionable insights for growing revenue
Interactive HTML dashboard with all charts."
Research data analysis:
"Analyze this survey data from 500 respondents:
/path/to/survey.csv
Research questions:
1. Is there a significant relationship between age group and product preference?
2. Do satisfaction scores differ by region? (ANOVA)
3. What factors best predict likelihood to recommend? (regression)
Include: statistical tests, p-values, effect sizes, and publication-ready charts.
PDF report format."
Generated Mar 1, 2026
An e-commerce company uploads messy customer transaction and interaction CSV data to identify factors leading to churn. Data-Cog cleans the data, performs exploratory analysis to find correlations, builds a classification model to predict at-risk customers, and generates an interactive dashboard with visualizations and actionable insights for retention strategies.
A SaaS company uses Data-Cog to analyze A/B test results from a new feature rollout. The skill processes CSV data with user engagement metrics, conducts statistical hypothesis testing to determine significance, calculates conversion rate differences, and produces a report with charts and plain-English conclusions to guide product decisions.
A retail chain uploads historical sales CSV data across multiple stores. Data-Cog performs time series analysis to identify trends and seasonality, cleans and transforms the data, builds a forecasting model to predict next quarter's sales, and creates presentation-ready charts for executive reporting and inventory planning.
A healthcare provider uploads messy patient records in CSV format to assess data quality. Data-Cog profiles the dataset by analyzing distributions, missing values, outliers, and correlations, cleans inconsistencies like date formats, and generates an HTML report with insights on data integrity for compliance and operational improvements.
A marketing agency uses Data-Cog to analyze customer behavior data from a client. The skill performs clustering to segment customers into natural groups based on purchasing patterns, conducts exploratory analysis to discover trends, and creates interactive dashboards with visualizations to inform targeted marketing campaigns and personalization strategies.
Offer Data-Cog as a monthly or annual subscription service where businesses pay for access to automated data analysis and reporting. This model provides recurring revenue by catering to companies needing regular insights without in-house data science teams, with tiered pricing based on data volume or feature access.
Provide consulting services to integrate Data-Cog into specific business workflows, offering custom analysis, training, and support. This model generates project-based revenue from one-time engagements or retainer contracts, ideal for enterprises with complex data needs requiring tailored solutions and ongoing optimization.
Deploy a freemium model where basic data analysis features are free, but advanced capabilities like ML model evaluation, large dataset processing, or priority support require a paid upgrade. This model drives user adoption through free access while monetizing power users and businesses needing more sophisticated tools.
💬 Integration Tip
Integrate Data-Cog by first installing the cellcog dependency and using the fire-and-forget pattern with agent chat mode for asynchronous analysis, ensuring notifications handle completion without polling.
Use the @steipete/oracle CLI to bundle a prompt plus the right files and get a second-model review (API or browser) for debugging, refactors, design checks, or cross-validation.
Manage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database). Use when a user asks Clawdbot to add a task to Things, list inbox/today/upcoming, search tasks, or inspect projects/areas/tags.
Local search/indexing CLI (BM25 + vectors + rerank) with MCP mode.
Use when designing database schemas, writing migrations, optimizing SQL queries, fixing N+1 problems, creating indexes, setting up PostgreSQL, configuring EF Core, implementing caching, partitioning tables, or any database performance question.
Connect to Supabase for database operations, vector search, and storage. Use for storing data, running SQL queries, similarity search with pgvector, and managing tables. Triggers on requests involving databases, vector stores, embeddings, or Supabase specifically.
Query, design, migrate, and optimize SQL databases. Use when working with SQLite, PostgreSQL, or MySQL — schema design, writing queries, creating migrations, indexing, backup/restore, and debugging slow queries. No ORMs required.