data-pipelinesDeep data pipeline workflow—ingestion, orchestration, idempotency, data quality, SLAs, observability, and lineage. Use when building batch/stream pipelines,...
Install via ClawdBot CLI:
clawdbot install mike47512/data-pipelinesGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Generated Apr 17, 2026
A retail company needs to ingest daily order data from multiple sources (e.g., website, mobile app) into a data warehouse for analytics. The pipeline must handle schema changes from API updates, ensure idempotency to avoid duplicate orders during backfills, and meet SLAs for freshness to support real-time inventory dashboards.
A fintech firm builds a streaming pipeline to process real-time transactions from Kafka for fraud detection. It requires observability to debug job failures, data quality checks to flag anomalies in transaction amounts, and lineage tracking for regulatory compliance audits on transaction sources.
A hospital system implements a batch pipeline to aggregate patient records from various EHR systems using Airflow. The workflow must manage schema evolution as new data fields are added, enforce data quality rules for completeness, and document lineage for HIPAA compliance and operational ownership.
A manufacturing company sets up a pipeline to ingest streaming sensor data from factory equipment via Spark. It focuses on orchestration with retries for network failures, monitoring for SLA misses on data freshness, and idempotent sinks to handle late-arriving data without duplication in time-series databases.
A streaming service uses a batch pipeline to process user viewing history daily for recommendation algorithms. The pipeline requires source contracts to handle API rate limits from content databases, quality checks for null values in user ratings, and clear DAGs for dependencies to ensure timely model updates.
Companies offer curated datasets to clients via subscription, using pipelines to ingest, transform, and deliver data with SLAs on freshness. Revenue comes from monthly or annual fees based on data volume and quality guarantees, leveraging idempotency and monitoring to ensure reliable service.
Firms provide custom pipeline development and optimization services for enterprises, charging project-based or retainer fees. Revenue is generated by designing workflows for specific use cases like ETL/ELT, with a focus on lineage and operations to reduce client downtime and improve data reliability.
Software vendors integrate data pipeline capabilities into their products (e.g., CRM or marketing tools), enabling users to sync external data sources. Revenue models include tiered pricing based on pipeline complexity and data volume, with upsells for advanced features like observability and quality checks.
💬 Integration Tip
Pair this skill with etl-design for batch optimization and message-queues for streaming handoffs to enhance pipeline reliability and performance.
Scored Apr 19, 2026
Process, transform, analyze, and report on CSV and JSON data files. Use when the user needs to filter rows, join datasets, compute aggregates, convert formats, deduplicate, or generate summary reports from tabular data. Works with any CSV, TSV, or JSON Lines file.
When the user wants to create competitor comparison or alternative pages for SEO and sales enablement. Also use when the user mentions 'alternative page,' 'v...
HubSpot CRM and CMS API integration for contacts, companies, deals, owners, and content management.
Automate ActiveCampaign tasks via Rube MCP (Composio): manage contacts, tags, list subscriptions, automation enrollment, and tasks. Always search tools first for current schemas.
Track deals through every stage from lead to close. Manage pipeline stages, update deal status, forecast revenue, and identify bottlenecks in your sales process.
Generate professional HTML proposals from meeting notes. Features 5 proposal styles (Corporate, Entrepreneur, Creative, Consultant, Minimal), 6+ color themes, and a Design Wizard for custom templates. Triggers on "create proposal", "proposal for [client]", "proposal wizard", "proposal from [notes]", "show proposal styles", "finalize proposal". Integrates with ai-meeting-notes for context. Outputs beautiful, responsive HTML ready to send or export as PDF.