data-pipelinesDeep data pipeline workflow—ingestion, orchestration, idempotency, data quality, SLAs, observability, and lineage. Use when building batch/stream pipelines,...
Install via ClawdBot CLI:
clawdbot install mike47512/data-pipelinesGrade Fair — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Generated Apr 17, 2026
A retail company needs to ingest daily order data from multiple sources (e.g., website, mobile app) into a data warehouse for analytics. The pipeline must handle schema changes from API updates, ensure idempotency to avoid duplicate orders during backfills, and meet SLAs for freshness to support real-time inventory dashboards.
A fintech firm builds a streaming pipeline to process real-time transactions from Kafka for fraud detection. It requires observability to debug job failures, data quality checks to flag anomalies in transaction amounts, and lineage tracking for regulatory compliance audits on transaction sources.
A hospital system implements a batch pipeline to aggregate patient records from various EHR systems using Airflow. The workflow must manage schema evolution as new data fields are added, enforce data quality rules for completeness, and document lineage for HIPAA compliance and operational ownership.
A manufacturing company sets up a pipeline to ingest streaming sensor data from factory equipment via Spark. It focuses on orchestration with retries for network failures, monitoring for SLA misses on data freshness, and idempotent sinks to handle late-arriving data without duplication in time-series databases.
A streaming service uses a batch pipeline to process user viewing history daily for recommendation algorithms. The pipeline requires source contracts to handle API rate limits from content databases, quality checks for null values in user ratings, and clear DAGs for dependencies to ensure timely model updates.
Companies offer curated datasets to clients via subscription, using pipelines to ingest, transform, and deliver data with SLAs on freshness. Revenue comes from monthly or annual fees based on data volume and quality guarantees, leveraging idempotency and monitoring to ensure reliable service.
Firms provide custom pipeline development and optimization services for enterprises, charging project-based or retainer fees. Revenue is generated by designing workflows for specific use cases like ETL/ELT, with a focus on lineage and operations to reduce client downtime and improve data reliability.
Software vendors integrate data pipeline capabilities into their products (e.g., CRM or marketing tools), enabling users to sync external data sources. Revenue models include tiered pricing based on pipeline complexity and data volume, with upsells for advanced features like observability and quality checks.
💬 Integration Tip
Pair this skill with etl-design for batch optimization and message-queues for streaming handoffs to enhance pipeline reliability and performance.
Scored Apr 19, 2026
ActiveCampaign API integration with managed OAuth. Marketing automation, CRM, contacts, deals, email campaigns, automations, tags, lists, users, accounts, cu...
Guide users building a personal CRM from simple files to structured database.
RevenueCat metrics, customer data, and documentation search. Use when querying subscription analytics, MRR, churn, customers, or RevenueCat docs.
Interact with Twenty CRM (self-hosted) via REST/GraphQL.
The stamp doesn't lie. It tells a different truth. Instead of rules, forge your AI a past — a coherent history that makes its behaviors intrinsic, not imposed. Based on Brandon Sanderson's The Emperor's Soul.
Analyze brands to generate comprehensive brand identity profiles (JSON). Use when the user wants to analyze a brand, create a brand profile, or needs brand data for ad generation. Stores profiles for reuse across Ad-Ready, Morpheus, and other creative workflows. Can list existing profiles and update them.