senior-data-engineerData engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka,...
Install via ClawdBot CLI:
clawdbot install alirezarezvani/senior-data-engineerGrade Good — based on market validation, documentation quality, package completeness, maintenance status, and authenticity signals.
Generated Mar 1, 2026
A retail company needs to analyze daily sales data from multiple online platforms to track revenue, customer behavior, and inventory trends. This involves building a batch ETL pipeline to ingest data from PostgreSQL databases, transform it using dbt models for dimensional modeling, and load it into Snowflake for BI dashboards, with data quality checks to ensure accuracy.
A financial services firm requires a streaming data pipeline to monitor transactions in real-time for fraudulent activities. Using Kafka for event ingestion and Spark for processing, the system analyzes patterns and triggers alerts, with Airflow orchestrating batch jobs for historical data reconciliation and compliance reporting.
A healthcare provider aims to consolidate patient records from various sources like EHR systems and IoT devices into a unified data lakehouse. This scenario involves designing an ELT pipeline with data quality frameworks to ensure HIPAA compliance, using dbt for transformations and Airflow for scheduling incremental loads to support analytics on patient outcomes.
A logistics company seeks to optimize supply chain operations by processing real-time sensor data from shipments. The pipeline uses Kafka for streaming IoT events, Spark for aggregating metrics like delivery times, and batch ETL with dbt to model data in a data warehouse, enabling predictive analytics for route planning and inventory management.
A subscription-based service offering data engineering tools and managed pipelines to businesses, generating revenue through tiered pricing based on data volume and features like real-time processing or advanced analytics. This model leverages the skill's expertise in scalable infrastructure to provide turnkey solutions for clients.
Providing expert data engineering consulting to enterprises for designing and implementing custom data architectures, such as building ETL pipelines or setting up DataOps practices. Revenue comes from project-based contracts or hourly rates, utilizing the skill's workflows for pipeline development and troubleshooting.
Creating and selling proprietary data products, like pre-built analytics dashboards or data quality frameworks, that integrate with clients' existing systems. This model monetizes the skill's capabilities in data modeling and pipeline orchestration to deliver value-added insights and tools.
💬 Integration Tip
Integrate this skill with existing CI/CD pipelines to automate deployment of data workflows, and ensure compatibility with cloud platforms like AWS or GCP for scalable infrastructure management.
Scored Apr 15, 2026
Use the @steipete/oracle CLI to bundle a prompt plus the right files and get a second-model review (API or browser) for debugging, refactors, design checks, or cross-validation.
Local search/indexing CLI (BM25 + vectors + rerank) with MCP mode.
Design data models for construction projects. Create entity-relationship diagrams, define schemas, and generate database structures.
MarkItDown is a Python utility from Microsoft for converting various files (PDF, Word, Excel, PPTX, Images, Audio) to Markdown. Useful for extracting structu...
Connect to Supabase for database operations, vector search, and storage. Use for storing data, running SQL queries, similarity search with pgvector, and managing tables. Triggers on requests involving databases, vector stores, embeddings, or Supabase specifically.
Use when designing database schemas, writing migrations, optimizing SQL queries, fixing N+1 problems, creating indexes, setting up PostgreSQL, configuring EF Core, implementing caching, partitioning tables, or any database performance question.