tiktok-crawlingUse for TikTok crawling, content retrieval, and analysis
Install via ClawdBot CLI:
clawdbot install RomneyDa/tiktok-crawlingyt-dlp is a CLI for downloading video/audio from TikTok and many other sites.
# macOS
brew install yt-dlp ffmpeg
# pip (any platform)
pip install yt-dlp
# Also install ffmpeg separately for merging/post-processing
yt-dlp "https://www.tiktok.com/@handle/video/1234567890"
yt-dlp "https://www.tiktok.com/@handle" \
-P "./tiktok/data" \
-o "%(uploader)s/%(upload_date)s-%(id)s/video.%(ext)s" \
--write-info-json
Creates:
tiktok/data/
handle/
20260220-7331234567890/
video.mp4
video.info.json
for handle in handle1 handle2 handle3; do
yt-dlp "https://www.tiktok.com/@$handle" \
-P "./tiktok/data" \
-o "%(uploader)s/%(upload_date)s-%(id)s/video.%(ext)s" \
--write-info-json \
--download-archive "./tiktok/downloaded.txt"
done
# Search by keyword
yt-dlp "tiktoksearch:cooking recipes" --playlist-end 20
# Hashtag page
yt-dlp "https://www.tiktok.com/tag/booktok" --playlist-end 50
# Videos using a specific sound
yt-dlp "https://www.tiktok.com/music/original-sound-1234567890" --playlist-end 30
# List available formats
yt-dlp -F "https://www.tiktok.com/@handle/video/1234567890"
# Download specific format (e.g., best video without watermark if available)
yt-dlp -f "best" "https://www.tiktok.com/@handle/video/1234567890"
# On or after a date
--dateafter 20260215
# Before a date
--datebefore 20260220
# Exact date
--date 20260215
# Date range
--dateafter 20260210 --datebefore 20260220
# Relative dates (macOS / Linux)
--dateafter "$(date -u -v-7d +%Y%m%d)" # macOS: last 7 days
--dateafter "$(date -u -d '7 days ago' +%Y%m%d)" # Linux: last 7 days
# 100k+ views
--match-filters "view_count >= 100000"
# Duration between 30-60 seconds
--match-filters "duration >= 30 & duration <= 60"
# Title contains "recipe" (case-insensitive)
--match-filters "title ~= (?i)recipe"
# Combine: 50k+ views from Feb 2026
yt-dlp "https://www.tiktok.com/@handle" \
--match-filters "view_count >= 50000" \
--dateafter 20260201
yt-dlp "https://www.tiktok.com/@handle" \
--simulate \
--print "%(upload_date)s | %(view_count)s views | %(title)s"
# Single JSON array
yt-dlp "https://www.tiktok.com/@handle" --simulate --dump-json > handle_videos.json
# JSONL (one object per line, better for large datasets)
yt-dlp "https://www.tiktok.com/@handle" --simulate -j > handle_videos.jsonl
yt-dlp "https://www.tiktok.com/@handle" \
--simulate \
--print-to-file "%(uploader)s,%(id)s,%(upload_date)s,%(view_count)s,%(like_count)s,%(webpage_url)s" \
"./tiktok/analysis/metadata.csv"
# Top 10 videos by views from downloaded .info.json files
jq -s 'sort_by(.view_count) | reverse | .[:10] | .[] | {title, view_count, url: .webpage_url}' \
tiktok/data/*/*.info.json
# Total views across all videos
jq -s 'map(.view_count) | add' tiktok/data/*/*.info.json
# Videos grouped by upload date
jq -s 'group_by(.upload_date) | map({date: .[0].upload_date, count: length})' \
tiktok/data/*/*.info.json
Tip: For deeper analysis and visualization, load JSONL/CSV exports into Python with pandas. Useful for engagement scatter plots, posting frequency charts, or comparing metrics across creators.
The --download-archive flag tracks downloaded videos, enabling incremental updates:
yt-dlp "https://www.tiktok.com/@handle" \
-P "./tiktok/data" \
-o "%(uploader)s/%(upload_date)s-%(id)s/video.%(ext)s" \
--write-info-json \
--download-archive "./tiktok/downloaded.txt"
Run the same command laterβit skips videos already in downloaded.txt.
# Use cookies from browser (recommended)
yt-dlp --cookies-from-browser chrome "https://www.tiktok.com/@handle"
# Or export cookies to a file first
yt-dlp --cookies tiktok_cookies.txt "https://www.tiktok.com/@handle"
# crontab -e
# Run daily at 2 AM, log output
0 2 * * * cd /path/to/project && ./scripts/scrape-tiktok.sh >> ./tiktok/logs/cron.log 2>&1
Example scripts/scrape-tiktok.sh:
#!/bin/bash
set -e
HANDLES="handle1 handle2 handle3"
DATA_DIR="./tiktok/data"
ARCHIVE="./tiktok/downloaded.txt"
for handle in $HANDLES; do
echo "[$(date)] Scraping @$handle"
yt-dlp "https://www.tiktok.com/@$handle" \
-P "$DATA_DIR" \
-o "%(uploader)s/%(upload_date)s-%(id)s/video.%(ext)s" \
--write-info-json \
--download-archive "$ARCHIVE" \
--cookies-from-browser chrome \
--dateafter "$(date -u -v-7d +%Y%m%d)" \
--sleep-interval 2 \
--max-sleep-interval 5
done
echo "[$(date)] Done"
| Problem | Solution |
| ---------------------------------------- | --------------------------------------------------------------------------- |
| Empty results / no videos found | Add --cookies-from-browser chrome β TikTok rate-limits anonymous requests |
| 403 Forbidden errors | Rate limited. Wait 10-15 min, or use cookies/different IP |
| "Video unavailable" | Region-locked. Try --geo-bypass or a VPN |
| Watermarked videos | Check -F for alternative formats; some may lack watermark |
| Slow downloads | Add --concurrent-fragments 4 for faster downloads |
| Profile shows fewer videos than expected | TikTok API limits. Use --playlist-end N explicitly, try with cookies |
# Verbose output to diagnose issues
yt-dlp -v "https://www.tiktok.com/@handle" 2>&1 | tee debug.log
| Option | Description |
| ----------------------------- | ------------------------------------------- |
| -o TEMPLATE | Output filename template |
| -P PATH | Base download directory |
| --dateafter DATE | Videos on/after date (YYYYMMDD) |
| --datebefore DATE | Videos on/before date |
| --playlist-end N | Stop after N videos |
| --match-filters EXPR | Filter by metadata (views, duration, title) |
| --write-info-json | Save metadata JSON per video |
| --download-archive FILE | Track downloads, skip duplicates |
| --simulate / -s | Dry run, no download |
| -j / --dump-json | Output metadata as JSON |
| --cookies-from-browser NAME | Use cookies from browser |
| --sleep-interval SEC | Wait between downloads (avoid rate limits) |
| Variable | Example Output |
| ----------------- | ----------------------- |
| %(id)s | 7331234567890 |
| %(uploader)s | handle |
| %(upload_date)s | 20260215 |
| %(title).50s | First 50 chars of title |
| %(view_count)s | 1500000 |
| %(like_count)s | 250000 |
| %(ext)s | mp4 |
Generated Mar 1, 2026
Brands can use this skill to monitor TikTok content related to their products or campaigns. By downloading videos and metadata from specific hashtags or creators, they can analyze engagement metrics like views and likes to measure campaign performance and identify trends.
News organizations can scrape TikTok for viral videos and trending topics to gather user-generated content for reporting. Using date filters and metadata exports, they can quickly compile relevant videos on events like protests or natural disasters, enhancing real-time news coverage.
Influencers and content creators can analyze competitors' TikTok profiles to benchmark performance. By downloading entire profiles with metrics filtering, they can study posting frequency, popular video formats, and engagement patterns to optimize their own content strategy.
Researchers can use this skill to collect TikTok data for studies on digital culture or youth behavior. By exporting metadata to JSON or CSV, they can perform statistical analysis on large datasets, such as tracking hashtag evolution or correlating video metrics with social phenomena.
Law firms or investigators can scrape TikTok videos for evidence in cases involving intellectual property infringement or defamation. The skill allows downloading specific videos with timestamps and metadata, providing verifiable records for legal proceedings or compliance audits.
Build a subscription-based platform that automates TikTok scraping for clients, offering dashboards with insights on engagement and trends. Revenue comes from monthly fees based on data volume and features like automated reports or API access.
Sell curated TikTok datasets to market research firms or advertisers. By scraping profiles, hashtags, and sounds at scale, you can package metadata into industry-specific reports, generating revenue through one-time sales or licensing agreements.
Offer managed services to brands for ongoing TikTok monitoring and analysis. Use scheduled scraping with authentication to track competitors and campaigns, charging clients on a retainer basis for regular updates and strategic recommendations.
π¬ Integration Tip
Integrate this skill with Python scripts using subprocess calls to automate yt-dlp commands, and store outputs in cloud storage like AWS S3 for scalable data management.
Fetch and read transcripts from YouTube videos. Use when you need to summarize a video, answer questions about its content, or extract information from it.
Fetch and summarize YouTube video transcripts. Use when asked to summarize, transcribe, or extract content from YouTube videos. Handles transcript fetching via residential IP proxy to bypass YouTube's cloud IP blocks.
Browse, search, post, and moderate Reddit. Read-only works without auth; posting/moderation requires OAuth setup.
Interact with Twitter/X β read tweets, search, post, like, retweet, and manage your timeline.
LinkedIn automation via browser relay or cookies for messaging, profile viewing, and network actions.
Search YouTube videos, get channel info, fetch video details and transcripts using YouTube Data API v3 via MCP server or yt-dlp fallback.