crawl-url

Crawl any website and save pages as local markdown files. Use when you need to download documentation, knowledge bases, or web content for offline access or analysis. No code required - just provide a URL.

2Sterne

0Forks

Aktualisiert 1/22/2026

Skill holen Quellcode

SKILL.md

readonlyread-only

name

crawl-url

description

"Crawl any website and save pages as local markdown files. Use when you need to download documentation, knowledge bases, or web content for offline access or analysis. No code required - just provide a URL."

URL Crawler

Crawls websites using Tavily Crawl API and saves each page as a separate markdown file in a flat directory structure.

Prerequisites

Tavily API Key Required - Get your key at https://tavily.com

Add to ~/.claude/settings.json:

{
  "env": {
    "TAVILY_API_KEY": "tvly-your-api-key-here"
  }
}

Restart Claude Code after adding your API key.

When to Use

Use this skill when the user wants to:

Crawl and extract content from a website
Download API documentation, framework docs, or knowledge bases
Save web content locally for offline access or analysis

Usage

Execute the crawl script with a URL and optional instruction:

python scripts/crawl_url.py <URL> [--instruction "guidance text"]

Required Parameters

URL: The website to crawl (e.g., https://docs.stripe.com/api)

Optional Parameters

--instruction, -i: Natural language guidance for the crawler (e.g., "Focus on API endpoints only")
--output, -o: Output directory (default: <repo_root>/crawled_context/<domain>)
--depth, -d: Max crawl depth (default: 2, range: 1-5)
--breadth, -b: Max links per level (default: 50)
--limit, -l: Max total pages to crawl (default: 50)

Output

The script creates a flat directory structure at <repo_root>/crawled_context/<domain>/ with one markdown file per crawled page. Filenames are derived from URLs (e.g., docs_stripe_com_api_authentication.md).

Each markdown file includes:

Frontmatter with source URL and crawl timestamp
The extracted content in markdown format

Examples

Basic Crawl

python scripts/crawl_url.py https://docs.anthropic.com

Crawls the Anthropic docs with default settings, saves to <repo_root>/crawled_context/docs_anthropic_com/.

With Instruction

python scripts/crawl_url.py https://react.dev --instruction "Focus on API reference pages and hooks documentation"

Uses natural language instruction to guide the crawler toward specific content.

Custom Output Directory

python scripts/crawl_url.py https://docs.stripe.com/api -o ./stripe-api-docs

Saves results to a custom directory.

Adjust Crawl Parameters

python scripts/crawl_url.py https://nextjs.org/docs --depth 3 --breadth 100 --limit 200

Increases crawl depth, breadth, and page limit for more comprehensive coverage.

Important Notes

API Key Required: Set TAVILY_API_KEY environment variable (loads from .env if available)
Crawl Time: Deeper crawls take longer (depth 3+ may take many minutes)
Filename Safety: URLs are converted to safe filenames automatically
Flat Structure: All files saved in <repo_root>/crawled_context/<domain>/ directory regardless of original URL hierarchy
Duplicate Prevention: Files are overwritten if URLs generate identical filenames

Related Skills

summarize

179Kresearch

Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).

openclaw

Holen

prompt-lookup

143Kresearch

Activates when the user asks about AI prompts, needs prompt templates, wants to search for prompts, or mentions prompts.chat. Use for discovering, retrieving, and improving prompts.

Holen

skill-lookup

143Kresearch

Activates when the user asks about Agent Skills, wants to find reusable AI capabilities, needs to install skills, or mentions skills for Claude. Use for discovering, retrieving, and installing skills.

Holen

sherpa-onnx-tts

88Kresearch

Local text-to-speech via sherpa-onnx (offline, no cloud)

moltbot

Holen

openai-whisper

87Kresearch

Local speech-to-text with the Whisper CLI (no API key).

moltbot

Holen

seo-review

66Kresearch

Perform a focused SEO audit on JavaScript concept pages to maximize search visibility, featured snippet optimization, and ranking potential

leonardomso

Holen

crawl-url

URL Crawler

Prerequisites

When to Use

Usage

Required Parameters

Optional Parameters

Output

Examples

Basic Crawl

With Instruction

Custom Output Directory

Adjust Crawl Parameters

Important Notes

You Might Also Like

Related Skills

summarize

prompt-lookup

skill-lookup

sherpa-onnx-tts

openai-whisper

seo-review