sql-performance-review

sql-performance-review

Review and improve Databricks SQL queries for correctness, readability, and performance (joins, filters, aggregations, partition pruning). Use when someone pastes a SQL query, asks why it is slow, or requests a rewrite/optimization in Databricks SQL.

8Star
8Fork
更新于 1/24/2026
SKILL.md
readonly只读
name
sql-performance-review
description

Review and improve Databricks SQL queries for correctness, readability, and performance (joins, filters, aggregations, partition pruning). Use when someone pastes a SQL query, asks why it is slow, or requests a rewrite/optimization in Databricks SQL.

version
"1.0"

Databricks SQL performance review

Use this skill when optimizing or reviewing SQL in Databricks SQL.

What to ask for (only if missing)

Ask up to 3 questions total:

  1. The query text (if not provided)
  2. The table(s) involved + their sizes (rough order of magnitude) OR the query profile / execution plan
  3. The desired result constraints (correctness, exactness, latency SLA)

If the user can’t provide sizes/plan, proceed with best-effort heuristics and call out assumptions.

Output format

Use the structure in assets/sql-review-output.md.

Checklist

Use references/sql-checklist.md to ensure you cover the common performance levers:

  • predicate pushdown / partition pruning
  • join strategy and join keys
  • avoid SELECT *
  • minimize shuffles / wide aggregations
  • use correct data types and avoid implicit casts
  • reduce data scanned (pre-filter, semi-joins, EXISTS)

Examples

User: “This query is slow in Databricks SQL. Can you optimize it?” (pastes query)
Assistant: Provide issues, suggestions, and a rewritten query, plus next steps (EXPLAIN, add ZORDER, etc.).

Edge cases

  • If the query is logically wrong (duplicates from joins, missing filters), fix correctness first.
  • If tables are Delta: suggest partitioning/ZORDER/OPTIMIZE only if it matches query patterns.
  • If the user is in a governed environment: avoid suggestions that require elevated permissions unless noted.

You Might Also Like

Related Skills

zig-system-calls

zig-system-calls

87Kdev-database

Guides using bun.sys for system calls and file I/O in Zig. Use when implementing file operations instead of std.fs or std.posix.

oven-sh avataroven-sh
获取
bun-file-io

bun-file-io

86Kdev-database

Use this when you are working on file operations like reading, writing, scanning, or deleting files. It summarizes the preferred file APIs and patterns used in this repo. It also notes when to use filesystem helpers for directories.

anomalyco avataranomalyco
获取
vector-index-tuning

vector-index-tuning

26Kdev-database

Optimize vector index performance for latency, recall, and memory. Use when tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure.

wshobson avatarwshobson
获取

Implement efficient similarity search with vector databases. Use when building semantic search, implementing nearest neighbor queries, or optimizing retrieval performance.

wshobson avatarwshobson
获取

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

wshobson avatarwshobson
获取
event-store-design

event-store-design

26Kdev-database

Design and implement event stores for event-sourced systems. Use when building event sourcing infrastructure, choosing event store technologies, or implementing event persistence patterns.

wshobson avatarwshobson
获取