explore-run

Popular

Rigor Improve / Rigor Explore run leaf skill for bounded exploratory evidence in deep learning research repositories. Use when the researcher explicitly authorizes exploratory runs such as small-subset validation, short-cycle guess-and-check, batch sweeps, idle-GPU search, or quick transfer-learning trials, with fair-comparison caveats and no-overclaim summaries in `explore_outputs/`. Do not use for end-to-end exploration orchestration on top of `current_research`, trusted baseline execution, conservative training verification, default routing, verified SOTA claims, or implicit experimentation.

449stars

0forks

Updated 6/22/2026

Get Skill Source Code

SKILL.md

readonlyread-only

name

explore-run

description

explore-run

Use this as the Rigor Improve / Rigor Explore run leaf skill. The installed slug
remains explore-run for compatibility.

Use the shared operating principles in
../../references/agent-operating-principles.md; this skill should guide
candidate run planning while preserving model judgment about the active repo.

When to apply

When the researcher explicitly authorizes exploratory runs.
When the task is a small-subset validation, short-cycle training probe, batch sweep, idle-GPU search, or quick transfer-learning trial.
When the output should rank candidate runs rather than certify trusted success.

When not to apply

When the user wants trusted training execution or conservative verification.
When there is no explicit exploratory authorization.
When the task is repository setup, intake, or debugging.

Clear boundaries

This skill owns exploratory execution planning and summary only.
Use ai-research-explore instead when the task spans both current_research coordination and exploratory code changes.
It may hand off actual command execution to minimal-run-and-audit or run-train.
It should keep experiment state isolated from the trusted baseline.
It should prefer small-subset and short-cycle checks before heavier exploratory runs.
It should label run results as bounded evidence and explain when a comparison
is not directly fair.

Ranking Semantics

Pre-execution candidate selection uses three factors: cost, success_rate, and expected_gain.
Default weights should stay conservative unless the researcher explicitly provides selection_weights.
Budget pruning still applies after scoring through max_variants and max_short_cycle_runs.
If runs are executed later, downstream ranking should switch to real execution evidence, not stay purely heuristic.

Variant Spec Hints

Use variant_axes to define the candidate dimension grid.
Use subset_sizes and short_run_steps to express exploratory run scale.
Use selection_weights to rebalance cost, success_rate, and expected_gain.
Use primary_metric and metric_goal so downstream ranking can order executed candidates consistently.

Output expectations

explore_outputs/CHANGESET.md
explore_outputs/SCIENTIFIC_CHANGELOG.md
explore_outputs/COMPARABILITY_REPORT.md
explore_outputs/TOP_RUNS.md
explore_outputs/status.json

Notes

Use references/execution-policy.md, ../../references/explore-variant-spec.md, ../../references/deep-learning-experiment-principles.md, scripts/plan_variants.py, and scripts/write_outputs.py.

Related Skills

summarize

380Kresearch-knowledge

Summarize or transcribe URLs, YouTube/videos, podcasts, articles, transcripts, PDFs, and local files.

steipete

Get

writing-skills

233Kresearch-knowledge

Use when creating new skills, editing existing skills, or verifying skills work before deployment

obra

Get

doc-coauthoring

153Kresearch-knowledge

Guide users through a structured workflow for co-authoring documentation. Use when user wants to write documentation, proposals, technical specs, decision docs, or similar structured content. This workflow helps users efficiently transfer context, refine content through iteration, and verify the doc works for readers. Trigger when user mentions writing docs, creating proposals, drafting specs, or similar documentation tasks.

anthropics

Get

claude-api

153Kresearch-knowledge

anthropics

Get

mcp-builder

153Kresearch-knowledge

Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).

anthropics

Get

xlsx

152Kresearch-knowledge

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like \"the xlsx in my downloads\") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

anthropics

Get

explore-run

explore-run

When to apply

When not to apply

Clear boundaries

Ranking Semantics

Variant Spec Hints

Output expectations

Notes

You Might Also Like

Related Skills

summarize

writing-skills

doc-coauthoring

claude-api

mcp-builder

xlsx