
explore-run
热门Rigor Improve / Rigor Explore run leaf skill for bounded exploratory evidence in deep learning research repositories. Use when the researcher explicitly authorizes exploratory runs such as small-subset validation, short-cycle guess-and-check, batch sweeps, idle-GPU search, or quick transfer-learning trials, with fair-comparison caveats and no-overclaim summaries in `explore_outputs/`. Do not use for end-to-end exploration orchestration on top of `current_research`, trusted baseline execution, conservative training verification, default routing, verified SOTA claims, or implicit experimentation.
Rigor Improve / Rigor Explore run leaf skill for bounded exploratory evidence in deep learning research repositories. Use when the researcher explicitly authorizes exploratory runs such as small-subset validation, short-cycle guess-and-check, batch sweeps, idle-GPU search, or quick transfer-learning trials, with fair-comparison caveats and no-overclaim summaries in `explore_outputs/`. Do not use for end-to-end exploration orchestration on top of `current_research`, trusted baseline execution, conservative training verification, default routing, verified SOTA claims, or implicit experimentation.
explore-run
Use this as the Rigor Improve / Rigor Explore run leaf skill. The installed slug
remains explore-run for compatibility.
Use the shared operating principles in
../../references/agent-operating-principles.md; this skill should guide
candidate run planning while preserving model judgment about the active repo.
When to apply
- When the researcher explicitly authorizes exploratory runs.
- When the task is a small-subset validation, short-cycle training probe, batch sweep, idle-GPU search, or quick transfer-learning trial.
- When the output should rank candidate runs rather than certify trusted success.
When not to apply
- When the user wants trusted training execution or conservative verification.
- When there is no explicit exploratory authorization.
- When the task is repository setup, intake, or debugging.
Clear boundaries
- This skill owns exploratory execution planning and summary only.
- Use
ai-research-exploreinstead when the task spans both current_research coordination and exploratory code changes. - It may hand off actual command execution to
minimal-run-and-auditorrun-train. - It should keep experiment state isolated from the trusted baseline.
- It should prefer small-subset and short-cycle checks before heavier exploratory runs.
- It should label run results as bounded evidence and explain when a comparison
is not directly fair.
Ranking Semantics
- Pre-execution candidate selection uses three factors:
cost,success_rate, andexpected_gain. - Default weights should stay conservative unless the researcher explicitly provides
selection_weights. - Budget pruning still applies after scoring through
max_variantsandmax_short_cycle_runs. - If runs are executed later, downstream ranking should switch to real execution evidence, not stay purely heuristic.
Variant Spec Hints
- Use
variant_axesto define the candidate dimension grid. - Use
subset_sizesandshort_run_stepsto express exploratory run scale. - Use
selection_weightsto rebalancecost,success_rate, andexpected_gain. - Use
primary_metricandmetric_goalso downstream ranking can order executed candidates consistently.
Output expectations
explore_outputs/CHANGESET.mdexplore_outputs/SCIENTIFIC_CHANGELOG.mdexplore_outputs/COMPARABILITY_REPORT.mdexplore_outputs/TOP_RUNS.mdexplore_outputs/status.json
Notes
Use references/execution-policy.md, ../../references/explore-variant-spec.md, ../../references/deep-learning-experiment-principles.md, scripts/plan_variants.py, and scripts/write_outputs.py.
You Might Also Like
Related Skills

summarize
Summarize or transcribe URLs, YouTube/videos, podcasts, articles, transcripts, PDFs, and local files.
steipete
writing-skills
Use when creating new skills, editing existing skills, or verifying skills work before deployment
obra
doc-coauthoring
Guide users through a structured workflow for co-authoring documentation. Use when user wants to write documentation, proposals, technical specs, decision docs, or similar structured content. This workflow helps users efficiently transfer context, refine content through iteration, and verify the doc works for readers. Trigger when user mentions writing docs, creating proposals, drafting specs, or similar documentation tasks.
anthropics
mcp-builder
Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).
anthropics
xlsx
Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like \"the xlsx in my downloads\") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.
anthropics