minimal-run-and-audit

Popular

Rigor Run skill for README-first deep learning repo reproduction. Use when the task is specifically to capture or normalize evidence from the selected smoke test or documented inference or evaluation command and write standardized `repro_outputs/` files, including patch notes when repository files changed. Do not use for training execution, initial repo intake, generic environment setup, paper lookup, target selection, hidden scientific-meaning changes, or end-to-end orchestration by itself.

413stars

0forks

Updated 6/16/2026

Get Skill Source Code

SKILL.md

readonlyread-only

name

minimal-run-and-audit

description

minimal-run-and-audit

Use this as the Rigor Run skill. The installed slug remains
minimal-run-and-audit for compatibility.

Use the shared operating principles in
../../references/agent-operating-principles.md; this skill should make run
evidence auditable without turning every command into a rigid protocol.

When to apply

After a reproduction target and setup plan exist.
When the main skill needs execution evidence and normalized outputs.
When a smoke test, documented inference run, documented evaluation run, or other short non-training verification is appropriate.
When the user already knows what command should be attempted and wants execution plus reporting only.

When not to apply

During initial repo scanning.
When environment or assets are still undefined enough to make execution meaningless.
When the task is a literature lookup rather than repository execution.
When the user is still deciding which reproduction target should count as the main run.

Clear boundaries

This skill owns normalized reporting for an attempted command.
It may receive execution evidence from the main skill or a thin helper.
It does not choose the overall target on its own.
It does not perform broad paper analysis.
It does not own training startup, resume, or long-running training state.
It should not normalize risky code edits into acceptable practice.
It must not hide changes that alter evaluation, preprocessing, checkpoints,
metrics, or other scientific meaning.

Input expectations

selected reproduction goal
runnable commands or smoke commands
environment and asset assumptions
optional patch metadata

Output expectations

execution result summary
standardized repro_outputs/ files
SCIENTIFIC_CHANGELOG.md for changed scientific meaning and evidence status
COMPARABILITY_REPORT.md for README/paper/baseline comparability
clear distinction between verified, partial, and blocked states
PATCHES.md when repo files changed

Notes

Use references/reporting-policy.md, ../../references/research-rigor-principles.md, scripts/run_command.py, and scripts/write_outputs.py.

Related Skills

writing-skills

233Kresearch-knowledge

Use when creating new skills, editing existing skills, or verifying skills work before deployment

obra

Get

doc-coauthoring

153Kresearch-knowledge

Guide users through a structured workflow for co-authoring documentation. Use when user wants to write documentation, proposals, technical specs, decision docs, or similar structured content. This workflow helps users efficiently transfer context, refine content through iteration, and verify the doc works for readers. Trigger when user mentions writing docs, creating proposals, drafting specs, or similar documentation tasks.

anthropics

Get

claude-api

153Kresearch-knowledge

anthropics

Get

mcp-builder

153Kresearch-knowledge

Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).

anthropics

Get

xlsx

152Kresearch-knowledge

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like \"the xlsx in my downloads\") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

anthropics

Get

docx

151Kresearch-knowledge

Use this skill whenever the user wants to create, read, edit, or manipulate Word documents (.docx files). Triggers include: any mention of 'Word doc', 'word document', '.docx', or requests to produce professional documents with formatting like tables of contents, headings, page numbers, or letterheads. Also use when extracting or reorganizing content from .docx files, inserting or replacing images in documents, performing find-and-replace in Word files, working with tracked changes or comments, or converting content into a polished Word document. If the user asks for a 'report', 'memo', 'letter', 'template', or similar deliverable as a Word or .docx file, use this skill. Do NOT use for PDFs, spreadsheets, Google Docs, or general coding tasks unrelated to document generation.

anthropics

Get

minimal-run-and-audit

minimal-run-and-audit

When to apply

When not to apply

Clear boundaries

Input expectations

Output expectations

Notes

You Might Also Like

Related Skills

writing-skills

doc-coauthoring

claude-api

mcp-builder

xlsx

docx