gcloud-usage

Beliebt

This skill should be used when user asks about "GCloud logs", "Cloud Logging queries", "Google Cloud metrics", "GCP observability", "trace analysis", or "debugging production issues on GCP".

372Sterne

40Forks

Aktualisiert 1/23/2026

Skill holen Quellcode

SKILL.md

readonlyread-only

name

gcloud-usage

description

This skill should be used when user asks about "GCloud logs", "Cloud Logging queries", "Google Cloud metrics", "GCP observability", "trace analysis", or "debugging production issues on GCP".

GCP Observability Best Practices

Structured Logging

JSON Log Format

Use structured JSON logging for better queryability:

{
  "severity": "ERROR",
  "message": "Payment failed",
  "httpRequest": { "requestMethod": "POST", "requestUrl": "/api/payment" },
  "labels": { "user_id": "123", "transaction_id": "abc" },
  "timestamp": "2025-01-15T10:30:00Z"
}

Severity Levels

Use appropriate severity for filtering:

DEBUG: Detailed diagnostic info
INFO: Normal operations, milestones
NOTICE: Normal but significant events
WARNING: Potential issues, degraded performance
ERROR: Failures that don't stop the service
CRITICAL: Failures requiring immediate action
ALERT: Person must take action immediately
EMERGENCY: System is unusable

Log Filtering Queries

Common Filters

# By severity
severity >= WARNING

# By resource
resource.type="cloud_run_revision"
resource.labels.service_name="my-service"

# By time
timestamp >= "2025-01-15T00:00:00Z"

# By text content
textPayload =~ "error.*timeout"

# By JSON field
jsonPayload.user_id = "123"

# Combined
severity >= ERROR AND resource.labels.service_name="api"

Advanced Queries

# Regex matching
textPayload =~ "status=[45][0-9]{2}"

# Substring search
textPayload : "connection refused"

# Multiple values
severity = (ERROR OR CRITICAL)

Metrics vs Logs vs Traces

When to Use Each

Metrics: Aggregated numeric data over time

Request counts, latency percentiles
Resource utilization (CPU, memory)
Business KPIs (orders/minute)

Logs: Detailed event records

Error details and stack traces
Audit trails
Debugging specific requests

Traces: Request flow across services

Latency breakdown by service
Identifying bottlenecks
Distributed system debugging

Alert Policy Design

Alert Best Practices

Avoid alert fatigue: Only alert on actionable issues
Use multi-condition alerts: Reduce noise from transient spikes
Set appropriate windows: 5-15 min for most metrics
Include runbook links: Help responders act quickly

Common Alert Patterns

Error rate:

Condition: Error rate > 1% for 5 minutes
Good for: Service health monitoring

Latency:

Condition: P99 latency > 2s for 10 minutes
Good for: Performance degradation detection

Resource exhaustion:

Condition: Memory > 90% for 5 minutes
Good for: Capacity planning triggers

Cost Optimization

Reducing Log Costs

Exclusion filters: Drop verbose logs at ingestion
Sampling: Log only percentage of high-volume events
Shorter retention: Reduce default 30-day retention
Downgrade logs: Route to cheaper storage buckets

Exclusion Filter Examples

# Exclude health checks
resource.type="cloud_run_revision" AND httpRequest.requestUrl="/health"

# Exclude debug logs in production
severity = DEBUG

Debugging Workflow

Start with metrics: Identify when issues started
Correlate with logs: Filter logs around problem time
Use traces: Follow specific requests across services
Check resource logs: Look for infrastructure issues
Compare baselines: Check against known-good periods

Related Skills

create-pr

170Kdev-devops

Creates GitHub pull requests with properly formatted titles that pass the check-pr-title CI validation. Use when creating PRs, submitting changes for review, or when the user says /pr or asks to create a pull request.

n8n-io

Holen

electron-chromium-upgrade

120Kdev-devops

Guide for performing Chromium version upgrades in the Electron project. Use when working on the roller/chromium/main branch to fix patch conflicts during `e sync --3`. Covers the patch application workflow, conflict resolution, analyzing upstream Chromium changes, and proper commit formatting for patch fixes.

electron

Holen

pr-creator

92Kdev-devops

Use this skill when asked to create a pull request (PR). It ensures all PRs follow the repository's established templates and standards.

google-gemini

Holen

clawdhub

87Kdev-devops

Use the ClawdHub CLI to search, install, update, and publish agent skills from clawdhub.com. Use when you need to fetch new skills on the fly, sync installed skills to latest or a specific version, or publish new/updated skill folders with the npm-installed clawdhub CLI.

moltbot

Holen

tmux

87Kdev-devops

Remote-control tmux sessions for interactive CLIs by sending keystrokes and scraping pane output.

moltbot

Holen

create-pull-request

57Kdev-devops

Create a GitHub pull request following project conventions. Use when the user asks to create a PR, submit changes for review, or open a pull request. Handles commit analysis, branch management, and PR creation using the gh CLI tool.

cline

Holen

gcloud-usage

GCP Observability Best Practices

Structured Logging

JSON Log Format

Severity Levels

Log Filtering Queries

Common Filters

Advanced Queries

Metrics vs Logs vs Traces

When to Use Each

Alert Policy Design

Alert Best Practices

Common Alert Patterns

Cost Optimization

Reducing Log Costs

Exclusion Filter Examples

Debugging Workflow

You Might Also Like

Related Skills

create-pr

electron-chromium-upgrade

pr-creator

clawdhub

tmux

create-pull-request