Guide

How Do You Deploy an AI Agent to Production Without Breaking Everything?

AI

AI Skills Team

6/26/2026 10 min

The Deployment Gap: When Your Agent Works Locally but Fails in Production

You have spent weeks building an AI agent. It runs perfectly on your laptop. The evaluation scores look solid. Your team is ready to show it to users. Then you try to deploy it, and everything falls apart.

This is a common scenario for teams building agents with the Google Agent Development Kit (ADK). The local development environment is forgiving. You control the Python version, the environment variables are set in your shell, and there is no network policy blocking outbound calls. Production is different. You need a container image, a service account with the right IAM roles, secrets management, a deployment target that can scale, and a way to roll back if something goes wrong.

The pain is not just technical. It is organizational. A developer might push a Dockerfile that works on their machine but fails in CI because the base image is wrong. A teammate might hardcode an API key in the source code instead of using Secret Manager. Someone else might deploy to the wrong region. Each of these mistakes is avoidable, but they happen because the deployment process is manual, undocumented, and fragile.

Why This Problem Gets Worse Over Time

When you have one agent and one developer, you can get away with ad-hoc deployment. You SSH into a server, pull the latest code, and restart the service. This does not scale.

As your team grows, you need:

  • Reproducible builds. The same code should produce the same container image every time.
  • Secrets isolation. API keys and service account credentials must not live in source control.
  • Environment parity. Dev, staging, and production should be as similar as possible.
  • Rollback capability. If a deployment introduces a regression, you need to revert in seconds, not hours.
  • Audit trails. You need to know who deployed what, when, and to which environment.

Without these, every deployment is a gamble. The cost is not just downtime. It is lost trust from users and wasted engineering time spent debugging infrastructure instead of improving the agent.

What a Good Solution Should Change

A deployment tool for AI agents should do three things:

  1. Abstract away infrastructure boilerplate. You should not have to write Terraform modules from scratch or debug Docker networking issues every time you deploy.
  2. Enforce best practices by default. Secrets should be injected from a secrets manager. Service accounts should follow least-privilege principles. Container images should be built with reproducible tooling.
  3. Support multiple deployment targets with a consistent interface. Whether you choose Cloud Run for simplicity, GKE for full Kubernetes control, or Agent Runtime for managed agent infrastructure, the deployment command should feel the same.

This is where a CLI tool designed specifically for ADK agent deployment can help. One option worth inspecting is the google-agents-cli-deploy skill, which is part of the broader Google Agents CLI ecosystem.

Introducing google-agents-cli-deploy

The google-agents-cli-deploy skill provides deployment workflows for agents built with the Google Agent Development Kit. It wraps Terraform, Docker, and cloud deployment logic into a tested pipeline, exposed through the agents-cli command-line tool.

This is not a general-purpose deployment tool. It is designed specifically for ADK agents and understands their structure: the AdkApp pattern, session management, agent identity, and the deployment metadata that ADK projects carry. If your project was scaffolded with the ADK, this tool knows how to package and deploy it.

What It Can Do

The skill covers several deployment scenarios:

  • Deploy to Agent Runtime. This is the managed deployment target purpose-built for agents. It handles auto-scaling, session state persistence via VertexAiSessionService, and supports features like agent identity and private VPC connectivity through PSC interfaces.
  • Deploy to Cloud Run. For teams that need more control over infrastructure, event-driven triggers (Pub/Sub, Eventarc), or custom networking. Cloud Run supports fully configurable scaling, direct VPC egress, and IAP.
  • Deploy to GKE. For teams that require full Kubernetes control, including HPA, VPA, node auto-provisioning, and custom networking. This option has the highest setup complexity but offers the most flexibility.
  • Manage secrets. The --secrets flag lets you inject secrets from Secret Manager into your deployed service as environment variables.
  • Configure CI/CD. The skill includes guidance on setting up full CI/CD pipelines with agents-cli infra cicd, including runner comparison, Workload Identity Federation authentication, and pipeline stage configuration.
  • Handle rollback. Because deployments are versioned, you can revert to a previous version if a deployment introduces issues.

Choosing a Deployment Target

The skill provides a decision matrix to help you choose between Agent Runtime, Cloud Run, and GKE. Here is a simplified version:

  • Agent Runtime is best if you want managed infrastructure with minimal operational overhead. It supports Python agents, has native session state management, and does not bill when idle. However, it does not support event-driven triggers like Pub/Sub or Cloud Scheduler.
  • Cloud Run is best if you need event-driven workloads, custom networking, or more control over scaling behavior. It supports Pub/Sub and Eventarc triggers natively.
  • GKE is best if you need full Kubernetes control, want to run non-Python agents in custom containers, or have existing Kubernetes infrastructure you want to reuse.

If your agent needs OAuth 2.0 user consent flows (for example, to access Google Drive or Calendar on behalf of a user), Agent Runtime with Gemini Enterprise is the recommended path. Cloud Run does not currently support managed OAuth flows.

How the Deployment Workflow Works

The deployment process with agents-cli follows a structured sequence. Understanding this sequence helps you avoid the most common failure modes.

Step 1: Ensure Your Project Is Ready

If your project was not scaffolded with deployment support, you need to add it first:

agents-cli scaffold enhance . --deployment-target <target>

Replace <target> with agent_runtime, cloud_run, or gke. This adds the necessary Dockerfile, Terraform configuration, and deployment metadata to your project.

Step 2: Get Human Approval

The skill explicitly requires human approval before deploying. This is a safety mechanism. The CLI will prompt you:

"Eval scores meet thresholds and tests pass. Ready to deploy to dev?"

You must wait for explicit approval before proceeding. This prevents accidental deployments and ensures someone has reviewed the agent's readiness.

Step 3: Run the Deploy Command

agents-cli deploy

This single command handles building the container image, pushing it to Artifact Registry, configuring the deployment target, and starting the service. You can pass flags to customize the deployment:

  • --project to specify the GCP project ID
  • --region to specify the deployment region
  • --service-account to specify the service account email
  • --secrets to inject secrets as environment variables
  • --update-env-vars to set additional environment variables

Step 4: Verify the Deployment

After deployment, verify that the agent is running correctly. The skill includes testing instructions for each deployment target, including curl examples and load test guidance.

Handling Timeouts

Agent Runtime deployments can take 5-10 minutes and may exceed command timeouts. If the deploy command is cancelled or times out, the deployment continues server-side. You can check progress with:

agents-cli deploy --status

Poll every 60 seconds until it reports completion or failure.

When Not to Use This Skill

The google-agents-cli-deploy skill has clear boundaries. It is not the right tool for every situation:

  • Non-ADK agents. If your agent is not built with the Google Agent Development Kit, this tool will not work. It relies on ADK-specific project structure and metadata.
  • API code patterns. If you need help writing agent code, use the google-agents-cli-adk-code skill instead.
  • Evaluation. If you need to evaluate agent performance, use the google-agents-cli-eval skill.
  • Project scaffolding. If you need to create a new ADK project from scratch, use the google-agents-cli-scaffold skill.
  • Observability. If you need Cloud Trace, prompt-response logging, or BigQuery analytics, use the google-agents-cli-observability skill.

The skill also does not handle batch inference with BigQuery Remote Function triggers or Pub/Sub event-driven patterns directly. For those, you need to look at the google-agents-cli-adk-code skill and its reference documentation.

Setup Context and Prerequisites

Before using this skill, you need:

  1. agents-cli installed. The tool is installed via uv tool install google-agents-cli. If you do not have uv, you need to install it first from the uv documentation.
  2. An ADK project. Your project should be scaffolded with deployment support. If it is not, use the scaffold skill to add it.
  3. GCP credentials. You need authenticated GCP credentials with permissions to deploy to your chosen target (Cloud Run, GKE, or Agent Runtime).
  4. A GCP project. You need a project with the necessary APIs enabled (Cloud Run, GKE, Artifact Registry, etc., depending on your target).

Optional: Single-Project Infrastructure Setup

If you need to provision infrastructure in a single GCP project without CI/CD (service accounts, IAM bindings, telemetry resources, Artifact Registry), you can run:

agents-cli infra single-project

This is optional. The agents-cli deploy command works without it. Use it if you need observability features or want to test infrastructure in a single project before going to production.

Note that agents-cli deploy does not automatically use the Terraform-created service account. You need to pass it explicitly with --service-account SA_EMAIL.

Safety Signals and Repository Health

When evaluating whether to adopt this skill, consider these signals:

  • Repository owner. The repository is maintained by Google, which provides a level of accountability and long-term support.
  • License. The skill is licensed under Apache-2.0, which is a permissive open-source license.
  • Stars and community. The repository has over 3,000 stars, indicating active community interest.
  • Security level. The skill is rated as Low risk, meaning it does not require elevated permissions or access to sensitive resources beyond what is needed for deployment.
  • Version. The current version is 0.5.1, which suggests the tool is still evolving. Expect breaking changes in future releases.

What to Inspect Before Using

Before adopting this skill in a production workflow, inspect:

  1. The reference documentation. The skill includes detailed reference files for each deployment target (cloud-run.md, agent-runtime.md, gke.md). Read the one relevant to your target.
  2. The Terraform patterns. If you plan to customize infrastructure, review terraform-patterns.md for IAM, state management, and resource import guidance.
  3. The CI/CD pipeline documentation. If you plan to set up automated deployments, review cicd-pipeline.md for runner comparison, WIF authentication, and pipeline stage configuration.
  4. The testing documentation. Review testing-deployed-agents.md for testing instructions specific to your deployment target.
  5. Your team's Kubernetes expertise. If you are considering GKE, ensure your team has the necessary Kubernetes knowledge. GKE has the highest setup complexity.

Practical Example: Deploying to Cloud Run

Here is a condensed example of deploying an ADK agent to Cloud Run:

  1. Ensure your project has deployment support:

    agents-cli scaffold enhance . --deployment-target cloud_run
    
  2. Review the generated Dockerfile and Terraform configuration in deployment/.

  3. Get human approval for deployment.

  4. Deploy:

    agents-cli deploy --project my-gcp-project --region us-central1 --service-account agent-sa@my-gcp-project.iam.gserviceaccount.com --secrets API_KEY=my-secret:latest
    
  5. Verify the deployment:

    curl https://my-service-abc123-uc.a.run.app/health
    
  6. If something goes wrong, check deployment status:

    agents-cli deploy --status
    

Summary

Deploying AI agents to production is a real engineering challenge. The google-agents-cli-deploy skill provides a structured approach for ADK agents, covering Agent Runtime, Cloud Run, and GKE deployment targets. It enforces human approval before deployment, supports secrets management, and includes CI/CD pipeline guidance.

It is not a universal solution. It only works with ADK projects, it does not handle agent code or evaluation, and the GKE path requires Kubernetes expertise. But if your team is building agents with the Google Agent Development Kit and needs a deployment workflow that reduces manual errors and enforces best practices, it is worth inspecting.

Start by reading the skill page and the reference documentation for your chosen deployment target. Test it on a non-critical project before committing to it for production workloads.

Related Articles