The Deployment Gap: When Your Agent Works Locally but Fails in Production
You have spent weeks building an AI agent. It runs perfectly on your laptop. The evaluation scores look solid. Your team is ready to show it to users. Then you try to deploy it, and everything falls apart.
This is a common scenario for teams building agents with the Google Agent Development Kit (ADK). The local development environment is forgiving. You control the Python version, the environment variables are set in your shell, and there is no network policy blocking outbound calls. Production is different. You need a container image, a service account with the right IAM roles, secrets management, a deployment target that can scale, and a way to roll back if something goes wrong.
The pain is not just technical. It is organizational. A developer might push a Dockerfile that works on their machine but fails in CI because the base image is wrong. A teammate might hardcode an API key in the source code instead of using Secret Manager. Someone else might deploy to the wrong region. Each of these mistakes is avoidable, but they happen because the deployment process is manual, undocumented, and fragile.
Why This Problem Gets Worse Over Time
When you have one agent and one developer, you can get away with ad-hoc deployment. You SSH into a server, pull the latest code, and restart the service. This does not scale.
As your team grows, you need:
- Reproducible builds. The same code should produce the same container image every time.
- Secrets isolation. API keys and service account credentials must not live in source control.
- Environment parity. Dev, staging, and production should be as similar as possible.
- Rollback capability. If a deployment introduces a regression, you need to revert in seconds, not hours.
- Audit trails. You need to know who deployed what, when, and to which environment.
Without these, every deployment is a gamble. The cost is not just downtime. It is lost trust from users and wasted engineering time spent debugging infrastructure instead of improving the agent.
What a Good Solution Should Change
A deployment tool for AI agents should do three things:
- Abstract away infrastructure boilerplate. You should not have to write Terraform modules from scratch or debug Docker networking issues every time you deploy.
- Enforce best practices by default. Secrets should be injected from a secrets manager. Service accounts should follow least-privilege principles. Container images should be built with reproducible tooling.
- Support multiple deployment targets with a consistent interface. Whether you choose Cloud Run for simplicity, GKE for full Kubernetes control, or Agent Runtime for managed agent infrastructure, the deployment command should feel the same.
This is where a CLI tool designed specifically for ADK agent deployment can help. One option worth inspecting is the google-agents-cli-deploy skill, which is part of the broader Google Agents CLI ecosystem.
Introducing google-agents-cli-deploy
The google-agents-cli-deploy skill provides deployment workflows for agents built with the Google Agent Development Kit. It wraps Terraform, Docker, and cloud deployment logic into a tested pipeline, exposed through the agents-cli command-line tool.
This is not a general-purpose deployment tool. It is designed specifically for ADK agents and understands their structure: the AdkApp pattern, session management, agent identity, and the deployment metadata that ADK projects carry. If your project was scaffolded with the ADK, this tool knows how to package and deploy it.
What It Can Do
The skill covers several deployment scenarios:
- Deploy to Agent Runtime. This is the managed deployment target purpose-built for agents. It handles auto-scaling, session state persistence via
VertexAiSessionService, and supports features like agent identity and private VPC connectivity through PSC interfaces. - Deploy to Cloud Run. For teams that need more control over infrastructure, event-driven triggers (Pub/Sub, Eventarc), or custom networking. Cloud Run supports fully configurable scaling, direct VPC egress, and IAP.
- Deploy to GKE. For teams that require full Kubernetes control, including HPA, VPA, node auto-provisioning, and custom networking. This option has the highest setup complexity but offers the most flexibility.
- Manage secrets. The
--secretsflag lets you inject secrets from Secret Manager into your deployed service as environment variables. - Configure CI/CD. The skill includes guidance on setting up full CI/CD pipelines with
agents-cli infra cicd, including runner comparison, Workload Identity Federation authentication, and pipeline stage configuration. - Handle rollback. Because deployments are versioned, you can revert to a previous version if a deployment introduces issues.
Choosing a Deployment Target
The skill provides a decision matrix to help you choose between Agent Runtime, Cloud Run, and GKE. Here is a simplified version:
- Agent Runtime is best if you want managed infrastructure with minimal operational overhead. It supports Python agents, has native session state management, and does not bill when idle. However, it does not support event-driven triggers like Pub/Sub or Cloud Scheduler.
- Cloud Run is best if you need event-driven workloads, custom networking, or more control over scaling behavior. It supports Pub/Sub and Eventarc triggers natively.
- GKE is best if you need full Kubernetes control, want to run non-Python agents in custom containers, or have existing Kubernetes infrastructure you want to reuse.
If your agent needs OAuth 2.0 user consent flows (for example, to access Google Drive or Calendar on behalf of a user), Agent Runtime with Gemini Enterprise is the recommended path. Cloud Run does not currently support managed OAuth flows.
How the Deployment Workflow Works
The deployment process with agents-cli follows a structured sequence. Understanding this sequence helps you avoid the most common failure modes.
Step 1: Ensure Your Project Is Ready
If your project was not scaffolded with deployment support, you need to add it first:
agents-cli scaffold enhance . --deployment-target <target>
Replace <target> with agent_runtime, cloud_run, or gke. This adds the necessary Dockerfile, Terraform configuration, and deployment metadata to your project.
Step 2: Get Human Approval
The skill explicitly requires human approval before deploying. This is a safety mechanism. The CLI will prompt you:
"Eval scores meet thresholds and tests pass. Ready to deploy to dev?"
You must wait for explicit approval before proceeding. This prevents accidental deployments and ensures someone has reviewed the agent's readiness.
Step 3: Run the Deploy Command
agents-cli deploy
This single command handles building the container image, pushing it to Artifact Registry, configuring the deployment target, and starting the service. You can pass flags to customize the deployment:
--projectto specify the GCP project ID--regionto specify the deployment region--service-accountto specify the service account email--secretsto inject secrets as environment variables--update-env-varsto set additional environment variables
Step 4: Verify the Deployment
After deployment, verify that the agent is running correctly. The skill includes testing instructions for each deployment target, including curl examples and load test guidance.
Handling Timeouts
Agent Runtime deployments can take 5-10 minutes and may exceed command timeouts. If the deploy command is cancelled or times out, the deployment continues server-side. You can check progress with:
agents-cli deploy --status
Poll every 60 seconds until it reports completion or failure.
When Not to Use This Skill
The google-agents-cli-deploy skill has clear boundaries. It is not the right tool for every situation:
- Non-ADK agents. If your agent is not built with the Google Agent Development Kit, this tool will not work. It relies on ADK-specific project structure and metadata.
- API code patterns. If you need help writing agent code, use the
google-agents-cli-adk-codeskill instead. - Evaluation. If you need to evaluate agent performance, use the
google-agents-cli-evalskill. - Project scaffolding. If you need to create a new ADK project from scratch, use the
google-agents-cli-scaffoldskill. - Observability. If you need Cloud Trace, prompt-response logging, or BigQuery analytics, use the
google-agents-cli-observabilityskill.
The skill also does not handle batch inference with BigQuery Remote Function triggers or Pub/Sub event-driven patterns directly. For those, you need to look at the google-agents-cli-adk-code skill and its reference documentation.
Setup Context and Prerequisites
Before using this skill, you need:
agents-cliinstalled. The tool is installed viauv tool install google-agents-cli. If you do not haveuv, you need to install it first from the uv documentation.- An ADK project. Your project should be scaffolded with deployment support. If it is not, use the scaffold skill to add it.
- GCP credentials. You need authenticated GCP credentials with permissions to deploy to your chosen target (Cloud Run, GKE, or Agent Runtime).
- A GCP project. You need a project with the necessary APIs enabled (Cloud Run, GKE, Artifact Registry, etc., depending on your target).
Optional: Single-Project Infrastructure Setup
If you need to provision infrastructure in a single GCP project without CI/CD (service accounts, IAM bindings, telemetry resources, Artifact Registry), you can run:
agents-cli infra single-project
This is optional. The agents-cli deploy command works without it. Use it if you need observability features or want to test infrastructure in a single project before going to production.
Note that agents-cli deploy does not automatically use the Terraform-created service account. You need to pass it explicitly with --service-account SA_EMAIL.
Safety Signals and Repository Health
When evaluating whether to adopt this skill, consider these signals:
- Repository owner. The repository is maintained by Google, which provides a level of accountability and long-term support.
- License. The skill is licensed under Apache-2.0, which is a permissive open-source license.
- Stars and community. The repository has over 3,000 stars, indicating active community interest.
- Security level. The skill is rated as Low risk, meaning it does not require elevated permissions or access to sensitive resources beyond what is needed for deployment.
- Version. The current version is 0.5.1, which suggests the tool is still evolving. Expect breaking changes in future releases.
What to Inspect Before Using
Before adopting this skill in a production workflow, inspect:
- The reference documentation. The skill includes detailed reference files for each deployment target (
cloud-run.md,agent-runtime.md,gke.md). Read the one relevant to your target. - The Terraform patterns. If you plan to customize infrastructure, review
terraform-patterns.mdfor IAM, state management, and resource import guidance. - The CI/CD pipeline documentation. If you plan to set up automated deployments, review
cicd-pipeline.mdfor runner comparison, WIF authentication, and pipeline stage configuration. - The testing documentation. Review
testing-deployed-agents.mdfor testing instructions specific to your deployment target. - Your team's Kubernetes expertise. If you are considering GKE, ensure your team has the necessary Kubernetes knowledge. GKE has the highest setup complexity.
Practical Example: Deploying to Cloud Run
Here is a condensed example of deploying an ADK agent to Cloud Run:
-
Ensure your project has deployment support:
agents-cli scaffold enhance . --deployment-target cloud_run -
Review the generated Dockerfile and Terraform configuration in
deployment/. -
Get human approval for deployment.
-
Deploy:
agents-cli deploy --project my-gcp-project --region us-central1 --service-account agent-sa@my-gcp-project.iam.gserviceaccount.com --secrets API_KEY=my-secret:latest -
Verify the deployment:
curl https://my-service-abc123-uc.a.run.app/health -
If something goes wrong, check deployment status:
agents-cli deploy --status
Summary
Deploying AI agents to production is a real engineering challenge. The google-agents-cli-deploy skill provides a structured approach for ADK agents, covering Agent Runtime, Cloud Run, and GKE deployment targets. It enforces human approval before deployment, supports secrets management, and includes CI/CD pipeline guidance.
It is not a universal solution. It only works with ADK projects, it does not handle agent code or evaluation, and the GKE path requires Kubernetes expertise. But if your team is building agents with the Google Agent Development Kit and needs a deployment workflow that reduces manual errors and enforces best practices, it is worth inspecting.
Start by reading the skill page and the reference documentation for your chosen deployment target. Test it on a non-critical project before committing to it for production workloads.