Google Cloud Agent SDK Master
About
This Claude Skill automatically activates to assist with all Google Cloud Agent Development Kit (ADK) operations. It helps developers build and deploy production-ready agent systems, including multi-agent architectures, containerized deployments, and RAG implementations. Use it for agent scaffolding, orchestration, and CI/CD pipeline setup when working with Google's agent frameworks.
Documentation
Google Cloud Agent SDK Master - Production-Ready Agent Systems
This Agent Skill provides comprehensive mastery of Google's Agent Development Kit (ADK) and Agent Starter Pack for building and deploying production-grade containerized agents.
Core Capabilities
π€ Agent Development Kit (ADK)
Framework Overview:
- Open-source Python framework from Google
- Same framework powering Google Agentspace and CES
- Build production agents in <100 lines of code
- Model-agnostic (optimized for Gemini)
- Deployment-agnostic (local, Cloud Run, GKE, Agent Engine)
Supported Agent Types:
- LLM Agents: Dynamic routing with intelligence
- Workflow Agents:
- Sequential: Linear execution
- Loop: Iterative processing
- Parallel: Concurrent execution
- Custom Agents: User-defined implementations
- Multi-agent Systems: Hierarchical coordination
Key Features:
- Flexible orchestration (workflow & LLM-driven)
- Tool ecosystem (search, code execution, custom functions)
- Third-party integrations (LangChain, CrewAI)
- Agents-as-tools capability
- Built-in evaluation framework
- Cloud Trace integration
π¦ Agent Starter Pack
Production Templates:
- adk_base - ReAct agent using ADK
- agentic_rag - Document retrieval + Q&A with search
- langgraph_base_react - LangGraph ReAct implementation
- crewai_coding_crew - Multi-agent coding system
- adk_live - Multimodal RAG (audio/video/text)
Infrastructure Automation:
- CI/CD setup with single command
- GitHub Actions or Cloud Build pipelines
- Multi-environment support (dev, staging, prod)
- Automated testing and evaluation
- Deployment rollback mechanisms
π Deployment Targets
1. Vertex AI Agent Engine
- Fully managed runtime
- Auto-scaling and load balancing
- Built-in observability
- Serverless architecture
- Best for: Production-scale agents
2. Cloud Run
- Containerized serverless
- Pay-per-use pricing
- Custom domain support
- Traffic splitting
- Best for: Web-facing agents
3. Google Kubernetes Engine (GKE)
- Full container orchestration
- Advanced networking
- Resource management
- Multi-cluster support
- Best for: Complex multi-agent systems
4. Local/Docker
- Development and testing
- Custom infrastructure
- On-premises deployment
- Best for: POC and debugging
π§ Technical Implementation
Installation:
# Agent Starter Pack (recommended)
pip install agent-starter-pack
# or direct from GitHub
uvx agent-starter-pack create my-agent
# ADK only
pip install google-cloud-aiplatform[adk,agent_engines]>=1.111
Create Agent (ADK):
from google.cloud.aiplatform import agent
from vertexai.preview.agents import ADKAgent
# Simple ReAct agent
@agent.adk_agent
class MyAgent(ADKAgent):
def __init__(self):
super().__init__(
model="gemini-2.5-pro",
tools=[search_tool, code_exec_tool]
)
def run(self, query: str):
return self.generate(query)
# Multi-agent orchestration
class OrchestratorAgent(ADKAgent):
def __init__(self):
self.research_agent = ResearchAgent()
self.analysis_agent = AnalysisAgent()
self.writer_agent = WriterAgent()
def run(self, task: str):
research = self.research_agent.run(task)
analysis = self.analysis_agent.run(research)
output = self.writer_agent.run(analysis)
return output
Using Agent Starter Pack:
# Create project with template
uvx agent-starter-pack create my-rag-agent \
--template agentic_rag \
--deployment cloud_run
# Generates complete structure:
my-rag-agent/
βββ src/
β βββ agent.py # Agent implementation
β βββ tools/ # Custom tools
β βββ config.py # Configuration
βββ deployment/
β βββ Dockerfile
β βββ cloudbuild.yaml
β βββ terraform/
βββ tests/
β βββ unit_tests.py
β βββ integration_tests.py
βββ .github/workflows/ # CI/CD pipelines
Deploy to Cloud Run:
# Using ADK CLI
adk deploy \
--target cloud_run \
--region us-central1 \
--service-account [email protected]
# Manual with Docker
docker build -t gcr.io/PROJECT/agent:latest .
docker push gcr.io/PROJECT/agent:latest
gcloud run deploy agent \
--image gcr.io/PROJECT/agent:latest \
--region us-central1 \
--allow-unauthenticated
Deploy to Agent Engine:
# Using Agent Starter Pack
asp deploy \
--env production \
--target agent_engine
# Manual deployment
from google.cloud.aiplatform import agent_engines
agent_engines.deploy_agent(
agent_id="my-agent",
project="PROJECT_ID",
location="us-central1"
)
π RAG Agent Implementation
Vector Search Integration:
from vertexai.preview.rag import VectorSearchTool
from google.cloud import aiplatform
# Set up vector search
vector_search = VectorSearchTool(
index_endpoint="projects/PROJECT/locations/LOCATION/indexEndpoints/INDEX_ID",
deployed_index_id="deployed_index"
)
# RAG agent with ADK
class RAGAgent(ADKAgent):
def __init__(self):
super().__init__(
model="gemini-2.5-pro",
tools=[vector_search, web_search_tool]
)
def run(self, query: str):
# Retrieves relevant docs automatically
response = self.generate(
f"Answer this using retrieved context: {query}"
)
return response
Vertex AI Search Integration:
from vertexai.preview.search import VertexAISearchTool
# Enterprise search integration
vertex_search = VertexAISearchTool(
data_store_id="DATA_STORE_ID",
project="PROJECT_ID"
)
agent = ADKAgent(
model="gemini-2.5-pro",
tools=[vertex_search]
)
π CI/CD Automation
GitHub Actions (auto-generated):
name: Deploy Agent
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Test Agent
run: pytest tests/
- name: Deploy to Cloud Run
run: |
gcloud run deploy agent \
--source . \
--region us-central1
Cloud Build Pipeline:
steps:
# Build container
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/$PROJECT_ID/agent', '.']
# Run tests
- name: 'gcr.io/$PROJECT_ID/agent'
args: ['pytest', 'tests/']
# Deploy to Cloud Run
- name: 'gcr.io/cloud-builders/gcloud'
args:
- 'run'
- 'deploy'
- 'agent'
- '--image=gcr.io/$PROJECT_ID/agent'
- '--region=us-central1'
π― Multi-Agent Orchestration
Hierarchical Agents:
# Coordinator agent with specialized sub-agents
class ProjectManagerAgent(ADKAgent):
def __init__(self):
self.researcher = ResearchAgent()
self.analyst = AnalysisAgent()
self.writer = WriterAgent()
self.reviewer = ReviewAgent()
def run(self, project_brief: str):
# Coordinate multiple agents
research = self.researcher.run(project_brief)
analysis = self.analyst.run(research)
draft = self.writer.run(analysis)
final = self.reviewer.run(draft)
return final
Parallel Agent Execution:
import asyncio
class ParallelResearchAgent(ADKAgent):
async def research_topic(self, topics: list[str]):
# Run multiple agents concurrently
tasks = [
self.specialized_agent(topic)
for topic in topics
]
results = await asyncio.gather(*tasks)
return self.synthesize(results)
π Evaluation & Monitoring
Built-in Evaluation:
from google.cloud.aiplatform import agent_evaluation
# Define evaluation metrics
eval_config = agent_evaluation.EvaluationConfig(
metrics=["accuracy", "relevance", "safety"],
test_dataset="gs://bucket/eval_data.jsonl"
)
# Run evaluation
results = agent.evaluate(eval_config)
print(f"Accuracy: {results.accuracy}")
print(f"Relevance: {results.relevance}")
Cloud Trace Integration:
from google.cloud import trace_v1
# Automatic tracing
@traced_agent
class MonitoredAgent(ADKAgent):
def run(self, query: str):
# All calls automatically traced
with self.trace_span("retrieval"):
docs = self.retrieve(query)
with self.trace_span("generation"):
response = self.generate(query, docs)
return response
π Security & Best Practices
1. Service Account Management:
# Create minimal-permission service account
gcloud iam service-accounts create agent-sa \
--display-name "Agent Service Account"
# Grant only required permissions
gcloud projects add-iam-policy-binding PROJECT_ID \
--member="serviceAccount:[email protected]" \
--role="roles/aiplatform.user"
2. Secret Management:
from google.cloud import secretmanager
def get_api_key():
client = secretmanager.SecretManagerServiceClient()
name = "projects/PROJECT/secrets/api-key/versions/latest"
response = client.access_secret_version(name=name)
return response.payload.data.decode('UTF-8')
3. VPC Service Controls:
# Enable VPC SC for data security
gcloud access-context-manager perimeters create agent-perimeter \
--resources=projects/PROJECT_ID \
--restricted-services=aiplatform.googleapis.com
π° Cost Optimization
Strategies:
- Use Gemini 2.5 Flash for most operations
- Cache embeddings for RAG systems
- Implement request batching
- Use preemptible GKE nodes
- Monitor token usage in Cloud Monitoring
Pricing Examples:
- Cloud Run: $0.00024/GB-second
- Agent Engine: Pay-per-request pricing
- GKE: Standard cluster costs
- Gemini API: $3.50/1M tokens (Pro)
π Reference Architecture
Production Agent System:
βββββββββββββββββββ
β Load Balancer β
ββββββββββ¬βββββββββ
β
ββββββΌβββββ
βCloud Runβ (Agent containers)
ββββββ¬βββββ
β
ββββββΌβββββββββββ
β Agent Engine β (Orchestration)
ββββββ¬βββββββββββ
β
ββββββΌβββββββββββββββββ
β Vertex AI Search β (RAG)
β Vector Search β
β Gemini 2.5 Pro β
βββββββββββββββββββββββ
π― Best Practices for Jeremy
1. Start with Templates:
# Use Agent Starter Pack templates
uvx agent-starter-pack create my-agent --template agentic_rag
2. Local Development:
# Test locally first
adk serve --port 8080
curl http://localhost:8080/query -d '{"question": "test"}'
3. Gradual Deployment:
# Deploy to dev β staging β prod
asp deploy --env dev
# Test thoroughly
asp deploy --env staging
# Final production push
asp deploy --env production
4. Monitor Everything:
- Enable Cloud Trace
- Set up error reporting
- Track token usage
- Monitor response times
- Set up alerting
π Official Documentation
Core Resources:
- ADK Docs: https://google.github.io/adk-docs/
- Agent Starter Pack: https://github.com/GoogleCloudPlatform/agent-starter-pack
- Agent Engine: https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/overview
- Agent Builder: https://cloud.google.com/products/agent-builder
Tutorials:
- Building AI Agents: https://codelabs.developers.google.com/devsite/codelabs/building-ai-agents-vertexai
- Multi-agent Systems: https://cloud.google.com/blog/products/ai-machine-learning/build-and-manage-multi-system-agents-with-vertex-ai
When This Skill Activates
This skill automatically activates when you mention:
- Agent development, ADK, or Agent Starter Pack
- Multi-agent systems or orchestration
- Containerized agent deployment
- Cloud Run, GKE, or Agent Engine deployment
- RAG agents or ReAct agents
- Agent templates or scaffolding
- CI/CD for agents
- Production agent systems
Integration with Other Services
Google Cloud:
- Vertex AI (Gemini, Search, Vector Search)
- Cloud Storage (data storage)
- Cloud Functions (triggers)
- Cloud Scheduler (automation)
- Cloud Logging & Monitoring
Third-party:
- LangChain integration
- CrewAI orchestration
- Custom tool frameworks
Success Metrics
Track:
- Agent response time (target: <2s)
- Evaluation scores (target: >85% accuracy)
- Deployment frequency (target: daily)
- System uptime (target: 99.9%)
- Cost per query (target: <$0.01)
This skill makes Jeremy a Google Cloud agent architecture expert with instant access to ADK, Agent Starter Pack, and production deployment patterns.
Quick Install
/plugin add https://github.com/jeremylongshore/claude-code-plugins-plus/tree/main/agent-sdk-masterCopy and paste this command in Claude Code to install this skill
GitHub δ»εΊ
Related Skills
sglang
MetaSGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.
llamaguard
OtherLlamaGuard is Meta's 7-8B parameter model for moderating LLM inputs and outputs across six safety categories like violence and hate speech. It offers 94-95% accuracy and can be deployed using vLLM, Hugging Face, or Amazon SageMaker. Use this skill to easily integrate content filtering and safety guardrails into your AI applications.
evaluating-llms-harness
TestingThis Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.
langchain
MetaLangChain is a framework for building LLM applications using agents, chains, and RAG pipelines. It supports multiple LLM providers, offers 500+ integrations, and includes features like tool calling and memory management. Use it for rapid prototyping and deploying production systems like chatbots, autonomous agents, and question-answering services.
