azure-aigateway

Name: azure-aigateway
Author: microsoft

Popular

Configure Azure API Management as an AI Gateway for AI models, MCP tools, and agents. WHEN: semantic caching, token limit, content safety, load balancing, AI model governance, MCP rate limiting, jailbreak detection, add Azure OpenAI backend, add AI Foundry model, test AI gateway, LLM policies, configure AI backend, token metrics, AI cost control, convert API to MCP, import OpenAPI to gateway.

1.2Kstars

0forks

Updated 6/9/2026

Get Skill Source Code

SKILL.md

readonlyread-only

name

azure-aigateway

description

"Configure Azure API Management as an AI Gateway for AI models, MCP tools, and agents. WHEN: semantic caching, token limit, content safety, load balancing, AI model governance, MCP rate limiting, jailbreak detection, add Azure OpenAI backend, add AI Foundry model, test AI gateway, LLM policies, configure AI backend, token metrics, AI cost control, convert API to MCP, import OpenAPI to gateway."

version

"3.1.1"

Azure AI Gateway

Configure Azure API Management (APIM) as an AI Gateway for governing AI models, MCP tools, and agents.

To deploy APIM, use the azure-prepare skill. See APIM deployment guide.

When to Use This Skill

Category	Triggers
Model Governance	"semantic caching", "token limits", "load balance AI", "track token usage"
Tool Governance	"rate limit MCP", "protect my tools", "configure my tool", "convert API to MCP"
Agent Governance	"content safety", "jailbreak detection", "filter harmful content"
Configuration	"add Azure OpenAI backend", "configure my model", "add AI Foundry model"
Testing	"test AI gateway", "call OpenAI through gateway"

Quick Reference

Policy	Purpose	Details
`azure-openai-token-limit`	Cost control	Model Policies
`azure-openai-semantic-cache-lookup/store`	60-80% cost savings	Model Policies
`azure-openai-emit-token-metric`	Observability	Model Policies
`llm-content-safety`	Safety & compliance	Agent Policies
`rate-limit-by-key`	MCP/tool protection	Tool Policies

Get Gateway Details

# Get gateway URL
az apim show --name <apim-name> --resource-group <rg> --query "gatewayUrl" -o tsv

# List backends (AI models)
az apim backend list --service-name <apim-name> --resource-group <rg> \
  --query "[].{id:name, url:url}" -o table

# Get subscription key
az apim subscription keys list \
  --service-name <apim-name> --resource-group <rg> --subscription-id <sub-id>

Test AI Endpoint

GATEWAY_URL=$(az apim show --name <apim-name> --resource-group <rg> --query "gatewayUrl" -o tsv)

curl -X POST "${GATEWAY_URL}/openai/deployments/<deployment>/chat/completions?api-version=2024-02-01" \
  -H "Content-Type: application/json" \
  -H "Ocp-Apim-Subscription-Key: <key>" \
  -d '{"messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100}'

Common Tasks

Add AI Backend

See references/patterns.md for full steps.

# Discover AI resources
az cognitiveservices account list --query "[?kind=='OpenAI']" -o table

# Create backend
az apim backend create --service-name <apim> --resource-group <rg> \
  --backend-id openai-backend --protocol http --url "https://<aoai>.openai.azure.com/openai"

# Grant access (managed identity)
az role assignment create --assignee <apim-principal-id> \
  --role "Cognitive Services User" --scope <aoai-resource-id>

Apply AI Governance Policy

Recommended policy order in <inbound>:

Authentication - Managed identity to backend
Semantic Cache Lookup - Check cache before calling AI
Token Limits - Cost control
Content Safety - Filter harmful content
Backend Selection - Load balancing
Metrics - Token usage tracking

See references/policies.md for complete example.

Troubleshooting

Issue	Solution
Token limit 429	Increase `tokens-per-minute` or add load balancing
No cache hits	Lower `score-threshold` to 0.7
Content false positives	Increase category thresholds (5-6)
Backend auth 401	Grant APIM "Cognitive Services User" role

See references/troubleshooting.md for details.

References

Detailed Policies - Full policy examples
Configuration Patterns - Step-by-step patterns
Troubleshooting - Common issues
AI-Gateway Samples
GenAI Gateway Docs

SDK Quick References

Content Safety: Python | TypeScript
API Management: Python | .NET

Related Skills

caveman-compress

73Kbackend-api

juliusbrussee

Get

hyperframes-media

29Kbackend-api

Asset preprocessing for HyperFrames compositions — multi-provider TTS (HeyGen / ElevenLabs / Kokoro local), multi-provider BGM (Google Lyria / local MusicGen), Whisper transcription, background removal, and caption authoring. Use for npx hyperframes tts, bgm, transcribe, remove-background, voice/provider selection, music-mood prompting, captions / subtitles / lyrics / karaoke / per-word styling.

heygen-com

Get

lark-base

14Kbackend-api

飞书多维表格（Base）操作：建表、字段、记录、视图、统计、公式/lookup、表单、仪表盘、workflow、角色权限；遇到 Base/多维表格/bitable 或 /base/ 链接时使用。文件导入转 lark-drive，认证/授权转 lark-shared。

larksuite

Get

azure-resource-visualizer

1.2Kbackend-api

Analyze Azure resource groups and generate detailed Mermaid architecture diagrams showing the relationships between individual resources. WHEN: create architecture diagram, visualize Azure resources, show resource relationships, generate Mermaid diagram, analyze resource group, diagram my resources, architecture visualization, resource topology, map Azure infrastructure.

microsoft

Get

firebase-ai-logic-basics

357backend-api

Official skill for integrating Firebase AI Logic (Gemini API) into web applications. Covers setup, multimodal inference, structured output, and security.

firebase

Get

wonda-cli

127backend-api

Using the Wonda CLI to generate images, videos, music, and audio from the terminal — plus LinkedIn, Reddit, and X/Twitter research and automation

degausai

Get

azure-aigateway

Azure AI Gateway

When to Use This Skill

Quick Reference

Get Gateway Details

Test AI Endpoint

Common Tasks

Add AI Backend

Apply AI Governance Policy

Troubleshooting

References

SDK Quick References

You Might Also Like

Related Skills

caveman-compress

hyperframes-media

lark-base

azure-resource-visualizer

firebase-ai-logic-basics

wonda-cli