This blog is part of the ADK Masterclass - Hands-On Series. In this blog, we will set up and run ADK agents using three different methods: CLI, Web, and Programmatic (Python).
Starting agents with CLI is great for quick testing and automation, the Web interface offers a visual way to build agents as well as chat with them, and the Programmatic approach gives us full control for integration and custom workflows.
View Code on GitHubTable of Contents
1. Setting Up via CLI
The command-line interface is the fastest way to get started with ADK agents. We can use the adk run command to interact with our agent directly from the terminal.
Running an Agent
The adk run command requires the path to the agent source code folder. We can run it from any directory:
adk run simple_agent
This starts an interactive CLI session where we can chat with our agent. The command will:
- Set up logging (logs are saved to a temporary directory)
- Display experimental warnings (these are normal and can be ignored)
- Start an interactive session where we can type questions
Here's what the session looks like:
❯ adk run simple_agent
Log setup complete: /var/folders/.../agent.20251119_051759.log
Running agent root_agent, type exit to exit.
[user]: what is deep learning ?
[root_agent]: Deep learning is a subfield of machine learning...
[user]:
Type exit to exit the interactive session.
Session Management
The adk run command supports several options for managing sessions:
1. Save Session
Save the session to a JSON file when exiting. When you type exit, you'll be prompted to enter a session ID:
❯ adk run simple_agent --save_session
Running agent root_agent, type exit to exit.
[user]: exit
Session ID to save: demo
Session saved to /path/to/simple_agent/demo.session.json
We can also specify the session ID directly using --session_id:
adk run simple_agent --save_session --session_id demo
2. Resume Session
Resume a previously saved session:
adk run simple_agent --resume demo.session.json
3. Replay Session
Replay queries from a saved session (read-only):
adk run simple_agent --replay demo.session.json
2. Setting Up via Web Interface
The ADK Web interface provides a visual way to run and interact with agents through a browser. It starts a FastAPI server with a Web UI where we can chat with our agents and monitor their execution.
Starting the Web Interface
To start the web interface, run:
adk web
If we need to specify a directory containing our agents (where each sub-directory is a single agent containing at least __init__.py and agent.py files), we can provide it as an argument:
adk web path/to/agents_dir
Open our browser and navigate to the URL shown in the terminal (usually http://127.0.0.1:8000).
Common Options
The adk web command supports several useful options:
Custom Port: Specify a different port:
adk web --port 8080 path/to/agents_dir
Custom Host: Bind to a specific host:
adk web --host 0.0.0.0 --port 8000 path/to/agents_dir
Auto Reload: Enable auto-reload for development (not supported for Cloud Run):
adk web --reload path/to/agents_dir
Live Agent Reload: Enable live reload for agent changes:
adk web --reload_agents path/to/agents_dir
Using the Web UI
The web interface allows us to:
- Chat with the agent: Send messages and see responses in real-time
- View agent configuration: See the current agent setup, model, and tools
- Test different inputs: Experiment with various prompts and scenarios
- Monitor execution: Watch the agent's thought process and tool usage
3. Setting Up Programmatically (Python)
The programmatic approach gives us full control over our agent's behavior. We can create, configure, and run agents directly in Python code, making it ideal for integration into larger applications.
Running an Agent with Runner
To run an agent programmatically, we use ADK's Runner class with a session service. First, let's create a Python file (e.g., simple-app.py):
import asyncio
import os
from google.genai import types
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
# Load environment variables
from dotenv import load_dotenv
load_dotenv()
async def main():
# Create session service
session_service = InMemorySessionService()
# Create a session
session = await session_service.create_session(
state={},
app_name='demo_app',
user_id='demo_user'
)
# Create agent
agent = Agent(
model=os.getenv("GEMINI_MODEL", "gemini-2.5-flash"),
name='root_agent',
description='A helpful assistant for user questions.',
instruction='Answer user questions to the best of your knowledge',
)
# Create runner
runner = Runner(
app_name='demo_app',
agent=agent,
session_service=session_service,
)
# Format user input
query = "What is the Transformers in AI?"
content = types.Content(
role='user',
parts=[types.Part(text=query)]
)
# Run agent and process events
events_async = runner.run_async(
session_id=session.id,
user_id=session.user_id,
new_message=content
)
# Process events and extract response text
async for event in events_async:
if event.content and event.content.parts:
for part in event.content.parts:
if part.text:
print(part.text)
if __name__ == "__main__":
asyncio.run(main())
Set your API key and run the script:
# Set API key
export GOOGLE_API_KEY=your_api_key_here
# Run the script
python simple-app.py
The output will show the agent's response. Here's an example:
The **Transformer** is a groundbreaking neural network architecture introduced in 2017 by Google Brain researchers in their paper "Attention Is All You Need." It has since become the cornerstone of modern Artificial Intelligence, particularly in Natural Language Processing (NLP), but also increasingly in other domains like computer vision.
Here's a breakdown of what makes Transformers revolutionary:
1. **Addressing Limitations of Previous Models (RNNs/LSTMs):**
Before Transformers, Recurrent Neural Networks (RNNs) and their variants like LSTMs were dominant for sequential data like text. However, they had key drawbacks:
- **Sequential Processing:** They process data word-by-word, which is slow and hinders parallelization
- **Difficulty with Long-Range Dependencies:** While LSTMs improved this, they still struggled to effectively capture relationships between words that are far apart
2. **The Core Innovation: The Attention Mechanism:**
Transformers completely discard recurrence and convolution layers. Instead, they rely entirely on an "attention mechanism" that allows the model to:
- Weigh the importance of every other word in the input sequence when processing a particular word
- Dynamically focus on the most relevant parts of the input, regardless of their position
[... rest of the response ...]
Programmatic Approach for Custom UI or Integration
When building a custom frontend application (React, Vue, Angular, etc.), we can create a backend API using FastAPI or Flask that integrates with ADK. This allows us to have full control over the API design while leveraging ADK's agent capabilities.
Setting Up the Backend
First, install the required dependencies:
pip install fastapi uvicorn python-dotenv
Create a new file called main.py (or app.py) with the following code:
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from typing import Optional
import asyncio
import os
from google.genai import types
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from dotenv import load_dotenv
load_dotenv()
app = FastAPI()
# Enable CORS for frontend
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # Configure appropriately for production
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Initialize session service and runner
session_service = InMemorySessionService()
agent = Agent(
model=os.getenv("GEMINI_MODEL", "gemini-2.5-flash"),
name='root_agent',
description='A helpful assistant for user questions.',
instruction='Answer user questions to the best of your knowledge',
)
runner = Runner(
app_name='custom_frontend_app',
agent=agent,
session_service=session_service,
)
# Request/Response models
class ChatRequest(BaseModel):
message: str
session_id: Optional[str] = None
user_id: str = "default_user"
class ChatResponse(BaseModel):
response: str
session_id: str
@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
try:
# Create or retrieve session
if request.session_id:
session = await session_service.get_session(request.session_id)
if not session:
raise HTTPException(status_code=404, detail="Session not found")
else:
session = await session_service.create_session(
state={},
app_name='custom_frontend_app',
user_id=request.user_id
)
# Format user input
content = types.Content(
role='user',
parts=[types.Part(text=request.message)]
)
# Run agent
events_async = runner.run_async(
session_id=session.id,
user_id=session.user_id,
new_message=content
)
# Collect response text
response_text = ""
async for event in events_async:
if event.content and event.content.parts:
for part in event.content.parts:
if part.text:
response_text += part.text
return ChatResponse(
response=response_text,
session_id=session.id
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.get("/health")
async def health():
return {"status": "healthy"}
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8081)
Running the Server
Set your API key and run the FastAPI server:
# Set API key
export GOOGLE_API_KEY=your_api_key_here
# Run the server
uvicorn main:app --reload --port 8081
The server will start on http://localhost:8081. We can verify it's running by visiting:
http://localhost:8081/health- Health check endpointhttp://localhost:8081/docs- Interactive API documentation (Swagger UI)
Testing the API
We can test the /chat endpoint using curl or any HTTP client:
curl -X POST "http://localhost:8081/chat" \
-H "Content-Type: application/json" \
-d '{
"message": "What is the Transformers in AI?",
"user_id": "test_user"
}'
This approach gives us:
- Full API control: Design endpoints that match our frontend needs
- Session management: Maintain conversation context across requests
- Custom authentication: Integrate with our existing auth system
- Flexible frontend: Use any frontend framework or technology
- Scalability: Deploy the backend independently and scale as needed
4. Choosing the Right Method
Here's a quick guide to help us choose the best method for our needs:
- CLI: Best for quick testing, debugging, and automation scripts
- Web Interface: Ideal for prototyping, demos, and non-technical users
- Programmatic: Perfect for production applications, integrations, and custom workflows
We can also combine methods—for example, using the Web interface for development and the Programmatic approach for custom UI or integration.
Next Steps
Now that we know how to set up and run agents, let's explore what we can build:
- Build agent with Visual Builder: Create agents without writing code using the visual interface.
- Building LLM Agents: Learn how to build intelligent, model-powered agents.
- Workflow Agents: Create dynamic task flows with sequential, loop, and parallel workflows.