2. Setting Up ADK Agents: CLI, Web, and Programmatic Methods

This blog is part of the ADK Masterclass - Hands-On Series. In this blog, we will set up and run ADK agents using three different methods: CLI, Web, and Programmatic (Python).

Starting agents with CLI is great for quick testing and automation, the Web interface offers a visual way to build agents as well as chat with them, and the Programmatic approach gives us full control for integration and custom workflows.

View Code on GitHub

Table of Contents

1. Setting Up via CLI

The command-line interface is the fastest way to get started with ADK agents. We can use the adk run command to interact with our agent directly from the terminal.

Running an Agent

The adk run command requires the path to the agent source code folder. We can run it from any directory:

adk run simple_agent

This starts an interactive CLI session where we can chat with our agent. The command will:

  • Set up logging (logs are saved to a temporary directory)
  • Display experimental warnings (these are normal and can be ignored)
  • Start an interactive session where we can type questions

Here's what the session looks like:

❯ adk run simple_agent

Log setup complete: /var/folders/.../agent.20251119_051759.log

Running agent root_agent, type exit to exit.

[user]: what is deep learning ?

[root_agent]: Deep learning is a subfield of machine learning...

[user]:

Type exit to exit the interactive session.

Session Management

The adk run command supports several options for managing sessions:

1. Save Session

Save the session to a JSON file when exiting. When you type exit, you'll be prompted to enter a session ID:

❯ adk run simple_agent --save_session

Running agent root_agent, type exit to exit.

[user]: exit

Session ID to save: demo

Session saved to /path/to/simple_agent/demo.session.json

We can also specify the session ID directly using --session_id:

adk run simple_agent --save_session --session_id demo

2. Resume Session

Resume a previously saved session:

adk run simple_agent --resume demo.session.json

3. Replay Session

Replay queries from a saved session (read-only):

adk run simple_agent --replay demo.session.json

2. Setting Up via Web Interface

The ADK Web interface provides a visual way to run and interact with agents through a browser. It starts a FastAPI server with a Web UI where we can chat with our agents and monitor their execution.

Starting the Web Interface

To start the web interface, run:

adk web

If we need to specify a directory containing our agents (where each sub-directory is a single agent containing at least __init__.py and agent.py files), we can provide it as an argument:

adk web path/to/agents_dir

Open our browser and navigate to the URL shown in the terminal (usually http://127.0.0.1:8000).

Common Options

The adk web command supports several useful options:

Custom Port: Specify a different port:

adk web --port 8080 path/to/agents_dir

Custom Host: Bind to a specific host:

adk web --host 0.0.0.0 --port 8000 path/to/agents_dir

Auto Reload: Enable auto-reload for development (not supported for Cloud Run):

adk web --reload path/to/agents_dir

Live Agent Reload: Enable live reload for agent changes:

adk web --reload_agents path/to/agents_dir

Using the Web UI

The web interface allows us to:

  • Chat with the agent: Send messages and see responses in real-time
  • View agent configuration: See the current agent setup, model, and tools
  • Test different inputs: Experiment with various prompts and scenarios
  • Monitor execution: Watch the agent's thought process and tool usage

3. Setting Up Programmatically (Python)

The programmatic approach gives us full control over our agent's behavior. We can create, configure, and run agents directly in Python code, making it ideal for integration into larger applications.

Running an Agent with Runner

To run an agent programmatically, we use ADK's Runner class with a session service. First, let's create a Python file (e.g., simple-app.py):

import asyncio
import os
from google.genai import types
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService

# Load environment variables
from dotenv import load_dotenv
load_dotenv()

async def main():
    # Create session service
    session_service = InMemorySessionService()
    
    # Create a session
    session = await session_service.create_session(
        state={}, 
        app_name='demo_app', 
        user_id='demo_user'
    )
    
    # Create agent
    agent = Agent(
        model=os.getenv("GEMINI_MODEL", "gemini-2.5-flash"),
        name='root_agent',
        description='A helpful assistant for user questions.',
        instruction='Answer user questions to the best of your knowledge',
    )
    
    # Create runner
    runner = Runner(
        app_name='demo_app',
        agent=agent,
        session_service=session_service,
    )
    
    # Format user input
    query = "What is the Transformers in AI?"
    content = types.Content(
        role='user', 
        parts=[types.Part(text=query)]
    )
    
    # Run agent and process events
    events_async = runner.run_async(
        session_id=session.id,
        user_id=session.user_id,
        new_message=content
    )
    
    # Process events and extract response text
    async for event in events_async:
        if event.content and event.content.parts:
            for part in event.content.parts:
                if part.text:
                    print(part.text)

if __name__ == "__main__":
    asyncio.run(main())

Set your API key and run the script:

# Set API key
export GOOGLE_API_KEY=your_api_key_here

# Run the script
python simple-app.py

The output will show the agent's response. Here's an example:

The **Transformer** is a groundbreaking neural network architecture introduced in 2017 by Google Brain researchers in their paper "Attention Is All You Need." It has since become the cornerstone of modern Artificial Intelligence, particularly in Natural Language Processing (NLP), but also increasingly in other domains like computer vision.

Here's a breakdown of what makes Transformers revolutionary:

1. **Addressing Limitations of Previous Models (RNNs/LSTMs):**
   Before Transformers, Recurrent Neural Networks (RNNs) and their variants like LSTMs were dominant for sequential data like text. However, they had key drawbacks:
   - **Sequential Processing:** They process data word-by-word, which is slow and hinders parallelization
   - **Difficulty with Long-Range Dependencies:** While LSTMs improved this, they still struggled to effectively capture relationships between words that are far apart

2. **The Core Innovation: The Attention Mechanism:**
   Transformers completely discard recurrence and convolution layers. Instead, they rely entirely on an "attention mechanism" that allows the model to:
   - Weigh the importance of every other word in the input sequence when processing a particular word
   - Dynamically focus on the most relevant parts of the input, regardless of their position

[... rest of the response ...]

Programmatic Approach for Custom UI or Integration

When building a custom frontend application (React, Vue, Angular, etc.), we can create a backend API using FastAPI or Flask that integrates with ADK. This allows us to have full control over the API design while leveraging ADK's agent capabilities.

Setting Up the Backend

First, install the required dependencies:

pip install fastapi uvicorn python-dotenv

Create a new file called main.py (or app.py) with the following code:

from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from typing import Optional
import asyncio
import os
from google.genai import types
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from dotenv import load_dotenv

load_dotenv()

app = FastAPI()

# Enable CORS for frontend
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],  # Configure appropriately for production
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

# Initialize session service and runner
session_service = InMemorySessionService()
agent = Agent(
    model=os.getenv("GEMINI_MODEL", "gemini-2.5-flash"),
    name='root_agent',
    description='A helpful assistant for user questions.',
    instruction='Answer user questions to the best of your knowledge',
)

runner = Runner(
    app_name='custom_frontend_app',
    agent=agent,
    session_service=session_service,
)

# Request/Response models
class ChatRequest(BaseModel):
    message: str
    session_id: Optional[str] = None
    user_id: str = "default_user"

class ChatResponse(BaseModel):
    response: str
    session_id: str

@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
    try:
        # Create or retrieve session
        if request.session_id:
            session = await session_service.get_session(request.session_id)
            if not session:
                raise HTTPException(status_code=404, detail="Session not found")
        else:
            session = await session_service.create_session(
                state={},
                app_name='custom_frontend_app',
                user_id=request.user_id
            )
        
        # Format user input
        content = types.Content(
            role='user',
            parts=[types.Part(text=request.message)]
        )
        
        # Run agent
        events_async = runner.run_async(
            session_id=session.id,
            user_id=session.user_id,
            new_message=content
        )
        
        # Collect response text
        response_text = ""
        async for event in events_async:
            if event.content and event.content.parts:
                for part in event.content.parts:
                    if part.text:
                        response_text += part.text
        
        return ChatResponse(
            response=response_text,
            session_id=session.id
        )
    
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/health")
async def health():
    return {"status": "healthy"}

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8081)

Running the Server

Set your API key and run the FastAPI server:

# Set API key
export GOOGLE_API_KEY=your_api_key_here

# Run the server
uvicorn main:app --reload --port 8081

The server will start on http://localhost:8081. We can verify it's running by visiting:

  • http://localhost:8081/health - Health check endpoint
  • http://localhost:8081/docs - Interactive API documentation (Swagger UI)

Testing the API

We can test the /chat endpoint using curl or any HTTP client:

curl -X POST "http://localhost:8081/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What is the Transformers in AI?",
    "user_id": "test_user"
  }'

This approach gives us:

  • Full API control: Design endpoints that match our frontend needs
  • Session management: Maintain conversation context across requests
  • Custom authentication: Integrate with our existing auth system
  • Flexible frontend: Use any frontend framework or technology
  • Scalability: Deploy the backend independently and scale as needed

4. Choosing the Right Method

Here's a quick guide to help us choose the best method for our needs:

  • CLI: Best for quick testing, debugging, and automation scripts
  • Web Interface: Ideal for prototyping, demos, and non-technical users
  • Programmatic: Perfect for production applications, integrations, and custom workflows

We can also combine methods—for example, using the Web interface for development and the Programmatic approach for custom UI or integration.

Next Steps

Now that we know how to set up and run agents, let's explore what we can build:

Resources

Comments