4. Building LLM Agents

This blog is part of the ADK Masterclass - Hands-On Series. In this tutorial, we'll dive into the core concept of LLM Agents, autonomous systems that use Large Language Models as their reasoning engine.

Unlike traditional software where logic is hard-coded, LLM Agents can dynamically reason about a task, decide which steps to take, and determine when a goal has been achieved. This flexibility makes them highly effective for complex, unstructured problems.

View Code on GitHub

Table of Contents

1. What is an LLM Agent?

An LLM Agent utilizes a Large Language Model (like Gemini) as its "brain." Instead of following a rigid if-then script, the agent:

  • Receives an input: A user query or an event.
  • Reasons: Uses the LLM to understand the intent and context.
  • Plans: Decides on a sequence of actions to fulfill the request.
  • Acts: Generates a response or calls tools (which we'll cover in later modules).
graph LR In[User Query] --> LLM subgraph LLM [LLM Brain] direction TB Re[Reason] --> Pl[Plan] end LLM --> Ac[Act: Tool / Response] style In fill:#e3f2fd,stroke:#0288d1,stroke-width:2px style LLM fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px style Re fill:#ffffff,stroke:#7b1fa2,stroke-width:1px style Pl fill:#ffffff,stroke:#7b1fa2,stroke-width:1px style Ac fill:#fff3e0,stroke:#f57c00,stroke-width:2px

This architecture allows LLM Agents to handle ambiguity and adapt to user needs on the fly, making them suitable for chatbots, research assistants, and dynamic problem solvers.

2. Defining an Agent

In ADK, the core abstraction is the Agent class. When we define an agent, we are essentially configuring its "brain" and its "personality." An agent isn't just a model; it's a model wrapped with specific directives and context.

Instructions and Persona

The most critical part of an LLM Agent is its instruction (often called a system prompt). This is the invariant set of rules that the agent must follow throughout its lifecycle. It defines:

  • Identity: "You are a helpful coding assistant named CodeBot."
  • Mission: "Your goal is to help users write clean, efficient Python code."
  • Constraints: "Answer only in Python. Do not provide explanations unless asked."
  • Tone/Style: "Be concise, professional, and encouraging."

A well-crafted instruction is the difference between a generic chatbot and a specialized, reliable agent.

Model Configuration

We also need to specify which model powers the agent. ADK supports various models, but we typically use Google's Gemini models:

  • gemini-2.5-flash: Fast and cost-effective, ideal for simple tasks or high-throughput agents.
  • gemini-2.5-pro: More capable reasoning, better for complex analysis or creative writing.

We select the model based on the complexity of the agent's assigned task.

3. Tutorial

Let's build a practical agent designed to act as a junior financial analyst. This demonstrates how instructions shape the agent's reasoning and output for business use cases.

Prerequisites

Step 1: Setup Environment

First, set up your Python environment and install the necessary packages.

python3 -m venv .venv
source .venv/bin/activate
pip install google-adk python-dotenv

# Set your API Key
export GOOGLE_API_KEY=your_api_key_here

Step 2: Initialize Project

Create a new agent project using the ADK CLI:

adk create financial_analyst
cd financial_analyst

Step 3: Define the Agent

We'll create a file named financial_analyst.py:

import asyncio
import os
from google.genai import types
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from dotenv import load_dotenv

# Load API keys
load_dotenv()

async def main():
    # 1. Define the Agent
    # We give it a specific persona via the 'instruction' parameter
    analyst_agent = Agent(
        model=os.getenv("GEMINI_MODEL", "gemini-2.5-flash"),
        name="fin_analyst",
        description="A professional financial analyst who interprets market data.",
        instruction=(
            "You are a Junior Financial Analyst at a major investment bank. "
            "Your job is to interpret financial data, 10-K filings, and earnings reports for clients. "
            "When asked about a stock (e.g., share price, PE ratio), provide a concise summary. "
            "If asked about earnings or 10-K risks, highlight the top 3 key takeaways. "
            "Maintain a professional, objective tone and always include a standard disclaimer that this is not investment advice."
        )
    )
    
    # 2. Set up the runtime environment
    session_service = InMemorySessionService()
    runner = Runner(
        app_name="finance_app",
        agent=analyst_agent,
        session_service=session_service
    )
    
    # 3. Create a session and run
    session = await session_service.create_session(
        state={}, 
        app_name="finance_app", 
        user_id="client_01"
    )
    
    # The client asks about a company's risks
    query = "What are the key risk factors mentioned in Google's latest 10-K filing?"
    
    print(f"Client: {query}\n")
    print("Analyst is analyzing...\n")
    
    # Format the input
    content = types.Content(
        role="user", 
        parts=[types.Part(text=query)]
    )
    
    # Run the agent
    events = runner.run_async(
        session_id=session.id,
        user_id=session.user_id,
        new_message=content
    )
    
    # Stream the response
    async for event in events:
        if event.content and event.content.parts:
            for part in event.content.parts:
                if part.text:
                    print(part.text, end="", flush=True)
    print("\n")

if __name__ == "__main__":
    asyncio.run(main())

Step 4: Run the Agent

To run this, ensure our environment variables are set up, then execute:

python financial_analyst.py

4. How It Works

When we run this code, the following happens:

graph LR A[User Query] -->|Input| B[Runner] B -->|Context + Query| C[Agent] C -->|Instruction + LLM| D{Reasoning} D -->|Output| E[Response] style A fill:#e1f5fe,stroke:#01579b,stroke-width:2px style B fill:#fff9c4,stroke:#fbc02d,stroke-width:2px style C fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px style D fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px style E fill:#ffebee,stroke:#c62828,stroke-width:2px
  1. Initialization: The Agent is initialized with the system instruction defining the "Financial Analyst" persona.
  2. Input Processing: The client's query ("What are the key risk factors...") is sent to the Runner.
  3. Reasoning Loop: The Agent processes the input. Because of its instructions, it knows to look for risks in a 10-K context (using its internal knowledge base) and format the output as "top 3 key takeaways" with a professional tone.
  4. Generation: The model generates a structured, professional response and appends the required disclaimer.

This illustrates the fundamental nature of an LLM Agent: it's a software entity whose behavior is programmed via natural language instructions rather than explicit code logic.

Next Steps

Now that we've built our first reasoning agent, we're ready to explore more advanced patterns:

  • Workflow Agents: Chain multiple steps together (Sequential, Loop, Parallel) for deterministic control.
  • Multi-Agent Systems: Compose teams of specialized agents to solve complex problems.

Resources

Comments