9. Built-in Tools - Vertex AI Search

This blog is part of the ADK Masterclass - Hands-On Series. In this tutorial, we'll explore the Vertex AI Search built-in tool, which enables our agent to search across unstructured enterprise data.

Vertex AI Search (formerly Gen App Builder) is a Google Cloud service that lets us build search engines for our websites and document repositories. By integrating it with ADK, our agents can answer questions grounded in our specific organizational knowledge.

View Code on GitHub

Table of Contents

graph TD subgraph ADK ["ADK Agent"] User[User Query] LLM[LLM Generation] Response["Answer with Citations"] end subgraph ManagedService ["Vertex AI Search - Fully Managed"] Sources["Data Sources
(Web, GCS, BigQuery)"] --> Indexer[Indexer] Indexer --> Index[("Search Index")] Index --> Retrieval["Retrieval Engine"] end User --> Retrieval Retrieval -->|"Context + Grounding"| LLM LLM --> Response style ADK fill:#e3f2fd,stroke:#1565c0,stroke-width:2px style ManagedService fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,stroke-dasharray: 5 5 style Index fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px

Vertex AI Search (formerly Enterprise Search on Gen App Builder) lets us build Google-quality search apps on our own data. It combines deep information retrieval with state-of-the-art natural language processing to understand user intent.

It supports indexing and searching across:

  • Unstructured Data: PDFs, HTML, Docs in Cloud Storage.
  • Websites: Crawling and indexing public or private sites.
  • Structured Data: BigQuery tables.

The VertexAiSearchTool in ADK allows our agent to query these data stores naturally.

Key benefits include:

  • Fully Managed RAG: Handles ingestion, embedding, indexing, and retrieval automatically.
  • Semantic Search: Understands synonyms, spellings, and natural language queries out-of-the-box.
  • Grounding & Attribution: Provides citations to reduce hallucinations and increase trust.
  • Multi-Modal: Capable of searching across text, images, and structured data.

2. Tutorial

Building an Enterprise Search Agent

We will build an agent that can answer questions from a document repository stored in Vertex AI Search.

Prerequisites

  • Python 3.11 or higher
  • Google Cloud Project with "Vertex AI Search and Conversation API" enabled
  • gcloud CLI installed and authenticated

Step 1: Create the Project

Initialize a new agent project:

adk create vertex_search_agent
cd vertex_search_agent

Install dependencies:

pip install google-adk

Step 2: Create Data Store

We need a Data Store in Google Cloud. The easiest way is via the console:

  1. Go to the Vertex AI Search & Conversation console.
  2. Click New App -> Search.
  3. Choose Generic (or specific type if applicable).
  4. Create a Data Store (e.g., Cloud Storage) and upload your docs.
  5. Once created, find your Data Store ID in the Data Stores list. It will look like `projects/.../locations/.../collections/.../dataStores/YOUR_DATA_STORE_ID`.

Step 3: Configure the Agent

Edit agent.py to use the VertexAiSearchTool:

from google.adk.agents import LlmAgent
from google.adk.tools import VertexAiSearchTool

# Replace with our actual Data Store ID
# Format: projects/<PROJECT_ID>/locations/<REGION>/collections/default_collection/dataStores/<DATASTORE_ID>
DATA_STORE_ID = "projects/YOUR_PROJECT_ID/locations/global/collections/default_collection/dataStores/YOUR_DATA_STORE_ID"

search_tool = VertexAiSearchTool(data_store_id=DATA_STORE_ID)

root_agent = LlmAgent(
    model="gemini-2.5-flash",
    name="enterprise_search_agent",
    instruction="You are a helpful assistant. Use the search tool to find information in the company documents.",
    tools=[search_tool],
)

Configure .env for our project:

GOOGLE_GENAI_USE_VERTEXAI=True
GOOGLE_CLOUD_PROJECT="your-project-id"
GOOGLE_CLOUD_LOCATION="us-central1"

Step 4: Run the Agent

Authenticate:

gcloud auth application-default login

Run:

adk web

3. Understanding Grounding

Unlike standard LLM responses which rely solely on training data (which might be outdated), Vertex AI Search "grounds" the model in our own data. This means the agent retrieves facts from our Data Store to construct the answer, significantly reducing hallucinations.

How it Works

When a user asks a question, the following process occurs:

  1. Retrieval: The agent searches our Vertex AI Data Store for relevant document chunks.
  2. Context Injection: These chunks are fed into the model as context.
  3. Generation: The model generates an answer based only on the provided context.
  4. Attribution: The response includes metadata linking the answer back to the specific source documents.

Grounding Metadata

The agent's response object contains a groundingMetadata field. This is crucial for verifying the answer:

  • groundingChunks: A list of the actual source documents (titles, URIs) used.
  • groundingSupports: A mapping that connects specific sentences in the generated answer to the chunks, serving as proof or citations.

You can access this metadata to display citations in our application:

# Example: Checking sources in the response event
if event.grounding_metadata:
    print(f"Found {len(event.grounding_metadata.grounding_chunks)} sources.")
    
    for chunk in event.grounding_metadata.grounding_chunks:
        print(f"Source: {chunk.document.title} ({chunk.document.uri})")

Next Steps

Now that we've mastered built-in tools, let's learn to build our own:

Resources

Comments