# Tool Selection Evaluator

### Getting Started
This sample demonstrates how to use tool selection evaluator on agent data. The supported input formats include:
- simple data such as strings and `dict` describing tool calls;
- user-agent conversations in the form of list of agent messages. 

Before you begin:
```bash
pip install azure-ai-evaluation
```
Set these environment variables with your own values:
1) **MODEL_DEPLOYMENT_NAME** - The deployment name of the model for this AI-assisted evaluator, as found under the "Name" column in the "Models + endpoints" tab in your Azure AI Foundry project.
2) **AZURE_OPENAI_ENDPOINT** - Azure Open AI Endpoint to be used for evaluation.
3) **AZURE_OPENAI_API_KEY** - Azure Open AI Key to be used for evaluation.
4) **AZURE_OPENAI_API_VERSION** - Azure Open AI Api version to be used for evaluation.


The Tool Selection evaluator assesses the appropriateness and efficiency of tool choices made by an AI agent by examining:
- Relevance of selected tools to the conversation
- Completeness of tool selection according to task requirements
- Efficiency in avoiding unnecessary or redundant tools

The evaluator uses a binary scoring system:

    - Score 0 (Fail): Tools selected are irrelevant, incorrect, or missing essential tools
    - Score 1 (Pass): All needed tools are selected, even if there are redundant tools

This evaluation focuses on measuring whether the right tools were chosen for the task, regardless of how those tools were executed or their parameter correctness.

Tool Selection requires following input:
- Query - This can be a single query or a list of messages(conversation history with agent). The original task request from the user.
- Tool Calls - Tool Call(s) made by Agent to answer the query. Optional - if response has tool calls, if not provided evaluator will look for tool calls in response.
- Response - (Optional) Response from Agent (or any GenAI App). This can be a single text response or a list or messages generated as part of Agent Response. If tool calls are not provided, Tool Selection Evaluator will look at response for tool calls.
- Tool Definitions - Tool(s) definition used by Agent to answer the query. Required to understand available tools and their purposes.


### Initialize Tool Selection Evaluator


In [None]:
import os
from azure.ai.evaluation._evaluators._tool_selection import _ToolSelectionEvaluator
from azure.ai.evaluation import AzureOpenAIModelConfiguration
from pprint import pprint

model_config = AzureOpenAIModelConfiguration(
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version=os.environ["AZURE_OPENAI_API_VERSION"],
    azure_deployment=os.environ["MODEL_DEPLOYMENT_NAME"],
)


tool_selection = _ToolSelectionEvaluator(model_config=model_config)

### Samples

#### Evaluating Single Tool Selection

In [None]:
query = "How is the weather in Seattle?"
tool_call = {
    "type": "tool_call",
    "tool_call_id": "call_CUdbkBfvVBla2YP3p24uhElJ",
    "name": "fetch_weather",
    "arguments": {"location": "Seattle"},
}

tool_definitions = [
    {
        "id": "fetch_weather",
        "name": "fetch_weather",
        "description": "Fetches the weather information for the specified location.",
        "parameters": {
            "type": "object",
            "properties": {"location": {"type": "string", "description": "The location to fetch weather for."}},
        },
    },
    {
        "id": "send_email",
        "name": "send_email",
        "description": "Sends an email with the specified subject and body to the recipient.",
        "parameters": {
            "type": "object",
            "properties": {
                "recipient": {"type": "string", "description": "Email address of the recipient."},
                "subject": {"type": "string", "description": "Subject of the email."},
                "body": {"type": "string", "description": "Body content of the email."},
            },
        },
    },
]

response = tool_selection(query=query, tool_calls=tool_call, tool_definitions=tool_definitions)
pprint(response)

#### Multiple Tool Selections for Complex Task

In [None]:
query = "Can you send me an email with weather information for Seattle?"
tool_calls = [
    {
        "type": "tool_call",
        "tool_call_id": "call_CUdbkBfvVBla2YP3p24uhElJ",
        "name": "fetch_weather",
        "arguments": {"location": "Seattle"},
    },
    {
        "type": "tool_call",
        "tool_call_id": "call_iq9RuPxqzykebvACgX8pqRW2",
        "name": "send_email",
        "arguments": {
            "recipient": "user@example.com",
            "subject": "Weather Information for Seattle",
            "body": "Weather data will be included here.",
        },
    },
]

tool_definitions = [
    {
        "id": "fetch_weather",
        "name": "fetch_weather",
        "description": "Fetches the weather information for the specified location.",
        "parameters": {
            "type": "object",
            "properties": {"location": {"type": "string", "description": "The location to fetch weather for."}},
        },
    },
    {
        "id": "send_email",
        "name": "send_email",
        "description": "Sends an email with the specified subject and body to the recipient.",
        "parameters": {
            "type": "object",
            "properties": {
                "recipient": {"type": "string", "description": "Email address of the recipient."},
                "subject": {"type": "string", "description": "Subject of the email."},
                "body": {"type": "string", "description": "Body content of the email."},
            },
        },
    },
    {
        "id": "get_calendar",
        "name": "get_calendar",
        "description": "Retrieves calendar events for a specified date range.",
        "parameters": {
            "type": "object",
            "properties": {
                "start_date": {"type": "string", "description": "Start date for calendar events."},
                "end_date": {"type": "string", "description": "End date for calendar events."},
            },
        },
    },
]

response = tool_selection(query=query, tool_calls=tool_calls, tool_definitions=tool_definitions)
pprint(response)

#### Tool Calls passed as part of `Response` (common for agent case)

In [None]:
query = "Can you send me an email with weather information for Seattle?"
response = [
    {
        "createdAt": "2025-03-26T17:27:35Z",
        "run_id": "run_zblZyGCNyx6aOYTadmaqM4QN",
        "role": "assistant",
        "content": [
            {
                "type": "tool_call",
                "tool_call_id": "call_CUdbkBfvVBla2YP3p24uhElJ",
                "name": "fetch_weather",
                "arguments": {"location": "Seattle"},
            }
        ],
    },
    {
        "createdAt": "2025-03-26T17:27:37Z",
        "run_id": "run_zblZyGCNyx6aOYTadmaqM4QN",
        "tool_call_id": "call_CUdbkBfvVBla2YP3p24uhElJ",
        "role": "tool",
        "content": [{"type": "tool_result", "tool_result": {"weather": "Rainy, 14째C"}}],
    },
    {
        "createdAt": "2025-03-26T17:27:38Z",
        "run_id": "run_zblZyGCNyx6aOYTadmaqM4QN",
        "role": "assistant",
        "content": [
            {
                "type": "tool_call",
                "tool_call_id": "call_iq9RuPxqzykebvACgX8pqRW2",
                "name": "send_email",
                "arguments": {
                    "recipient": "your_email@example.com",
                    "subject": "Weather Information for Seattle",
                    "body": "The current weather in Seattle is rainy with a temperature of 14째C.",
                },
            }
        ],
    },
    {
        "createdAt": "2025-03-26T17:27:41Z",
        "run_id": "run_zblZyGCNyx6aOYTadmaqM4QN",
        "tool_call_id": "call_iq9RuPxqzykebvACgX8pqRW2",
        "role": "tool",
        "content": [
            {"type": "tool_result", "tool_result": {"message": "Email successfully sent to your_email@example.com."}}
        ],
    },
    {
        "createdAt": "2025-03-26T17:27:42Z",
        "run_id": "run_zblZyGCNyx6aOYTadmaqM4QN",
        "role": "assistant",
        "content": [
            {
                "type": "text",
                "text": "I have successfully sent you an email with the weather information for Seattle. The current weather is rainy with a temperature of 14째C.",
            }
        ],
    },
]

tool_definitions = [
    {
        "name": "fetch_weather",
        "description": "Fetches the weather information for the specified location.",
        "parameters": {
            "type": "object",
            "properties": {"location": {"type": "string", "description": "The location to fetch weather for."}},
        },
    },
    {
        "name": "send_email",
        "description": "Sends an email with the specified subject and body to the recipient.",
        "parameters": {
            "type": "object",
            "properties": {
                "recipient": {"type": "string", "description": "Email address of the recipient."},
                "subject": {"type": "string", "description": "Subject of the email."},
                "body": {"type": "string", "description": "Body content of the email."},
            },
        },
    },
    {
        "name": "get_calendar",
        "description": "Retrieves calendar events for a specified date range.",
        "parameters": {
            "type": "object",
            "properties": {
                "start_date": {"type": "string", "description": "Start date for calendar events."},
                "end_date": {"type": "string", "description": "End date for calendar events."},
            },
        },
    },
]

result = tool_selection(query=query, response=response, tool_definitions=tool_definitions)
pprint(result)

#### Query as Conversation History (List of Messages)
The evaluator also supports query as a list of messages representing conversation history. This helps determine if the Agent selected appropriate tools based on the conversation context.

In [None]:
# Query as conversation history instead of a single string
query_as_conversation = [
    {
        "role": "system",
        "content": "You are a helpful assistant that can fetch weather information and send emails."
    },
    {
        "role": "user", 
        "content": "Hi, can you check the weather in Seattle for me?"
    },
    {
        "role": "user",
        "content": "Actually, could you also send me an email with that weather information to john@example.com?"
    }
]

tool_calls = [
    {
        "type": "tool_call",
        "tool_call_id": "call_weather_123",
        "name": "fetch_weather",
        "arguments": {"location": "Seattle"},
    },
    {
        "type": "tool_call", 
        "tool_call_id": "call_email_456",
        "name": "send_email",
        "arguments": {
            "recipient": "john@example.com",
            "subject": "Weather Information for Seattle",
            "body": "Here is the weather information you requested."
        },
    },
]

tool_definitions = [
    {
        "name": "fetch_weather",
        "description": "Fetches the weather information for the specified location.",
        "parameters": {
            "type": "object",
            "properties": {"location": {"type": "string", "description": "The location to fetch weather for."}},
        },
    },
    {
        "name": "send_email",
        "description": "Sends an email with the specified subject and body to the recipient.",
        "parameters": {
            "type": "object",
            "properties": {
                "recipient": {"type": "string", "description": "Email address of the recipient."},
                "subject": {"type": "string", "description": "Subject of the email."},
                "body": {"type": "string", "description": "Body content of the email."},
            },
        },
    },
    {
        "name": "get_calendar",
        "description": "Retrieves calendar events for a specified date range.",
        "parameters": {
            "type": "object",
            "properties": {
                "start_date": {"type": "string", "description": "Start date for calendar events."},
                "end_date": {"type": "string", "description": "End date for calendar events."},
            },
        },
    },
]

result = tool_selection(query=query_as_conversation, tool_calls=tool_calls, tool_definitions=tool_definitions)
pprint(result)

#### Example of Poor Tool Selection

In [None]:
query = "How is the weather in Seattle?"
# Using irrelevant tool for the task
poor_tool_calls = [
    {
        "type": "tool_call",
        "tool_call_id": "call_calendar_123",
        "name": "get_calendar",
        "arguments": {"start_date": "2025-01-01", "end_date": "2025-01-31"},
    },
]

# This should score poorly as get_calendar is not relevant for weather queries
result = tool_selection(query=query, tool_calls=poor_tool_calls, tool_definitions=tool_definitions)
pprint(result)

#### Response as String (str)

In [None]:
query = "What's the weather in New York?"

# Response as a simple string (not commonly used, but supported by the API)
# When using string response, tool_calls should be provided separately
response_str = "The current weather in New York is sunny with a temperature of 22째C."

tool_calls = {
    "type": "tool_call",
    "tool_call_id": "call_ny_weather_789",
    "name": "fetch_weather",
    "arguments": {"location": "New York"},
}

tool_definitions = [
    {
        "name": "fetch_weather",
        "description": "Fetches the weather information for the specified location.",
        "parameters": {
            "type": "object",
            "properties": {"location": {"type": "string", "description": "The location to fetch weather for."}},
        },
    },
    {
        "name": "send_email",
        "description": "Sends an email with the specified subject and body to the recipient.",
        "parameters": {
            "type": "object",
            "properties": {
                "recipient": {"type": "string", "description": "Email address of the recipient."},
                "subject": {"type": "string", "description": "Subject of the email."},
                "body": {"type": "string", "description": "Body content of the email."},
            },
        },
    },
]

result = tool_selection(query=query, response=response_str, tool_calls=tool_calls, tool_definitions=tool_definitions)
pprint(result)

#### Tool Definition as Single Dict

In [None]:
query = "How is the weather in Seattle?"

tool_definition_dict = {
    "name": "get_weather",
    "description": "Get the current weather for a location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"},
            "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
        },
        "required": ["location"]
    }
}

tool_calls_dict = {
    "type": "tool_call",
    "tool_call_id": "call_abc123",
    "name": "get_weather",
    "arguments": {"location": "Seattle, WA", "unit": "celsius"}
}

result = tool_selection(query=query, tool_definitions=tool_definition_dict, tool_calls=tool_calls_dict)

pprint(result)