# Tool Call Accuracy Evaluator

### Getting Started

This sample demonstrates how to use Intent Resolution Evaluator
Before running the sample:
```bash
pip install azure-ai-projects azure-identity azure-ai-evaluation
```
Set these environment variables with your own values:
1) **PROJECT_CONNECTION_STRING** - The project connection string, as found in the overview page of your Azure AI Foundry project.
2) **MODEL_DEPLOYMENT_NAME** - The deployment name of the AI model, as found under the "Name" column in the "Models + endpoints" tab in your Azure AI Foundry project.
3) **AZURE_OPENAI_ENDPOINT** - Azure Open AI Endpoint to be used for evaluation.
4) **AZURE_OPENAI_API_KEY** - Azure Open AI Key to be used for evaluation.
5) **AZURE_OPENAI_API_VERSION** - Azure Open AI Api version to be used for evaluation.
6) **AZURE_SUBSCRIPTION_ID** - Azure Subscription Id of Azure AI Project
7) **PROJECT_NAME** - Azure AI Project Name
8) **RESOURCE_GROUP_NAME** - Azure AI Project Resource Group Name

The Tool Call Accuracy evaluator assesses how accurately an AI uses tools by examining:
- Relevance to the conversation
- Parameter correctness according to tool definitions
- Parameter value extraction from the conversation
- Potential usefulness of the tool call

The evaluator uses a binary scoring system (0 or 1):
    - Score 0: The tool call is irrelevant or contains information not in the conversation/definition
    - Score 1: The tool call is relevant with properly extracted parameters from the conversation

This evaluation focuses on measuring whether tool calls meaningfully contribute to addressing query while properly following tool definitions and using information present in the conversation history.

Tool Call Accuracy requires following input:
- Query - This can be a single query or a list of messages(conversation history with agent). Latter helps to determine if Agent used the information in history to make right tool calls.
- Tool Calls - Tool Call(s) made by Agent to answer the query. Optional - if response has tool calls, if not provided evaluator will look for tool calls in response.
- Response - (Optional)Response from Agent (or any GenAI App). This can be a single text response or a list or messages generated as part of Agent Response. If tool calls are not provide Tool Call Accuracy Evaluator will look at response for tool calls.
- Tool Definitions - Tool(s) definition used by Agent to answer the query. 


### Initialize Tool Call Accuracy Evaluator


In [None]:
import os
from azure.ai.evaluation import ToolCallAccuracyEvaluator , AzureOpenAIModelConfiguration
from pprint import pprint

model_config = AzureOpenAIModelConfiguration(
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version=os.environ["AZURE_OPENAI_API_VERSION"],
    azure_deployment=os.environ["MODEL_DEPLOYMENT_NAME"],
)


tool_call_accuracy = ToolCallAccuracyEvaluator(model_config=model_config)

### Samples

#### Evaluating Single Tool Call

In [None]:
query = "How is the weather in Seattle ?"
tool_call = {
                    "type": "tool_call",
                    "tool_call_id": "call_CUdbkBfvVBla2YP3p24uhElJ",
                    "name": "fetch_weather",
                    "arguments": {
                        "location": "Seattle"
                    }
                }

tool_definition = {
                    "id": "fetch_weather",
                    "name": "fetch_weather",
                    "description": "Fetches the weather information for the specified location.",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "location": {
                                "type": "string",
                                "description": "The location to fetch weather for."
                            }
                        }
                    }
                }

In [None]:
response = tool_call_accuracy(query=query, tool_calls=tool_call, tool_definitions=tool_definition)
pprint(response)

#### Multiple Tool Calls used by Agent to respond

In [None]:
query = "How is the weather in Seattle ?"
tool_calls = [{
                    "type": "tool_call",
                    "tool_call_id": "call_CUdbkBfvVBla2YP3p24uhElJ",
                    "name": "fetch_weather",
                    "arguments": {
                        "location": "Seattle"
                    }
            },
            {
                    "type": "tool_call",
                    "tool_call_id": "call_CUdbkBfvVBla2YP3p24uhElJ",
                    "name": "fetch_weather",
                    "arguments": {
                        "location": "London"
                    }
            }]

tool_definition = {
                    "id": "fetch_weather",
                    "name": "fetch_weather",
                    "description": "Fetches the weather information for the specified location.",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "location": {
                                "type": "string",
                                "description": "The location to fetch weather for."
                            }
                        }
                    }
                }

In [None]:
response = tool_call_accuracy(query=query, tool_calls=tool_calls, tool_definitions=tool_definition)
pprint(response)

#### Tool Calls passed as part of `Response` (common for agent case)
- Tool Call Accuracy Evaluator extracts tool calls from response

In [None]:
query = "Can you send me an email with weather information for Seattle?"
response = [
        {
            "createdAt": "2025-03-26T17:27:35Z",
            "run_id": "run_zblZyGCNyx6aOYTadmaqM4QN",
            "role": "assistant",
            "content": [
                {
                    "type": "tool_call",
                    "tool_call_id": "call_CUdbkBfvVBla2YP3p24uhElJ",
                    "name": "fetch_weather",
                    "arguments": {
                        "location": "Seattle"
                    }
                }
            ]
        },
        {
            "createdAt": "2025-03-26T17:27:37Z",
            "run_id": "run_zblZyGCNyx6aOYTadmaqM4QN",
            "tool_call_id": "call_CUdbkBfvVBla2YP3p24uhElJ",
            "role": "tool",
            "content": [
                {
                    "type": "tool_result",
                    "tool_result": {
                        "weather": "Rainy, 14\u00b0C"
                    }
                }
            ]
        },
        {
            "createdAt": "2025-03-26T17:27:38Z",
            "run_id": "run_zblZyGCNyx6aOYTadmaqM4QN",
            "role": "assistant",
            "content": [
                {
                    "type": "tool_call",
                    "tool_call_id": "call_iq9RuPxqzykebvACgX8pqRW2",
                    "name": "send_email",
                    "arguments": {
                        "recipient": "your_email@example.com",
                        "subject": "Weather Information for Seattle",
                        "body": "The current weather in Seattle is rainy with a temperature of 14\u00b0C."
                    }
                }
            ]
        },
        {
            "createdAt": "2025-03-26T17:27:41Z",
            "run_id": "run_zblZyGCNyx6aOYTadmaqM4QN",
            "tool_call_id": "call_iq9RuPxqzykebvACgX8pqRW2",
            "role": "tool",
            "content": [
                {
                    "type": "tool_result",
                    "tool_result": {
                        "message": "Email successfully sent to your_email@example.com."
                    }
                }
            ]
        },
        {
            "createdAt": "2025-03-26T17:27:42Z",
            "run_id": "run_zblZyGCNyx6aOYTadmaqM4QN",
            "role": "assistant",
            "content": [
                {
                    "type": "text",
                    "text": "I have successfully sent you an email with the weather information for Seattle. The current weather is rainy with a temperature of 14\u00b0C."
                }
            ]
        }
    ]

tool_definitions = [
    {
		"name": "fetch_weather",
		"description": "Fetches the weather information for the specified location.",
		"parameters": {
			"type": "object",
			"properties": {
				"location": {
					"type": "string",
					"description": "The location to fetch weather for."
				}
			}
		}
	},
    {
		"name": "send_email",
		"description": "Sends an email with the specified subject and body to the recipient.",
		"parameters": {
			"type": "object",
			"properties": {
				"recipient": {
					"type": "string",
					"description": "Email address of the recipient."
				},
				"subject": {
					"type": "string",
					"description": "Subject of the email."
				},
				"body": {
					"type": "string",
					"description": "Body content of the email."
				}
			}
		}
	}
]

In [None]:
response = tool_call_accuracy(query=query, response=response, tool_definitions=tool_definitions)
pprint(response)