1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434
|
# Azure Cognitive Search client library for Python
[Azure Cognitive Search](https://docs.microsoft.com/azure/search/) is a
search-as-a-service cloud solution that gives developers APIs and tools
for adding a rich search experience over private, heterogeneous content
in web, mobile, and enterprise applications.
The Azure Cognitive Search service is well suited for the following
application scenarios:
* Consolidate varied content types into a single searchable index.
To populate an index, you can push JSON documents that contain your content,
or if your data is already in Azure, create an indexer to pull in data
automatically.
* Attach skillsets to an indexer to create searchable content from images
and large text documents. A skillset leverages AI from Cognitive Services
for built-in OCR, entity recognition, key phrase extraction, language
detection, text translation, and sentiment analysis. You can also add
custom skills to integrate external processing of your content during
data ingestion.
* In a search client application, implement query logic and user experiences
similar to commercial web search engines.
Use the Azure.Search.Documents client library to:
* Submit queries for simple and advanced query forms that include fuzzy
search, wildcard search, regular expressions.
* Implement filtered queries for faceted navigation, geospatial search,
or to narrow results based on filter criteria.
* Create and manage search indexes.
* Upload and update documents in the search index.
* Create and manage indexers that pull data from Azure into an index.
* Create and manage skillsets that add AI enrichment to data ingestion.
* Create and manage analyzers for advanced text analysis or multi-lingual content.
* Optimize results through scoring profiles to factor in business logic or freshness.
[Source code](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/search/azure-search-documents) |
[Package (PyPI)](https://pypi.org/project/azure-search-documents/) |
[API reference documentation](https://azuresdkdocs.blob.core.windows.net/$web/python/azure-search-documents/latest/index.html) |
[Product documentation](https://docs.microsoft.com/azure/search/search-what-is-azure-search) |
[Samples](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/samples)
## _Disclaimer_
_Azure SDK Python packages support for Python 2.7 has ended 01 January 2022. For more information and questions, please refer to https://github.com/Azure/azure-sdk-for-python/issues/20691_
## Getting started
### Install the package
Install the Azure Cognitive Search client library for Python with [pip](https://pypi.org/project/pip/):
```bash
pip install azure-search-documents
```
### Prerequisites
* Python 3.7 or later is required to use this package.
* You need an [Azure subscription][azure_sub] and a
[Azure Cognitive Search service][search_resource] to use this package.
To create a new search service, you can use the [Azure portal][create_search_service_docs], [Azure PowerShell][create_search_service_ps], or the [Azure CLI][create_search_service_cli].
```Powershell
az search service create --name <mysearch> --resource-group <mysearch-rg> --sku free --location westus
```
See [choosing a pricing tier](https://docs.microsoft.com/azure/search/search-sku-tier)
for more information about available options.
### Authenticate the client
All requests to a search service need an api-key that was generated specifically
for your service. [The api-key is the sole mechanism for authenticating access to
your search service endpoint.](https://docs.microsoft.com/azure/search/search-security-api-keys)
You can obtain your api-key from the
[Azure portal](https://portal.azure.com/) or via the Azure CLI:
```Powershell
az search admin-key show --service-name <mysearch> --resource-group <mysearch-rg>
```
There are two types of keys used to access your search service: **admin**
*(read-write)* and **query** *(read-only)* keys. Restricting access and
operations in client apps is essential to safeguarding the search assets on your
service. Always use a query key rather than an admin key for any query
originating from a client app.
*Note: The example Azure CLI snippet above retrieves an admin key so it's easier
to get started exploring APIs, but it should be managed carefully.*
We can use the api-key to create a new `SearchClient`.
```python
import os
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
index_name = "nycjobs"
# Get the service endpoint and API key from the environment
endpoint = os.environ["SEARCH_ENDPOINT"]
key = os.environ["SEARCH_API_KEY"]
# Create a client
credential = AzureKeyCredential(key)
client = SearchClient(endpoint=endpoint,
index_name=index_name,
credential=credential)
```
## Key concepts
An Azure Cognitive Search service contains one or more indexes that provide
persistent storage of searchable data in the form of JSON documents. _(If
you're brand new to search, you can make a very rough analogy between
indexes and database tables.)_ The Azure.Search.Documents client library
exposes operations on these resources through two main client types.
* `SearchClient` helps with:
* [Searching](https://docs.microsoft.com/azure/search/search-lucene-query-architecture)
your indexed documents using
[rich queries](https://docs.microsoft.com/azure/search/search-query-overview)
and [powerful data shaping](https://docs.microsoft.com/azure/search/search-filters)
* [Autocompleting](https://docs.microsoft.com/rest/api/searchservice/autocomplete)
partially typed search terms based on documents in the index
* [Suggesting](https://docs.microsoft.com/rest/api/searchservice/suggestions)
the most likely matching text in documents as a user types
* [Adding, Updating or Deleting Documents](https://docs.microsoft.com/rest/api/searchservice/addupdate-or-delete-documents)
documents from an index
* `SearchIndexClient` allows you to:
* [Create, delete, update, or configure a search index](https://docs.microsoft.com/rest/api/searchservice/index-operations)
* [Declare custom synonym maps to expand or rewrite queries](https://docs.microsoft.com/rest/api/searchservice/synonym-map-operations)
* Most of the `SearchServiceClient` functionality is not yet available in our current preview
* `SearchIndexerClient` allows you to:
* [Start indexers to automatically crawl data sources](https://docs.microsoft.com/rest/api/searchservice/indexer-operations)
* [Define AI powered Skillsets to transform and enrich your data](https://docs.microsoft.com/rest/api/searchservice/skillset-operations)
_The `Azure.Search.Documents` client library (v1) is a brand new offering for
Python developers who want to use search technology in their applications. There
is an older, fully featured `Microsoft.Azure.Search` client library (v10) with
many similar looking APIs, so please be careful to avoid confusion when
exploring online resources._
## Examples
The following examples all use a simple [Hotel data set](https://github.com/Azure-Samples/azure-search-sample-data/blob/master/README.md)
that you can [import into your own index from the Azure portal.](https://docs.microsoft.com/azure/search/search-get-started-portal#step-1---start-the-import-data-wizard-and-create-a-data-source)
These are just a few of the basics - please [check out our Samples](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/samples) for
much more.
* [Querying](#querying)
* [Creating an index](#creating-an-index)
* [Adding documents to your index](#adding-documents-to-your-index)
* [Retrieving a specific document from your index](#retrieving-a-specific-document-from-your-index)
* [Async APIs](#async-apis)
### Querying
Let's start by importing our namespaces.
```python
import os
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
```
We'll then create a `SearchClient` to access our hotels search index.
```python
index_name = "hotels"
# Get the service endpoint and API key from the environment
endpoint = os.environ["SEARCH_ENDPOINT"]
key = os.environ["SEARCH_API_KEY"]
# Create a client
credential = AzureKeyCredential(key)
client = SearchClient(endpoint=endpoint,
index_name=index_name,
credential=credential)
```
Let's search for a "luxury" hotel.
```python
results = client.search(search_text="luxury")
for result in results:
print("{}: {})".format(result["hotelId"], result["hotelName"]))
```
### Creating an index
You can use the `SearchIndexClient` to create a search index. Fields can be
defined using convenient `SimpleField`, `SearchableField`, or `ComplexField`
models. Indexes can also define suggesters, lexical analyzers, and more.
```python
import os
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
ComplexField,
CorsOptions,
SearchIndex,
ScoringProfile,
SearchFieldDataType,
SimpleField,
SearchableField
)
endpoint = os.environ["SEARCH_ENDPOINT"]
key = os.environ["SEARCH_API_KEY"]
# Create a service client
client = SearchIndexClient(endpoint, AzureKeyCredential(key))
# Create the index
name = "hotels"
fields = [
SimpleField(name="hotelId", type=SearchFieldDataType.String, key=True),
SimpleField(name="baseRate", type=SearchFieldDataType.Double),
SearchableField(name="description", type=SearchFieldDataType.String),
ComplexField(name="address", fields=[
SimpleField(name="streetAddress", type=SearchFieldDataType.String),
SimpleField(name="city", type=SearchFieldDataType.String),
])
]
cors_options = CorsOptions(allowed_origins=["*"], max_age_in_seconds=60)
scoring_profiles = []
index = SearchIndex(
name=name,
fields=fields,
scoring_profiles=scoring_profiles,
cors_options=cors_options)
result = client.create_index(index)
```
### Adding documents to your index
You can `Upload`, `Merge`, `MergeOrUpload`, and `Delete` multiple documents from
an index in a single batched request. There are
[a few special rules for merging](https://docs.microsoft.com/rest/api/searchservice/addupdate-or-delete-documents#document-actions)
to be aware of.
```python
import os
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
index_name = "hotels"
endpoint = os.environ["SEARCH_ENDPOINT"]
key = os.environ["SEARCH_API_KEY"]
DOCUMENT = {
'Category': 'Hotel',
'hotelId': '1000',
'rating': 4.0,
'rooms': [],
'hotelName': 'Azure Inn',
}
search_client = SearchClient(endpoint, index_name, AzureKeyCredential(key))
result = search_client.upload_documents(documents=[DOCUMENT])
print("Upload of new document succeeded: {}".format(result[0].succeeded))
```
### Authenticate in a National Cloud
To authenticate in a [National Cloud](https://docs.microsoft.com/azure/active-directory/develop/authentication-national-cloud), you will need to make the following additions to your client configuration:
- Set the `AuthorityHost` in the credential options or via the `AZURE_AUTHORITY_HOST` environment variable
- Set the `audience` in `SearchClient`, `SearchIndexClient`, or `SearchIndexerClient`
```python
# Create a SearchClient that will authenticate through AAD in the China national cloud.
import os
from azure.identity import DefaultAzureCredential, AzureAuthorityHosts
from azure.search.documents import SearchClient
index_name = "hotels"
endpoint = os.environ["SEARCH_ENDPOINT"]
key = os.environ["SEARCH_API_KEY"]
credential = DefaultAzureCredential(authority=AzureAuthorityHosts.AZURE_CHINA)
search_client = SearchClient(endpoint, index_name, crdential=credential, audience="https://search.azure.cn")
```
### Retrieving a specific document from your index
In addition to querying for documents using keywords and optional filters,
you can retrieve a specific document from your index if you already know the
key. You could get the key from a query, for example, and want to show more
information about it or navigate your customer to that document.
```python
import os
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
index_name = "hotels"
endpoint = os.environ["SEARCH_ENDPOINT"]
key = os.environ["SEARCH_API_KEY"]
client = SearchClient(endpoint, index_name, AzureKeyCredential(key))
result = client.get_document(key="1")
print("Details for hotel '1' are:")
print(" Name: {}".format(result["HotelName"]))
print(" Rating: {}".format(result["Rating"]))
print(" Category: {}".format(result["Category"]))
```
### Async APIs
This library includes a complete async API. To use it, you must
first install an async transport, such as [aiohttp](https://pypi.org/project/aiohttp/).
See
[azure-core documentation](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/README.md#transport)
for more information.
```py
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.aio import SearchClient
client = SearchClient(endpoint, index_name, AzureKeyCredential(api_key))
async with client:
results = await client.search(search_text="hotel")
async for result in results:
print("{}: {})".format(result["hotelId"], result["hotelName"]))
...
```
## Troubleshooting
### General
The Azure Cognitive Search client will raise exceptions defined in [Azure Core][azure_core].
### Logging
This library uses the standard [logging][python_logging] library for logging.
Basic information about HTTP sessions (URLs, headers, etc.) is logged at INFO
level.
Detailed DEBUG level logging, including request/response bodies and unredacted
headers, can be enabled on a client with the `logging_enable` keyword argument:
```python
import sys
import logging
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
# Create a logger for the 'azure' SDK
logger = logging.getLogger('azure')
logger.setLevel(logging.DEBUG)
# Configure a console output
handler = logging.StreamHandler(stream=sys.stdout)
logger.addHandler(handler)
# This client will log detailed information about its HTTP sessions, at DEBUG level
client = SearchClient("<service endpoint>", "<index_name>", AzureKeyCredential("<api key>"), logging_enable=True)
```
Similarly, `logging_enable` can enable detailed logging for a single operation,
even when it isn't enabled for the client:
```python
result = client.search(search_text="spa", logging_enable=True)
```
## Next steps
* Go further with Azure.Search.Documents and our [https://github.com/Azure/azure-sdk-for-python/blob/master/sdk/search/azure-search-documents/samples](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/samples)
* Watch a [demo or deep dive video](https://azure.microsoft.com/resources/videos/index/?services=search)
* Read more about the [Azure Cognitive Search service](https://docs.microsoft.com/azure/search/search-what-is-azure-search)
## Contributing
See our [Search CONTRIBUTING.md][search_contrib] for details on building,
testing, and contributing to this library.
This project welcomes contributions and suggestions. Most contributions require
you to agree to a Contributor License Agreement (CLA) declaring that you have
the right to, and actually do, grant us the rights to use your contribution. For
details, visit [cla.microsoft.com][cla].
This project has adopted the [Microsoft Open Source Code of Conduct][code_of_conduct].
For more information see the [Code of Conduct FAQ][coc_faq]
or contact [opencode@microsoft.com][coc_contact] with any
additional questions or comments.

## Related projects
* [Microsoft Azure SDK for Python](https://github.com/Azure/azure-sdk-for-python)
<!-- LINKS -->

[azure_cli]: https://docs.microsoft.com/cli/azure
[azure_core]: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/README.md
[azure_sub]: https://azure.microsoft.com/free/
[search_resource]: https://docs.microsoft.com/azure/search/search-create-service-portal
[azure_portal]: https://portal.azure.com
[create_search_service_docs]: https://docs.microsoft.com/azure/search/search-create-service-portal
[create_search_service_ps]: https://docs.microsoft.com/azure/search/search-manage-powershell#create-or-delete-a-service
[create_search_service_cli]: https://docs.microsoft.com/cli/azure/search/service?view=azure-cli-latest#az-search-service-create
[search_contrib]: https://github.com/Azure/azure-sdk-for-python/blob/main/CONTRIBUTING.md
[python_logging]: https://docs.python.org/3.5/library/logging.html
[cla]: https://cla.microsoft.com
[code_of_conduct]: https://opensource.microsoft.com/codeofconduct/
[coc_faq]: https://opensource.microsoft.com/codeofconduct/faq/
[coc_contact]: mailto:opencode@microsoft.com
|