File: sample_extract_key_phrases.py

package info (click to toggle)
python-azure 20250603%2Bgit-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 851,724 kB
  • sloc: python: 7,362,925; ansic: 804; javascript: 287; makefile: 195; sh: 145; xml: 109
file content (74 lines) | stat: -rw-r--r-- 2,845 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
# -------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for
# license information.
# --------------------------------------------------------------------------

"""
FILE: sample_extract_key_phrases.py

DESCRIPTION:
    This sample demonstrates how to extract key talking points from a batch of documents.

    In this sample, we want to go over articles and read the ones that mention Microsoft.
    We're going to use the SDK to create a rudimentary search algorithm to find these articles.

USAGE:
    python sample_extract_key_phrases.py

    Set the environment variables with your own values before running the sample:
    1) AZURE_LANGUAGE_ENDPOINT - the endpoint to your Language resource.
    2) AZURE_LANGUAGE_KEY - your Language subscription key
"""


def sample_extract_key_phrases() -> None:
    print(
        "In this sample, we want to find the articles that mention Microsoft to read."
    )
    articles_that_mention_microsoft = []
    # [START extract_key_phrases]
    import os
    from azure.core.credentials import AzureKeyCredential
    from azure.ai.textanalytics import TextAnalyticsClient

    endpoint = os.environ["AZURE_LANGUAGE_ENDPOINT"]
    key = os.environ["AZURE_LANGUAGE_KEY"]

    text_analytics_client = TextAnalyticsClient(endpoint=endpoint, credential=AzureKeyCredential(key))
    articles = [
        """
        Washington, D.C. Autumn in DC is a uniquely beautiful season. The leaves fall from the trees
        in a city chock-full of forests, leaving yellow leaves on the ground and a clearer view of the
        blue sky above...
        """,
        """
        Redmond, WA. In the past few days, Microsoft has decided to further postpone the start date of
        its United States workers, due to the pandemic that rages with no end in sight...
        """,
        """
        Redmond, WA. Employees at Microsoft can be excited about the new coffee shop that will open on campus
        once workers no longer have to work remotely...
        """
    ]

    result = text_analytics_client.extract_key_phrases(articles)
    for idx, doc in enumerate(result):
        if not doc.is_error:
            print("Key phrases in article #{}: {}".format(
                idx + 1,
                ", ".join(doc.key_phrases)
            ))
    # [END extract_key_phrases]
            if "Microsoft" in doc.key_phrases:
                articles_that_mention_microsoft.append(str(idx + 1))

    print(
        "The articles that mention Microsoft are articles number: {}. Those are the ones I'm interested in reading.".format(
            ", ".join(articles_that_mention_microsoft)
        )
    )


if __name__ == '__main__':
    sample_extract_key_phrases()