File: migration_guide.md

package info (click to toggle)
python-azure 20250603%2Bgit-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 851,724 kB
  • sloc: python: 7,362,925; ansic: 804; javascript: 287; makefile: 195; sh: 145; xml: 109
file content (194 lines) | stat: -rw-r--r-- 8,678 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
# Guide for migrating to azure-schemaregistry-avroencoder v1.0.x from azure-schemaregistry-avroserializer v1.0.0b4

This guide is intended to assist in the migration to azure-schemaregistry-avroencoder v1.0.x from azure-schemaregistry-avroserializer v1.0.0b4.

It will focus on side-by-side comparisons for similar operations between the two packages.

Familiarity with the `azure-schemaregistry-avroserializer` v1.0.0b4 package is assumed.
For those new to the Schema Registry Avro Encoder library for Python, please refer to the [README for `azure-schemaregistry-avroencoder`](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/schemaregistry/azure-schemaregistry-avroencoder/README.md) rather than this guide.

## Table of contents

* [Migration benefits](#migration-benefits)
  * [New Features](#new-features)
* [Important changes](#important-changes)
  * [The AvroEncoder](#the-avroencoder)
    * [Encode](#encode)
    * [Decode](#decode)
  * [Content type](#content-type)
    * [Backward Compatibility](#backward-compatibility)
  * [MessageType models](#messagetype-models)
    * [Encoding with EventData](#encoding-with-eventdata)
    * [Decoding with EventData](#decoding-with-eventdata)
* [Additional samples](#additional-samples)

## Migration benefits

A natural question to ask when considering whether or not to adopt a new version or library is what
the benefits of doing so would be. As Azure has matured and been embraced by a more diverse group of developers,
we have been focused on learning the patterns and practices to best support developer productivity and
to understand the gaps that the Python client libraries have.

There were several areas of consistent feedback expressed across the Azure client library ecosystem.
One of the most important is that the client libraries for different Azure services have not had a
consistent approach to organization, naming, and API structure. Additionally, many developers have felt
that the learning curve was difficult, and the APIs did not offer a good, approachable,
and consistent onboarding story for those learning Azure or exploring a specific Azure service.

To try and improve the development experience across Azure services,
a set of uniform [design guidelines](https://azure.github.io/azure-sdk/general_introduction.html) was created
for all languages to drive a consistent experience with established API patterns for all services.
A set of [Python-specific guidelines](https://azure.github.io/azure-sdk/python/guidelines/index.html) was also introduced to ensure
that Python clients have a natural and idiomatic feel with respect to the Python ecosystem.
Further details are available in the guidelines for those interested.

### New Features

We have a variety of new features in the version v1.0.x of the Schema Registry Avro Encoder library.

- `AvroEncoder` sync and async classes provide the functionality to encode and decode content which follows a schema with the RecordSchema format, as defined by the Apache Avro specification. The Apache Avro library is used as the implementation for encoding and decoding.
- Refer to the [CHANGELOG.md](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/schemaregistry/azure-schemaregistry-avroencoder/CHANGELOG.md) for more new features, changes, and bug fixes.

## Important changes

### The AvroEncoder

The `AvroEncoder` has been renamed from the `AvroSerializer` and has the following methods:

* `encode`, which replaces `serialize`
* `decode`, which replaces `deserialize`

#### Encode

The positional parameter `value` has been renamed to `content`. If `value` parameter name was specified when calling `serialize`, it must be updated to `content`.

Using the `AvroSerializer`:

```python
with serializer:
    dict_data_ben = {"name": "Ben", "favorite_number": 7, "favorite_color": "red"}
    payload_ben = serializer.serialize(value=dict_data_ben, schema=SCHEMA_STRING)
    print(payload_ben)  # prints bytes b'<record format indicator><schema ID><Avro-encoded data>``
```

Using the `AvroEncoder`:

```python
with encoder:
    dict_content = {"name": "Ben", "favorite_number": 7, "favorite_color": "red"}
    message_content_ben = encoder.encode(content=dict_content, schema=SCHEMA_STRING)
    print(message_content_ben)  # prints {"content": <Avro-encoded content>, "content_type": "<Avro MIME type>+<Schema ID>"}
```

#### Decode

The positional parameter `value` has been renamed to `content`. If `value` parameter name was specified when calling `deserialize`, it must be updated to `content`.

Using the `AvroSerializer`:

```python
with serializer:
    decoded_value = serializer.deserialize(value=encoded_payload)
```

Using the `AvroEncoder`:

```python
with encoder:
    decoded_content = encoder.decode(content=encoded_content)
```

### Content type

Previously, `encode` returned an Avro-encoded content, with the binary encoded schema ID and record format indicator values prepended to it. Now, `encode` will return a TypedDict of {`content`: `<serialized payload>`, `content_type`: `<Avro MIME type>`+`<schema ID>`} by default.

#### Backward compatibility

In order to decode content that was encoded (or "serialized") by the `AvroSerializer`, use `azure-schemaregistry-avroencoder` v1.0.0b2.
> Note: This backward compatibility was removed starting `azure-schemaregistry-avroencoder` v1.0.0b3.

1. Install `azure-schemaregistry-avroencoder` v1.0.0b2, which automatically detects the preamble.

```pwsh
pip install azure-schemaregistry-avroencoder==1.0.0b2
```

2. Receive and decode events from Event Hubs that were "serialized" by the `AvroSerializer`.

```python
import os
from azure.eventhub import EventHubConsumerClient
from azure.schemaregistry import SchemaRegistryClient
from azure.schemaregistry.encoder.avroencoder import AvroEncoder
from azure.identity import DefaultAzureCredential

token_credential = DefaultAzureCredential()
fully_qualified_namespace = os.environ['SCHEMAREGISTRY_FULLY_QUALIFIED_NAMESPACE']
group_name = os.environ['SCHEMAREGISTRY_GROUP']
eventhub_connection_str = os.environ['EVENT_HUB_CONN_STR']
eventhub_name = os.environ['EVENT_HUB_NAME']

schema_registry_client = SchemaRegistryClient(fully_qualified_namespace, token_credential)
avro_encoder = AvroEncoder(client=schema_registry_client, group_name=group_name)

eventhub_consumer = EventHubConsumerClient.from_connection_string(
    conn_str=eventhub_connection_str,
    consumer_group='$Default',
    eventhub_name=eventhub_name,
)

def on_event(partition_context, event):
    decoded_content = avro_encoder.decode(event)

with eventhub_consumer, avro_encoder:
    eventhub_consumer.receive(on_event=on_event, starting_position="-1")
```

### MessageType models

In order to integrate more easily with messaging libraries, a `MessageType` protocol has been added. It specifies the methods that need to be supported by message models so that the `AvroEncoder` can automatically set/get Avro-encoded content and respective content type on the message.

Message models that currently support the `MessageType` protocol are:

* `azure.eventhub.EventData` from `azure-eventhub>=5.9.0`

#### Encoding with EventData

Previously, the Avro-encoded payload would need to be passed in as the `EventData` body.

```python
with serializer:
    dict_data = {"name": "Ben", "favorite_number": 7, "favorite_color": "red"}
    payload_bytes = serializer.serialize(value=dict_data, schema=SCHEMA_STRING)
    event_data = EventData(body=payload_bytes)  # pass the bytes data to the body of an EventData
```

Now, the `EventData` class can be passed in to the `AvroEncoder` to automatically set the Avro-encoded content and content type on the `EventData` object.

```python
with encoder:
    dict_content = {"name": "Ben", "favorite_number": 7, "favorite_color": "red"}
    event_data = encoder.encode(content=dict_content, schema=SCHEMA_STRING, message_type=EventData)

```

#### Decoding with EventData

Previously, the body on the received EventData would need to be manually constructed then passed to the `AvroSerializer` for decoding.

```python
def on_event(partition_context, event):
    bytes_payload = b"".join(b for b in event.body)
    deserialized_data = avro_serializer.deserialize(value=bytes_payload)
```

Now, the `EventData` can be passed directly to the `AvroEncoder` for automatic decoding.

```python
def on_event(partition_context, event):
    decoded_content = avro_encoder.decode(content=event)
```

## Additional samples

More examples can be found at [Samples for azure-schemaregistry-avroencoder](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/schemaregistry/azure-schemaregistry-avroencoder/samples)