File: GEN1_GEN2_MAPPING.md

package info (click to toggle)
python-azure 20250603%2Bgit-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 851,724 kB
  • sloc: python: 7,362,925; ansic: 804; javascript: 287; makefile: 195; sh: 145; xml: 109
file content (191 lines) | stat: -rw-r--r-- 7,679 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
<h1>Mapping from ADLS Gen1 API -> ADLS Gen2 API</h1>
<table style="background:white">
<thead>
<tr>
<th>ADLS Gen1 API</th>
<th>Note for Gen1 API</th>
<th>ADLS Gen2 API</th>
<th>Note for API Mapping</th>
</tr>
</thead>
<tbody>
<tr>
<td>access/exists</td>
<td>To check if file/directory exists.</td>
<td>N/A</td>
<td>User can use Gen2 API: <strong>create_file(if_none_match=&#39;*&#39;)<strong> or </strong>create_directory(if_none_match=&#39;*&#39;)</strong> so that the operation will fail on exist.</td>
</tr>
<tr>
<td>touch</td>
<td>Create empty file</td>
<td><strong>create_file</strong></td>
<td>The API has the same main purpose for Gen1 and Gen2. However Gen2 <strong>create_file</strong> API could accept more parameters along with creation.</td>
</tr>
<tr>
<td>mkdir</td>
<td>Make new directory</td>
<td><strong>create_directory</strong></td>
<td>The API has the same main purpose for Gen1 and Gen2. However Gen2 <strong>create_directory</strong> API could accept more parameters along with creation.</td>
</tr>
<tr>
<td rowspan="2">stat/info</td>
<td rowspan="2">File information for path</td>
<td><strong>get_file_properties</strong></td>
<td rowspan="2">The Gen1 API is split into two separate ones in ADLS Gen2.</td>
</tr>
<tr>
<td><strong>get_directory_properties</strong></td>
</tr>
<tr>
<td rowspan="2">unlink/remove/rm</td>
<td rowspan="2">Remove a file or directory</td>
<td><strong>delete_file</strong></td>
<td rowspan="2">The Gen1 API is split into two separate ones in ADLS Gen2.</td>
</tr>
<tr>
<td><strong>delete_directory</strong></td>
</tr>
<tr>
<td>rmdir</td>
<td>Remove empty directory</td>
<td><strong>delete_directory</strong></td>
<td>Delete directory</td>
</tr>
<tr>
<td>ls/listdir</td>
<td>List all elements under directory specified with path</td>
<td rowspan="2"><strong>get paths</strong></td>
<td><strong>get_paths(recursive=False)</strong> is equal to <strong>ls/listdir</strong></td>
</tr>
<tr>
<td>walk</td>
<td>Walk a path recursively and returns list of files and dirs(if parameter set)</td>
<td><strong>get_paths()</strong> or <strong>get_paths(recursive=True)</strong> is equal to <strong>walk</strong>. <strong>recursive</strong> is <strong>True</strong> by default.</td>
</tr>
<tr>
<td>put</td>
<td>Stream data from local filename to file at path.</td>
<td><strong>append_data</strong> together with <strong>flush_data</strong></td>
<td><strong>append_data</strong> should be followed by <strong>flush_data</strong> , then the data is actually write into the file. <strong>append_data</strong> is just to stage the data, not actually write the data into file.</td>
</tr>
<tr>
<td>cat</td>
<td>Return contents of file</td>
<td rowspan="4"><strong>download_file</strong></td>
<td rowspan="4">Put the expected range parameters in Gen2 API will achieve the same function of the 4 Gen1 APIs.</td>
</tr>
<tr>
<td>head</td>
<td>Return first bytes of file</td>
</tr>
<tr>
<td>tail</td>
<td>Return last bytes of file</td>
</tr>
<tr>
<td><a href="https://learn.microsoft.com/python/API/azure-datalake-store/azure.datalake.store.core.azuredlfilesystem?view=azure-python#read-block-fn--offset--length--delimiter-none-"><strong>read_block</strong></a></td>
<td>Read a block of bytes from an ADL file</td>
</tr>
<tr>
<td>get</td>
<td>Stream data from file at path to local filename</td>
<td><strong>download_file</strong></td>
<td>Passing a <strong>stream</strong> parameter in <strong>download_file</strong> should do the same thing as Gen1 <strong>get</strong> API does</td>
</tr>
<tr>
<td rowspan="2">rename/mv</td>
<td rowspan="2">Move file between locations on ADL</td>
<td><strong>rename_file</strong></td>
<td rowspan="2">Currently ADLS Gen2 only support rename. Move isn&#39;t supported yet.</td>
</tr>
<tr>
<td><strong>rename_directory</strong></td>
</tr>
<tr>
<td>chown</td>
<td>Change owner and/or owning group</td>
<td rowspan="4"><strong>set_access_control</strong></td>
<td rowspan="4">Users can set owner, group, acl etc. using the same API.</td>
</tr>
<tr>
<td>chmod</td>
<td>Change access mode of path</td>
</tr>
<tr>
<td>set_acl</td>
<td>Set the Access Control List (ACL) for a file or folder.</td>
</tr>
<tr>
<td><a href="https://learn.microsoft.com/python/API/azure-datalake-store/azure.datalake.store.core.azuredlfilesystem?view=azure-python#modify-acl-entries-path--acl-spec--recursive-false--number-of-sub-process-none-"><strong>modify_acl_entries</strong></a></td>
<td>Modify existing Access Control List (ACL) entries on a file or folder. If the entry does not exist it is added, otherwise it is updated based on the spec passed in. No entries are removed by this process (unlike set_acl).</td>
</tr>
<tr>
<td>get_acl_status</td>
<td>Gets Access Control List (ACL) entries for the specified file or directory.</td>
<td><strong>get_access_control</strong></td>
<td>The result will include owner, group, acl etc.</td>
</tr>
<tr>
<td>remove_acl_entries</td>
<td>Remove existing, named, Access Control List (ACL) entries on a file or folder.If the entry does not exist already it is ignored. Default entries cannot be removed this way, please use remove_default_acl for that. Unnamed entries cannot be removed in this way, please use remove_acl for that. Note: this is by default not recursive, and applies only to the file or folder specified.</td>
<td rowspan="3">N/A</td>
<td rowspan="3">Probably users can achieve the same purpose by calling set_access_control with related parameters.</td>
</tr>
<tr>
<td><a href="https://learn.microsoft.com/python/API/azure-datalake-store/azure.datalake.store.core.azuredlfilesystem?view=azure-python#remove-acl-path-"><strong>remove_acl</strong></a></td>
<td>Remove the entire, non default, ACL from the file or folder, including unnamed entries. Default entries cannot be removed this way, please use remove_default_acl for that. Note: this is not recursive, and applies only to the file or folder specified.</td>
</tr>
<tr>
<td>remove_default_acl</td>
<td>Remove the entire default ACL from the folder. Default entries do not exist on files, if a file is specified, this operation does nothing. Note: this is not recursive, and applies only to the folder specified.</td>
</tr>
<tr>
<td><a href="https://learn.microsoft.com/python/API/azure-datalake-store/azure.datalake.store.core.azuredlfilesystem?view=azure-python#open-path--mode--rb---blocksize-33554432--delimiter-none-"><strong>open</strong></a></td>
<td>Open a file for reading or writing to.</td>
<td>N/A</td>
<td>There is no open file operation In ADLS Gen2. However users can do operations to the file directly, eg. <strong>append_data, flush_data, download_file</strong></td>
</tr>
<tr>
<td>concat/merge</td>
<td>Concatenate a list of files into one new file</td>
<td>N/A</td>
<td>N/A</td>
</tr>
<tr>
<td>cp</td>
<td>Not implemented. Copy file between locations on ADL</td>
<td>N/A</td>
<td>N/A</td>
</tr>
<tr>
<td>current</td>
<td>Return the most recently created AzureDLFileSystem</td>
<td>N/A</td>
<td>N/A</td>
</tr>
<tr>
<td>df</td>
<td>Resource summary of path. eg. File count, directory count</td>
<td>N/A</td>
<td>get_paths could be a helpful API. But user need to do further processing.</td>
</tr>
<tr>
<td>du</td>
<td>Bytes in keys at path</td>
<td>N/A</td>
<td>get_paths could be a helpful API. But user need to do further processing.</td>
</tr>
<tr>
<td>glob</td>
<td>Find files (not directories) by glob-matching.</td>
<td>N/A</td>
<td>get_paths could be a helpful API. But user need to do further processing.</td>
</tr>
<tr>
<td>set_expiry</td>
<td>Set or remove the expiration time on the specified file. This operation can only be executed against files.</td>
<td>N/A</td>
<td>N/A</td>
</tr>
</tbody>
</table>