File: copying.rst

package info (click to toggle)
fsspec 2025.7.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 9,200 kB
  • sloc: python: 24,285; makefile: 31; sh: 17
file content (343 lines) | stat: -rw-r--r-- 11,000 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
Copying files and directories
=============================

This documents the expected behavior of the ``fsspec``  file and directory copying functions.
There are three functions of interest here: :meth:`~fsspec.spec.AbstractFileSystem.copy`,
:meth:`~fsspec.spec.AbstractFileSystem.get` and :meth:`~fsspec.spec.AbstractFileSystem.put`.
Each of these copies files and/or directories from a ``source`` to a ``target`` location.
If we refer to our filesystem of interest, derived from :class:`~fsspec.spec.AbstractFileSystem`,
as the remote filesystem (even though it may be local) then the difference between the three
functions is:

    - :meth:`~fsspec.spec.AbstractFileSystem.copy` copies from a remote ``source`` to a remote ``target``
    - :meth:`~fsspec.spec.AbstractFileSystem.get` copies from a remote ``source`` to a local ``target``
    - :meth:`~fsspec.spec.AbstractFileSystem.put` copies from a local ``source`` to a remote ``target``

The ``source`` and ``target`` are the first two arguments passed to these functions, and each
consists of one or more files, directories and/or ``glob`` (wildcard) patterns.
The behavior of the ``fsspec`` copy functions is intended to be the same as that obtained using
POSIX command line ``cp`` but ``fsspec`` functions have extra functionality because:

    - They support more than one ``target`` whereas command line ``cp`` is restricted to one.
    - They can create new directories, either automatically or via the ``auto_mkdir=True`` keyword
      argument, whereas command line ``cp`` only does this as part of a recursive copy.

Expected behavior
-----------------

There follows a comprehensive list of the expected behavior of the ``fsspec`` copying functions
that also forms the basis of a set of tests that all classes that derive from
:class:`~fsspec.spec.AbstractFileSystem` can be tested against to check that they conform.
For all scenarios the ``source`` filesystem contains the following directories and files::

    📁 source
    ├── 📄 file1
    ├── 📄 file2
    └── 📁 subdir
        ├── 📄 subfile1
        ├── 📄 subfile2
        └── 📁 nesteddir
            └── 📄 nestedfile

and before each scenario the ``target`` directory exists and is empty unless otherwise noted::

    📁 target

All example code uses :meth:`~fsspec.spec.AbstractFileSystem.cp` which is an alias of
:meth:`~fsspec.spec.AbstractFileSystem.copy`; equivalent behavior is expected by
:meth:`~fsspec.spec.AbstractFileSystem.get` and :meth:`~fsspec.spec.AbstractFileSystem.put`.
Forward slashes are used for directory separators throughout.

1. Single source to single target
---------------------------------

.. dropdown:: 1a. File to existing directory

    .. code-block:: python

        cp("source/subdir/subfile1", "target/")

    results in::

        📁 target
        └── 📄 subfile1

    The trailing slash on ``"target/"`` is optional but recommended as it explicitly indicates that
    the target is a directory.

.. dropdown:: 1b. File to new directory

    .. code-block:: python

        cp("source/subdir/subfile1", "target/newdir/")

    results in::

        📁 target
        └── 📁 newdir
            └── 📄 subfile1

    This fails if the ``target`` file system is not capable of creating the directory, for example
    if it is write-only or if ``auto_mkdir=False``. There is no command line equivalent of this
    scenario without an explicit ``mkdir`` to create the new directory.

    The trailing slash is required on the new directory otherwise it is interpreted as a filename
    which is a different scenario (1d. File to file in new directory).

.. dropdown:: 1c. File to file in existing directory

    .. code-block:: python

        cp("source/subdir/subfile1", "target/newfile")

    results in::

        📁 target
        └── 📄 newfile

    The target cannot have a trailing slash as ``"newfile/"`` is interpreted as a new directory
    which is a different scenario (1b. File to new directory).

.. dropdown:: 1d. File to file in new directory

    .. code-block:: python

        cp("source/subdir/subfile1", "target/newdir/newfile")

    creates the new directory and copies the file into it::

        📁 target
        └── 📁 newdir
            └── 📄 newfile

    This fails if the ``target`` file system is not capable of creating the directory, for example
    if it is write-only or if ``auto_mkdir=False``. There is no command line equivalent of this
    scenario without an explicit ``mkdir`` to create the new directory.

    If there is a trailing slash on the target ``target/newdir/newfile/`` then it is interpreted as
    a new directory which is a different scenario (1b. File to new directory).

.. dropdown:: 1e. Directory to existing directory

    .. code-block:: python

        cp("source/subdir/", "target/", recursive=True)

    results in::

       📁 target
        ├── 📄 subfile1
        └── 📄 subfile2
            └── 📁 nesteddir
                └── 📄 nestedfile

    The ``recursive=True`` keyword argument is required otherwise the call does nothing. The depth
    of recursion can be controlled using the ``maxdepth`` keyword argument, for example:

    .. code-block:: python

        cp("source/subdir/", "target/", recursive=True, maxdepth=1)

    results in::

       📁 target
        ├── 📄 subfile1
        └── 📄 subfile2

    The trailing slash on ``"target/"`` is optional but recommended as it explicitly indicates that
    the target is a directory.

    If the trailing slash is omitted from ``"source/subdir"`` then the ``subdir`` is also copied,
    not just its contents:

    .. code-block:: python

        cp("source/subdir", "target/", recursive=True)

    results in::

       📁 target
        └── 📁 subdir
            ├── 📄 subfile1
            └── 📄 subfile2
                └── 📁 nesteddir
                    └── 📄 nestedfile

    Again the depth of recursion can be controlled using the ``maxdepth`` keyword argument, for
    example:

    .. code-block:: python

        cp("source/subdir", "target/", recursive=True, maxdepth=1)

    results in::

       📁 target
        └── 📁 subdir
            ├── 📄 subfile1
            └── 📄 subfile2

.. dropdown:: 1f. Directory to new directory

    .. code-block:: python

        cp("source/subdir/", "target/newdir/", recursive=True)

    results in::

       📁 target
        └── 📁 newdir
            ├── 📄 subfile1
            └── 📄 subfile2
                └── 📁 nesteddir
                    └── 📄 nestedfile

    Trailing slashes on both ``source`` and ``target`` are optional and do not affect the result.
    They are recommended to explicitly indicate both are directories.

    The ``recursive=True`` keyword argument is required otherwise the call does nothing. The depth
    of recursion can be controlled using the ``maxdepth`` keyword argument, for example:

    .. code-block:: python

        cp("source/subdir/", "target/newdir/", recursive=True, maxdepth=1)

    results in::

       📁 target
        └── 📁 newdir
            ├── 📄 subfile1
            └── 📄 subfile2

.. dropdown:: 1g. Glob to existing directory

    Nonrecursive

    .. code-block:: python

        cp("source/subdir/*", "target/")

    copies files from the top-level directory only and results in::

       📁 target
        ├── 📄 subfile1
        └── 📄 subfile2

    Recursive

    .. code-block:: python

        cp("source/subdir/*", "target/", recursive=True)

    results in::

        📁 target
        ├── 📄 subfile1
        └── 📄 subfile2
            └── 📁 nesteddir
                └── 📄 nestedfile

    The trailing slash on ``"target/"`` is optional but recommended as it explicitly indicates that
    the target is a directory.

    The depth of recursion can be controlled by the ``maxdepth`` keyword argument, for example:

    .. code-block:: python

        cp("source/subdir/*", "target/", recursive=True, maxdepth=1)

    results in::

       📁 target
        ├── 📄 subfile1
        └── 📄 subfile2

.. dropdown:: 1h. Glob to new directory

    Nonrecursive

    .. code-block:: python

        cp("source/subdir/*", "target/newdir/")

    copies files from the top-level directory only and results in::

       📁 target
        └── 📁 newdir
            ├── 📄 subfile1
            └── 📄 subfile2

    Recursive

    .. code-block:: python

        cp("source/subdir/*", "target/newdir/", recursive=True)

    results in::

        📁 target
        └── 📁 newdir
            ├── 📄 subfile1
            └── 📄 subfile2
                └── 📁 nesteddir
                    └── 📄 nestedfile

    The trailing slash on the ``target`` is optional but recommended as it explicitly indicates that
    it is a directory.

    The depth of recursion can be controlled by the ``maxdepth`` keyword argument, for example:

    .. code-block:: python

        cp("source/subdir/*", "target/newdir/", recursive=True, maxdepth=1)

    results in::

        📁 target
        └── 📁 newdir
            ├── 📄 subfile1
            └── 📄 subfile2

    These calls fail if the ``target`` file system is not capable of creating the directory, for
    example if it is write-only or if ``auto_mkdir=False``. There is no command line equivalent of
    this scenario without an explicit ``mkdir`` to create the new directory.

2. Multiple source to single target
-----------------------------------

.. dropdown:: 2a. List of files to existing directory

    .. code-block:: python

        cp(["source/file1", "source/file2", "source/subdir/subfile1"], "target/")

    results in::

        📁 target
        ├── 📄 file1
        ├── 📄 file2
        └── 📄 subfile1

    All of the files are copied to the target directory regardless of their relative paths in the
    source filesystem. The trailing slash on the ``target`` is optional but recommended as it
    explicitly indicates that it is a directory.

.. dropdown:: 2b. List of files to new directory

    .. code-block:: python

        cp(["source/file1", "source/file2", "source/subdir/subfile1"], "target/newdir/")

    results in::

        📁 target
        └── 📁 newdir
            ├── 📄 file1
            ├── 📄 file2
            └── 📄 subfile1

    All of the files are copied to the target directory regardless of their relative paths in the
    source filesystem.

    The trailing slash is required on the new directory otherwise it is interpreted as a filename
    rather than a directory.