File: guide.rst

package info (click to toggle)
python-fs 2.4.16-7
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 1,944 kB
  • sloc: python: 13,048; makefile: 226; sh: 3
file content (238 lines) | stat: -rw-r--r-- 13,037 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
Guide
=====

The PyFilesytem interface simplifies most aspects of working with files and directories. This guide covers what you need to know about working with FS objects.

Why use PyFilesystem?
~~~~~~~~~~~~~~~~~~~~~

If you are comfortable using the Python standard library, you may be wondering; *why learn another API for working with files?*

The :ref:`interface` is generally simpler than the ``os`` and ``io`` modules -- there are fewer edge cases and less ways to shoot yourself in the foot. This may be reason alone to use it, but there are other compelling reasons you should use ``import fs`` for even straightforward filesystem code.

The abstraction offered by FS objects means that you can write code that is agnostic to where your files are physically located. For instance, if you wrote a function that searches a directory for duplicates files, it will work unaltered with a directory on your hard-drive, or in a zip file, on an FTP server, on Amazon S3, etc.

As long as an FS object exists for your chosen filesystem (or any data store that resembles a filesystem), you can use the same API. This means that you can defer the decision regarding where you store data to later. If you decide to store configuration in the *cloud*, it could be a single line change and not a major refactor.


PyFilesystem can also be beneficial for unit-testing; by swapping the OS filesystem with an in-memory filesystem, you can write tests without having to manage (or mock) file IO. And you can be sure that your code will work on Linux, MacOS, and Windows.

Opening Filesystems
~~~~~~~~~~~~~~~~~~~

There are two ways you can open a filesystem. The first and most natural way is to import the appropriate filesystem class and construct it.

Here's how you would open a :class:`~fs.osfs.OSFS` (Operating System File System), which maps to the files and directories of your hard-drive::

    >>> from fs.osfs import OSFS
    >>> home_fs = OSFS("~/")

This constructs an FS object which manages the files and directories under a given system path. In this case, ``'~/'``, which is a shortcut for your home directory.

Here's how you would list the files/directories in your home directory::

    >>> home_fs.listdir('/')
    ['world domination.doc', 'paella-recipe.txt', 'jokes.txt', 'projects']

Notice that the parameter to ``listdir`` is a single forward slash, indicating that we want to list the *root* of the filesystem. This is because from the point of view of ``home_fs``, the root is the directory we used to construct the ``OSFS``.

Also note that it is a forward slash, even on Windows. This is because FS paths are in a consistent format regardless of the platform. Details such as the separator and encoding are abstracted away. See :ref:`paths` for details.

Other filesystems interfaces may have other requirements for their constructor. For instance, here is how you would open a FTP filesystem::

    >>> from ftpfs import FTPFS
    >>> debian_fs = FTPFS('ftp.mirror.nl')
    >>> debian_fs.listdir('/')
    ['debian-archive', 'debian-backports', 'debian', 'pub', 'robots.txt']

The second, and more general way of opening filesystems objects, is via an *opener* which opens a filesystem from a URL-like syntax. Here's an alternative way of opening your home directory::

    >>> from fs import open_fs
    >>> home_fs = open_fs('osfs://~/')
    >>> home_fs.listdir('/')
    ['world domination.doc', 'paella-recipe.txt', 'jokes.txt', 'projects']

The opener system is particularly useful when you want to store the physical location of your application's files in a configuration file.

If you don't specify the protocol in the FS URL, then PyFilesystem will assume you want a OSFS relative from the current working directory. So the following would be an equivalent way of opening your home directory::

    >>> from fs import open_fs
    >>> home_fs = open_fs('.')
    >>> home_fs.listdir('/')
    ['world domination.doc', 'paella-recipe.txt', 'jokes.txt', 'projects']

Tree Printing
~~~~~~~~~~~~~

Calling :meth:`~fs.base.FS.tree` on a FS object will print an ascii tree view of your filesystem. Here's an example::

    >>> from fs import open_fs
    >>> my_fs = open_fs('.')
    >>> my_fs.tree()
    ├── locale
    │   └── readme.txt
    ├── logic
    │   ├── content.xml
    │   ├── data.xml
    │   ├── mountpoints.xml
    │   └── readme.txt
    ├── lib.ini
    └── readme.txt

This can be a useful debugging aid!


Closing
~~~~~~~

FS objects have a :meth:`~fs.base.FS.close` methd which will perform any required clean-up actions. For many filesystems (notably :class:`~fs.osfs.OSFS`), the ``close`` method does very little. Other filesystems may only finalize files or release resources once ``close()`` is called.

You can call ``close`` explicitly once you are finished using a filesystem. For example::

    >>> home_fs = open_fs('osfs://~/')
    >>> home_fs.writetext('reminder.txt', 'buy coffee')
    >>> home_fs.close()

If you use FS objects as a context manager, ``close`` will be called automatically. The following is equivalent to the previous example::

    >>> with open_fs('osfs://~/') as home_fs:
    ...    home_fs.writetext('reminder.txt', 'buy coffee')

Using FS objects as a context manager is recommended as it will ensure every FS is closed.

Directory Information
~~~~~~~~~~~~~~~~~~~~~

Filesystem objects have a :meth:`~fs.base.FS.listdir` method which is similar to ``os.listdir``; it takes a path to a directory and returns a list of file names. Here's an example::

    >>> home_fs.listdir('/projects')
    ['fs', 'moya', 'README.md']

An alternative method exists for listing directories; :meth:`~fs.base.FS.scandir` returns an *iterable* of :ref:`info` objects. Here's an example::

    >>> directory = list(home_fs.scandir('/projects'))
    >>> directory
    [<dir 'fs'>, <dir 'moya'>, <file 'README.md'>]

Info objects have a number of advantages over just a filename. For instance you can tell if an info object references a file or a directory with the :attr:`~fs.info.Info.is_dir` attribute, without an additional system call. Info objects may also contain information such as size, modified time, etc. if you request it in the ``namespaces`` parameter.


.. note::

    The reason that ``scandir`` returns an iterable rather than a list, is that it can be more efficient to retrieve directory information in chunks if the directory is very large, or if the information must be retrieved over a network.

Additionally, FS objects have a :meth:`~fs.base.FS.filterdir` method which extends ``scandir`` with the ability to filter directory contents by wildcard(s). Here's how you might find all the Python files in a directory:

    >>> code_fs = OSFS('~/projects/src')
    >>> directory = list(code_fs.filterdir('/', files=['*.py']))

By default, the resource information objects returned by ``scandir`` and ``listdir`` will contain only the file name and the ``is_dir`` flag. You can request additional information with the ``namespaces`` parameter. Here's how you can request additional details (such as file size and file modified times)::

    >>> directory = code_fs.filterdir('/', files=['*.py'], namespaces=['details'])

This will add a ``size`` and ``modified`` property (and others) to the resource info objects. Which makes code such as this work::

    >>> sum(info.size for info in directory)

See :ref:`info` for more information.

Sub Directories
~~~~~~~~~~~~~~~

PyFilesystem has no notion of a *current working directory*, so you won't find a ``chdir`` method on FS objects. Fortunately you won't miss it; working with sub-directories is a breeze with PyFilesystem.

You can always specify a directory with methods which accept a path. For instance, ``home_fs.listdir('/projects')`` would get the directory listing for the `projects` directory. Alternatively, you can call :meth:`~fs.base.FS.opendir` which returns a new FS object for the sub-directory.

For example, here's how you could list the directory contents of a `projects` folder in your home directory::


    >>> home_fs = open_fs('~/')
    >>> projects_fs = home_fs.opendir('/projects')
    >>> projects_fs.listdir('/')
    ['fs', 'moya', 'README.md']

When you call ``opendir``, the FS object returns an instance of a :class:`~fs.subfs.SubFS`. If you call any of the methods on a ``SubFS`` object, it will be as though you called the same method on the parent filesystem with a path relative to the sub-directory.

The :class:`~fs.base.FS.makedir` and :class:`~fs.base.FS.makedirs` methods also return ``SubFS`` objects for the newly create directory. Here's how you might create a new directory in ``~/projects`` and initialize it with a couple of files::

    >>> home_fs = open_fs('~/')
    >>> game_fs = home_fs.makedirs('projects/game')
    >>> game_fs.touch('__init__.py')
    >>> game_fs.writetext('README.md', "Tetris clone")
    >>> game_fs.listdir('/')
    ['__init__.py', 'README.md']

Working with ``SubFS`` objects means that you can generally avoid writing much path manipulation code, which tends to be error prone.

Working with Files
~~~~~~~~~~~~~~~~~~

You can open a file from a FS object with :meth:`~fs.base.FS.open`, which is very similar to ``io.open`` in the standard library. Here's how you might open a file called "reminder.txt" in your home directory::

    >>> with open_fs('~/') as home_fs:
    ...     with home_fs.open('reminder.txt') as reminder_file:
    ...        print(reminder_file.read())
    buy coffee

In the case of a ``OSFS``, a standard file-like object will be returned. Other filesystems may return a different object supporting the same methods. For instance, :class:`~fs.memoryfs.MemoryFS` will return a ``io.BytesIO`` object.

PyFilesystem also offers a number of shortcuts for common file related operations. For instance, :meth:`~fs.base.FS.readbytes` will return the file contents as bytes, and :meth:`~fs.base.FS.readtext` will read unicode text. These methods are generally preferable to explicitly opening files, as the FS object may have an optimized implementation.

Other *shortcut* methods are :meth:`~fs.base.FS.download`, :meth:`~fs.base.FS.upload`, :meth:`~fs.base.FS.writebytes`, :meth:`~fs.base.FS.writetext`.

Walking
~~~~~~~

Often you will need to scan the files in a given directory, and any sub-directories. This is known as *walking* the filesystem.

Here's how you would print the paths to all your Python files in your home directory::

    >>> from fs import open_fs
    >>> home_fs = open_fs('~/')
    >>> for path in home_fs.walk.files(filter=['*.py']):
    ...     print(path)

The ``walk`` attribute on FS objects is instance of a :class:`~fs.walk.BoundWalker`, which should be able to handle most directory walking requirements.

See :ref:`walking` for more information on walking directories.

Globbing
~~~~~~~~

Closely related to walking a filesystem is *globbing*, which is a slightly higher level way of scanning filesystems. Paths can be filtered by a *glob* pattern, which is similar to a wildcard (such as ``*.py``), but can match multiple levels of a directory structure.

Here's an example of globbing, which removes all the ``.pyc`` files in your project directory::

    >>> from fs import open_fs
    >>> open_fs('~/project').glob('**/*.pyc').remove()
    62

See :ref:`globbing` for more information.


Moving and Copying
~~~~~~~~~~~~~~~~~~

You can move and copy file contents with :meth:`~fs.base.FS.move` and :meth:`~fs.base.FS.copy` methods, and the equivalent :meth:`~fs.base.FS.movedir` and :meth:`~fs.base.FS.copydir` methods which operate on directories rather than files.

These move and copy methods are optimized where possible, and depending on the implementation, they may be more performant than reading and writing files.

To move and/or copy files *between* filesystems (as apposed to within the same filesystem), use the :mod:`~fs.move` and :mod:`~fs.copy` modules. The methods in these modules accept both FS objects and FS URLS. For instance, the following will compress the contents of your projects folder::

    >>> from fs.copy import copy_fs
    >>> copy_fs('~/projects', 'zip://projects.zip')

Which is the equivalent to this, more verbose, code::

    >>> from fs.copy import copy_fs
    >>> from fs.osfs import OSFS
    >>> from fs.zipfs import ZipFS
    >>> copy_fs(OSFS('~/projects'), ZipFS('projects.zip'))

The :func:`~fs.copy.copy_fs` and :func:`~fs.copy.copy_dir` functions also accept a :class:`~fs.walk.Walker` parameter, which can you use to filter the files that will be copied. For instance, if you only wanted back up your python files, you could use something like this::

    >>> from fs.copy import copy_fs
    >>> from fs.walk import Walker
    >>> copy_fs('~/projects', 'zip://projects.zip', walker=Walker(filter=['*.py']))

An alternative to copying is *mirroring*, which will copy a filesystem them keep it up to date by copying only changed files / directories. See :func:`~fs.mirror.mirror`.