1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255
|
***************************
**VFS**: Manipulating Files
***************************
.. highlight:: ada
Ada was meant from the beginning to be a very portable language, across
architectures. As a result, most of the code you write on one machine has
good chances of working as is on other machines. There remains, however,
some areas that are somewhat system specific. The Ada run-time, the GNAT
specific run-time and GNATColl all try to abstract some of those
operations to help you make your code more portable.
One of these areas is related to the way files are represented and
manipulated. Reading or writing to a file is system independent, and taken
care of by the standard run-time. Other differences between systems include
the way file names are represented (can a given file be accessed through
various casing or not, are directories separated with a backslash or a
forward slash, or some other mean, and a few others). The GNAT run-time does
a good job at providing subprograms that work on most types of filesystems,
but the relevant subprograms are split between several packages and not always
easy to locate. GNATColl groups all these functions into a single
convenient tagged type hierarchy. In addition, it provides the framework for
transparently manipulating files on other machines.
Another difference is specific to the application code: sometimes, a
subprogram needs to manipulate the base name (no directory information) of
a file, whereas sometimes the full file name is needed. It is somewhat hard
to document this in the API, and certainly fills the code with lots of
conversion from full name to base name, and sometimes reverse (which, of
course, might be an expansive computation). To make this easier,
GNATColl provides a type that encapsulates the notion of a file,
and removes the need for the application to indicate whether it needs a
full name, a base name, or any other part of the file name.
Filesystems abstraction
=======================
There exists lots of different filesystems on all machines. These include
such things as FAT, VFAT, NTFS, ext2, VMS,.... However, all these can
be grouped into three families of filesystems:
* windows-based filesystems
On such filesystems, the full name of a file is split into three parts: the
name of the drive (c:, d:,...), the directories which are separated by
a backslash, and the base name. Such filesystems are sometimes inaccurately
said to be case insensitive: by that, one means that the same file can be
accessed through various casing. However, a user is generally expecting a
specific casing when a file name is displayed, and the application should
strive to preserve that casing (as opposed to, for instance, systematically
convert the file name to lower cases).
A special case of a windows-based filesystems is that emulated by the
cygwin development environment. In this case, the filesystem is seen as if
it was unix-based (see below), with one special quirk to indicate the drive
letter (the file name starts with "/cygwin/c/").
* unix-based filesystems
On such filesystems, directories are separated by forward slashed. File
names are case sensitive, that is a directory can contain both "foo" and
"Foo", which is not possible on windows-based filesystems.
* vms filesystem
This filesystem represents path differently than the other two, using
brackets to indicate parent directories
A given machine can actually have several file systems in parallel, when
a remote disk is mounted through NFS or samba for instance. There is
generally no easy way to guess that information automatically, and it
generally does not matter since the system will convert from the native file
system to that of the remote host transparently (for instance, if you mount
a windows disk on a unix machine, you access its files through forward slash-
separated directory names).
GNATColl abstracts the differences between these filesystems through
a set of tagged types in the `GNATCOLL.Filesystem` package and its
children. Such a type has primitive operations to manipulate the names of
files (retrieving the base name from a full name for instance), to check
various attributes of the file (is this a directory, a symbolic link, is the
file readable or writable), or to
manipulate the file itself (copying, deleting, reading and writing).
It provides similar operations for directories (creating or deleting paths,
reading the list of files in a directory,...).
It also provides information on the system itself (the list of available drives
on a windows machine for instance).
The root type `Filesystem_Record` is abstract, and is specialized in
various child types. A convenient factory is provided to return the filesystem
appropriate for the local machine (`Get_Local_Filesystem`), but you
might chose to create your own factory in your application if you have
specialized needs (:ref:`Remote_filesystems`).
file names encoding
-------------------
One delicate part when dealing with filesystems is handling files whose
name cannot be described in ASCII. This includes names in asian languages
for instance, or names with accented letters.
There is unfortunately no way, in general, to know what the encoding is for
a filesystem. In fact, there might not even be such an encoding (on linux,
for instance, one can happily create a file with a chinese name and another
one with a french name in the same directory). As a result, GNATColl
always treats file names as a series of bytes, and does not try to assume
any specific encoding for them. This works fine as long as you are
interfacing the system (since the same series of bytes that was returned by
it is also used to access the file later on).
However, this becomes a problem when the time comes to display the name for
the user (for instance in a graphical interface). At that point, you need to
convert the file name to a specific encoding, generally UTF-8 but not
necessarily (it could be ISO-8859-1 in some cases for instance).
Since GNATColl cannot guess whether the file names have a specific
encoding on the file system, or what encoding you might wish in the end, it
lets you take care of the conversion. To do so, you can use either of the
two subprograms `Locale_To_Display` and
`Set_Locale_To_Display_Encoder`
.. _Remote_filesystems:
Remote filesystems
==================
Once the abstract for filesystems exists, it is tempting to use it to
access files on remote machines. There are of course lots of differences
with filesystems on the local machine: their names are manipulated
similarly (although you need to somehow indicate on which host they are
to be found), but any operation of the file itself needs to be done on the
remote host itself, as it can't be done through calls to the system's
standard C library.
Note that when we speak of disks on a remote machine, we indicate disks
that are not accessible locally, for instance through NFS mounts or samba.
In such cases, the files are accessed transparently as if they were local,
and all this is taken care of by the system itself, no special layer is
needed at the application level.
GNATColl provides an extensive framework for manipulating such
remote files. It knows what commands need to be run on the remote host to
perform the operations ("cp" or "copy", "stat" or "dir /a-d",...) and
will happily perform these operations when you try to manipulate such
files.
There are however two operations that your own application needs to take
care of to take full advantage of remote files.
Filesystem factory
------------------
GNATColl cannot know in advance what filesystem is running on the
remote host, so it does not try to guess it. As a result, your application
should have a factory that creates the proper instance of a
`Filesystem_Record` depending on the host. Something like::
type Filesystem_Type is (Windows, Unix);
function Filesystem_Factory
(Typ : Filesystem_Type;
Host : String)
return Filesystem_Access
is
FS : Filesystem_Access;
begin
if Host = "" then
case Typ is
when Unix =>
FS := new Unix_Filesystem_Record;
when Windows =>
FS := new Windows_Filesystem_Record;
end case;
else
case Typ is
when Unix =>
FS := new Remote_Unix_Filesystem_Record;
Setup (Remote_Unix_Filesystem_Record (FS.all),
Host => Host,
Transport => ...); *-- see below*
when Windows =>
FS := new Remote_Windows_Filesystem_Record;
Setup (Remote_Windows_Filesystem_Record (FS.all),
Host => Host,
Transport => ...);
end case;
end if;
Set_Locale_To_Display_Encoder
(FS.all, Encode_To_UTF8'Access);
return FS;
end Filesystem_Factory;
Transport layer
---------------
There exists lots of protocols to communicate with a remote machine, so as
to be able to perform operations on it. These include protocols such as
`rsh`, `ssh` or `telnet`. In most of these cases, a user
name and password is needed (and will likely be asked to the user).
Furthermore, you might not want to use the same protocol to connect to
different machines.
GNATColl does not try to second guess your intention here. It
performs all its remote operations through a tagged type defined in
`GNATCOLL.Filesystem.Transport`. This type is abstract, and must be
overridden in your application. For instance, GPS has a full support for
choosing which protocol to use on which host, what kind of filesystem is
running on that host, to recognize password queries from the transport
protocol,.... All these can be encapsulated in the transport
protocol.
Once you have created one or more children of
`Filesystem_Transport_Record`, you associate them with your
instance of the filesystem through a call to the `Setup` primitive
operation of the filesystem. See the factory example above.
Virtual files
=============
As we have seen, the filesystem type abstracts all the operations for
manipulating files and their names. There is however another aspect when
dealing with file names in an application: it is often unclear whether a
full name (with directories) is expected, or whether the base name itself
is sufficient. There are also some aspects about a file that can be cached
to improve the efficiency.
For these reasons, GNATColl provides a new type
`GNATCOLL.VFS.Virtual_File` which abstracts the notion of file. It
provides lots of primitive operations to manipulate such files (which
are of course implemented based on the filesystem abstract, so support
files on remote hosts among other advantages), and encapsulate the base
name and the full name of a file so that your API becomes clearer (you
are not expecting just any string, but really a file).
This type is reference counted: it takes care of memory management on
its own, and will free its internal data (file name and cached data)
automatically when the file is no longer needed. This has of course a
slight efficiency cost, due to controlled types, but we have found in
the context of GPS that the added flexibility was well worth it.
GtkAda support for virtual files
================================
If you are programming a graphical interface to your application, and the
latter is using the `Virtual_File` abstraction all other the place,
it might be a problem to convert back to a string when you store a file
name in a graphical element (for instance in a tree model if you display
an explorer-like interface in your application).
Thus, GNATColl provides the `GNATCOLL.VFS.GtkAda` package,
which is only build if `GtkAda` was detected when GNATColl
was compiled, which allows you to encapsulate a `Virtual_File`
into a `GValue`, and therefore to store it in a tree model.
|