1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292
|
<html>
<head>
<title>dir_it.3</title>
</head>
<body BGCOLOR="white" LINK="0000FF" VLINK="800080">
<p> </p>
<h1><img src="../../c++boost.gif" alt="c++boost.gif (8819 bytes)" align="center" WIDTH="277" HEIGHT="86">dir_it
iterator</h1>
<table BORDER="0" CELLSPACING="0" CELLPADDING="0" COLS="2">
<tr>
<td WIDTH="109" VALIGN="TOP"></td>
<td><font SIZE="3" FACE="Arial"><spacer TYPE="VERTICAL" SIZE="40"> <h1>dir_it</h1>
<h2>iterator to get all files in a directory</h2>
<h1>Abstract</h1>
<p>The Standard C++ Library does not have any way to access the directory structure of a
computer. This is due to the missing notion of directories at all on some C++ target
platforms. However, many important platforms do have a notion of a directory but the
system interface is very different between these platforms. This class provides a standard
interface which is extensible to suit specific needs on the platform (when it comes to the
need to access file attributes). <a NAME="Synopsis"></p>
<h1>Synopsis</a> </h1>
<table BORDER="0" CELLSPACING="0" CELLPADDING="0" COLS="2">
<tr>
<td WIDTH="30" VALIGN="TOP"></td>
<td><pre>
#include <boost/directory.h>
std::string dirname(...);
boost::filesystem::dir_it begin(dirname);
boost::filesystem::dir_it end;
boost::filesystem::dir_it it(begin);
it = begin
*it
++it
*it++
it == end
it != end
prop::value_type v = boost::filesystem::get<prop>(it)
boost::filesystem::set<prop>(it, value)
</pre>
</td>
</tr>
</table>
<a NAME="Description"><h1>Description</a> </h1>
<p>The class <tt>boost::filesystem::dir_it</tt> (<tt>dir_it</tt> for short) is an input
iterator which iterates over the entries in a directory. A begin iterator is constructed
from a valid directory name using the platform specific notation, an end iterator is
constructed using the default constructor of the class. The two function <tt>boost::filesystem::get()</tt>
and <tt>boost::filesystem::set()</tt> are used to access specific properties of a file.
The exact list of available properties depends on the system. Below is a list of common
properties and lists of properties supported on specific systems. </p>
<p>Since the file properties differ between systems, an extensible interface was choosen
to allow different sets of properties to be accessed. It is even possible for the user to
add special properties. To define a new file property, a <tt>struct</tt> is defines which
gives the name and the type to the property. Of course, it is also necessary to define the
<tt>get()</tt> and/or <tt>set()</tt> functions. Details for this are given below. <a NAME="Basic Functionality"></p>
<h1>Basic Functionality</a></h1>
<p>The main functionality of the class <tt>dir_it</tt> is to iterate over the entries in a
directory. Here is an example how the class can be used to print the files in a directory:
</p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="0" COLS="2">
<tr>
<td WIDTH="30" VALIGN="TOP"></td>
<td><pre>
#include <iterator>
#include <iostream>
#include <algorithm>
#include <boost/directory.h>
int main(int ac, char *av[])
{
if (ac == 2)
{
typedef boost::filesystem::dir_it InIt;
typedef std::ostream_iterator<std::string> OutIt;
std::copy(InIt(av[1]), InIt(), OutIt(std::cout, "\\n"));
}
return 0;
}
</pre>
</td>
</tr>
</table>
<p>Of course, it is also possible to do this loop manually: The class <tt>dir_it</tt> is
just an input iterator. Note, that the post increment operator only returns a proxy object
which can be used for dereferencing (using <tt>operator*()</tt>) as required by the input
iterator specification. However, the proxy object cannot be used to access other file
attributes than the name. <a NAME="dir_it Members"></p>
<h1>dir_it Members</a> <a NAME="Lifecycle"></h1>
<h3>Lifecycle</a> </h3>
<dl>
<dt>Default Constructor </dt>
<dd>The default constructor is used to create the "past the end" iterator. This
construction never fails and the resulting iterator cannot be deferenced. </dd>
<dt>Constructor taking a std::string </dt>
<dd>A <tt>std::string</tt> naming a directory can be used to construct a "begin"
iterator. If the argument does not name an accessible directory, the resulting iterator
compares equal to the past the end iterator constructed with the default constructor. On
most system it is no problem how this failure is indicated because even an empty directory
has entries, e.g. on POSIX systems the directories "." (the directory itself)
and ".." (the parent directory). </dd>
<dt>Copy Constructor </dt>
<dd>The copy constructor creates a new instance which is always positioned on the same
current entry as the original <tt>dir_it</tt> instance. This means, that advancing either
the original or the newly created iterator will advance both iterators. It is not possible
to copy a <tt>dir_it</tt> to iterate over the same directory entries twice. To do this,
two objects of type <tt>dir_it</tt> have to be constructed from the directory name. </dd>
<dt>Destructor </dt>
<dd>The destructor releases the resources associated with the <tt>dir_it</tt>. However, if
the <tt>dir_it</tt> was copied, associated system resources are released when the last
copy is destroyed. This is because the various copies share the same system resources. </dd>
<dt>Assignment </dt>
<dd>The assigned <tt>dir_it</tt> is always position on the same entry as the original
iterator. Thus, the same restriction on the assigned iterator apply as those for iterators
created with the copy constructor. </dd>
</dl>
<a NAME="Operations"><h3>Operations</a> </h3>
<dl>
<dt>Dereference (<tt>operator*()</tt>) </dt>
<dd>Dereferencing a <tt>dir_it</tt> returns the name of the current directory entry as <tt>std::string</tt>.
It is only possible to derference a <tt>dir_it</tt> if it does not compare equal to the
past the end iterator. </dd>
<dt>Pre Increment (<tt>operator++()</tt>) </dt>
<dd>The major means to advance a <tt>dir_it</tt> is the pre increment operator. This
operation moves the object to the next directory entry, if there is another entry.
Otherwise, the <tt>dir_it</tt> object compares equal to the past the end iterator after
the pre increment. The pre increment operator returns the object itself. </dd>
<dt>Post Increment (<tt>operator++(int)</tt>) </dt>
<dd>The post increment advances the <tt>dir_it</tt> to the next entry and returns a proxy
object which can be dereferenced as if it were an object of type <tt>dir_it</tt>. However,
nothing else can be done with this object. This method of advancing the iterator is
normally less efficient such that the pre increment operator should be used if possible. </dd>
<dt>Equals Operator (<tt>operator==()</tt>) </dt>
<dd>The equals operator determines whether two objects of type <tt>dir_it</tt> are either
both indicating a current directory entry, or both objects are past the end iterators.
Because every directory turns into a past the end iterator once all entries in the
directory have been seen, this can be used to test whether there are any more entries.
However, it is not possible to determine whether a <tt>dir_it</tt> is positioned on a
specific directory entry (but this can be done by comparing the results of the dereference
operator). </dd>
<dt>Not Equal Operator (<tt>operator!=()</tt>) </dt>
<dd>The not equal operator returns the exact negation of the equals operator. Thus, this
operator returns <tt>true</tt> if one of the two iterators indicates a current directory
entry while the other iterator is a past the end iterator. </dd>
</dl>
<a NAME="File Properties"><h1>File Properties</a></h1>
<p>Using the functions <tt>get()</tt> and <tt>set()</tt> it is possible to access file
properties. Here is an example which prints the file sizes in addition to the name: </p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="0" COLS="2">
<tr>
<td WIDTH="30" VALIGN="TOP"></td>
<td><pre>
#include <iostream>
#include <boost/directory.h>
int main(int ac, char *av[])
{
if (ac == 2)
{
using namespace boost::filesystem;
for (dir_it it(av[1]); it != dir_it(); ++it)
std::cout << std::setw(10) << get<size>(it)
<< " " << *it << "\\n";
}
return 0;
}
</pre>
</td>
</tr>
</table>
<p>Each property constists of two major components <ul>
<li>A <tt>struct</tt> which gives the name to the property and which defines the type
accessed using the property. The type of the property is defined using a <tt>typedef</tt>
defining the type <tt>value_type</tt> in the corresponding <tt>struct</tt>. For the
standard properties, the corresponding <tt>struct</tt>s are defined in the namespace <tt>boost::filesystem</tt>.
</li>
<li>Access functions which are just specializations of the functions <tt>boost::filesystem::get()</tt>
and <tt>boost::filesystem::set()</tt>. Of course, if the property can only be read or only
be written, only the corresponding access function is defined. </li>
</ul>
<a NAME="Example Property"><h3>Example Property</a></h3>
<p>The <tt>size</tt> property used in the above example might be defined as follows: </p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="0" COLS="2">
<tr>
<td WIDTH="30" VALIGN="TOP"></td>
<td><pre>
namespace boost {
namespace filesystem {
struct size
{
typedef size_t value_type;
};
template <>
size::value_type get<size>(dir_it const &it)
{
return ... /* environment specific code */
}
}
}
</pre>
</td>
</tr>
</table>
<p>The properties which are already provided by the implementation normally access some
data structure internal to the <tt>dir_it</tt> objects to avoid multiple system calls. <a NAME="Details"></p>
<h3>Details</a> </h3>
<dl>
<dt>Property Selection </dt>
<dd>The file property to be accessed is selected using a template argument to the <tt>get()</tt>
or <tt>set()</tt> function. The template argument is a type which defines the type <tt>value_type</tt>
as a subtype. The <tt>get()</tt> and <tt>set()</tt> functions are specialized for the
properties provided by the system. By specializing addtional versions of these functions,
the user may extend the set of accessible properties. </dd>
<dt>Property Type </dt>
<dd>The type of a file property is determined from a <tt>typedef</tt> called <tt>value_type</tt>
in the type selecting the property. </dd>
<dt>Reading a Property </dt>
<dd>To read a file property, a <tt>dir_it</tt> is passed as argument to the template
function <tt>boost::filesystem::get()</tt>. The template argument <tt>prop</tt> selecting
the file property to be accessed is explicitly specified. The return type returned from
the <tt>get()</tt> function is <tt>prop::value_type</tt>. </dd>
<dt>Setting a Property </dt>
<dd>To set a file property, a <tt>dir_it</tt> and the new value of the property are passed
to the template function <tt>boost::filesystem::set()</tt>. The template argument <tt>prop</tt>
selecting the file property to be accessed is explicitly specified. The type of the second
argument to the <tt>set()</tt> function is <tt>prop::value_type const &</tt>. </dd>
</dl>
<a NAME="Standard Properties"><h3>Standard Properties</a></h3>
<p>The organization of files differ heavily between different system. As a result, the
sets of file properties defined on different systems vary. The property interface is
choosen such that it is obvious how specific properties are accessed except that the names
and the exact types are still open. To enhance portability, some common file properties
are always defined: <dl>
<dt>is_directory </dt>
<dd>A boolean read only property which can be used to determine whether a directory entry is
itself a directory. </dd>
<dt>is_hidden </dt>
<dd>A boolean property indicating whether the file is "hidden". By default, hidden
files are not shown to the user. However, with appropriate options, these files may be
shown anyway. On some systems, there is a special flag for the files which indicates that
the file is hidden. On such systems this flag is a read/write property. On other systems,
e.g. on POSIX systems, files starting with a dot (".") are considered to be
hidden. On such systems this flag is a read only property. </dd>
<dt>size </dt>
<dd>A read only property of type <tt>size_t</tt> returning the size in bytes of a file. Note
that the size returned is not necessarily identical to the number of characters retrieved
from an <tt>ifstream</tt> created for this file: In text mode, some character sequences
are replaced by single characters during reading. However, the number of characters in
binary mode should normally match the size of the file. </dd>
<dt>mtime </dt>
<dd>A read only property of type <tt>time_t</tt> returning the last modification time of the
file. On some systems, e.g. POSIX, it is possible to write this property to set the value
to an arbitrary value. </dd>
</dl>
<a NAME="POSIX Properties"><h3>POSIX Properties</a> <a NAME="WinNT Properties"></h3>
<h3>WinNT Properties</a> <a NAME="Future Directions"></h3>
<h1>Future Directions</a></h1>
<p>In computer systems there are other structures than the system's directory which can
also be viewed as directories. An obvious example are archive files which store copies of
directory hierarchies, like ZIP or tar files. It might be useful to extend the class <tt>dir_it</tt>
to consider such structures also to be directories and somehow add support to iterate of
these. </p>
<p>A potential approach might be the definition of a CORBA interface which is used
internally by the class <tt>dir_it</tt> to determine directory entries and to figure out,
whether an entry itself a directory. This way it would be possible to even extend what is
considered to be a directory and have the same class iterate over very different
structures. </p>
<p>Whether this approach is reasonable whill have to be evaluated in the future.
Personally, I think this is an interesting direction and I hope that I will find time to
test this in the near future. <a NAME="See Also"></p>
<h1>See Also</a></h1>
<p>POSIX: opendir(3), readdir(3), closedir(3), stat(2) <br>
Standard Template Library: Input Iterator Requirements </p>
<hr>
<p><a HREF="http://www.claas-solutions.de/kuehl">Dietmar Khl</a>
<dietmar.kuehl@claas-solutions.de><br>
</font></td>
</tr>
</table>
</body>
</html>
|