File: dir_it.html

package info (click to toggle)
boost 1.27.0-3
links: PTS
area: main
in suites: woody
size: 19,908 kB
ctags: 26,546
sloc: cpp: 122,225; ansic: 10,956; python: 4,412; sh: 855; yacc: 803; makefile: 257; perl: 165; lex: 90; csh: 6
file content (292 lines) | stat: -rw-r--r-- 15,624 bytes
<html>

<head>
<title>dir_it.3</title>
</head>

<body BGCOLOR="white" LINK="0000FF" VLINK="800080">

<p>&nbsp;</p>

<h1><img src="../../c++boost.gif" alt="c++boost.gif (8819 bytes)" align="center" WIDTH="277" HEIGHT="86">dir_it
iterator</h1>

<table BORDER="0" CELLSPACING="0" CELLPADDING="0" COLS="2">
  <tr>
    <td WIDTH="109" VALIGN="TOP"></td>
    <td><font SIZE="3" FACE="Arial"><spacer TYPE="VERTICAL" SIZE="40"> <h1>dir_it</h1>
    <h2>iterator to get all files in a directory</h2>
    <h1>Abstract</h1>
    <p>The Standard C++ Library does not have any way to access the directory structure of a
    computer. This is due to the missing notion of directories at all on some C++ target
    platforms. However, many important platforms do have a notion of a directory but the
    system interface is very different between these platforms. This class provides a standard
    interface which is extensible to suit specific needs on the platform (when it comes to the
    need to access file attributes). <a NAME="Synopsis"></p>
    <h1>Synopsis</a> </h1>
    <table BORDER="0" CELLSPACING="0" CELLPADDING="0" COLS="2">
      <tr>
        <td WIDTH="30" VALIGN="TOP"></td>
        <td><pre>
#include &lt;boost/directory.h&gt;

std::string dirname(...);

boost::filesystem::dir_it begin(dirname);
boost::filesystem::dir_it end;
boost::filesystem::dir_it it(begin);

it = begin
*it
++it
*it++
it == end
it != end

prop::value_type v = boost::filesystem::get&lt;prop&gt;(it)
boost::filesystem::set&lt;prop&gt;(it, value)
    </pre>
        </td>
      </tr>
    </table>
    <a NAME="Description"><h1>Description</a> </h1>
    <p>The class <tt>boost::filesystem::dir_it</tt> (<tt>dir_it</tt> for short) is an input
    iterator which iterates over the entries in a directory. A begin iterator is constructed
    from a valid directory name using the platform specific notation, an end iterator is
    constructed using the default constructor of the class. The two function <tt>boost::filesystem::get()</tt>
    and <tt>boost::filesystem::set()</tt> are used to access specific properties of a file.
    The exact list of available properties depends on the system. Below is a list of common
    properties and lists of properties supported on specific systems. </p>
    <p>Since the file properties differ between systems, an extensible interface was choosen
    to allow different sets of properties to be accessed. It is even possible for the user to
    add special properties. To define a new file property, a <tt>struct</tt> is defines which
    gives the name and the type to the property. Of course, it is also necessary to define the
    <tt>get()</tt> and/or <tt>set()</tt> functions. Details for this are given below. <a NAME="Basic Functionality"></p>
    <h1>Basic Functionality</a></h1>
    <p>The main functionality of the class <tt>dir_it</tt> is to iterate over the entries in a
    directory. Here is an example how the class can be used to print the files in a directory:
    </p>
    <table BORDER="0" CELLSPACING="0" CELLPADDING="0" COLS="2">
      <tr>
        <td WIDTH="30" VALIGN="TOP"></td>
        <td><pre>
#include &lt;iterator&gt;
#include &lt;iostream&gt;
#include &lt;algorithm&gt;
#include &lt;boost/directory.h&gt;

int main(int ac, char *av[])
{
  if (ac == 2)
  {
    typedef boost::filesystem::dir_it        InIt;
    typedef std::ostream_iterator&lt;std::string&gt; OutIt;

    std::copy(InIt(av[1]), InIt(), OutIt(std::cout, &quot;\\n&quot;));
  }
  return 0;
}
    </pre>
        </td>
      </tr>
    </table>
    <p>Of course, it is also possible to do this loop manually: The class <tt>dir_it</tt> is
    just an input iterator. Note, that the post increment operator only returns a proxy object
    which can be used for dereferencing (using <tt>operator*()</tt>) as required by the input
    iterator specification. However, the proxy object cannot be used to access other file
    attributes than the name. <a NAME="dir_it Members"></p>
    <h1>dir_it Members</a> <a NAME="Lifecycle"></h1>
    <h3>Lifecycle</a> </h3>
    <dl>
      <dt>Default Constructor </dt>
      <dd>The default constructor is used to create the &quot;past the end&quot; iterator. This
        construction never fails and the resulting iterator cannot be deferenced. </dd>
      <dt>Constructor taking a std::string </dt>
      <dd>A <tt>std::string</tt> naming a directory can be used to construct a &quot;begin&quot;
        iterator. If the argument does not name an accessible directory, the resulting iterator
        compares equal to the past the end iterator constructed with the default constructor. On
        most system it is no problem how this failure is indicated because even an empty directory
        has entries, e.g. on POSIX systems the directories &quot;.&quot; (the directory itself)
        and &quot;..&quot; (the parent directory). </dd>
      <dt>Copy Constructor </dt>
      <dd>The copy constructor creates a new instance which is always positioned on the same
        current entry as the original <tt>dir_it</tt> instance. This means, that advancing either
        the original or the newly created iterator will advance both iterators. It is not possible
        to copy a <tt>dir_it</tt> to iterate over the same directory entries twice. To do this,
        two objects of type <tt>dir_it</tt> have to be constructed from the directory name. </dd>
      <dt>Destructor </dt>
      <dd>The destructor releases the resources associated with the <tt>dir_it</tt>. However, if
        the <tt>dir_it</tt> was copied, associated system resources are released when the last
        copy is destroyed. This is because the various copies share the same system resources. </dd>
      <dt>Assignment </dt>
      <dd>The assigned <tt>dir_it</tt> is always position on the same entry as the original
        iterator. Thus, the same restriction on the assigned iterator apply as those for iterators
        created with the copy constructor. </dd>
    </dl>
    <a NAME="Operations"><h3>Operations</a> </h3>
    <dl>
      <dt>Dereference (<tt>operator*()</tt>) </dt>
      <dd>Dereferencing a <tt>dir_it</tt> returns the name of the current directory entry as <tt>std::string</tt>.
        It is only possible to derference a <tt>dir_it</tt> if it does not compare equal to the
        past the end iterator. </dd>
      <dt>Pre Increment (<tt>operator++()</tt>) </dt>
      <dd>The major means to advance a <tt>dir_it</tt> is the pre increment operator. This
        operation moves the object to the next directory entry, if there is another entry.
        Otherwise, the <tt>dir_it</tt> object compares equal to the past the end iterator after
        the pre increment. The pre increment operator returns the object itself. </dd>
      <dt>Post Increment (<tt>operator++(int)</tt>) </dt>
      <dd>The post increment advances the <tt>dir_it</tt> to the next entry and returns a proxy
        object which can be dereferenced as if it were an object of type <tt>dir_it</tt>. However,
        nothing else can be done with this object. This method of advancing the iterator is
        normally less efficient such that the pre increment operator should be used if possible. </dd>
      <dt>Equals Operator (<tt>operator==()</tt>) </dt>
      <dd>The equals operator determines whether two objects of type <tt>dir_it</tt> are either
        both indicating a current directory entry, or both objects are past the end iterators.
        Because every directory turns into a past the end iterator once all entries in the
        directory have been seen, this can be used to test whether there are any more entries.
        However, it is not possible to determine whether a <tt>dir_it</tt> is positioned on a
        specific directory entry (but this can be done by comparing the results of the dereference
        operator). </dd>
      <dt>Not Equal Operator (<tt>operator!=()</tt>) </dt>
      <dd>The not equal operator returns the exact negation of the equals operator. Thus, this
        operator returns <tt>true</tt> if one of the two iterators indicates a current directory
        entry while the other iterator is a past the end iterator. </dd>
    </dl>
    <a NAME="File Properties"><h1>File Properties</a></h1>
    <p>Using the functions <tt>get()</tt> and <tt>set()</tt> it is possible to access file
    properties. Here is an example which prints the file sizes in addition to the name: </p>
    <table BORDER="0" CELLSPACING="0" CELLPADDING="0" COLS="2">
      <tr>
        <td WIDTH="30" VALIGN="TOP"></td>
        <td><pre>
#include &lt;iostream&gt;
#include &lt;boost/directory.h&gt;

int main(int ac, char *av[])
{
  if (ac == 2)
  {
    using namespace boost::filesystem;

    for (dir_it it(av[1]); it != dir_it(); ++it)
      std::cout &lt;&lt; std::setw(10) &lt;&lt; get&lt;size&gt;(it)
                &lt;&lt; &quot; &quot; &lt;&lt; *it &lt;&lt; &quot;\\n&quot;;
  }
  return 0;
}
    </pre>
        </td>
      </tr>
    </table>
    <p>Each property constists of two major components <ul>
      <li>A <tt>struct</tt> which gives the name to the property and which defines the type
        accessed using the property. The type of the property is defined using a <tt>typedef</tt>
        defining the type <tt>value_type</tt> in the corresponding <tt>struct</tt>. For the
        standard properties, the corresponding <tt>struct</tt>s are defined in the namespace <tt>boost::filesystem</tt>.
      </li>
      <li>Access functions which are just specializations of the functions <tt>boost::filesystem::get()</tt>
        and <tt>boost::filesystem::set()</tt>. Of course, if the property can only be read or only
        be written, only the corresponding access function is defined. </li>
    </ul>
    <a NAME="Example Property"><h3>Example Property</a></h3>
    <p>The <tt>size</tt> property used in the above example might be defined as follows: </p>
    <table BORDER="0" CELLSPACING="0" CELLPADDING="0" COLS="2">
      <tr>
        <td WIDTH="30" VALIGN="TOP"></td>
        <td><pre>
namespace boost {
  namespace filesystem {
    struct size
    {
      typedef size_t value_type;
    };

    template &lt;&gt;
    size::value_type get&lt;size&gt;(dir_it const &amp;it)
    {
      return ... /* environment specific code */
    }
  }
}
      </pre>
        </td>
      </tr>
    </table>
    <p>The properties which are already provided by the implementation normally access some
    data structure internal to the <tt>dir_it</tt> objects to avoid multiple system calls. <a NAME="Details"></p>
    <h3>Details</a> </h3>
    <dl>
      <dt>Property Selection </dt>
      <dd>The file property to be accessed is selected using a template argument to the <tt>get()</tt>
        or <tt>set()</tt> function. The template argument is a type which defines the type <tt>value_type</tt>
        as a subtype. The <tt>get()</tt> and <tt>set()</tt> functions are specialized for the
        properties provided by the system. By specializing addtional versions of these functions,
        the user may extend the set of accessible properties. </dd>
      <dt>Property Type </dt>
      <dd>The type of a file property is determined from a <tt>typedef</tt> called <tt>value_type</tt>
        in the type selecting the property. </dd>
      <dt>Reading a Property </dt>
      <dd>To read a file property, a <tt>dir_it</tt> is passed as argument to the template
        function <tt>boost::filesystem::get()</tt>. The template argument <tt>prop</tt> selecting
        the file property to be accessed is explicitly specified. The return type returned from
        the <tt>get()</tt> function is <tt>prop::value_type</tt>. </dd>
      <dt>Setting a Property </dt>
      <dd>To set a file property, a <tt>dir_it</tt> and the new value of the property are passed
        to the template function <tt>boost::filesystem::set()</tt>. The template argument <tt>prop</tt>
        selecting the file property to be accessed is explicitly specified. The type of the second
        argument to the <tt>set()</tt> function is <tt>prop::value_type const &amp;</tt>. </dd>
    </dl>
    <a NAME="Standard Properties"><h3>Standard Properties</a></h3>
    <p>The organization of files differ heavily between different system. As a result, the
    sets of file properties defined on different systems vary. The property interface is
    choosen such that it is obvious how specific properties are accessed except that the names
    and the exact types are still open. To enhance portability, some common file properties
    are always defined: <dl>
      <dt>is_directory </dt>
      <dd>A boolean read only property which can be used to determine whether a directory entry is
        itself a directory. </dd>
      <dt>is_hidden </dt>
      <dd>A boolean property indicating whether the file is &quot;hidden&quot;. By default, hidden
        files are not shown to the user. However, with appropriate options, these files may be
        shown anyway. On some systems, there is a special flag for the files which indicates that
        the file is hidden. On such systems this flag is a read/write property. On other systems,
        e.g. on POSIX systems, files starting with a dot (&quot;.&quot;) are considered to be
        hidden. On such systems this flag is a read only property. </dd>
      <dt>size </dt>
      <dd>A read only property of type <tt>size_t</tt> returning the size in bytes of a file. Note
        that the size returned is not necessarily identical to the number of characters retrieved
        from an <tt>ifstream</tt> created for this file: In text mode, some character sequences
        are replaced by single characters during reading. However, the number of characters in
        binary mode should normally match the size of the file. </dd>
      <dt>mtime </dt>
      <dd>A read only property of type <tt>time_t</tt> returning the last modification time of the
        file. On some systems, e.g. POSIX, it is possible to write this property to set the value
        to an arbitrary value. </dd>
    </dl>
    <a NAME="POSIX Properties"><h3>POSIX Properties</a> <a NAME="WinNT Properties"></h3>
    <h3>WinNT Properties</a> <a NAME="Future Directions"></h3>
    <h1>Future Directions</a></h1>
    <p>In computer systems there are other structures than the system's directory which can
    also be viewed as directories. An obvious example are archive files which store copies of
    directory hierarchies, like ZIP or tar files. It might be useful to extend the class <tt>dir_it</tt>
    to consider such structures also to be directories and somehow add support to iterate of
    these. </p>
    <p>A potential approach might be the definition of a CORBA interface which is used
    internally by the class <tt>dir_it</tt> to determine directory entries and to figure out,
    whether an entry itself a directory. This way it would be possible to even extend what is
    considered to be a directory and have the same class iterate over very different
    structures. </p>
    <p>Whether this approach is reasonable whill have to be evaluated in the future.
    Personally, I think this is an interesting direction and I hope that I will find time to
    test this in the near future. <a NAME="See Also"></p>
    <h1>See Also</a></h1>
    <p>POSIX: opendir(3), readdir(3), closedir(3), stat(2) <br>
    Standard Template Library: Input Iterator Requirements </p>
    <hr>
    <p><a HREF="http://www.claas-solutions.de/kuehl">Dietmar Khl</a>
    &lt;dietmar.kuehl@claas-solutions.de&gt;<br>
    </font></td>
  </tr>
</table>
</body>
</html>