File: paths.hh

package info (click to toggle)
monotone 0.31-6
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 20,680 kB
  • ctags: 14,801
  • sloc: cpp: 87,711; ansic: 64,862; sh: 5,691; lisp: 954; perl: 783; makefile: 509; python: 265; sql: 98; sed: 16
file content (282 lines) | stat: -rw-r--r-- 11,078 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
#ifndef __PATHS_HH__
#define __PATHS_HH__

// Copyright (C) 2005 Nathaniel Smith <njs@pobox.com>
//
// This program is made available under the GNU GPL version 2.0 or
// greater. See the accompanying file COPYING for details.
//
// This program is distributed WITHOUT ANY WARRANTY; without even the
// implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
// PURPOSE.

// safe, portable, fast, simple path handling -- in that order.
// but they all count.
//
// this file defines the vocabulary we speak in when dealing with the
// filesystem.  this is an extremely complex problem by the time one worries
// about normalization, security issues, character sets, and so on;
// furthermore, path manipulation has historically been a performance
// bottleneck in monotone.  so the goal here is the efficient implementation
// of a design that makes it hard or impossible to introduce as many classes
// of bugs as possible.
//
// Our approach is to have three different types of paths:
//   -- system_path
//      this is a path to anywhere in the fs.  it is in native format.  it is
//      always absolute.  when constructed from a string, it interprets the
//      string as being relative to the directory that monotone was run in.
//      (note that this may be different from monotone's current directory, as
//      when run in workspace monotone chdir's to the project root.)
//
//      one can also construct a system_path from one of the below two types
//      of paths.  this is intelligent, in that it knows that these sorts of
//      paths are considered to be relative to the project root.  thus
//        system_path(file_path_internal("foo"))
//      is not, in general, the same as
//        system_path("foo")
//
//   -- file_path
//      this is a path representing a versioned file.  it is always
//      a fully normalized relative path, that does not escape the project
//      root.  it is always relative to the project root.
//      you cannot construct a file_path directly from a string; you must pick
//      a constructor:
//        file_path_internal: use this for strings that come from
//          "monotone-internal" places, e.g. parsing revisions.  this turns on
//          stricter checking -- the string must already be normalized -- and
//          is extremely fast.  such strings are interpreted as being relative
//          to the project root.
//        file_path_external: use this for strings that come from the user.
//          these strings are normalized before being checked, and if there is
//          a problem trigger N() invariants rather than I() invariants.  if in
//          a workspace, such strings are interpreted as being
//          _relative to the user's original directory_.
//          if not in a workspace, strings are treated as referring to some
//          database object directly.
//      file_path's also provide optimized splitting and joining
//      functionality.
//
//   -- bookkeeping_path
//      this is a path representing something in the _MTN/ directory of a
//      workspace.  it has the same format restrictions as a file_path,
//      except instead of being forbidden to point into the _MTN directory, it
//      is _required_ to point into the _MTN directory.  the one constructor is
//      strict, and analogous to file_path_internal.  however, the normal way
//      to construct bookkeeping_path's is to use the global constant
//      'bookkeeping_root', which points to the _MTN directory.  Thus to
//      construct a path pointing to _MTN/options, use:
//          bookkeeping_root / "options"
//
// All path types should always be constructed from utf8-encoded strings.
//
// All path types provide an "operator /" which allows one to construct new
// paths pointing to things underneath a given path.  E.g.,
//     file_path_internal("foo") / "bar" == file_path_internal("foo/bar")
//
// All path types subclass 'any_path', which provides:
//    -- emptyness checking with .empty()
//    -- a method .as_internal(), which returns the utf8-encoded string
//       representing this path for internal use.  for instance, this is the
//       string that should be embedded into the text of revisions.
//    -- a method .as_external(), which returns a std::string suitable for
//       passing to filesystem interface functions.  in practice, this means
//       that it is recoded into an appropriate character set, etc.
//    -- a operator<< for ostreams.  this should always be used when writing
//       out paths for display to the user.  at the moment it just calls one
//       of the above functions, but this is _not_ correct.  there are
//       actually 3 different logical character sets -- internal (utf8),
//       user (locale-specific), and filesystem (locale-specific, except
//       when it's not, i.e., on OS X).  so we need three distinct operations,
//       and you should use the correct one.
//
//       all this means that when you want to print out a path, you usually
//       want to just say:
//           F("my path is %s") % my_path
//       i.e., nothing fancy necessary, for purposes of F() just treat it like
//       it were a string
//
//
// There is also one "not really a path" type, 'split_path'.  This is a vector
// of path_component's, and semantically equivalent to a file_path --
// file_path's can be split into split_path's, and split_path's can be joined
// into file_path's.


#include <iosfwd>
#include <string>
#include <vector>
#include <set>

#include "vocab.hh"

typedef std::vector<path_component> split_path;

const path_component the_null_component;

inline bool
null_name(path_component pc)
{
  return pc == the_null_component;
}

bool
workspace_root(split_path const & sp);

template <> void dump(split_path const & sp, std::string & out);

// It's possible this will become a proper virtual interface in the future,
// but since the implementation is exactly the same in all cases, there isn't
// much point ATM...
class any_path
{
public:
  // converts to native charset and path syntax
  // this is a path that you can pass to the operating system
  std::string as_external() const;
  // leaves as utf8
  std::string const & as_internal() const
  { return data(); }
  bool empty() const
  { return data().empty(); }
protected:
  utf8 data;
  any_path() {}
  any_path(any_path const & other)
    : data(other.data) {}
  any_path & operator=(any_path const & other)
  { data = other.data; return *this; }
};

std::ostream & operator<<(std::ostream & o, any_path const & a);
std::ostream & operator<<(std::ostream & o, split_path const & s);

class file_path : public any_path
{
public:
  file_path() {}
  // join a file_path out of pieces
  file_path(split_path const & sp);

  // this currently doesn't do any normalization or anything.
  file_path operator /(std::string const & to_append) const;

  void split(split_path & sp) const;

  bool operator ==(const file_path & other) const
  { return data == other.data; }

  bool operator <(const file_path & other) const
  { return data < other.data; }

private:
  typedef enum { internal, external } source_type;
  // input is always in utf8, because everything in our world is always in
  // utf8 (except interface code itself).
  // external paths:
  //   -- are converted to internal syntax (/ rather than \, etc.)
  //   -- normalized
  //   -- assumed to be relative to the user's cwd, and munged
  //      to become relative to root of the workspace instead
  // both types of paths:
  //   -- are confirmed to be normalized and relative
  //   -- not to be in _MTN/
  file_path(source_type type, std::string const & path);
  friend file_path file_path_internal(std::string const & path);
  friend file_path file_path_external(utf8 const & path);
};

// these are the public file_path constructors
inline file_path file_path_internal(std::string const & path)
{
  return file_path(file_path::internal, path);
}
inline file_path file_path_external(utf8 const & path)
{
  return file_path(file_path::external, path());
}

class bookkeeping_path : public any_path
{
public:
  bookkeeping_path() {};
  // path _should_ contain the leading _MTN/
  // and _should_ look like an internal path
  // usually you should just use the / operator as a constructor!
  bookkeeping_path(std::string const & path);
  bookkeeping_path operator /(std::string const & to_append) const;
  // exposed for the use of walk_tree
  static bool is_bookkeeping_path(std::string const & path);
  bool operator ==(const bookkeeping_path & other) const
  { return data == other.data; }

  bool operator <(const bookkeeping_path & other) const
  { return data < other.data; }
};

extern bookkeeping_path const bookkeeping_root;
extern path_component const bookkeeping_root_component;
// for migration
extern file_path const old_bookkeeping_root;

// this will always be an absolute path
class system_path : public any_path
{
public:
  system_path() {};
  system_path(system_path const & other) : any_path(other) {};
  // the optional argument takes some explanation.  this constructor takes a
  // path relative to the workspace root.  the question is how to interpret
  // that path -- since it's possible to have multiple workspaces over the
  // course of a the program's execution (e.g., if someone runs 'checkout'
  // while already in a workspace).  if 'true' is passed (the default),
  // then monotone will trigger an invariant if the workspace changes after
  // we have already interpreted the path relative to some other working
  // copy.  if 'false' is passed, then the path is taken to be relative to
  // whatever the current workspace is, and will continue to reference it
  // even if the workspace later changes.
  explicit system_path(any_path const & other,
                       bool in_true_workspace = true);
  // this path can contain anything, and it will be absolutified and
  // tilde-expanded.  it will considered to be relative to the directory
  // monotone started in.  it should be in utf8.
  system_path(std::string const & path);
  system_path(utf8 const & path);
  system_path operator /(std::string const & to_append) const;
};

void
dirname_basename(split_path const & sp,
                 split_path & dirname, path_component & basename);

void
save_initial_path();

system_path
current_root_path();

// returns true if workspace found, in which case cwd has been changed
// returns false if workspace not found
bool
find_and_go_to_workspace(system_path const & search_root);

// this is like change_current_working_dir, but also initializes the various
// root paths that are needed to interpret paths
void
go_to_workspace(system_path const & new_workspace);

typedef std::set<split_path> path_set;

// equivalent to file_path_internal(path).split(sp) but more efficient.
void
internal_string_to_split_path(std::string const & path, split_path & sp);

// Local Variables:
// mode: C++
// fill-column: 76
// c-file-style: "gnu"
// indent-tabs-mode: nil
// End:
// vim: et:sw=2:sts=2:ts=2:cino=>2s,{s,\:s,+s,t0,g0,^-2,e-2,n-2,p2s,(0,=s:

#endif