File: HACKING

package info (click to toggle)
monotone 1.1-4+deb8u2
  • links: PTS, VCS
  • area: main
  • in suites: jessie
  • size: 20,664 kB
  • ctags: 8,113
  • sloc: cpp: 86,443; sh: 6,906; perl: 924; makefile: 838; python: 517; lisp: 379; sql: 118; exp: 91; ansic: 52
file content (598 lines) | stat: -rw-r--r-- 26,288 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
Contributing to monotone
========================

This file gives a number of basic guidelines and tips for hacking on
monotone.  For more specific topics (for instance, documentation on
writing tests, or the release process), see also the files in the
notes/ directory.


Coding standards
----------------

Code is largely formatted according to the GNU coding standards, but there
are minor deviations.  Where the coding style differs from the standard
please follow the coding style of the particular file you're making changes
to so that formatting consistency is retained within that file.

All source indentation should use two space characters per indentation
level.

Never use tab characters to indent code.  Always use spaces.  Teach
your editor to do the same.

The appropriate Emacs modeline to use for editing source is:

-*- mode: C++; c-file-style: "gnu"; indent-tabs-mode: nil; c-basic-offset: 2 -*-

And something close (but not perfect) for VIM (requires "set cindent"):

vim: et:sw=2:sts=2:ts=2:cino=>2s,{s,\:s,+s,t0,g0,^-2,e-2,n-2,p2s,(0,=s:

The appropriate astyle options are:

astyle --style=gnu -s2 -p

monotone's source includes third party code and libraries.  These have
been imported from upstream, and should retain the original coding style of
the particular library in all cases.  This makes life easier when a
developer needs to send our fixes upstream or wants to import new upstream
versions into monotone.

There is a special header file, src/base.hh, that should be the very
first #include in every .cc file, without exception.  It makes a small
number of inclusions and declarations that we want to be globally
visible.  Do not include this header in any other header file;
however, in header files, assume the contents of base.hh are already
visible.  "make distcheck" will verify that these rules are followed.
If you modify src/base.hh, make sure you keep the "util/audit-includes"
script consistent with it.  Try not to add things to base.hh; it's intended
to be a minimal set of declarations that really do need to be visible
everywhere.


Formatting messages
-------------------

For short messages (single sentence error, warning, info and progress
messages), we follow the GNU coding standards.  However, those are not
quite enough, and we have developed an extended set of rules to follow
for messages in english:

  - put single quotes ('') aroung file names and paths, option names,
    tag names, branch names, command names and commands, as well as
    any %s that would be expanded to one of those.
    There is an exception, though: if the quoted item is cited after a
    colon, either as the last part of the message or on its own line.
    For example, the following doesn't look very good:

        To apply the changes relative to one of its parents, use:
          'mtn pluck -r PARENT -r 06c3...'

    The following looks better

        To apply the changes relative to one of its parents, use:
          mtn pluck -r PARENT -r 06c3...

  - use a Capital first letter in command abstract and description
    strings.  Do not put a period at the end of the abstract, but do
    so with the description.  The following would be very wrong:

        CMD(foo, "foo", "", CMD_REF(foos), "",
            N_("does the foo thing."),
            N_("in order to do the foo thing, this allows you to join "
               "the foo fighters"),

    The following is what it's supposed to look like:

        CMD(foo, "foo", "", CMD_REF(foos), "",
            N_("Does the foo thing"),
            N_("In order to do the foo thing, this allows you to join "
               "the foo fighters."),

  - just as with error messages, use a lowercase first letter in
    descriptions of options and do not end it with a period.  The
    following would be wrong:

        SIMPLE_OPTION(bar, "bar", bool,
                      gettext_noop("Have the remaining figures go to "
                                   "the nearest bar."))

    The following is better:

        SIMPLE_OPTION(bar, "bar", bool,
                      gettext_noop("have the remaining figures go to "
                                   "the nearest bar"))

  - for multi-sentence error messages, use a lowercase first letter,
    but have the second sentence and on start with a Capital letter,
    and punctuate between sentences.  The following is an example that
    does not follow this rule, and is quite unreadable:

      E(!candidates.empty(), origin::user,
        F("your request matches no descendents of the current revision\n"
          "in fact, it doesn't even match the current revision\n"
          "maybe you want something like --revision=h:%s")
        % app.opts.branch);

    The proper way to write this message is:

      E(!candidates.empty(), origin::user,
        F("your request matches no descendents of the current revision.\n"
          "In fact, it doesn't even match the current revision.\n"
          "Maybe you want something like --revision=h:%s")
        % app.opts.branch);

  - proper names, acronyms, function names and variable names must
    follow their proper capitalization rules.  For example, "Botan"
    and "I/O" must be written that way.

  - "monotone" and "mtn" are proper names with a lower case character,
    and should always be written with a lower case first character,
    even though language rules or the rules written here would say
    otherwise.  If you find that troublesome, you might want to
    consider rephrasing so the sentence doesn't start with "monotone"
    or "mtn".

  - Please consider that the user has to understand the messages, even
    though they might not know all the inner details of how monotone
    works, and might not have a programmer's mind.  For example, the
    following message was hard to understand:

        could not query default database locations

    It has been replaced with the following (which will at least make
    it clear that this is likely to be a configuration problem and not
    some inner weirdness in monotone):

        no default database location configured

PLEASE APPLY COMMON SENSE.  There are message that will deviate from
these rules.  They should be very few, however.


Translation
-----------

For translators, their language may have a different set of rules for
things such as quoting (in the french translation, «_ and _» are used
instead of single quotes, for example) and capitalization.  Please use
your best judgement and be consistent.  Still, do not capitalize
"monotone" or "mtn".

When working on your languages, please do not update any language file
but your own, as it can be quite annoying for other translators to
have to deal with merging the changes generated with their current
work.  The best way to deal with this is by updating your language
files using one of the following methods:

    make ALL_LINGUAS={lang} update-po update-gmo

    make po/{lang}.po-update po/{lang}.gmo

(replace {lang} with your language code)


Dialect issues
--------------

C++ is a big language with a lot of dialects spoken in different
projects.  monotone is no exception; we would prefer submissions
continued to adhere to the dialect we write in. in particular:


  - try to stick to simple functions and data types; don't make big
    complicated class hierarchies with a lot of interdependencies.

  - avoid pointers whenever possible. if you need to use a handle to a
    heap allocated object, use a shared_ptr, scoped_ptr, weak_ptr, or
    the like.

  - if a value has a clearly defined well-formedness condition,
    encapsulate the value in an object and make the constructors
    and mutators check the condition. see vocab.hh for string-valued
    examples.

  - in general, try to express both issues and semantically rich
    names in the type system. use typedefs to make names more
    meaningful.

  - avoid returning values, especially large ones. instead, pass a
    value as a parameter, by reference, clear it and assign into
    it. this style generally produces fewer invocations of copy
    constructors, and makes it natural to add multiple, named output
    parameters from a function.

  - use invariants and logging liberally.

  - make everything as const-correct as possible. make query methods
    on objects const. make the input parameters to a function const.
    the word "const" comes after the thing which is constant; for
    example write "int const" rather than "const int".

  - separate pointer and reference types with space. write "int * foo"
    rather than "int *foo" or "int* foo".

  - if you have a const, put that _after_ the main type, but before
    any * or &.  Example: "set<revision_id> const & foo".

  - if you are declaring a static array, be aware of a quirk of the
    language: "foo const * arrayname[]" declares a _non-constant_
    array of pointers to constant foo.  if the array itself will never
    be modified (true in most cases) you should put a second "const"
    modifier _after_ the *.  this most often comes up with arrays of
    string constants: write "char const * const arrayname[]" unless
    you really mean the array to be modifiable.

  - when defining a function, put a carriage return right before the
    function name, so that the visibility and return type go on the
    preceeding line. this makes it possible to grep for "^functionname"
    to find the function definition without its uses.

  - use enums rather than magic constants, if you can.

  - magic constants, when needed, go in constants.{cc,hh}.

  - generally avoid the preprocessor unless you have a matter which is
    very difficult to express using inlines, reference args,
    templates, etc.

  - avoid indexing into an array or vector with []. use idx(vec,i),
    and you will get a nicely informative runtime error when you
    overflow rather than a crash.

  - avoid recursion, so that you will not overflow the stack on large
    work sets. use a heap-allocated std::deque<> for breadth-first
    recursion, std::stack<> for depth-first.

  - generally avoid anything which can generate a SEGV, as it's an
    uninformative error. prefer errors which say what went wrong.

  - do not use "using namespace <foo>" anywhere.  especially do not
    use "using namespace std".

  - in .cc files, it is preferred to put "using std::foo" at the top
    for each foo used in the file rather than put "std::" in front of
    all the uses.  this is also the preferred style for symbols from
    other namespaces when they are used frequently.

  - do not put any "using" declarations in .hh files; use the fully
    qualified name everywhere necessary.

  - .hh files should include everything that is necessary to parse all
    of their declarations, but strenuous efforts should be made to
    keep the number of nested includes to a minimum.  wherever
    possible, use forward declarations (struct foo;) and similar
    techniques.

  - <iostream> deserves special mention.  including this file causes
    the compiler to emit static constructors to ensure that the
    standard streams are initialized before their first use.
    therefore, do not include <iostream> unless you actually refer to
    one of the standard streams (cin, cout, cerr, clog).  use <iosfwd>,
    <istream>, <ostream>, <fstream>, etc instead, as appropriate.  do not,
    under any circumstances, refer to the standard streams in a header file.

  - it is almost always a mistake to use std::endl instead of '\n'.
    std::endl writes a newline to the stream *and flushes it*.
    in monotone this is basically only appropriate when one needs to
    resynchronize clog and cout (ui.cc:clear_line()), or recover from
    disabling terminal echo (read_password()).  note that it is never
    necessary to use endl with cerr, as cerr is unbuffered.

  - prefer character constants ('x') to single-character strings ("x") in
    formatted output sequences.


Interfacing with Lua
--------------------

monotone is extended with hooks written in the Lua language. Code that
interfaces with Lua must obey the following rules:

  - Lua arrays (tables with numeric indices) are 1-indexed. This is not
    mandated by the language per se, but the standard libraries assume that
    and using the arrays otherwise may break hooks that use standard
    functions.


Test suites, and writing test cases
-----------------------------------

monotone includes a number of unit and integration tests.  These can be run
easily by initiating a 'make check'.  The test suite (or, at least, any
tests potentially affected by your change) should be run before you
distribute your changes.

Automated build bots run the complete test suite on a regular basis, so
any problems will be noticed quickly, but it is still faster to find and fix
any problems locally rather than waiting for the build bots to alert the
development team of a problem.

All changes to monotone that alter monotone's behaviour should include a new
test.  This includes most changes, but use your judgment about adding tests
for very small changes.  The tests are located in the source tree in the
test/ directory, documentation on writing tests is available in
notes/README.testing.

When fixing a bug, check for an existing test case for that bug and
carefully observe the test case's behaviour before and after your fix. If no
test case exists, it is strongly recommended to write a test case before
attempting to fix the bug.

Tip: if the unit tests are failing, the quickest way to find the
problem is to search the output for the regexp \([0-9]+\) -- i.e., a
number in parentheses.  Or, if using gdb, try setting a breakpoint on
theboost::unit_test::first_failed_assertion function (see
results_collector.hpp).


Documenting large, user visible or otherwise important changes
--------------------------------------------------------------

There are changes that are more than mere bug fixes.  Those should be
documented briefly in the NEWS file.  Please follow the examples from
earlier releases in that file.
Please add the entries in NEWS as part of the change in question, or
as soon after the change has been committed as possible.  That will
save the release master a lot of hassle at the time of release.


Single-character macros
-----------------------

These are very convenient once you get used to them, but when you
first look at the code, it can be a bit of a shock to see all these
bare capital letters scattered around.  Here's a quick guide.

   Formatting macros:
      F("foo %s"):
         create a formatting object, for display to the user.
         Translators will translate this string, and F() runs
         gettext() on its argument.  NB: this string should usually
         _not_ end in a newline.
      FP("%d foo", "%d foos", n) % n:
         create a formatting object, with plural handling.
         Same comments apply as to F().
      FL("foo %s"):
         create a raw ("literal") formatting object, mostly for
         use in logging.  This string will _not_ be translated, and
         gettext() is _not_ called.  This is almost always an argument
         to the L() macro.

   Informational macros:
      L(FL("foo")):
         log "foo".  Log messages are generally not seen by the
         user, and are used to debug problems with monotone.
      P(F("foo")):
         print "foo".  For generic informative messages to the user.
      W(F("foo")):
         warn "foo".  For warnings to the user.

   Assertion macros (see also the next section).  These all cause
   monotone to exit if their condition is false:
      I(x != y):
         "invariant" -- if the condition is not true, then there
         is a bug in monotone.
      E(x != y, origin::user, F("x and y are not equal")):
         "error" -- a general error. The second argument is used to
         give the system a hint what kind of error that is. One of the
         origin types in origin_type.hh is possible.

   Tracing macros:
      MM(x):
         Mark the given variable as one of the things we are looking
         at right now.  On its own, this statement has no visible
         effect; but if monotone crashes (that is, an I() check fails)
         while MM(x) is in scope, then the value of x will be printed
         to the debug log.  This is quite cheap, so feel free to
         scatter them through your code; this information is _very_
         useful when trying to understand crashes, especially ones
         reported by users, that cannot necessarily be reproduced.
         There are some limitations:
          -- the object passed to MM must remain in scope as long as the
             MM does.  Code like
                MM(get_cool_value())
             will probably crash!  Instead say
                cool_t my_cool_value = get_cool_value();
                MM(my_cool_value);"
          -- You can only use MM() once per line.  If you say
                MM(x); MM(y);
             you will get a compile error.  Instead say
                MM(x);
                MM(y);
          -- The object passed to MM() must have a defined "dump"
             function.  You can easily add an overload to "dump" to
             support new types.


"Application state" objects
---------------------------

There are nine object types which hold a substantial portion of the
overall state of the program.  You will see them frequently in
argument lists.  Most, but not all, of these are allocated only once
for the entire program.

Because many functions take some of these objects as arguments, we
have a convention for their position and order: all such arguments
appear first within the overall argument list, and in the order of
the list below.

 * "app_state" is being phased out; it used to be an umbrella object
   carrying almost all the state of the program, with sub-objects of
   the types listed below.  Most of those are now allocated
   separately, but the options and lua_hooks objects are still under
   the umbrella.  Also, there are a very few operations that are still
   app_state methods.  Do not introduce new functions which take an
   app_state argument, unless there is no alternative.

 * "options" holds information from all of the command-line options.
   It does *not* record the non-option command line arguments.  Some
   of its fields may default to other information sources as well.

   To the maximum extent practical, "options" objects should not
   appear in function arguments.  Instead, pass down specific fields
   that are relevant to the lower-level code.

 * When adding new options, separate words with dash "-" not
   underscore "_"; dash is easier to type.

 * "lua_hooks" holds the Lua interpreter handle and all the associated
   state, in particular all the hook functions that the user may
   override.  It is, unfortunately, not possible to pass around single
   hook functions, so any C++ function that (transitively) calls some
   hook must get the lua_hooks object somehow.

 * There are three types that encapsulate the database of revisions at
   different levels of abstraction.  No function should take more than
   one of the following types.

     - "project_t" represents a development project within the
       database, that is, a database plus a set of branch names and
       trust decisions.

     - "database" represents the database as a whole, at a level where
       trust decisions are irrelevant.  At present, the database
       object does do some trust checking and has responsibility for
       all public key operations (signature checks and nonce
       encryption); these may be moved to the project object in the
       future.

     - "sqlite3" is the raw SQLite library handle.  Some very
       low-level internal functions use this instead of a database
       object.  Introducing more of them is to be avoided.

 * "node_id_source" is not really a top-level state object, but if a
   function takes one of them, it goes right after the database in the
   argument list.

 * "key_store" holds the user's private keys, and is responsible for all
   private key operations (creating signatures and decrypting nonces).

 * "workspace" is responsible for manipulating checked-out source trees.


Reporting errors to the user
----------------------------

monotone has a number of assertion macros available for different
situations.  These assertion macros are divided into two categories:
invariants and general errors.

Invariants assert that monotone's internal state is in the expected state.
An invariant failure indicates that there is a bug in monotone.  e.g.

    I(r_working.edges.size() == 1);

Error conditions handle most other error cases, where monotone is unable to
complete an operation due to an error, but that error is not caused by a bug
in monotone,  e.g.

    E(converted != NULL, origin::system,
      F("failed to convert string from %s to %s: '%s'") 
        % src_charset % dst_charset % src);

Each of these assertion macros are fatal and will cause an exception to be
thrown that will ultimately cause the monotone process to exit.  Exceptions
are used (rather than C-style abort() based assertions) so that the stack
will unwind and cause the destructors for objects allocated on the stack to
run.  This allows monotone to leave the user's database and workspace in
as logical a state as possible.  Any in-flight uncommitted database
transactions will be aborted.  Operations occurring on a workspace may
leave the workspace in an inconsistent state, as we do not have a way to
atomically update the workspace.


Patch submission guidelines
---------------------------

Register an account on our code forge at <https://code.monotone.ca> and go to
the "Issues" view of the "monotone" project listed there.  Then create a
separate issue for the patch you want to send; describe it briefly in the
issue's title and explain the reasoning more in detail in the description
text.  Afterwards append the patch file to the issue so we can review and
comment on it.  Please do not paste or upload it somewhere else, external
resources tend to expire from time to time and we would like to preserve
the information as they are as long as possible. Please do also not include
your patch inline in the issue's description, so its formatting is not
destroyed.

If you prefer old-style mail, send your patches to 'monotone-devel@nongnu.org'
with a subject beginning with '[PATCH]', again followed by a brief description of
the patch.  The body of the message should then also contain more information
with reasoning for why the changes are required.  Patches may be included inline
in a message, or attached as a text/plain, text/x-diff, or text/x-patch attachment.
Make sure your mailer does not mangle the patch (e.g. by wrapping lines in the
patch) before sending your patch to the list.

All changes to the monotone source require an accompanying commit message, so
please add a prepared commit message to your submission, as it will make our
lives a little easier.

Any changes that affect the user interface (e.g. adding command-line
options, changing the output format) or affect the documented behaviour of
monotone must include appropriate changes to the documentation.  Please also
think about adding unit and / or functional tests for bug fixes or new features
in your patch.

Finally, please review your patch prior to submission, to not include
accidental white-space-only changes or changes to the language
files. Usually you should revert po/*.po files before generating a
patch - unfortunately these are often changed when you build but do
not contain any reasonable changes. Alternatively restrict mtn diff to
the files you've actually changed.

The monotone development team will review and comment on all patches on a
best-efforts basis.  Patches are never ignored, but a patch may not be
discussed or applied immediately according to the amount of spare time the
developers have.  Don't be discouraged if you don't receive an immediate
response, and if you feel that your patch has slipped through the cracks,
post a follow up reminder message with a pointer to the original issue
or message to the mailing list archives.


Third-party code
----------------

monotone used to include a number of libraries directly in its source
tree for a long time.  Today we prefer to keep big external sources
outside of our tree, with the exception of our network library "Netxx"
which has been abandoned by its original author long ago and which has
also seen minor modifications from our side in the meantime.

So if you write new code that depends on third-party code or libraries,
then try to rely on packaged versions of this code as much as possible.
If this is impractical, for example because its only one file or even
a single method you need, check the copyrights of the code in question for
compatibility with the GPL that is used for monotone and add individual
copyrights to the AUTHORS file before you copy it into monotone's
source tree.

Also from time to time, bug fixes to third-party code are required.  These
fixes should be made in the monotone versions of the code first to solve the
immediate problem.  In all cases that make sense (i.e. for general bug
fixes), the change should also be sent to the upstream developers for
inclusion in the next release.

In a small number of cases, a change made to our local version of the
third-party code may not make sense to send upstream.  In this case,
make a note of this in the file you're changing and in your commit
message so that this permanent deviation is documented.


Compiling Hints
---------------

  - monotone's compilation time can be improved significantly by compiling
  with 'CXXFLAGS=-O0'. Note that disabling optimisation makes the resultant
  binary significantly slower - don't bother using it for performance
  profiling.

  - precompiled headers can be enabled by running 'configure' with --enable-pch
  This should give shorter compile times, given boost's extensive use of
  templates.  Some versions of gcc have issues with precompiled headers, so if
  you get strange compilation errors, try disabling them.

  - ccache (http://ccache.samba.org/) is quite effective for speeding up
  repeated compiles of the same source