1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444
|
% -*- Mode: C++; tab-width: 2; -*-
% vi: set ts=2:
%
\documentclass[a4paper,10pt]{article}
\usepackage{a4wide}
\usepackage{times}
\title{BALL coding conventions}
\author{Oliver Kohlbacher}
\date{$ $Revision: 1.8.20.2 $ $}
\newtheorem{example}{Example}
\begin{document}
\maketitle
\section{General}
\subsection{Formatting}
{\bf All BALL files use a tab width of two.} Use the command {\tt set tabstop=2} in
{\tt vi} or {\tt set-variable tab-width 2} if you are using {\tt emacs}. The
standard file header should automatically set these values for {\tt vi} or
{\tt emacs} (see below).
\noindent
Curly braces should be put on their own line. {\bf Matching pairs of opening and closing curly braces should be
set to the same column. }
\begin{verbatim}
while (continue == true)
{
for (int i = 0; i < 10; i++)
{
...
}
...
if (x < 7)
{
....
}
}
\end{verbatim}
\noindent
{\bf Always use braces around a block.} The main reason for this rule is to avoid constructions like:
\begin{verbatim}
if (isValid(a))
return 0;
\end{verbatim}
\noindent
which might be changed to something like
\begin{verbatim}
if (isValid(a))
error = 0;
return 0;
\end{verbatim}
\noindent
The resulting errors are hard to find. There are two ways to avoid these
problems: (a) always use braces around a block (b) write everything in a single
line. We recommend method (a).
However, this is mainly a question of personal style, so no explicit checking
is performed to enforce this rule.
\section{Class requirements}
Each BALL class has to provide the following minimal interface:
\begin{verbatim}
/** Class documentation...
*/
class Test
{
public:
// clone method (implemented automatically with the CREATE macro)
CREATE(Test)
// default ctor
Test();
// copy ctor
Test(const Test& test);
// destructor
virtual ~Test();
// assignment operator
Test& operator = (const Test& test);
};
\end{verbatim}
\section{Naming conventions}
\subsection{Variable names}
{\bf Variable names are all lower case letters.} Distinguished parts of the name are
separated using underscores ``{\tt \_}''. If parts of the name are derived
from common acronyms (e.g. PDB) they should be in upper case.\\
Private or protected member variables of classes are suffixed by an
underscore.\\
No prefixing or suffixing is allowed to identify the variable type - it
leads to completely illegible documentation and overly long variable names.\\
\begin{example}\hspace*{2mm}\\
{\tt String PDB\_atom\_name;} (using an abbreviation)\\
{\tt int counter\_;} (private class member)\\
{\tt bool is\_found;} (ordinary variable)
\end{example}
\subsection{Class names/type names}
{\bf Class names and type names always start with a capital letter.} Different parts
of the name are separated by capital letters at the beginning of the word. No
underscores are allowed in type names and class names, except for the names of
protected types and nested classes which are suffixed by an underscore.
\begin{example}\hspace*{2mm}\\
{\tt class PDBFile}\\
{\tt class ForwardIteratorTraits\_} (private nested class definition)\\
\end{example}
\subsection{File names}
{\bf Header and source files should be named after the class they contain.}
Each file should contain a single class only, although exceptions are possible
for extreme light-weight classes. The file names start with a lower-case
letter (for historical reasons). An exception to the rule are file names starting
with common acronyms. Hence, the file containing the {\tt AtomContainer} class
would be named {\tt atomContainer.C}. The file containing the {\tt PDBRecord} class
would be named {\tt PDBRecord.C}.
\subsection{Function names/method names}
{\bf Function names (including class method names) always start with a lower case
letter.} Parts of the name are separated using capital letters (as are class
and type names). The argument of {\tt void} functions is omitted in
the declaration and the definition. If function arguments are pointers or
references, the pointer or reference qualifier is appended to the variable
type (without separating blanks). It should not prefix the variable name.\\
The variable names used in the declaration have to be the same as in the
definition. They should be as short as possible, but nevertheless
comprehensible. If arguments are not used in the implementation of the
function, they have to be commented out (to avoid compiler warnings).
\begin{example}\hspace*{2mm}\\
{\tt int countAtoms()}\\
{\tt bool isBondedTo(const Atom\& atom)}\\
{\tt bool doSomething(int i, String\& /* name */)}
\end{example}
\section{Documentation}
\subsection{Doxygen}
{\bf Each BALL class has to be documented using Doxygen.} The documentation is
inserted in Doxygen format in the header file in which the class is defined.
Documentation includes the description of the class, of each method, type
declaration, enum declaration, constant, and member variable.
\subsection{Usage of groups in the documentation}
In order to obtain a structured and comprehensible documentation, methods are
grouped by their function. Each class usually uses one or more of the
following categories to group its methods:
\begin{itemize}
\item {\bf Type Definitions:}\\
This group should contain all {\tt typedef}s of the class
(except for private or protected ones).
\item {\bf Exceptions:}\\
All class specific exceptions are defined as nested classes in this group.
\item {\bf Enums:}\\
This group usually contains enums or constants defined via enums such as
properties.
\item {\bf Constructors and Destructor:}\\
This group includes all constructors and the destructor as well as related
methods such as {\tt clear} or {\tt destroy}.
\item {\bf Persistence:}\\
This group is present in each class derived from {\tt PersistentObject}
and always contains the two methods {\tt persistentWrite} and {\tt
persistentRead}.
\item {\bf Converters:}\\
If an object is convertible into another class an explicit
converter should be defined inside this group.
\item {\bf Accessors:}\\
Accessors are all methods that access or modify attributes of the
class.
\item {\bf Predicates:}\\
Predicates are functions returning a boolean value. The names of
predicates should start with {\tt is} or {\tt has} if applicable.
Comparison operators are also predicates and should appear inside
this group.
\item {\bf Debugging and Diagnostics:}\\
This group should contain all methods used for the debugging of the class
(for example the {\tt dump} methods).
\item {\bf Attributes:}\\
This group should contain all public class attributes. Protected and
private attributes should be documented in the corresponding groups
{\bf Protected Attributes} or {\bf Private Attributes}. However, private
members do not occur in the documentation by default.
\end{itemize}
These section names are intended as a rule-of-thumb only. There are many cases
where other section headings are more intuitive. However, if any of the above
sections apply, they should be used. The sections should also appear in this
order.
\subsection{Commenting code}
The code for each .C file has to be commented. Each piece of code in BALL has
to contain at least {\bf 5\% of comments}. The use of {\tt //} instead of C style
comments ({\tt /* */}) is recommended to avoid problems arising from nested
comments. Comments should be written in plain English and describe the
functionality of the next few lines.
To check whether a header or source file fulfills this requirement, use the
script {\tt check\_coding} in {\tt source/config}. Invoke {\tt check\_coding} with the
filename to check (or a list of files) and it will print the percentage of
comments (including Doxygen comments) for each file.
\subsection{Standard header}
{\bf BALL uses the Concurrent Version System (CVS)} to manage different
version of the
source files. For easier identification, {\bf each BALL source file starts
with a header containing the CVS ID string in a comment.} Any new BALL header
or source file should start with this header:
\begin{verbatim}
// -*- Mode: C++; tab-width: 2; -*-
// vi: set ts=2:
//
// $Id: coding_style.tex,v 1.8.20.2 2007/08/08 07:17:22 toussaint Exp $
//
\end{verbatim}
In non-C++ files (Makefiles, TeX-Files, etc.) the C++ comments are replaced
by the respective comment characters (e.g. ``\#'' for Makefiles).
\subsection{Exception handling}
No BALL class method or function should lead to the termination of a program
in case of a detectable error. In particular, if BALL classes are embedded
for scripting, robustness is an important topic. The recommended
procedure to handle even fatal errors is to throw an exception. Uncaught
exceptions will result in a call to {\tt abort} thereby terminating the
program while still giving the programmer a chance to handle them graciously.\\
All exceptions used in BALL are derived from {\tt GeneralException}
defined in {\bf COMMON/exception.h}. A default constructor should not be
implemented for these exceptions. Instead, the constructor of all derived
exceptions should have the following signature:
\begin{verbatim}
AnyException(const char* file, unsigned long line);
\end{verbatim}
Additional arguments are possible but should provide default values (see
{\tt IndexOverflow} for an example).\\
The {\tt throw} directive for each exception should be of the form
\begin{verbatim}
throw AnyException(__FILE__, __LINE__);
\end{verbatim}
to simplify debugging. {\tt \_\_FILE\_\_} and {\tt \_\_LINE\_\_} are preprocessor
macros filled in by the compiler. {\bf GeneralException} provides two methods
({\tt getFile} and {\tt getLine}) that allow the localization of the
exception's cause.\\
\subsection{Persistence}
Object persistence is a serious problem in C++. There are mainly three
concepts that are responsible for the trouble:
\begin{enumerate}
\item multiple inheritance from a common base class leading to virtual
inheritance
\item static member variables
\item void pointers
\end{enumerate}
To start at the end, {\em void pointers} are very simple to handle: avoid them!
Pointers are only valid inside a single instance of a program, they have no
meaning for a different instance or even another machine. Any code dealing
with void pointers cannot be made persistent.
{\em Static member} variables can cause trouble, if the behaviour of a class depends
on the state of this variable. There is no general rule on how to handle these
variables. However, in most cases everything works fine if you simply ignore
them. The only case where a static member variable had to be considered when
implementing object persistence for the BALL kernel classes, was the internal
counter of the class {\tt Object}. This counter is incremented by one for each
instance of {\tt Object} created. It is used to uniquely identify objects and
to define a linear order on these objects (this is the object's handle).
If this handle had been treated as other variables when reading/writing persistent objects,
reading an object could have lead to object handles occuring twice.
Persistent reading for objects of type {\tt Object}
was therefore implemented to ignore the value. Objects that are read get handles according
to the order in which they are read and created. This leads to a different order
of the handles but prevents handles from occuring twice.
Whether static variables have to be read/written or not has to be
decided on a case-by-case basis.
{\em Multiple inheritance} is the most serious problem.
All persistent objects have to be derived from exactly {\em one} persistent
object class - in this case {\tt Composite}. The support classes like {\tt
PropertyManager} or {\tt Selectable} are not derived from {\tt
PersistentObject}, but are a {\em model of storable}.
A class which is a model of storable cannot be read from a persistent stream
if it is not part or base class of a persistent object. This is due to the fact
that the persistence manager needs a well-defined interface to handle the
objects it stores and retrieves. This interface is implemented in the class
{\tt PersistentObject}.
Thus, in order to provide for object persistence
the definition of two methods in the storable class' interface is required:
{\tt read(PersistenceManager\& pm)} and {\tt write(PersistenceManager\& pm)}.
These methods are quite similar to {\tt persistentWrite} and {\tt
persistentRead}.
\section{Testing}
\subsection{General}
Testing is crucial to verify the correctness of the library -- especially when
using C++. But why has it to be {\em so} complicated, using all these macros
and stuff? C++ compilers are strange beasts and there are no two accepting
the same code. Since one of the main concerns of BALL is portability,
we have to ensure that every single line of
code compiles on all platforms. Due to the long compilation times and the
(hopefully in the future) large number of different platforms, tests to verify the
correct behaviour of all classes have to be carried out automatically. This
implies a well defined interface for all tests, which is the reason for all
these strange macros. This fixed format also enforces the writing of complete
class tests. Usually a programmer writes a few lines of code to test the parts
of the code he wrote for correctness. Of the methods tested after the
introduction of the test macros, about a tenth of all functions/methods showed
severe errors after thorough testing. Most of these errors did not occur an
all platforms or did not show up on trivial input.
Writing tests for {\em each} method of a class also ensures that each line of code is
compiled. When using class templates the compiler only compiles the methods
called. Thus it is possible that a code segment contains syntactical errors
but the compiler accepts the code happily -- it simply ignores most of the
code. This is quickly discovered in a complete test of all methods. The same
is true for configuration dependend preprocessor directives that stem from
platform dependencies. Often untested code also hides inside the {\tt const}
version of a method, when there is a non-const method with the same name and
arguments (for example most of the {\tt getName} methods in BALL). In most
cases, the non-const version is preferred by the compiler and it is usually
not clear to the user which version is taken. Again, explicit testing of each
single method provides help for this problem.
The ideal method to tackle the problem of untested code is the complete
coverage analysis of a class. Unfortunately, this is only supported for very
few compilers, so it is not used for testing BALL.
Writing the test program is a wonderful opportunity to verify and complete the
documentation! Often enough implementation details are not clear at the time
the documentation is written. A lot of side effects or special cases that were
added later do not appear in the documentation. Going through the
documentation and the implementation in parallel is the best way to verify the
documentation for consistence and (strange coincidence?!) the best way to
implement a test program, too!
\subsection{Structure of a test program}
{\bf Each BALL class has to provide its own test program.} This test program has to check
each method of the class. The test programs reside in the directory
{\tt BALL/source/TEST}. To create a new test program, rename a copy of the file
{\tt Skeleton\_test.C} to the new class test name (usually {\tt
<classname>\_test.C}). The test program has to be coded using the class test
macros as described in the BALL online reference. These macros are
defined in {\tt CONCEPT/classTest.h}. Special care should be taken
to cover all special cases (e.g. what happens, if a method is called with
empty strings, negative values, zero, null pointers etc.).
\subsection{Macros to start, finish and evaluate tests}
\begin{itemize}
\item {\tt START\_TEST(class\_name, version)}\\
{\tt END\_TEST}\\
The {\tt START\_TEST} macro starts the tests for class {\tt class\_name}. This should occur just
\emph{once} in a test program, as it contains the definition of the {\tt main}
function. Similarly, {\tt END\_TEST} defines the terminal section of the test program.
\item {\tt CHECK(name)}\\
{\tt RESULT}\\
The {\tt CHECK} macro starts a subtest for a specific method of the class. The full
signature of the class should be given as {\tt name}.
Variables defined after this macro are scoped and valid until the next
{\tt RESULT} only. This {\tt RESULT} also evaluates success or failure of the test.
\item {\tt ABORT\_IF(condition)}\\
This macro aborts the subtest, i.e. it jumps forward to the next {\tt
RESULT} macro and marks the subtest as failed. This macro is usually employed
to prevent the execution of certain tests if a necessary precondition was
not met and the execution would hence result in a crash.
\item {\tt STATUS(...)}\\
This macro can be used to print the current status from within a subtest
and is useful for debugging tests. The macro will print its argument to the
console if the test is run with "-V" as an argument and stay silent
otherwise. Example: {\tt STATUS("result: " << result)}.
\end{itemize}
\subsection{Comparison macros}
\begin{itemize}
\item {\tt TEST\_EQUAL(a, b)}\\
Check that {\tt a} and {\tt b} are equal (using an appropriate {\tt operator ==}).
\item {\tt TEST\_NOT\_EQUAL(a, b)}\\
Check that {\tt a} and {\tt b} are \emph{not} equal (again, using an
appropriate {\tt operator ==}, not the {\tt operator !=}). This is helpful, if
you know what a method returns upon \emph{unsuccessful} execution, e.g.,
if you want to make sure it did not return a NULL pointer.
\item {\tt TEST\_REAL\_EQUAL(a, b)}\\
{\tt PRECISION(delta)}\\
The {\tt TEST\_REAL\_EQUAL(a, b)} macro checks if {\tt a} and {\tt b}
are equal within a defined tolerance {\tt delta}.
The tolerance is defined via the {\tt PRECISION} macro.
{\tt a} and {\tt b} have to be convertible to double.
\item{\tt NEW\_TMPFILE(filename)}\\
{\tt TEST\_FILE()}\\
{\tt TEST\_FILE\_REGEX()}\\
The {\tt NEW\_TMPFILE} macro creates a new temporary filename, which is then assigned
to {\tt filename}. Using these temporary filenames is recommended, as they
are deleted after running the test (except if run in verbose mode).
After having written something to a file (e.g. while testing output methods), the
contents of this file can be compared to a template file using the {\tt
TEST\_FILE} and {\tt TEST\_FILE\_REGEX} macros.
\item {\tt TEST\_EXCEPTION(exception\_type, expression)}\\
Tests whether an expression throws the exception it is supposed to throw.
\end{itemize}
\end{document}
|