File: coding_style.tex

package info (click to toggle)
ball 1.5.0%2Bgit20180813.37fc53c-11
links: PTS, VCS
area: main
in suites: bookworm
size: 239,924 kB
sloc: cpp: 326,149; ansic: 4,208; python: 2,303; yacc: 1,778; lex: 1,099; xml: 958; sh: 322; javascript: 164; makefile: 88
file content (444 lines) | stat: -rw-r--r-- 19,291 bytes
parent folder | download | duplicates (8)
% -*- Mode: C++; tab-width: 2; -*-
% vi: set ts=2:
%
                                                                                                                                                             
\documentclass[a4paper,10pt]{article}
\usepackage{a4wide}
\usepackage{times}

\title{BALL coding conventions}
\author{Oliver Kohlbacher}

\date{$ $Revision: 1.8.20.2 $ $}

\newtheorem{example}{Example}

\begin{document}
\maketitle

\section{General}

\subsection{Formatting}

{\bf All BALL files use a tab width of two.} Use the command {\tt set tabstop=2} in
{\tt vi} or {\tt set-variable tab-width 2} if you are using {\tt emacs}. The
standard file header should automatically set these values for {\tt vi} or
{\tt emacs} (see below).

\noindent
Curly braces should be put on their own line. {\bf Matching pairs of opening and closing curly braces should be 
set to the same column. }

\begin{verbatim}
while (continue == true)
{
  for (int i = 0; i < 10; i++)
  {
    ...
  }

 	...

  if (x < 7)
  {
    ....
  }
}
\end{verbatim}

\noindent
{\bf Always use braces around a block.} The main reason for this rule is to avoid constructions like:

\begin{verbatim}
if (isValid(a))
  return 0;
\end{verbatim}

\noindent
which might be changed to something like

\begin{verbatim}
if (isValid(a))
  error = 0;
  return 0;
\end{verbatim}

\noindent
The resulting errors are hard to find. There are two ways to avoid these
problems: (a) always use braces around a block (b) write everything in a single
line. We recommend method (a).
However, this is mainly a question of personal style, so no explicit checking
is performed to enforce this rule.



\section{Class requirements}

Each BALL class has to provide the following minimal interface:
\begin{verbatim}
  /** Class documentation... 
  */
  class Test
  {
    public:

    // clone method (implemented automatically with the CREATE macro)
    CREATE(Test)    

    // default ctor
    Test();

    // copy ctor 
    Test(const Test& test);

    // destructor 
    virtual ~Test();
 
    // assignment operator
    Test& operator = (const Test& test);
  };
\end{verbatim}



\section{Naming conventions}


\subsection{Variable names}

{\bf Variable names are all lower case letters.} Distinguished parts of the name are
separated using underscores ``{\tt \_}''. If parts of the name are derived
from common acronyms (e.g. PDB) they should be in upper case.\\
Private or protected member variables of classes are suffixed by an
underscore.\\
No prefixing or suffixing is allowed to identify the variable type - it 
leads to completely illegible documentation and overly long variable names.\\
\begin{example}\hspace*{2mm}\\
{\tt String PDB\_atom\_name;} (using an abbreviation)\\
{\tt int counter\_;} (private class member)\\
{\tt bool is\_found;} (ordinary variable)
\end{example}

\subsection{Class names/type names}

{\bf Class names and type names always start with a capital letter.} Different parts
of the name are separated by capital letters at the beginning of the word. No
underscores are allowed in type names and class names, except for the names of
protected types and nested classes which are suffixed by an underscore.
\begin{example}\hspace*{2mm}\\
{\tt class PDBFile}\\
{\tt class ForwardIteratorTraits\_} (private nested class definition)\\
\end{example}

\subsection{File names}

{\bf Header and source files should be named after the class they contain.}
Each file should contain a single class only, although exceptions are possible
for extreme light-weight classes. The file names start with a lower-case
letter (for historical reasons). An exception to the rule are file names starting 
with common acronyms. Hence, the file containing the {\tt AtomContainer} class 
would be named {\tt atomContainer.C}. The file containing the {\tt PDBRecord} class
would be named {\tt PDBRecord.C}.

\subsection{Function names/method names}

{\bf Function names (including class method names) always start with a lower case
letter.} Parts of the name are separated using capital letters (as are class
and type names). The argument of {\tt void} functions is omitted in
the declaration and the definition. If function arguments are pointers or
references, the pointer or reference qualifier is appended to the variable
type (without separating blanks). It should not prefix the variable name.\\
The variable names used in the declaration have to be the same as in the
definition. They should be as short as possible, but nevertheless
comprehensible.  If arguments are not used in the implementation of the
function, they have to be commented out (to avoid compiler warnings).
\begin{example}\hspace*{2mm}\\
{\tt int countAtoms()}\\
{\tt bool isBondedTo(const Atom\& atom)}\\
{\tt bool doSomething(int i, String\& /* name */)}
\end{example}

\section{Documentation}

\subsection{Doxygen}
{\bf Each BALL class has to be documented using Doxygen.} The documentation is
inserted in Doxygen format in the header file in which the class is defined.
Documentation includes the description of the class, of each method, type
declaration, enum declaration, constant, and member variable.

\subsection{Usage of groups in the documentation}

In order to obtain a structured and comprehensible documentation,  methods are
grouped by their function. Each class usually uses one or more of the
following categories to group its methods:
\begin{itemize}
	\item {\bf Type Definitions:}\\
		This group should contain all {\tt typedef}s of the class 
    (except for private or protected ones).
	\item {\bf Exceptions:}\\
		All class specific exceptions are defined as nested classes in this group.
	\item {\bf Enums:}\\
		This group usually contains enums or constants defined via enums such as
		properties.
	\item {\bf Constructors and Destructor:}\\
		This group includes all constructors and the destructor as well as related
		methods such as {\tt clear} or {\tt destroy}.
	\item {\bf Persistence:}\\
		This group is present in each class derived from {\tt PersistentObject}
		and always contains the two methods {\tt persistentWrite} and {\tt
		persistentRead}.
	\item {\bf Converters:}\\
		If an object is convertible into another class an explicit
		converter should be defined inside this group.
	\item {\bf Accessors:}\\
		Accessors are all methods that access or modify attributes of the
		class.
	\item {\bf Predicates:}\\
		Predicates are functions returning a boolean value. The names of
		predicates should start with {\tt is} or {\tt has} if applicable.
		Comparison operators are also predicates and should appear inside
		this group.
	\item {\bf Debugging and Diagnostics:}\\
		This group should contain all methods used for the debugging of the class
		(for example the {\tt dump} methods).
	\item {\bf Attributes:}\\	
		This group should contain all public class attributes. Protected and
		private attributes should be documented in the corresponding groups
		{\bf Protected Attributes} or {\bf Private Attributes}. However, private 
		members do not occur in the documentation by default.
\end{itemize}

These section names are intended as a rule-of-thumb only. There are many cases
where other section headings are more intuitive. However, if any of the above
sections apply, they should be used. The sections should also appear in this
order.

\subsection{Commenting code}

The code for each .C file has to be commented. Each piece of code in BALL has
to contain at least {\bf 5\% of comments}. The use of {\tt //} instead of C style
comments ({\tt /* */}) is recommended to avoid problems arising from nested
comments. Comments should be written in plain English and describe the
functionality of the next few lines.

To check whether a header or source file fulfills this requirement, use the
script {\tt check\_coding} in {\tt source/config}. Invoke {\tt check\_coding} with the
filename to check (or a list of files) and it will print the percentage of
comments (including Doxygen comments) for each file.

\subsection{Standard header}
                                                                                                                                                             
{\bf BALL uses the Concurrent Version System (CVS)} to manage different
version of the
source files. For easier identification, {\bf each BALL source file starts
with a header containing the CVS ID string in a comment.} Any new BALL header
or source file should start with this header:
\begin{verbatim}
  // -*- Mode: C++; tab-width: 2; -*-
  // vi: set ts=2:
  //
  // $Id: coding_style.tex,v 1.8.20.2 2007/08/08 07:17:22 toussaint Exp $
  //                                                                                                                                                             
\end{verbatim}
In non-C++ files (Makefiles, TeX-Files, etc.) the C++ comments are replaced
by the respective comment characters (e.g. ``\#'' for Makefiles).

\subsection{Exception handling}

No BALL class method or function should lead to the termination of a program
in case of a detectable error. In particular, if BALL classes are embedded 
for scripting, robustness is an important topic. The recommended
procedure to handle even fatal errors is to throw an exception. Uncaught
exceptions will result in a call to {\tt abort} thereby terminating the
program while still giving the programmer a chance to handle them graciously.\\
All exceptions used in BALL are derived from {\tt GeneralException}
defined in {\bf COMMON/exception.h}. A default constructor should not be
implemented for these exceptions. Instead, the constructor of all derived
exceptions should have the following signature:
\begin{verbatim}
  AnyException(const char* file, unsigned long line);
\end{verbatim}
Additional arguments are possible but should provide default values (see
{\tt IndexOverflow} for an example).\\
The {\tt throw} directive for each exception should be of the form
\begin{verbatim}
  throw AnyException(__FILE__, __LINE__);
\end{verbatim}
to simplify debugging. {\tt \_\_FILE\_\_} and {\tt \_\_LINE\_\_} are preprocessor
macros filled in by the compiler. {\bf GeneralException} provides two methods
({\tt getFile} and {\tt getLine}) that allow the localization of the
exception's cause.\\


\subsection{Persistence}

Object persistence is a serious problem in C++. There are mainly three
concepts that are responsible for the trouble:
\begin{enumerate}
	\item multiple inheritance from a common base class leading to virtual
		inheritance
	\item static member variables
	\item void pointers
\end{enumerate}

To start at the end, {\em void pointers} are very simple to handle: avoid them!
Pointers are only valid inside a single instance of a program, they have no
meaning for a different instance or even another machine. Any code dealing
with void pointers cannot be made persistent.

{\em Static member} variables can cause trouble, if the behaviour of a class depends
on the state of this variable. There is no general rule on how to handle these
variables. However, in most cases everything works fine if you simply ignore
them. The only case where a static member variable had to be considered when
implementing object persistence for the BALL kernel classes, was the internal
counter of the class {\tt Object}. This counter is incremented by one for each
instance of {\tt Object} created. It is used to uniquely identify objects and
to define a linear order on these objects (this is the object's handle).
If this handle had been treated as other variables when reading/writing persistent objects,
reading an object could have lead to object handles occuring twice. 
Persistent reading for objects of type {\tt Object} 
was therefore implemented to ignore the value. Objects that are read get handles according
to the order in which they are read and created. This leads to a different order
of the handles but prevents handles from occuring twice.
Whether static variables have to be read/written or not has to be
decided on a case-by-case basis.

{\em Multiple inheritance} is the most serious problem. 
All persistent objects have to be derived from exactly {\em one} persistent
object class - in this case {\tt Composite}. The support classes like {\tt
PropertyManager} or {\tt Selectable} are not derived from {\tt
PersistentObject}, but are a {\em model of storable}. 
A class which is a model of storable cannot be read from a persistent stream
if it is not part or base class of a persistent object. This is due to the fact
that the persistence manager needs a well-defined interface to handle the
objects it stores and retrieves. This interface is implemented in the class
{\tt PersistentObject}. 
Thus, in order to provide for object persistence 
the definition of two methods in the storable class' interface is required:
{\tt read(PersistenceManager\& pm)} and {\tt write(PersistenceManager\& pm)}.
These methods are quite similar to {\tt persistentWrite} and {\tt
persistentRead}.

\section{Testing}

\subsection{General}

Testing is crucial to verify the correctness of the library -- especially when
using C++. But why has it to be {\em so} complicated, using all these macros
and stuff? C++ compilers are strange beasts and there are no two accepting 
the same code. Since one of the main concerns of BALL is portability, 
we have to ensure that every single line of
code compiles on all platforms. Due to the long compilation times and the
(hopefully in the future) large number of different platforms, tests to verify the
correct behaviour of all classes have to be carried out automatically. This
implies a well defined interface for all tests, which is the reason for all
these strange macros. This fixed format also enforces the writing of complete
class tests. Usually a programmer writes a few lines of code to test the parts
of the code he wrote for correctness. Of the methods tested after the
introduction of the test macros, about a tenth of all functions/methods showed
severe errors after thorough testing. Most of these errors did not occur an
all platforms or did not show up on trivial input.

Writing tests for {\em each} method of a class also ensures that each line of code is
compiled. When using class templates the compiler only compiles the methods
called. Thus it is possible that a code segment contains syntactical errors
but the compiler accepts the code happily -- it simply ignores most of the
code. This is quickly discovered in a complete test of all methods. The same
is true for configuration dependend preprocessor directives that stem from
platform dependencies. Often untested code also hides inside the {\tt const}
version of a method, when there is a non-const method with the same name and
arguments (for example most of the {\tt getName} methods in BALL). In most
cases, the non-const version is preferred by the compiler and it is usually
not clear to the user which version is taken. Again, explicit testing of each
single method provides help for this problem.
The ideal method to tackle the problem of untested code is the complete
coverage analysis of a class. Unfortunately, this is only supported for very
few compilers, so it is not used for testing BALL.

Writing the test program is a wonderful opportunity to verify and complete the
documentation! Often enough implementation details are not clear at the time
the documentation is written. A lot of side effects or special cases that were
added later do not appear in the documentation. Going through the
documentation and the implementation in parallel is the best way to verify the
documentation for consistence and (strange coincidence?!) the best way to
implement a test program, too!


\subsection{Structure of a test program}

{\bf Each BALL class has to provide its own test program.} This test program has to check
each method of the class. The test programs reside in the directory
{\tt BALL/source/TEST}. To create a new test program, rename a copy of the file
{\tt Skeleton\_test.C} to the new class test name (usually {\tt
<classname>\_test.C}). The test program has to be coded using the class test
macros as described in the BALL online reference. These macros are
defined in {\tt CONCEPT/classTest.h}. Special care should be taken
to cover all special cases (e.g. what happens, if a method is called with
empty strings, negative values, zero, null pointers etc.). 

\subsection{Macros to start, finish and evaluate tests}
\begin{itemize}
	\item {\tt START\_TEST(class\_name, version)}\\
	{\tt END\_TEST}\\
		The {\tt START\_TEST} macro starts the tests for class {\tt class\_name}. This should occur just
		\emph{once} in a test program, as it contains the definition of the {\tt main}
		function. Similarly, {\tt END\_TEST} defines the terminal section of the test program.
	
	\item {\tt CHECK(name)}\\
	{\tt RESULT}\\
		The {\tt CHECK} macro starts a subtest for a specific method of the class. The full
		signature of the class should be given as {\tt name}.
		Variables defined after this macro are scoped and valid until the next
		{\tt RESULT} only. This {\tt RESULT} also evaluates success or failure of the test.

	\item {\tt ABORT\_IF(condition)}\\
		This macro aborts the subtest, i.e. it jumps forward to the next {\tt
		RESULT} macro and marks the subtest as failed. This macro is usually employed
		to prevent the execution of certain tests if a necessary precondition was
		not met and the execution would hence result in a crash.

	\item {\tt STATUS(...)}\\
		This macro can be used to print the current status from within a subtest
		and is useful for debugging tests. The macro will print its argument to the
		console if the test is run with "-V" as an argument and stay silent
		otherwise. Example: {\tt STATUS("result: " << result)}.

\end{itemize}

\subsection{Comparison macros}
\begin{itemize}
	\item {\tt TEST\_EQUAL(a, b)}\\
		Check that {\tt a} and {\tt b} are equal (using an appropriate {\tt operator ==}).

	\item {\tt TEST\_NOT\_EQUAL(a, b)}\\
		Check that {\tt a} and {\tt b} are \emph{not} equal (again, using an
		appropriate {\tt operator ==}, not the {\tt operator !=}). This is helpful, if
		you know what a method returns upon \emph{unsuccessful} execution, e.g.,
		if you want to make sure it did not return a NULL pointer.

	\item {\tt TEST\_REAL\_EQUAL(a, b)}\\
	{\tt PRECISION(delta)}\\
		The {\tt TEST\_REAL\_EQUAL(a, b)} macro checks if {\tt a} and {\tt b} 
    are equal within a defined tolerance {\tt delta}. 
    The tolerance is defined via the {\tt PRECISION} macro. 
    {\tt a} and {\tt b} have to be convertible to double. 

	\item{\tt NEW\_TMPFILE(filename)}\\
	{\tt TEST\_FILE()}\\
	{\tt TEST\_FILE\_REGEX()}\\
		The {\tt NEW\_TMPFILE} macro creates a new temporary filename, which is then assigned
		to {\tt filename}. Using these temporary filenames is recommended, as they
		are deleted after running the test (except if run in verbose mode).
		After having written something to a file (e.g. while testing output methods), the
		contents of this file can be compared to a template file using the {\tt
		TEST\_FILE} and {\tt TEST\_FILE\_REGEX} macros.

	\item {\tt TEST\_EXCEPTION(exception\_type, expression)}\\
		Tests whether an expression throws the exception it is supposed to throw.
	
	
\end{itemize}



\end{document}