File: README.md

package info (click to toggle)
libcifpp 9.0.5-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 15,528 kB
  • sloc: cpp: 33,979; sh: 105; makefile: 12
file content (221 lines) | stat: -rw-r--r-- 7,410 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
[![github CI](https://github.com/pdb-redo/libcifpp/actions/workflows/cmake-multi-platform.yml/badge.svg)](https://github.com/pdb-redo/libcifpp/actions)
[![GitHub License](https://img.shields.io/github/license/pdb-redo/libcifpp)](https://github.com/pdb-redo/libcifpp/LICENSE)

# libcifpp

As the name implies, this library was originally written to work with mmCIF files
using C++ as programming language. The design of this library leanes heavily on
the structure of CIF files. These files can be thought of as a text dump of a
relational databank with, often but not always, a very strict schema describing
the data. These schema's are called dictionaries.

Using information from the content of a mmCIF file and an optional schema,
libcifpp allows you to access the data in the file as a collection of datablock
each containing a collection of categories with rows of data. The categories can
be searched for data using queries written in regular C++ syntax. When a dictionary
was specified, inserted data is checked for validity. Likewise removal of data
may result in cascaded removal of linked data in other categories using
parent/child relationship information.

Since there were still many programs using the legacy PDB format at the time
development started, a layer was added that converts data to and from PDB format
into mmCIF format. This means you can manipulate PDB files as if they were
normal mmCIF files.

Apart from this basic functionality, libcifpp also offers code to help with
symmetry calculations, 3d manipulations and obtaining information from the CCD
[Chemical Component Dictionary](https://www.wwpdb.org/data/ccd).

## Documentation

The documentation can be found at [github.io](https://pdb-redo.github.io/libcifpp/)

## Synopsis

```cpp
// A simple program counting residues with an OXT atom

#include <filesystem>
#include <iostream>

#include <cif++.hpp>

namespace fs = std::filesystem;

int main(int argc, char *argv[])
{
    if (argc != 2)
        exit(1);

    // Read file, can be PDB or mmCIF and can even be compressed with gzip.
    cif::file file = cif::pdb::read(argv[1]);

    if (file.empty())
    {
        std::cerr << "Empty file\n";
        exit(1);
    }

    // Take the first datablock in the file
    auto &db = file.front();

    // Use the atom_site category
    auto &atom_site = db["atom_site"];

    // Count the atoms with atom-id "OXT"
    auto n = atom_site.count(cif::key("label_atom_id") == "OXT");

    std::cout << "File contains " << atom_site.size() << " atoms of which "
              << n << (n == 1 ? " is" : " are") << " OXT\n"
              << "residues with an OXT are:\n";

    // Loop over all atoms with atom-id "OXT" and print out some info.
    // That info is extracted using structured binding in C++
    for (const auto &[asym, comp, seqnr] :
            atom_site.find<std::string, std::string, int>(
                cif::key("label_atom_id") == "OXT",
                "label_asym_id", "label_comp_id", "label_seq_id"))
    {
        std::cout << asym << ' ' << comp << ' ' << seqnr << '\n';
    }

    return 0;
}
```

## Installation

You might be able to use libcifpp from a package manager used by your
OS distribution. But most likely this package will be out-of-date.
Therefore it is recommended to build *libcifpp* from code. It is not
hard to do. But it is recommended to read the following instructions
carefully.

### Requirements

The code for this library was written in C++17. You therefore need a
recent compiler to build it. For the development gcc >= 9.4 and clang >= 9.0
have been used as well as MSVC version 2019.

The other requirement you really need to have installed on your computer
is a version of [CMake](https://cmake.org). For now the minimum version
is 3.16 but that may soon change into a higher version. You should also
install the gui version of CMake to set build options easily, on Debian
I prefer to use the curses version installed with `cmake-curses-gui`.

It is very useful to have [mrc](https://github.com/mhekkel/mrc) available.
However, this is only an option if you use Windows or an operating system
using the ELF executable format (i.e. Linux or FreeBSD). MRC is a resource
compiler that allows including data files into the executable making them
easier to install.

Other libraries you might want to install beforehand are:

- [libeigen](https://eigen.tuxfamily.org/index.php?title=Main_Page), a
  library to do amongst others matrix calculations. This usually can be
  installed using your package manager, in Debian/Ubuntu it is called
  `libeigen3-dev`
- [zlib](https://github.com/madler/zlib), the development version of this
  library. On Debian/Ubuntu this is the package `zlib1g-dev`.
- [pcre2](https://www.pcre.org/), the Perl Compatible Regular Expression
  library. On Debian/Ubuntu this is the package `libpcre2-dev`.

### Building

First you need to download the code:

```console
 git clone https://github.com/PDB-REDO/libcifpp.git
 cd libcifpp
```

You should start by considering where to install libcifpp. If you have
sufficient permissions on your computer you perhaps should use the
default but libcifpp can be configured to be installed anywhere
including e.g. *$HOME/.local*.

Next step is to configure, for this use the CMake gui application. If you
installed the curses version of cmake you can type `ccmake`. On Windows
you can use `cmake-gui.exe`.

To install in the default location:

```console
ccmake -S . -B build
```

To install elsewhere, e.g. *$HOME/.local*:

```console
ccmake -S . -B build -DCMAKE_INSTALL_PREFIX=$HOME/.local
```

In the cmake window, start the configure command (use button or press 'c').
After the first configure step you will see a list of settable options.
Alter these to match your preferences. Most options are self explaining
and contain a description. Some may need a bit more explanation:

- CIFPP_DATA_DIR, this directory will be used to store initial versions
  of the mmcif_pdbx dictionary as well as the optional CCD file.

- CIFPP_DOWNLOAD_CCD

  The CCD file is huge and perhaps you think you don't
  need it. In that case you can leave this OFF. But that will limit the
  use cases.

- CIFPP_INSTALL_UPDATE_SCRIPT
  
  The files in CIFPP_DATA_DIR are quickly becoming out of date. On
  FreeBSD and Linux you can install a script that updates these files
  on a weekly basis.

- CIFPP_CRON_DIR
  
  The directory where the update script is to be installed.

- CIFPP_ETC_DIR
  
  The update script will only work if the file called *libcifpp.conf*
  in this *etc* directory will contain an uncommented line with

```console
update=true
```

- CIFPP_CACHE_DIR

  When you installed and enabled the update script, new files are
  written to this directory.

- CIFPP_RECREATE_SYMOP_DATA
  
  If you had CCP4 sourced into your environment, this option allows
  you to recreate the symop data file.

- BUILD_FOR_CCP4
  
  Build a special version of libcifpp to be installed in the CCP4
  environment.

After setting these options you can run the configure step again and
then use generate to create the makefiles.

Building and installing is then as simple as:

```console
cmake --build build
cmake --install build
```

If this fails due to lack of permissions, you can try:

```console
sudo cmake --install build
```

Tests are created by default, and to test the code you can run:

```console
ctest --test-dir build
```