1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514
|
# architecture.pod: The Torrus internals
# Copyright (C) 2016 Stanislav Sinyagin
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
# Stanislav Sinyagin <ssinyagin@k-open.com>
#
#
=head1 Torrus Architecture
=head2 Configuration processing
The XML configuration is compiled into the database representation by
operator's manual request.
A compiled version of configuration is not a one-to-one representation
of the XML version. All templates are expanded. Backward restoration of
XML from the database is available with the snapshot utility.
A template defines a piece of configuration which can be used in
multiple places. Templates can be nested.
The configuration consists of multiple XML files. They are processed in
the order as specified in the tree configuration. Each new file is
treated as an additive information to the existing tree.
The XML configuration compiler validates all the mandatory parameters.
=head2 Data storage
Three types of data stores are used in Torrus:
=over
=item * Git for storing the tree data;
=item * Redis for storing run-time data, sending notifications, and locking;
=back
=head2 Tree configuration
The configuration consists of multiple trees. A tree consists of nodes,
and each node can be of type "leaf" or "subtree". Subtrees contain child
subtrees or leaves, and a leaf does not contain any child elements. Each
node has an arbitrary number of parameters. Some parameters can prohibit
recursion, but most of parameters are calculated by traversing the tree
upwards, until a value is found.
Each node has a path within a tree. A subtree path ends always with a
slash, and a leaf path ends with a word character. The top of the tree
is identified by a single slash symbol. The node names in the path allow
alphanumeric characters, dash and underscore.
Each node is identified by a token. A node token a 40-character SHA-1
checksum calculated from the tree name, followed by a colon, and the
path.
=head2 C<ConfigTree> objects
C<ConfigTree> Perl module provides an API for accessing the
configuration trees, as well as other types of data. Each data element
is referred to by a token, as follows:
Tree node token is a 40-character SHA-1 checksum of a node as described above.
Tokenset name starts with letter I<S>. The rest is an arbitrary sequence of
word characters.
The special token I<SS> is reserved for tokensets list. Also tokenset
parameters are inherited from this token's parameters.
View and monitor names must be unique, and must
start with a lower case letter.
=head2 Git storage
A Torrus instance uses a number of Git branches for each tree
configuration. The XML compiler is the only writer, and consumers are
only reading from the Git repositories. Both writing and reading is done
directly on local Git repositories, so working directories are not
needed. If the writer and reader are on the same host, they use the same
Git repository. Otherwise, the writer pushes its commits to a remote
repository, and the reader pulls from it. The reader sets an exclusive
lock on the repository for the time of fetching and merging, so that
other readers don't try to pull at the same time.
Each tree has its own set of branches. The C<I<TREE>_configtree> branch
contains a full hierarchy of objects, so all parameters that are defined
in the input XML are retrievable. Typically the Web UI renderer is
consuming this data.
The C<I<TREE>_srcfiles> branch contains XML files that are used by the
XML compiler. While processing the XML files, it adds them into this
branch in order to track the changes in XML sources.
Also there's a number of agent branches, one per agent instance
(collectors and monitors are typical agents):
C<I<TREE>_I<DAEMON>_I<INSTANCE>>. Each such branch contains only the
information and parameters needed by the agent, so that the agent
process can start and update its data as fast as possible.
A Git reference C<refs/heads/I<TREE>_agents_ref> is used to indicate the
commit in C<I<TREE>_configtree> branch that corresponds to the current
heads in agent branches. This reference is moved when the agent branches
are updated.
The C<I<TREE>_agent_tokens> branch is used to store the information
which agent branches have which tokens. It is needed for deleting tokens
from agent branches when they are deleted from configtree branch. The
branch contains two-level 256-way directory hierarchy, for every token,
and the JSON files contain arrays with agent branch names where a
particular token is used.
The C<I<TREE>_configtree> branch has subdirectories as follows. The
directories C<nodes> and C<children> contain JSON files named after the
node tokens, arranged into two-level 256-way tree structure. For
example, the file for token C<b7ba0d88a0b14a4e6c3c61f5446aa619a537098f>
is stored as C<b7/ba/0d88a0b14a4e6c3c61f5446aa619a537098f>.
=over
=item * C<nodes/>: the JSON content defines the node's type, name,
parent's token, and parameters.
=item * C<children/>: tor each subtree node, there is a JSON file
defining a hash with child tokens as keys and "1" as values.
=item * C<srcrefs>: a JSON hash representing dependencies of nodes from
XML files. It's a two-level hash: the first key is the source file
name, the second key is the token of topmost dependent node, and the
value is "1".
=item * C<srcglobaldeps>: a JSON hash representing the XML source files
which define parameter properties, definitions, or templates. The keys
are file names, and values are "1".
=item * C<srcrev>: A file containing a JSON scalar referring to the
commit in XML sources.
=item * C<srcincludes>: A file containing a JSON hash with source XML
files as keys and arrays of included file names as values. The order
in the array is the same as the order of "include" statements in the
XML file. The key C<__ROOT__> indicates the XML files where the
compilation started. Every source XML file is listed here, and those
which do not include other files, have empty arrays.
=item * C<nodeid/>: two-level 256-way directory structure. Each file is
a SHA-1 digest of I<nodeid> value. The content of the file is a JSON
array of the nodeid value and node token.
=item * C<nodeidpx/>: I<nodeid> prefix searching database. Each
I<nodeid> value is split by standard delimiters (two consecutive
slashes), and each resulting prefix is used to build a key in this
hierarchy. Two-level 256-way directory structure is built from SHA-1
digests of these keys. The directories contain zero-length files
representing the SHA-1 digests of I<nodeid> values.
=item * C<definitions/>: each file is a definition name, and the content
is a JSON scalar returning the definition value.
=item * C<other/>: JSON objects for views, monitors, and actions
definitions. Files are named after the view, monitor, or action name,
and the content is a hash with parameters defining each object. The
following special files are JSON hashes of object names and "1" as
values of corresponding types: C<__VIEWS__>, C<__MONITORS__>,
C<__ACTIONS__>.
=item * C<paramprops>: a single JSON file defining parameter properties
in two-level hash.
=back
The C<srcdef> structure is mainly required for recursive deletion of
nodes if a corresponding XML file is changed or deleted.
Each Git commit refers to a complete and consistent tree structure. If
the compiler finds an error, it does not create a new commit, and rolls
back to the latest HEAD.
The JSON files within C<nodes> hierarchy are hashes with the following
keys and values:
=over 4
=item * C<is_subtree>: 1 for subtree, 0 for a leaf.
=item * C<parent>: token of the parent node, or empty string if this is
the top of the tree.
=item * C<path>: the full node name. Subtree names must end with slash,
and leaf names should end with alphanumeric characters.
=item * C<params>: hash with parameter names and values.
=item * C<vars>: hash with variable values (used in setvar, iftrue and
iffalse XML statements).
=item * C<src>: optional hash of source XML file names as keys and "1"
as values. It's only defined on the topmost node that is affected by a
given XML. If an XML file updates a previously defined node, the
C<src> content is copied from the nearest parent where it's defined.
=back
The JSON files within C<other> are hashes with the following keys and
values:
=over 4
=item * C<params>: hash with parameter names values.
=back
The agent branches contain JSON files named after
the node tokens, arranged into two-level 256-way tree structure. Each
daemon that needs a quick access to a subset of leaf nodes (primarily,
I<collector>, and also I<monitor>) retrieve the node configurations from
this structure. The instance number is a 4-digit lower-case hexademical
number. The JSON files are hashes defining all parameter values needed
by the daemon. These files are populated by the XML compiler after the
tree is processed.
An optional C<searchdb> branch is used for indexing the node parameters
in order to provide the search in GUI. It consists of the following
directories:
=over 4
=item * C<words/I<TREE>/> contains zero-length files in the following
hierarchy: C<I<KEYWORD>/I<TOKEN>/I<PARAM>>. If a keyword is matched in
the subtree or leaf name, the file name is C<__NODENAME__>.
=item * C<wordsglobal/> is the same as above, but for global search. In
addition, a file called C<__TREENAME__> contains a JSON scalar with
the tree name where this token is defined.
=item * C<tokens/> is a two-level 256-way hierarchy of directories based
on token ID's. These directories contain zero-length files named after
keywords.
=item * C<configtree_ref/> is a directory containing files named after
the config tree names. Each file is a JSON scalar indicating the
commit ID in corresponding configtree branch.
=back
=head2 Redis database
Redis is an in-memory database, supporting key/value hashes and linear
arrays, with periodic saving to disk storage. Torrus keeps all run-time
and dynamic information in Redis.
All Redis keys that are used within a single Torrus installation are
prefixed with a configurable prefix ("torrus:" by default), thus
allowing multiple Torrus installations to use the same Redis
instance. Further in this document, the prefix is omitted for easier
reading.
=over 4
=item * C<gitlock:I<REPOPATH>> -- this key is used as a mutex that
protects a local Git repository from simultaneous initialization by
multiple processes. Before accessing the repository in writer mode,
the writer sets this Redis key to the current UNIX timestamp.
=item * C<writer:I<REPOPATH>> -- this is a hash representing active
Torrus::ConfigTree::Writer objects. Each key is the process ID, and
values are the UNIX timestamps when the writer objects were
created. Entries older than 24 hours are automatically removed. This
hash aims to prevent Git garbage collector from running while there
are active compiler processes.
=item * C<githeads> -- this is a hash containing commit numbers written
by the compiler. The keys are branch names, and the values are the Git
commit numbers of corresponding tops of the branches. The consumer
process compares this with the current known commit and pulls the
updates if needed.
=item * C<tsets:I<TREE>> -- hash of tokenset names as keys and "1" as
values.
=item * C<tset:I<TREE>:I<TSET>> -- hash of tokenset members. Tokens are
the keys, and the values indicate the origins. Currently known origins
are "static" and "monitor".
=item * C<tsetparam:I<TREE>:I<TSET>> -- a hash of tokenset parameters.
=item * C<users> -- a hash containing users and groups, as described below.
=item * C<acl> -- a hash containing the access privileges for groups, as
described below.
=item * C<monitor_alarms:I<TREE>> is a hash that keeps alarm status
information from previous runs of Monitor, with the keys and values as
described below.
=item * C<scheduler_stats:I<TREE>> is a hash which stores the runtime
statistics of Scheduler tasks. Each key is of structure
C<I<TYPE>:I<TASKNAME>:I<INSTANCE>:I<PERIOD>:I<OFFSET>>#I<VARIABLE>>,
and the value is a number representing the current value of the
variable. Depending on variable purpose, the number is floating point
or integer.
=item * C<serviceid_params> is a hash containing properties for each
Service ID (exported collector information, usually stored in an SQL
database). The keys are Service IDs, and values are JSON hashes
describing the properties. Known parameters are: C<trees>, C<token>,
C<dstype>, C<units>.
=item * C<serviceid_tokens> is a hash with tokens as keys and Service ID
as values.
=item * C<snmp_failures:I<TREE>> -- a hash listing SNMP failures in the
collector, as described below.
=back
PubSub channels:
=over 4
=item * C<treecommits:I<TREE>> -- the value of every new Git commit in
C<I<TREE>_configtree> branch is published to this channel.
=back
=head3 C<users> contents
=over 4
=item * C<ua:I<UID>:I<ATTR>> => C<I<VALUE>>
User attributes, such as C<cn> (Common name) or C<userPassword>, are
stored here. For each user, there is a record consisting of the
attribute C<uid>, with the value equal to the user identifier.
=item * C<uA:I<UID>> => C<I<ATTR>,...>
Comma-separated list of attribute names for the given user.
=item * C<gm:I<UID>> => C<I<group>,...>
For each user ID, stores the comma-separated list of groups it belongs to.
=item * C<ga:I<GROUP>:I<ATTR>> => C<I<VALUE>>
Group attributes, such as group description.
=item * C<gA:I<GROUP>> => C<I<ATTR>,...>
Comma-separated list of attribute names for the given group.
=item * C<G:> => C<I<GROUP>,...>
List of all groups
=back
=head3 C<acl> contents
=over 4
=item * C<I<GROUP>:I<OBJECT>:I<PRIVILEGE>> => C<1>
The entry exists if and only if the group members have this privilege
over the object given. Most common privilege is C<DisplayTree>, where
the object is the tree name.
=back
=head3 C<monitor_alarms> contents
=over 4
=item * C<I<MNAME>:I<TOKEN>> =>
C<I<T_SET>:I<T_EXPIRES>:I<STATUS>:I<T_LAST_CHANGE>
[:I<ESCALATION>[:I<ESCALATION>...]]>
Key consists of the monitor name and leaf token. In the value, C<T_SET>
is the time when the alarm was raised. If two subsequent runs of
Monitor raise the same alarm, C<T_SET> does not change. C<T_EXPIRES>
is the timestamp that shows until when it's still important to keep
the entry after the alarm is cleared. C<STATUS> is 1 if the alarm is
active, and 0 otherwise. C<T_LAST_CHANGE> is the timestamp of last
status change. Following values are optional escalation times if
escalation events were fired.
If C<STATUS> is 1, the record is kept regardless of timestamps. If
C<STATUS> is 0, and the current time is more than C<T_EXPIRES>, the
record is not reliable and may be deleted by Monitor.
=back
=head3 C<serviceid_params> contents
=over 4
=item * C<a:> => C<I<SERVICEID>,...>
Lists all known service IDs
=item * C<t:I<TREE>> => C<I<SERVICEID>,...>
Lists service IDs exported by a given datasource tree.
=item * C<p:I<SERVICEID>:I<PARAM>> => C<I<VALUE>>
Parameter value for a given service ID. Mandatory parameters are:
C<tree>, C<token>, C<dstype>. Optional: C<units>.
=item * C<P:I<serviceid>> => C<I<PARAM>,...>
List of parameter names for a service ID.
=back
=head3 C<snmp_failures> contents
=over 4
=item * C<c:I<counter>> => C<I<number>>
A counter with a name. Known names: C<unreachable>, C<deleted>, C<mib_errors>.
=item * C<h:I<hosthash>> => C<I<failure>:I<timestamp>>
SNMP host failure information. Hosthash is a concatenation of hostname,
UDP port, and SNMP community, separated by "|". Known failures:
C<unreachable>, C<deleted>. Timestamp is a UNIX time of the event.
=item * C<m:I<TOKEN>> => C<I<timestamp>>
MIB failures (I<noSuchObject>, I<noSuchInstance>, and I<endOfMibView>)
for a given host, with the tree path of their occurence and the UNIX timestamp.
=item * C<M:I<hosthash>> => C<I<number>>
Count of MIB failures per SNMP host.
=back
=head2 Search and indexing service
Searching within trees is implemented in a standalone service,
consisting of two parts:
=over 4
=item 1. the daemon that subscribes to C<treecommits:*> channels and
updates its database after every commit;
=item 2. a RESTful API service for retrieving the search results
=back
=head1 Author
Copyright (c) 2016-2017 Stanislav Sinyagin ssinyagin@k-open.com
|