File: multimaster.rst

package info (click to toggle)
munin 2.0.76-5
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 7,064 kB
  • sloc: perl: 11,684; java: 1,924; sh: 1,632; makefile: 636; javascript: 365; python: 267
file content (166 lines) | stat: -rw-r--r-- 5,994 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
.. _example-tips-masteraggregation:

==================================
 multiple master data aggregation
==================================

This example describes a way to have multiple master collecting
different information, and show all the data in a single presentation.

When you reach some size (probably several hundreds of nodes, several
tousands plugins), 5 minutes is not enough for your single master to
connect and gather data from all hosts, and you end up having holes in
your graph.

Requirements
============

This example requires a shared nfs space for the munin data between the
nodes.

Before going that road, you should make sure to check other options
first, like changing the number of update threads, and having rrdcached.

An other option you might consider, is using munin-async. It requires
modifications on all nodes, so it might not be an option, but I felt
compeled to mention it. If you can't easily have shared nfs, or if you
might have connectivity issues between master and some node, async would
probably be a better approach.

Because there is some rrd path merge required, it is highly recommended
to have **all** nodes in groups.

Overview
========

Munin-Master runs different scripts via the cron script (munin-cron).

``munin-update``
	is the only part actually connecting to the nodes. It gathers
	information and updates the rrd (you'll probably need rrdcached,
	especially via nfs).

``munin-limits``
	checks what was collected, compared to the limits and places
	warning and criticals.

``munin-html``
	takes the information gathered by update and limits, and
	generates the actual html files (if don't have cgi-html).
	It currently still generates some data needed by the cgi.

``munin-graph``
	generate the graphs. If you are thinking about getting many
	masters, you probably have a lot of graph, and don't want to
	generate them every 5 minutes, but you would rather use
	cgi-graph.

The trick about having multiple master running to update is :

- run ``munin-update`` on different masters (called update-masters there
  after), having ``dbdir`` on nfs
- run ``munin-limits`` on either each of the update-masters, or the
  html-master (see next line)
- run ``munin-html`` on a single master (html-master), after merging
  some data generated by the update processes
- have graph (cgi) and html (from file or cgi) served by either
  html-master, or specific presentation hosts.

Of course, all hosts must have access to the shared nfs directory.

Exemples will consider the shared folder /nfs/munin.

Running munin-update
====================

Change the ``munin-cron`` to only run ``munin-update`` (and
``munin-limits``, if you have alerts you want to be managed directly on
those masters). The cron should NOT launch munin-html or munin-graph.

Change your ``munin.conf`` to use a dbdir within the shared nfs, (ie:
``/nfs/munin/db/<hostname>``).

To make it easier to see the configuration, you can also update the
configuration with an ``includedir`` on nfs, and declare all your nodes
there (ie: ``/nfs/munin/etc/<hostname>.d/``).

If you configured at least one node, you should have
``/nfs/munin/db/<hostname>`` that starts getting populated with
subdirectories (groups), and a few files, including ``datafile``, and
``datafile.storable`` (and ``limits`` if you also have munin-limits
running here).

Merging data
============

All our update-masters generate update their dbdir including:

- ``datafile`` and ``datafile.storable`` which contain information about
  the collected plugins, and graphs to generate.
- directory tree with the rrd files

In order to have munin-html to run correctly, we need to merge those
dbdir into one.

Merging files
-------------

``datafile`` is just plain text with lines of ``key value``, so
concatenating all the files is enough.

``datafile.storable`` is a binary representation of the data as loaded
by munin. It requires some munin internal structures knowledge to merge
them.

If you have ``munin-limits`` also running on update-masters, it generate
a ``limits`` files, those are also plain text.

In order to make that part easier, a ``munin-mergedb.pl`` is provided in
contrib.

Merging rrd tree
----------------

The main trick is about rrd. As we are using a shared nfs, we can use
symlinks to get them to point to one an other, and not have to duplicate
them. (Would be hell to keep in sync, that's why we really need shared
nfs storage.)

As we deal with groups, we could just link top level groups to a common
rrd tree.

Example, if you have two updaters (update1 and update2), and 4 groups
(customer1, customer2, customer3, customer4), you could make something
like that::

/nfs/munin/db/shared-rrd/customer1/
/nfs/munin/db/shared-rrd/customer2/
/nfs/munin/db/shared-rrd/customer3/
/nfs/munin/db/shared-rrd/customer4/
/nfs/munin/db/update1/customer1 -> ../shared-rrd/customer1
/nfs/munin/db/update1/customer2 -> ../shared-rrd/customer2
/nfs/munin/db/update1/customer3 -> ../shared-rrd/customer3
/nfs/munin/db/update1/customer4 -> ../shared-rrd/customer4
/nfs/munin/db/update2/customer1 -> ../shared-rrd/customer1
/nfs/munin/db/update2/customer2 -> ../shared-rrd/customer2
/nfs/munin/db/update2/customer3 -> ../shared-rrd/customer3
/nfs/munin/db/update2/customer4 -> ../shared-rrd/customer4
/nfs/munin/db/html/customer1 -> ../shared-rrd/customer1
/nfs/munin/db/html/customer2 -> ../shared-rrd/customer2
/nfs/munin/db/html/customer3 -> ../shared-rrd/customer3
/nfs/munin/db/html/customer4 -> ../shared-rrd/customer4

At some point, an option to get the rrd tree separated from the dbdir,
and should avoid the need of such links.

Running munin-html
==================

Once you have your update-masters running, and a merge ready to go, you
should place a cron on a html-master to :

- merge data as requested
- launch ``munin-limits``, if not launched on update-masters and merged
- launch ``munin-html`` (required, even if you use cgi)
- launch ``munin-graph`` unless you use cgi-graph