File: conf.pod

package info (click to toggle)
ganglia 3.6.0-7
  • links: PTS, VCS
  • area: main
  • in suites: bullseye, buster, sid, stretch
  • size: 6,484 kB
  • ctags: 3,880
  • sloc: ansic: 27,874; sh: 11,052; python: 6,695; makefile: 565; perl: 366; php: 126; xml: 28
file content (651 lines) | stat: -rw-r--r-- 23,477 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
=pod

=head1 NAME

B<gmond.conf> - configuration file for ganglia monitoring
daemon (gmond)

=head1 DESCRIPTION

The gmond.conf file is used to configure the ganglia
monitoring daemon (gmond) which is part of the B<Ganglia
Distributed Monitoring System>.  

=head1 SECTIONS AND ATTRIBUTES

All sections and attributes are case-insensitive.  For example,
B<name> or B<NAME> or B<Name> or B<NaMe> are all equivalent.

Some sections can be included in the configuration file multiple
times and some sections are singular.  For example, you can
have only one B<cluster> section to define the attributes of
the cluster being monitored; however, you can have multiple
B<udp_recv_channel> sections to allow gmond to receive message
on multiple UDP channels.

=head2 cluster

There should only be one B<cluster> section defined.  This
section controls how gmond reports the attributes of the
cluster that it is part of.

The B<cluster> section has four attributes: B<name>,
B<owner>, B<latlong> and B<url>.

For example,

  cluster {
    name = "Millennium Cluster"
    owner = "UC Berkeley CS Dept."
    latlong = "N37.37 W122.23"
    url = "http://www.millennium.berkeley.edu/"
  }

The B<name> attributes specifies the name of the cluster of 
machines.  The B<owner> tag specifies the administrators of 
the cluster.  The pair B<name>/B<owner> should be unique
to all clusters in the world.

The B<latlong> attribute is the latitude and longitude GPS 
coordinates of this cluster on earth.  Specified to 1 mile 
accuracy with two decimal places per axis in decimal.

The B<url> for more information on the B<cluster>. 
Intended to give purpose, owner, administration, and account details 
for this cluster.

There directives directly control the XML output of gmond.  For
example, the cluster configuration example above would translate
into the following XML.

  <CLUSTER NAME="Millennium Cluster" OWNER="UC Berkeley CS Dept."
           LATLONG="N37.37 W122.23" URL="http://www.millennium.berkeley.edu/">
  ...
  </CLUSTER>

=head2 host

The B<host> section provides information about the host running this
instance of B<gmond>. Currently only the B<location> string attribute is
supported. Example:

 host {
   location = "1,2,3"
 }

The numbers represent Rack, Rank and Plane respectively.

=head2 globals

The B<globals> section controls general characteristics of gmond
such as whether is should daemonize, what user it should run as,
whether is should send/receive date and such.  The B<globals>
section has the following attributes: B<daemonize>, B<setuid>, B<user>,
B<debug_level>, B<mute>, B<deaf>, B<allow_extra_data>, B<host_dmax>,
B<host_tmax>, B<cleanup_threshold>, B<gexec>, B<send_metadata_interval>
and B<module_dir>.

For example,

  globals {
    daemonize = true
    setuid = true
    user = nobody
    host_dmax = 3600
    host_tmax = 40
  }

The B<daemonize> attribute is a boolean.  When true, B<gmond> will 
daemonize.  When false, B<gmond> will run in the foreground.

The B<setuid> attribute is a boolean.  When true, B<gmond> will
set its effective UID to the uid of the user specified by the B<user>
attribute.  When false, B<gmond> will not change its effective user.

The B<debug_level> is an integer value.  When set to zero (0), B<gmond>
will run normally.  A B<debug_level> greater than zero will result in
B<gmond> running in the foreground and outputting debugging information.
The higher the B<debug_level> the more verbose the output.

The B<mute> attribute is a boolean.  When true, B<gmond> will not 
send data regardless of any other configuration directives.

The B<deaf> attribute is a boolean.  When true, B<gmond> will not 
receive data regardless of any other configuration directives.

The B<allow_extra_data> attribute is a boolean.  When false, B<gmond> will
not send out the EXTRA_ELEMENT and EXTRA_DATA parts of the XML.  This might
be useful if you are using your own frontend to the metric data and will
like to save some bandwith.

The B<host_dmax> value is an integer with units in seconds.  When set 
to zero (0), B<gmond> will never delete a host from its list even when 
a remote host has stopped reporting.  If B<host_dmax> is set to a
positive number then B<gmond> will flush a host after it has not heard
from it for B<host_dmax> seconds.  By the way, dmax means "delete max".

The B<host_tmax> value is an integer with units in seconds. This value
represents the maximum amount of time that B<gmond> should wait between
updates from a host. As messages may get lost in the network, B<gmond>
will consider the host as being down if it has not received any messages
from it after 4 times this value. For example, if B<host_tmax> is set 
to 20, the host will appear as down after 80 seconds with no messages
from it. By the way, tmax means "timeout max".

The B<cleanup_threshold> is the minimum amount of time before gmond
will cleanup any hosts or metrics where B<tn> > B<dmax> a.k.a. expired
data.

The B<gexec> boolean allows you to specify whether gmond will announce
the hosts availability to run gexec jobs.  B<Note>: this requires
that B<gexecd> is running on the host and the proper keys have been
installed.  

The B<send_metadata_interval> establishes an interval in which gmond
will send or resend the metadata packets that describe each enabled 
metric. This directive by default is set to 0 which means that gmond will
only send the metadata packets at startup and upon request from other 
gmond nodes running remotely. If a new machine running gmond is added
to a cluster, it needs to announce itself and inform all other nodes of the
metrics that it currently supports. In multicast mode, this isn't a problem
because any node can request the metadata of all other nodes in the cluster.
However in unicast mode, a resend interval must be established. The interval
value is the minimum number of seconds between resends.

The B<override_hostname> and B<override_ip> parameters allow an arbitrary
hostname and/or IP (hostname can be optionally specified without IP) to
use when identifying metrics coming from this host.

The B<module_dir> is an optional parameter indicating the directory where
the DSO modules are to be located.  If absent, the value to use is set at
configure time with the --with-moduledir option which will default if omitted
to the a subdirectory named "ganglia" in the directory where libganglia will
be installed.

For example, in a 32-bit Intel compatible Linux host that is usually:

  /usr/lib/ganglia

=head2 udp_send_channel

You can define as many B<udp_send_channel> sections as you like within
the limitations of memory and file descriptors.  If B<gmond> is configured
as B<mute> this section will be ignored.

The B<udp_send_channel> has a total of seven attributes: B<mcast_join>,
B<mcast_if>, B<host>, B<port>, B<ttl>, B<bind> and B<bind_hostname>.
B<bind> and B<bind_hostname> are mutually exclusive.

For example, the 2.5.x version gmond would send on the following single channel
by default...

  udp_send_channel {
    mcast_join = 239.2.11.71
    port       = 8649
  }

The B<mcast_join> and B<mcast_if> attributes are optional.  When specified
B<gmond> will create the UDP socket and join the B<mcast_join> multicast group
and send data out the interface specified by B<mcast_if>.

You can use the B<bind> attribute to bind to a particular local address to
be used as the source for the multicast packets sent or let gmond resolve the
default hostname if B<bind_hostname> = yes.

If only a B<host> and B<port> are specified then B<gmond> will send unicast UDP
messages to the hosts specified.

You could specify multiple unicast hosts for redundancy as B<gmond> will send
UDP messages to all UDP channels.

Be careful though not to mix multicast and unicast attributes in the same
udp_send_channel definition.

For example...

  udp_send_channel {
    host = host.foo.com
    port = 2389
  }
  udp_send_channel {
    host = 192.168.3.4
    port = 2344
  }

would configure gmond to send messages to two hosts.  The B<host> specification
can be an IPv4/IPv6 address or a resolvable hostname.

The B<ttl> attribute lets you modify the Time-To-Live (TTL) of outgoing messages
(unicast or multicast).

=head2 udp_recv_channel

You can specify as many B<udp_recv_channel> sections as you like within the 
limits of memory and file descriptors.  If B<gmond> is configured B<deaf>
this attribute will be ignored.

The B<udp_recv_channel> section has following attributes:
B<mcast_join>, B<bind>, B<port>, B<mcast_if>, B<family>, B<retry_bind>
and B<buffer>.
The B<udp_recv_channel> can also have an B<acl> definition (see
ACCESS CONTROL LISTS below).

For example, the 2.5.x gmond ran with a single udp receive channel...

  udp_recv_channel {
    mcast_join = 239.2.11.71
    bind       = 239.2.11.71
    port       = 8649
  }

The B<mcast_join> and B<mcast_if> should only be used if you want to 
have this UDP channel receive multicast packets the multicast
group B<mcast_join> on interface B<mcast_if>.  If you do not specify
multicast attributes then B<gmond> will simply create a UDP server
on the specified B<port>.

You can use the B<bind> attribute to bind to a particular local address.

The family address is set to B<inet4> by default.  If you want to bind
the port to an B<inet6> port, you need to specify that in the family
attribute.  Ganglia will not allow IPV6=>IPV4 mapping (for portability
and security reasons).  If you want to listen on both B<inet4> and
B<inet6> for a particular port, explicitly state it with the following:

  udp_recv_channel {
    port = 8666
    family = inet4
  }
  udp_recv_channel {
    port = 8666
    family = inet6
  }

If you specify a bind address, the family of that address takes precedence.
f your IPv6 stack doesn't support IPV6_V6ONLY, a warning will be issued
but gmond will continue working (this should rarely happen).

Multicast Note: for multicast, specifying a B<bind> address with the same
value used for B<mcast_join> will prevent unicast UDP messages to the same
B<port> from being processed.

The sFlow protocol (see http://www.sflow.org) can be used to collect
a standard set of performance metrics from servers. For servers that
don't include embedded sFlow agents, an open source sFlow agent is available
on SourceForge (see http://host-sflow.sourceforge.net).

To configure B<gmond> to receive sFlow datagrams, simply
add a B<udp_recv_channel> with the B<port> set to 6343 (the IANA registered
port for sFlow):

  udp_recv_channel {
    port = 6343
  }

Note: sFlow is unicast protocol, so don't include B<mcast_join> join.
Note: To use some other port for sFlow, set it here and then specify the port
in an B<sflow> section (see below).

B<gmond> will fail to run if it can't bind to all defined
B<udp_recv_channel>s.  Sometimes, on machines configured by DHCP,
for example, the B<gmond> daemon starts before a network address is
assigned to the interface.  Consequently, the bind fails and the 
B<gmond> daemon does not run.  To assist in this situation, the
boolean parameter B<retry_bind> can be set to the value B<true>
and then the daemon will not abort on failure, it will enter a
loop and repeat the bind attempt every 60 seconds:

  udp_recv_channel {
    port = 6343
    retry_bind = true
  }

If you have a large system with lots of metrics, you might experience
UDP drops. This happens when B<gmond> is not able to process the UDP
fast enough from the network. In this case you might consider changing
your setup into a more distributed setup using aggregator B<gmond> hosts.
Alternatively you can choose to create a bigger receive B<buffer>:

  udp_recv_channel {
    port = 6343
    buffer = 10485760
  }
B<buffer> is specified in bytes, i.e.: 10485760 will allow 10MB UDP 
to be buffered in memory.

Note: increasing buffer size will increase memory usage by B<gmond>

=head2 tcp_accept_channel

You can specify as many B<tcp_accept_channel> sections as you like
within the limitations of memory and file descriptors.  If B<gmond>
is configured to be B<mute>, then these sections are ignored.

The B<tcp_accept_channel> has the following attributes: B<bind>, B<port>, 
B<interface>, B<family> and B<timeout>.  A B<tcp_accept_channel> may also have
an B<acl> section specified (see ACCESS CONTROL LISTS below).

For example, 2.5.x gmond would accept connections on a single TCP
channel.

  tcp_accept_channel {
    port = 8649
  }

The B<bind> address is optional and allows you to specify which 
local address B<gmond> will bind to for this channel.

The B<port> is an integer than specifies which port to answer 
requests for data.

The B<family> address is set to B<inet4> by default.  If you want to bind
the port to an B<inet6> port, you need to specify that in the family
attribute.  Ganglia will not allow IPV6=>IPV4 mapping (for portability
and security reasons).  If you want to listen on both B<inet4> and
B<inet6> for a particular port, explicitly state it with the following:

  tcp_accept_channel {
    port = 8666
    family = inet4
  }
  tcp_accept_channel {
    port = 8666
    family = inet6
  }

If you specify a bind address, the family of that address takes precedence.
If your IPv6 stack doesn't support IPV6_V6ONLY, a warning will be issued
but gmond will continue working (this should rarely happen).

The B<timeout> attribute allows you to specify how many microseconds to block
before closing a connection to a client.  The default is set to -1 (blocking
IO) and will never abort a connection regardless of how slow the client is
in fetching the report data.

The B<interface> is not implemented at this time (use B<bind>).

=head2 collection_group

You can specify as many B<collection_group> section as you like
within the limitations of memory.  A B<collection_group> has
the following attributes: B<collect_once>, B<collect_every>
and B<time_threshold>.  A B<collection_group> must also contain one
or more B<metric> sections.

The B<metric> section has the following attributes: (one of B<name> 
or B<name_match>; B<name_match> is only permitted if pcre support is
compiled in), B<value_threshold> and B<title>.  For a list of 
available metric names, run the following command:

  % gmond -m

Here is an example of a collection group for a static metric...

  collection_group {
    collect_once   = yes
    time_threshold = 1800
    metric {
     name = "cpu_num"
     title = "Number of CPUs"
    }
  }

This B<collection_group> entry would cause gmond to collect the 
B<cpu_num> metric once at startup (since the number of CPUs will not 
change between reboots).  The metric B<cpu_num> would be send
every 1/2 hour (1800 seconds).  The default value for the B<time_threshold>
is 3600 seconds if no B<time_threshold> is specified.

The B<time_threshold> is the maximum amount of time that can pass before
gmond sends all B<metric>s specified in the B<collection_group> to all
configured B<udp_send_channel>s.  A B<metric> may be sent before this
B<time_threshold> is met if during collection the value surpasses the
B<value_threshold> (explained below).

Here is an example of a collection group for a volatile metric...

  collection_group {
    collect_every = 60
    time_threshold = 300
    metric {
      name = "cpu_user"
      value_threshold = 5.0
      title = "CPU User"
    }
    metric {
      name = "cpu_idle"
      value_threshold = 10.0
      title = "CPU Idle"
    }
  }

This collection group would collect the B<cpu_user> and B<cpu_idle> metrics
every 60 seconds (specified in B<collect_every>).  If B<cpu_user> varies by
5.0% or B<cpu_idle> varies by 10.0%, then the entire B<collection_group>
is sent.  If no B<value_threshold> is triggered within B<time_threshold>
seconds (in this case 300), the entire B<collection_group> is sent.

Each time the metric value is collected the new value is compared with
the old value collected.  If the difference between the last value and
the current value is greater than the B<value_threshold>, the entire
collection group is send to the B<udp_send_channel>s defined.

It's important to note that all metrics in a collection group are sent
even when only a single B<value_threshold> is surpassed.

In addition a user friendly title can be substituted for the metric name
by including a B<title> within the B<metric> section.

By using the B<name_match> parameter instead of B<name>, it is possible
to use a single definition to configure multiple metrics that match a
regular expression.  The perl compatible regular expression (pcre) syntax
is used.  This approach is particularly useful for a series of metrics
that may vary in number between reboots (e.g. metric names that
are generated for each individual NIC or CPU core).

Here is an example of using the B<name_match> directive to enable
the multicpu metrics:

  metric {
    name_match = "multicpu_([a-z]+)([0-9]+)"
    value_threshold = 1.0
    title = "CPU-\\2 \\1"
  }

Note that in the example above, there are two matches: the alphabetical
match matches the variations of the metric name (e.g. B<idle>, B<system>)
while the numeric match matches the CPU core number.  The second thing
to note is the use of substitutions within the argument to B<title>.

If both B<name> and B<name_match> are specified, then B<name> is ignored.

=head2 Modules

A B<modules> section contains the parameters that are necessary to load a
metric module. A metric module is a dynamically loadable module that 
extends the available metrics that gmond is able to collect. Each B<modules>
section contains at least one B<module> section.  Within a B<module> section
are the directives B<name>, B<language>, B<enabled>, B<path> and B<params>.  
The module B<name> is the name of the module as determined by the module 
structure if the module was developed in C/C++.  Alternatively, the 
B<name> can be the name of the source file if the module has been 
implemented in a interpreted language such as python.  A B<language> 
designation must be specified as a string value for each module.  The 
B<language> directive must correspond to the source code language in 
which the module was implemented (ex. language = "python").  If a 
B<language> directive does not exist for the module, the assumed 
language will be "C/C++". The B<enabled> directive allows a metric module
to be easily enabled or disabled through the configuration file. If the
B<enabled> directive is not included in the module configuration, the 
enabled state will default to "yes". One thing to note is that if a 
module has been disabled yet the metric which that module implements 
is still listed as part of a collection group, gmond will produce a 
warning message.  However gmond will continue to function normally 
by simply ignoring the metric. The B<path> is the path from which 
gmond is expected to load the  module (C/C++ compiled dynamically 
loadable module only).  The B<params> directive can be used to pass 
a single string parameter directly to the module initialization 
function (C/C++ module only). Multiple parameters can be passed to 
the module's initialization function by including one or more 
B<param> sections. Each B<param> section must be named and contain 
a B<value> directive. Once a module has been loaded, the additional 
metrics can be discovered by invoking B<gmond -m>.

   modules {
     module {
       name = "example_module"
       language = "C/C++"
       enabled = yes
       path = "modexample.so"
       params = "An extra raw parameter"
       param RandomMax {
         value = 75
       }
       param ConstantValue {
         value = 25
       }
     }
   }

=head2 sFlow

The B<sflow> group is optional and has the following optional
attributes: B<udp_port>, B<accept_vm_metrics>, B<accept_http_metrics>,
B<accept_memcache_metrics>, B<accept_jvm_metrics>,
B<multiple_http_instances>,B<multiple_memcache_instances>,
B<multiple_jvm_instances>. By default, a
B<udp_recv_channel> on port 6343 (the IANA registered port for
sFlow) is all that is required to accept and process sFlow
datagrams.  To receive sFlow on some other port requires both
a B<udp_recv_channel> for the other port and a B<udp_port>
setting here. For example:

   udp_recv_channel {
     port = 7343
   }

   sflow {
     udp_port = 7343
   }

An sFlow agent running on a hypervisor may also be sending
metrics for its local virtual machines.  By default these
metrics are ignored, but the B<accept_vm_metrics> flag can
be used to accept those metrics too,  and prefix them with
an identifier for each virtual machine.

   sflow {
     accept_vm_metrics = yes
   }

The sFlow feed may also contain metrics sent from HTTP or memcached
servers,  or from Java VMs.  Extra options can be used to ignore
or accept these metrics,  and to indicate that there may be multiple
instances per host.  For example:

    sflow {
      accept_http_metrics = yes
      multiple_http_instances = yes
    }

will allow the HTTP metrics, and also mark them with a distinguishing
identifier so that each instance can be trended separately.  (If multiple
instances are reporting and this flag is not set,  the results are likely
to be garbled.)

=head2 Include

This directive allows the user to include additional configuration files
rather than having to add all gmond configuration directives to the
gmond.conf file.  The following example includes any file with the extension
of .conf contained in the directory conf.d as if the contents of the included 
configuration files were part of the original gmond.conf file. This allows 
the user to modularize their configuration file.  One usage example might 
be to load individual metric modules by including module specific .conf files.

include ('/etc/ganglia/conf.d/*.conf')

=head1 ACCESS CONTROL

The B<udp_recv_channel> and B<tcp_accept_channel> directives
can contain an Access Control List (ACL).  This ACL allows you to specify
exactly which hosts gmond process data from.

An example of an B<acl> entry looks like

  acl {
    default = "deny"
    access {
      ip = 192.168.0.4
      mask = 32
      action = "allow"
    }
  }

This ACL will by default reject all traffic that is not specifically from
host 192.168.0.4 (the mask size for an IPv4 address is 32, the mask size
for an IPv6 address is 128 to represent a single host).

Here is another example

  acl {
    default = "allow"
    access {
      ip = 192.168.0.0
      mask = 24
      action = "deny"
    }
    access {
      ip = ::ff:1.2.3.0
      mask = 120
      action = "deny"
    }
  }

This ACL will by default allow all traffic unless it comes from the 
two subnets specified with action = "deny".

=head1 EXAMPLE

The default behavior for a 2.5.x gmond would be specified as...

  udp_recv_channel {
    mcast_join = 239.2.11.71
    bind       = 239.2.11.71
    port       = 8649
  }
  udp_send_channel {
    mcast_join = 239.2.11.71
    port       = 8649
  }
  tcp_accept_channel {
    port       = 8649
  }

To see the complete default configuration for gmond simply run:

  % gmond -t

gmond will print out its default behavior in a configuration file
and then exit.  Capturing this output to a file can serve as
a useful starting point for creating your own custom configuration.

  % gmond -t > custom.conf

edit B<custom.conf> to taste and then

  % gmond -c ./custom.conf

=head1 SEE ALSO

gmond(1).

=head1 NOTES

The ganglia web site is at http://ganglia.info/.

=head1 COPYRIGHT

Copyright (c) 2005 The University of California, Berkeley

=cut