1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520
|
<!--
title: "Netdata daemon"
date: 2020-04-29
custom_edit_url: https://github.com/netdata/netdata/edit/master/daemon/README.md
-->
# Netdata daemon
## Starting netdata
- You can start Netdata by executing it with `/usr/sbin/netdata` (the installer will also start it).
- You can stop Netdata by killing it with `killall netdata`. You can stop and start Netdata at any point. When
exiting, the [database engine](/database/engine/README.md) saves metrics to `/var/cache/netdata/dbengine/` so that
it can continue when started again.
Access to the web site, for all graphs, is by default on port `19999`, so go to:
```sh
http://127.0.0.1:19999/
```
You can get the running config file at any time, by accessing `http://127.0.0.1:19999/netdata.conf`.
### Starting Netdata at boot
In the `system` directory you can find scripts and configurations for the
various distros.
#### systemd
The installer already installs `netdata.service` if it detects a systemd system.
To install `netdata.service` by hand, run:
```sh
# stop Netdata
killall netdata
# copy netdata.service to systemd
cp system/netdata.service /etc/systemd/system/
# let systemd know there is a new service
systemctl daemon-reload
# enable Netdata at boot
systemctl enable netdata
# start Netdata
systemctl start netdata
```
#### init.d
In the system directory you can find `netdata-lsb`. Copy it to the proper place according to your distribution
documentation. For Ubuntu, this can be done via running the following commands as root.
```sh
# copy the Netdata startup file to /etc/init.d
cp system/netdata-lsb /etc/init.d/netdata
# make sure it is executable
chmod +x /etc/init.d/netdata
# enable it
update-rc.d netdata defaults
```
#### openrc (gentoo)
In the `system` directory you can find `netdata-openrc`. Copy it to the proper
place according to your distribution documentation.
#### CentOS / Red Hat Enterprise Linux
For older versions of RHEL/CentOS that don't have systemd, an init script is included in the system directory. This can
be installed by running the following commands as root.
```sh
# copy the Netdata startup file to /etc/init.d
cp system/netdata-init-d /etc/init.d/netdata
# make sure it is executable
chmod +x /etc/init.d/netdata
# enable it
chkconfig --add netdata
```
_There have been some recent work on the init script, see PR
<https://github.com/netdata/netdata/pull/403>_
#### other systems
You can start Netdata by running it from `/etc/rc.local` or equivalent.
## Command line options
Normally you don't need to supply any command line arguments to netdata.
If you do though, they override the configuration equivalent options.
To get a list of all command line parameters supported, run:
```sh
netdata -h
```
The program will print the supported command line parameters.
The command line options of the Netdata 1.10.0 version are the following:
```sh
^
|.-. .-. .-. .-. . netdata
| '-' '-' '-' '-' real-time performance monitoring, done right!
+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+--->
Copyright (C) 2016-2022, Netdata, Inc. <info@netdata.cloud>
Released under GNU General Public License v3 or later.
All rights reserved.
Home Page : https://netdata.cloud
Source Code: https://github.com/netdata/netdata
Docs : https://learn.netdata.cloud
Support : https://github.com/netdata/netdata/issues
License : https://github.com/netdata/netdata/blob/master/LICENSE.md
Twitter : https://twitter.com/linuxnetdata
LinkedIn : https://linkedin.com/company/netdata-cloud/
Facebook : https://facebook.com/linuxnetdata/
SYNOPSIS: netdata [options]
Options:
-c filename Configuration file to load.
Default: /etc/netdata/netdata.conf
-D Do not fork. Run in the foreground.
Default: run in the background
-h Display this help message.
-P filename File to save a pid while running.
Default: do not save pid to a file
-i IP The IP address to listen to.
Default: all IP addresses IPv4 and IPv6
-p port API/Web port to use.
Default: 19999
-s path Prefix for /proc and /sys (for containers).
Default: no prefix
-t seconds The internal clock of netdata.
Default: 1
-u username Run as user.
Default: netdata
-v Print netdata version and exit.
-V Print netdata version and exit.
-W options See Advanced options below.
Advanced options:
-W stacksize=N Set the stacksize (in bytes).
-W debug_flags=N Set runtime tracing to debug.log.
-W unittest Run internal unittests and exit.
-W createdataset=N Create a DB engine dataset of N seconds and exit.
-W set section option value
set netdata.conf option from the command line.
-W buildinfo Print the version, the configure options,
a list of optional features, and whether they
are enabled or not.
-W buildinfojson Print the version, the configure options,
a list of optional features, and whether they
are enabled or not, in JSON format.
-W simple-pattern pattern string
Check if string matches pattern and exit.
-W "claim -token=TOKEN -rooms=ROOM1,ROOM2 url=https://api.netdata.cloud"
Connect the agent to the workspace rooms pointed to by TOKEN and ROOM*.
Signals netdata handles:
- HUP Close and reopen log files.
- USR1 Save internal DB to disk.
- USR2 Reload health configuration.
```
You can send commands during runtime via [netdatacli](/cli/README.md).
## Log files
Netdata uses 3 log files:
1. `error.log`
2. `access.log`
3. `debug.log`
Any of them can be disabled by setting it to `/dev/null` or `none` in `netdata.conf`. By default `error.log` and
`access.log` are enabled. `debug.log` is only enabled if debugging/tracing is also enabled (Netdata needs to be compiled
with debugging enabled).
Log files are stored in `/var/log/netdata/` by default.
### error.log
The `error.log` is the `stderr` of the `netdata` daemon and all external plugins
run by `netdata`.
So if any process, in the Netdata process tree, writes anything to its standard error,
it will appear in `error.log`.
For most Netdata programs (including standard external plugins shipped by netdata), the following lines may appear:
| tag | description |
|:-:|:----------|
| `INFO` | Something important the user should know. |
| `ERROR` | Something that might disable a part of netdata.<br/>The log line includes `errno` (if it is not zero). |
| `FATAL` | Something prevented a program from running.<br/>The log line includes `errno` (if it is not zero) and the program exited. |
So, when auto-detection of data collection fail, `ERROR` lines are logged and the relevant modules are disabled, but the
program continues to run.
When a Netdata program cannot run at all, a `FATAL` line is logged.
### access.log
The `access.log` logs web requests. The format is:
```txt
DATE: ID: (sent/all = SENT_BYTES/ALL_BYTES bytes PERCENT_COMPRESSION%, prep/sent/total PREP_TIME/SENT_TIME/TOTAL_TIME ms): ACTION CODE URL
```
where:
- `ID` is the client ID. Client IDs are auto-incremented every time a client connects to netdata.
- `SENT_BYTES` is the number of bytes sent to the client, without the HTTP response header.
- `ALL_BYTES` is the number of bytes of the response, before compression.
- `PERCENT_COMPRESSION` is the percentage of traffic saved due to compression.
- `PREP_TIME` is the time in milliseconds needed to prepared the response.
- `SENT_TIME` is the time in milliseconds needed to sent the response to the client.
- `TOTAL_TIME` is the total time the request was inside Netdata (from the first byte of the request to the last byte
of the response).
- `ACTION` can be `filecopy`, `options` (used in CORS), `data` (API call).
### debug.log
See [debugging](#debugging).
## Netdata process scheduling policy
By default Netdata versions prior to 1.34.0 run with the `idle` process scheduling policy, so that it uses CPU
resources, only when there is idle CPU to spare. On very busy servers (or weak servers), this can lead to gaps on
the charts.
Starting with version 1.34.0, Netdata instead uses the `batch` scheduling policy by default. This largely eliminates
issues with gaps in charts on busy systems while still keeping the impact on the rest of the system low.
You can set Netdata scheduling policy in `netdata.conf`, like this:
```conf
[global]
process scheduling policy = idle
```
You can use the following:
| policy | description |
| :-----------------------: | :---------- |
| `idle` | use CPU only when there is spare - this is lower than nice 19 - it is the default for Netdata and it is so low that Netdata will run in "slow motion" under extreme system load, resulting in short (1-2 seconds) gaps at the charts. |
| `other`<br/>or<br/>`nice` | this is the default policy for all processes under Linux. It provides dynamic priorities based on the `nice` level of each process. Check below for setting this `nice` level for netdata. |
| `batch` | This policy is similar to `other` in that it schedules the thread according to its dynamic priority (based on the `nice` value). The difference is that this policy will cause the scheduler to always assume that the thread is CPU-intensive. Consequently, the scheduler will apply a small scheduling penalty with respect to wake-up behavior, so that this thread is mildly disfavored in scheduling decisions. |
| `fifo` | `fifo` can be used only with static priorities higher than 0, which means that when a `fifo` threads becomes runnable, it will always immediately preempt any currently running `other`, `batch`, or `idle` thread. `fifo` is a simple scheduling algorithm without time slicing. |
| `rr` | a simple enhancement of `fifo`. Everything described above for `fifo` also applies to `rr`, except that each thread is allowed to run only for a maximum time quantum. |
| `keep`<br/>or<br/>`none` | do not set scheduling policy, priority or nice level - i.e. keep running with whatever it is set already (e.g. by systemd). |
For more information see `man sched`.
### scheduling priority for `rr` and `fifo`
Once the policy is set to one of `rr` or `fifo`, the following will appear:
```conf
[global]
process scheduling priority = 0
```
These priorities are usually from 0 to 99. Higher numbers make the process more
important.
### nice level for policies `other` or `batch`
When the policy is set to `other`, `nice`, or `batch`, the following will appear:
```conf
[global]
process nice level = 19
```
## scheduling settings and systemd
Netdata will not be able to set its scheduling policy and priority to more important values when it is started as the
`netdata` user (systemd case).
You can set these settings at `/etc/systemd/system/netdata.service`:
```sh
[Service]
# By default Netdata switches to scheduling policy idle, which makes it use CPU, only
# when there is spare available.
# Valid policies: other (the system default) | batch | idle | fifo | rr
#CPUSchedulingPolicy=other
# This sets the maximum scheduling priority Netdata can set (for policies: rr and fifo).
# Netdata (via [global].process scheduling priority in netdata.conf) can only lower this value.
# Priority gets values 1 (lowest) to 99 (highest).
#CPUSchedulingPriority=1
# For scheduling policy 'other' and 'batch', this sets the lowest niceness of netdata.
# Netdata (via [global].process nice level in netdata.conf) can only increase the value set here.
#Nice=0
```
Run `systemctl daemon-reload` to reload these changes.
Now, tell Netdata to keep these settings, as set by systemd, by editing
`netdata.conf` and setting:
```conf
[global]
process scheduling policy = keep
```
Using the above, whatever scheduling settings you have set at `netdata.service`
will be maintained by netdata.
### Example 1: Netdata with nice -1 on non-systemd systems
On a system that is not based on systemd, to make Netdata run with nice level -1 (a little bit higher to the default for
all programs), edit `netdata.conf` and set:
```conf
[global]
process scheduling policy = other
process nice level = -1
```
then execute this to [restart Netdata](/docs/configure/start-stop-restart.md):
```sh
sudo systemctl restart netdata
```
#### Example 2: Netdata with nice -1 on systemd systems
On a system that is based on systemd, to make Netdata run with nice level -1 (a little bit higher to the default for all
programs), edit `netdata.conf` and set:
```conf
[global]
process scheduling policy = keep
```
edit /etc/systemd/system/netdata.service and set:
```sh
[Service]
CPUSchedulingPolicy=other
Nice=-1
```
then execute:
```sh
sudo systemctl daemon-reload
sudo systemctl restart netdata
```
## Virtual memory
You may notice that netdata's virtual memory size, as reported by `ps` or `/proc/pid/status` (or even netdata's
applications virtual memory chart) is unrealistically high.
For example, it may be reported to be 150+MB, even if the resident memory size is just 25MB. Similar values may be
reported for Netdata plugins too.
Check this for example: A Netdata installation with default settings on Ubuntu
16.04LTS. The top chart is **real memory used**, while the bottom one is
**virtual memory**:

### Why does this happen?
The system memory allocator allocates virtual memory arenas, per thread running. On Linux systems this defaults to 16MB
per thread on 64 bit machines. So, if you get the difference between real and virtual memory and divide it by 16MB you
will roughly get the number of threads running.
The system does this for speed. Having a separate memory arena for each thread, allows the threads to run in parallel in
multi-core systems, without any locks between them.
This behaviour is system specific. For example, the chart above when running
Netdata on Alpine Linux (that uses **musl** instead of **glibc**) is this:

### Can we do anything to lower it?
Since Netdata already uses minimal memory allocations while it runs (i.e. it adapts its memory on start, so that while
repeatedly collects data it does not do memory allocations), it already instructs the system memory allocator to
minimize the memory arenas for each thread. We have also added [2 configuration
options](https://github.com/netdata/netdata/blob/5645b1ee35248d94e6931b64a8688f7f0d865ec6/src/main.c#L410-L418) to allow
you tweak these settings: `glibc malloc arena max for plugins` and `glibc malloc arena max for netdata`.
However, even if we instructed the memory allocator to use just one arena, it
seems it allocates an arena per thread.
Netdata also supports `jemalloc` and `tcmalloc`, however both behave exactly the
same to the glibc memory allocator in this aspect.
### Is this a problem?
No, it is not.
Linux reserves real memory (physical RAM) in pages (on x86 machines pages are 4KB each). So even if the system memory
allocator is allocating huge amounts of virtual memory, only the 4KB pages that are actually used are reserving physical
RAM. The **real memory** chart on Netdata application section, shows the amount of physical memory these pages occupy(it
accounts the whole pages, even if parts of them are actually used).
## Debugging
When you compile Netdata with debugging:
1. compiler optimizations for your CPU are disabled (Netdata will run somewhat slower)
2. a lot of code is added all over netdata, to log debug messages to `/var/log/netdata/debug.log`. However, nothing is
printed by default. Netdata allows you to select which sections of Netdata you want to trace. Tracing is activated
via the config option `debug flags`. It accepts a hex number, to enable or disable specific sections. You can find
the options supported at [log.h](https://raw.githubusercontent.com/netdata/netdata/master/libnetdata/log/log.h).
They are the `D_*` defines. The value `0xffffffffffffffff` will enable all possible debug flags.
Once Netdata is compiled with debugging and tracing is enabled for a few sections, the file `/var/log/netdata/debug.log`
will contain the messages.
> Do not forget to disable tracing (`debug flags = 0`) when you are done tracing. The file `debug.log` can grow too
> fast.
### compiling Netdata with debugging
To compile Netdata with debugging, use this:
```sh
# step into the Netdata source directory
cd /usr/src/netdata.git
# run the installer with debugging enabled
CFLAGS="-O1 -ggdb -DNETDATA_INTERNAL_CHECKS=1" ./netdata-installer.sh
```
The above will compile and install Netdata with debugging info embedded. You can now use `debug flags` to set the
section(s) you need to trace.
### debugging crashes
We have made the most to make Netdata crash free. If however, Netdata crashes on your system, it would be very helpful
to provide stack traces of the crash. Without them, is will be almost impossible to find the issue (the code base is
quite large to find such an issue by just observing it).
To provide stack traces, **you need to have Netdata compiled with debugging**. There is no need to enable any tracing
(`debug flags`).
Then you need to be in one of the following 2 cases:
1. Netdata crashes and you have a core dump
2. you can reproduce the crash
If you are not on these cases, you need to find a way to be (i.e. if your system does not produce core dumps, check your
distro documentation to enable them).
### Netdata crashes and you have a core dump
> you need to have Netdata compiled with debugging info for this to work (check above)
Run the following command and post the output on a github issue.
```sh
gdb $(which netdata) /path/to/core/dump
```
### you can reproduce a Netdata crash on your system
> you need to have Netdata compiled with debugging info for this to work (check above)
Install the package `valgrind` and run:
```sh
valgrind $(which netdata) -D
```
Netdata will start and it will be a lot slower. Now reproduce the crash and `valgrind` will dump on your console the
stack trace. Open a new github issue and post the output.
|