File: README

package info (click to toggle)
pgbadger 13.2-1
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 8,960 kB
  • sloc: perl: 22,613; javascript: 22,274; sh: 4; makefile: 2
file content (816 lines) | stat: -rw-r--r-- 41,121 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
NAME
    pgBadger - a fast PostgreSQL log analysis report

SYNOPSIS
    Usage: pgbadger [options] logfile [...]

        PostgreSQL log analyzer with fully detailed reports and graphs.

    Arguments:

        logfile can be a single log file, a list of files, or a shell command
        returning a list of files. If you want to pass log content from stdin
        use - as filename. Note that input from stdin will not work with csvlog.

    Options:

        -a | --average minutes : number of minutes to build the average graphs of
                                 queries and connections. Default 5 minutes.
        -A | --histo-average min: number of minutes to build the histogram graphs
                                 of queries. Default 60 minutes.
        -b | --begin datetime  : start date/time for the data to be parsed in log
                                 (either a timestamp or a time)
        -c | --dbclient host   : only report on entries for the given client host.
        -C | --nocomment       : remove comments like /* ... */ from queries.
        -d | --dbname database : only report on entries for the given database.
        -D | --dns-resolv      : client ip addresses are replaced by their DNS name.
                                 Be warned that this can really slow down pgBadger.
        -e | --end datetime    : end date/time for the data to be parsed in log
                                 (either a timestamp or a time)
        -E | --explode         : explode the main report by generating one report
                                 per database. Global information not related to a
                                 database is added to the postgres database report.
        -f | --format logtype  : possible values: syslog, syslog2, stderr, jsonlog,
                                 csv, pgbouncer, logplex, rds and redshift. Use this
                                 option when pgBadger is not able to detect the log
                                 format.
        -G | --nograph         : disable graphs on HTML output. Enabled by default.
        -h | --help            : show this message and exit.
        -H | --html-outdir path: path to directory where HTML report must be written
                                 in incremental mode, binary files stay on directory
                                 defined with -O, --outdir option.
        -i | --ident name      : programname used as syslog ident. Default: postgres
        -I | --incremental     : use incremental mode, reports will be generated by
                                 days in a separate directory, --outdir must be set.
        -j | --jobs number     : number of jobs to run at same time for a single log
                                 file. Run as single by default or when working with
                                 csvlog format.
        -J | --Jobs number     : number of log files to parse in parallel. Process
                                 one file at a time by default.
        -l | --last-parsed file: allow incremental log parsing by registering the
                                 last datetime and line parsed. Useful if you want
                                 to watch errors since last run or if you want one
                                 report per day with a log rotated each week.
        -L | --logfile-list file:file containing a list of log files to parse.
        -m | --maxlength size  : maximum length of a query, it will be restricted to
                                 the given size. Default truncate size is 100000.
        -M | --no-multiline    : do not collect multiline statements to avoid garbage
                                 especially on errors that generate a huge report.
        -N | --appname name    : only report on entries for given application name
        -o | --outfile filename: define the filename for the output. Default depends
                                 on the output format: out.html, out.txt, out.bin,
                                 or out.json. This option can be used multiple times
                                 to output several formats. To use json output, the
                                 Perl module JSON::XS must be installed, to dump
                                 output to stdout, use - as filename.
        -O | --outdir path     : directory where out files must be saved.
        -p | --prefix string   : the value of your custom log_line_prefix as
                                 defined in your postgresql.conf. Only use it if you
                                 aren't using one of the standard prefixes specified
                                 in the pgBadger documentation, such as if your
                                 prefix includes additional variables like client IP
                                 or application name. MUST contain escape sequences
                                 for time (%t, %m or %n) and processes (%p or %c).
                                 See examples below.
        -P | --no-prettify     : disable SQL queries prettify formatter.
        -q | --quiet           : don't print anything to stdout, not even a progress
                                 bar.
        -Q | --query-numbering : add numbering of queries to the output when using
                                 options --dump-all-queries or --normalized-only.
        -r | --remote-host ip  : set the host where to execute the cat command on
                                 remote log file to parse the file locally.
        -R | --retention N     : number of weeks to keep in incremental mode. Defaults
                                 to 0, disabled. Used to set the number of weeks to
                                 keep in output directory. Older weeks and days
                                 directories are automatically removed.
        -s | --sample number   : number of query samples to store. Default: 3.
        -S | --select-only     : only report SELECT queries.
        -t | --top number      : number of queries to store/display. Default: 20.
        -T | --title string    : change title of the HTML page report.
        -u | --dbuser username : only report on entries for the given user.
        -U | --exclude-user username : exclude entries for the specified user from
                                 report. Can be used multiple time.
        -v | --verbose         : enable verbose or debug mode. Disabled by default.
        -V | --version         : show pgBadger version and exit.
        -w | --watch-mode      : only report errors just like logwatch could do.
        -W | --wide-char       : encode html output of queries into UTF8 to avoid
                                 Perl message "Wide character in print".
        -x | --extension       : output format. Values: text, html, bin or json.
                                 Default: html
        -X | --extra-files     : in incremental mode allow pgBadger to write CSS and
                                 JS files in the output directory as separate files.
        -z | --zcat exec_path  : set the full path to the zcat program. Use it if
                                 zcat, bzcat or unzip is not in your path.
        -Z | --timezone +/-XX  : Set the number of hours from GMT of the timezone.
                                 Use this to adjust date/time in JavaScript graphs.
                                 The value can be an integer, ex.: 2, or a float,
                                 ex.: 2.5.
        --pie-limit num        : pie data lower than num% will show a sum instead.
        --exclude-query regex  : any query matching the given regex will be excluded
                                 from the report. For example: "^(VACUUM|COMMIT)"
                                 You can use this option multiple times.
        --exclude-file filename: path of the file that contains each regex to use
                                 to exclude queries from the report. One regex per
                                 line.
        --include-query regex  : any query that does not match the given regex will
                                 be excluded from the report. You can use this
                                 option multiple times. For example: "(tbl1|tbl2)".
        --include-file filename: path of the file that contains each regex to the
                                 queries to include from the report. One regex per
                                 line.
        --disable-error        : do not generate error report.
        --disable-hourly       : do not generate hourly report.
        --disable-type         : do not generate report of queries by type, database
                                 or user.
        --disable-query        : do not generate query reports (slowest, most
                                 frequent, queries by users, by database, ...).
        --disable-session      : do not generate session report.
        --disable-connection   : do not generate connection report.
        --disable-lock         : do not generate lock report.
        --disable-temporary    : do not generate temporary report.
        --disable-checkpoint   : do not generate checkpoint/restartpoint report.
        --disable-autovacuum   : do not generate autovacuum report.
        --charset              : used to set the HTML charset to be used.
                                 Default: utf-8.
        --csv-separator        : used to set the CSV field separator, default: ,
        --exclude-time  regex  : any timestamp matching the given regex will be
                                 excluded from the report. Example: "2013-04-12 .*"
                                 You can use this option multiple times.
        --include-time  regex  : only timestamps matching the given regex will be
                                 included in the report. Example: "2013-04-12 .*"
                                 You can use this option multiple times.
        --exclude-db name      : exclude entries for the specified database from
                                 report. Example: "pg_dump". Can be used multiple
                                 times.
        --exclude-appname name : exclude entries for the specified application name
                                 from report.  Example: "pg_dump".  Can be used
                                 multiple times.
        --exclude-line regex   : exclude any log entry that will match the given
                                 regex. Can be used multiple times.
        --exclude-client name  : exclude log entries for the specified client ip.
                                 Can be used multiple times.
        --anonymize            : obscure all literals in queries, useful to hide
                                 confidential data.
        --noreport             : no reports will be created in incremental mode.
        --log-duration         : force pgBadger to associate log entries generated
                                 by both log_duration = on and log_statement = 'all'
        --enable-checksum      : used to add an md5 sum under each query report.
        --journalctl command   : command to use to replace PostgreSQL logfile by
                                 a call to journalctl. Basically it might be:
                                    journalctl -u postgresql-9.5
        --pid-dir path         : set the path where the pid file must be stored.
                                 Default /tmp
        --pid-file file        : set the name of the pid file to manage concurrent
                                 execution of pgBadger. Default: pgbadger.pid
        --rebuild              : used to rebuild all html reports in incremental
                                 output directories where there's binary data files.
        --pgbouncer-only       : only show PgBouncer-related menus in the header.
        --start-monday         : in incremental mode, calendar weeks start on
                                 Sunday. Use this option to start on a Monday.
        --iso-week-number      : in incremental mode, calendar weeks start on
                                 Monday and respect the ISO 8601 week number, range
                                 01 to 53, where week 1 is the first week that has
                                 at least 4 days in the new year.
        --normalized-only      : only dump all normalized queries to out.txt
        --log-timezone +/-XX   : Set the number of hours from GMT of the timezone
                                 that must be used to adjust date/time read from
                                 log file before beeing parsed. Using this option
                                 makes log search with a date/time more difficult.
                                 The value can be an integer, ex.: 2, or a float,
                                 ex.: 2.5.
        --prettify-json        : use it if you want json output to be prettified.
        --month-report YYYY-MM : create a cumulative HTML report over the specified
                                 month. Requires incremental output directories and
                                 the presence of all necessary binary data files
        --day-report YYYY-MM-DD: create an HTML report over the specified day.
                                 Requires incremental output directories and the
                                 presence of all necessary binary data files
        --noexplain            : do not process lines generated by auto_explain.
        --command CMD          : command to execute to retrieve log entries on
                                 stdin. pgBadger will open a pipe to the command
                                 and parse log entries generated by the command.
        --no-week              : inform pgbadger to not build weekly reports in
                                 incremental mode. Useful if it takes too much time.
        --explain-url URL      : use it to override the url of the graphical explain
                                 tool. Default: https://explain.depesz.com/
        --tempdir DIR          : set directory where temporary files will be written
                                 Default: File::Spec->tmpdir() || '/tmp'
        --no-process-info      : disable changing process title to help identify
                                 pgbadger process, some system do not support it.
        --dump-all-queries     : dump all queries found in the log file replacing
                                 bind parameters included in the queries at their
                                 respective placeholders positions.
        --keep-comments        : do not remove comments from normalized queries. It
                                 can be useful if you want to distinguish between
                                 same normalized queries.
        --no-progressbar       : disable progressbar.
        --dump-raw-csv         : parse the log and dump the information into CSV
                                 format. No further processing is done, no report.
        --include-pid PID      : only report events related to the session pid (%p).
                                 Can be used multiple time.
        --include-session ID   : only report events related to the session id (%c).
                                 Can be used multiple time.
       --histogram-query VAL   : use custom inbound for query times histogram.
                                 Default inbound in milliseconds:
                             0,1,5,10,25,50,100,500,1000,10000
       --histogram-session VAL : use custom inbound for session times histogram.
                                 Default inbound in milliseconds:
                             0,500,1000,30000,60000,600000,1800000,3600000,28800000
       --no-fork               : do not fork any process, for debugging purpose.

    pgBadger is able to parse a remote log file using a passwordless ssh
    connection. Use -r or --remote-host to set the host IP address or
    hostname. There are also some additional options to fully control the
    ssh connection.

        --ssh-program ssh        path to the ssh program to use. Default: ssh.
        --ssh-port port          ssh port to use for the connection. Default: 22.
        --ssh-user username      connection login name. Defaults to running user.
        --ssh-identity file      path to the identity file to use.
        --ssh-timeout second     timeout to ssh connection failure. Default: 10 sec.
        --ssh-option  options    list of -o options to use for the ssh connection.
                                 Options always used:
                                     -o ConnectTimeout=$ssh_timeout
                                     -o PreferredAuthentications=hostbased,publickey
        --ssh-sudo               Runs all commands on remote host as sudo, will not work
                                 with wildcard filenames

    Log file to parse can also be specified using an URI, supported
    protocols are http[s] and [s]ftp. The curl command will be used to
    download the file, and the file will be parsed during download. The ssh
    protocol is also supported and will use the ssh command like with the
    remote host use. See examples bellow.

    Return codes:

        0: on success
        1: die on error
        2: if it has been interrupted using ctr+c for example
        3: the pid file already exists or can not be created
        4: no log file was given at command line

    Examples:

        pgbadger /var/log/postgresql.log
        pgbadger /var/log/postgres.log.2.gz /var/log/postgres.log.1.gz /var/log/postgres.log
        pgbadger /var/log/postgresql/postgresql-2012-05-*
        pgbadger --exclude-query="^(COPY|COMMIT)" /var/log/postgresql.log
        pgbadger -b "2012-06-25 10:56:11" -e "2012-06-25 10:59:11" /var/log/postgresql.log
        cat /var/log/postgres.log | pgbadger -
        # Log line prefix with stderr log output
        pgbadger --prefix '%t [%p]: user=%u,db=%d,client=%h' /pglog/postgresql-2012-08-21*
        pgbadger --prefix '%m %u@%d %p %r %a : ' /pglog/postgresql.log
        # Log line prefix with syslog log output
        pgbadger --prefix 'user=%u,db=%d,client=%h,appname=%a' /pglog/postgresql-2012-08-21*
        # Use my 8 CPUs to parse my 10GB file faster, much faster
        pgbadger -j 8 /pglog/postgresql-10.1-main.log

    Use URI notation for remote log file:

        pgbadger http://172.12.110.1//var/log/postgresql/postgresql-10.1-main.log
        pgbadger ftp://username@172.12.110.14/postgresql-10.1-main.log
        pgbadger ssh://username@172.12.110.14:2222//var/log/postgresql/postgresql-10.1-main.log*

    You can use together a local PostgreSQL log and a remote pgbouncer log
    file to parse:

        pgbadger /var/log/postgresql/postgresql-10.1-main.log ssh://username@172.12.110.14/pgbouncer.log

    Reporting errors every week by cron job:

        30 23 * * 1 /usr/bin/pgbadger -q -w /var/log/postgresql.log -o /var/reports/pg_errors.html

    Generate report every week using incremental behavior:

        0 4 * * 1 /usr/bin/pgbadger -q `find /var/log/ -mtime -7 -name "postgresql.log*"` -o /var/reports/pg_errors-`date +\%F`.html -l /var/reports/pgbadger_incremental_file.dat

    This supposes that your log file and HTML report are also rotated every
    week.

    Or better, use the auto-generated incremental reports:

        0 4 * * * /usr/bin/pgbadger -I -q /var/log/postgresql/postgresql.log.1 -O /var/www/pg_reports/

    will generate a report per day and per week.

    In incremental mode, you can also specify the number of weeks to keep in
    the reports:

        /usr/bin/pgbadger --retention 2 -I -q /var/log/postgresql/postgresql.log.1 -O /var/www/pg_reports/

    If you have a pg_dump at 23:00 and 13:00 each day during half an hour,
    you can use pgBadger as follow to exclude these periods from the report:

        pgbadger --exclude-time "2013-09-.* (23|13):.*" postgresql.log

    This will help avoid having COPY statements, as generated by pg_dump, on
    top of the list of slowest queries. You can also use --exclude-appname
    "pg_dump" to solve this problem in a simpler way.

    You can also parse journalctl output just as if it was a log file:

        pgbadger --journalctl 'journalctl -u postgresql-9.5'

    or worst, call it from a remote host:

        pgbadger -r 192.168.1.159 --journalctl 'journalctl -u postgresql-9.5'

    you don't need to specify any log file at command line, but if you have
    other PostgreSQL log files to parse, you can add them as usual.

    To rebuild all incremental html reports after, proceed as follow:

        rm /path/to/reports/*.js
        rm /path/to/reports/*.css
        pgbadger -X -I -O /path/to/reports/ --rebuild

    it will also update all resource files (JS and CSS). Use -E or --explode
    if the reports were built using this option.

    pgBadger also supports Heroku PostgreSQL logs using logplex format:

        heroku logs -p postgres | pgbadger -f logplex -o heroku.html -

    this will stream Heroku PostgreSQL log to pgbadger through stdin.

    pgBadger can auto detect RDS and cloudwatch PostgreSQL logs using rds
    format:

        pgbadger -f rds -o rds_out.html rds.log

    Each CloudSQL Postgresql log is a fairly normal PostgreSQL log, but
    encapsulated in JSON format. It is autodetected by pgBadger but in case
    you need to force the log format use `jsonlog`:

        pgbadger -f jsonlog -o cloudsql_out.html cloudsql.log

    This is the same as with the jsonlog extension, the json format is
    different but pgBadger can parse both formats.

    pgBadger also supports logs produced by CloudNativePG Postgres operator
    for Kubernetes:

        pgbadger -f jsonlog -o cnpg_out.html cnpg.log

    To create a cumulative report over a month use command:

        pgbadger --month-report 2919-05 /path/to/incremental/reports/

    this will add a link to the month name into the calendar view in
    incremental reports to look at report for month 2019 May. Use -E or
    --explode if the reports were built using this option.

DESCRIPTION
    pgBadger is a PostgreSQL log analyzer built for speed providing fully
    detailed reports based on your PostgreSQL log files. It's a small
    standalone Perl script that outperforms any other PostgreSQL log
    analyzer.

    It is written in pure Perl and uses a JavaScript library (flotr2) to
    draw graphs so that you don't need to install any additional Perl
    modules or other packages. Furthermore, this library gives us more
    features such as zooming. pgBadger also uses the Bootstrap JavaScript
    library and the FontAwesome webfont for better design. Everything is
    embedded.

    pgBadger is able to autodetect your log file format (syslog, stderr,
    csvlog or jsonlog) if the file is long enough. It is designed to parse
    huge log files as well as compressed files. Supported compressed formats
    are gzip, bzip2, lz4, xz, zip and zstd. For the xz format you must have
    an xz version higher than 5.05 that supports the --robot option. lz4
    files must be compressed with the --content-size option for pgbadger to
    determine the uncompressed file size. For the complete list of features,
    see below.

    All charts are zoomable and can be saved as PNG images.

    You can also limit pgBadger to only report errors or remove any part of
    the report using command-line options.

    pgBadger supports any custom format set in the log_line_prefix directive
    of your postgresql.conf file as long as it at least specifies the %t, %m
    or %n and %p or %c patterns.

    pgBadger allows parallel processing of a single log file or multiple
    files through the use of the -j option specifying the number of CPUs.

    If you want to save system performance you can also use log_duration
    instead of log_min_duration_statement to have reports on duration and
    number of queries only.

FEATURE
    pgBadger reports everything about your SQL queries:

            Overall statistics.
            The most frequent waiting queries.
            Queries that waited the most.
            Queries generating the most temporary files.
            Queries generating the largest temporary files.
            The slowest queries.
            Queries that took up the most time.
            The most frequent queries.
            The most frequent errors.
            Histogram of query times.
            Histogram of sessions times.
            Users involved in top queries.
            Applications involved in top queries.
            Queries generating the most cancellation.
            Queries most cancelled.
            The most time consuming prepare/bind queries

    The following reports are also available with hourly charts divided into
    periods of five minutes:

            SQL queries statistics.
            Temporary file statistics.
            Checkpoints statistics.
            Autovacuum and autoanalyze statistics.
            Cancelled queries.
            Error events (panic, fatal, error and warning).
            Error class distribution.

    There are also some pie charts about distribution of:

            Locks statistics.
            Queries by type (select/insert/update/delete).
            Distribution of queries type per database/application
            Sessions per database/user/client/application.
            Connections per database/user/client/application.
            Autovacuum and autoanalyze per table.
            Queries per user and total duration per user.

    All charts are zoomable and can be saved as PNG images. SQL queries
    reported are highlighted and beautified automatically.

    pgBadger is also able to parse PgBouncer log files and to create the
    following reports:

            Request Throughput
            Bytes I/O Throughput
            Average Query Duration
            Simultaneous sessions
            Histogram of sessions times
            Sessions per database
            Sessions per user
            Sessions per host
            Established connections
            Connections per database
            Connections per user
            Connections per host
            Most used reserved pools
            Most Frequent Errors/Events

    You can also have incremental reports with one report per day and a
    cumulative report per week. Two multiprocess modes are available to
    speed up log parsing, one using one core per log file, and the second
    using multiple cores to parse a single file. These modes can be
    combined.

    Histogram granularity can be adjusted using the -A command-line option.
    By default, they will report the mean of each top queries/errors
    occurring per hour, but you can specify the granularity down to the
    minute.

    pgBadger can also be used in a central place to parse remote log files
    using a passwordless SSH connection. This mode can be used with
    compressed files and in the multiprocess per file mode (-J), but cannot
    be used with the CSV log format.

    Examples of reports can be found here:
    https://pgbadger.darold.net/#reports

REQUIREMENT
    pgBadger comes as a single Perl script - you do not need anything other
    than a modern Perl distribution. Charts are rendered using a JavaScript
    library, so you don't need anything other than a web browser. Your
    browser will do all the work.

    If you plan to parse PostgreSQL CSV log files, you might need some Perl
    Modules:

            Text::CSV_XS - to parse PostgreSQL CSV log files.

    This module is optional, if you don't have PostgreSQL log in the CSV
    format, you don't need to install it.

    If you want to export statistics as JSON file, you need an additional
    Perl module:

            JSON::XS - JSON serialising/deserialising, done correctly and fast

    This module is optional, if you don't select the json output format, you
    don't need to install it. You can install it on a Debian-like system
    using:

            sudo apt-get install libjson-xs-perl

    and on RPM-like system using:

            sudo yum install perl-JSON-XS

    Compressed log file format is autodetected from the file extension. If
    pgBadger finds a gz extension, it will use the zcat utility; with a bz2
    extension, it will use bzcat; with lz4, it will use lz4cat; with zst, it
    will use zstdcat; if the file extension is zip or xz, then the unzip or
    xz utility will be used.

    If those utilities are not found in the PATH environment variable, then
    use the --zcat command-line option to change this path. For example:

            --zcat="/usr/local/bin/gunzip -c" or --zcat="/usr/local/bin/bzip2 -dc"
            --zcat="C:\tools\unzip -p"

    By default, pgBadger will use the zcat, bzcat, lz4cat, zstdcat and unzip
    utilities following the file extension. If you use the default
    autodetection of compression format, you can mix gz, bz2, lz4, xz, zip
    or zstd files. Specifying a custom value of --zcat option will remove
    the possibility of mixed compression format.

    Note that multiprocessing cannot be used with compressed files or CSV
    files as well as under Windows platform.

INSTALLATION
    Download the tarball from GitHub and unpack the archive as follow:

            tar xzf pgbadger-11.x.tar.gz
            cd pgbadger-11.x/
            perl Makefile.PL
            make && sudo make install

    This will copy the Perl script pgbadger to /usr/local/bin/pgbadger by
    default and the man page into /usr/local/share/man/man1/pgbadger.1.
    Those are the default installation directories for 'site' install.

    If you want to install all under /usr/ location, use INSTALLDIRS='perl'
    as an argument of Makefile.PL. The script will be installed into
    /usr/bin/pgbadger and the manpage into /usr/share/man/man1/pgbadger.1.

    For example, to install everything just like Debian does, proceed as
    follows:

            perl Makefile.PL INSTALLDIRS=vendor

    By default, INSTALLDIRS is set to site.

POSTGRESQL CONFIGURATION
    You must enable and set some configuration directives in your
    postgresql.conf before starting.

    You must first enable SQL query logging to have something to parse:

            log_min_duration_statement = 0

    Here every statement will be logged, on a busy server you may want to
    increase this value to only log queries with a longer duration. Note
    that if you have log_statement set to 'all', nothing will be logged
    through the log_min_duration_statement directive. See the next chapter
    for more information.

    pgBadger supports any custom format set in the log_line_prefix directive
    of your postgresql.conf file as long as it at least specifies a time
    escape sequence (%t, %m or %n) and a process-related escape sequence (%p
    or %c).

    For example, with 'stderr' log format, log_line_prefix must be at least:

            log_line_prefix = '%t [%p]: '

    Log line prefix could add user, database name, application name and
    client ip address as follows:

            log_line_prefix = '%t [%p]: user=%u,db=%d,app=%a,client=%h '

    or for syslog log file format:

            log_line_prefix = 'user=%u,db=%d,app=%a,client=%h '

    Log line prefix for stderr output could also be:

            log_line_prefix = '%t [%p]: db=%d,user=%u,app=%a,client=%h '

    or for syslog output:

            log_line_prefix = 'db=%d,user=%u,app=%a,client=%h '

    You need to enable other parameters in postgresql.conf to get more
    information from your log files:

            log_checkpoints = on
            log_connections = on
            log_disconnections = on
            log_lock_waits = on
            log_temp_files = 0
            log_autovacuum_min_duration = 0
            log_error_verbosity = default

    Do not enable log_statement as its log format will not be parsed by
    pgBadger.

    Of course your log messages should be in English with or without locale
    support:

            lc_messages='en_US.UTF-8'
            lc_messages='C'

    pgBadger parser does not support other locales, like 'fr_FR.UTF-8' for
    example.

LOG STATEMENTS
    Considerations about log_min_duration_statement, log_duration and
    log_statement configuration directives.

    If you want the query statistics to include the actual query strings,
    you must set log_min_duration_statement to 0 or more milliseconds.

    If you just want to report duration and number of queries and don't want
    all details about queries, set log_min_duration_statement to -1 to
    disable it and enable log_duration in your postgresql.conf file. If you
    want to add the most common query report, you can either choose to set
    log_min_duration_statement to a higher value or to enable log_statement.

    Enabling log_min_duration_statement will add reports about slowest
    queries and queries that took up the most time. Take care that if you
    have log_statement set to 'all', nothing will be logged with
    log_min_duration_statement.

    Warning: Do not enable both log_min_duration_statement, log_duration and
    log_statement all together, this will result in wrong counter values.
    Note that this will also increase drastically the size of your log.
    log_min_duration_statement should always be preferred.

PARALLEL PROCESSING
    To enable parallel processing you just have to use the -j N option where
    N is the number of cores you want to use.

    pgBadger will then proceed as follow:

            for each log file
                chunk size = int(file size / N)
                look at start/end offsets of these chunks
                fork N processes and seek to the start offset of each chunk
                    each process will terminate when the parser reach the end offset
                    of its chunk
                    each process write stats into a binary temporary file
                wait for all children processes to terminate
            All binary temporary files generated will then be read and loaded into
            memory to build the html output.

    With that method, at start/end of chunks pgBadger may truncate or omit a
    maximum of N queries per log file, which is an insignificant gap if you
    have millions of queries in your log file. The chance that the query
    that you were looking for is lost is near 0, this is why I think this
    gap is livable. Most of the time the query is counted twice but
    truncated.

    When you have many small log files and many CPUs, it is speedier to
    dedicate one core to one log file at a time. To enable this behavior,
    you have to use option -J N instead. With 200 log files of 10MB each,
    the use of the -J option starts being really interesting with 8 cores.
    Using this method you will be sure not to lose any queries in the
    reports.

    Here is a benchmark done on a server with 8 CPUs and a single file of
    9.5GB.

             Option |  1 CPU  | 2 CPU | 4 CPU | 8 CPU
            --------+---------+-------+-------+------
               -j   | 1h41m18 | 50m25 | 25m39 | 15m58
               -J   | 1h41m18 | 54m28 | 41m16 | 34m45

    With 200 log files of 10MB each, so 2GB in total, the results are
    slightly different:

             Option | 1 CPU | 2 CPU | 4 CPU | 8 CPU
            --------+-------+-------+-------+------
               -j   | 20m15 |  9m56 |  5m20 | 4m20
               -J   | 20m15 |  9m49 |  5m00 | 2m40

    So it is recommended to use -j unless you have hundreds of small log
    files and can use at least 8 CPUs.

    IMPORTANT: when you are using parallel parsing, pgBadger will generate a
    lot of temporary files in the /tmp directory and will remove them at the
    end, so do not remove those files unless pgBadger is not running. They
    are all named with the following template tmp_pgbadgerXXXX.bin so they
    can be easily identified.

INCREMENTAL REPORTS
    pgBadger includes an automatic incremental report mode using option -I
    or --incremental. When running in this mode, pgBadger will generate one
    report per day and a cumulative report per week. Output is first done in
    binary format into the mandatory output directory (see option -O or
    --outdir), then in HTML format for daily and weekly reports with a main
    index file.

    The main index file will show a dropdown menu per week with a link to
    each week report and links to daily reports of each week.

    For example, if you run pgBadger as follows based on a daily rotated
    file:

        0 4 * * * /usr/bin/pgbadger -I -q /var/log/postgresql/postgresql.log.1 -O /var/www/pg_reports/

    you will have all daily and weekly reports for the full running period.

    In this mode, pgBadger will create an automatic incremental file in the
    output directory, so you don't have to use the -l option unless you want
    to change the path of that file. This means that you can run pgBadger in
    this mode each day on a log file rotated each week, and it will not
    count the log entries twice.

    To save disk space, you may want to use the -X or --extra-files
    command-line option to force pgBadger to write JavaScript and CSS to
    separate files in the output directory. The resources will then be
    loaded using script and link tags.

  Rebuilding reports
    Incremental reports can be rebuilt after a pgbadger report fix or a new
    feature to update all HTML reports. To rebuild all reports where a
    binary file is still present, proceed as follow:

        rm /path/to/reports/*.js
        rm /path/to/reports/*.css
        pgbadger -X -I -O /path/to/reports/ --rebuild

    it will also update all resource files (JS and CSS). Use -E or --explode
    if the reports were built using this option.

  Monthly reports
    By default, pgBadger in incremental mode only computes daily and weekly
    reports. If you want monthly cumulative reports, you will have to use a
    separate command to specify the report to build. For example, to build a
    report for August 2019:

        pgbadger -X --month-report 2019-08 /var/www/pg_reports/

    this will add a link to the month name into the calendar view of
    incremental reports to look at monthly report. The report for a current
    month can be run every day, it is entirely rebuilt each time. The
    monthly report is not built by default because it could take a lot of
    time following the amount of data.

    If reports were built with the per-database option ( -E | --explode ),
    it must be used too when calling pgbadger to build monthly report:

        pgbadger -E -X --month-report 2019-08 /var/www/pg_reports/

    This is the same when using the rebuild option ( -R | --rebuild ).

BINARY FORMAT
    Using the binary format it is possible to create custom incremental and
    cumulative reports. For example, if you want to refresh a pgBadger
    report each hour from a daily PostgreSQL log file, you can proceed by
    running the following commands each hour:

        pgbadger --last-parsed .pgbadger_last_state_file -o sunday/hourX.bin /var/log/pgsql/postgresql-Sun.log

    to generate the incremental data files in binary format. And to generate
    the fresh HTML report from that binary file:

        pgbadger sunday/*.bin

    Or as another example, if you generate one log file per hour and you
    want reports to be rebuilt each time the log file is rotated, proceed as
    follows:

            pgbadger -o day1/hour01.bin /var/log/pgsql/pglog/postgresql-2012-03-23_10.log
            pgbadger -o day1/hour02.bin /var/log/pgsql/pglog/postgresql-2012-03-23_11.log
            pgbadger -o day1/hour03.bin /var/log/pgsql/pglog/postgresql-2012-03-23_12.log
            ...

    When you want to refresh the HTML report, for example, each time after a
    new binary file is generated, just do the following:

            pgbadger -o day1_report.html day1/*.bin

    Adjust the commands to suit your particular needs.

JSON FORMAT
    JSON format is good for sharing data with other languages, which makes
    it easy to integrate pgBadger result into other monitoring tools, like
    Cacti or Graphite.

AUTHORS
    pgBadger is an original work from Gilles Darold.

    The pgBadger logo is an original creation of Damien Cazeils.

    The pgBadger v4.x design comes from the "Art is code" company.

    This web site is a work of Gilles Darold.

    pgBadger is maintained by Gilles Darold and everyone who wants to
    contribute.

    Many people have contributed to pgBadger, they are all quoted in the
    Changelog file.

LICENSE
    pgBadger is free software distributed under the PostgreSQL Licence.

    Copyright (c) 2012-2026, Gilles Darold

    A modified version of the SQL::Beautify Perl Module is embedded in
    pgBadger with copyright (C) 2009 by Jonas Kramer and is published under
    the terms of the Artistic License 2.0.