File: Output-and-formatting.md

package info (click to toggle)
plaso 20201007-2
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 519,924 kB
  • sloc: python: 79,002; sh: 629; xml: 72; sql: 14; vhdl: 11; makefile: 10
file content (219 lines) | stat: -rw-r--r-- 8,585 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
# Output and formatting

The plaso tools `psort.py` and `psteal.py` can output events in multiple
formats using several output modules.

## Output modules

Plaso supports several output formats:

Name | Description
--- | ---
dynamic | Output events to a delimiter (comma by default) separated value output format, that supports a dynamic selection of fields.
elastic | Output events to an ElasticSearch database. Requires elasticsearch-py.
json | Output events to JSON format.
json_line | Output events to JSON line format.
l2tcsv | Output events to log2timeline.pl legacy CSV format, with 17 fixed fields. Also see: [l2tcsv output format](Output-format-l2tcsv.md)
l2ttln | Output events to log2timeline.pl extended TLN format, with 7 fixed field. | delimited output. Also see: [TLN](https://forensicswiki.xyz/wiki/index.php?title=TLN).
null | Do not output events.
rawpy | Output events in "raw" (or native) Python format.
timesketch | Output events to a Timesketch ElasticSearch database. Requires elasticsearch-py.
tln | Output events to TLN format, with 5 fixed fields. Also see: [TLN](https://forensicswiki.xyz/wiki/index.php?title=TLN).

### Dynamic output module fields

The dynamic output module defines the following command line options to specify
which fields should be represented in the output, namely `--fields` and
`--additional_fields`. The name of the fields typically map 1-to-1 to the names
of attributes of the event data. However there are "special" fields that are
composed at runtime.

Name | Description
--- | ---
date | The date of the event
datetime | The date and time of the event in ISO 8601 format
description | The event message string as defined by the message formatter
description_short | The short event message string as defined by the message formatter
display_name | Human readable representation of the path specification
filename | The "filename" attribute if present in the event data, otherwise derived from the path specification
host | The hostname derived by pre-processing
hostname | The hostname derived by pre-processing
inode | The "inode" attribute if present in the event data, otherwise derived from the file system identifer (such as inode, mft entry) in the path specification
macb | MACB (Modification, Access, Change, Birth) group representation
message | The event message string as defined by the message formatter
message_short | The short event message string as defined by the message formatter
source | The short event source as defined by the message formatter
sourcetype | The event source as defined by the message formatter
source_long | The event source as defined by the message formatter
tag | The labels defined by event tags
time | The time of the event
timestamp_desc | Indication of what the event time represents such as Creation Time or Program Execution Duration
timezone | Time zone indicator
type | Indication of what the event time represents such as Creation Time or Program Execution Duration
user | The username derived by pre-processing
username | The username derived by pre-processing
zone | Time zone indicator

Output fields that are not part of the event data but of the data stream the
event data originates from.

Name | Description
--- | ---
file_entropy | Byte entropy of the data stream content. This is a value ranging from 0.0 to 8.0, where 8.0 indicates the distribution of byte values is highly random.
md5_hash | MD5 hash of the data stream content.
sha1_hash | SHA-1 hash of the data stream content.
sha256_hash | SHA-256 hash of the data stream content.
yara_match | Names of the Yara rules that matched the data stream content.

## Output field formatting

### Source fields

As of Plaso 20200916 the value of the long and short source fields are defined
in `data/sources.config`. This file contains 3 tab separated values:

* data_type; event data type.
* short_source; short source identifier that corresponds with the l2tcsv and tln source field.
* source; source identifier that corresponds with the l2tcsv sourcetype field.

## Message formatting

In log2timeline.pl the l2tcsv format introduced the `desc` and `short` fields
that provide a description of the field, the interpreted results or the content
of the corresponding log line.

In Plaso the dynamic format extended the idea of the `desc` field, to provide
a formatted `message` field. That allow to provide more extensive formatting
such as [supporting Windows Event Log message strings](http://blog.kiddaland.net/2015/04/windows-event-log-message-strings.html).

### Formatter configuration file format

As of version 20200227 Plaso supports formatter configuration files.

**Note that the format of these configuration files is subject to change.**

An event formatter is defined as a set of attributes:

* "data_type"; required event data type.
* "enumeration_helpers"; optional enumeration helpers.
* "message"; required formatter message string, for a basic type, or list of messages string pieces, for a conditional type.
* "separator"; optional conditional message string piece separator, the default is a single space.
* "short_message"; required formatter short message string, for a basic type, or list of short messages string pieces, for a conditional type.
* "type"; required event formatter type either "basic" or "conditional".

For example:

```
---
type: 'basic'
data_type: 'bash:history:command'
message: 'Command executed: {command}'
short_message: '{command}'
---
type: 'conditional'
data_type: 'syslog:cron:task_run'
message:
- 'Cron ran: {command}'
- 'for user: {username}'
- 'pid: {pid}'
separator: ', '
short_message:
- '{body}'
```

#### Enumeration helpers

Enumeration helpers can be defined to map a value of an event attribute to
a more descriptive value, for example mapping 100 to BEGIN_SYSTEM_CHANGE in
the example below.

```
type: 'conditional'
data_type: 'windows:restore_point:info'
enumeration_helpers:
- input_attribute: 'restore_point_event_type'
  output_attribute: 'restore_point_event_type'
  default_value: 'UNKNOWN'
  values:
    100: 'BEGIN_SYSTEM_CHANGE'
    101: 'END_SYSTEM_CHANGE'
    102: 'BEGIN_NESTED_SYSTEM_CHANGE'
    103: 'END_NESTED_SYSTEM_CHANGE'
- input_attribute: 'restore_point_type'
  output_attribute: 'restore_point_type'
  default_value: 'UNKNOWN'
  values:
    0: 'APPLICATION_INSTALL'
    1: 'APPLICATION_UNINSTALL'
    10: 'DEVICE_DRIVER_INSTALL'
    12: 'MODIFY_SETTINGS'
    13: 'CANCELLED_OPERATION'
message:
- '{description}'
- 'Event type: {restore_point_event_type}'
- 'Restore point type: {restore_point_type}'
short_message:
- '{description}'
```

enumeration helpers are defined as a set of attributes:

* "input_attribute"; required name of the attribute which the value that needs to be mapped is read from.
* "output_attribute"; required name of the attribute which the mapped value is written to.
* "default_value"; optional default value if there is no corresponding mapping in "values".
* "values"; required value mappings, contains key value pairs.

#### Flags helpers

Flags helpers can be defined to map a value of an event attribute to a more
descriptive value, for example mapping 0x00000040 to FinderInfoModified in
the example below.

```
type: 'conditional'
data_type: 'macos:fseventsd:record'
flags_helpers:
- input_attribute: 'flags'
  output_attribute: 'flag_values'
  # The include header sys/fsevents.h defines various FSE constants, e.g.
  # #define FSE_CREATE_FILE          0
  # The flag values correspond to: FLAG = 1 << CONSTANT
  values:
    0x00000000: 'None'
    0x00000001: 'Created'
    0x00000002: 'Removed'
    0x00000004: 'InodeMetadataModified'
    0x00000008: 'Renamed'
    0x00000010: 'Modified'
    0x00000020: 'Exchange'
    0x00000040: 'FinderInfoModified'
    0x00000080: 'DirectoryCreated'
    0x00000100: 'PermissionChanged'
    0x00000200: 'ExtendedAttributeModified'
    0x00000400: 'ExtendedAttributeRemoved'
    0x00001000: 'DocumentRevision'
    0x00004000: 'ItemCloned'
    0x00080000: 'LastHardLinkRemoved'
    0x00100000: 'IsHardLink'
    0x00400000: 'IsSymbolicLink'
    0x00800000: 'IsFile'
    0x01000000: 'IsDirectory'
    0x02000000: 'Mount'
    0x04000000: 'Unmount'
    0x20000000: 'EndOfTransaction'
message:
- '{path}'
- 'Flag Values: {flag_values}'
- 'Flags: 0x{flags:08x}'
- 'Event Identifier: {event_identifier}'
short_message:
- '{path}'
- '{flag_values}'
```

#### Change log

* 20200227 Added support for formatter configuration files.
* 20200822 Added support for enumeration helpers.
* 20200904 Added support for flags helpers.
* 20200916 Removed source types from formatters.