1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129
|
# How to write a parser
## Introduction
This page is intended to give you an introduction into developing a parser for
Plaso.
* First a step-by-step example is provided to create a simple binary parser
for the Safari Cookies.binarycookies file.
* At bottom are some common troubleshooting tips that others have run into
before you.
This page assumes you have at least a basic understanding of programming in
Python and use of git.
## Format
Before you can write a binary file parser you will need to have a good
understanding of the file format. A description of the
Safari Cookies.binarycookies format can be found
[here](https://github.com/libyal/dtformats/blob/master/documentation/Safari%20Cookies.asciidoc).
## Parsers vs. Plugins
Before starting work on a parser, check if Plaso already has a parser that
handles the underlying format of the file you're parsing. Plaso currently
supports plugins for the following file formats:
* Bencode
* Compound zip files
* Web Browser Cookies
* ESEDB
* OLECF
* Plist
* SQLite
* [Syslog](How-to-write-a-Syslog-plugin.md)
* Windows Registry
If the artifact you're trying to parse is in one of these formats, you need to
write a plugin of the appropriate type, rather than a parser.
For our example, however, the Safari Cookies.binarycookies file is in its own
binary format, so a separate parser is appropriate.
## Test data
First we make a representative test file and add it to the `test_data/`
directory, in our example:
```
test_data/Cookies.binarycookies
```
**Make sure that the test file does not contain sensitive or copyrighted
material.**
## Parsers, formatters, events and event data
* parser; a subclass of [FileObjectParser](../api/plaso.parsers.html#plaso.parsers.interface.FileObjectParser)
that extracts events from the content of a file.
* formatter (or event formatter); a subclass of
[EventFormatter](../api/plaso.formatters.html#plaso.formatters.interface.EventFormatter) which generates a human readable
description of the event data.
* event; a subclass of [EventObject](../api/plaso.containers.html#plaso.containers.events.EventObject) which represents
[an event](Scribbles-about-events.md#what-is-an-event)
* event data; a subclass of [EventData](../api/plaso.containers.html#plaso.containers.events.EventData) which represents
data related to the event.
### Writing the parser
#### Registering the parser
Add an import for the parser to:
```
plaso/parsers/__init__.py
```
It should look like this:
~~~~python
from plaso.parsers import safari_cookies
~~~~
When plaso.parsers is imported this will load the safari_cookies module
`safari_cookies.py`.
The parser class `BinaryCookieParser` is registered using
`manager.ParsersManager.RegisterParser(BinaryCookieParser)`.
```
plaso/parsers/safari_cookies.py
```
~~~~python
# -*- coding: utf-8 -*-
"""Parser for Safari Binary Cookie files."""
from plaso.parsers import interface
from plaso.parsers import manager
class BinaryCookieParser(interface.FileObjectParser):
"""Parser for Safari Binary Cookie files."""
NAME = 'binary_cookies'
DATA_FORMAT = 'Safari Binary Cookie file'
def ParseFileObject(self, parser_mediator, file_object, **kwargs):
"""Parses a Safari binary cookie file-like object.
Args:
parser_mediator (ParserMediator): parser mediator.
file_object (dfvfs.FileIO): file-like object to be parsed.
Raises:
UnableToParseFile: when the file cannot be parsed, this will signal
the event extractor to apply other parsers.
"""
...
manager.ParsersManager.RegisterParser(BinaryCookieParser)
~~~~
### Writing the message formatter
The event message format is defined in `data/formatters/*.yaml`.
For more information about the configuration file format see:
[message formatting](../user/Output-and-formatting.html#message-formatting)
|