1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143
|
.. _user_guide:
**********
User Guide
**********
PPM, Prediction by partial matching, is a wellknown compression technique
based on context modeling and prediction. PPM models use a set of previous
symbols in the uncompressed symbol stream to predict the next symbol in the
stream.
PPMd is an implementation of PPMII by Dmitry Shkarin.
The ppmd-cffi package uses core C files from p7zip.
The library has a bare function and no metadata/header handling functions.
This means you should know compression parameters and input/output data
sizes.
Getting started
===============
Install
-------
The ppmd-cffi is written by Python and C language bound with CFFI, and can be downloaded
from PyPI(aka. Python Package Index) using standard 'pip' command as like follows;
.. code-block:: bash
$ pip install ppmd-cffi
Command line
============
`ppmd-cffi` provide command line script to hande `.ppmd` file.
To compress file
.. code-block:: bash
$ ppmd target.dat
To decompress ppmd file
.. code-block:: bash
$ ppmd -x target.ppmd
To decompress to STDOUT
.. code-block:: bash
$ ppmd -x -c target.ppmd
Programming Interfaces
======================
`.ppmd` file comression/decompression
=====================================
ppmd-cffi project provide two functions which compress and decompress `.ppmd` archive file.
`PpmdCompressor` class provide compress function `compress()` and `PpmdDecompressor` class
provide extraction function `decompress()`.
Both classes takes `version=` argument which default is 8, means PPMd Ver. I.
Also classes takes `target`, `fname` and `ftime` arguments which is a target file and its properties.
`target` should be a file-like object which has `write()` method.
`fname` and `ftime` is a file property which is stored in archive as meta data.
`fname` should be string, and `ftime` should be a datetime object.
`order` and `mem_in_mb` parameters will be vary.
Compression with PPMd ver. H
----------------------------
.. code-block:: python
targetfile = pathlib.Path('target.dat')
fname = 'target.dat'
ftime = datetime.utcfromtimestamp(targetfile.stat().st_mtime)
archivefile = 'archive.ppmd'
order = 6
mem_in_mb = 16
with archivefile.open('wb') as target:
with targetfile.open('rb') as src:
with PpmdCompressor(target, fname, ftime, order, mem_in_mb, version=7) as compressor:
compressor.compress(src)
Compression with PPMd ver. I
----------------------------
.. code-block:: python
targetfile = pathlib.Path('target.dat')
fname = 'target.dat'
ftime = datetime.utcfromtimestamp(targetfile.stat().st_mtime)
archivefile = 'archive.ppmd'
order = 6
mem_in_mb = 8
with archivefile.open('wb') as target:
with targetfile.open('rb') as src:
with PpmdCompressor(target, fname, ftime, order, mem_in_mb, version=8) as compressor:
compressor.compress(src)
Decompression
-------------
When construct `PpmdDecompressor` object, it read header from specified archive file.
The header hold a compression parameters such as PPMd version, order and memory size.
It also has a `filename` and `timestamp`.
`PpmdDecompressor` select a proper decoder based on header data.
You need to handle `filename` and `timestamp` by your self.
A decompressor method will write data to specified file-like object, which should have
`write()` method.
.. code-block:: python
targetfile = pathlib.Path('target.ppmd')
with targetfile.open('rb') as target:
with PpmdDecompressor(target, target_size) as decompressor:
extractedfile = pathlib.Path(parent.joinpath(decompressor.filename))
with extractedfile.open('wb') as ofile:
decompressor.decompress(ofile)
timestamp = datetime_to_timestamp(decompressor.ftime)
os.utime(str(extractedfile), times=(timestamp, timestamp))
Bare encoding/decoding PPMd data
================================
There are several classes to handle bare PPMd data. Note: mem parameter should be
as bytes not MB.
* Ppmd7Encoder(dst, order, mem)
* Ppmd7Decoder(src, order, mem)
* Ppmd8Encoder(det, order, mem, restore)
* Ppmd8Decoder(src, order, mem, restore)
|