1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136
|
Introduction:
-------------
rsbep is a tool that protects the data-stream from stdin with
Reed-Solomon FEC (forward error correction) and additional it
spreads the bytes of the resulting blocks out to give some
protection against burst errors (e.g from tape-recordings).
The Reed-Solomon-code is from Phil Karn (see REAMDE.RS32), and it's
a special version for i386 (and compatible) to get enough performance
for realtime encoding/decoding making it suitable for use with the tool
"dvbackup" - which requires 3,6MByte/sec of data to feed the
MiniDV-Camcorder.
It is hardwired for (255,223)-RS.
The burst error protection is done by encoding ERR_BURST_LEN blocks
of 255 bytes and placing the Bytes of an RS-block with distance
ERR_BURST_LEN withing a buffer of ERR_BURST_LEN*255 bytes size.
(This is the minimum output size of rsbep)
During decoding the blocks are reassembled and then decoded by RS32.
If any error occured in the data it is corrected if possible, otherwise
written as is, the number of corrected failures and uncorrectable blocks
is printed if >0 or according to commandline options (v,q).
The first line of each encoded output block contains
"rsbep 255 223 255 MAGIC_NUM"
which is the identifier for rsbep, the Reed-Solomon block-size, data-size and
the ERR_BURST_LEN (that is a number >=block-size and exact multiple of
block-size).
MAGIC_NUM is used to resynchronize if the stream is completly broken, which
makes it possible to recover later files in your (tar-)archive.
---
You can get the latest version at http://www.s.netic.de/gfiala/rsbep.html
Error correction:
-----------------
Now how much errors can it correct?
RS(255,223) can correct up to (255-223)/2=16 incorrect bytes in unknown
positions.
However, the problem is, that typical communications and storage errors are
also burst errors - which is that a lot of subsequent bytes are lost.
That would mean, if more than 16 subsequent bytes are damaged, the data are
irreparably damaged.
The method above spreads those bytes so far out, that up to 16*255=4080
subsequent bytes can get garbled without loosing valuable data at all!
Test with dvbackup and the LongPlay mode of my camcorder showed that more than enough
to use it - it corrected Kilo-Bytes of errors in Giga-Bytes of data.
Of course there is a probability, that the failure has just that rate, that
it hit's always the bytes of the same block - i leave it to the mathematicians
among you to calculate it ;-)
Limitations:
------------
-It does'nt make use of "erasures" - that would double the recover rate.
-It is hardcoded for RS(255,223)*(<=255) bytes, if you change the code you may not be
able to recover old data with different RS-values
-the resynchronisation takes so much CPU that it will fillup the buffer soon and anything
will fall apart - it needs a lot of work to be really of use...
-failures in the header-line might have drastic consequences, mathematically spoken: Those bytes contain more information
Compiling and Install:
----------------------
type "make" and as root "make install" (or have a look at the Makefile before)
or
make -f Makefile_plain_C if you don't use a i386 compatible, it does then use the very
slow C-Version of the library - you may need a gigahertz CPU to encode in realtime for
for use with dvbackup!!!
Usage: (and that is what i use it for)
--------------------------------------
it's intended for use in a pipe, e.g.
tar cvv something|rsbep|dvbackup|dvconnect -s
or
tar cvv something|rsbep >something.rsbep
to restore the protectd data:
dvconnect|dvbackup -d|rsbep -d|tar xvv
or
cat something.rsbep|rsbep -d|tar xvv
(remember, that the output of rsbep is always multiple of 255*255, so it's
not recommended to do the following:
cat some.txt|rsbep >some.txt.rsbep
cat some.txt.rsbep|rsbep -d >some.txt
because your text is now longer than before (\0-char's) ...)
Utils:
------
tar2dv.sh - shell wrapper to tar to dv-tape (comment out dvr if not used)
dv2tar.sh - shell wrapper to untar from dv-tape
dv2ver.sh - shell wrapper to verify a tar from dv-tape
dvr - utility to start DVin record via RS232 (ttyS3, change in code)
TODO:
-----
-find a secure way against the limitations
-find a way to avoid the problem of decode-file-size larger than input
-make it work with the C-Version as fast as with the asm32-version to keep it more portable
-also: this is far from being perfect - for that case one would need the information
were the errors are in the buffer and the resync-mechanism would have to be deterministic
-Although i'am not a mathematician i can dream up a streaming format with perfect distribution
of _all_ information (header,data,parity) where no bit has a higher importance than each other
which would still contain all the information to restore the original data stream - mathemagically ;-)
Disclaimer:
-----------
The program is supplied as is, no warranties, it might do anything to your data or system -
use at your own risk or let it be!
I may and can not give any help to rescue your data, so be sure you understand what it does before
use!
Thanks to:
----------
Swen Thuemmler for fixing segfault+incomplete decoding bug
|