File: rdiff.md

package info (click to toggle)
librsync 2.3.4-1.1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 1,204 kB
  • sloc: ansic: 4,956; sh: 245; xml: 182; perl: 30; makefile: 21
file content (164 lines) | stat: -rw-r--r-- 5,537 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
# rdiff command {#page_rdiff}

Introduction
============

*rdiff* is a program to compute and apply network deltas. An *rdiff
delta* is a delta between binary files, describing how a *basis* (or
*old*) file can be automatically edited to produce a *result* (or *new*)
file.

Unlike most diff programs, librsync does not require access to both of
the files when the diff is computed. Computing a delta requires just a
short "signature" of the old file and the complete contents of the new
file. The signature contains checksums for blocks of the old file. Using
these checksums, rdiff finds matching blocks in the new file, and then
computes the delta.

rdiff deltas are usually less compact and also slower to produce than
xdeltas or regular text diffs. If it is possible to have both the old
and new files present when computing the delta,
[xdelta](http://www.xcf.berkeley.edu/~jmacd/xdelta.html) will generally
produce a much smaller file. If the files being compared are plain text,
then GNU [diff](http://www.gnu.org/software/diffutils/diffutils.html) is
usually a better choice, as the diffs can be viewed by humans and
applied as inexact matches.

rdiff comes into its own when it is not convenient to have both files
present at the same time. One example of this is that the two files are
on separate machines, and you want to transfer only the differences.
Another example is when one of the files has been moved to archive or
backup media, leaving only its signature.

Symbolically

> signature(*basis-file*) -> *sig-file*
>
> delta(*sig-file*, *new-file*) -> *delta-file*
>
> patch(*basis-file*, *delta-file*) -> *recreated-file*

rdiff signatures and deltas are binary files in a format specific to
rdiff. Signatures consist of a header, followed by a list of checksums
for successive fixed-size blocks. Deltas consist of a header followed by
an instruction stream, which when executed produces the output file.
There are instructions to insert new data specified in the patch, or to
copy data from the basis file.

Unlike regular text diffs, rdiff deltas can describe sections of the
input file which have been reordered or copied.

Because block checksums are used to find identical sections, rdiff
cannot find common sections smaller than one block, and it may not
exactly identify common sections near changed sections. Changes that
touch every block of the file, such as changing newlines to CRLF, are
likely to cause no blocks to match at all.

rdiff does not deal with file metadata or structure, such as filenames,
permissions, or directories. To rdiff, a file is just a stream of bytes.
Higher-level tools, such as
[rdiff-backup](http://rdiff-backup.stanford.edu/) can deal with these
issues in a way appropriate to their users.

Use patterns
============

A typical application of the rsync algorithm is to transfer a file *A2*
from a machine A to a machine B which has a similar file *A1*. This can
be done as follows:

1.  B generates the rdiff signature of *A1*. Call this *S1*. B sends the
    signature to A. (The signature is usually much smaller than the file
    it describes.)
2.  A computes the rdiff delta between *S1* and *A2*. Call this delta
    *D*. A sends the delta to B.
3.  B applies the delta to recreate *A2*.

In cases where *A1* and *A2* contain runs of identical bytes, rdiff
should give a significant space saving.

Invoking rdiff
==============

There are three distinct modes of operation: *signature*, *delta* and
*patch*. The mode is selected by the first command argument.

signature
---------

> rdiff \[OPTIONS\] signature INPUT SIGNATURE

**rdiff signature** generates a signature file from an input file. The
signature can later be used to generate a delta relative to the old
file.

delta
-----

> rdiff \[OPTIONS\] delta SIGNATURE NEWFILE DELTA

**rdiff delta** reads in a delta describing a basis file. It then
calculates and writes a delta delta that transforms the basis into the
new file.

patch
-----

> rdiff \[OPTIONS\] patch BASIS DELTA OUTPUT

rdiff applies a delta to a basis file and writes out the result.

rdiff cannot update files in place: the output file must not be the same
as the input file.

rdiff does not currently check that the delta is being applied to the
correct file. If a delta is applied to the wrong basis file, the results
will be garbage.

The basis file must allow random access. This means it must be a regular
file rather than a pipe or socket.

Global Options
--------------

These options are available for all commands.

`--version` Show program version and copyright.

`--help` Show brief help message.

`--statistics` Show counts of internal operations.

`--debug` Write debugging information to stderr.

Options must be specified before the command name.

Return Value
============

0:   Successful completion.

1:   Environmental problems (file not found, invalid options, IO
    error, etc).

2:   Corrupt signature or delta file.

3:   Internal error or unhandled situation in librsync or rdiff.

Bugs
====

Unlike text patches, rdiff deltas can only be usefully applied to the
exact basis file that they were generated from. rdiff does not protect
against trying to apply a delta to the wrong file, though this will
produce garbage output. It may be useful to store a hash of the file to
which the digest is meant to be applied.

Author
======

rdiff was written by Martin Pool. The original rsync algorithm was
discovered by Andrew Tridgell.

This program is part of the [librsync](http://librsync.sourcefrog.net/)
package.