File: EXAMPLES

package info (click to toggle)
storebackup 3.2-1
  • links: PTS, VCS
  • area: main
  • in suites: squeeze
  • size: 1,120 kB
  • ctags: 430
  • sloc: perl: 18,058; makefile: 52; sh: 37
file content (276 lines) | stat: -rw-r--r-- 11,493 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
Copyright (c) 2002-2008 by Heinz-Josef Claes (see README)
Published under the GNU General Public License v3 or any later version


Before explaining some examples, here are some important aspects about
how storeBackup works: (The following explains the principle
mechanisms, for performance reasons it's implemented a little bit
different. There are several waiting queues, parallelisms and a tiny
scheduler inside which are not described here.)

storeBackup uses two internal flat files in each generated backup:
.md5CheckSums.info     - general information about the backup
.md5CheckSums[.bz2]    - information about every file (dir, etc.) saved

When starting storeBackup.pl, it will basically do (beside some other things):
1) read the contents of the previous .md5CheckSums[.bz2] file and store it
   in two dbm databases: dbm(md5sum) and dbm(filename)
   (dbm(md5sum) means, that md5sum is the key)
2) read the contents of other .md5CheckSums[.bz2] files (otherBackupDirs)
   and store it to dbm(md5sum). Always store the file with the lowest
   inode number in the dbm file if two different files (e.g. from
   different backup series) are identical. This assures, that multiple
   versions of the same file in different backups are unified in the
   future.

A) (describes backup without sharing of files, examples below 1 and 2)
   In a loop over all files to backup it will do:
   1) look into dbm(filename) -- which contains all files from the previous
      backup -- if the exact same file exists and has not
      changed. In this case, the needed information are the values of
      dbm(filename).
      If it existed in the previous backup(s), make a hard link and go to 3)
   2) calculate the md5 sum of the file to backup
      look into dbm(md5sum) for that md5 sum
      if it exists there, make a hard link
      if it doesn't exist, copy or compress the file
   3) write the information of the new file to the corresponding
      .md5CheckSums[.bz2] file

B) (describes backup with sharing of files, examples below 3 and 4)
   In a loop over all files to backup it will do:
   1) look into dbm(filename) -- which contains all files from the previous
      backup -- if the exact same file exists and has not
      changed. In this case, the needed information are the values of
      dbm(filename).
      (Now, because we have independant backups, it is possible, that
      a file with the same contents exists in another backup series. So we
      have to look into the dbm(md5sum) to ensure linking to the same file
      from all different backup series.)
   2) calculate the md5 sum of the file to backup if not known from
      step 1) 
      look into dbm(md5sum) for that md5 sum
      if it exists there, make a hard link
      if it doesn't exist, copy or compress the file
   3) write the information of the new file to the corresponding
      .md5CheckSums[.bz2] file

C) (describes using Option --lateLinks, example 6 below)
   If you save your backup via NFS to a server, then most of the time will
   be spent for setting hard links. Setting a hard link is very fast, but
   if you have many thousands of them it takes some time.
   You can avoid waiting for hard linking if you use the option --lateLinks:
   1) make a backup with storeBackup and set --lateLinks (or set
      lateLinks = yes
      in the configuration file. It will not generate one hard link, only
      a file will be written with the information what has to be linked.
   2) as a separate step, call storeBackupUpdateBackup to set all the
      required hard links to make full backups out of these incomplete
      backups. Please see also "how does it work with 'latelinks'" in
      the README file for a more detailed explanation.

Conclusions:
1) Everything depends on the existence of valid .md5CheckSums files!
   You have to preconceive this when making backups with otherBackupDirs.
2) Do not delete a backup to which the hard links are not yet generated.
   Use storeBackupUpdateBackup to set the hard links and check consistency.
   It's a good idea to only use storeBackup or storeBackupDel for the
   deletition of old backups.
3) All sharing of data in the backups is done via hard links. This means:
   - A backup series cannot be split over different partitions.
   - If you want to share data between different backup series, all backups
     must reside on the same partition.
4) Every information of a backup in the .md5CheckSums is stored with relative
   paths. It does not matter if you change the absolute path to the backup
   or backup with a different machine (server makes backup from client via NFS
   --- client makes backup to server via NFS).

If you have additional ideas or any questions, feel free to contact me
(hjclaes@web.de).



The examples are explaned with command line parameters. But it is a
good idea to use a configuration file!!!
Simply call:
# storeBackup.pl --generate <configFile>

Edit the configuration file and call storeBackup in the following way:
# storeBackup.pl -f <configFile>

You can override settings in the configuration file via command line
(see EXAMPLE 6).



EXAMPLE 1:
==========
simple backup without any special requirement
---------------------------------------------

backup source tree '/home/jim' to '/backup/jim':

# storeBackup.pl -s /home/jim --backupDir /backup/jim \
      -l /tmp/storeBackup.log

will do the job and write the log to '/tmp/storeBackup.log'



EXAMPLE 2:
==========
backup of more than one directory at the same time
--------------------------------------------------

Unfortunately, for historical reasons, storeBackup can only handle one
directory to backup, but there is another mechanism to overcome this
problem:

you will backup '/home/jim', '/etc' and '/home/greg/important'
to '/backup/stbu'

1) make a special directory, eg. mkdir '/opt/stbu'
2) cd /opt/stbu
3) ln -s . stbu
4) ln -s /home/jim home_jim
5) ln -s /etc etc
6) ln -s /home/greg/important home_greg_important
7) write a short script 'backup.sh':
#! /bin/sh
<PATH>/storeBackup.pl -s /opt/stbu --backupDir /backup/stbu \
    -l /tmp/storeBackup.log --followLinks 1
8) chmod 755 backup.sh

Whenever you start this script, you will backup the wanted directories
and your short script. You need to be root to have the required
permissions to read the directories in this example.
(Step 2 will result in a directory identical to stbu in your backup)



EXAMPLE 3:
==========
make a backup of the whole machine once a week and small backups every day
-------------------------------------------------------------------------

1) your machine mounts files from other servers at '/net' (you don't
   want to backup this)
2) you don't want to save '/tmp' and '/var/tmp'
3) you want to save the whole machine once a week to
   '/net/server/backup/weekly' (which takes some time)
4) you want to save '/home/jim' and '/home/tom/texts' to
   '/net/server/backup/daily' more quickly after you finished your work
5) naturally, you want to share the data between the two backup series
6) You should not start both backup scripts at the same time! This can
   result in a not 100% sharing of files between the two backups. But
   this is automatically corrected over time and does not cause any
   problems.

To perform the steps described above, you need to do the following:

1) for the daily backup, you make a special directory:
mkdir /opt/small-backup
cd /opt/small-backup
ln -s . small-backup
ln -s /home/jim home_jim
ln -s /home/tom/texts home_tom_texts
and write a backup script 'myBackup.sh':
#! /bin/sh
<PATH>/storeBackup.pl -s /opt/small-backup --backupDir /net/server/backup \
    -S daily -l /tmp/storeBackup.log --followLinks 1 0:weekly

2) script for weekly backup:
#! /bin/sh
<PATH>/storeBackup.pl -s / --backupDir /net/server/backup -S weekly \
    -l /tmp/storeBackup.log --exceptDirs /net -e /tmp -e /var/tmp \
    -e /proc -e /sys -e /dev 0:daily

The '0' before the paths (like '0:weekly') means to
take the last backup of the other backup series to check for identical
files.



EXAMPLE 4:
==========
make backups from different machines (not coordinated) and share the data
-------------------------------------------------------------------------

1) you have a server called 'server' with a separate disk which is mounted
   at '/disk1' to '/disk1/server'
2) you want to backup machine 'client1' which mounts disk1 of the server at
   '/net/server/disk1' to '/net/server/disk1/client1'
3) you want to backup machine 'client2' which mounts disk1 of the server at
   '/net/server/disk1' to '/net/server/disk1/client2'
4) the backup of the server runs nightly, independent of the other backups
5) the backups of the clients run uncoordinated, that means perhaps at the
   same time
6) you want to share all the data in the backup
7) you can also make small backups of parts or the source (with data sharing),
   but that's the same mechanism and not detailed in this example

1) script for the server:
#! /bin/sh
<PATH>storeBackup.pl -s / --backupDir /disk1 -S server -l /tmp/storeBackup.log \
    -e /tmp -e /var/tmp -e /disk1 -e /sys -e /dev -e /proc 0:client1 0:client2


2) scripts for client1:
#! /bin/sh
<PATH>/storeBackup.pl -s / --backupDir /net/server/disk1 -S client1 \
    -l /tmp/storeBackup.log -e /tmp -e /var/tmp -e /disk1 -e /sys -e /dev \
    -e /proc 0:client1 0:client2


3) scripts for client2:
#! /bin/sh
<PATH>/storeBackup.pl -s / --backupDir /net/server/disk1 -S client2 \
    -l /tmp/storeBackup.log -e /tmp -e /var/tmp -e /disk1 -e /sys -e /dev \
    -e /proc  0:server 0:client1



EXAMPLE 5:
==========
make a backup with different keepTimes for some directories
-----------------------------------------------------------

You can do this very easy and obvious with the following (from the
previous examples) known trick. Lets say you want to keep your backup
for 60 days and all files in the directory 'notimportant' for only 7
days.
Simply make two backups, one with --keepAll 60d and exclude directory
'notimportant'. Make the second backup with --keepAll 7d for the
missing directory. Like described in EXAMPLE 3, create a relationship
between the backups. So, if you move or copy a file between
'notimportant' and the rest of your saved directories, you will not
use additional space for the file.


EXAMPLE 6:
==========
make a backup via NFS as fast as possible (with lateLinks)
----------------------------------------------------------

1) Configure storeBackup to make a backup to you backup diretory via
   NFS. You configure all options in the configuration file <cf1>
   and you set among others:
   lateLinks = yes
   lateCompress = yes
   doNotDelete = yes
   If you have a low bandwidth, there is no need to set lateCompress to 'yes'.
   Because of 'doNotDelete = yes' you will not have to wait for the
   deletition of old backups.
2) Make your backup(s). (Like always, the very first backup will be slow.)
   You do not have to do anything more from your client (NFS client).
3) Start (via cron) on the server (NFS server = backup server):
   storeBackupUpdateBackup.pl -f <cf1> --topLevel <topLevelDir> \
          -l /tmp/stbuUpdate.log
   This will overwrite the topLevel path in the configuration file which
   probably will be different on the server.
4) Start (via cron) on the server:
   storeBackupDel.pl -f <cf1> --topLevel <topLevelDir> \
          --unset doNotDelete
   This will overwrite (unset) also the doNotDelete flag in the configuration
   file.