1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209
|
Before explaining some examples, here are some important aspects about
how storeBackup works: (The following explains the principle
mechanisms, for performance reasons it's implemented a little bit
different. There are several waiting queues, parallelisms and a tiny
scheduler which are not described here.)
storeBackup uses two internal flat files in each generated backup:
.md5CheckSums.info - general information about the backup
.md5CheckSums[.bz2] - information about every file (dir, etc.) saved
When starting storeBackup.pl, it will basically do (beside some other things):
1) read the contents of the previous .md5CheckSums[.bz2] file and store it
in two dbm databases: dbm(md5sum) and dbm(filename)
(dbm(md5sum) means, that md5sum is the key)
2) read the contents of other .md5CheckSums[.bz2] files (otherBackupDirs)
and store it to dbm(md5sum). Always store the file with the lowest
inode number in the dbm file if two different files (e.g. from
different backup series) are identical. This assures, that multiple
versions of the same file in different backups are unified in the
future.
A) (describes backup without sharing of files, examples below 1 and 2)
In a loop over all files to backup it will do:
1) look into dbm(filename) -- which contains all files from the previous
backup -- if the exact same file exists and has not
changed. In this case, the needed information are the values of
dbm(filename).
If it existed in the previous backup(s), make a hard link and go to 3)
2) calculate the md5 sum of the file to backup
look into dbm(md5sum) for that md5 sum
if it exists there, make a hard link
if it doesn't exist, copy or compress the file
3) write the information of the new file to the corresponding
.md5CheckSums[.bz2] file
B) (describes backup with sharing of files, examples below 3 and 4)
In a loop over all files to backup it will do:
1) look into dbm(filename) -- which contains all files from the previous
backup -- if the exact same file exists and has not
changed. In this case, the needed information are the values of
dbm(filename).
(Now, because we have independant backups, it is possible, that
a file with the same contents exists in another backup series. So we
have to look into the dbm(md5sum) to ensure linking to the same file
from all different backup series.)
2) calculate the md5 sum of the file to backup if not known from
step 1)
look into dbm(md5sum) for that md5 sum
if it exists there, make a hard link
if it doesn't exist, copy or compress the file
3) write the information of the new file to the corresponding
.md5CheckSums[.bz2] file
Conclusions:
1) Everything depends on the existence of valid .md5CheckSums files!
You have to preconceive this when making backups with otherBackupDirs.
2) All sharing of data in the backups is done via hard links. This means:
- A backup series cannot be split over different partitions.
- If you want to share data between different backup series, all backups
must reside on the same partition.
3) Every information of a backup in the .md5CheckSums is stored with relative
paths. It does not matter if you change the absolute path to the backup
or backup with a different machine (server makes backup from client via NFS
--- client makes backup to server via NFS).
If you have additional ideas or any questions, feel free to contact me
(hjclaes@web.de).
EXAMPLE 1:
==========
simple backup without any special requirement
---------------------------------------------
backup source tree '/home/jim' to '/backup/jim':
# storeBackup.pl -s /home/jim -t /backup/jim -l /tmp/storeBackup.log
will do the job and write the log to '/tmp/storeBackup.log'
EXAMPLE 2:
==========
backup of more than one directory at the same time
--------------------------------------------------
Unfortunately, for historical reasons, storeBackup can only handle one
directory to backup, but there is another mechanism to overcome this
problem:
you will backup '/home/jim', '/etc' and '/home/greg/important'
to '/backup/stbu'
1) make a special directory, eg. mkdir '/opt/stbu'
2) cd /opt/stbu
3) ln -s . stbu
4) ln -s /home/jim home_jim
5) ln -s /etc etc
6) ln -s /home/greg/important home_greg_important
7) write a short script 'backup.sh':
#! /bin/sh
PATH/storeBackup.pl -s /opt/stbu -t /backup/stbu -l /tmp/storeBackup.log --followLinks 1
8) chmod 755 backup.sh
Whenever you start this script, you will backup the wanted directories
and your short script. You need to be root to have the required
permissions to read the directories in this example.
(Step 2 will result in a directory identical to stbu in your backup)
EXAMPLE 3:
==========
make a backup of the whole machine once a week and small backups every day
-------------------------------------------------------------------------
1) your machine mounts files from other servers at '/net'
2) you don't want to save '/tmp' and '/var/tmp'
3) you want to save the whole machine once a week to
'/net/server/backup/weekly' (which takes some time)
4) you want to save '/home/jim' and '/home/tom/texts' to
'/net/server/backup/daily' more quickly after you finished your work
5) naturally, you want to share the data between the two backup series
6) You should not start both backup scripts at the same time! This can
result in a not 100% sharing of files between the two backups. But
this is automatically corrected over time and does not cause any
problems.
To perform the above described, you need to do the following:
1) for the daily backup, you make a special directory:
mkdir /opt/small-backup
cd /opt/small-backup
ln -s . small-backup
ln -s /home/jim home_jim
ln -s /home/tom/texts home_tom_texts
and write a backup script 'myBackup.sh':
#! /bin/sh
PATH/storeBackup.pl -s /opt/small-backup -t /net/server/backup/daily \
-l /tmp/storeBackup.log --followLinks 1 0:/net/server/backup/weekly
2) script for weekly backup:
#! /bin/sh
PATH/storeBackup.pl -s / -t /net/server/backup/weekly -l /tmp/storeBackup.log \
--exceptDirs /net,/tmp,/var/tmp 0:/net/server/backup/daily
The '0' before the paths (like '0:/net/server/backup/weekly') means to
take the last backup of the other backup series to check for identical
files.
EXAMPLE 4:
==========
make backups from different machines (not coordinated) and share the data
-------------------------------------------------------------------------
1) you have a server called 'server' with a separate disk which is mounted
at '/disk1' to '/disk1/server'
2) you want to backup machine 'client1' which mounts disk1 of the server at
'/net/server/disk1' to '/net/server/disk1/client1'
3) you want to backup machine 'client2' which mounts disk1 of the server at
'/net/server/disk1' to '/net/server/disk1/client2'
4) the backup of the server runs nightly, independent of the other backups
5) the backups of the clients run uncoordinated, that means perhaps at the
same time
6) you want to share all the data in the backup
7) you can also make small backups of parts or the source (with data sharing),
but that's the same mechanism and not detailed in this example
1) script for the server:
#! /bin/sh
storeBackup.pl -s / -t /disk1/server -l /tmp/storeBackup.log \
--exceptDirs /tmp,/var/tmp,/disk1 \
0:/disk1/client1 0:/disk1/client2
2) scripts for client1:
#! /bin/sh
storeBackup.pl -s / -t /net/server/disk1/client1 -l /tmp/storeBackup.log \
--exceptDirs /net,/tmp,/var/tmp \
0:/net/server/disk1/server 0:/net/server/disk1/client2
3) scripts for client2:
#! /bin/sh
storeBackup.pl -s / -t /net/server/disk1/client2 -l /tmp/storeBackup.log \
--exceptDirs /net,/tmp,/var/tmp \
0:/net/server/disk1/server 0:/net/server/disk1/client1
EXAMPLE 5:
==========
make a backup with different keepTimes for some directories
-----------------------------------------------------------
You can do this very easy and obvious with the following (from the
previous examples) known trick. Lets say you want to keep your backup
for 60 days and all files in the directory 'notimportant' for only 7
days.
Simply make two backups, one with --keepAll 60d and exclude directory
'notimportant'. Make the second backup with --keepAll 7d for the
missing directory. Like described in EXAMPLE 3, create a relationship
between the backups. So, if you move or copy a file between
'notimportant' and the rest of your saved directories, you will not
use additional space for the file.
|