|
Here follow the main features of dar/libdar tool. For each feature an overview is presented with some pointers you are welcome to follow for a more detailed information. |
HARD LINK CONSIDERATION | |
hard links are properly saved in any case and properly restored if possible. For example, if restoring across
a mounted file system, hard linking will fail, but dar will then
duplicate the inode and file contents, issuing a warning. Hard link
support includes the following inode types: plain files, char devices,
block devices, symlinks (Yes, you can hard link symbolic links! Thanks to Wesley Leggette for the info ;-) ) |
SPARSE FILES |
references: man dar |
--sparse-file-min-size, -ah | |
By default Dar takes care of sparse files, even if the underlying filesystem does
not support sparse files(!). When a long sequence of zeroed bytes is
met in a file during backup, those are not stored into the archive but
the number of zeroed bytes is stored instead (structure known as a "hole"). When comes the time to
restore that file, dar restores the normal data but when a hole is met
in the archive dar directly skips at the position of the data following
that hole. If the underlying filesystem supports sparse files,
this will (re)create a hole in the restored file, making a sparse file.
Sparse files can report to be several hundred gigabytes large while they
need only a few bytes of disk space, being able to properly save and restore them
avoids wasting disk space at restoration time and in archives. |
EXTENDED
ATTRIBUTES (EA) |
references: man dar |
MacOS
X FILE FORKS / ACL |
keywords: -u -U -am -ae --alter=list-ea |
Dar is able to
save and restore EA, all or just those matching a given pattern.
File Forks (MacOS X) are implemented over
EA as well as Linux's ACL, they are thus transparently saved, tested,
compared and restored by dar.
Note that ACL under MacOS seem to not rely on EA, thus while they are
marginally used they are ignored by dar.
|
FILESYSTEM SPECIFIC ATTRIBUTES (FSA) |
references: man dar |
MacOSX/FreeBSD Birthdate, Linux FS attributes |
keyword: --fsa-family |
Since release 2.5.0 dar is able to take care of filesystem specific
attributes. Those are grouped by family strongly linked to the
filesystem they have been read from, but perpendicularly each FSA is
designated also by a function. This way it is possible to translate FSA
from a filesystem into another filesystem when there is a equivalency
in role.
currently two families are present:
|
DIRTY FILES |
references: man dar |
keywords: --dirty-behavior , --retry-on-change |
|
At
backup time, dar checks that each saved file had not changed at the
time it was read. If a file has changed in that situation, dar retries
saving it up to three times (by default) and if it is still changing, is
flagged as "dirty" in the archive, and handled differently from other
files at restoration time. The dirty file handling is either to warn
the user before restoring, to ignore and avoid restoring them, or to ignore
the dirty flag and restore them normally. Note that dar precision when readng/writing inode dates (atime, ctime, mtime, birthtime) is the microsecond. Thus a file is seen as having changed even if a very small modification occurres in it very frequently. |
FILTERS |
references: man dar / command line usage notes |
keywords: -I -X -P -g -[ -] -am --exclude-by-ea |
|
dar
is able to backup from a total file system to a single file, thanks to
its filter mechanism. This one is dual headed: The first head let one
decide which part of a directory tree to consider for the operation
(backup, restoration, etc.) while the second head defines which type of
file to consider (filter only based on filename, like for example the
extension of the file). For backup operation, files and directories can also be filtered out if they have been set with a given user defined EA. |
NODUMP FLAG | references: man dar |
keywords: --nodump | |
Many filesystems, like ext2/3/4 filesystems provide for each inodes a set of flags, among which is the "nodump" flag. You can instruct dar to avoid saving files that have this flag set, as does the so-called
dump backup program. |
ONE FILESYSTEM | references: man dar |
keywords: -M | |
By default dar
does not stop at filesystems boundaries unless the filtering mechanism
described above exclude such directory that matches another mounted
filesystem. But you can also ask dar to avoid changing of filesystem
without the burden of finding and listing the directories to be
excluded from the backup: dar will manage alone to only save files of
the current filesystem. |
CACHE DIRECTORY TAGGING STANDARD |
references: man dar |
keywords: --cache-directory-tagging |
|
Many software use cache directories (mozilla web browser for example), directories where is stored temporaneous data that is not interesting to backup. The Cache Directory Tagging Standard
provides a standard way for software applications to identify this type
of data, which let dar (like some other backup softwares) able to take into account and avoid saving them. |
DIFFERENTIAL BACKUP | references: man dar/TUTORIAL |
keywords: -A | |
When making a backup with dar, you have the possibility to make a full backup or a differential backup. A full backup, as expected, makes backup of all files as specified on the command line (with or without filters). Instead, a differential backup, (over filter mechanism), saves only files that have changed since a given reference backup. Additionally, files that existed in the reference backup and which do no more exist at the time of the differential backup are recorded in the backup as "been removed". At recovery time, (unless you deactivate it), restoring a differential backup will update changed files and new files, but also remove files that have been recorded as "been removed". Note that the reference backup can be a full backup or another differential backup (this second method is usually designed as incremental backup). This way you can make a first full backup, then many incremental backups, each taking as reference the last backup made, for example. |
DECREMENTAL BACKUP | references: man dar / Decremental backup |
keywords: -+ -ad |
|
As
opposed to incremental backups, where the older one is a full backup
and each subsequent backup contains only the changes from the previous
backup, decremental backup let the full backup be the more recent while
the older ones only contain changes compared to the just more recent one. This
has the advantage of providing a single archive to use to restore a whole
system in its latest known state, while reducing the overall amount
of data to retain older versions of files (same amount required as with
differential backup). It has also the advantage to not have to keep
several set of backup as you just need to delete the oldest backup when
you need storage space. However it has the default to require at each
new cycle the creation of a full backup, then the transformation of
the previous full backup into a so-called decremental backup. Yes, everything has
a cost! |
DELTA BINARY |
references: man dar |
keywords: --delta sig, --include-delta-sig, --exclude-delta-sig, --delta-sig-min-size, --delta no-patch |
|
Since
release 2.6.0, for incremental and decremental backups, instead of
saving an entire whole file when it has changed, dar/libdar provides
the ability to save only the part that has changed in it. This feature
called binary delta relies on librsync library. It is not activated by
default considering the non null probability of collision between two
different versions of a file. This is also the choice of the dar user
community. |
PREVENTING ROOTKITS AND OTHER MALWARES |
references: man dar |
keywords: -asecu | |
At backup time when a differential, incremental or decremental backup is
done, dar compares the status of inode on the filesystem to the
status they had at the time of the last backup. If the ctime of a file
has changed while no other inode field changed dar issues a warning
considering that file as suspicious. This does not mean that your
system has been compromised but you are strongly advised to check
whether this concerned file has been recently updated (Some package
manager may lead to that situation) or has its Extended Attributes
changed since last backup was made. In normal situation this type of
warning does not show often (false positive are rare but possible).
However in case your system has been infected by a virus or compromised
by a rootkit, dar will signal the problem if the intruder tried to hid
its forfait. |
DIRECTORY TREE SNAPSHOT | references: man dar |
keywords: -A + |
|
Dar can make a snapshot of a directory tree and files recording the inode status of files. This may be used to detect changes in filesystem, by "diffing" the resulting archive with the filesystem at a later time. The resulting archive can also be used as reference to save file that have changed since the snapshot has been done. A snapshot archive is very small compared to the corresponding full backup but it cannot be used to restore any data. |
SLICES | references: man dar/TUTORIAL |
keywords: -s -S -p -aSI -abinary |
|
Dar
stands for Disk
ARchive. From the beginning it was designed to be able to split an
archive over several removable media whatever their number is and
whatever their size is. To restore from such a splitted archive, dar
will directly fetch the requested data in the correct slice(s). Thus
dar is able to save and restore using old floppy disk,
CD-R, DVD-R, CD-RW, DVD-RW, Zip, Jazz, etc... However, Dar will not
un/mount removable media because it is independent of hardware.
Given the size, it will split the archive in several files (called
SLICES), eventually pausing before creating the next one, allowing this
way the user to un/mount a medium, burn the file on CD-R, send it by
email (if your mail system does not allow huge file in emails, dar can
help you here also.. but OK, this is bad doing so :-)). By default, (no
size specified), dar will make one
slice whatever its size is. Additionally, the size of the first slice
can be specified separately, if for example you want first to fulfill a
partially filled disk before starting using empty ones. Last, at
restoration time, dar will just pause and prompt the user asking a
slice only if it is missing, so you can choose to have more than one
slice per medium without penalty from dar. Note that all these
operation can be
automatized using the "user command between slices" feature (presented
below), that let dar do all you want it to do once a slice is created
or before reading a slice. |
COMPRESSION | references: man dar |
keywords: -z |
|
dar can use compression. By default no compression is used. Actually gzip, bzip2, lzo, xz/lzma algorithms are available, and there is still room available for any other compression algorithm. Note that, compression is made before slicing, which means that using compression together with slices, will not make slices smaller, but will probably make less slices in the backup. |
SELECTIVE COMPRESSION | references: man dar/samples |
keywords: -Y -Z -m -am |
|
dar can be given a special filter that determines which files will be compressed or not. This way you can speed up the backup operation by not trying to compress *.mp3, *.mpg, *.zip, *.gz and other already compressed files, for example. Moreover another mechanism allow you to say that files under a given size (whatever their name is) will not be compressed. |
STRONG ENCRYPTION | references: man dar |
keywords: -K -J -# -* blowfish, twofish, aes256, serpent256, camellia256 |
|
Dar can use blowfish, twofish, aes256, serpent256 and camellia256 algorithms to encrypt the whole archive. Two "elastic buffers" are inserted and encrypted with the rest of the data, one at the beginning and one at the end of the archive to prevent a clear text attack or codebook attack. |
PUBLIC KEY ENCRYPTION |
references: man dar |
keywords: -K, --key-length |
|
Encryption
based on GPG public key is available. A given archive can be encrypted
for a recipient (or several recipients without visible overhead) using
its public key. Only the recipient(s) will be able to read such
encrypted archive. |
PRIVATE KEY SIGNATURE |
references: man dar |
keywords: --sign |
|
When
using encryption with public key it is possible in addition to sign an
archive with your own private key(s). Your recipients can then be sure
the archive has been generated by you, dar will check the signature
validity against the corresponding public key(s) each time the archive
is used (restoration, testing, etc.) and a
warning is issued if signature does not match or key is missing to
verify
the signature. You can also have the list of signatories of the archive
while listing the archive content. |
SLICE HASHING |
references: man dar |
--hash, md5, sha1, sha512 |
|
When
creating an archive dar can compute an md5, sha1 or sha512 hash before the
archive is written to disk and produce a small file compatible with
md5sum, sha1sum or sha512sum that let verify that the medium has not
corrupted the archive slices. |
DATA PROTECTION | references: man dar/Parchive integration |
keywords: -al |
|
Dar is able to detect corruption in any part of a dar archive, but it cannot fix it. Dar relies on the Parchive program for data protection against media errors. Thanks to dar's ability to run user command or script and thanks to the ad hoc provided scripts, dar can use Parchive as simply as adding a word (par2) on command-line. Depending on the context (archive creation, archive testing, ...), dar will by this mean create parity data for each slice, verify and if necessary repair the archive slices. Without Parchive, dar can workaround a corruption by not restoring the concerned file. For some more vital part of the archive, like the "catalog" which is the table of contents, dar has the ability to use an isolated catalog as backup of the internal catalog of an archive. It can also make use of tape marks that are used inside the archive for sequential reading as a way to overcome catalog corruption. The other vital information is the slice layout which is replicated in each slice and let dar overcome data corruption of that part too. As a last resort, Dar also proposes a "lax" mode in which the user is asked questions (like the compression algorithm used, ...) to help dar recover very corrupted archives and in which, many sanity checks are turned into warnings instead of aborting the operation. However this does not replace using Parchive. This "lax" mode has to be considered as the last resort option. |
TRUNCATED ARCHIVE REPARATION |
reference: man dar |
keyword: -y |
|
Since
version 2.6.0 an truncated archive (due to lack of disk space, power
outage, or any other reason) can be repaired. A truncated archive lacks
a table of content which is located at the end of the archive, without
it you cannot know what file is saved and where to fetch its data from,
unless you use the sequential reading mode which is slow as it implies
reading the whole archive even for restoring just one file. To allow
sequential reading of an archive, which is suitable for tape media,
some metadata is by default inserted all along the archive. This
metadata is globally the same information that should contain the
missing table fo content, but spread by pieces all along the archive.
Reparing an archive consists of gathering this inlined metadata and
adding it at the end of the repaired archive to allow direct access
mode (default mode) which is fast and efficient. |
DIRECT ACCESS | |
even using compression and/or encryption dar has not
to read the whole backup to extract one file. This way if you just want
to restore one file from a huge backup, the process will be much faster
than using tar. Dar first reads the catalogue (i.e. the contents of the
backup), then it goes directly to the location of the saved file(s) you
want to restore and then proceeds to restoration. In particular using slices,
dar will ask only for the slice(s) containing the file(s) to restore. Since version 2.6.0 dar can also read an archive from a remote host by mean of FTP or SFTP. Here too dar can leverage its direct access ability to only download the necessary stuff in order to restore some files from a large archive, or list the archive content or even compare a set of file with live filesystem. |
SEQUENTIAL ACCESS |
references: man dar |
(suitable for tapes) |
--sequential-read, -at |
The
direct access feature seen above is well adapted to random access media
like disks, but not for tapes. Since release 2.4.0, dar provides a
sequential mode in which dar sequentially read and write archives. It
has the advantage to be efficient with tape but suffers from the same
drawback as tar archive: it is slow to restore a single file from a
huge archive. The second advantage is to be able to repair a truncated
archive (lack of disk space, power outage, ...) as described above. |
MULTI-VOLUME TAPES |
references: man dar_split |
keywords: --sequential-read | |
The independant dar_split program
provides a mean to output dar but also tar archives to several tapes.
If takes care of splitting the archive when writing to tapes and gather
pieces of archive from several tapes for dar/tar to work as if it was a
single pieced archive. |
ARCHIVE TESTING | references: man dar/TUTORIAL/
Good
Backup Practice |
keywords: -t |
|
thanks to CRC (cyclic redundancy checks), dar is able to detect data corruption in an archive. Only the file where data corruption occurred will not be possible to restore, but dar will restore the others even when compression or encryption (or both) is used. |
ISOLATION | references: man dar |
keywords: -C -A -@ |
|
the catalogue (i.e.: the contents of an
archive), can be copied (this operation is called isolation)
to a small file, that
can in turn be used as reference for differential archive. There is
then no need to provide an archive to be able to create a differential
backup based on it, just its catalogue
is can be used instead. Such an isolated catalogue
can also be used to rescue the archive it has been isolated from in the case the archive's internal catalogue has been corrupted. Such isolated catalogue can be created at the same time as the archive (operation called on-fly isolation) or as a separate operation (called isolation). |
FLAT RESTORATION | references: man dar |
keywords: -f | |
It is possible to restore any
file without restoring the directories and subdirectories it was in at
the time of the backup. If this option is activated, all files will be
restored in the (-R) root directory whatever their real position is recorded inside the archive. |
USER COMMAND BETWEEN SLICES | references: man dar dar_slave dar_xform/command line usage notes |
keywords: -E -F -~ |
|
several hooks are provided for dar to call a given command once a slice has been written or before reading a slice. Several macros allow the user command or script to know the requested slice number, path and archive basename. |
USER
COMMAND BEFORE AND AFTER SAVING A DIRECTORY OR A FILE |
references: man dar/command line usage notes |
keywords: -< -> -= |
|
It
is possible to define a set of file that will have a command executed
before dar start saving them and once dar has completed saving them.
This is especially intended for saving live database backup. Before
entering a directory dar will call the specified user command, then it
will proceed to the backup of that directory. Once the whole directory
has been saved, dar will call again the same user command (with
slightly different arguments) and then continue the backup
process. Such user command may have for action to stop the database and
to reactivate it afterward for example. |
CONFIGURATION FILE | references: man dar, conditional syntax and user targets |
keywords: -B |
|
dar can read parameter from
file. This is a way to extends the command-line limited length
input. A configuration file can ask dar to read (or to include) other
configuration files. A simple but efficient mechanism forbids a file to
include itself directly or not, and there is no limitation in the
degree of recursion for the inclusion of configuration files. Two special configuration files $HOME/.darrc and /etc/darrc are read if they exist. They share the same syntax as any configuration file which is the syntax used on the command-line, eventually completed by newlines and comments. Any configuration file can also receive conditional statements, which describe which options are to be used in different conditions. Conditions are: "extract", "listing", "test", "diff", "create", "isolate", "merge", "reference", "auxiliary", "all", "default" (which may be useful in case or recursive inclusion of files) ... more about their meaning and use cases in dar man page. |
REMOTE OPERATIONS | references: command line usage notes, man dar/dar_slave/dar_xform |
|
keywords: -i -o - -afile-auth |
dar is able to read and write an archive to a remote server in three different ways: 1 - dar is able to produce an archive to its standard output or to a named pipe and is able to read an archive from its standard input or from a named pipe 2 - if the previous approach is fine to write down an archive over the network (through an ssh session for example), reading an archive from a remote sever that way (using a single pipe) requires dar to read the whole archive which may be inefficient to just restore a single file. For that reason, dar is also able to read an archive through a pair of pipes (or named pipes) using dar_slave at the other side of the pipes. From the pair of pipes, one pipe let dar asking to dar_slave which portion of the archive it has to send through the other pipe. This makes a remote restoration much more efficient and still allows these bidirectional exchanges to be encrypted over the network, simply running dar_slave through an ssh session. 3 - last, since release 2.6.0 dar can make use FTP or SFTP protocols to read or write an archive from or to a remote server. This method does not rely on anonymous or named pipes, is as efficient as option 2 for reading a remote archive and is compatible with slicing and slice hashing. however this option is restricted to these two network protocols: FTP (low CPU usage but insecure) SFTP (secure) |
DAR MANAGER | references: man dar_manager |
The advantage of differential
backup is that it takes much less space to store and time to complete
than always making full backup. But, in the other hand, it may lead you having a
lot of them due to the reduces space requirements. Then if you want to restore a particular file, you may
spend time to figure out in which backup is located the most recent version.
To solved this, dar_manager gathers contents information of all your backups. At restoration
time, it will call dar for you to restore the asked file(s) from the
proper backup. |
RE-SHAPE SLICES OF AN EXISTING ARCHIVE | references: man dar_xform |
|
|
the
provided program
named "dar_xform" is able to change the size of slices of a given
archive. The resulting archive is totally identical to archives
directly created by dar. Source archive can be taken from a set of
slice, from standard input or even a named pipe. Note that dar_xform
can work on encrypted and/or compressed data without having to
decompress or even decrypt it. |
ARCHIVE MERGING | references: man dar |
keywords: -+ -ak -A -@ |
|
From version 2.3.0, dar supports the merging of two
existing archives into a single one. This merging operation is assorted by
the same filtering mechanism used for archive creation. This let the
user define which file will be part of the resulting archive. By extension, archive merging can also take as single source archive as input. This may sound a bit strange at first, but this let you make a subset of a given archive without having to extract any file to disk. In particular, if your filesystem does not support Extended Attributes (EA), thanks to this feature you can still cleanup an archive from files you do not want to keep anymore without loosing any EA or performing any change to standard file attributes (like modification dates for example) of files that will stay in the resulting archive. Last, this merging feature give you also the opportunity to change the compression level or algorithm used as well as the encryption algorithm and passphrase. Of course, from a pair of source archive you can do all these sub features at the same time: filtering out files you do not want in the resulting archive, use a different compression level and algorithm or encryption password and algorithm than the source archive(s), you may also have a different archive slicing or no slicing at all (well dar_xform is more efficient for this feature only, see above "RE-SHAPE SLICES OF AN EXISTING ARCHIVE" for details). |
ARCHIVE SUBSETTING |
references: man dar |
keywords: -+ -ak |
|
As seen above under the "archive merging" feature description, it is possible to define a
subset of files from an archive and put them into a new archive without
having to really extract these files to disk. To speed up the process, it is also possible to avoid
uncompressing/recompressing files that are kept in the resulting archive or change
their compression, as well change the encryption scheme used. Last, you
may manipulate this way files and their EA while you don't have EA
support available on your system. |
DRY-RUN EXECUTION |
references: man dar |
keywords: -e |
|
You
can run any feature without effectively performing the action. Dar will
report any problem but will not create, remove or modify any file. |
ARCHIVE USER COMMENTS |
references: man dar |
keywords: --user-comment, -l -v, -l -q |
|
The
archive header can encompass a message from the user. This message is
never ciphered nor compressed and always available to any one listing
the archive summary (-l and -q options). Several macro are available to
add more confort using this option, like the current date, uid and gid
used for archive creation, hostname, and command-line used for the
archive creation. |
PADDED ZEROS TO SLICE NUMBER |
references: man dar |
keywords: --min-digits |
|
Dar
slice are numbered by integers starting by 1. Which makes filename of
the following form: archive.1.dar, archive.2.dar, ..., archive.10.dar,
etc. However, the lexicographical order used by many directory listing
tools, is not adapted to show the slices in order. For that reason, dar
let the user define how much zeros to add in the slice numbers to have
usual file browsers listing slices as expected. For example, with 3 as
minimum digit, the slice name would become: archive.001.dar,
archive.002.dar, ... archive.010.dar. |