File: DataManagement.ref.txt

package info (click to toggle)
debian-reference 2.24
  • links: PTS
  • area: main
  • in suites: lenny
  • size: 20,088 kB
  • ctags: 35
  • sloc: xml: 70,510; sh: 616; makefile: 352; perl: 221; sed: 3
file content (1461 lines) | stat: -rw-r--r-- 72,795 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
= Do not use Edit(GUI) button. =

[[TableOfContents(4)]]

Copyright 2007, 2008  Osamu Aoki GPL, (Please agree to GPL, GPL2, and any version of GPL which is compatible with DSFG if you update any part of wiki page)

Generated HTML is at "[http://people.debian.org/~osamu/pub/getwiki/html/ch11.en.html Debian Reference: Chapter 11. Data management]".

I welcome your contributions to update this wiki page. You must follow these rules:
 * Do not use Edit(GUI) button of MoinMoin.
 * You can update anytime for:
  * grammar errors
  * spelling errors
  * moved URL location
  * package name transition adjustment (emacs23 etc.)
  * clearly broken script.
 * Before updating this wiki content:
  * Read "[http://wiki.debian.org/DebianReference/Test Guide for contributing to Debian Reference]".

= Data management =

== Sharing, copying, and archiving ==

The security of the data and its controlled sharing have several aspects:
 * the creation of data archive,
 * the remote storage access,
 * the duplication,
 * the tracking of the modification history,
 * the facilitation of data sharing,
 * the prevention of unauthorized file access, and
 * the detection of unauthorized file modification.

These can be realized by using some combination of:
 * the archive and compression tools,
 * the copy and synchronization tools,
 * the network file system,
 * the removable storage media,
 * the secure shell,
 * the authentication system,
 * the version control system tools, and
 * hash and cryptographic encryption tools.

=== Archive and compression tools ===

Here is a summary of archive and compression tools available on the Debian system:

|| List of archive and compression tools. || 1 || 2 || 3 || || ||
|| '''package''' || '''popcon''' || '''size''' || '''command''' || '''comment''' || '''extension''' ||
|| {{{tar}}} || 29915 || - || {{{tar}}}(1) || the standard archiver (de facto standard) || {{{.tar}}} ||
|| {{{cpio}}} || 15940 || - || {{{cpio}}}(1) || Unix System V style archiver, use with {{{find}}} command || {{{.cpio}}} ||
|| {{{binutils}}} || 15167 || - || {{{ar}}}(1) || archiver for the creation of static libraries || {{{.ar}}} ||
|| {{{fastjar}}} || 2307 || - || {{{fastjar}}}(1) || archiver for Java (zip like) || {{{.jar}}} ||
|| {{{pax}}} || 530 || - || {{{pax}}}(1) || new POSIX standard archiver, compromise between {{{tar}}} and {{{cpio}}} || {{{.pax}}} ||
|| {{{afio}}} || 308 || - || {{{afio}}}(1) || extended {{{cpio}}} with per-file compression etc. || {{{.afio}}} ||
|| {{{gzip}}} || 38002 || - || {{{gzip}}}(1), {{{zcat}}}(1), ... || GNU [http://en.wikipedia.org/wiki/LZ77_and_LZ78 LZ77] compression utility (de facto standard) || {{{.gz}}} ||
|| {{{bzip2}}} || 25807 || - || {{{bzip2}}}(1), {{{bzcat}}}(1), ... || [http://en.wikipedia.org/wiki/Burrows-Wheeler_transform Burrows-Wheeler block-sorting compression] utility with higher compression ratio than {{{gzip}}}(1) (slower than {{{gzip}}} with similar syntax) || {{{.bz2}}} ||
|| {{{lzma}}} || - || - || {{{lzma}}}(1) || [http://en.wikipedia.org/wiki/Lempel-Ziv-Markov_chain_algorithm LZMA] compression utility with higher compression ratio than {{{gzip}}}(1)  (slower than {{{gzip}}} with similar syntax) || {{{.lzma}}} ||
|| {{{p7zip}}} || - || - || {{{7zr}}}(1), {{{p7zip}}}(1) || [http://en.wikipedia.org/wiki/7-Zip 7-Zip] file archiver with high compression ratio ([http://en.wikipedia.org/wiki/Lempel-Ziv-Markov_chain_algorithm LZMA] compression) || {{{.7z}}} ||
|| {{{p7zip-full}}} || - || - || {{{7z}}}(1), {{{7za}}}(1) || [http://en.wikipedia.org/wiki/7-Zip 7-Zip] file archiver with high compression ratio ([http://en.wikipedia.org/wiki/Lempel-Ziv-Markov_chain_algorithm LZMA] compression and others) || {{{.7z}}} ||
|| {{{lzop}}} || - || - || {{{lzop}}}(1) || [http://en.wikipedia.org/wiki/Lempel-Ziv-Oberhumer LZO] compression utility with higher compression and decompression speed than {{{gzip}}}(1) (lower compression ratio than {{{gzip}}} with similar syntax) || {{{.lzo}}} ||
|| {{{zip}}} || - || - || {{{zip}}}(1) || [http://en.wikipedia.org/wiki/Info-ZIP InfoZIP]: DOS archive and compression tool || {{{.zip}}} ||
|| {{{unzip}}} || - || - || {{{unzip}}}(1) ||  [http://en.wikipedia.org/wiki/Info-ZIP InfoZIP]: DOS unarchive and decompression tool || {{{.zip}}} ||

(!) The gzipped {{{.tar}}} archive sometimes uses the file extension {{{.tgz}}}.

(!) The {{{cp}}}, {{{scp}}} and {{{tar}}} may have some limitation for special files.  The {{{cpio}}} and {{{afio}}} are most versatile.

(!) The {{{cpio}}} and {{{afio}}} commands are designed to be used with the {{{find}}} and other commands and suitable for creating backup scripts since the file selection part of the script can be tested independently.

(!) {{{afio}}} compresses each file in the archive.  This makes {{{afio}}} to be much safer for the file corruption than the globally compressed {{{tar}}} or {{{cpio}}} archives and to be '''the best archive engine''' for the backup script.

(!) Internal structure of OpenOffice data files are {{{.jar}}} file.

=== Copy and synchronization tools ===

Here is a summary of simple copy and backup tools available on the Debian system:

|| List of copy and synchronization tools. || 1 || 2 || 3 || ||
|| '''package''' || '''popcon''' || '''size''' || '''tool''' || '''function''' ||
|| {{{coreutils}}} || 37945 || - || GNU cp || Locally copy files and directories ("-a" for recursive). ||
|| {{{openssh-client}}} || 29037 || - || scp || Remotely copy files and directories (client). "-r" for recursive. ||
|| {{{openssh-server}}} || 22918 || - || sshd || Remotely copy files and directories (remote server). ||
|| {{{rsync}}} || 6383 || Rsync || - || 1-way remote synchronization and backup. ||
|| {{{unison}}} || 634 || Unison || - || 2-way remote synchronization and backup. ||
|| {{{pdumpfs}}} || 51 || pdumpfs || - || Daily local backup using hardlinks, similar to Plan9's {{{dumpfs}}}. ||

##|| Remote VCS || {{{cvs}}}, {{{subversion}}}, ... || 4265, 5276, ... || - || multi-way remote synchronization and backup (server-based). ||
##|| Distributed VCS || {{{git-core}}}, ... || 512, ... || - || multi-way  remote and local synchronization and backup (distributed). ||

{i} Execution of the {{{bkup}}} script mentioned in @{@acopyscriptforthedatabackup@}@ with the "{{{-gl}}}" option under {{{cron}}}(8) should provide very similar functionality as {{{pdumpfs}}} for the static data archive.

{i} Version control system (VCS) tools in @{@listofversioncontrolsystemtools@}@ can function as the multi-way copy and synchronization tools.

=== Idioms for the archive ===

Here are several ways to archive and unarchive the entire contents of the directory {{{/source}}}.

With GNU {{{tar}}}:
{{{
$ tar cvzf archive.tar.gz /source
$ tar xvzf archive.tar.gz
}}}

With {{{cpio}}}:
{{{
$ find /source -xdev -print0 | cpio -ov --null > archive.cpio; gzip archive.cpio
$ zcat archive.cpio.gz | cpio -i
}}}

With {{{afio}}}:
{{{
$ find /source -xdev -print0 | afio -ovZ0 archive.afio
$ afio -ivZ archive.afio
}}}

=== Idioms for the copy ===

Here are several ways to copy the entire contents of the directory
 * from {{{/source}}} to {{{/dest}}} , and
 * from {{{/source}}} at local to {{{/dest}}} at {{{user@host.dom}}}.

With GNU {{{cp}}} and openSSH {{{scp}}}:
{{{
# cp -a /source /dest
# scp -pr /source user@host.dom:/dest
}}}

With GNU {{{tar}}}:
{{{
# (cd /source && tar cf - . ) | (cd /dest && tar xvfp - )
# (cd /source && tar cf - . ) | ssh user@host.dom '(cd /dest && tar xvfp - )'
}}}

With {{{cpio}}}:
{{{
# cd /source; find . -print0 | cpio -pvdm --null --sparse /dest
}}}

With {{{afio}}}:
{{{
# cd /source; find . -print0 | afio -pv0a /dest
}}}

The {{{scp}}} command can even copy files between remote hosts:
{{{
# scp -pr user1@host1.dom:/source user2@host2.dom:/dest
}}}

=== Idioms for the selection of files ===

The {{{find}}}(1) command is used to select files for archive and copy commands (see @{@idiomsforthearchive@}@ and @{@idiomsforthecopy@}@) or for the {{{xargs}}}(1) command (see @{@repeatingacommandloopingoverfiles@}@).  This can be enhanced by using its command arguments.

Basic syntax of {{{find}}}(1) can be summarized as:
 * Its conditional arguments are evaluated from left to right.
 * This evaluation stops once its outcome is determined.
 * "Logical '''OR'''" (specified by "{{{-o}}}" between conditionals) has lower precedence than "logical '''AND'''" (specified by "{{{-a}}}" or nothing between conditionals).
 * "Logical '''NOT'''" (specified by "{{{!}}}" before a conditional) has higher precedence than "logical '''AND'''".
 * "{{{-prune}}}" always returns logical '''TRUE''' and, if it is a directory, searching of file is stopped beyond this point.
 * "{{{-name}}}" matches the base of the filename with shell glob (see @{@shellglob@}@) but it also matches its initial "." with metacharacters such as "{{{*}}}" and "{{{?}}}". (New [http://en.wikipedia.org/wiki/POSIX POSIX] feature)
 * "{{{-regex}}}" matches the full path with emacs style '''BRE''' (see @{@regularexpressions@}@) as default.
 * "{{{-size}}}" matches the file based on the file size (value precedented with "{{{+}}}" for larger, precedented with "{{{-}}}" for smaller)
 * "{{{-newer}}}" matches the file newer than the one specified in its argument.
 * "{{{-print0}}}" always returns logical '''TRUE''' and print the full filename (null-terminated) on the standard output.

This {{{find}}}(1) command is often used with an idiomatic style.  For example:
{{{
# find /path/to \
    -xdev -regextype posix-extended \
    -type f -regex ".*\.afio|.*~" -prune -o \
    -type d -regex ".*/\.git" -prune -o \
    -type f -size +99M -prune -o \
    -type f -newer /path/to/timestamp -print0
}}}

This means to do following actions:
 * search all files starting from "{{{/path/to}}}"
 * globally limit its search within its starting filesystem and uses '''ERE''' (see @{@regularexpressions@}@) instead,
 * exclude files matching regex of "{{{.*\.afio}}}" or "{{{.*~}}}" from search by stop processing,
 * exclude directories matching regex of "{{{.*/\.git}}}" from search by stop processing,
 * exclude files larger than 99 Megabytes (units of 1048576 bytes) from search by stop processing, and
 * print filenames which satisfy above search conditions and newer than "{{{/path/to/timestamp}}}".

Please note the idiomatic use of "{{{-prune -o}}}" to exclude files in the above example.

(!) For non-Debian unix-like system, some options may not be supported for {{{find}}}(1). In such a case, please consider to adjust matching methods and replace "{{{-print0}}}" with "{{{-print}}}".  You may need to adjust related commands too.

=== Backup and recovery ===

We all know that computers fail sometime or human errors cause system and data damages.  Backup and recovery operations are the essential part of successful system administration.  All possible failure modes will hit you some day.

There are 3 key factors which determine actual backup and recovery policy:
 1. Knowing what to backup and recover.
  * Data files directly created by you: data in {{{/$HOME/}}}
  * Data files created by applications used by you: data in {{{/var/}}} (except {{{/var/cache/}}}, {{{/var/run/}}}, and {{{/var/tmp/}}}).
  * System configuration files: data in {{{/etc/}}}
  * Local softwares: data in {{{/usr/local/}}} or {{{/opt/}}}
  * System installation information: a memo in plain text on key steps (partition, ...).
  * Proven set of data: experimenting with recovery operations in advance.
 2. Knowing how to backup and recover.
  * Secure storage of data: protection from overwrite and system failure.
  * Frequent backup: scheduled backup.
  * Redundant backup: data mirroring.
  * Fool proof process: easy single command backup.
 3. Assessing risks and costs involved.
  * Failure mode and their possibility.
  * Value of data when lost.
  * Required resources for backup: human, hardware, software, ...

As for secure storage of data, data should be at least on different disk partitions preferably on different disks and machines to withstand the filesystem corruption.  Important data are best stored on a write-once media such as CD/DVD-R to prevent overwrite accidents.  (See @{@thebinarydata@}@ for how to write to the storage media from the shell commandline.  Gnome desktop GUI environment gives you easy access via menu: "Places->CD/DVD Creator".)

(!) You may wish to stop some application daemons such as MTA (see @{@mta@}@) while backing up data.

(!) You should pay extra care to the backup and restoration of identity related data files such as {{{/etc/ssh/ssh_host_dsa_key}}}, {{{/etc/ssh/ssh_host_rsa_key}}}, {{{$HOME/.gnupg/*}}}, {{{$HOME/.ssh/*}}}, {{{/etc/passwd}}}, {{{/etc/shadow}}}, {{{/etc/fetchmailrc}}}, {{{popularity-contest.conf}}}, {{{/etc/ppp/pap-secrets}}}, and {{{/etc/exim4/passwd.client}}}.  Some of these data can not be regenerated by entering the same input string to the system.

(!) If you run a cron job as a user process, you need to restart it after the system restoration.  See @{@scheduletasksregularly@}@ for {{{cron}}}(8) and {{{crontab}}}(1).

=== Backup utility suites ===

Here is a select list of notable backup utility suites available on the Debian system:

|| List of backup suite utilities. || 1 || 2 || 3 ||
|| '''package''' || '''popcon''' || '''size''' || '''description''' ||
|| {{{sbackup}}} || - || - || Simple Backup Suite for Gnome desktop ||
|| {{{keep}}} || - || - || backup system for KDE ||
|| {{{rdiff-backup}}} || - || - || remote incremental backup ||
|| {{{backupninja}}} || - || - || lightweight, extensible '''meta-backup''' system ||
|| {{{mondo}}} || - || - || [http://en.wikipedia.org/wiki/Mondo_Rescue Mondo Rescue]: disaster recovery backup suite ||
|| {{{bacula-common}}} || - || - || [http://en.wikipedia.org/wiki/Bacula Bacula]: network backup, recovery and verification - common support files ||
|| {{{bacula-client}}} || - || - || [http://en.wikipedia.org/wiki/Bacula Bacula]: network backup, recovery and verification - client meta-package ||
|| {{{bacula-console}}} || - || - || [http://en.wikipedia.org/wiki/Bacula Bacula]: network backup, recovery and verification - text console ||
|| {{{bacula-server}}} || - || - || [http://en.wikipedia.org/wiki/Bacula Bacula]: network backup, recovery and verification - server meta-package ||
|| {{{amanda-common}}} || - || - || [http://en.wikipedia.org/wiki/Advanced_Maryland_Automatic_Network_Disk_Archiver Amanda]: Advanced Maryland Automatic Network Disk Archiver (Libs) ||
|| {{{amanda-client}}} || - || - || [http://en.wikipedia.org/wiki/Advanced_Maryland_Automatic_Network_Disk_Archiver Amanda]: Advanced Maryland Automatic Network Disk Archiver (Client) ||
|| {{{amanda-server}}} || - || - || [http://en.wikipedia.org/wiki/Advanced_Maryland_Automatic_Network_Disk_Archiver Amanda]: Advanced Maryland Automatic Network Disk Archiver (Server) ||
|| {{{cdrw-taper}}} || - || - || taper replacement for [http://en.wikipedia.org/wiki/Advanced_Maryland_Automatic_Network_Disk_Archiver Amanda] to support backups to CD-RW or DVD+RW ||
|| {{{backuppc}}} || - || - || [http://en.wikipedia.org/wiki/Backuppc BackupPC] is a high-performance, enterprise-grade system for backing up PCs (disk based) ||
|| {{{backup-manager}}} || - || - || command-line backup tool ||
|| {{{backup2l}}} || - || - || low-maintenance backup/restore tool for mountable media (disk based) ||
|| {{{faubackup}}} || - || - || backup system using a filesystem for storage (disk based) ||

The {{{sbackup}}} and {{{keep}}} packages provide easy GUI access to regular backups of user data for desktop users. An equivalent function can be realized by a simple script (@{@anexamplescriptforthesystembackup@}@) and {{{cron}}}(8).

[http://en.wikipedia.org/wiki/Mondo_Rescue Mondo Rescue] facilitates restoration of complete system from backup CD/DVD etc. without going through normal system installation processes.

[http://en.wikipedia.org/wiki/Bacula Bacula], [http://en.wikipedia.org/wiki/Advanced_Maryland_Automatic_Network_Disk_Archiver Amanda], and [http://en.wikipedia.org/wiki/Backuppc BackupPC] are full featured backup suite utilities which are focused on regular backups over network.

=== An example script for the system backup ===

For a personal Debian desktop system running {{{unstable}}} suite, I only need to protect personal and critical data.  I reinstall system once a year anyway.  Thus I see no reason to backup the whole system or to install a full featured backup utility.

I use a simple script to make a backup archive and burn it into CD/DVD using GUI.  Here is an example script for this.
{{{
#!/bin/sh -e
# Copyright (C) 2007-2008 Osamu Aoki <osamu@debian.org>, Public Domain
BUUID=1000; USER=osamu # UID and name of a user who accesses backup files
BUDIR="/var/backups"
XDIR0=".+/Mail|.+/Desktop"
XDIR1=".+/\.thumbnails|.+/\.?Trash|.+/\.?[cC]ache|.+/\.gvfs|.+/sessions"
XDIR2=".+/CVS|.+/\.git|.+/\.svn|.+/Downloads|.+/Archive|.+/Checkout|.+/tmp"
XSFX=".+\.iso|.+\.tgz|.+\.tar\.gz|.+\.tar\.bz2|.+\.afio|.+\.tmp|.+\.swp|.+~"
SIZE="+99M"
DATE=$(date --utc +"%Y%m%d-%H%M")
[ -d "$BUDIR" ] || mkdir -p "BUDIR"
umask 077
dpkg --get-selections \* > /var/lib/dpkg/dpkg-selections.list
debconf-get-selections > /var/cache/debconf/debconf-selections

{
find /etc /usr/local /opt /var/lib/dpkg/dpkg-selections.list \
     /var/cache/debconf/debconf-selections -xdev -print0
find /home/$USER /root -xdev -regextype posix-extended \
  -type d -regex "$XDIR0|$XDIR1" -prune -o -type f -regex "$XSFX" -prune -o \
  -type f -size  "$SIZE" -prune -o -print0
find /home/$USER/Mail/Inbox /home/$USER/Mail/Outbox -print0
find /home/$USER/Desktop  -xdev -regextype posix-extended \
  -type d -regex "$XDIR2" -prune -o -type f -regex "$XSFX" -prune -o \
  -type f -size  "$SIZE" -prune -o -print0
} | cpio -ov --null -O $BUDIR/BU$DATE.cpio
chown $BUUID $BUDIR/BU$DATE.cpio
touch $BUDIR/backup.stamp
}}}

This is meant to be a script example executed from root:
 * Edit this script to cover all your important data (see @{@idiomsfortheselectionoffiles@}@ and @{@backupandrecovery@}@).
 * Replace "{{{find ... -print0}}}" with "{{{find ... -newer $BUDIR/backup.stamp -print0}}}" to make a differential backup.
 * Transfer backup files to the remote host using {{{scp}}}(1) or {{{rsync}}}(1) or burn them to CD/DVD for extra data security.  (I use Gnome desktop GUI for burning CD/DVD. See @{@shellscriptexamplewithzenity@}@ for extra redundancy.)
 * Keep it simple!

{i} You can recover debconf configuration data with "{{{debconf-set-selections debconf-selections}}}" and dpkg selection data with "{{{dpkg --set-selection <dpkg-selections.list}}}".

=== A copy script for the data backup  ===

For the set of data under a directory tree, the copy with "{{{cp -a}}}" provides the normal backup.

For the set of large non-overwritten static data under a directory tree such as the data under the {{{/var/cache/apt/packages/}}} directory, hardlinks with "{{{cp -al}}}" provide an alternative to the normal backup with efficient use of the disk space.

Here is a copy script, which I named as {{{bkup}}}, for the data backup. This script copies all (non-VCS) files under the current directory to the dated directory on the parent directory or on a remote host.

{{{
#!/bin/sh -e
# Copyright (C) 2007-2008 Osamu Aoki <osamu@debian.org>, Public Domain
function fdot(){ find . -type d \( -iname ".?*" -o -iname "CVS" \) -prune -o -print0;}
function fall(){ find . -print0;}
function mkdircd(){ mkdir -p "$1";chmod 700 "$1";cd "$1">/dev/null;}
FIND="fdot";OPT="-a";MODE="CPIOP";HOST="localhost";EXTP="$(hostname -f)"
BKUP="$(basename $(pwd)).bkup";TIME="$(date  +%Y%m%d-%H%M%S)";BU="$BKUP/$TIME"
while getopts gcCsStrlLaAxe:h:T f; do case $f in
g)  MODE="GNUCP";; # cp (GNU)
c)  MODE="CPIOP";; # cpio -p
C)  MODE="CPIOI";; # cpio -i
s)  MODE="CPIOSSH";; # cpio/ssh
S)  MODE="AFIOSSH";; # afio/ssh
t)  MODE="TARSSH";; # tar/ssh
r)  MODE="RSYNCSSH";; # rsync/ssh
l)  OPT="-alv";; # hardlink (GNU cp)
L)  OPT="-av";;  # copy (GNU cp)
a)  FIND="fall";; # find all
A)  FIND="fdot";; # find non CVS/ .???/
x)  set -x;; # trace
e)  EXTP="${OPTARG}";; # hostname -f
h)  HOST="${OPTARG}";; # user@remotehost.example.com
T)  MODE="TEST";; # test find mode
\?) echo "use -x for trace."
esac; done
shift $(expr $OPTIND - 1)
if [ $# -gt 0 ]; then
  for x in $@; do cp $OPT $x $x.$TIME; done
elif [ $MODE = GNUCP ]; then
  mkdir -p "../$BU";chmod 700 "../$BU";cp $OPT . "../$BU/"
elif [ $MODE = CPIOP ]; then
  mkdir -p "../$BU";chmod 700 "../$BU"
  $FIND|cpio --null --sparse -pvd ../$BU
elif [ $MODE = CPIOI ]; then
  $FIND|cpio -ov --null | ( mkdircd "../$BU"&&cpio -i )
elif [ $MODE = CPIOSSH ]; then
  $FIND|cpio -ov --null|ssh -C $HOST "( mkdircd \"$EXTP/$BU\"&&cpio -i )"
elif [ $MODE = AFIOSSH ]; then
  $FIND|afio -ov -0 -|ssh -C $HOST "( mkdircd \"$EXTP/$BU\"&&afio -i - )"
elif [ $MODE = TARSSH ]; then
  (tar cvf - . )|ssh -C $HOST "( mkdircd \"$EXTP/$BU\"&& tar xvfp - )"
elif [ $MODE = RSYNCSSH ]; then
  rsync -rlpt ./ "${HOST}:${EXTP}-${BKUP}-${TIME}"
else
  echo "Any other idea to backup?"
  $FIND |xargs -0 -n 1 echo
fi
}}}

This is meant to be command examples.  Please read script and test it by yourself.

{i} I keep this {{{bkup}}} in my {{{/usr/local/bin/}}} directory.  I issue {{{bkup}}} command without any option in the working directory whenever I need a temporary snapshot backup.

{i} For making snapshot history of a source file tree or a configuration file tree, it is easier and space efficient to use {{{git}}}(7) (see @{@gitforrecordingcigurationhistory@}@).

=== Removable mass storage device ===

Removable mass storage devices may be any one of
 * harddisk,
 * any format of flash memory devices, or
 * digital camera,
which are connected via [http://en.wikipedia.org/wiki/Universal_Serial_Bus USB], [http://en.wikipedia.org/wiki/IEEE_1394_interface IEEE 1394 / Firewire], [http://en.wikipedia.org/wiki/PC_card PC Card], etc.

These removable mass storage devices can be automatically mounted as a user under modern desktop environment, such as Gnome using {{{gnome-mount}}}(1).  
 * Mount point under Gnome is chosen as {{{/media/<disk_label>}}} which can be customized 
  * by the {{{mlabel}}}(1) command for FAT filesystem,
  * by the {{{genisoimage}}}(1) command with "{{{-V}}}" option for ISO9660 filesystem, and 
  * by the {{{tune2fs}}}(1) command with "{{{-L}}}" option for ext2/ext3 filesystem.  
 * The choice of encoding may need to be provided as mount option (see @{@filenameencoding@}@).
 * The ownership of the mounted filesystem may need to be adjusted for use by the normal user. 

{i} When providing wrong mount option causes problem, erase its corresponding setting under {{{/system/storage/}}} via {{{gconf-editor}}}(1).

(!) Automounting under modern desktop environment happens only when those removable media devices are not listed in {{{/etc/fstab}}}.

|| List of packages which permit normal users to mount removable devices without a matching {{{/etc/fstab}}} entry. || 1 || 2 || 3 ||
|| '''package''' || '''popcon''' || '''size''' || '''description''' ||
|| {{{gnome-mount}}} || - || - || wrapper for (un)mounting and ejecting storage devices (used by Gnome) ||
|| {{{pmount}}} || - || - || mount removable devices as normal user (used by KDE) ||
|| {{{cryptmount}}} || - || - || Management and user-mode mounting of encrypted file systems ||
|| {{{usbmount}}} || - || - || automatically mount and unmount USB mass storage devices ||

When sharing data with other system via removable mass storage device, you should format it with common [http://en.wikipedia.org/wiki/File_system filesystem] supported by both systems. Here is a list of filesystem choices.

|| List of filesystem choices for removable storage devices with typical usage scenarios. || ||
|| '''filesystem''' || '''typical usage scenario''' ||
|| [http://en.wikipedia.org/wiki/File_allocation_table FAT12] || Cross platform sharing of data on the floppy disk. (<=32MiB) ||
|| [http://en.wikipedia.org/wiki/File_allocation_table FAT16] || Cross platform sharing of data on the small harddisk like device. (<=2GiB) ||
|| [http://en.wikipedia.org/wiki/File_allocation_table FAT32] || Cross platform sharing of data on the large harddisk like device. (<=8TiB, supported by newer than MS Windows95 OSR2) ||
|| [http://en.wikipedia.org/wiki/NTFS NTFS] || Cross platform sharing of data on the large harddisk like device. (supported natively on [http://en.wikipedia.org/wiki/Windows_NT MS Windows NT] and later version, and supported by [http://en.wikipedia.org/wiki/NTFS-3G NTFS-3G] via [http://en.wikipedia.org/wiki/Filesystem_in_Userspace FUSE] on Linux) ||
|| [http://en.wikipedia.org/wiki/ISO_9660 ISO9660] || Cross platform sharing of static data on CD-R and DVD+/-R ||
|| [http://en.wikipedia.org/wiki/Universal_Disk_Format UDF] || Incremental data writing on CD-R and DVD+/-R (new) ||
|| [http://en.wikipedia.org/wiki/Minix_file_system MINIX filesystem] || Space efficient unix file data storage on the floppy disk. ||
|| [http://en.wikipedia.org/wiki/Ext2 ext2 filesystem] || Sharing of data on the harddisk like device with older Linux systems. ||
|| [http://en.wikipedia.org/wiki/Ext3 ext3 filesystem] || Sharing of data on the harddisk like device with current Linux systems. (Journaling file system) ||

{i} See @{@removablediskencnwithdmcryptluks@}@ for cross platform sharing of data using device level encryption.

The FAT filesystem is supported by almost all modern operating systems and is quite useful for the data exchange purpose via removable harddisk like media (.

When formatting removable harddisk like devices for cross platform sharing of data with the FAT filesystem, the following should be safe choices:
 * Partitioning them with {{{fdisk}}}, {{{cfdisk}}} or {{{parted}}} command (see @{@partitionconfiguration@}@) into a single primary partition and to mark it as:
  * type-"6" for FAT16 for media smaller than 2GB or
  * type-"c" for FAT32 (LBA) for larger media.
 * Formatting the primary partition with the {{{mkfs.vfat}}} command
  * with just its device name, e.g. "{{{/dev/sda1}}}" for FAT16, or
  * with the explicit option and its device name, e.g. "{{{-F 32 /dev/sda1}}}" for FAT32.

When using the FAT or ISO9660 filesystems for sharing data, the following should be the safe considerations:
 * Archiving files into an archive file first using the {{{tar}}}(1), {{{cpio}}}(1), or {{{afio}}}(1) command to retain the long filename, the symbolic link, the original Unix file permission and the owner information.
 * Splitting the archive file size into less than 2 GiB chunks with the "{{{split}}}(1)" command to protect it from the file size limitation.
 * Encrypting the archive file to secure its contents from the unauthorized access.

(!) For FAT filesystems by its design, the maximum file size is {{{(2^32 - 1) bytes = (4GiB - 1 byte)}}}. For some applications on the older 32 bit OSs, the maximum file size was even smaller {{{(2^31 - 1) bytes = (42GiB - 1 byte)}}}.  Debian does not suffer the latter problem.

(!) Microsoft itself does not recommend to use FAT for drives or partitions of over 200 MB.  Microsoft highlights its short comings such as inefficient disk space usage in their "[http://support.microsoft.com/kb/100108/EN-US/ Overview of FAT, HPFS, and NTFS File Systems]".  Of course for the Linux, we should normally use the ext3 filesystem.

{i} For more on filesystems and accessing filesystems, please read "[http://tldp.org/HOWTO/Filesystems-HOWTO.html Filesystems HOWTO]".

=== Sharing data via network ===

When sharing data with other system via network, you should use common service. Here are some hints.

|| List of the network service to chose with the typical usage scenario. || ||
|| '''network service''' || '''typical usage scenario''' ||
|| [http://en.wikipedia.org/wiki/Server_Message_Block SMB/CIFS] network mounted filesystem with [http://en.wikipedia.org/wiki/Samba_(software) Samba] || Sharing files via "Microsoft Windows Network". See {{{smb.conf}}}(5) and [http://www.samba.org/samba/docs/man/Samba-HOWTO-Collection/ The Official Samba 3.2.x HOWTO and Reference Guide] or the {{{samba-doc}}} package. ||
|| [http://en.wikipedia.org/wiki/Network_File_System_(protocol) NFS] network mounted filesystem with the Linux kernel || Sharing files via "Unix/Linux Network". See {{{exports}}}(5) and [http://tldp.org/HOWTO/NFS-HOWTO/index.html Linux NFS-HOWTO]. ||
|| [http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol HTTP] service || Sharing file between the web server/client. ||
|| [http://en.wikipedia.org/wiki/Https HTTPS] service || Sharing file between the web server/client with encrypted Secure Sockets Layer (SSL) or [http://en.wikipedia.org/wiki/Transport_Layer_Security Transport Layer Security] (TLS). ||
|| [http://en.wikipedia.org/wiki/File_Transfer_Protocol FTP] service || Sharing file between the FTP server/client. ||

Although these filesystems mounted over network or file transfer methods over network are quite convenient for sharing data, these may be insecure.  Their network connection must be secured by:
 * encrypting it with [http://en.wikipedia.org/wiki/Transport_Layer_Security SSL/TLS],
 * tunneling it via [http://en.wikipedia.org/wiki/Secure_Shell SSH],
 * tunneling it via [http://en.wikipedia.org/wiki/Virtual_private_network VPN] or,
 * limiting it behind the secure firewall. 
See also @{@othernetworkapplicationservers@}@ and @{@othernetworkapplicationclients@}@.

=== Archive media ===

When choosing [http://en.wikipedia.org/wiki/Computer_data_storage computer data storage media] for important data archive, you should be careful about their limitations.  For small personal data back up, I use CD-R and DVD-R by the brand name company and store in a cool, dry, clean environment.  (Tape archive media seem to be popular for professional use.)

(!) [http://en.wikipedia.org/wiki/Safe A fire-resistant safe] are usually meant for paper documents.  Most of the computer data storage media have less temperature tolerance than paper. I usually rely on multiple secure encrypted copies stored in multiple secure locations.

Optimistic storage life of archive media seen on the net (mostly from vendor info):
 *  100+ years : acid free paper with ink
 *  100  years : optical storage  (CD/DVD, CD/DVD-R)
 *   30  years : magnetic storage (tape, floppy)
 *   20  years : phase change optical storage (CD-RW)
These do not count on the mechanical failures due to handling etc.

Optimistic write cycle of archive media seen on the net (mostly from vendor info):
 *  250,000+ cycles : Harddisk drive
 *   10,000+ cycles : Flash memory
 *    1,000  cycles : CD/DVD-RW
 *        1  cycles : CD/DVD-R, paper

<!> Figures of storage life and write cycle here should not be used for decisions on any critical data storage.   Please consult the specific product information provided by the manufacture.

{i} Since CD/DVD-R and paper have only 1 write cycle, they inherently prevent accidental data loss by overwriting.  This is advantage!

{i} If you need fast and frequent backup of large amount of data, a harddisk on a remote host linked by a fast network connection, may be the only realistic option.

== The binary data ==

Here, we discuss direct manipulation of the binary data on storage media.  See @{@datastoragetips@}@, too.

=== Make the disk image file ===

The disk image file, {{{disk.img}}}, of an unmounted device, e.g., the second SCSI drive {{{/dev/sdb}}}, can be made using {{{cp}}}(1) or {{{dd}}}(1):
{{{
# cp /dev/sda disk.img
# dd if=/dev/sda of=disk.img
}}}

The disk image of the traditional PC's [http://en.wikipedia.org/wiki/Master_boot_record master boot record (MBR)] (see @{@partitionconfiguration@}@) which reside on the first sector on the primary IDE disk partial disk can be made by using {{{dd}}}(1):
{{{
# dd if=/dev/hda of=mbr.img bs=512 count=1
# dd if=/dev/hda of=mbr-nopart.img bs=446 count=1
# dd if=/dev/hda of=mbr-part.img skip=446 bs=1 count=66
}}}
 * {{{mbr.img}}} : the MBR with the partition table.
 * {{{mbr-nopart.img}}} : the MBR without the partition table.
 * {{{part.img}}} : the partition table of the MBR only..

If you have a SCSI device (including the new serial ATA drive) as the boot disk, substitute "{{{/dev/hda}}}" with "{{{/dev/sda}}}".

If you are making an image of a disk partition of the original disk, substitute "{{{/dev/hda}}}" with "{{{/dev/hda1}}}" etc.

=== Writing directly to the disk ===

The disk image file, {{{disk.img}}} can be written to an unmounted device, e.g., the second SCSI drive {{{/dev/sdb}}} with matching size, by {{{dd}}}(1):
{{{
# dd if=disk.img of=/dev/sda
}}}

Similarly, the disk partition image file, {{{disk.img}}} can be written to an unmounted partition, e.g., the first partition of the second SCSI drive {{{/dev/sdb1}}} with matching size, by {{{dd}}}(1):
{{{
# dd if=disk.img of=/dev/sda1
}}}

=== View and edit binary data ===

The most basic viewing method of binary data is to use "{{{od -t x1}}}" command.

|| List of packages which view and edit binary data. || 1 || 2 || 3 ||
|| '''package''' || '''popcon''' || '''size''' || '''description''' ||
|| {{{coreutils}}} || - || - || This basic package has {{{od}}}(1) command to dump files in octal and other formats. ||
|| {{{bsdmainutils}}} || - || - || This utility package has {{{hd}}}(1) command to dump files in ASCII, decimal, hexadecimal, and octal formats. ||
|| {{{hexedit}}} || - || - || View and edit files in hexadecimal or in ASCII ||
|| {{{bless}}} || - || - || Full featured hexadecimal editor (Gnome) ||
|| {{{khexedit}}} || - || - || Full featured hexadecimal editor (KDE). ||
|| {{{ncurses-hexedit}}} || - || - || Edit files/disks in HEX, ASCII and EBCDIC ||
|| {{{lde}}} || - || - || Linux Disk Editor ||
|| {{{beav}}} || - || - || Binary editor and viewer for HEX, ASCII, EBCDIC, OCTAL, DECIMAL, and BINARY formats. ||
|| {{{hexcat}}} || - || - || Hexadecimal dumping utility ||
|| {{{hex}}} || - || - || Hexadecimal dumping tool for Japanese ||

{i} HEX is used as an acronym for hexadecimal format.

=== Mount the disk image file ===

If {{{disk.img}}} contains an image of the disk contents and the original disk had a disk configuration which gives xxxx = (bytes/sector) * (sectors/cylinder), then the following will mount it to {{{/mnt}}}:

{{{
# mount -o loop,offset=xxxx disk.img /mnt
}}}

Note that most hard disks have 512 bytes/sector.  This offset is to skip MBR of the hard disk. You can skip offset in the above example, if {{{disk.img}}} contains
 * only an image of a disk partition of the original hard disk, or
 * only an image of the original floppy disk.

=== Manipulating files without mounting disk ===

There are tools to write files without mounting disk.

|| List of packages to manipulate files without mounting. || 1 || 2 || 3 ||
|| '''package''' || '''popcon''' || '''size''' || '''description''' ||
|| {{{mtools}}} || - || - || Utilities for MSDOS files without mounting them. ||
|| {{{hfsutils}}} || - || - || Utilities for HFS and HFS+ files without mounting them. ||

=== Make the ISO9660 image file ===

The [http://en.wikipedia.org/wiki/ISO_9660 ISO9660] image file, {{{cd.iso}}}, from the source directory tree at {{{source_directory}}} can be made using {{{genisoimage}}}(1) command:
{{{
#  genisoimage -r -J -T -V volume_id -o cd.iso source_directory
}}}

Similary, the bootable ISO9660 image file, {{{cdboot.iso}}}, can be made from {{{debian-installer}}} like directory tree at {{{source_directory}}}:
{{{
#  genisoimage -r -o cdboot.iso -V volume_id \
   -b isolinux/isolinux.bin -c isolinux/boot.cat \
   -no-emul-boot -boot-load-size 4 -boot-info-table source_directory
}}}
Here [http://en.wikipedia.org/wiki/SYSLINUX Isolinux boot loader] (see @{@stagecthebootloader@}@) is used for booting.

To make the disk image directly from the CD-ROM device using {{{cp}}}(1) or {{{dd}}}(1) has a few problems. The first run of the {{{dd}}}(1) command may cause an error message and may yield a shorter disk image with a lost tail-end. The second run of the {{{dd}}}(1) command may yield a larger disk image with garbage data attached at the end on some systems if the data size is not specified. Only the second run of the {{{dd}}}(1) command with the correct data size specified, and without ejecting the CD after an error message, seems to avoid these problems. If for example the image size displayed by {{{df}}}(1) is 46301184 blocks, use the following command twice to get the right image (this is my empirical information):
{{{
# dd if=/dev/cdrom of=cd.iso bs=2048 count=$((46301184/2))
}}}

=== Writing directly to the CD/DVD-R/RW ===

{i} DVD is only a large CD to {{{wodim}}}(1).

You can find a usable device by:
{{{
# wodim --devices
}}}
Then the blank CD-R is inserted to the device, and the ISO9660 image file, {{{cd.iso}}} is written to this device, e.g., {{{/dev/hda}}}, by {{{wodim}}}(1):
{{{
# wodim -v -eject dev=/dev/hda cd.iso
}}}

If CD-RW is used instead of CD-R, do this instead:
{{{
# wodim -v -eject blank=fast dev=/dev/hda cd.iso
}}}

{i} If your desktop system mounts CD automatically, unmount it before issuing the {{{wodim}}}(1) command by "{{{sudo unmount /dev/hda}}}".

=== Mount the ISO9660 image file ===

If {{{cd.iso}}} contains an ISO9660 image, then the following will manually mount it to {{{/cdrom}}}:

{{{
# mount -t iso9660 -o ro,loop cd.iso /cdrom
}}}

{i} Modern desktop system mounts removable media automatically (see @{@removablemassstoragedevice@}@).

=== Split a large file into small files ===

When a data is too big to backup, you can back up a large file into, e.g. 2000MiB chunks and merge those files into a large file.
{{{
$ split -b 2000m large_file
$ cat x* >large_file
}}}

<!> Please make sure you do not have any file starting with "{{{x}}}" to avoid the file name crash.

=== Clear file contents ===

In order to clear the contents of a file such as a log file, do not use {{{rm}}} to delete the file and then create a new empty file, because the file may still be accessed in the interval between commands.  The following is the safe way to clear the contents of the file.
{{{
$ :>file_to_be_cleared
}}}

=== Dummy files ===

The following commands will create dummy or empty files:
{{{
$ dd if=/dev/zero    of=5kb.file bs=1k count=5
$ dd if=/dev/urandom of=7mb.file bs=1M count=7
$ touch zero.file
$ : > alwayszero.file
}}}
 * {{{5kb.file}}} is 5KB of zeros.
 * {{{7mb.file}}} is 7MB of random data.
 * {{{zero.file}}} is 0 byte file (if file exists, the file contents are kept while updating mtime.)
 * {{{alwayszero.file}}} is always 0 byte file (if file exists, the file contents are not kept while updating mtime.)

=== Erase entire harddisk ===

There are several ways to completely erase data from an entire harddisk-like device, e.g., USB memory stick at {{{/dev/sda}}}.

<!> Check your USB memory stick location with the "{{{mount}}}" command first before executing commands here.  The device pointed by {{{/dev/sda}}} may be SCSI harddisk or serial-ATA harddisk where your entire system resides.

 * Erase all by resetting data to 0:
{{{
dd if=/dev/zero of=/dev/sda
}}}
 * Erase all by overwriting random data:
{{{
# dd if=/dev/urandom of=/dev/sda
}}}
 * Erase all by overwriting random data very efficiently (fast):
{{{
# shred -v -n 1 /dev/sda
}}}

Since the {{{dd}}} command is available from the shell of many bootable Linux CDs such as Debian installer CD, you can erase your installed system completely by running an erase command from such media on the system hard disk, e.g., {{{/dev/hda}}}, {{{/dev/sda}}}, etc.

### $ sudo time dd if=/dev/urandom of=/dev/sdd; sudo time dd if=/dev/zero of=/dev/sdd; sudo time shred -v -n 1 /dev/sdd
### [sudo] password for osamu: 
### dd: writing to `/dev/sdd': No space left on device
### 126849+0 records in
### 126848+0 records out
### 64946176 bytes (65 MB) copied, 237.358 s, 274 kB/s
### Command exited with non-zero status 1
### 0.06user 25.18system 3:57.36elapsed 10%CPU (0avgtext+0avgdata 0maxresident)k
### 126848inputs+126856outputs (0major+250minor)pagefaults 0swaps
### dd: writing to `/dev/sdd': No space left on device
### 126849+0 records in
### 126848+0 records out
### 64946176 bytes (65 MB) copied, 202.87 s, 320 kB/s
### Command exited with non-zero status 1
### 0.02user 1.68system 3:22.87elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
### 126848inputs+126848outputs (0major+249minor)pagefaults 0swaps
### 0.00user 12.04system 1:30.57elapsed 13%CPU (0avgtext+0avgdata 0maxresident)k
### 0inputs+126848outputs (0major+226minor)pagefaults 0swaps

=== Undelete deleted but still open file ===

Even if you have accidentally deleted a file, as long as that file is still being used by some application (read or write mode), it is possible to recover such a file.

 * On one terminal:
{{{
$ echo foo > bar
$ less bar
}}}

 * Then on another terminal:
{{{
$ ps aux | grep ' less[ ]'
osamu    4775  0.0  0.0  92200   884 pts/8    S+   00:18   0:00 less bar
$ rm bar
$ ls -l /proc/4775/fd | grep bar
lr-x------ 1 osamu osamu 64 2008-05-09 00:19 4 -> /home/osamu/bar (deleted)
$ cat /proc/4775/fd/4 >bar
$ ls -l
-rw-r--r-- 1 osamu osamu 4 2008-05-09 00:25 bar
$ cat bar
foo
}}}

 * Alternatively, when you have the {{{lsof}}} command installed, on another terminal:
{{{
$ ls -li bar
2228329 -rw-r--r-- 1 osamu osamu 4 2008-05-11 11:02 bar
$ lsof |grep bar|grep less
less 4775 osamu 4r REG 8,3 4 2228329 /home/osamu/bar
$ rm bar
$ lsof |grep bar|grep less
less 4775 osamu 4r REG 8,3 4 2228329 /home/osamu/bar (deleted)
$ cat /proc/4775/fd/4 >bar
$ ls -li bar
2228302 -rw-r--r-- 1 osamu osamu 4 2008-05-11 11:05 bar
$ cat bar
foo
}}}

=== Searching all hardlinks ===

Files with hardlinks can be identified by "{{{ls -li}}}", e.g.:
{{{
$ ls -li
total 0
2738405 -rw-r--r-- 1 root root 0 2008-09-15 20:21 bar
2738404 -rw-r--r-- 2 root root 0 2008-09-15 20:21 baz
2738404 -rw-r--r-- 2 root root 0 2008-09-15 20:21 foo
}}}
Both "{{{baz}}}" and "{{{foo}}}" have link count of "2" (>1) showing them to have hardlinks.  Their inode numbers are common "2738404".  This means they are the same hardlinked file.  If you do not happen to find all hardlinked files by chance, you can search it by the inode, e.g., "2738404":
{{{
# find /path/to/mount/point -xdev -inum 2738404 
}}}

=== Invisible disk space consumption ===

All deleted but open files consumes disk space although they are not visible from normal {{{du}}}(1).  They can be listed with their size by:
{{{
# lsof -s -X / |grep deleted
}}}

== Data security infrastructure ==

The data security infrastructure is provided by the combination of data encryption tool, message digest tool, and signature tool.

|| List of data security infrastructure tools. || 1 || 2 || 3 ||
|| '''package''' || '''popcon''' ||'''size''' || '''function''' ||
|| {{{gnupg}}} || - || - || [http://en.wikipedia.org/wiki/GNU_Privacy_Guard GNU privacy guard] -  OpenPGP encryption and signing tool. {{{gpg}}}(1) ||
|| {{{gnupg-doc}}} || - || - || GNU Privacy Guard documentation ||
|| {{{gpgv}}} || - || - || GNU privacy guard - signature verification tool ||
|| {{{cryptsetup}}} || - || - || Utilities for [http://en.wikipedia.org/wiki/Dm-crypt dm-crypto] block device encryption supporting [http://en.wikipedia.org/wiki/Linux_Unified_Key_Setup LUKS] ||
|| {{{ecryptfs-utils}}} || - || - || Utilities for [http://ecryptfs.sourceforge.net/ ecryptfs] stacked filesystem encryption ||
|| {{{coreutils}}} || - || - || The {{{md5sum}}} command computes and checks MD5 message digest ||
|| {{{coreutils}}} || - || - || The  {{{sha1sum}}} command computes and checks SHA1 message digest ||
|| {{{openssl}}} || - || - || The "{{{openssl dgst}}}" command computes message digest (OpenSSL). {{{dgst}}}(1ssl) ||

See @{@dataencryptiontips@}@ on [http://en.wikipedia.org/wiki/Dm-crypt dm-crypto] and [http://ecryptfs.sourceforge.net/ ecryptfs] which implement automatic data encryption infrastructure via Linux kernel modules.

=== Key management for Gnupg ===

Here are [http://en.wikipedia.org/wiki/GNU_Privacy_Guard GNU Privacy Guard] commands for the basic key management:

|| List of GNU Privacy Guard commands for the key management ||
|| '''command''' || '''effects''' ||
|| gpg --gen-key || generate a new key ||
|| gpg --gen-revoke my_user_ID || generate revoke key for my_user_ID ||
|| gpg --edit-key user_ID               || "help" for help, interactive ||
|| gpg -o file --exports                || export all keys to file ||
|| gpg --imports file                   || import all keys from file ||
|| gpg --send-keys user_ID              || send key of user_ID to keyserver ||
|| gpg --recv-keys user_ID              || recv. key of user_ID from keyserver ||
|| gpg --list-keys user_ID              || list keys of user_ID ||
|| gpg --list-sigs user_ID              || list sig. of user_ID ||
|| gpg --check-sigs user_ID             || check sig. of user_ID ||
|| gpg --fingerprint user_ID            || check fingerprint of "user_ID" ||
|| gpg --refresh-keys                   || update local keyring ||

##Following does not work any more
##|| host -l pgp.net | grep www|less || figure out pgp keyservers ||

Here is the meaning of trust code:

|| List of the meaning of trust code. || ||
|| '''code''' || '''trust''' ||
|| - || No owner trust assigned / not yet calculated. ||
|| e || Trust calculation has failed. ||
|| q || Not enough information for calculation. ||
|| n || Never trust this key. ||
|| m || Marginally trusted. ||
|| f || Fully trusted. ||
|| u || Ultimately trusted. ||

The following will upload my key "A8061F32" to the popular keyserver {{{hkp://subkeys.pgp.net}}}:

{{{
$ gpg --keyserver hkp://subkeys.pgp.net --send-keys A8061F32
}}}

A good default keyserver set up in {{{$HOME/.gnupg/gpg.conf}}} (or old location {{{$HOME/.gnupg/options}}}) contains:
{{{
keyserver hkp://subkeys.pgp.net
}}}

The following will obtain unknown keys from the keyserver:
{{{
$ gpg --list-sigs | \
  sed -n '/^sig.*\[User ID not found\]/s/^sig..........\(\w\w*\)\W.*/\1/p' |\
  sort | uniq | xargs gpg --recv-keys
}}}

There was a bug in [http://sourceforge.net/projects/pks/ OpenPGP Public Key Server] (pre version 0.9.6) which corrupted key with more than 2 sub-keys.  The newer {{{gnupg}}} (>1.2.1-2) can handle these corrupted subkeys.  See {{{gpg}}}(1) manpage under {{{--repair-pks-subkey-bug}}} option.

=== Using GnuPG with files ===

File handling:

|| List of gnu privacy guard commands on files ||
|| '''command''' || '''effects''' ||
|| gpg -a -s file || sign file into ascii armored file.asc ||
|| gpg --armor --sign file || , , ||
|| gpg --clearsign file          || clear-sign message ||
|| gpg --clearsign --not-dash-escaped patchfile  || clear-sign patchfile ||
|| gpg --verify file                 || verify clear-signed file ||
|| gpg -o file.sig -b file || create detached signature ||
|| gpg -o file.sig --detach-sig file || , , ||
|| gpg --verify file.sig file        || verify file with file.sig ||
|| gpg -o crypt_file.gpg -r name -e file || public-key encryption intended for name from file to binary crypt_file.gpg ||
|| gpg -o crypt_file.gpg --recipient name --encrypt file || , , ||
|| gpg -o crypt_file.asc -a -r name -e file || public-key encryption intended for name from file to ASCII armored crypt_file.asc ||
|| gpg -o crypt_file.gpg -c file || symmetric encryption from file to crypt_file.gpg ||
|| gpg -o crypt_file.gpg --symmetric file || , , ||
|| gpg -o crypt_file.asc -a -c file || symmetric encryption intended for name from file to ASCII armored crypt_file.asc ||
|| gpg -o file -d crypt_file.gpg -r name || decryption ||
|| gpg -o file --decrypt crypt_file.gpg  || , , ||

=== Using GnuPG with Mutt ===

Add the following to {{{~/.muttrc}}} to keep a slow GnuPG from automatically
starting, while allowing it to be used by typing "{{{S}}}" at the index menu.

{{{
macro index S ":toggle pgp_verify_sig\n"
set pgp_verify_sig=no
}}}


=== Using GnuPG with Vim ===

The {{{gnupg}}} plugin let you run GnuPG transparently for files with extension {{{.gpg}}}, {{{.asc}}}, and {{{.ppg}}}.

{{{
# aptitude install vim-scripts vim-addon-manager
$ vim-addons install gnupg
}}}

=== The MD5 sum ===

The {{{md5sum}}} program provides utility to make a digest file using the method in [http://tools.ietf.org/html/rfc1321 rfc1321] and verifying each file with it.
{{{
$ md5sum foo bar >baz.md5
$ cat baz.md5
d3b07384d113edec49eaa6238ad5ff00  foo
c157a79031e1c40f85931829bc5fc552  bar
$ md5sum -c baz.md5
foo: OK
bar: OK
}}}

(!) The computation for the MD5 sum is less CPU intensive than the one for the cryptographic signature by the Gnupg. Usually, only the top level digest file is cryptographically signed to ensure data integrity.

== Source code merge tools ==

There are many merge tools for the source code.  Following commands caught my eyes.:

|| List of source code merge tools. || 2 || 3 || 4 ||
|| '''command''' || '''package''' || '''popcon''' || '''size''' || '''description''' ||
|| {{{diff}}}(1) || {{{diff}}} || 37745 || - || This compares files line by line. ||
|| {{{diff3}}}(1) || {{{diff}}} || 37745 || - || This compares and merges three files line by line. ||
|| {{{vimdiff}}}(1) || {{{vim}}} || 15655 || - || This compares 2 files side by side in vim. ||
|| {{{patch}}}(1) || {{{patch}}} || 8068 || - || This applies a diff file to an original. ||
|| {{{dpatch}}}(1) || {{{dpatch}}} || 1446 || - || This manage series of patches for Debian package. ||
|| {{{diffstat}}}(1) || {{{diffstat}}} || 1008 || - || This produces a histogram of changes by the diff. ||
|| {{{combinediff}}}(1) || {{{patchutils}}} || 759 || - || This creates a cumulative patch from two incremental patches. ||
|| {{{dehtmldiff}}}(1) || {{{patchutils}}} || x || - || This extracts a diff from an HTML page. ||
|| {{{filterdiff}}}(1) || {{{patchutils}}} || x || - || This extracts or excludes diffs from a diff file. ||
|| {{{fixcvsdiff}}}(1) || {{{patchutils}}} || x || - || This fixes diff files created by CVS that "patch" mis-interprets. ||
|| {{{flipdiff}}}(1) || {{{patchutils}}} || x || - || This exchanges the order of two patches. ||
|| {{{grepdiff}}}(1) || {{{patchutils}}} || x || - || This shows which files are modified by a patch matching a regex. ||
|| {{{interdiff}}}(1) || {{{patchutils}}} || x || - || This shows differences between two unified diff files. ||
|| {{{lsdiff}}}(1) || {{{patchutils}}} || x || - || This shows which files are modified by a patch. ||
|| {{{recountdiff}}}(1) || {{{patchutils}}} || x || - || This recomputes counts and offsets in unified context diffs. ||
|| {{{rediff}}}(1) || {{{patchutils}}} || x || - || This fixes offsets and counts of a hand-edited diff. ||
|| {{{splitdiff}}}(1) || {{{patchutils}}} || x || - || This separates out incremental patches. ||
|| {{{unwrapdiff}}}(1) || {{{patchutils}}} || x || - || This demangles patches that have been word-wrapped. ||
|| {{{wiggle}}}(1) || {{{wiggle}}} || 451 || - || This applies rejected patches. ||
|| {{{quilt}}}(1) || {{{quilt}}} || 430 || - || This manage series of patches. ||
|| {{{meld}}}(1) || {{{meld}}} || 256 || - || This is a GTK graphical file comparator and merge tool. ||
|| {{{xxdiff}}}(1) || {{{xxdiff}}} || 182 || - || This is a plain X graphical file comparator and merge tool. ||
|| {{{dirdiff}}}(1) || {{{dirdiff}}} || 61 || - || This displays and merges changes between directory trees. ||
|| {{{docdiff}}}(1) || {{{docdiff}}} || 38 || - || This compares two files word by word / char by char. ||
|| {{{imediff2}}}(1) || {{{imediff2}}} || 24 || - ||  This is an interactive full screen 2-way merge tool. ||
|| {{{makepatch}}}(1) || {{{makepatch}}} || 20 || - || This generates extended patch files. ||
|| {{{applypatch}}}(1) || {{{makepatch}}} || 20 || - || This applies extended patch files. ||
|| {{{wdiff}}}(1) || {{{wdiff}}} || 16 || - || This displays word differences between text files. ||

=== Extract differences for source files ===

Following one of these procedures will extract differences between two source files and create unified diff files {{{file.patch0}}} or {{{file.patch1}}} depending on the file location:

{{{
$ diff -u file.old file.new > file.patch0
$ diff -u old/file new/file > file.patch1
}}}

=== Merge updates for source files ===

The diff file (alternatively called patch file) is used to send a program update.  The receiving party will apply this update to another file by:

{{{
$ patch -p0 file < file.patch0
$ patch -p1 file < file.patch1
}}}

=== 3 way merge updates ===

If you have three versions of source code, you can merge them more effectively using {{{diff3}}}:

{{{
$ diff3 -m file.mine file.old file.yours > file
}}}

== Version control systems ==

Here is a summary of the [http://en.wikipedia.org/wiki/Revision_control version control systems (VCS)] on the Debian system:

(!) If you are new to VCS systems, you should start learning with '''Git''', which is growing fast in popularity.

|| List of version control system tools. || 1 || 2 || 3 ||
|| '''package''' || '''popcon''' || '''size''' || '''tool''' || '''VCS type''' || '''comment''' ||
|| {{{cssc}}} || 7 || - || [http://cssc.sourceforge.net/ CSSC] || local || Clone of the [http://en.wikipedia.org/wiki/Source_Code_Control_System Unix SCCS] (deprecated)  ||
|| {{{rcs}}} || 1658 || - || [http://en.wikipedia.org/wiki/Revision_Control_System RCS] || local || "[http://en.wikipedia.org/wiki/Source_Code_Control_System Unix SCCS] done right" ||
|| {{{cvs}}} || 4265 || - || [http://en.wikipedia.org/wiki/Concurrent_Versions_System CVS] || remote || The previous standard remote VCS ||
|| {{{subversion}}} || 5276 || - || [http://en.wikipedia.org/wiki/Subversion_(software) Subversion] || remote || "CVS done right", the new de facto standard remote VCS ||
|| {{{git-core}}} || 512 || - || [http://en.wikipedia.org/wiki/Git_(software) Git] || distributed || fast DVCS in C (used by the Linux kernel and others) ||
|| {{{mercurial}}} || 256 || - || [http://en.wikipedia.org/wiki/Mercurial_(software) Mercurial] || distributed || DVCS in python and some C. ||
|| {{{bzr}}} || 158 || - || [http://en.wikipedia.org/wiki/Bazaar_(software) Bazaar] || distributed || DVCS influenced by {{{tla}}} written in python (used by [http://www.ubuntu.com/ Ubuntu]) ||
|| {{{darcs}}} || - || - || [http://en.wikipedia.org/wiki/Darcs Darcs] || distributed || DVCS with smart algebra of patches (slow). ||
|| {{{tla}}} || - || - || [http://en.wikipedia.org/wiki/GNU_arch GNU arch] || distributed || DVCS mainly by Tom Lord. (Historic) ||
|| {{{monotone}}} || 88 || - || [http://en.wikipedia.org/wiki/Monotone_(software) Monotone] || distributed || DVCS in C++ ||

### || {{{bazaar}}} || - || - || [http://en.wikipedia.org/wiki/Bazaar_(software) Bazaar] || distributed || DVCS based on {{{tla}}} implemented in C. (Historic) ||

VCS is sometimes known as revision control system (RCS), or software configuration management (SCM).

Distributed VCS such as Git is the tool of choice these days.  CVS and Subversion may still be useful to join some existing open source program activities.

Debian provides free VCS services via [http://alioth.debian.org/ Debian Alioth service].  It supports practically all VCSs. Its documentation can be found at http://wiki.debian.org/Alioth .

<!> The {{{git}}} package is "GNU Interactive Tools" which is not the DVCS.

=== Native VCS commands ===

Here is an oversimplified comparison of native VCS commands to provide the big picture. The typical command sequence may require options and arguments.

|| Comparison of native VCS commands. || || || ||
|| '''CVS''' || '''Subversion''' || '''Git''' || '''function''' ||
|| {{{cvs init}}} || {{{svn create}}} || {{{git init}}} || create the (local) repository  ||
|| {{{cvs login}}} || - || - || login to the remote repository ||
|| {{{cvs co}}} || {{{svn co}}} || {{{git clone}}} || check out the remote repository as the working tree ||
|| {{{cvs up}}} || {{{svn up}}} || {{{git pull}}} || update the working tree by merging the remote repository ||
|| {{{cvs add}}} || {{{svn add}}} || {{{git add .}}} || add file(s) in the working tree to the VCS ||
|| {{{cvs rm}}} || {{{svn rm}}} || {{{git rm}}} || remove file(s) in working tree from the VCS ||
|| {{{cvs ci}}} || {{{svn ci}}} || - || commit changes to the remote repository ||
|| - || - || {{{git commit -a}}} || commit changes to the local repository ||
|| - || - || {{{git push}}} || update the remote repository by the local repository ||
|| {{{cvs status}}} || {{{svn status}}} || {{{git status}}} || display the working tree status from the VCS ||
|| {{{cvs diff}}} || {{{svn diff}}} || {{{git diff}}} || diff <reference_repository> <working_tree> ||
|| - || - || {{{git repack -a -d; git prune}}} || repack the local repository into single pack. ||

<!> Invoking a {{{git}}} subcommand as "{{{git-xyzzy}}}" from the command line has been deprecated since early 2006.

## The "{{{git-command}}}" may be typed as "{{{git command}}}".

## === Git commands to work directly with different VCS repositories ===


## Here is an oversimplified summary of commands to provide the big picture.
## The typical command sequence may require options and arguments.
##
## Nice to have table here.
##
{i} Git can work directly with different VCS repositories such as ones provided by CVS and Subversion, and provides the local repository for local changes with the {{{git-cvs}}} and {{{git-svn}}} packages.  See [http://www.kernel.org/pub/software/scm/git/docs/gitcvs-migration.html git for CVS users], [http://live.gnome.org/GitForGnomeDevelopers Git for GNOME developers] and
@{@git@}@.

## Following URLs are interesting.
## [http://www.mantisbt.org/wiki/doku.php/mantisbt:git_svn Using Git with SVN]
## [http://andy.delcambre.com/2008/3/4/git-svn-workflow Git SVN Workflow]
## [http://www.gnome.org/~federico/misc/git-cheat-sheet.txt GIT for mortals]
## [http://kerneltrap.org/mailarchive/git/2007/6/26/250068 GIT + CVS workflow query]

(!) Git has commands which have no equivalents in CVS and Subversion.  "Fetch", "Rebase", "Cherrypick", ...

## http://lwn.net/Articles/210045/

=== CVS ===

Check
 * "{{{sensible-browser file:///usr/share/doc/cvs/html-cvsclient}}}",
 * "{{{sensible-browser file:///usr/share/doc/cvs/html-info}}}",
 * "{{{sensible-browser file:///usr/share/doc/cvsbook}}}",
 * "{{{info cvs}}}", and
 * "{{{man cvs}}}"
for detailed information.

==== Installing a CVS server ====

The following setup will allow commits to the CVS repository only by a member
of the "src" group, and administration of CVS only by a member of the "staff"
group, thus reducing the chance of shooting oneself.

{{{
# cd /var/lib; umask 002; mkdir cvs
# export CVSROOT=/var/lib/cvs
# cd $CVSROOT
# chown root:src .
# chmod 2775 .
# cvs -d $CVSROOT init
# cd CVSROOT
# chown -R root:staff .
# chmod 2775 .
# touch val-tags
# chmod 664 history val-tags
# chown root:src history val-tags
}}}
You may restrict creation of new project by changing the owner of {{{$CVSROOT}}} directory to "{{{root:staff}}}  and its permission to "{{{3775}}}".

==== Use local CVS server ====

The following will set up shell environments for the local access to the CVS repository:

{{{
$ export CVSROOT=/var/lib/cvs
}}}

==== Use remote CVS pserver ====

The following will set up shell environments for the read-only remote access to the CVS repository without SSH (use RSH protocol capability in {{{cvs}}}):

{{{
$ export CVSROOT=:pserver:account@cvs.foobar.com:/var/lib/cvs
$ cvs login
}}}

This is prone to eavesdropping attack.

==== Anonymous CVS (download only) ====

The following will set up shell environments for the read-only remote access to the CVS repository:

{{{
$ export CVSROOT=:pserver:anonymous@cvs.sf.net:/cvsroot/qref
$ cvs login
$ cvs -z3 co qref
}}}

==== Use remote CVS through ssh ====

The following will set up shell environments for the read-only remote access to the CVS repository with SSH:

{{{
$ export CVSROOT=:ext:account@cvs.foobar.com:/var/lib/cvs
}}}

or for SourceForge:

{{{
$ export CVSROOT=:ext:account@cvs.sf.net:/cvsroot/qref
}}}

You can also use public key authentication for SSH which eliminates the password prompt.

==== Create a new CVS archive ====

For,

|| Assumption for the CVS archive. || || ||
|| '''ITEM'''   || '''VALUE'''           || '''MEANING''' ||
|| source tree  || {{{~/project-x}}}     || All source codes ||
|| Project name || {{{project-x}}}       || Name for this project ||
|| Vendor Tag   || {{{Main-branch}}}     || Tag for the entire branch ||
|| Release Tag  || {{{Release-initial}}} || Tag for a specific release ||

Then,

{{{
$ cd ~/project-x
}}}
 * create a source tree ...
{{{
$ cvs import -m "Start project-x" project-x Main-branch Release-initial
$ cd ..; rm -R ~/project-x
}}}

==== Work with CVS ====

To work with {{{project-x}}} using the local CVS repository:

{{{
$ mkdir -p /path/to; cd /path/to
$ cvs co project-x
}}}
 * get sources from CVS to local
{{{
$ cd project-x
}}}
 * make changes to the content ...
{{{
$ cvs diff -u
}}}
 * similar to "{{{diff -u repository/ local/}}}"
{{{
$ cvs up -C modified_file
}}}
 * undo changes to a file
{{{
$ cvs ci -m "Describe change"
}}}
 * save local sources to CVS
{{{
$ vi newfile_added
$ cvs add newfile_added
$ cvs ci -m "Added newfile_added"
$ cvs up
}}}
 * merge latest version from CVS.
 * To create all newly created subdirectories from CVS, use "{{{cvs up -d -P}}}" instead.
 * Watch out for lines starting with "{{{C filename}}}" which indicates conflicting changes.
 * unmodified code is moved to .#filename.version .
 * search for "<<<<<<<" and ">>>>>>>" in the files for conflicting changes.
 * edit file to fix conflicts.
{{{
$ cvs tag Release-1
}}}
 * add release tag
 * edit further ...
{{{
$ cvs tag -d Release-1
}}}
 * remove release tag
{{{
$ cvs ci -m "more comments"
$ cvs tag Release-1
}}}
* re-add release tag
{{
$ cd /path/to
$ cvs co -r Release-initial -d old project-x
}}}
 * get original version to "{{{/path/to/old}}}" directory
{{{
$ cd old
$ cvs tag -b Release-initial-bugfixes
}}}
 * create branch (-b) tag "{{{Release-initial-bugfixes}}}"
 * now you can work on the old version (Tag is sticky)
{{{
$ cvs update -d -P
}}}
 * don't create empty directories
 * source tree now has sticky tag "Release-initial-bugfixes"
 * work on this branch ... while someone else making changes too
{{{
$ cvs up -d -P
}}}
 * sync with files modified by others on this branch
{{{
$ cvs ci -m "check into this branch"
$ cvs update -kk -A -d -P
}}}
 * remove sticky tag and forget contents
 * update from main trunk without keyword expansion
{{{
$ cvs update -kk -d -P -j Release-initial-bugfixes
}}}
 * merge from Release-initial-bugfixes branch into the main
 * trunk without keyword expansion.  Fix conflicts with editor.
{{{
$ cvs ci -m "merge Release-initial-bugfixes"
$ cd
$ tar -cvzf old-project-x.tar.gz old
}}}
 * make archive. use "{{{-j}}}" if you want {{{.tar.bz2}}} .
{{{
$ cvs release -d old
}}}
 * remove local source (optional)

|| Notable options for {{{CVS}}} commands  (use as first argument(s) to {{{cvs}}}). || ||
|| '''option''' || '''meaning''' ||
|| {{{-n}}} || dry run, no effect ||
|| {{{-t}}} || display messages showing steps of cvs activity ||

==== Export files from CVS ====

To get the latest version from CVS, use "tomorrow":

{{{
$ cvs ex -D tomorrow module_name
}}}

==== Administer CVS ====

Add alias to a project (local server):

{{{
$ export CVSROOT=/var/lib/cvs
$ cvs co CVSROOT/modules
$ cd CVSROOT
$ echo "px -a project-x" >>modules
$ cvs ci -m "Now px is an alias for project-x"
$ cvs release -d .
$ cvs co -d project px
}}}
 * check out project-x (alias:px) from CVS to directory project
{{{
$ cd project
}}}
 * make changes to the content ...

In order to perform above procedure, you should have the appropriate file permission.

==== File permissions in repository ====

CVS will not overwrite the current repository file but replaces it with another one.  Thus, write permission to the repository directory is critical.  For every new repository creation, run the following to ensure this condition if needed.

{{{
# cd /var/lib/cvs
# chown -R root:src repository
# chmod -R ug+rwX   repository
# chmod    2775     repository
}}}

==== Execution bit ====

A file's execution bit is retained when checked out.  Whenever you see execution permission problems in checked-out files, change permissions of the file in the CVS repository with the following command.

{{{
# chmod ugo-x filename
}}}

=== Subversion ===

Subversion is a "next-generation" version control system, intended to replace CVS, so it has most of CVS's features. Generally, Subversion's interface to a particular feature is similar to CVS's, except where there's a compelling reason to do otherwise.

==== Installing a Subversion server ====

You need to install the {{{subversion}}}, {{{libapache2-svn}}} and {{{subversion-tools}}} packages to set up a server.

==== Setting up a repository ====

Currently, the {{{subversion}}} package does not set up a repository, so one must be set up manually.  One possible location for a repository is in {{{/var/local/repos}}}.

Create the directory:
{{{
# mkdir -p /var/local/repos
}}}

Create the repository database:

{{{
# svnadmin create /var/local/repos
}}}

Make the repository writable by the WWW server:

{{{
# chown -R www-data:www-data /var/local/repos
}}}

==== Configuring Apache2 ====

To allow access to the repository via user authentication, add (or uncomment) the following in {{{/etc/apache2/mods-available/dav_svn.conf}}}:

{{{
<Location /repos>
  DAV svn
  SVNPath /var/local/repos
  AuthType Basic
  AuthName "Subversion repository"
  AuthUserFile /etc/subversion/passwd
<LimitExcept GET PROPFIND OPTIONS REPORT>
    Require valid-user
</LimitExcept>
</Location>
}}}

Then, create a user authentication file with the command:

{{{
htpasswd2 -c /etc/subversion/passwd some-username
}}}

Restart Apache2, and your new Subversion repository will be accessible with the URL {{{http://hostname/repos}}}.

==== Subversion usage examples ====

The following sections teach you how to use different commands in Subversion.

==== Create a new Subversion archive ====

To create a new Subversion archive, type the following:

{{{
$ cd ~/your-project         # go to your source directory
$ svn import http://localhost/repos your-project project-name -m "initial project import"
}}}

This creates a directory named project-name in your Subversion repository which contains your project files.  Look at {{{http://localhost/repos/}}} to see if it's there.

==== Working with Subversion ====

Working with project-y using Subversion:

{{{
$ mkdir -p /path/to ;cd  /path/to
$ svn co http://localhost/repos/project-y
}}}
 * Check out sources
{{{
$ cd project-y
}}}
 * do some work ...
{{{
$ svn diff
}}}
 *similar to "{{{diff -u repository/ local/}}}"
{{{
$ svn revert modified_file
}}}
 * undo changes to a file
{{{
$ svn ci -m "Describe changes"
}}}
 * check in your changes to the repository
{{{
$ vi newfile_added
$ svn add newfile_added
$ svn add new_dir
}}}
 * recursively add all files in new_dir
{{{
$ svn add -N new_dir2
}}}
 * non recursively add the directory
{{{
$ svn ci -m "Added newfile_added, new_dir, new_dir2"
$ svn up
}}}
 * merge in latest version from repository
{{{
$ svn log
}}}
 * shows all changes committed
{{{
$ svn copy http://localhost/repos/project-y \
      http://localhost/repos/project-y-branch \
      -m "creating my branch of project-y"
}}}
 * branching project-y
{{{
$ svn copy http://localhost/repos/project-y \
      http://localhost/repos/projct-y-release1.0 \
      -m "project-y 1.0 release"
}}}
 * added release tag.
 * note that branching and tagging are the same. The only difference is that branches get committed whereas tags do not.
 * make changes to branch ...
{{{
$ svn merge http://localhost/repos/project-y \
   http://localhost/repos/project-y-branch
}}}
 * merge branched copy back to main copy
{{{
$ svn co -r 4 http://localhost/repos/project-y
}}}
 * get revision 4

=== Git ===

Git can do everything for both local and remote source code management.  This means that you can record the source code changes without needing network connectivity to the remote repository.

==== Before using Git ====

You may wish to set several global configuration in {{{~/.gitconfig}}} such as your name and email address used by Git:
{{{
$ git config --global user.name "Name Surname"
$ git config --global user.email yourname@example.com
}}}

If you are too used to CVS or Subversion commands, you may wish to set several command aliases;
{{{
$ git config --global alias.ci "commit -a"
$ git config --global alias.co checkout
}}}

You can check your global configuration by:
{{{
$ git config --global --list
}}}

==== Git references ====

There are good references for Git.

 * [http://www.kernel.org/pub/software/scm/git/docs/git.html manpage: git(1)]
 * [http://www.kernel.org/pub/software/scm/git/docs/user-manual.html Git User's Manual]
 * [http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html A tutorial introduction to git]
 * [http://www.kernel.org/pub/software/scm/git/docs/gittutorial-2.html A tutorial introduction to git: part two]
 * [http://www.kernel.org/pub/software/scm/git/docs/everyday.html Everyday GIT With 20 Commands Or So]
 * [http://www.kernel.org/pub/software/scm/git/docs/gitcvs-migration.html git for CVS users] : This also describes  how to set up server like CVS and extract old data from CVS into there.
 * [http://git.or.cz/course/svn.html Git - SVN Crash Course]
 * [http://git.or.cz/course/stgit.html StGit Crash Course]

The {{{git-gui}}} and {{{gitk}}} commands make using Git very easy.

/!\ Do not use the tag string with spaces in it even if some tools such as {{{gitk}}} allow you to use it.  It will choke some other {{{git}}} commands.

==== Git commands ====

Even if your upstream uses different VCS, it is good idea to use {{{git}}}(1) for local activity since you can manage your local copy of source tree without the network connection to the upstream.  Here are commands used with {{{git}}}(1).

|| List of git packages and commands. || 2 || 3 || 4 ||
|| '''command''' || '''package''' || '''popcon''' || '''size''' || '''description''' ||
|| N/A || {{{git-doc}}} || *862 || - || This provide the documentation for Git. ||
|| {{{git}}}(7) || {{{git-core}}} || 512 || - || The main command for Git. ||
|| {{{gitk}}}(1) || {{{gitk}}} || 94 || - || The GUI Git repository browser with history. ||
|| {{{git-gui}}}(1) || {{{git-gui}}} || 28 || - || The GUI for Git. (No history) ||
|| {{{git-svnimport}}}(1) || {{{git-svn}}} || 68 || - ||  This import the data out of Subversion into Git. ||
|| {{{git-svn}}}(1) || {{{git-svn}}} || 68 || - || This provides bidirectional operation between the Subversion and Git. ||
|| {{{git-cvsimport}}}(1) || {{{git-cvs}}} || 49 || - || This import the data out of CVS into Git. ||
|| {{{git-cvsexportcommit}}}(1) || {{{git-cvs}}} || 49 || - || This exports a commit to a CVS checkout from Git. ||
|| {{{git-cvsserver}}}(1) || {{{git-cvs}}} || 49 || - || A CVS server emulator for Git. ||
|| {{{git-send-email}}}(1) || {{{git-email}}} || 37 || - || This sends a collection of patches as email from the Git. ||
|| {{{stg}}}(1) || {{{stgit}}} || 31 || - || This is quilt on top of git. (Python) ||
|| {{{git-buildpackage}}}(1) || {{{git-buildpackage}}} || 17 || - || This automates the Debian packaging with the Git. ||
|| {{{guilt}}}(7) || {{{guilt}}} || 9 || - || This is quilt on top of git. (SH/AWK/SED/...) ||

==== Git for recording configuration history ====

You can manually record chronological history of configuration using [http://en.wikipedia.org/wiki/Git_(software) Git] tools.  Here is a simple example for your practice to record {{{/etc/apt/}}} contents.:

## sudo environment is assumed for realistic scenario.
## Please do not complain...
{{{
$ cd /etc/apt/
$ sudo git init
$ sudo chmod 700 .git
$ sudo git add .
$ sudo git commit -a
}}}
 * commit configuration with description.
 * make modification to the configuration files
{{{
$ cd /etc/apt/
$ sudo git commit -a
}}}
 * commit configuration with description.
 * ... continue your life ...
{{{
$ cd /etc/apt/
$ sudo gitk --all
}}}
 * you have full configuration history with you.

(!) The {{{sudo}}}(8) command is needed to work with permissions of configuration data.  For user configuration data, you may skip the {{{sudo}}}(8) command.

(!) The "{{{chmod 700 .git}}}" command in the above example is needed to protect archive data from unauthorized read access.

{i} For more complete setup for recording configuration history, please look for the {{{etckeeper}}} package: @{@recordingchangesinconfigurationfiles@}@.