1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589
|
Projects:
Bacula Projects Roadmap
Status updated 14 Jun 2009
Summary:
* => item complete
Item 1: Ability to restart failed jobs
*Item 2: 'restore' menu: enter a JobId, automatically select dependents
Item 3: Scheduling syntax that permits more flexibility and options
Item 4: Data encryption on storage daemon
*Item 5: Deletion of disk Volumes when pruned (partial -- truncate when pruned)
*Item 6: Implement Base jobs
Item 7: Add ability to Verify any specified Job.
Item 8: Improve Bacula's tape and drive usage and cleaning management
Item 9: Allow FD to initiate a backup
*Item 10: Restore from volumes on multiple storage daemons
Item 11: Implement Storage daemon compression
Item 12: Reduction of communications bandwidth for a backup
Item 13: Ability to reconnect a disconnected comm line
Item 14: Start spooling even when waiting on tape
*Item 15: Enable/disable compression depending on storage device (disk/tape)
Item 16: Include all conf files in specified directory
Item 17: Multiple threads in file daemon for the same job
Item 18: Possibilty to schedule Jobs on last Friday of the month
Item 19: Include timestamp of job launch in "stat clients" output
*Item 20: Cause daemons to use a specific IP address to source communications
Item 21: Message mailing based on backup types
Item 22: Ability to import/export Bacula database entities
*Item 23: "Maximum Concurrent Jobs" for drives when used with changer device
Item 24: Implementation of running Job speed limit.
Item 25: Add an override in Schedule for Pools based on backup types
Item 26: Automatic promotion of backup levels based on backup size
Item 27: Allow inclusion/exclusion of files in a fileset by creation/mod times
Item 28: Archival (removal) of User Files to Tape
Item 29: An option to operate on all pools with update vol parameters
Item 30: Automatic disabling of devices
*Item 31: List InChanger flag when doing restore.
Item 32: Ability to defer Batch Insert to a later time
Item 33: Add MaxVolumeSize/MaxVolumeBytes statement to Storage resource
Item 34: Enable persistent naming/number of SQL queries
*Item 35: Port bat to Win32
Item 36: Bacula Dir, FD and SD to support proxies
Item 37: Add Minumum Spool Size directive
Item 38: Backup and Restore of Windows Encrypted Files using Win raw encryption
Item 39: Implement an interface between Bacula and Amazon's S3.
Item 40: Convert Bacula existing tray monitor on Windows to a stand alone program
Item 1: Ability to restart failed jobs
Date: 26 April 2009
Origin: Kern/Eric
Status:
What: Often jobs fail because of a communications line drop or max run time,
cancel, or some other non-critical problem. Currrently any data
saved is lost. This implementation should modify the Storage daemon
so that it saves all the files that it knows are completely backed
up to the Volume
The jobs should then be marked as incomplete and a subsequent
Incremental Accurate backup will then take into account all the
previously saved job.
Why: Avoids backuping data already saved.
Notes: Requires Accurate to restart correctly. Must completed have a minimum
volume of data or files stored on Volume before enabling.
Item 2: 'restore' menu: enter a JobId, automatically select dependents
Origin: Graham Keeling (graham@equiinet.com)
Date: 13 March 2009
Status: Done in 3.0.2
What: Add to the bconsole 'restore' menu the ability to select a job
by JobId, and have bacula automatically select all the
dependent jobs.
Why: Currently, you either have to...
a) laboriously type in a date that is greater than the date of the
backup that you want and is less than the subsequent backup (bacula
then figures out the dependent jobs), or
b) manually figure out all the JobIds that you want and laboriously
type them all in. It would be extremely useful (in a programmatical
sense, as well as for humans) to be able to just give it a single JobId
and let bacula do the hard work (work that it already knows how to do).
Notes (Kern): I think this should either be modified to have Bacula
print a list of dates that the user can choose from as is done in
bwx-console and bat or the name of this command must be carefully
chosen so that the user clearly understands that the JobId is being
used to specify what Job and the date to which he wishes the restore to
happen.
Item 3: Scheduling syntax that permits more flexibility and options
Date: 15 December 2006
Origin: Gregory Brauer (greg at wildbrain dot com) and
Florian Schnabel <florian.schnabel at docufy dot de>
Status:
What: Currently, Bacula only understands how to deal with weeks of the
month or weeks of the year in schedules. This makes it impossible
to do a true weekly rotation of tapes. There will always be a
discontinuity that will require disruptive manual intervention at
least monthly or yearly because week boundaries never align with
month or year boundaries.
A solution would be to add a new syntax that defines (at least)
a start timestamp, and repetition period.
An easy option to skip a certain job on a certain date.
Why: Rotated backups done at weekly intervals are useful, and Bacula
cannot currently do them without extensive hacking.
You could then easily skip tape backups on holidays. Especially
if you got no autochanger and can only fit one backup on a tape
that would be really handy, other jobs could proceed normally
and you won't get errors that way.
Notes: Here is an example syntax showing a 3-week rotation where full
Backups would be performed every week on Saturday, and an
incremental would be performed every week on Tuesday. Each
set of tapes could be removed from the loader for the following
two cycles before coming back and being reused on the third
week. Since the execution times are determined by intervals
from a given point in time, there will never be any issues with
having to adjust to any sort of arbitrary time boundary. In
the example provided, I even define the starting schedule
as crossing both a year and a month boundary, but the run times
would be based on the "Repeat" value and would therefore happen
weekly as desired.
Schedule {
Name = "Week 1 Rotation"
#Saturday. Would run Dec 30, Jan 20, Feb 10, etc.
Run {
Options {
Type = Full
Start = 2006-12-30 01:00
Repeat = 3w
}
}
#Tuesday. Would run Jan 2, Jan 23, Feb 13, etc.
Run {
Options {
Type = Incremental
Start = 2007-01-02 01:00
Repeat = 3w
}
}
}
Schedule {
Name = "Week 2 Rotation"
#Saturday. Would run Jan 6, Jan 27, Feb 17, etc.
Run {
Options {
Type = Full
Start = 2007-01-06 01:00
Repeat = 3w
}
}
#Tuesday. Would run Jan 9, Jan 30, Feb 20, etc.
Run {
Options {
Type = Incremental
Start = 2007-01-09 01:00
Repeat = 3w
}
}
}
Schedule {
Name = "Week 3 Rotation"
#Saturday. Would run Jan 13, Feb 3, Feb 24, etc.
Run {
Options {
Type = Full
Start = 2007-01-13 01:00
Repeat = 3w
}
}
#Tuesday. Would run Jan 16, Feb 6, Feb 27, etc.
Run {
Options {
Type = Incremental
Start = 2007-01-16 01:00
Repeat = 3w
}
}
}
Notes: Kern: I have merged the previously separate project of skipping
jobs (via Schedule syntax) into this.
Item 4: Data encryption on storage daemon
Origin: Tobias Barth <tobias.barth at web-arts.com>
Date: 04 February 2009
Status: new
What: The storage demon should be able to do the data encryption that can
currently be done by the file daemon.
Why: This would have 2 advantages:
1) one could encrypt the data of unencrypted tapes by doing a
migration job
2) the storage daemon would be the only machine that would have
to keep the encryption keys.
Notes from Landon:
As an addendum to the feature request, here are some crypto
implementation details I wrote up regarding SD-encryption back in Jan
2008:
http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg28860.html
Item 5: Deletion of disk Volumes when pruned
Date: Nov 25, 2005
Origin: Ross Boylan <RossBoylan at stanfordalumni dot org> (edited
by Kern)
Status: Truncate operation implemented in 3.1.4
What: Provide a way for Bacula to automatically remove Volumes
from the filesystem, or optionally to truncate them.
Obviously, the Volume must be pruned prior removal.
Why: This would allow users more control over their Volumes and
prevent disk based volumes from consuming too much space.
Notes: The following two directives might do the trick:
Volume Data Retention = <time period>
Remove Volume After = <time period>
The migration project should also remove a Volume that is
migrated. This might also work for tape Volumes.
Notes: (Kern). The data fields to control this have been added
to the new 3.0.0 database table structure.
Item 6: Implement Base jobs
Date: 28 October 2005
Origin: Kern
Status:
What: A base job is sort of like a Full save except that you
will want the FileSet to contain only files that are
unlikely to change in the future (i.e. a snapshot of
most of your system after installing it). After the
base job has been run, when you are doing a Full save,
you specify one or more Base jobs to be used. All
files that have been backed up in the Base job/jobs but
not modified will then be excluded from the backup.
During a restore, the Base jobs will be automatically
pulled in where necessary.
Why: This is something none of the competition does, as far as
we know (except perhaps BackupPC, which is a Perl program that
saves to disk only). It is big win for the user, it
makes Bacula stand out as offering a unique
optimization that immediately saves time and money.
Basically, imagine that you have 100 nearly identical
Windows or Linux machine containing the OS and user
files. Now for the OS part, a Base job will be backed
up once, and rather than making 100 copies of the OS,
there will be only one. If one or more of the systems
have some files updated, no problem, they will be
automatically restored.
Notes: Huge savings in tape usage even for a single machine.
Will require more resources because the DIR must send
FD a list of files/attribs, and the FD must search the
list and compare it for each file to be saved.
Item 7: Add ability to Verify any specified Job.
Date: 17 January 2008
Origin: portrix.net Hamburg, Germany.
Contact: Christian Sabelmann
Status: 70% of the required Code is part of the Verify function since v. 2.x
What:
The ability to tell Bacula which Job should verify instead of
automatically verify just the last one.
Why:
It is sad that such a powerfull feature like Verify Jobs
(VolumeToCatalog) is restricted to be used only with the last backup Job
of a client. Actual users who have to do daily Backups are forced to
also do daily Verify Jobs in order to take advantage of this useful
feature. This Daily Verify after Backup conduct is not always desired
and Verify Jobs have to be sometimes scheduled. (Not necessarily
scheduled in Bacula). With this feature Admins can verify Jobs once a
Week or less per month, selecting the Jobs they want to verify. This
feature is also not to difficult to implement taking in account older bug
reports about this feature and the selection of the Job to be verified.
Notes: For the verify Job, the user could select the Job to be verified
from a List of the latest Jobs of a client. It would also be possible to
verify a certain volume. All of these would naturaly apply only for
Jobs whose file information are still in the catalog.
Item 8: Improve Bacula's tape and drive usage and cleaning management
Date: 8 November 2005, November 11, 2005
Origin: Adam Thornton <athornton at sinenomine dot net>,
Arno Lehmann <al at its-lehmann dot de>
Status:
What: Make Bacula manage tape life cycle information, tape reuse
times and drive cleaning cycles.
Why: All three parts of this project are important when operating
backups.
We need to know which tapes need replacement, and we need to
make sure the drives are cleaned when necessary. While many
tape libraries and even autoloaders can handle all this
automatically, support by Bacula can be helpful for smaller
(older) libraries and single drives. Limiting the number of
times a tape is used might prevent tape errors when using
tapes until the drives can't read it any more. Also, checking
drive status during operation can prevent some failures (as I
[Arno] had to learn the hard way...)
Notes: First, Bacula could (and even does, to some limited extent)
record tape and drive usage. For tapes, the number of mounts,
the amount of data, and the time the tape has actually been
running could be recorded. Data fields for Read and Write
time and Number of mounts already exist in the catalog (I'm
not sure if VolBytes is the sum of all bytes ever written to
that volume by Bacula). This information can be important
when determining which media to replace. The ability to mark
Volumes as "used up" after a given number of write cycles
should also be implemented so that a tape is never actually
worn out. For the tape drives known to Bacula, similar
information is interesting to determine the device status and
expected life time: Time it's been Reading and Writing, number
of tape Loads / Unloads / Errors. This information is not yet
recorded as far as I [Arno] know. A new volume status would
be necessary for the new state, like "Used up" or "Worn out".
Volumes with this state could be used for restores, but not
for writing. These volumes should be migrated first (assuming
migration is implemented) and, once they are no longer needed,
could be moved to a Trash pool.
The next step would be to implement a drive cleaning setup.
Bacula already has knowledge about cleaning tapes. Once it
has some information about cleaning cycles (measured in drive
run time, number of tapes used, or calender days, for example)
it can automatically execute tape cleaning (with an
autochanger, obviously) or ask for operator assistance loading
a cleaning tape.
The final step would be to implement TAPEALERT checks not only
when changing tapes and only sending the information to the
administrator, but rather checking after each tape error,
checking on a regular basis (for example after each tape
file), and also before unloading and after loading a new tape.
Then, depending on the drives TAPEALERT state and the known
drive cleaning state Bacula could automatically schedule later
cleaning, clean immediately, or inform the operator.
Implementing this would perhaps require another catalog change
and perhaps major changes in SD code and the DIR-SD protocol,
so I'd only consider this worth implementing if it would
actually be used or even needed by many people.
Implementation of these projects could happen in three distinct
sub-projects: Measuring Tape and Drive usage, retiring
volumes, and handling drive cleaning and TAPEALERTs.
Item 9: Allow FD to initiate a backup
Origin: Frank Volf (frank at deze dot org)
Date: 17 November 2005
Status:
What: Provide some means, possibly by a restricted console that
allows a FD to initiate a backup, and that uses the connection
established by the FD to the Director for the backup so that
a Director that is firewalled can do the backup.
Why: Makes backup of laptops much easier.
Item 10: Restore from volumes on multiple storage daemons
Origin: Graham Keeling (graham@equiinet.com)
Date: 12 March 2009
Status: Done in 3.0.2
What: The ability to restore from volumes held by multiple storage daemons
would be very useful.
Why: It is useful to be able to backup to any number of different storage
daemons. For example, your first storage daemon may run out of space,
so you switch to your second and carry on. Bacula will currently let
you do this. However, once you come to restore, bacula cannot cope
when volumes on different storage daemons are required.
Notes: The director knows that more than one storage daemon is needed,
as bconsole outputs something like the following table.
The job will require the following
Volume(s) Storage(s) SD Device(s)
=====================================================================
backup-0001 Disk 1 Disk 1.0
backup-0002 Disk 2 Disk 2.0
However, the bootstrap file that it creates gets sent to the first
storage daemon only, which then stalls for a long time, 'waiting for a
mount request' for the volume that it doesn't have. The bootstrap file
contains no knowledge of the storage daemon. Under the current design:
The director connects to the storage daemon, and gets an sd_auth_key.
The director then connects to the file daemon, and gives it the
sd_auth_key with the 'jobcmd'. (restoring of files happens) The
director does a 'wait_for_storage_daemon_termination()'. The director
waits for the file daemon to indicate the end of the job.
With my idea:
The director connects to the file daemon.
Then, for each storage daemon in the .bsr file... {
The director connects to the storage daemon, and gets an sd_auth_key.
The director then connects to the file daemon, and gives it the
sd_auth_key with the 'storaddr' command.
(restoring of files happens)
The director does a 'wait_for_storage_daemon_termination()'.
The director waits for the file daemon to indicate the end of the
work on this storage.
}
The director tells the file daemon that there are no more storages to
contact. The director waits for the file daemon to indicate the end of
the job. As you can see, each restore between the file daemon and
storage daemon is handled in the same way that it is currently handled,
using the same method for authentication, except that the sd_auth_key
is moved from the 'jobcmd' to the 'storaddr' command - where it
logically belongs.
Item 11: Implement Storage daemon compression
Date: 18 December 2006
Origin: Vadim A. Umanski , e-mail umanski@ext.ru
Status:
What: The ability to compress backup data on the SD receiving data
instead of doing that on client sending data.
Why: The need is practical. I've got some machines that can send
data to the network 4 or 5 times faster than compressing
them (I've measured that). They're using fast enough SCSI/FC
disk subsystems but rather slow CPUs (ex. UltraSPARC II).
And the backup server has got a quite fast CPUs (ex. Dual P4
Xeons) and quite a low load. When you have 20, 50 or 100 GB
of raw data - running a job 4 to 5 times faster - that
really matters. On the other hand, the data can be
compressed 50% or better - so losing twice more space for
disk backup is not good at all. And the network is all mine
(I have a dedicated management/provisioning network) and I
can get as high bandwidth as I need - 100Mbps, 1000Mbps...
That's why the server-side compression feature is needed!
Notes:
Item 12: Reduction of communications bandwidth for a backup
Date: 14 October 2008
Origin: Robin O'Leary (Equiinet)
Status:
What: Using rdiff techniques, Bacula could significantly reduce
the network data transfer volume to do a backup.
Why: Faster backup across the Internet
Notes: This requires retaining certain data on the client during a Full
backup that will speed up subsequent backups.
Item 13: Ability to reconnect a disconnected comm line
Date: 26 April 2009
Origin: Kern/Eric
Status:
What: Often jobs fail because of a communications line drop. In that
case, Bacula should be able to reconnect to the other daemon and
resume the job.
Why: Avoids backuping data already saved.
Notes: *Very* complicated from a design point of view because of authenication.
Item 14: Start spooling even when waiting on tape
Origin: Tobias Barth <tobias.barth@web-arts.com>
Date: 25 April 2008
Status:
What: If a job can be spooled to disk before writing it to tape, it should
be spooled immediately. Currently, bacula waits until the correct
tape is inserted into the drive.
Why: It could save hours. When bacula waits on the operator who must insert
the correct tape (e.g. a new tape or a tape from another media
pool), bacula could already prepare the spooled data in the spooling
directory and immediately start despooling when the tape was
inserted by the operator.
2nd step: Use 2 or more spooling directories. When one directory is
currently despooling, the next (on different disk drives) could
already be spooling the next data.
Notes: I am using bacula 2.2.8, which has none of those features
implemented.
Item 15: Enable/disable compression depending on storage device (disk/tape)
Origin: Ralf Gross ralf-lists@ralfgross.de
Date: 2008-01-11
Status: Done
What: Add a new option to the storage resource of the director. Depending
on this option, compression will be enabled/disabled for a device.
Why: If different devices (disks/tapes) are used for full/diff/incr
backups, software compression will be enabled for all backups
because of the FileSet compression option. For backup to tapes
wich are able to do hardware compression this is not desired.
Notes:
http://news.gmane.org/gmane.comp.sysutils.backup.bacula.devel/cutoff=11124
It must be clear to the user, that the FileSet compression option
must still be enabled use compression for a backup job at all.
Thus a name for the new option in the director must be
well-defined.
Notes: KES I think the Storage definition should probably override what
is in the Job definition or vice-versa, but in any case, it must
be well defined.
Item 16: Include all conf files in specified directory
Date: 18 October 2008
Origin: Database, Lda. Maputo, Mozambique
Contact:Cameron Smith / cameron.ord@database.co.mz
Status: New request
What: A directive something like "IncludeConf = /etc/bacula/subconfs" Every
time Bacula Director restarts or reloads, it will walk the given
directory (non-recursively) and include the contents of any files
therein, as though they were appended to bacula-dir.conf
Why: Permits simplified and safer configuration for larger installations with
many client PCs. Currently, through judicious use of JobDefs and
similar directives, it is possible to reduce the client-specific part of
a configuration to a minimum. The client-specific directives can be
prepared according to a standard template and dropped into a known
directory. However it is still necessary to add a line to the "master"
(bacula-dir.conf) referencing each new file. This exposes the master to
unnecessary risk of accidental mistakes and makes automation of adding
new client-confs, more difficult (it is easier to automate dropping a
file into a dir, than rewriting an existing file). Ken has previously
made a convincing argument for NOT including Bacula's core configuration
in an RDBMS, but I believe that the present request is a reasonable
extension to the current "flat-file-based" configuration philosophy.
Notes: There is NO need for any special syntax to these files. They should
contain standard directives which are simply "inlined" to the parent
file as already happens when you explicitly reference an external file.
Notes: (kes) this can already be done with scripting
From: John Jorgensen <jorgnsn@lcd.uregina.ca>
The bacula-dir.conf at our site contains these lines:
#
# Include subfiles associated with configuration of clients.
# They define the bulk of the Clients, Jobs, and FileSets.
#
@|"sh -c 'for f in /etc/bacula/clientdefs/*.conf ; do echo @${f} ; done'"
and when we get a new client, we just put its configuration into
a new file called something like:
/etc/bacula/clientdefs/clientname.conf
Item 17: Multiple threads in file daemon for the same job
Date: 27 November 2005
Origin: Ove Risberg (Ove.Risberg at octocode dot com)
Status:
What: I want the file daemon to start multiple threads for a backup
job so the fastest possible backup can be made.
The file daemon could parse the FileSet information and start
one thread for each File entry located on a separate
filesystem.
A confiuration option in the job section should be used to
enable or disable this feature. The confgutration option could
specify the maximum number of threads in the file daemon.
If the theads could spool the data to separate spool files
the restore process will not be much slower.
Why: Multiple concurrent backups of a large fileserver with many
disks and controllers will be much faster.
Item 18: Possibilty to schedule Jobs on last Friday of the month
Origin: Carsten Menke <bootsy52 at gmx dot net>
Date: 02 March 2008
Status:
What: Currently if you want to run your monthly Backups on the last
Friday of each month this is only possible with workarounds (e.g
scripting) (As some months got 4 Fridays and some got 5 Fridays)
The same is true if you plan to run your yearly Backups on the
last Friday of the year. It would be nice to have the ability to
use the builtin scheduler for this.
Why: In many companies the last working day of the week is Friday (or
Saturday), so to get the most data of the month onto the monthly
tape, the employees are advised to insert the tape for the
monthly backups on the last friday of the month.
Notes: To give this a complete functionality it would be nice if the
"first" and "last" Keywords could be implemented in the
scheduler, so it is also possible to run monthy backups at the
first friday of the month and many things more. So if the syntax
would expand to this {first|last} {Month|Week|Day|Mo-Fri} of the
{Year|Month|Week} you would be able to run really flexible jobs.
To got a certain Job run on the last Friday of the Month for example
one could then write
Run = pool=Monthly last Fri of the Month at 23:50
## Yearly Backup
Run = pool=Yearly last Fri of the Year at 23:50
## Certain Jobs the last Week of a Month
Run = pool=LastWeek last Week of the Month at 23:50
## Monthly Backup on the last day of the month
Run = pool=Monthly last Day of the Month at 23:50
Item 19: Include timestamp of job launch in "stat clients" output
Origin: Mark Bergman <mark.bergman@uphs.upenn.edu>
Date: Tue Aug 22 17:13:39 EDT 2006
Status:
What: The "stat clients" command doesn't include any detail on when
the active backup jobs were launched.
Why: Including the timestamp would make it much easier to decide whether
a job is running properly.
Notes: It may be helpful to have the output from "stat clients" formatted
more like that from "stat dir" (and other commands), in a column
format. The per-client information that's currently shown (level,
client name, JobId, Volume, pool, device, Files, etc.) is good, but
somewhat hard to parse (both programmatically and visually),
particularly when there are many active clients.
Item 20: Cause daemons to use a specific IP address to source communications
Origin: Bill Moran <wmoran@collaborativefusion.com>
Date: 18 Dec 2006
Status: Done in 3.0.2
What: Cause Bacula daemons (dir, fd, sd) to always use the ip address
specified in the [DIR|DF|SD]Addr directive as the source IP
for initiating communication.
Why: On complex networks, as well as extremely secure networks, it's
not unusual to have multiple possible routes through the network.
Often, each of these routes is secured by different policies
(effectively, firewalls allow or deny different traffic depending
on the source address)
Unfortunately, it can sometimes be difficult or impossible to
represent this in a system routing table, as the result is
excessive subnetting that quickly exhausts available IP space.
The best available workaround is to provide multiple IPs to
a single machine that are all on the same subnet. In order
for this to work properly, applications must support the ability
to bind outgoing connections to a specified address, otherwise
the operating system will always choose the first IP that
matches the required route.
Notes: Many other programs support this. For example, the following
can be configured in BIND:
query-source address 10.0.0.1;
transfer-source 10.0.0.2;
Which means queries from this server will always come from
10.0.0.1 and zone transfers will always originate from
10.0.0.2.
Item 21: Message mailing based on backup types
Origin: Evan Kaufman <evan.kaufman@gmail.com>
Date: January 6, 2006
Status:
What: In the "Messages" resource definitions, allowing messages
to be mailed based on the type (backup, restore, etc.) and level
(full, differential, etc) of job that created the originating
message(s).
Why: It would, for example, allow someone's boss to be emailed
automatically only when a Full Backup job runs, so he can
retrieve the tapes for offsite storage, even if the IT dept.
doesn't (or can't) explicitly notify him. At the same time, his
mailbox wouldnt be filled by notifications of Verifies, Restores,
or Incremental/Differential Backups (which would likely be kept
onsite).
Notes: One way this could be done is through additional message types, for
example:
Messages {
# email the boss only on full system backups
Mail = boss@mycompany.com = full, !incremental, !differential, !restore,
!verify, !admin
# email us only when something breaks
MailOnError = itdept@mycompany.com = all
}
Notes: Kern: This should be rather trivial to implement.
Item 22: Ability to import/export Bacula database entities
Date: 26 April 2009
Origin: Eric
Status:
What: Create a Bacula ASCII SQL database independent format that permits
importing and exporting database catalog Job entities.
Why: For achival, database clustering, tranfer to other databases
of any SQL engine.
Notes: Job selection should be by Job, time, Volume, Client, Pool and possibly
other criteria.
Item 23: "Maximum Concurrent Jobs" for drives when used with changer device
Origin: Ralf Gross ralf-lists <at> ralfgross.de
Date: 2008-12-12
Status: Done in 3.0.3
What: respect the "Maximum Concurrent Jobs" directive in the _drives_
Storage section in addition to the changer section
Why: I have a 3 drive changer where I want to be able to let 3 concurrent
jobs run in parallel. But only one job per drive at the same time.
Right now I don't see how I could limit the number of concurrent jobs
per drive in this situation.
Notes: Using different priorities for these jobs lead to problems that other
jobs are blocked. On the user list I got the advice to use the
"Prefer Mounted Volumes" directive, but Kern advised against using
"Prefer Mounted Volumes" in an other thread:
http://article.gmane.org/gmane.comp.sysutils.backup.bacula.devel/11876/
In addition I'm not sure if this would be the same as respecting the
drive's "Maximum Concurrent Jobs" setting.
Example:
Storage {
Name = Neo4100
Address = ....
SDPort = 9103
Password = "wiped"
Device = Neo4100
Media Type = LTO4
Autochanger = yes
Maximum Concurrent Jobs = 3
}
Storage {
Name = Neo4100-LTO4-D1
Address = ....
SDPort = 9103
Password = "wiped"
Device = ULTRIUM-TD4-D1
Media Type = LTO4
Maximum Concurrent Jobs = 1
}
[2 more drives]
The "Maximum Concurrent Jobs = 1" directive in the drive's section is
ignored.
Item 24: Implementation of running Job speed limit.
Origin: Alex F, alexxzell at yahoo dot com
Date: 29 January 2009
What: I noticed the need for an integrated bandwidth limiter for
running jobs. It would be very useful just to specify another
field in bacula-dir.conf, like speed = how much speed you wish
for that specific job to run at
Why: Because of a couple of reasons. First, it's very hard to implement a
traffic shaping utility and also make it reliable. Second, it is very
uncomfortable to have to implement these apps to, let's say 50 clients
(including desktops, servers). This would also be unreliable because you
have to make sure that the apps are properly working when needed; users
could also disable them (accidentally or not). It would be very useful
to provide Bacula this ability. All information would be centralized,
you would not have to go to 50 different clients in 10 different
locations for configuration; eliminating 3rd party additions help in
establishing efficiency. Would also avoid bandwidth congestion,
especially where there is little available.
Item 25: Add an override in Schedule for Pools based on backup types
Date: 19 Jan 2005
Origin: Chad Slater <chad.slater@clickfox.com>
Status:
What: Adding a FullStorage=BigTapeLibrary in the Schedule resource
would help those of us who use different storage devices for different
backup levels cope with the "auto-upgrade" of a backup.
Why: Assume I add several new devices to be backed up, i.e. several
hosts with 1TB RAID. To avoid tape switching hassles, incrementals are
stored in a disk set on a 2TB RAID. If you add these devices in the
middle of the month, the incrementals are upgraded to "full" backups,
but they try to use the same storage device as requested in the
incremental job, filling up the RAID holding the differentials. If we
could override the Storage parameter for full and/or differential
backups, then the Full job would use the proper Storage device, which
has more capacity (i.e. a 8TB tape library.
Item 26: Automatic promotion of backup levels based on backup size
Date: 19 January 2006
Origin: Adam Thornton <athornton@sinenomine.net>
Status:
What: Other backup programs have a feature whereby it estimates the space
that a differential, incremental, and full backup would take. If
the difference in space required between the scheduled level and the
next level up is beneath some user-defined critical threshold, the
backup level is bumped to the next type. Doing this minimizes the
number of volumes necessary during a restore, with a fairly minimal
cost in backup media space.
Why: I know at least one (quite sophisticated and smart) user for whom the
absence of this feature is a deal-breaker in terms of using Bacula;
if we had it it would eliminate the one cool thing other backup
programs can do and we can't (at least, the one cool thing I know
of).
Item 27: Allow inclusion/exclusion of files in a fileset by creation/mod times
Origin: Evan Kaufman <evan.kaufman@gmail.com>
Date: January 11, 2006
Status:
What: In the vein of the Wild and Regex directives in a Fileset's
Options, it would be helpful to allow a user to include or exclude
files and directories by creation or modification times.
You could factor the Exclude=yes|no option in much the same way it
affects the Wild and Regex directives. For example, you could exclude
all files modified before a certain date:
Options {
Exclude = yes
Modified Before = ####
}
Or you could exclude all files created/modified since a certain date:
Options {
Exclude = yes
Created Modified Since = ####
}
The format of the time/date could be done several ways, say the number
of seconds since the epoch:
1137008553 = Jan 11 2006, 1:42:33PM # result of `date +%s`
Or a human readable date in a cryptic form:
20060111134233 = Jan 11 2006, 1:42:33PM # YYYYMMDDhhmmss
Why: I imagine a feature like this could have many uses. It would
allow a user to do a full backup while excluding the base operating
system files, so if I installed a Linux snapshot from a CD yesterday,
I'll *exclude* all files modified *before* today. If I need to
recover the system, I use the CD I already have, plus the tape backup.
Or if, say, a Windows client is hit by a particularly corrosive
virus, and I need to *exclude* any files created/modified *since* the
time of infection.
Notes: Of course, this feature would work in concert with other
in/exclude rules, and wouldnt override them (or each other).
Notes: The directives I'd imagine would be along the lines of
"[Created] [Modified] [Before|Since] = <date>".
So one could compare against 'ctime' and/or 'mtime', but ONLY 'before'
or 'since'.
Item 28: Archival (removal) of User Files to Tape
Date: Nov. 24/2005
Origin: Ray Pengelly [ray at biomed dot queensu dot ca
Status:
What: The ability to archive data to storage based on certain parameters
such as age, size, or location. Once the data has been written to
storage and logged it is then pruned from the originating
filesystem. Note! We are talking about user's files and not
Bacula Volumes.
Why: This would allow fully automatic storage management which becomes
useful for large datastores. It would also allow for auto-staging
from one media type to another.
Example 1) Medical imaging needs to store large amounts of data.
They decide to keep data on their servers for 6 months and then put
it away for long term storage. The server then finds all files
older than 6 months writes them to tape. The files are then removed
from the server.
Example 2) All data that hasn't been accessed in 2 months could be
moved from high-cost, fibre-channel disk storage to a low-cost
large-capacity SATA disk storage pool which doesn't have as quick of
access time. Then after another 6 months (or possibly as one
storage pool gets full) data is migrated to Tape.
Item 29: An option to operate on all pools with update vol parameters
Origin: Dmitriy Pinchukov <absh@bossdev.kiev.ua>
Date: 16 August 2006
Status: Patch made by Nigel Stepp
What: When I do update -> Volume parameters -> All Volumes
from Pool, then I have to select pools one by one. I'd like
console to have an option like "0: All Pools" in the list of
defined pools.
Why: I have many pools and therefore unhappy with manually
updating each of them using update -> Volume parameters -> All
Volumes from Pool -> pool #.
Item 30: Automatic disabling of devices
Date: 2005-11-11
Origin: Peter Eriksson <peter at ifm.liu dot se>
Status:
What: After a configurable amount of fatal errors with a tape drive
Bacula should automatically disable further use of a certain
tape drive. There should also be "disable"/"enable" commands in
the "bconsole" tool.
Why: On a multi-drive jukebox there is a possibility of tape drives
going bad during large backups (needing a cleaning tape run,
tapes getting stuck). It would be advantageous if Bacula would
automatically disable further use of a problematic tape drive
after a configurable amount of errors has occurred.
An example: I have a multi-drive jukebox (6 drives, 380+ slots)
where tapes occasionally get stuck inside the drive. Bacula will
notice that the "mtx-changer" command will fail and then fail
any backup jobs trying to use that drive. However, it will still
keep on trying to run new jobs using that drive and fail -
forever, and thus failing lots and lots of jobs... Since we have
many drives Bacula could have just automatically disabled
further use of that drive and used one of the other ones
instead.
Item 31: List InChanger flag when doing restore.
Origin: Jesper Krogh<jesper@krogh.cc>
Date: 17 Oct 2008
Status: Done in version 3.0.2
What: When doing a restore the restore selection dialog ends by telling
stuff like this:
The job will require the following
Volume(s) Storage(s) SD Device(s)
===========================================================================
000741L3 LTO-4 LTO3
000866L3 LTO-4 LTO3
000765L3 LTO-4 LTO3
000764L3 LTO-4 LTO3
000756L3 LTO-4 LTO3
001759L3 LTO-4 LTO3
001763L3 LTO-4 LTO3
001762L3 LTO-4 LTO3
001767L3 LTO-4 LTO3
When having an autochanger, it would be really nice with an inChanger
column so the operator knew if this restore job would stop waiting for
operator intervention. This is done just by selecting the inChanger flag
from the catalog and printing it in a seperate column.
Why: This would help getting large restores through minimizing the
time spent waiting for operator to drop by and change tapes in the library.
Notes: [Kern] I think it would also be good to have the Slot as well,
or some indication that Bacula thinks the volume is in the autochanger
because it depends on both the InChanger flag and the Slot being
valid.
Item 32: Ability to defer Batch Insert to a later time
Date: 26 April 2009
Origin: Eric
Status:
What: Instead of doing a Job Batch Insert at the end of the Job
which might create resource contention with lots of Job,
defer the insert to a later time.
Why: Permits to focus on getting the data on the Volume and
putting the metadata into the Catalog outside the backup
window.
Notes: Will use the proposed Bacula ASCII database import/export
format (i.e. dependent on the import/export entities project).
Item 33: Add MaxVolumeSize/MaxVolumeBytes statement to Storage resource
Origin: Bastian Friedrich <bastian.friedrich@collax.com>
Date: 2008-07-09
Status: -
What: SD has a "Maximum Volume Size" statement, which is deprecated and
superseded by the Pool resource statement "Maximum Volume Bytes".
It would be good if either statement could be used in Storage
resources.
Why: Pools do not have to be restricted to a single storage type/device;
thus, it may be impossible to define Maximum Volume Bytes in the
Pool resource. The old MaxVolSize statement is deprecated, as it
is SD side only. I am using the same pool for different devices.
Notes: State of idea currently unknown. Storage resources in the dir
config currently translate to very slim catalog entries; these
entries would require extensions to implement what is described
here. Quite possibly, numerous other statements that are currently
available in Pool resources could be used in Storage resources too
quite well.
Item 34: Enable persistent naming/number of SQL queries
Date: 24 Jan, 2007
Origin: Mark Bergman
Status:
What:
Change the parsing of the query.sql file and the query command so that
queries are named/numbered by a fixed value, not their order in the
file.
Why:
One of the real strengths of bacula is the ability to query the
database, and the fact that complex queries can be saved and
referenced from a file is very powerful. However, the choice
of query (both for interactive use, and by scripting input
to the bconsole command) is completely dependent on the order
within the query.sql file. The descriptve labels are helpful for
interactive use, but users become used to calling a particular
query "by number", or may use scripts to execute queries. This
presents a problem if the number or order of queries in the file
changes.
If the query.sql file used the numeric tags as a real value (rather
than a comment), then users could have a higher confidence that they
are executing the intended query, that their local changes wouldn't
conflict with future bacula upgrades.
For scripting, it's very important that the intended query is
what's actually executed. The current method of parsing the
query.sql file discourages scripting because the addition or
deletion of queries within the file will require corresponding
changes to scripts. It may not be obvious to users that deleting
query "17" in the query.sql file will require changing all
references to higher numbered queries. Similarly, when new
bacula distributions change the number of "official" queries,
user-developed queries cannot simply be appended to the file
without also changing any references to those queries in scripts
or procedural documentation, etc.
In addition, using fixed numbers for queries would encourage more
user-initiated development of queries, by supporting conventions
such as:
queries numbered 1-50 are supported/developed/distributed by
with official bacula releases
queries numbered 100-200 are community contributed, and are
related to media management
queries numbered 201-300 are community contributed, and are
related to checksums, finding duplicated files across
different backups, etc.
queries numbered 301-400 are community contributed, and are
related to backup statistics (average file size, size per
client per backup level, time for all clients by backup level,
storage capacity by media type, etc.)
queries numbered 500-999 are locally created
Notes:
Alternatively, queries could be called by keyword (tag), rather
than by number.
Item 35: Port bat to Win32
Date: 26 April 2009
Origin: Kern/Eric
Status:
What: Make bat run on Win32/64.
Why: To have GUI on Windows
Notes:
Item 36: Bacula Dir, FD and SD to support proxies
Origin: Karl Grindley @ MIT Lincoln Laboratory <kgrindley at ll dot mit dot edu>
Date: 25 March 2009
Status: proposed
What: Support alternate methods for nailing up a TCP session such
as SOCKS5, SOCKS4 and HTTP (CONNECT) proxies. Such a feature
would allow tunneling of bacula traffic in and out of proxied
networks.
Why: Currently, bacula is architected to only function on a flat network, with
no barriers or limitations. Due to the large configuration states of
any network and the infinite configuration where file daemons and
storage daemons may sit in relation to one another, bacula often is
not usable on a network where filtered or air-gaped networks exist.
While often solutions such as ACL modifications to firewalls or port
redirection via SNAT or DNAT will solve the issue, often however,
these solutions are not adequate or not allowed by hard policy.
In an air-gapped network with only a highly locked down proxy services
are provided (SOCKS4/5 and/or HTTP and/or SSH outbound) ACLs or
iptable rules will not work.
Notes: Director resource tunneling: This configuration option to utilize a
proxy to connect to a client should be specified in the client
resource Client resource tunneling: should be configured in the client
resource in the director config file? Or configured on the bacula-fd
configuration file on the fd host itself? If the ladder, this would
allow only certain clients to use a proxy, where others do not when
establishing the TCP connection to the storage server.
Also worth noting, there are other 3rd party, light weight apps that
could be utilized to bootstrap this. Instead of sockifing bacula
itself, use an external program to broker proxy authentication, and
connection to the remote host. OpenSSH does this by using the
"ProxyCommand" syntax in the client configuration and uses stdin and
stdout to the command. Connect.c is a very popular one.
(http://bent.latency.net/bent/darcs/goto-san-connect-1.85/src/connect.html).
One could also possibly use stunnel, netcat, etc.
Item 37: Add Minumum Spool Size directive
Date: 20 March 2008
Origin: Frank Sweetser <fs@wpi.edu>
What: Add a new SD directive, "minimum spool size" (or similar). This
directive would specify a minimum level of free space available for
spooling. If the unused spool space is less than this level, any
new spooling requests would be blocked as if the "maximum spool
size" threshold had bee reached. Already spooling jobs would be
unaffected by this directive.
Why: I've been bitten by this scenario a couple of times:
Assume a maximum spool size of 100M. Two concurrent jobs, A and B,
are both running. Due to timing quirks and previously running jobs,
job A has used 99.9M of space in the spool directory. While A is
busy despooling to disk, B is happily using the remaining 0.1M of
spool space. This ends up in a spool/despool sequence every 0.1M of
data. In addition to fragmenting the data on the volume far more
than was necessary, in larger data sets (ie, tens or hundreds of
gigabytes) it can easily produce multi-megabyte report emails!
Item 38: Backup and Restore of Windows Encrypted Files using Win raw encryption
Origin: Michael Mohr, SAG Mohr.External@infineon.com
Date: 22 February 2008
Origin: Alex Ehrlich (Alex.Ehrlich-at-mail.ee)
Date: 05 August 2008
Status:
What: Make it possible to backup and restore Encypted Files from and to
Windows systems without the need to decrypt it by using the raw
encryption functions API (see:
http://msdn2.microsoft.com/en-us/library/aa363783.aspx)
that is provided for that reason by Microsoft.
If a file ist encrypted could be examined by evaluating the
FILE_ATTRIBUTE_ENCRYTED flag of the GetFileAttributes
function.
For each file backed up or restored by FD on Windows, check if
the file is encrypted; if so then use OpenEncryptedFileRaw,
ReadEncryptedFileRaw, WriteEncryptedFileRaw,
CloseEncryptedFileRaw instead of BackupRead and BackupWrite
API calls.
Why: Without the usage of this interface the fd-daemon running
under the system account can't read encypted Files because
the key needed for the decrytion is missed by them. As a result
actually encrypted files are not backed up
by bacula and also no error is shown while missing these files.
Notes: Using xxxEncryptedFileRaw API would allow to backup and
restore EFS-encrypted files without decrypting their data.
Note that such files cannot be restored "portably" (at least,
easily) but they would be restoreable to a different (or
reinstalled) Win32 machine; the restore would require setup
of a EFS recovery agent in advance, of course, and this shall
be clearly reflected in the documentation, but this is the
normal Windows SysAdmin's business.
When "portable" backup is requested the EFS-encrypted files
shall be clearly reported as errors.
See MSDN on the "Backup and Restore of Encrypted Files" topic:
http://msdn.microsoft.com/en-us/library/aa363783.aspx
Maybe the EFS support requires a new flag in the database for
each file, too?
Unfortunately, the implementation is not as straightforward as
1-to-1 replacement of BackupRead with ReadEncryptedFileRaw,
requiring some FD code rewrite to work with
encrypted-file-related callback functions.
Item 39: Implement an interface between Bacula and Storage clould like Amazon's S3.
Date: 25 August 2008
Origin: Soren Hansen <soren@ubuntu.com>
Status: Not started.
What: Enable the storage daemon to store backup data on Amazon's
S3 service.
Why: Amazon's S3 is a cheap way to store data off-site.
Notes: If we configure the Pool to put only one job per volume (they don't
support append operation), and the volume size isn't to big (100MB?),
it should be easy to adapt the disk-changer script to add get/put
procedure with curl. So, the data would be safetly copied during the
Job.
Cloud should be only used with Copy jobs, users should always have
a copy of their data on their site.
We should also think to have our own cache, trying always to have
cloud volume on the local disk. (I don't know if users want to store
100GB on cloud, so it shouldn't be a disk size problem). For example,
if bacula want to recycle a volume, it will start by downloading the
file to truncate it few seconds later, if we can avoid that...
Item 40: Convert Bacula existing tray monitor on Windows to a stand alone program
Date: 26 April 2009
Origin: Kern/Eric
Status:
What: Separate Win32 tray monitor to be a separate program.
Why: Vista does not allow SYSTEM services to interact with the
desktop, so the current tray monitor does not work on Vista
machines.
Notes: Requires communicating with the FD via the network (simulate
a console connection).
========= End items voted on May 2009 ==================
========= New items after last vote ====================
Item 1: Relabel disk volume after recycling
Origin: Pasi Kärkkäinen <pasik@iki.fi>
Date: 07 May 2009.
Status: Not implemented yet, no code written.
What: The ability to relabel the disk volume (and thus rename the file on the
disk) after it has been recycled. Useful when you have a single job
per disk volume, and you use a custom Label format, for example:
Label Format =
"${Client}-${Level}-${NumVols:p/4/0/r}-${Year}_${Month}_${Day}-${Hour}_${Minute}"
Why: Disk volumes in Bacula get the label/filename when they are used for the
first time. If you use recycling and custom label format like above,
the disk volume name doesn't match the contents after it has been
recycled. This feature makes it possible to keep the label/filename
in sync with the content and thus makes it easy to check/monitor the
backups from the shell and/or normal file management tools, because
the filenames of the disk volumes match the content.
Notes: The configuration option could be "Relabel after Recycling = Yes".
Item n: Command that releases all drives in an autochanger
Origin: Blake Dunlap (blake@nxs.net)
Date: 10/07/2009
Status: Request
What: It would be nice if there was a release command that
would release all drives in an autochanger instead of having to
do each one in turn.
Why: It can take some time for a release to occur, and the
commands must be given for each drive in turn, which can quicky
scale if there are several drives in the library. (Having to
watch the console, to give each command can waste a good bit of
time when you start getting into the 16 drive range when the
tapes can take up to 3 minutes to eject each)
Notes: Due to the way some autochangers/libraries work, you
cannot assume that new tapes inserted will go into slots that are
not currently believed to be in use by bacula (the tape from that
slot is in a drive). This would make any changes in
configuration quicker/easier, as all drives need to be released
before any modifications to slots.
Item n: Run bscan on a remote storage daemon from within bconsole.
Date: 07 October 2009
Origin: Graham Keeling <graham@equiinet.com>
Status: Proposing
What: The ability to be able to run bscan on a remote storage daemon from
within bconsole in order to populate your catalog.
Why: Currently, it seems you have to:
a) log in to a console on the remote machine
b) figure out where the storage daemon config file is
c) figure out the storage device from the config file
d) figure out the catalog IP address
e) figure out the catalog port
f) open the port on the catalog firewall
g) configure the catalog database to accept connections from the
remote host
h) build a 'bscan' command from (b)-(e) above and run it
It would be much nicer to be able to type something like this into
bconsole:
*bscan storage=<storage> device=<device> volume=<volume>
or something like:
*bscan storage=<storage> all
It seems to me that the scan could also do a better job than the
external bscan program currently does. It would possibly be able to
deduce some extra details, such as the catalog StorageId for the
volumes.
Notes: (Kern). If you need to do a bscan, you have done something wrong,
so this functionality should not need to be integrated into the
the Storage daemon. However, I am not opposed to someone implementing
this feature providing that all the code is in a shared object (or dll)
and does not add significantly to the size of the Storage daemon. In
addition, the code should be written in a way such that the same source
code is used in both the bscan program and the Storage daemon to avoid
adding a lot of new code that must be maintained by the project.
Item n: Implement a Migration job type that will create a reverse
incremental (or decremental) backup from two existing full backups.
Date: 05 October 2009
Origin: Griffith College Dublin. Some sponsorship available.
Contact: Gavin McCullagh <gavin.mccullagh@gcd.ie>
Status:
What: The ability to take two full backup jobs and derive a reverse
incremental backup from them. The older full backup data may then
be discarded.
Why: Long-term backups based on keeping full backups can be expensive in
media. In many cases (eg a NAS), as the client accumulates files
over months and years, the same file will be duplicated unchanged,
across many media and datasets. Eg, Less than 10% (and
shrinking) of our monthly full mail server backup is new files,
the other 90% is also in the previous full backup.
Regularly converting the oldest full backup into a reverse
incremental backup allows the admin to keep access to old backup
jobs, but remove all of the duplicated files, freeing up media.
Notes: This feature was previously discussed on the bacula-devel list
here: http://www.mail-archive.com/bacula-devel@lists.sourceforge.net/msg04962.html
Item n: Job migration between different SDs
Origin: Mariusz Czulada <manieq AT wp DOT eu>
Date: 07 May 2007
Status: NEW
What: Allow to specify in migration job devices on Storage Daemon other then
the one used for migrated jobs (possibly on different/distant host)
Why: Sometimes we have more then one system which requires backup
implementation. Often, these systems are functionally unrelated and
placed in different locations. Having a big backup device (a tape
library) in each location is not cost-effective. It would be much
better to have one powerful enough tape library which could handle
backups from all systems, assuming relatively fast and reliable WAN
connections. In such architecture backups are done in service windows
on local bacula servers, then migrated to central storage off the peak
hours.
Notes: If migration to different SD is working, migration to the same SD, as
now, could be done the same way (i mean 'localhost') to unify the
whole process
Item n: Concurrent spooling and despooling withini a single job.
Date: 17 nov 2009
Origin: Jesper Krogh <jesper@krogh.cc>
Status: NEW
What: When a job has spooling enabled and the spool area size is
less than the total volumes size the storage daemon will:
1) Spool to spool-area
2) Despool to tape
3) Go to 1 if more data to be backed up.
Typical disks will serve data with a speed of 100MB/s when
dealing with large files, network it typical capable of doing 115MB/s
(GbitE). Tape drives will despool with 50-90MB/s (LTO3) 70-120MB/s
(LTO4) depending on compression and data.
As bacula currently works it'll hold back data from the client until
de-spooling is done, now matter if the spool area can handle another
block of data. Say given a FileSet of 4TB and a spool-area of 100GB and
a Maximum Job Spool Size set to 50GB then above sequence could be
changed to allow to spool to the other 50GB while despooling the first
50GB and not holding back the client while doing it. As above numbers
show, depending on tape-drive and disk-arrays this potentially leads to
a cut of the backup-time of 50% for the individual jobs.
Real-world example, backing up 112.6GB (large files) to LTO4 tapes
(despools with ~75MB/s, data is gzipped on the remote filesystem.
Maximum Job Spool Size = 8GB
Current:
Size: 112.6GB
Elapsed time (total time): 46m 15s => 2775s
Despooling time: 25m 41s => 1541s (55%)
Spooling time: 20m 34s => 1234s (45%)
Reported speed: 40.58MB/s
Spooling speed: 112.6GB/1234s => 91.25MB/s
Despooling speed: 112.6GB/1541s => 73.07MB/s
So disk + net can "keep up" with the LTO4 drive (in this test)
Prosed change would effectively make the backup run in the "despooling
time" 1541s giving a reduction to 55% of the total run time.
In the situation where the individual job cannot keep up with LTO-drive
spooling enables efficient multiplexing of multiple concurrent jobs onto
the same drive.
Why: When dealing with larger volumes the general utillization of the
network/disk is important to maximize in order to be able to run a full
backup over a weekend. Current work-around is to split the FileSet in
smaller FileSet and Jobs but that leads to more configuration mangement
and is harder to review for completeness. Subsequently it makes restores
more complex.
Item 1: Extend the verify code to make it possible to verify
older jobs, not only the last one that has finished
Date: 10 April 2009
Origin: Ralf Gross (Ralf-Lists <at> ralfgross.de)
Status: not implemented or documented
What: At the moment a VolumeToCatalog job compares only the
last job with the data in the catalog. It's not possible
to compare the data (md5sums) of an older volume with the
data in the catalog.
Why: If a verify job fails, one has to immediately check the
source of the problem, fix it and rerun the verify job.
This has to happen before the next backup of the
verified backup job starts.
More important: It's not possible to check jobs that are
kept for a long time (archiv). If a jobid could be
specified for a verify job, older backups/tapes could be
checked on a regular base.
Notes: verify documentation:
VolumeToCatalog: This level causes Bacula to read the file
attribute data written to the Volume from the last Job [...]
Verify Job = <Job-Resource-Name> If you run a verify job
without this directive, the last job run will be compared
with the catalog, which means that you must immediately
follow a backup by a verify command. If you specify a Verify
Job Bacula will find the last job with that name that ran [...]
example bconsole verify dialog:
Run Verify job
JobName: VerifyServerXXX
Level: VolumeToCatalog
Client: ServerXXX-fd
FileSet: ServerXXX-Vol1
Pool: Full (From Job resource)
Storage: Neo4100 (From Pool resource)
Verify Job: ServerXXX-Vol1
Verify List:
When: 2009-04-20 09:03:04
Priority: 10
OK to run? (yes/mod/no): m
Parameters to modify:
1: Level
2: Storage
3: Job
4: FileSet
5: Client
6: When
7: Priority
8: Pool
9: Verify Job
Item n: Separate "Storage" and "Device" in the bacula-dir.conf
Date: 29 April 2009
Origin: "James Harper" <james.harper@bendigoit.com.au>
Status: not implemented or documented
What: Separate "Storage" and "Device" in the bacula-dir.conf
The resulting config would looks something like:
Storage {
Name = name_of_server
Address = hostname/IP address
SDPort = 9103
Password = shh_its_a_secret
Maximum Concurrent Jobs = 7
}
Device {
Name = name_of_device
Storage = name_of_server
Device = name_of_device_on_sd
Media Type = media_type
Maximum Concurrent Jobs = 1
}
Maximum Concurrent Jobs would be specified with a server and a device
maximum, which would both be honoured by the director. Almost everything
that mentions a 'Storage' would need to be changed to 'Device', although
perhaps a 'Storage' would just be a synonym for 'Device' for backwards
compatibility...
Why: If you have multiple Storage definitions pointing to different
Devices in the same Storage daemon, the "status storage" command
prompts for each different device, but they all give the same
information.
Notes:
========= Add new items above this line =================
============= Empty Feature Request form ===========
Item n: One line summary ...
Date: Date submitted
Origin: Name and email of originator.
Status:
What: More detailed explanation ...
Why: Why it is important ...
Notes: Additional notes or features (omit if not used)
============== End Feature Request form ==============
========== Items put on hold by Kern ============================
|