1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN""http://www.w3.org/TR/html4/loose.dtd">
<HTML
><HEAD
><TITLE
>debdelta-upgrade service</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
REL="HOME"
HREF="index.html"><LINK
REL="PREVIOUS"
TITLE="a delta"
HREF="x54.html"><LINK
REL="NEXT"
TITLE="Goals, tricks, ideas and issues"
HREF="x182.html"></HEAD
><BODY
CLASS="section"
BGCOLOR="#FFFFFF"
TEXT="#000000"
LINK="#0000FF"
VLINK="#840084"
ALINK="#0000FF"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
></TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="x54.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
></TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="x182.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="section"
><H1
CLASS="section"
><A
NAME="AEN65"
>3. debdelta-upgrade service</A
></H1
><P
>In June 2006 I set up a delta-upgrading framework, so that people
may upgrade their Debian box using <B
CLASS="command"
>debdelta-upgrade</B
> (that downloads
package 'deltas').
This section is an introduction to the framework that is behind
'debdelta-upgrade', and is also used by 'cupt'.
In the following, I will simplify (in places, quite a lot).
</P
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="AEN69"
>3.1. The framework</A
></H2
><P
> The framework is so organized: I keep up some servers where I use the
program 'debdeltas' to create all the deltas; whereas endusers use the
client 'debdelta-upgrade' to download the deltas and apply them to
produce the debs needed to upgrade their boxes.
In my server, I mirror some repositories, and then I invoke
'debdeltas' to make the deltas between them. I use the
scripts <TT
CLASS="filename"
>/usr/share/debdelta/debmirror-delta-security</TT
>
and <TT
CLASS="filename"
>/usr/share/debdelta/debmirror-marshal-deltas</TT
> for this.
This generates any delta that may be needed for upgrades
in squeeze,squeeze-security,wheezy,sid,experimental,
for architectures i386 and amd64 (as of Mar 2011); the generated repository of deltas is
more or less 10GB.
</P
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="AEN74"
>3.2. The goals</A
></H2
><P
>There are two ultimate goals in designing this framework:
<P
></P
><OL
TYPE="1"
><LI
><P
> SMALL) reduce the size of downloads
(fit for people that pay-by-megabyte);
</P
></LI
><LI
><P
> FAST) speed up the upgrade.
</P
></LI
></OL
>
The two goals are unfortunately only marginally compatible. An
example: bsdiff can produce very small deltas, but is quite slow (in
particular with very large files); so currently (2009 on) I use 'xdelta3'
as the backend diffing tool for 'debdeltas' in my server.
Another example is in debs that contain archives ( .gz, , tar.gz
etc etc): I have methods and code to peek inside them, so
the delta become smaller, but the applying gets slower.
</P
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="AEN82"
>3.3. The repository structure</A
></H2
><P
> The repository of deltas is just a HTTP archive; it is similar to the pool of packages; that is, if
<TT
CLASS="filename"
>foobar_1_all.deb</TT
> is stored in
<TT
CLASS="filename"
>pool/main/f/foobar/</TT
> in the repository of debs, then the
delta to upgrade it will be stored in <TT
CLASS="filename"
>pool/main/f/foobar/foobar_1_2_all.debdelta</TT
>
in the repository of deltas. Contrary to the repository of debs, a repository of deltas
has no indexes, see <A
HREF="x65.html#no_indexes"
>Section 3.7.2</A
>. The delta repository is in
<TT
CLASS="filename"
>http://debdeltas.debian.net/debian-deltas</TT
>.
</P
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="delta_creation"
>3.4. The repository creation</A
></H2
><P
> Suppose that the unstable archive, on 1st Mar, contains
<TT
CLASS="filename"
>foobar_1_all.deb</TT
> (and it is in
<TT
CLASS="filename"
>pool/main/f/foobar/</TT
> ) ; then on 2nd Mar,
<TT
CLASS="filename"
>foobar_2_all.deb</TT
> is uploaded; but this
has a flaw (e.g. FTBFS) and so on 3rd Mar
<TT
CLASS="filename"
>foobar_3_all.deb</TT
> is uploaded.
On 2nd Mar, the delta server generates
<TT
CLASS="filename"
>pool/main/f/foobar/foobar_1_2_all.debdelta</TT
>
On 3rd Mar, the server generates both
<TT
CLASS="filename"
>pool/main/f/foobar/foobar_1_3_all.debdelta</TT
>
<TT
CLASS="filename"
>pool/main/f/foobar/foobar_2_3_all.debdelta</TT
>.
So, if the end-user Ann upgrades the system on both 2nd and 3rd Mar,
then she uses both foobar_1_2_all.debdelta (on 2nd) and
<TT
CLASS="filename"
>foobar_2_3_all.debdelta</TT
> (on 3rd Mar). If the end-user Boe has not
upgraded the system on 2nd Mar, , and he upgrades on 3rd Mar, then on
3rd Mar he uses <TT
CLASS="filename"
>foobar_1_3_all.debdelta</TT
>.
</P
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="AEN102"
>3.5. size limit</A
></H2
><P
> Note that currently the server rejects deltas that exceed 70% of the deb
size: indeed the size gain would be too small, and the time would be
wasted, if you sum the time to download the delta and the time to apply
it (OK, these are run as much as possible in parallel, yet ....).
</P
><P
> Also, the server does not generate delta for packages that are smaller than 10KB.
</P
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="AEN106"
>3.6. /etc/debdelta/sources.conf</A
></H2
><P
> Consider a package that is currently installed. It is characterized by
<SPAN
CLASS="emphasis"
><I
CLASS="emphasis"
> name installed_version architecture</I
></SPAN
>
(unfortunately there is no way to tell from which archive it came
from, but this does not seem to be a problem currently)
Suppose now that a newer version is available somewhere in an archive,
and that the user wishes to upgrade to that version.
The archive Release file contain these info:
<SPAN
CLASS="QUOTE"
>"Origin , Label , Site, Archive"</SPAN
>.
(Note that Archive is called Suite in the Release file).
Example for the security archive:
<PRE
CLASS="programlisting"
> Origin=Debian
Label=Debian-Security
Archive=stable
Site=security.debian.org
</PRE
>
The file <TT
CLASS="filename"
>/etc/debdelta/sources.conf</TT
>
, given the above info, determines
the host that should contain the delta for upgrading the package. This
information is called "delta_uri" in that file.
The complete URL for the delta is built adding to the delta_uri a
directory path that mimicks the "pool" structure used in Debian
archives, and appending to it a filename of the form
<TT
CLASS="filename"
>name_oldversion_newversion_architecture.debdelta</TT
>.
All this is implemented in the example script contrib/findurl.py .
If the delta is not available at that URL, and
<TT
CLASS="filename"
>name_oldversion_newversion_architecture.debdelta-too-big</TT
>
is available, then the delta is too big to be useful.
If neither is present, then, either the delta has not yet been
generated, or it will never be generated... but this is difficult to
know.
</P
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="AEN115"
>3.7. indexes</A
></H2
><DIV
CLASS="section"
><H3
CLASS="section"
><A
NAME="AEN117"
>3.7.1. indexes of debs in APT</A
></H3
><P
> Let's start examining the situation for debs and APT.
Using indexes for debs is a no-brainer decision: indeed, the client
(i.e. the end user) does not know the list of available debs in the
server, and, even knowing the current list, cannot foresee the future
changes.
So indexes provide needed informations: the packages' descriptions,
versions, dependencies, etc etc; these info are used by apt and the
other frontends.
</P
></DIV
><DIV
CLASS="section"
><H3
CLASS="section"
><A
NAME="no_indexes"
>3.7.2. no indexes of deltas in debdelta</A
></H3
><P
> If you then think of deltas, you realize that all requirements above
fall. Firstly there is no description and no dependencies for deltas.
<A
NAME="AEN123"
HREF="#FTN.AEN123"
><SPAN
CLASS="footnote"
>[1]</SPAN
></A
>
Of course 'debdelta-upgrade' needs some information to determine if a delta
exists, and to download it; but these information are already available:
<PRE
CLASS="programlisting"
> the name of the package P
the old version O
the new version N
the architecture A
</PRE
>
Once these are known, the URL of the file F can be algorithmically
determined as
<TT
CLASS="filename"
>URI/POOL/P_O_N_A.debdelta</TT
>
where URI is determined from
<TT
CLASS="filename"
>/etc/debdelta/sources.conf</TT
>
and POOL is the directory in the pool of the package P .
This algorithm is also implemented (quite verbosely) in
contrib/findurl.py in the sources of debdelta.
This is the reason why currently there is no "index of deltas", and
nonetheless 'debdelta-upgrade' works fine (and "cupt" as well).
Adding an index of file would only increase downloads (time and size)
and increase disk usage; with negligeable benefit, if any.
</P
></DIV
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="no_incremental"
>3.8. no incremental deltas</A
></H2
><P
> Let me add another point that may be unclear. There are no incremental
deltas (and IMHO never will be).
</P
><DIV
CLASS="section"
><H3
CLASS="section"
><A
NAME="AEN131"
>3.8.1. What "incremental" would be, and why it is not</A
></H3
><P
> Please recall <A
HREF="x65.html#delta_creation"
>Section 3.4</A
>.
What <SPAN
CLASS="emphasis"
><I
CLASS="emphasis"
>does not happen</I
></SPAN
> currently is what follows:
on 3rd Mar , Boe decides to upgrade, and invokes 'debdelta-upgrade';
then 'debdelta-upgrade' finds <TT
CLASS="filename"
>foobar_1_2_all.debdelta</TT
> and
<TT
CLASS="filename"
>foobar_2_3_all.debdelta</TT
> , it uses the foremost to generate
<TT
CLASS="filename"
>foobar_2_all.deb</TT
>, and in turn it uses this and the second delta to
<TT
CLASS="filename"
>generate foobar_3_all.deb</TT
> .
This is not implemented, and it will not, for the following reasons.
<P
></P
><UL
><LI
><P
> The delta size is, on average, 40% of the size of the deb (and this
is getting worse, for different reasons, see <A
HREF="x277.html#getting_worse"
>Section 5.2</A
>); so two deltas are 80% of the
target deb, and this too much.
</P
></LI
><LI
><P
> It takes time to apply a delta; applying two deltas to produce one
deb takes too much time.</P
></LI
><LI
><P
> The server does generate the direct delta
<TT
CLASS="filename"
>foobar_1_3_all.debdelta</TT
>
:-) so why making things complex when they are easy? :-)</P
></LI
><LI
><P
> Note also that incremental deltas would
need some index system to be implemented... indeed, Boe
would have no way to know on 3rd Mar that the intermediate
version of foobar between "1" and "3" is "2"; but since
incremental deltas do not exist, then there is no need to
have indexes). </P
></LI
></UL
>
</P
></DIV
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="repo_howto"
>3.9. Repository howto</A
></H2
><P
>There are (at least) two ways two manage a repository, and run a server that creates the deltas
</P
><DIV
CLASS="section"
><H3
CLASS="section"
><A
NAME="AEN154"
>3.9.1. debmirror --debmarshal</A
></H3
><P
> The first way is what I currently use. It is implemented in the script
<TT
CLASS="filename"
>/usr/share/debdelta/debmirror-marshal-deltas</TT
>
(a simpler version, much primitive but more readable , is
<TT
CLASS="filename"
>/usr/share/debdelta/debmirror-delta-security</TT
>)
Currently I use the complex script that creates deltas for amd64 and
i386, and for lenny squeeze sid experimental ; and the simpler one for
lenny-security.
Let me start outlining how the simple script generate deltas . It is a 3 steps
process.
Lets say that $secdebmir is the directory containg the mirror of the
repository security.debian.org.
<P
></P
><OL
TYPE="1"
><LI
><PRE
CLASS="programlisting"
> --- 1st step
#make copy of current stable-security lists of packages
olddists=${TMPDIR:-/tmp}/oldsecdists-`date +'%F_%H-%M-%S'`
mkdir $olddists
cp -a $secdebmir/dists $olddists
</PRE
></LI
><LI
><P
> --- 2nd step
call 'debmirror' to update the mirror ; note that I apply a patch to
debmirror so that old debs are not deleted , but moved to a /old_deb
directory
</P
></LI
><LI
><P
> --- 3rd step
call 'debdeltas' to generate deltas , from the state of packages in
$olddists to the current state in $secdebmir , and also wrt what is in
stable.
Note that, for any package that was deleted from the archive, then
'debdeltas' will go fishing for it inside /old_deb .
</P
></LI
></OL
>
The more complex script uses the new <SPAN
CLASS="emphasis"
><I
CLASS="emphasis"
>debmirror --debmarshal</I
></SPAN
>
so it keeps 40 old snapshots of the deb archives, and it generates deltas of the current
package version (the "new" version) to the versions in snapshots -10,-20,-30,-40.
</P
></DIV
><DIV
CLASS="section"
><H3
CLASS="section"
><A
NAME="AEN167"
>3.9.2. hooks and repository of old_debs</A
></H3
><P
>
I wrote the scheleton for some commands.
<P
><B
CLASS="command"
>debdelta_repo</B
> [--add name version arch filename disttoken]</P
>
This first one is to be called by the archive management tool (e.g. DAK) when a new package enters
in a part of the archive (lets say,
package="foobar" version="2" arch="all" and filename="pool/main/f/foobar/foobar_2_all.deb" just entered
disttoken="testing/main/amd64"). That command will add that to a delta queue, so
appropriate deltas will be generated; this command returns almost immediately.
<P
><B
CLASS="command"
>debdelta_repo</B
> [--delta]</P
>
This does create all the deltas.
<P
><B
CLASS="command"
>debdelta_repo</B
> [--sos filename]</P
>
This will be called by DAK when (before) it does delete a package from the archive;
this command will save that old deb somewhere (indeed it may be needed to generate deltas sometimes in the future).
(It will be up to some piece of <SPAN
CLASS="emphasis"
><I
CLASS="emphasis"
>debdelta_repo</I
></SPAN
> code to manage the repository of old debs, and
delete excess copies).
</P
><P
><SPAN
CLASS="emphasis"
><I
CLASS="emphasis"
>TODO that scheleton does not handle 'security', where some old versions of the packages are in
a different DISTTOKEN</I
></SPAN
></P
></DIV
></DIV
></DIV
><H3
CLASS="FOOTNOTES"
>Notes</H3
><TABLE
BORDER="0"
CLASS="FOOTNOTES"
WIDTH="100%"
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN123"
HREF="x65.html#AEN123"
><SPAN
CLASS="footnote"
>[1]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>deltas have a "info" section, but that is, as to say, standalone</P
></TD
></TR
></TABLE
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="x54.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="x182.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>a delta</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
> </TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>Goals, tricks, ideas and issues</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>
|