
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN""http://www.w3.org/TR/html4/loose.dtd">
<HTML
><HEAD
><TITLE
>debdelta-upgrade service</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
REL="HOME"
HREF="index.html"><LINK
REL="PREVIOUS"
TITLE="a delta"
HREF="x54.html"><LINK
REL="NEXT"
TITLE="Goals, tricks, ideas and issues"
HREF="x182.html"></HEAD
><BODY
CLASS="section"
BGCOLOR="#FFFFFF"
TEXT="#000000"
LINK="#0000FF"
VLINK="#840084"
ALINK="#0000FF"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
></TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="x54.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
></TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="x182.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="section"
><H1
CLASS="section"
><A
NAME="AEN65"
>3. debdelta-upgrade service</A
></H1
><P
>In June 2006 I set up a delta-upgrading framework, so that people
may upgrade their Debian box using <B
CLASS="command"
>debdelta-upgrade</B
> (that downloads
package 'deltas').
This section is an introduction to the framework that is behind
'debdelta-upgrade', and is also used by 'cupt'.
In the following, I will simplify (in places, quite a lot).
</P
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="AEN69"
>3.1. The framework</A
></H2
><P
> The framework is so organized: I keep up some servers where I use the
program 'debdeltas' to create all the deltas; whereas endusers use the
client 'debdelta-upgrade' to download the deltas and apply them to
produce the debs needed to upgrade their boxes.
In my server, I mirror some repositories, and then I invoke
'debdeltas' to make the deltas between them. I use the
scripts <TT
CLASS="filename"
>/usr/share/debdelta/debmirror-delta-security</TT
>
and <TT
CLASS="filename"
>/usr/share/debdelta/debmirror-marshal-deltas</TT
> for this.
This generates any delta that may be needed for upgrades
in squeeze,squeeze-security,wheezy,sid,experimental,
for architectures i386 and amd64 (as of Mar 2011); the generated repository of deltas is
more or less 10GB.
</P
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="AEN74"
>3.2. The goals</A
></H2
><P
>There are two ultimate goals in designing this framework:
<P
></P
><OL
TYPE="1"
><LI
><P
> SMALL) reduce the size of downloads
(fit for people that pay-by-megabyte);
</P
></LI
><LI
><P
> FAST) speed up the upgrade.
</P
></LI
></OL
>
The two goals are unfortunately only marginally compatible. An
example: bsdiff can produce very small deltas, but is quite slow (in
particular with very large files); so currently (2009 on) I use 'xdelta3'
as the backend diffing tool for 'debdeltas' in my server.
Another example is in debs that contain archives ( .gz, , tar.gz
etc etc): I have methods and code to peek inside them, so
the delta become smaller, but the applying gets slower.
</P
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="AEN82"
>3.3. The repository structure</A
></H2
><P
> The repository of deltas is just a HTTP archive; it is similar to the pool of packages; that is, if
<TT
CLASS="filename"
>foobar_1_all.deb</TT
> is stored in
<TT
CLASS="filename"
>pool/main/f/foobar/</TT
> in the repository of debs, then the
delta to upgrade it will be stored in <TT
CLASS="filename"
>pool/main/f/foobar/foobar_1_2_all.debdelta</TT
>
in the repository of deltas. Contrary to the repository of debs, a repository of deltas
has no indexes, see <A
HREF="x65.html#no_indexes"
>Section 3.7.2</A
>. The delta repository is in
<TT
CLASS="filename"
>http://debdeltas.debian.net/debian-deltas</TT
>.
</P
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="delta_creation"
>3.4. The repository creation</A
></H2
><P
> Suppose that the unstable archive, on 1st Mar, contains
<TT
CLASS="filename"
>foobar_1_all.deb</TT
> (and it is in
<TT
CLASS="filename"
>pool/main/f/foobar/</TT
> ) ; then on 2nd Mar,
<TT
CLASS="filename"
>foobar_2_all.deb</TT
> is uploaded; but this
has a flaw (e.g. FTBFS) and so on 3rd Mar
<TT
CLASS="filename"
>foobar_3_all.deb</TT
> is uploaded.
On 2nd Mar, the delta server generates
<TT
CLASS="filename"
>pool/main/f/foobar/foobar_1_2_all.debdelta</TT
>
On 3rd Mar, the server generates both
<TT
CLASS="filename"
>pool/main/f/foobar/foobar_1_3_all.debdelta</TT
>
<TT
CLASS="filename"
>pool/main/f/foobar/foobar_2_3_all.debdelta</TT
>.
So, if the end-user Ann upgrades the system on both 2nd and 3rd Mar,
then she uses both foobar_1_2_all.debdelta (on 2nd) and
<TT
CLASS="filename"
>foobar_2_3_all.debdelta</TT
> (on 3rd Mar). If the end-user Boe has not
upgraded the system on 2nd Mar, , and he upgrades on 3rd Mar, then on
3rd Mar he uses <TT
CLASS="filename"
>foobar_1_3_all.debdelta</TT
>.
</P
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="AEN102"
>3.5. size limit</A
></H2
><P
> Note that currently the server rejects deltas that exceed 70% of the deb
size: indeed the size gain would be too small, and the time would be
wasted, if you sum the time to download the delta and the time to apply
it (OK, these are run as much as possible in parallel, yet ....).
</P
><P
> Also, the server does not generate delta for packages that are smaller than 10KB.
</P
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="AEN106"
>3.6. /etc/debdelta/sources.conf</A
></H2
><P
> Consider a package that is currently installed. It is characterized by
<SPAN
CLASS="emphasis"
><I
CLASS="emphasis"
> name installed_version architecture</I
></SPAN
>
(unfortunately there is no way to tell from which archive it came
from, but this does not seem to be a problem currently)
Suppose now that a newer version is available somewhere in an archive,
and that the user wishes to upgrade to that version.
The archive Release file contain these info:
<SPAN
CLASS="QUOTE"
>"Origin , Label , Site, Archive"</SPAN
>.
(Note that Archive is called Suite in the Release file).
Example for the security archive:
<PRE
CLASS="programlisting"
> Origin=Debian
Label=Debian-Security
Archive=stable
Site=security.debian.org
</PRE
>
The file <TT
CLASS="filename"
>/etc/debdelta/sources.conf</TT
>
, given the above info, determines
the host that should contain the delta for upgrading the package. This
information is called "delta_uri" in that file.
The complete URL for the delta is built adding to the delta_uri a
directory path that mimicks the "pool" structure used in Debian
archives, and appending to it a filename of the form
<TT
CLASS="filename"
>name_oldversion_newversion_architecture.debdelta</TT
>.
All this is implemented in the example script contrib/findurl.py .
If the delta is not available at that URL, and
<TT
CLASS="filename"
>name_oldversion_newversion_architecture.debdelta-too-big</TT
>
is available, then the delta is too big to be useful.
If neither is present, then, either the delta has not yet been
generated, or it will never be generated... but this is difficult to
know.
</P
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="AEN115"
>3.7. indexes</A
></H2
><DIV
CLASS="section"
><H3
CLASS="section"
><A
NAME="AEN117"
>3.7.1. indexes of debs in APT</A
></H3
><P
> Let's start examining the situation for debs and APT.
Using indexes for debs is a no-brainer decision: indeed, the client
(i.e. the end user) does not know the list of available debs in the
server, and, even knowing the current list, cannot foresee the future
changes.
So indexes provide needed informations: the packages' descriptions,
versions, dependencies, etc etc; these info are used by apt and the
other frontends.
</P
></DIV
><DIV
CLASS="section"
><H3
CLASS="section"
><A
NAME="no_indexes"
>3.7.2. no indexes of deltas in debdelta</A
></H3
><P
> If you then think of deltas, you realize that all requirements above
fall. Firstly there is no description and no dependencies for deltas.
<A
NAME="AEN123"
HREF="#FTN.AEN123"
><SPAN
CLASS="footnote"
>[1]</SPAN
></A
>
Of course 'debdelta-upgrade' needs some information to determine if a delta
exists, and to download it; but these information are already available:
<PRE
CLASS="programlisting"
> the name of the package P
the old version O
the new version N
the architecture A
</PRE
>
Once these are known, the URL of the file F can be algorithmically
determined as
<TT
CLASS="filename"
>URI/POOL/P_O_N_A.debdelta</TT
>
where URI is determined from
<TT
CLASS="filename"
>/etc/debdelta/sources.conf</TT
>
and POOL is the directory in the pool of the package P .
This algorithm is also implemented (quite verbosely) in
contrib/findurl.py in the sources of debdelta.
This is the reason why currently there is no "index of deltas", and
nonetheless 'debdelta-upgrade' works fine (and "cupt" as well).
Adding an index of file would only increase downloads (time and size)
and increase disk usage; with negligeable benefit, if any.
</P
></DIV
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="no_incremental"
>3.8. no incremental deltas</A
></H2
><P
> Let me add another point that may be unclear. There are no incremental
deltas (and IMHO never will be).
</P
><DIV
CLASS="section"
><H3
CLASS="section"
><A
NAME="AEN131"
>3.8.1. What "incremental" would be, and why it is not</A
></H3
><P
> Please recall <A
HREF="x65.html#delta_creation"
>Section 3.4</A
>.
What <SPAN
CLASS="emphasis"
><I
CLASS="emphasis"
>does not happen</I
></SPAN
> currently is what follows:
on 3rd Mar , Boe decides to upgrade, and invokes 'debdelta-upgrade';
then 'debdelta-upgrade' finds <TT
CLASS="filename"
>foobar_1_2_all.debdelta</TT
> and
<TT
CLASS="filename"
>foobar_2_3_all.debdelta</TT
> , it uses the foremost to generate
<TT
CLASS="filename"
>foobar_2_all.deb</TT
>, and in turn it uses this and the second delta to
<TT
CLASS="filename"
>generate foobar_3_all.deb</TT
> .
This is not implemented, and it will not, for the following reasons.
<P
></P
><UL
><LI
><P
> The delta size is, on average, 40% of the size of the deb (and this
is getting worse, for different reasons, see <A
HREF="x277.html#getting_worse"
>Section 5.2</A
>); so two deltas are 80% of the
target deb, and this too much.
</P
></LI
><LI
><P
> It takes time to apply a delta; applying two deltas to produce one
deb takes too much time.</P
></LI
><LI
><P
> The server does generate the direct delta
<TT
CLASS="filename"
>foobar_1_3_all.debdelta</TT
>
:-) so why making things complex when they are easy? :-)</P
></LI
><LI
><P
> Note also that incremental deltas would
need some index system to be implemented... indeed, Boe
would have no way to know on 3rd Mar that the intermediate
version of foobar between "1" and "3" is "2"; but since
incremental deltas do not exist, then there is no need to
have indexes). </P
></LI
></UL
>
</P
></DIV
></DIV
><DIV
CLASS="section"
><H2
CLASS="section"
><A
NAME="repo_howto"
>3.9. Repository howto</A
></H2
><P
>There are (at least) two ways two manage a repository, and run a server that creates the deltas
</P
><DIV
CLASS="section"
><H3
CLASS="section"
><A
NAME="AEN154"
>3.9.1. debmirror --debmarshal</A
></H3
><P
> The first way is what I currently use. It is implemented in the script
<TT
CLASS="filename"
>/usr/share/debdelta/debmirror-marshal-deltas</TT
>
(a simpler version, much primitive but more readable , is
<TT
CLASS="filename"
>/usr/share/debdelta/debmirror-delta-security</TT
>)
Currently I use the complex script that creates deltas for amd64 and
i386, and for lenny squeeze sid experimental ; and the simpler one for
lenny-security.
Let me start outlining how the simple script generate deltas . It is a 3 steps
process.
Lets say that $secdebmir is the directory containg the mirror of the
repository security.debian.org.
<P
></P
><OL
TYPE="1"
><LI
><PRE
CLASS="programlisting"
> --- 1st step
#make copy of current stable-security lists of packages
olddists=${TMPDIR:-/tmp}/oldsecdists-`date +'%F_%H-%M-%S'`
mkdir $olddists
cp -a $secdebmir/dists $olddists
</PRE
></LI
><LI
><P
> --- 2nd step
call 'debmirror' to update the mirror ; note that I apply a patch to
debmirror so that old debs are not deleted , but moved to a /old_deb
directory
</P
></LI
><LI
><P
> --- 3rd step
call 'debdeltas' to generate deltas , from the state of packages in
$olddists to the current state in $secdebmir , and also wrt what is in
stable.
Note that, for any package that was deleted from the archive, then
'debdeltas' will go fishing for it inside /old_deb .
</P
></LI
></OL
>
The more complex script uses the new <SPAN
CLASS="emphasis"
><I
CLASS="emphasis"
>debmirror --debmarshal</I
></SPAN
>
so it keeps 40 old snapshots of the deb archives, and it generates deltas of the current
package version (the "new" version) to the versions in snapshots -10,-20,-30,-40.
</P
></DIV
><DIV
CLASS="section"
><H3
CLASS="section"
><A
NAME="AEN167"
>3.9.2. hooks and repository of old_debs</A
></H3
><P
>
I wrote the scheleton for some commands.
<P
><B
CLASS="command"
>debdelta_repo</B
> [--add name version arch filename disttoken]</P
>
This first one is to be called by the archive management tool (e.g. DAK) when a new package enters
in a part of the archive (lets say,
package="foobar" version="2" arch="all" and filename="pool/main/f/foobar/foobar_2_all.deb" just entered
disttoken="testing/main/amd64"). That command will add that to a delta queue, so
appropriate deltas will be generated; this command returns almost immediately.
<P
><B
CLASS="command"
>debdelta_repo</B
> [--delta]</P
>
This does create all the deltas.
<P
><B
CLASS="command"
>debdelta_repo</B
> [--sos filename]</P
>
This will be called by DAK when (before) it does delete a package from the archive;
this command will save that old deb somewhere (indeed it may be needed to generate deltas sometimes in the future).
(It will be up to some piece of <SPAN
CLASS="emphasis"
><I
CLASS="emphasis"
>debdelta_repo</I
></SPAN
> code to manage the repository of old debs, and
delete excess copies).
</P
><P
><SPAN
CLASS="emphasis"
><I
CLASS="emphasis"
>TODO that scheleton does not handle 'security', where some old versions of the packages are in
a different DISTTOKEN</I
></SPAN
></P
></DIV
></DIV
></DIV
><H3
CLASS="FOOTNOTES"
>Notes</H3
><TABLE
BORDER="0"
CLASS="FOOTNOTES"
WIDTH="100%"
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN123"
HREF="x65.html#AEN123"
><SPAN
CLASS="footnote"
>[1]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>deltas have a "info" section, but that is, as to say, standalone</P
></TD
></TR
></TABLE
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="x54.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="x182.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>a delta</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
> </TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>Goals, tricks, ideas and issues</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>
|