1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259
|
## This will eventually do to wiki.debian.org/DebTestFramework
* '''Created''': <<Date(2010-10-07)>>
* '''Contributors''': MichaelHanke, YaroslavHalchenko
* '''Packages affected''':
* '''See also''':
== Summary ==
This specification describes DebTest -- a framework with conventions and tools
that allow Debian to distribute test batteries developed by upstream or Debian
developers. DebTest aims to enable developers and users to perform extensive
testing of a deployed Debian system or a particular software of interest in a
uniform fashion.
== Rationale ==
Ideally software packaged for Debian comes with an exhaustive test suite that
can be used to determine whether this particular software works as expected on
the Debian platform. However, especially for complex software, these test
suites are often resource hungry (CPU time, memory, disk space, network
bandwidth) and cannot be ran at package build time by buildds. Consequently,
test suites are typically utilized manually and only by the respective packager
on a particular machine, before uploading a new version to the archive.
However, Debian is an integrated system and packaged software typically relies
on functionality provided by other Debian packages (e.g. shared libraries)
instead of shipping duplicates with different versions in every package -- for
many good reasons. Unfortunately, there is also a downside to this: Debian
packages often use versions of 3rd-party tools that are different from those
tested by upstream, and moreover, the actual versions of dependencies might
change frequently between subsequent uploads of a dependent package. Currently
a change in a dependency that introduces an incompatibility cannot be detected
reliably even if upstream provides a test suite that would have caught the
breakage. Therefore integration testing heavily relies on users to detect
incorrect functioning and file bug reports. Although there are archive-wide QA
efforts (e.g. constantly rebuilding all packages) these tests can only detect
API/ABI breakage or functionality tested during build-time checks -- they are
not exhaustive for the aforementioned reasons.
This is a proposal to, first of all, package upstream test suites in a way that
they can be used to run expensive archive-wide QA tests. However, this is also
a proposal to establish means to test interactions between software from
multiple Debian packages to provide more thorough continued integration and
regression testing for the Debian systems.
== Use Cases ==
* Moritz is a member of the security team. Whenever he applies a patch to fix
a security issue he wants to make sure that the generic behavior of the software
remains unchanged. However, in general he only has access to test cases that
are included in the source package (if any). In the absence of proper tests
he can only either assume that is would work (bad by design), or rely on the
respective package maintainer to run the appropriate tests (introduces
delays). A packaged exhaustive regression test suite would allow Moritz to
perform comprehensive testing on his own and release the fixed package as
soon as the tests pass.
* Michael is a Debian package maintainer that takes care of three
packages each providing a data format conversion utility. While
all three tools have their merits there is also lots of
overlap. For example, given a particular data file they should all
generate identical output. With a DebTest framework, Michael can
write and package cross-package test suites to ensure that this
promise is fulfilled at any time. Moreover, Michael can also
develop/package "pipeline" tests that ensure proper functioning of
multi-stage/package processing pipelines (from raw data format
conversion to visualization), where some stages could be
(re)processed using alternative tools from different software
packages promising to provide the same functionality. By testing
a whole processing stream while changing the alternative
implementations, breakage of the compatibility compliance could be
detected.
* Yarik is a Debian maintainer of a package where upstream provides
a complete analysis pipeline which was used for an article
publication. Such analysis requires relatively large array of
data and a range of tools from other packages to acquire
publication-ready summary of the results. Therefor such analysis
cannot be carried out at package build time. Upstream aims to
assure the reproducibility of the published results and encourages
Yarik to promise correct functioning of the research product on
Debian systems. Within the DebTest framework, Yarik can package
upstream analysis pipeline along with the target results to assure
reproducibility of the scientific findings.
* Albert is a scientist using Debian for his research activities. The
developers of his favorite software tell him to rather use the GreenPants
distribution, because they cannot guarantee that their software works
properly on Debian. They reason that Debian has a different
version of a numerical library that hasn't been "tested" by the authors.
With packaged regression test suites Albert can install and run, at any given point,
a complete test of his Debian system to ensure that everything is working
properly given the exact set of base libraries installed at this very moment.
This includes the test suite of the authors of his favorite software, but
also all distribution test suites provided by Debian developers (see above).
* Sylvestre maintains a core computational library in Debian.
A new version (or other modification) of this library promises performance
advantages. Using DebTest he could not only verify the absence of
regressions but also to obtain direct performance comparison
against the previous version across a range of applications.
* Joerg maintains a repository of backports of Debian packages to be
installed in a stable environment. He wants to assure that
backporting of the packages has not caused a deviation in their
intended functioning. By using existing DebTest tests suites he
could verify that backported versions of the packages do not break
the stability and function as promised within the stable
environment.
* Mark wants to create a Debian-derived distribution and needs to
modify a number of essential packages in order to achieve the desired
improvements. He hopes that these changes do not break other Debian
packages, but he is not really sure. A comprehensive test battery for the
whole Debian system would offer him a way to verify proper functioning
of his modified snapshot of Debian -- without having to manually replicate
the testing efforts done by thousands of Debian contributors.
* Linus is an upstream developer. He just loves the fact that he can tell any
of his Debian-based users to just 'apt-get install' something and send him
the output of a debtest command, whenever they claim that his software
doesn't work properly. It pleases him to see his carefully developed test
suite to be conveniently accessible for users.
* Finally, Lucas has access to a powerful computing facility and
likes to run all kinds of tests on all packages in the Debian archive.
A Debian-wide regression test framework would allow Lucas to execute
complex test collections (suites for individual packages,
interoperability tests, or comparative) in an automated fashion,
and file bug reports against the respective packages whenever a
malfunction is detected. Some of Lucas friends are not brave enough to file
bugs, but still want to contribute. They simply run (selected) tests
on their local machines that in turn report results/logs to a Debian
dashboard server, where interested parties can get a weather report of
Debian's status.
== Scope ==
This specification is applicable to all Debian packages, and Debian as a whole.
== Design ==
A specification should be built with the following considerations:
* The person implementing it may not be the person writing it. Specification should be
* clear enough for someone to be able to read it and have a clear path
* towards implementing it. If it is not straightforward, it needs more detail.
* Use cases covered in the specification should be practical
* situations, not contrived issues.
* Limitations and issues discovered during the creation of a specification
* should be clearly pointed out so that they can be dealt with explicitly.
* If you don't know enough to be able to competently write a spec, you should
* either get help or research the problem further. Avoid spending time making
* up a solution: base yourself on your peers' opinions and prior work.
Specific issues related to particular sections are described further below.
=== Core components ===
* Organization of the framework
- packages might register ways to run basic tests against installed
versions
register:
- executable?
==== Packaged tests ====
* Metainformation:
* duration: ....
* resources:
* suites:
* Debug symbols: ....
* do not strip symbols from test binary
* Packages that register tests might provide a virtual package
'test-<packagename>' to allow easy test discovery and retrival via
debtest tools.
==== debtest tools ====
* Invocation::
* single package tests
* all (with -f to force even if resources are not sufficient)
* tests of dependent packages (discovered via rdepends,
"rrecommends" and "rsuggests")
* given specific resources demands, just run
the ones matching those
* Customization/Output::
plugins::
* job resources requirement adjustments
. manual customization
. request from dashboard for the system (or alike)
* executioners
. local execution (monitor resources)
. submit to cluster/cloud
* output/reports
. some structured output
. interfaces to dashboards
==== Maintainer helpers ====
Helpers:
- assess resources/performance:
=== Supplementary infrastructure ===
==== Dashboard server ====
=== Implementation Plan ===
This section is usually broken down into subsections, such as the packages
being affected, data and system migration where necessary, user interface
requirements and pictures (photographs of drawings on paper work well).
== Implementation ==
To implement a specification, the developer should observe the use cases
carefully, and follow the design specified. He should make note of places in
which he has strayed from the design section, adding rationale describing why
this happened. This is important so that next iterations of this specification
(and new specifications that touch upon this subject) can use the specification
as a reference.
The implementation is very dependent on the type of feature to be implemented.
Refer to the team leader for further suggestions and guidance on this topic.
* Implementation language:
- Python unless someone takes the burden to develop
and maintain for upcoming years.
== Outstanding Issues ==
The specification process requires experienced people to drive it. More
documentation on the process should be produced.
The drafting of a specification requires english skills and a very good
understanding of the problem. It must also describe things to an extent that
someone else could implement. This is a difficult set of conditions to ensure
throughout all the specifications added.
There is a lot of difficulty in gardening obsolete, unwanted and abandoned
specifications in the Wiki.
== BoF agenda and discussion ==
Possible meetings where this specification will be discussed.
----
CategorySpec
|