1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
|
File loaders work with untrusted data and need defensive programming.
They also need to be efficient as they are invoked for previews.
And maintainable because file formats tend to evolve.
1. Does the loader release all resources? Namely:
- file contents from g_file_get_contents() or gwy_file_get_contents()
- structures describing the file
- own references on data fields, curves, spectra or metadata containers
- own references on SI units
- strings
- other stuff
2. I do not mean usually. I mean, does it release them on *all possibe code
paths*? Especially when it fails to load the file?
3. When it reads values from some buffer, does it always know that it contains
enough data? Use err_SIZE_MISMATCH().
4. When it reads the size or offset of something, does it validates the value
before using it? At least that it does not point beyond the end of the
file? Did you calculate the length of the thing? Use err_DIMENSION().
5. Does it re-check the file type in the loader? Users can try to load a file
as a specific type that does not match the actual file type.
6. Does it validate data field dimensions? They should not be zero. And they
should not be 2**30 either. And they definitely cannot be negative.
7. If it uses string functions (strFOO instead of memFOO), does it ensure
the search cannot run away past the end of the buffer?
8. If it parses numbers in text format, does it use g_ascii_strtod() or other
means of locale-independent parsing? Some locales use decimal comma, but
data files (normally) always use decimal points.
9. Does it work on platforms with reverse byte order?
A. Does it work on platforms with different structure packing?
B. Does it work on platforms with different data type sizes?
C. Does it avoid implementation-dependent types such as long or time_t?
D. Really? Does it use functions such as gwy_get_uint16_le() or
GINT32_FROM_LE() to obtain/convert data, or does it foolishly attempt to
read binary data directly into variables and structures? See the module
tutorial.
E. Does it trust any information from the file? It must not. In particular,
it must not assume that some field is always present in the file. It might
be missing and then the loader must fail gracefully. And what if some
enumerated value falls outside the set of known values?
F. Does it report errors correctly? I mean, if the loader returns NULL, it
*must* give a reason in the error argument.
G. Is the module information, such as the authors, description and version
correct? Or did you just copied that from another module and left some
some bogus values there?
H. Is the comment defining Freedesktop MIME type correct? If you cannot write
a good magic or glob rule, it is better to omit the type than to break other
files. By omitting it I mean *delete* the comment.
I. Does it ensure physical data field and line dimensions are positive (not
negative, not zero) numbers?
J. Does it take into account that SI units are objects and have their identity?
If you need to set both lateral and value units to meters, you must create
*two* SI unit objects (both representing meters) and set one as the lateral
units and another as the value units.
K. What does it do when the file is valid yet does not contain anything
importable? Use err_NO_DATA().
L. Does it convert text descriptions to UTF-8 (typically, from Latin 1)?
M. Does it fill the data field with gwy_data_field_set_val()? Why do you
think its documentation says `do not set data with this function inside
inner loops, it's slow'? Is it necessary to call pow10() for each value
filled?
N. Does it name variables, structures and functions in comprehensible English?
The format specification may be in German, French, Italian or whatever.
This is irrelevant. Would you prefer if I named all functions and types in
Czech? And is it really necessary to introduce yet another different
coding conventions? It is Gwyddion code so it should follow the Gwyddion
conventions devel-docs/CODING-STANDARDS.
|