File: CONTRIBUTING.md

package info (click to toggle)
nsd 4.14.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 5,260 kB
  • sloc: ansic: 64,435; sh: 4,351; python: 2,085; yacc: 1,344; makefile: 688
file content (75 lines) | stat: -rw-r--r-- 3,379 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
# Contributing to simdzone

The simdzone library is open source and made available under the permissive
3-clause BSD license.

Contributions are very welcome!

> The original specification in [RFC1035][1] is rather ambiguous and does not
> cover additions from later RFCs. See [SYNTAX.md](SYNTAX.md) for a quick
> summary of the format and interpretation in simdzone.

[1]: https://datatracker.ietf.org/doc/html/rfc1035#section-5

## Reference data

1. [Zone Data for .se and .nu][2] can be obtained via a DNS zone transfer.

2. The [Centralized Zone Data Service (CZDS)][3] provides access to zone data
   for participating gTLDs.

   > Downloading zone data via the browser can be problematic. The
   > [The CZDS API client in Java][4] can be used as a workaround.

3. The *Hint and Zone Files* can be obtained from Internet Assigned Numbers
   Authority (IANA) [Root Zone Management][5] page.

[2]: https://internetstiftelsen.se/en/zone-data/
[3]: https://czds.icann.org/
[4]: https://github.com/icann/czds-api-client-java/
[5]: https://www.iana.org/domains/root

## Source layout

`include` contains only headers required for consumption of the library.

`src` contains the implementation and internal headers.

The layout of `src` is (of course) inspired by the layout in simdjson. The
structure is intentionally simple and without (too much) hierarchy, but as
simdzone has very architecture specific code to maximize performance, there
are some caveats.

Processors may support multiple instruction sets. e.g. x86\_64 may support
SSE4.2, AVX2 and AVX-512 instruction sets depending on the processor family.
The preferred implementation is automatically selected at runtime. As a result,
code may need to be compiled more than once. To improve code reuse, shared
logic resides in headers rather than source files and is declared static to
avoid name clashes. Bits and pieces are then mixed and matched in a
`src/<arch>/parser.c` compilation target to allow for multiple implementations
to co-exist.

Sources and headers common to all architectures that do not implement parsing
for a specific data-type reside directly under `src`. Code specific to an
architecture resides in a directory under `src`, e.g. `haswell` or `fallback`.
`src/generic` contains scanner and parser code common to all implementations,
but leans towards code shared by SIMD implementations.

For example, SIMD-optimized scanner code resides in `src/generic/scanner.h`,
abstractions for intrinsics reside in e.g. `src/haswell/simd.h` and `lex(...)`,
which is used by all implementations, is implemented in `src/lexer.h`.
A fallback scanner is implemented in `src/fallback/scanner.h`.

A SIMD-optimized type parser is implemented in `src/generic/type.h`, a fallback
type parser is implemented in `src/fallback/type.h`. Future versions are
expected to add more optimized parsers for specific data types, even parsers
that are tied to a specific instruction set. The layout accommodates these
scenarios. e.g. an AVX2 optimized parser may reside in `src/haswell/<type>.h`,
an SSE4.2 optimized parser may reside in `src/westmere/<type>.h`, etc.

## Symbol visibility

All exported symbols, identifiers, etc must be prefixed with `zone_`, or
`ZONE_` for macros. Non-exported symbols are generally not prefixed. e.g.
`lex(...)` and `scan(...)` are declared static and as such are not required to
be prefixed.