File: HACKING.md

package info (click to toggle)
ocaml-eio 1.3-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 2,548 kB
  • sloc: ml: 14,608; ansic: 1,237; makefile: 25
file content (149 lines) | stat: -rw-r--r-- 7,056 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
## Installing Eio from Git

If you want to run the latest development version from Git, run these commands:

```
git clone https://github.com/ocaml-multicore/eio.git
cd eio
opam pin -yn .
opam install eio_main
```

## Layout of the code

`lib_eio/core` contains the core logic about fibers, promises, switches, etc.
`lib_eio` extends this with e.g. streams, buffered readers, buffered writers,
and a load of types for OS resources (files, networks, etc).

There is one directory for each backend (e.g. `eio_linux`).
Each backend provides a scheduler that integrates with a particular platform,
and implements some or all of the cross-platform resource APIs.
For example, `eio_linux` implements the network interface using `io_uring` to send data.

`lib_main` just selects an appropriate backend for the current system.

## Writing a backend

It's best to start by reading `lib_eio/mock/backend.ml`, which implements a mock backend with no actual IO.
You can then read one of the real backends to see how to integrate this with the OS.

Most backends are built in two layers:

- A "low-level" module directly wraps the platform's own API, just adding support for suspending fibers for concurrency
  and basic safety features (such wrapping `Unix.file_descr` to prevent use-after-close races).

- An implementation of the cross-platform API (as defined in the `eio` package) that uses the low-level API internally.
  This should ensure that errors are reported using the `Eio.Io` exception.

`eio_posix` is the best one to look at first:

- `lib_eio_posix/sched.ml` is similar to the mock scheduler, but extended to interact with the OS kernel.
- `lib_eio_posix/low_level.ml` provides fairly direct wrappers of the standard POSIX functions,
  but using `sched.ml` to suspend and resume instead of blocking the whole domain.
- `lib_eio_posix/net.ml` implements the cross-platform API using the low-level API.
  For example, it converts Eio network addresses to Unix ones.
  Likewise, `fs.ml` implements the cross-platform file-system APIs, etc.
- `lib_eio_posix/eio_posix.ml` provides the main `run` function.
  It runs the scheduler, passing to the user's `main` function an `env` object for the cross-platform API functions.

When writing a backend, it's best to write the main loop in OCaml rather than delegate that to a C function.
Some particular things to watch out for:

- If a system call returns `EINTR`, you must switch back to OCaml
  (`caml_leave_blocking_section`) so that the signal can be handled. Some C
  libraries just restart the function immediately and this will break signal
  handling (on systems that have signals).

- If C code installs a signal handler, it *must* use the alt stack (`SA_ONSTACK`).
  Otherwise, signals handlers will run on the fiber stack, which is too small and will result in memory corruption.

- Effects cannot be performed over a C function.
  So, if the user installs an effect handler and then calls a C mainloop, and the C code invokes a callback,
  the callback cannot use the effect handler.
  This isn't a problem for Eio itself (Eio's effect handler is installed inside the mainloop),
  but it can break programs using effects in other ways.

## Tests

Eio has tests in many places...

### Cross-platform unit tests

These are in the top-level `tests` directory.
They are run against whichever backend `Eio_main.run` selects, and therefore must get the same result for all backends.

### Concurrency primitives

`lib_eio/tests` tests some internal data structures, such as the lock-free cells abstraction.
The `.md` files in that directory provide a simple walk-through to demonstrate the basic operation,
while `lib_eio/tests/dscheck` uses [dscheck][] to perform exhaustive testing of all atomic interleavings.

At the time of writing, dscheck has some performance problems that make it unusable by default, so
you must use the version in https://github.com/ocaml-multicore/dscheck/pull/22 instead.

### Benchmarks

The `bench` directory contains various speed tests.
`make bench` is a convenient way to run all of them.
This is useful to check for regressions.

If you want to contibute an optimisation, please add a benchmark so that we can measure the improvement.
If you are changing something, make sure the benchmark doesn't get significantly worse.

### Stress and fuzz testing

The `fuzz` directory uses afl-fuzz to search for bugs.

Using it properly requires an instrumented version of the OCaml compiler
(see https://v2.ocaml.org/manual/afl-fuzz.html for instructions).
The `dune` build rules don't use afl-fuzz; they just do a few random tests and then stop.

To run e.g. the `fuzz_buf_read` tests with afl-fuzz:

```
mkdir input
date > input/seed
afl-fuzz -m 1000 -i input -o output ./_build/default/fuzz/fuzz_buf_read.exe @@
```

- `Fork server handshake failed` indicates that you are not using an AFL-enabled version of OCaml.
- `The current memory limit (75.0 MB) is too restrictive` means you forgot to use `-m`.

The `stress` directory contains stress tests (that try to trigger races by brute force).

### Backend-specific tests

There are also backend-specific tests, e.g.

- `lib_eio_linux/tests`
- `lib_eio_luv/tests`

Use these for tests that only make sense for one platform.

### Formal verification

Some parts of Eio have been formally verified:

- https://github.com/addap/master-thesis/tree/main/documents [[video](https://discuss.ocaml.org/t/video-verifying-an-effect-based-cooperative-concurrency-scheduler-in-iris-by-adrian-dapprich/13825)]
- https://github.com/clef-men/zebre/tree/main/theories/eio

## Code formatting

Eio's code is indented using ocp-indent.
When making PRs, please do not apply other formatting tools to existing code unrelated to your PR.
Try to avoid making unnecessary changes; this makes review harder and clutters up the Git history.
`ocamlformat` may be useful to get badly messed up code to a baseline unformatted state,
from which human formatting can be added where needed.

## AI-generated Code

Contributing to Eio should not be done _solely_ using "AI tools" such as ChatGPT. This is for a few reasons:

1. **It obfuscates how you think**. Purely AI-generated code tells us little about how you think and the problems you might be having. This makes it harder to provide good feedback on PRs and issues.
2. **It is often more work to review**. Particularly for the OCaml ecosystem and libraries like Eio, it seems that these tools are not very good and generate a lot of believable code that is in actual fact completely wrong. PR comments and the code submitted with them can say completely different things.
3. **It is a grey area for licensing**. Models like ChatGPT have been trained on lots of code with different licenses and has been known to simply copy code as an answer to a prompt. We would like to avoid this headache as best we can.

Use AI tools, if you wish, to help you understand OCaml and Eio. Do not offload all of the work of a PR or a comment to these tools.

[dscheck]: https://github.com/ocaml-multicore/dscheck