1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149
|
## Installing Eio from Git
If you want to run the latest development version from Git, run these commands:
```
git clone https://github.com/ocaml-multicore/eio.git
cd eio
opam pin -yn .
opam install eio_main
```
## Layout of the code
`lib_eio/core` contains the core logic about fibers, promises, switches, etc.
`lib_eio` extends this with e.g. streams, buffered readers, buffered writers,
and a load of types for OS resources (files, networks, etc).
There is one directory for each backend (e.g. `eio_linux`).
Each backend provides a scheduler that integrates with a particular platform,
and implements some or all of the cross-platform resource APIs.
For example, `eio_linux` implements the network interface using `io_uring` to send data.
`lib_main` just selects an appropriate backend for the current system.
## Writing a backend
It's best to start by reading `lib_eio/mock/backend.ml`, which implements a mock backend with no actual IO.
You can then read one of the real backends to see how to integrate this with the OS.
Most backends are built in two layers:
- A "low-level" module directly wraps the platform's own API, just adding support for suspending fibers for concurrency
and basic safety features (such wrapping `Unix.file_descr` to prevent use-after-close races).
- An implementation of the cross-platform API (as defined in the `eio` package) that uses the low-level API internally.
This should ensure that errors are reported using the `Eio.Io` exception.
`eio_posix` is the best one to look at first:
- `lib_eio_posix/sched.ml` is similar to the mock scheduler, but extended to interact with the OS kernel.
- `lib_eio_posix/low_level.ml` provides fairly direct wrappers of the standard POSIX functions,
but using `sched.ml` to suspend and resume instead of blocking the whole domain.
- `lib_eio_posix/net.ml` implements the cross-platform API using the low-level API.
For example, it converts Eio network addresses to Unix ones.
Likewise, `fs.ml` implements the cross-platform file-system APIs, etc.
- `lib_eio_posix/eio_posix.ml` provides the main `run` function.
It runs the scheduler, passing to the user's `main` function an `env` object for the cross-platform API functions.
When writing a backend, it's best to write the main loop in OCaml rather than delegate that to a C function.
Some particular things to watch out for:
- If a system call returns `EINTR`, you must switch back to OCaml
(`caml_leave_blocking_section`) so that the signal can be handled. Some C
libraries just restart the function immediately and this will break signal
handling (on systems that have signals).
- If C code installs a signal handler, it *must* use the alt stack (`SA_ONSTACK`).
Otherwise, signals handlers will run on the fiber stack, which is too small and will result in memory corruption.
- Effects cannot be performed over a C function.
So, if the user installs an effect handler and then calls a C mainloop, and the C code invokes a callback,
the callback cannot use the effect handler.
This isn't a problem for Eio itself (Eio's effect handler is installed inside the mainloop),
but it can break programs using effects in other ways.
## Tests
Eio has tests in many places...
### Cross-platform unit tests
These are in the top-level `tests` directory.
They are run against whichever backend `Eio_main.run` selects, and therefore must get the same result for all backends.
### Concurrency primitives
`lib_eio/tests` tests some internal data structures, such as the lock-free cells abstraction.
The `.md` files in that directory provide a simple walk-through to demonstrate the basic operation,
while `lib_eio/tests/dscheck` uses [dscheck][] to perform exhaustive testing of all atomic interleavings.
At the time of writing, dscheck has some performance problems that make it unusable by default, so
you must use the version in https://github.com/ocaml-multicore/dscheck/pull/22 instead.
### Benchmarks
The `bench` directory contains various speed tests.
`make bench` is a convenient way to run all of them.
This is useful to check for regressions.
If you want to contibute an optimisation, please add a benchmark so that we can measure the improvement.
If you are changing something, make sure the benchmark doesn't get significantly worse.
### Stress and fuzz testing
The `fuzz` directory uses afl-fuzz to search for bugs.
Using it properly requires an instrumented version of the OCaml compiler
(see https://v2.ocaml.org/manual/afl-fuzz.html for instructions).
The `dune` build rules don't use afl-fuzz; they just do a few random tests and then stop.
To run e.g. the `fuzz_buf_read` tests with afl-fuzz:
```
mkdir input
date > input/seed
afl-fuzz -m 1000 -i input -o output ./_build/default/fuzz/fuzz_buf_read.exe @@
```
- `Fork server handshake failed` indicates that you are not using an AFL-enabled version of OCaml.
- `The current memory limit (75.0 MB) is too restrictive` means you forgot to use `-m`.
The `stress` directory contains stress tests (that try to trigger races by brute force).
### Backend-specific tests
There are also backend-specific tests, e.g.
- `lib_eio_linux/tests`
- `lib_eio_luv/tests`
Use these for tests that only make sense for one platform.
### Formal verification
Some parts of Eio have been formally verified:
- https://github.com/addap/master-thesis/tree/main/documents [[video](https://discuss.ocaml.org/t/video-verifying-an-effect-based-cooperative-concurrency-scheduler-in-iris-by-adrian-dapprich/13825)]
- https://github.com/clef-men/zebre/tree/main/theories/eio
## Code formatting
Eio's code is indented using ocp-indent.
When making PRs, please do not apply other formatting tools to existing code unrelated to your PR.
Try to avoid making unnecessary changes; this makes review harder and clutters up the Git history.
`ocamlformat` may be useful to get badly messed up code to a baseline unformatted state,
from which human formatting can be added where needed.
## AI-generated Code
Contributing to Eio should not be done _solely_ using "AI tools" such as ChatGPT. This is for a few reasons:
1. **It obfuscates how you think**. Purely AI-generated code tells us little about how you think and the problems you might be having. This makes it harder to provide good feedback on PRs and issues.
2. **It is often more work to review**. Particularly for the OCaml ecosystem and libraries like Eio, it seems that these tools are not very good and generate a lot of believable code that is in actual fact completely wrong. PR comments and the code submitted with them can say completely different things.
3. **It is a grey area for licensing**. Models like ChatGPT have been trained on lots of code with different licenses and has been known to simply copy code as an answer to a prompt. We would like to avoid this headache as best we can.
Use AI tools, if you wish, to help you understand OCaml and Eio. Do not offload all of the work of a PR or a comment to these tools.
[dscheck]: https://github.com/ocaml-multicore/dscheck
|