1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204
|
# Path Facts
<!-- cargo-rdme start -->
Thanks for subscribing to PATH FACTS. You'll get a fun fact every \<15 minutes\> reply
"TWO ROADS DIVERGED" to unsubscribe!
### What
Unlike the world-famous `cat` FACTS, path facts only appear when you ask them and are best right
after a `std::io::Error` from a filesystem operation.
The purpose of path facts is to deliver maximum information about your system on disk in a tidy,
easy-to-understand package. The idea is that it gives you enough information to debug an unexpected
error without requiring you to run around calling `ls` and `cat` until you find the problem.
Tired of seeing this:
```text
No such file or directory
```
When you could be seeing this?
```text
cannot access `/path/to/directory/a/b/c/does_not_exist.txt`
- Prior path is not a directory
- Prior path exists `/path/to/directory/a`
- `/path/to/directory`
└── `a` file [✅ read, ✅ write, ❌ execute]
```
Then start using path facts today!
### No, but really. Why?
I designed this library to be used by other libraries that extend or mimic `std::fs` so they can
create beautiful errors. Of course, you don't have to be a library author to get your PATH FACTS
fix.
The philosophy behind PATH FACTS is that our systems shouldn't just tell us when something goes
wrong, but they should be able to describe WHY it went wrong so that you (the developer) can do
something about it. That's the same philosophy I used to introduce a parsing-agnostic [
syntax suggesting algorithm into Ruby core](https://github.com/ruby/syntax_suggest).
### Use
The API is small. Construct a `PathFact` with a path and `Display` it as you like:
```rust
use path_facts::PathFacts;
let path = std::path::Path::new("doesnotexist.txt");
std::fs::read_to_string(&path)
.map_err(|error| format!("{error}. Path {}", PathFacts::new(&path)))
.unwrap();
```
For an operation with multiple paths you can use multiple PATH FACTS structs. For example:
```rust
use path_facts::PathFacts;
use indoc::formatdoc;
let from = std::path::Path::new("doesnotexist.txt");
let to = std::path::Path::new("also_does_not_exist.txt");
std::fs::rename(&from, to).map_err(|error| formatdoc! {"
cannot rename from `{}` to `{}` due to: {error}.
From path {from_facts}
To path {to_facts}
",
from.display(),
to.display(),
from_facts = PathFacts::new(&from),
to_facts = PathFacts::new(&to)
});
```
### Async support
This library uses syncronous file system calls. If you're using this library in an async context,
you'll want to use an async wrapper like [`tokio::task::block_in_place`](https://docs.rs/tokio/latest/tokio/task/fn.block_in_place.html).
### Actual path facts
Here are a few facts about paths that some people might find interesting. If you're staring at
path facts and an error message, maybe one of these tidbits could help you connect the dots:
- Fact: Failing to access a path does not guarantee it doesn't exist. It might mean you don't have
permission even to know whether it exists.
- Fact: If a parent directory of a path does not exist, the path does not exist!
- Fact: It is possible to read file permissions of a file you cannot "read" (if the directory it's
in has effective `execute` permissions).
- Fact: Files in a directory can be readable and modifiable, but if the directory is missing the
`execute` permission, you're not allowed to view the metadata (such as permission info).
- Fact: Deleting and creating files depends on whether the directory they're in has the `write`
permission or not!
- Fact: Parent directories cannot be files (unless they're valid symlinks or hard links to a valid
directory)!
- Fact: Making an infinitely recursive structure of paths using symlinks is possible. (FYI, this
is why we don't try to follow broken symlinks to completion)
- Fact: Permissions of a path depend not just on the permissions of the specific file/directory
but also on other things, such as inherited permissions from parent directories.
- This means that to know the "effective" permissions of a file, you need to know the permissions
of all its parent directories (we use the `faccess` crate for this)
- More permissions info at <https://www.redhat.com/sysadmin/linux-file-permissions-explained>
and <https://www.redhat.com/sysadmin/suid-sgid-sticky-bit>
- Fact: Different operating systems have different permissions models. Even on Linux, there are
additional ways to restrict file capabilities, such as Access Control Lists (ACLs).
- This library is OS independent but prioritizes posix systems (Linux, Mac) and, to a lesser
degree, Windows.
- Fact: The first paths were made by animals. Source: [top 10 facts about ~~paths~~ roads](https://www.funkidslive.com/learn/top-10-facts/top-ten-facts-about-roads/)
### Usage considerations
If you're considering adopting this library, here are some things you should consider:
- TOC/TOU: This stands for "time of check/time of use," meaning there are race conditions when
dealing with files. Files on disk are effectively a distributed system there's no way for this
library to guarantee that the disk was not modified between when your original command failed
and when this library ran. This library tries to gather information on disk as atomically as it
can, but ultimately, it must assume that the prior information collected is still valid. Does that
mean that our path facts could be path fiction? Anything we print was true at some time. It's the
user's responsibility to be aware of how their system is accessed and modified and act accordingly.
- Not everyone likes facts: If you add this to your library, consider adding a feature to
enable/disable it. We recommend the feature name `path_facts`.
- Facts don't come cheap: This library will make system calls. If performance is a concern, don't
call `PathFacts::new` on a hot code path. Instead, you could store the path and lazily call
`PathFacts` only when the error is rendered. We assume that computers do stuff fast and developers
do stuff slowly. You'll be trading off some compute time to reduce end developer debugging time.
- Top secret facts: If your errors end up being displayed to a user and they can manipulate the
input, they can already guess and check what files are on your system. If you introduce this library,
an attacker could gain more information (such as specific file permissions) and make it easier to
list directory contents. We recommend using this library in contexts where developer logs are kept
separate from user-facing errors or where the user already has access to the entire disk (such as
in a Cloud Native Buildpack).
- Stranger than fiction: Facts provided by this library make an effort to be as correct as possible
but might provide incomplete or conflicting information. What does that mean? The best way to know
if you can do something on disk is to try it and see if it succeeds. If your system uses a custom
file permission restriction system, it might show a path with `read` permission without read access.
Consider the information provided by path facts as a good starting point on where to focus your
investigation rather than as immutable and indisputable truth.
### Why not path recommendations instead of path facts?
Initial efforts on this library were geared towards giving concrete recommendations such as telling
people to `mkdir -p <directory>` or running a specific `chmod` command. That could still be a
worthwhile effort, but that requires that we know both the facts on disk as well as the intent of
the programmer.
To explain: Suggestions and recommendations are better the more true they are. If an error message
says "please try again" and trying again doesn't fix it, but the message continues to assert "please
try again," it's not so much a valid suggestion but more wishful thinking on the error message
author. If a path doesn't exist, that's usually bad, so we suggest you create it via the `touch`
command. Except if you're trying to create a file, then that path not existing is good, and
suggesting that you create it would introduce an error where none existed before. So, any suggestions
must be context-aware and task-aware.
Complicating things further: implementation details matter when determining the disk's state
**should** be. For example, Rust's [std::fs::rename](https://doc.rust-lang.org/std/fs/fn.rename.html)
function will error if the "to" path exists unless it's on Unix and it's a directory that is empty
and the "from" path is also a directory. But on Windows, the "to" path cannot be a directory.
That's a lot of caveats to consider!
Effectively, the only way to deliver truly accurate recommendations for some operations would be
to reverse engineer them. That's fine in moderation. We do that a little here, traversing directories
in a parent chain to see which one doesn't exist. However, if your implementation is overly coupled
to internally described logic, it can be difficult to maintain it if the reference implementation
changes without warning. Then suddenly, previously valid suggestions are no longer true!
There are some ways around this inside-out implementation-coupling problem. For example,
[synax_suggest](https://github.com/ruby/syntax_suggest) tries to know as little as possible about
Ruby grammar and parsing. It tells the user things that are true, such as "if you take these lines of
code together, they're invalid Ruby." Then it presents that truthy information in a way the user can
consume and actualize. It's not always a perfect result, but it's correct enough to be helpful more
often than it's harmful. A more scholarly way to frame this would be looking at it through the lens
of [soundeness versus completeness and precision](https://cacm.acm.org/blogcacm/soundness-and-completeness-defined-with-precision/).
We aim for "more precise" i.e., "it reports fewer non-errors." It also means we could possibly stop
at some point earlier than "completely reverse-engineer rust stdlib behavior" and somewhere further
than "simply state the facts". However, we're here. We have the facts, we might as well show those.
<!-- cargo-rdme end -->
## Development
Output style and contents are tested via snapshot testing using [insta](https://docs.rs/insta/latest/insta/). If the output changes and you want to update the snapshots, you can generate new pending output by running tests `cargo test` this will produce `*.pending-snap` files. These can be reviewed and accepted by using `cargo insta`:
```term
$ cargo insta review
```
Or via using env vars:
```term
$ INSTA_UPDATE=always cargo test
```
To run on linux:
```
$ docker build -f Dockerfile.test -t path_facts_test .
$ docker run --rm path_facts_test
```
|