1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392
|
.\" Automatically generated by Pandoc 2.5
.\"
.TH "muchsync" "1" "" "" ""
.hy
.SH NAME
.PP
muchsync \- synchronize maildirs and notmuch databases
.SH SYNOPSIS
.PP
muchsync \f[I]options\f[R]
.PD 0
.P
.PD
muchsync \f[I]options\f[R] \f[I]server\-name\f[R]
\f[I]server\-options\f[R]
.PD 0
.P
.PD
muchsync \f[I]options\f[R] \[en]init \f[I]maildir\f[R]
\f[I]server\-name\f[R] \f[I]server\-options\f[R]
.SH DESCRIPTION
.PP
muchsync synchronizes the contents of maildirs and notmuch tags across
machines.
Any given execution runs pairwise between two replicas, but the system
scales to an arbitrary number of replicas synchronizing in arbitrary
pairs.
For efficiency, version vectors and logical timestamps are used to limit
synchronization to items a peer may not yet know about.
.PP
To use muchsync, both muchsync and notmuch should be installed someplace
in your PATH on two machines, and you must be able to access the remote
machine via ssh.
.PP
In its simplest usage, you have a single notmuch database on some server
\f[C]SERVER\f[R] and wish to start replicating that database on a
client, where the client currently does not have any mailboxes.
You can initialize a new replica in \f[C]$HOME/inbox\f[R] by running the
following command:
.IP
.nf
\f[C]
muchsync \-\-init $HOME/inbox SERVER
\f[R]
.fi
.PP
This command may take some time, as it transfers the entire contents of
your maildir from the server to the client and creates a new notmuch
index on the client.
Depending on your setup, you may be either bandwidth limited or CPU
limited.
(Sadly, the notmuch library on which muchsync is built is non\-reentrant
and forces all indexing to happen on a single core at a rate of about
10,000 messages per minute.)
.PP
From then on, to synchronize the client with the server, just run:
.IP
.nf
\f[C]
muchsync SERVER
\f[R]
.fi
.PP
Since muchsync replicates the tags in the notmuch database itself, you
should consider disabling maildir flag synchronization by executing:
.IP
.nf
\f[C]
notmuch config set maildir.synchronize_flags=false
\f[R]
.fi
.PP
The reason is that the synchronize_flags feature only works on a small
subset of pre\-defined flags and so is not all that useful.
Moreover, it marks flags by renaming files, which is not particularly
efficient.
muchsync was largely motivated by the need for better flag
synchronization.
If you are satisfied with the synchronize_flags feature, you might
consider a tool such as offlineimap as an alternative to muchsync.
.SS Synchronization algorithm
.PP
muchsync separately synchronizes two classes of information: the
message\-to\-directory mapping (henceforth link counts) and the
message\-id\-to\-tag mapping (henceforth tags).
Using logical timestamps, it can detect update conflicts for each type
of information.
We describe link count and tag synchronization in turn.
.PP
Link count synchronization consists of ensuring that any given message
(identified by its collision\-resistant content hash) appears the same
number of times in the same subdirectories on each replica.
Generally a message will appear only once in a single subdirectory.
However, if the message is moved or deleted on one replica, this will
propagate to other replicas.
.PP
If two replicas move or copy the same file between synchronization
events (or one moves the file and the other deletes it), this
constitutes an update conflict.
Update conflicts are resolved by storing in each subdirectory a number
of copies equal to the maximum of the number of copies in that
subdirectory on the two replicas.
This is conservative, in the sense that a file will never be deleted
after a conflict, though you may get extra copies of files.
(muchsync uses hard links, so at least these copies will not use too
much disk space.)
.PP
For example, if one replica moves a message to subdirectory .box1/cur
and another moves the same message to subdirectory .box2/cur, the
conflict will be resolved by placing two links to the message on each
replica, one in .box1/cur and one in .box2/cur.
To respect the structure of maildirs, subdirectories ending
\f[C]new\f[R] and \f[C]cur\f[R] are special\-cased; conflicts between
sibling \f[C]new\f[R] and \f[C]cur\f[R] subdirectories are resolved in
favor of \f[C]cur\f[R] without creating additional copies of messages.
.PP
Message tags are synchronized based on notmuch\[cq]s message\-ID
(usually the Message\-ID header of a message), rather than message
contents.
On conflict, tags are combined as follows.
Any tag in the notmuch configuration parameter
\f[C]muchsync.and_tags\f[R] is removed from the message unless it
appears on both replicas.
Any other tag is added if it appears on any replica.
In other words, tags in \f[C]muchsync.and_tags\f[R] are logically anded,
while all other flags are logically ored.
(This approach will give the most predictable results if
\f[C]muchsync.and_tags\f[R] has the same value in all your replicas.
The \f[C]\-\-init\f[R] option ensures uniform configurations initially,
but subsequent changes to \f[C]muchsync.and_tags\f[R] must be manually
propagated.)
.PP
If your configuration file does not specify a value for
\f[C]muchsync.and_tags\f[R], the default is to use the set of tags
specified in the \f[C]new.tags\f[R] configuration option.
This should give intuitive results unless you use a two\-pass tagging
system such as the afew tool, in which case \f[C]new.tags\f[R] is used
to flag input to the second pass while you likely want
\f[C]muchsync.and_tags\f[R] to reflect the output of the second pass.
.SS File deletion
.PP
Because publishing software that actually deletes people\[cq]s email is
a scary prospect, muchsync for the moment never actually deletes mail
files.
Though this may change in the future, for the moment muchsync moves any
deleted messages to the directory \f[C].notmuch/muchsync/trash\f[R]
under your mail directory (naming deleted messages by their content
hash).
If you really want to delete mail to reclaim disk space or for privacy
reasons, you will need to run the following on each replica:
.IP
.nf
\f[C]
cd \[dq]$(notmuch config get database.path)\[dq]
rm \-rf .notmuch/muchsync/trash
\f[R]
.fi
.SH OPTIONS
.TP
.B \-C \f[I]file\f[R], \-\-config \f[I]file\f[R]
Specify the path of the notmuch configuration file to use.
If none is specified, the default is to use the contents of the
environment variable $NOTMUCH_CONFIG, or if that variable is unset, the
value $HOME/.notmuch\-config.
(These are the same defaults as the notmuch command itself.)
.TP
.B \-F
Check for modified files.
Without this option, muchsync assumes that files in a maildir are never
edited.
\-F disables certain optimizations so as to make muchsync at least check
the timestamp on every file, which will detect modified files at the
cost of a longer startup time.
If muchsync dies with the error \[lq]message received does not match
hash,\[rq] you likely need to run it with the \-F option.
.RS
.PP
Note that if your software regularly modifies the contents of mail files
(e.g., because you are running offlineimap with \[lq]synclabels =
yes\[rq]), then you will need to use \-F each time you run muchsync.
Specify it as a server option (after the server name) if the editing
happens server\-side.
.RE
.TP
.B \-r /path/to/muchsync
Specifies the path to muchsync on the server.
Ordinarily, muchsync should be in the default PATH on the server so this
option is not required.
However, this option is useful if you have to install muchsync in a
non\-standard place or wish to test development versions of the code.
.TP
.B \-s ssh\-cmd
Specifies a command line to pass to /bin/sh to execute a command on
another machine.
The default value is \[lq]ssh \-CTaxq\[rq].
Note that because this string is passed to the shell, special characters
including spaces may need to be escaped.
.TP
.B \-v
The \-v option increases verbosity.
The more times it is specified, the more verbose muchsync will become.
.TP
.B \-\-help
Print a brief summary of muchsync\[cq]s command\-line options.
.TP
.B \-\-init \f[I]maildir\f[R]
This option clones an existing mailbox on a remote server into
\f[I]maildir\f[R] on the local machine.
Neither \f[I]maildir\f[R] nor your notmuch configuration file (see
\f[C]\-\-config\f[R] above) should exist when you run this command, as
both will be created.
The configuration file is copied from the server (adjusted to reflect
the local maildir), while \f[I]maildir\f[R] is created as a replica of
the maildir you have on the server.
.TP
.B \-\-nonew
Ordinarily, muchsync begins by running \[lq]notmuch new\[rq].
This option says not to run \[lq]notmuch new\[rq] before starting the
muchsync operation.
It can be passed as either a client or a server option.
For example: The command \[lq]\f[C]muchsync myserver \-\-nonew\f[R]\[rq]
will run \[lq]\f[C]notmuch new\f[R]\[rq] locally but not on myserver.
.TP
.B \-\-noup, \-\-noupload
Transfer files from the server to the client, but not vice versa.
.TP
.B \-\-upbg
Transfer files from the server to the client in the foreground.
Then fork into the background to upload any new files from the client to
the server.
This option is useful when checking new mail, if you want to begin
reading your mail as soon as it has been downloaded while the upload
continues.
.TP
.B \-\-self
Print the 64\-bit replica ID of the local maildir replica and exit.
Potentially useful in higher\-level scripts, such as the emacs
notmuch\-poll\-script variable for identifying on which replica one is
running, particularly if network file systems allow a replica to be
accessed from multiple machines.
.TP
.B \-\-newid
Muchsync requires every replica to have a unique 64\-bit identifier.
If you ever copy a notmuch database to another machine, including the
muchsync state, bad things will happen if both copies use muchsync, as
they will both have the same identifier.
Hence, after making such copy and before running muchsync to synchronize
mail, run \f[C]muchsync \-\-newid\f[R] to change the identifier of one
of the copies.
.TP
.B \-\-version
Report on the muchsync version number
.SH EXAMPLES
.PP
To initialize a the muchsync database, you can run:
.IP
.nf
\f[C]
muchsync \-vv
\f[R]
.fi
.PP
This first executes \[lq]\f[C]notmuch new\f[R]\[rq], then builds the
initial muchsync database from the contents of your maildir (the
directory specified as \f[C]database.path\f[R] in your notmuch
configuration file).
This command may take several minutes the first time it is run, as it
must compute a content hash of every message in the database.
Note that you do not need to run this command, as muchsync will
initialize the database the first time a client tries to synchronize
anyway.
.IP
.nf
\f[C]
muchsync \-\-init \[ti]/maildir myserver
\f[R]
.fi
.PP
First run \[lq]notmuch new\[rq] on myserver, then create a directory
\f[C]\[ti]/maildir\f[R] containing a replica of your mailbox on
myserver.
Note that neither your configuration file (by default
\f[C]\[ti]/.notmuch\-config\f[R]) nor \f[C]\[ti]/maildir\f[R] should
exist before running this command, as both will be created.
.PP
To create a \f[C]notmuch\-poll\f[R] script that fetches mail from a
remote server \f[C]myserver\f[R], but on that server just runs
\f[C]notmuch new\f[R], do the following: First, run
\f[C]muchsync \-\-self\f[R] on the server to get the replica ID.
Then take the ID returned (e.g., \f[C]1968464194667562615\f[R]) and
embed it in a shell script as follows:
.IP
.nf
\f[C]
#!/bin/sh
self=$($HOME/muchsync \-\-self) || exit 1
if [ \[dq]$self\[dq] = 1968464194667562615 ]; then
exec notmuch new
else
exec $HOME/muchsync \-r ./muchsync \-\-upbg myserver
fi
\f[R]
.fi
.PP
The path of such a script is a good candidate for the emacs
\f[C]notmuch\-poll\-script\f[R] variable.
.PP
Alternatively, to have the command \f[C]notmuch new\f[R] on a client
automatically fetch new mail from server \f[C]myserver\f[R], you can
place the following in the file \f[C].notmuch/hooks/post\-new\f[R] under
your mail directory:
.IP
.nf
\f[C]
#!/bin/sh
muchsync \-\-nonew \-\-upbg myserver
\f[R]
.fi
.SH FILES
.PP
The default notmuch configuration file is
\f[C]$HOME/.notmuch\-config\f[R].
.PP
muchsync keeps all of its state in a subdirectory of your top maildir
called \f[C].notmuch/muchsync\f[R].
.SH SEE ALSO
.PP
notmuch(1).
.SH BUGS
.PP
muchsync expects initially to create replicas from scratch.
If you have created a replica using another tool such as offlineimap and
you try to use muchsync to synchronize them, muchsync will assume every
file has an update conflict.
This is okay if the two replicas are identical; if they are not, it will
result in artifacts such as files deleted in only one replica
reappearing.
Ideally notmuch needs an option like \f[C]\-\-clobber\f[R] that makes a
local replica identical to the remote one without touching the remote
one, so that an old version of a mail directory can be used as a
disposable cache to bootstrap initialization.
.PP
muchsync never deletes directories.
If you want to remove a subdirectory completely, you must manually
execute rmdir on all replicas.
Even if you manually delete a subdirectory, it will live on in the
notmuch database.
.PP
To synchronize deletions and re\-creations properly, muchsync never
deletes content hashes and their message IDs from its database, even
after the last copy of a message has disappeared.
Such stale hashes should not consume an inordinate amount of disk space,
but could conceivably pose a privacy risk if users believe deleting a
message removes all traces of it.
.PP
Message tags are synchronized based on notmuch\[cq]s message\-ID
(usually the Message\-ID header of a message), rather than based on
message contents.
This is slightly strange because very different messages can have the
same Message\-ID header, meaning the user will likely only read one of
many messages bearing the same Message\-ID header.
It is conceivable that an attacker could suppress a message from a
mailing list by sending another message with the same Message\-ID.
This bug is in the design of notmuch, and hence not something that
muchsync can work around.
muchsync itself does not assume Message\-ID equivalence, relying instead
on content hashes to synchronize link counts.
Hence, any tools used to work around the problem should work on all
replicas.
.PP
Because notmuch and Xapian do not keep any kind of modification time on
database entries, every invocation of muchsync requires a complete scan
of all tags in the Xapian database to detect any changed tags.
Fortunately muchsync heavily optimizes the scan so that it should take
well under a second for 100,000 mail messages.
However, this means that interfaces such as those used by notmuch\-dump
are not efficient enough (see the next paragraph).
.PP
muchsync makes certain assumptions about the structure of notmuch\[cq]s
private types \f[C]notmuch_message_t\f[R] and
\f[C]notmuch_directory_t\f[R].
In particular, it assumes that the Xapian document ID is the second
field of these data structures.
Sadly, there is no efficient and clean way to extract this information
from the notmuch library interface.
muchsync also makes other assumptions about how tokens are named in the
Xapian database.
These assumptions are necessary because the notmuch library interface
and the notmuch dump utility are too slow to support synchronization
every time you check mail.
.SH AUTHORS
David Mazieres.
|