1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
|
CalendarServer: Attachments
===========================
# Introduction
This document describes the support for attachments in calendar events in CalendarServer. First some background. The initial CalendarServer attachment support was implemented using the "dropbox" protocol. In that protocol, the server advertised a "dropbox" collection on each principal resource (the dropbox was always located as a top-level collection inside a principal's calendar home). To add an attachment to an event, the client would:
1. Create a child collection inside the dropbox collection (usually clients use the UID of the event for that collection name).
2. Set the permissions on the dropbox child collection so that the owner and any attendees on the event could access its contents.
3. Store the actual attachment data in the dropbox child collection.
4. Add an X-APPLE-DROPBOX property to the event with a value set to the URI of the dropbox child collection
5. Add an ATTACH property to the event with a URI value pointing to the attachment resource in the dropbox child collection.
When an event with an X-APPLE-DROPBOX property is opened, the client scans the dropbox child collection (using PROPFIND) to determine what attachment resources are present, and the client reconciles the list of attachments in the event. In addition, attendees of events were given permission to add attachments to the dropbox child collection by simply storing them there (which would cause them to appear the next time the event attachment list was reconciled).
There were a number of problems with this approach: notably that clients often did not manage the ACL list properly. Plus other vendors were not keen on this approach. Instead, the [Calendaring and Scheduling Consortium](https://calconnect.org) developed a new specification for "managed attachments" that put the server in control of managing the permissions for access to the attachment data, and which provided a one request mode for adding attachments to an event.
In the managed attachment protocol, a client adds an attachment as follows:
1. The client does POST request on the event calendar object resource using an `action=add-attachment` query parameter, and includes the attachment data in the body of the request, in addition to `Content-Type` and `Content-Disposition` HTTP header fields that described the attachment.
2. In receipt of the POST request, the server stores the attachment and adds an appropriate `ATTACH` property to the calendar event data, with a `MANAGED-ID` property parameter whose value is a unique ID generated by the server (and used to track the attachment from that time on).
3. The client can refresh the event to get details of the `ATTACH` property added.
In order to provide backwards compatibility with the dropbox protocol, CalendarServer stores managed attachments in a child collection in the dropbox collection. Whilst dropbox is still supported, the server defaults to supporting managed attachments (and once managed attachments is enabled the server cannot go back to supporting dropbox).
# Implementation Details
Key python modules to examine are:
* `twistedcaldav/storebridge.py`, classes:
* `DropboxCollection` (legacy).
* `CalendarObjectDropbox` (legacy).
* `AttachmentsCollection`.
* `AttachmentsChildCollection`.
* `CalendarAttachment`.
* `CalendarObjectResource` - specifically the `POST_handler_attachment` method.
* `txdav/caldav/datastore/sql.py`, classes:
* `CalendarObject` - specifically `addAttachment`, `updateAttachment`, `removeAttachment` etc.
* `txdav/caldav/datastore/sql_attachment.py`
The relevant SQL schema:
* `CALENDAR_OBJECT` table - and the `ATTACHMENTS_MODE` and `DROPBOX_ID` columns in particular.
* `ATTACHMENT` table.
* `ATTACHMENT_CALENDAR_OBJECT` table.
It is important to note that the attachment data is NOT stored in the database. Instead we store attachments in the file system inside the directory specified by the `config.AttachmentsRoot` setting. The attachments are stored in a two-level deep hashed directory name structure - the `attachmentID` of the attachment is MD5-hashed and the first two pairs of characters of the hex-encoded value are used for the two directories in the. The hex-encoded MD5 hash is then used for the name of the file stored in that location (e.g., `<AttachmentRoot>/01/23/0123456789`).
Information about attachments is stored in the `ATTACHMENT` table in the database. Each attachment has a unique `ATTACHMENT_ID`, a reference to the calendar home collection of the owner (the person against whose quota this attachment will count), a `DROPBOX_ID` (which is always `.` for managed attachments), and then meta-data about the attachment itself (name, type, etc). The `Attachment` class models this table, with the `ManagedAttachment` class being the specialization of that for managed attachments.
Since the same attachment can be used in multiple events for the same owner (e.g., a recurring event that is split) there exists a one-to-many mapping being an attachment and a calendar object resource. That mapping is managed by the `ATTACHMENT_CALENDAR_OBJECT` table which also holds the value of the `ManagedID` used to represent the attachment in the calendar data `ATTACH` property.
When an attachment is created by the `POST` request on the calendar object resource, the server streams the attachment data to disk via the `AttachmentStorageTransport` class, then creates the necessary database table entries and updates the calendar object resource `ATTACH` property data. Note, that can trigger scheduling messages to any attendees of the event.
Whenever and event is created or updated on the server, the server must reconcile the `ATTACH` properties with `MANAGED-ID` property parameters with any existing data for that resource, so that it can detect removal of attachments (which clients do by simply removing the associated `ATTACH` property). That reconcile is done in the `CalendarObject.resourceCheckAttachments` method. Attachments are reference counted based on the entries in the `ATTACHMENT_CALENDAR_OBJECT`. i.e., once all references to a specific `ATTACHMENT_ID` in the `ATTACH` table have been removed from the `ATTACHMENT_CALENDAR_OBJECT` table, the server will delete the `ATTACH` table entry and delete the attachment file on disk.
Whenever the client uses a `GET` request to read an attachment via its URI, the server resolves the attachment URI into an `sql_attachment.Attachment` object, then streams the resource data back to the client in the response, with the appropriate HTTP response header fields set. Attachment resources cannot be directly updated (via `PUT` requests) or removed (via `DELETE` requests).
Permissions to access an attachment is managed by the `storebridge.AttachmentsChildCollection.accessControlList` method. That method examines the attendee list of the associated event, any sharees of the owner's calendar, and any proxies of the owner, and makes sure they have the appropriate permissions to that collection and all the attachments it contains.
|