File: AnnotatedSchema.md

package info (click to toggle)
calendarserver 9.2%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 25,688 kB
  • sloc: python: 195,037; sql: 78,794; xml: 16,936; sh: 2,502; ansic: 66; makefile: 26
file content (260 lines) | stat: -rw-r--r-- 19,503 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
CalendarServer: Annotated Schema
================================

# Introduction

This document serves as a description of the [CalendarServer SQL schema](https://github.com/apple/ccs-calendarserver/blob/master/txdav/common/datastore/sql_schema/current.sql). The goal is to document the tables, their relationships, how they map to app-layer (CalDAV) actions, how they interact with the directory system, and how they are used for scheduling and other internal operations. This document will concentrate on the calendaring specific schema elements and skip discussion of the contacts elements, but the contacts schema is a simplified version of calendar, so the same basic concepts apply.

The CalendarServer database stores the calendaring data associated with a user. The actual details of a user are not stored in the database, instead they are assumed to come from the directory service configured for use with CalendarServer. To relate data in the database to directory users, the directory service GUID or UUID record entry for each calendar user is used as a "key" in the database. The directory service is assumed to be read-only, but some user-specific data can be modified on the server - in particular calendaring delegate relationships - and we do maintain separate tables in the database for that. Additionally, the server needs to detect changes to group membership so group membership information is also cached in the database.

Each key piece of data in the database will have an associated unique-id, derived from a database `SEQUENCE` value, and these are used for the foreign key relationships between tables.
 
# CALENDAR\_HOME

```sql
create table CALENDAR_HOME (
  RESOURCE_ID      integer      primary key default nextval('RESOURCE_ID_SEQ'),
  OWNER_UID        varchar(255) not null,
  STATUS           integer      default 0 not null,
  DATAVERSION      integer      default 0 not null
);
```

This table provides the main mapping between a directory record GUID/UUID and the calendar data associated with a user. The `OWNER_UID` column is the directory record GUID/UUID value. The `RESOURCE_ID` is used as the foreign key in other tables to define ownership of calendar and calendar object data. A `CALENDAR_HOME` table entry is provisioned the first time a calendar user logs in and makes a CalDAV request, or the first time a calendar user is referred to as an attendee of an event or a sharee of a calendar. The server always checks the directory record first to ensure a calendar user exists and is active before creating the `CALENDAR_HOME` entry for that user.

The `STATUS` column is used to represent different internal modes for the corresponding associated calendar data. Details TBD.

The `DATAVERSION` column is used to support on-demand data upgrades. In some cases it may be necessary to upgrade data on the server - that could be either calendar data or database related metadata. Rather than lock the database and upgrade all the data at once, we have the option to upgrade data on demand or in the background. The `DATAVERSION` column is used to track the status of that. The value is an integer representing the current version of the data. If that version is less than the expected version, then an upgrade is needed and the appropriate tools can key of that to initiate background or on-demand processing.

# CALENDAR\_HOME\_METADATA

```sql
create table CALENDAR_HOME_METADATA (
  RESOURCE_ID              integer     primary key references CALENDAR_HOME on delete cascade,
  QUOTA_USED_BYTES         integer     default 0 not null,
  TRASH                    integer     default null references CALENDAR on delete set null,
  DEFAULT_EVENTS           integer     default null references CALENDAR on delete set null,
  DEFAULT_TASKS            integer     default null references CALENDAR on delete set null,
  DEFAULT_POLLS            integer     default null references CALENDAR on delete set null,
  ALARM_VEVENT_TIMED       text        default null,
  ALARM_VEVENT_ALLDAY      text        default null,
  ALARM_VTODO_TIMED        text        default null,
  ALARM_VTODO_ALLDAY       text        default null,
  AVAILABILITY             text        default null,
  CREATED                  timestamp   default timezone('UTC', CURRENT_TIMESTAMP),
  MODIFIED                 timestamp   default timezone('UTC', CURRENT_TIMESTAMP)
);
```

This table has one row for every row in `CALENDAR_HOME`. It maintains data about a user's calendar home that changes, potentially fairly frequently. Rather than store that in the `CALENDAR_HOME` table, it was found to be better to have an associated table to maintain the data - primarily to avoid lock contention issues. Data is this table can change as a direct result of a user action (e.g., a `PROPPATCH` of properties on the user's calendar home resource), or as the side-effect of other changes (e.g., the `QUOTA_USED_BYTES` is changed each time an attachment is added, changed, or removed).

# CALENDAR

```sql
create table CALENDAR (
  RESOURCE_ID integer   primary key default nextval('RESOURCE_ID_SEQ')
);
```

This table represents a calendar - basically a container for calendar objects. The `CALENDAR_OBJECT` table has a single `CALENDAR` reference for each entry - i.e., there is a one-to-many relationship between `CALENDAR` and `CALENDAR_OBJECT` entries. However, the relationship between calendars and calendar homes is more complicated: typically a calendar will be owned by a specific user, so that represents a one-to-many relationship between calendar homes and calendars. However, calendars can also be shared to other users such that they appear in those other users' calendar homes. So that gives rise to a many-to-many relationship between calendar homes and calendars. That many-to-many relationship is managed via the `CALENDAR_BIND` table, described below. Each row in that table defines a relationship between one calendar home and one calendar.

# CALENDAR\_METADATA

```sql
create table CALENDAR_METADATA (
  RESOURCE_ID           integer      primary key references CALENDAR on delete cascade,
  SUPPORTED_COMPONENTS  varchar(255) default null,
  CHILD_TYPE            integer      default 0 not null,
  TRASHED               timestamp    default null,
  IS_IN_TRASH           boolean      default false not null, -- collection is in the trash
  CREATED               timestamp    default timezone('UTC', CURRENT_TIMESTAMP),
  MODIFIED              timestamp    default timezone('UTC', CURRENT_TIMESTAMP)
);
```

This table plays a similar role to `CALENDAR_HOME_METADATA` in that it maintains mutable state for calendars, again separated from the main table in order to reduce lock contention issues.

# CALENDAR\_BIND

```sql
create table CALENDAR_BIND (
  CALENDAR_HOME_RESOURCE_ID integer      not null references CALENDAR_HOME,
  CALENDAR_RESOURCE_ID      integer      not null references CALENDAR on delete cascade,
  CALENDAR_RESOURCE_NAME    varchar(255) not null,
  BIND_MODE                 integer      not null,
  BIND_STATUS               integer      not null,
  BIND_REVISION             integer      default 0 not null,
  BIND_UID                  varchar(36)  default null,
  MESSAGE                   text,
  TRANSP                    integer      default 0 not null,
  ALARM_VEVENT_TIMED        text         default null,
  ALARM_VEVENT_ALLDAY       text         default null,
  ALARM_VTODO_TIMED         text         default null,
  ALARM_VTODO_ALLDAY        text         default null,
  TIMEZONE                  text         default null,

  primary key (CALENDAR_HOME_RESOURCE_ID, CALENDAR_RESOURCE_ID),
  unique (CALENDAR_HOME_RESOURCE_ID, CALENDAR_RESOURCE_NAME)
);
```

This table maintains the mapping between a calendar and the calendar home it is meant to appear in. Each row has a column for the referenced calendar home and calendar entry defining the "bind". The type of "bind" is defined by the `BIND_MODE` column. There are several values for `BIND_MODE` but the primary purpose is to distinguish between "owned" and "shared" calendars. For any pair of calendar home and calendar, there can (must be) only one owned calendar: `BIND_MODE == 0`, but there can be many shared calendars in various states (invited, accepted, declined etc) as indicated by the `BIND_STATUS` value.

This table also maintains per-user calendar state, such as alarms and associated time zone, which may differ between the various users who have access to the calendar.

# CALENDAR\_OBJECT

```sql
create table CALENDAR_OBJECT (
  RESOURCE_ID          integer      primary key default nextval('RESOURCE_ID_SEQ'),
  CALENDAR_RESOURCE_ID integer      not null references CALENDAR on delete cascade,
  RESOURCE_NAME        varchar(255) not null,
  ICALENDAR_TEXT       text         not null,
  ICALENDAR_UID        varchar(255) not null,
  ICALENDAR_TYPE       varchar(255) not null,
  ATTACHMENTS_MODE     integer      default 0 not null,
  DROPBOX_ID           varchar(255),
  ORGANIZER            varchar(255),
  RECURRANCE_MIN       date,
  RECURRANCE_MAX       date,
  ACCESS               integer      default 0 not null,
  SCHEDULE_OBJECT      boolean      default false,
  SCHEDULE_TAG         varchar(36)  default null,
  SCHEDULE_ETAGS       text         default null,
  PRIVATE_COMMENTS     boolean      default false not null,
  MD5                  char(32)     not null,
  TRASHED              timestamp    default null,
  ORIGINAL_COLLECTION  integer      default null,
  CREATED              timestamp    default timezone('UTC', CURRENT_TIMESTAMP),
  MODIFIED             timestamp    default timezone('UTC', CURRENT_TIMESTAMP),
  DATAVERSION          integer      default 0 not null
);
```

This table maintains the data and metadata for calendar objects. Each entry references a single calendar, in which the calendar object is meant to appear. The actual iCalendar data is stored as text in the `ICALENDAR_TEXT` column. The other columns maintain associated metadata used by the business logic for various purposes. The `DATAVERSION` column is used in a similar manner to the column with the same name on the `CALENDAR_HOME` to support incremental, or on-demand, data upgrades. `RECURRANCE_MIN` and `RECURRANCE_MAX` are used to define the range of validity for expanded time range instance data for the calendar object, as stored in the `TIME_RANGE` table, and used during time range and freebusy queries. Other columns such as `ICALENDAR_UID`, `ACCESS`, and `PRIVATE_COMMENTS`, provide quick access to frequently used data that would otherwise have to be read and parsed from the actual iCalendar data.

# TIME\_RANGE

```sql
create table TIME_RANGE (
  INSTANCE_ID                 integer        primary key default nextval('INSTANCE_ID_SEQ'),
  CALENDAR_RESOURCE_ID        integer        not null references CALENDAR on delete cascade,
  CALENDAR_OBJECT_RESOURCE_ID integer        not null references CALENDAR_OBJECT on delete cascade,
  FLOATING                    boolean        not null,
  START_DATE                  timestamp      not null,
  END_DATE                    timestamp      not null,
  FBTYPE                      integer        not null,
  TRANSPARENT                 boolean        not null
);
```

This table maintains a list of expanded recurrence instances for the associated calendar object. We want time range and freebusy queries to be fast, which means we don't want to have to read, parse, and carry out recurrence expansion on every calendar object in a calendar each time one of those queries is done (since they happen frequently). Instead, we do the instance expansion once and cache the result in the `TIME_RANGE` table so that the queries can directly target those tables to return results. However, this is an on-demand cache, so typically we don't expand the instances when the calendar object is created or updated, but instead wait for the first time range or freebusy query to occur and then do the expansion. The reason for that is that sometimes a calendar object can be updated multiple times in rapid succession, without a time range or freebusy query occurring in between, so the time range expansion work would be wasted for all but the last update. Also, the server does not cache all instances of an event, but instead only caches over a period of past/future time to limit the amount of data that needs to be stored since some recurring events can be unbounded or long lived, yet most time range or freebusy queries are done over short periods of time centered on the current time.

# PERUSER

```sql
create table PERUSER (
  TIME_RANGE_INSTANCE_ID      integer      not null references TIME_RANGE on delete cascade,
  USER_ID                     varchar(255) not null,
  TRANSPARENT                 boolean      not null,
  ADJUSTED_START_DATE         timestamp    default null,
  ADJUSTED_END_DATE           timestamp    default null
);
```

Another aspect of time range data is that there is a per-user component to it: specifically, a single calendar object appears in a single calendar, but that calendar could be shared with multiple users, each of which can set their own travel time and transparency status on the event. As a result, we need to maintain the per-user data in such a way that we can pick the appropriate items to use for a time range of freebusy query depending on which user is carrying out the query request. The `PERUSER` table is used for that purpose. If a user with access to a calendar object has per-user data that differs from the default for any instance of the data, then they will have an associated entry in the `PERUSER` table that relates their directory GUID/UUID - via the `USER_ID` column - to the specific instance in the `TIME_RANGE` table. The time range and freebusy queries then include this table as a join to ensure the required information is extracted.

# ATTACHMENT

```sql
create table ATTACHMENT (
  ATTACHMENT_ID               integer           primary key default nextval('ATTACHMENT_ID_SEQ'),
  CALENDAR_HOME_RESOURCE_ID   integer           not null references CALENDAR_HOME,
  DROPBOX_ID                  varchar(255),
  CONTENT_TYPE                varchar(255)      not null,
  SIZE                        integer           not null,
  MD5                         char(32)          not null,
  CREATED                     timestamp default timezone('UTC', CURRENT_TIMESTAMP),
  MODIFIED                    timestamp default timezone('UTC', CURRENT_TIMESTAMP),
  PATH                        varchar(1024)     not null
);
```

This table maintains information about attachments stored on the server. Each attachment has a specific user as its owner (identified by the `CALENDAR_HOME_RESOURCE_ID` column). That is the user whose quota will be impacted by the attachment. The attachment data is stored as a file on disk - not in the database.

# ATTACHMENT\_CALENDAR\_OBJECT

This table maintains a mapping between attachments and calendar objects and is needed because a single attachment can appear in multiple calendar objects.

# RESOURCE\_PROPERTY

```sql
create table RESOURCE_PROPERTY (
  RESOURCE_ID integer      not null,
  NAME        varchar(255) not null,
  VALUE       text         not null,
  VIEWER_UID  varchar(255)
);
```

Each WebDAV resource at the application layer can have associated WebDAV "properties", which can be server or user defined metadata. Server specific metadata is usually stored in specific columns in the relevant calendar home, calendar, or calendar object data. User defined properties might be stored in those tables if we need fast access to the data, or it will be stored in the `RESOURCE_PROPERTY` table, with the value being the XML "blob" used by WebDAV.

# CALENDAR\_OBJECT\_REVISIONS

```sql
create table CALENDAR_OBJECT_REVISIONS (
  CALENDAR_HOME_RESOURCE_ID integer      not null references CALENDAR_HOME,
  CALENDAR_RESOURCE_ID      integer      references CALENDAR,
  CALENDAR_NAME             varchar(255) default null,
  RESOURCE_NAME             varchar(255),
  REVISION                  integer      default nextval('REVISION_SEQ') not null,
  DELETED                   boolean      not null,
  MODIFIED                  timestamp    default timezone('UTC', CURRENT_TIMESTAMP) not null
);
```

To allow clients to incrementally update their cache of data from the server, the server supports a WebDAV "sync" REPORT. In order to carry out that query the server needs to track changes and deletions of all resources: calendars and calendar objects. The `CALENDAR_OBJECT_REVISIONS` table is used to track that state and to provide results for the sync query, by tracking the changes to both calendars and calendar objects in one place (so that a single query can be scoped to just or calendar or an entire calendar home). When a calendar or calendar object is deleted, its entry in the `CALENDAR_BIND` or `CALENDAR_OBJECT` table is removed, however the sync query needs to still maintain at least the name of that resource to report the deletion back to the client, so this table needs to maintain such "tomb stones".

# APN\_SUBSCRIPTIONS

```sql
create table APN_SUBSCRIPTIONS (
  TOKEN                         varchar(255) not null,
  RESOURCE_KEY                  varchar(255) not null,
  MODIFIED                      integer      not null,
  SUBSCRIBER_GUID               varchar(255) not null,
  USER_AGENT                    varchar(255) default null,
  IP_ADDR                       varchar(255) default null
);
```

This table tracks client subscriptions to APNS push notifications, so that when a change on the server occurs, the server can determine which clients need to be sent push notifications. We store MODIFIED (seconds since the Epoch) because we purge subscriptions older than config.SubscriptionPurgeSeconds.  Clients are expected to re-subscribe at least every config.SubscriptionRefreshIntervalSeconds.  SUBSCRIBER_GUID, USER_AGENT, and IP_ADDR are used purely for diagnosing problems via the push command utility.

# IMIP\_TOKENS

```sql
create table IMIP_TOKENS (
  TOKEN                         varchar(255) not null,
  ORGANIZER                     varchar(255) not null,
  ATTENDEE                      varchar(255) not null,
  ICALUID                       varchar(255) not null,
  ACCESSED                      timestamp    default timezone('UTC', CURRENT_TIMESTAMP)
);
```

When scheduling messages are sent via email to calendar users not hosted on the server, the server needs a way to be able to track replies coming back from the attendees based on information in the iCalendar reply message. This table is used to track sufficient information for the server to be able to relate the iCalendar data in an email reply with the calendar object that generated the original request.

# *\_MIGRATION

The set of tables with the `_MIGRATION` suffix are used by the incremental cross-pod migration process to track the status of migrating data.

# CALENDARSERVER

```sql
create table CALENDARSERVER (
  NAME                          varchar(255) primary key, -- implicit index
  VALUE                         varchar(255)
);
```

This table maintains global state about the overall database using a "key-value" data mode. The `VERSION` key represents the current SQL schema version present in the database, and is checked on server startup to determine whether a schema upgrade is needed. Other `*-DATAVERSION` keys are used to determine whether data upgrades are needed.