File: performance.md

package info (click to toggle)
golang-github-transparency-dev-tessera 1.0.1-2
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 3,612 kB
  • sloc: sql: 33; sh: 17; makefile: 12
file content (175 lines) | stat: -rw-r--r-- 10,232 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
# Tessera Storage Performance

Tessera is designed to scale to meet the needs of most currently envisioned workloads in a cost-effective manner.

All storage backends have been tested to meet the write-throughput of CT-scale loads without issue.
The read API of Tessera based logs scales extremely well due to the immutable resource based approach used, which allows for:
1. Aggressive caching to be applied, e.g. via CDN
2. Horizontal scaling of read infrastructure (e.g. object storage)[^1]

[^1]: The MySQL storage backend is different to the others in that reads must be served via the personality rather than directly,
      however, due to changes in how MySQL is used compared to Trillian v1, read performance should be far better, and _could_ still
      be scaled horizontally with additional MySQL read replicas & read-only personality instances.

Below are some indicative figures which show the rough scale of performance we've seen from deploying Tessera conformance
binaries in various environments.

## Performance factors

### Resources

Exact performance numbers are highly dependent on the exact infrastructure being used (e.g. storage type & locality, host resources
of the machine(s) running the personality binary, network speed and weather, etc.)  If in doubt, you should run your own performance
tests on infrastructure which is as close as possible to that which will ultimately be used to run the log in production.
The [conformance binaries](/cmd/conformance) and [hammer tool](/internal/hammer) are designed for this kind of performance testing.

### Antispam

Antispam is a feature which does best effort deduplication of incoming entries. While cheaper than _strong atomic_ deduplication would
be, it is still a somewhat expensive operation in terms of both storage and throughput.
Not all personality designs will require it, so Tessera is built such that you only incur these costs if they are necessary
for your design.

Leaving antispam disabled will greatly increase the throughput of the log, and decrease CPU and storage costs.


## Backends

The currently supported storage backends are listed below, with a rough idea of the expected performance figures.
Individual storage implementations may have more detailed information about performance in their respective directories.

### GCP

The main lever for cost vs performance on GCP is Spanner, in the form of "Performance Units" (PUs).
PUs can be allocated in blocks of 100, and 1000 PUs is equivalent to 1 Spanner Server.

The table below shows some rough numbers of measured performance:

| Spanner PUs | Num FEs | QPS no-antispam | QPS antispam |
|-------------|---------|--------------|-----------|
| 100         | 1       | > 3,000      | > 800     |
| 200         | 1       | not done     | > 1500    |
| 300         | 1       | not done     | > 3000    |
| 300         | 2       | not done     | > 5000    |


### POSIX

Performance of the POSIX storage backend is highly dependent on the underlying infrastructure, some representative examples
of the performance on different types of infratructure are given below.

#### Local storage

##### NVMe

The log and hammer were both run in the same VM, with the log using a ZFS subvolume from the NVMe mirror.
With antispam enabled, it was able to sustain around 10,000 write qps, using up to 7 cores for the server.

```
┌───────────────────────────────────────────────────────────────────────────┐
│Read (8 workers): Current max: 20/s. Oversupply in last second: 0          │
│Write (30000 workers): Current max: 10000/s. Oversupply in last second: 0  │
│TreeSize: 5042936 (Δ 10567qps over 30s)                                    │
│Time-in-queue: 1889ms/2990ms/3514ms (min/avg/max)                          │
│Observed-time-to-integrate: 2255ms/3103ms/3607ms (min/avg/max)             │
├───────────────────────────────────────────────────────────────────────────┤
```


##### SAS 12Gb HDD

A single local instance on an 12-core VM with 8GB of RAM writing to local filesystem stored on a mirrored pair of SAS disks.

Without antispam, it was able to sustain around 2900 writes/s.

```
┌────────────────────────────────────────────────────────────────────────────────────┐
│Read (8 workers): Current max: 20/s. Oversupply in last second: 0                   │
│Write (3000 workers): Current max: 3000/s. Oversupply in last second: 0             │
│TreeSize: 1470460 (Δ 2927qps over 30s)                                              │
│Time-in-queue: 136ms/1110ms/1356ms (min/avg/max)                                    │
│Observed-time-to-integrate: 583ms/6019ms/6594ms (min/avg/max)                       │
├────────────────────────────────────────────────────────────────────────────────────┤
```

With antispam enabled (badger), it was able to sustain around 1600 writes/s.

```
┌────────────────────────────────────────────────────────────────────────────────────┐
│Read (8 workers): Current max: 20/s. Oversupply in last second: 0                   │
│Write (1800 workers): Current max: 1800/s. Oversupply in last second: 0             │
│TreeSize: 2041087 (Δ 1664qps over 30s)                                              │
│Time-in-queue: 0ms/112ms/448ms (min/avg/max)                                        │
│Observed-time-to-integrate: 593ms/3232ms/5754ms (min/avg/max)                       │
├────────────────────────────────────────────────────────────────────────────────────┤
```


#### Network storage

A 4 node CephFS cluster (1 admin, 3x storage nodes) running on E2 nodes sustained > 1000qps of writes.

#### GCP Free Tier VM Instance

A small `e2-micro` free-tier VM is able to sustain > 1500 writes/s using a mounted PersistentDisk to store the log.

> [!NOTE]
> Virtual CPUs (vCPUs) in virtualized environments often share physical CPU cores with other vCPUs and introduce variability
> and potential performance impacts.

```
┌───────────────────────────────────────────────────────────────────────┐
│Read (184 workers): Current max: 0/s. Oversupply in last second: 0     │
│Write (600 workers): Current max: 1758/s. Oversupply in last second: 0 │
│TreeSize: 1882477 (Δ 1587qps over 30s)                                 │
│Time-in-queue: 149ms/371ms/692ms (min/avg/max)                         │
│Observed-time-to-integrate: 569ms/1191ms/1878ms (min/avg/max)          │
└───────────────────────────────────────────────────────────────────────┘
```

More details on Tessera POSIX performance can be found [here](/storage/posix/PERFORMANCE.md).


## MySQL

Figures below were measured using VMs on GCP in order to provide an idea of size of machine required to
achieve the results.

> [!NOTE]
> Note that for Tessera on GCP deployments, we **strongly recommended* using the Tessera GCP storage implementation instead.


### GCP free-tier + CloudSQL

Tessera running on an `e2-micro` free tier VM instance on GCP, using CloudSQL for storage can sustain around 2000 write/s.

```
┌───────────────────────────────────────────────────────────────────────┐
│Read (8 workers): Current max: 0/s. Oversupply in last second: 0       │
│Write (512 workers): Current max: 2571/s. Oversupply in last second: 0 │
│TreeSize: 2530480 (Δ 2047qps over 30s)                                 │
│Time-in-queue: 41ms/120ms/288ms (min/avg/max)                          │
│Observed-time-to-integrate: 568ms/636ms/782ms (min/avg/max)            │
└───────────────────────────────────────────────────────────────────────┘
```

### GCP free-tier VM only

Tessera + MySQL both running on an `e2-micro` free tier VM instance on GCP can sustain around 300 writes/s.

```
┌──────────────────────────────────────────────────────────────────────┐
│Read (8 workers): Current max: 0/s. Oversupply in last second: 0      │
│Write (256 workers): Current max: 409/s. Oversupply in last second: 0 │
│TreeSize: 240921 (Δ 307qps over 30s)                                  │
│Time-in-queue: 86ms/566ms/2172ms (min/avg/max)                        │
│Observed-time-to-integrate: 516ms/1056ms/2531ms (min/avg/max)         │
└──────────────────────────────────────────────────────────────────────┘
```

More details on Tessera MySQL performance can be found [here](/storage/mysql/PERFORMANCE.md).


## AWS

Coming soon.