File: README.md

package info (click to toggle)
chromium 145.0.7632.159-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 5,976,224 kB
  • sloc: cpp: 36,198,469; ansic: 7,634,080; javascript: 3,564,060; python: 1,649,622; xml: 838,470; asm: 717,087; pascal: 185,708; sh: 88,786; perl: 88,718; objc: 79,984; sql: 59,811; cs: 42,452; fortran: 24,101; makefile: 21,144; tcl: 15,277; php: 14,022; yacc: 9,066; ruby: 7,553; awk: 3,720; lisp: 3,233; lex: 1,328; ada: 727; jsp: 228; sed: 36
file content (166 lines) | stat: -rw-r--r-- 8,352 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
# SQL disk cache backend

This directory contains an experimental SQL-based implementation of the disk
cache (`disk_cache::Backend`). It uses SQLite to store cache entries and is
designed to be more robust and performant than the default cache backends
(block file backend on Windows and simple cache backend on other OSes),
especially in scenarios with a large number of small entries.

The implementation is sharded to improve concurrency, with each shard managing
its own SQLite database file.

## Key Classes and Components

### Core Backend Logic

*   **`SqlBackendImpl`**: This is the main entry point and public-facing class
    for the SQL-based disk cache. It implements the `disk_cache::Backend`
    interface, handling requests from the network stack to open, create, doom,
    and enumerate cache entries. It owns and coordinates the
    `SqlPersistentStore`.

*   **`SqlEntryImpl`**: Implements the `disk_cache::Entry` interface,
    representing a single entry in the cache. It manages the data streams
    (header and body) and metadata (`last_used` time, key, etc.) for an entry.
    All I/O operations are passed to the `SqlBackendImpl`, which then delegates
    them to the persistence layer.

### Persistence Layer

*   **`SqlPersistentStore`**: This class serves as the primary interface to the
    persistence layer, abstracting away the details of database sharding and
    asynchronous operations. It owns multiple `BackendShard` instances and
    distributes operations among them based on the cache entry's key.

*   **`SqlPersistentStore::BackendShard`**: Manages a single shard of the cache.
    Each shard has its own SQLite database file, a dedicated background task
    runner for database operations, and an in-memory index of its entries. It
    owns a `SequenceBound<SqlPersistentStore::Backend>`.

*   **`SqlPersistentStore::Backend`**: This class encapsulates all direct
    interactions with a single SQLite database. It runs entirely on a
    background sequence to avoid blocking the main network thread. Its
    responsibilities include:
    *   Executing all SQL queries (CREATE, READ, UPDATE, DELETE).
    *   Managing the database schema and transactions.
    *   Handling database initialization and error recovery.

### Data Structures and Utilities

*   **`CacheEntryKey`**: A memory-efficient wrapper for the cache key string.
    Since cache keys can be long and are stored in multiple in-memory data
    structures, this class uses a `scoped_refptr<base::RefCountedString>` to
    share the underlying string data, reducing memory overhead.

*   **`SqlPersistentStoreInMemoryIndex`**: An in-memory index that maps a hash
    of a `CacheEntryKey` to its `ResId` (the primary key in the database). This
    allows for fast, synchronous checks to see if an entry is likely to exist in
    the cache without needing to perform a slow, asynchronous database query. It
    is highly optimized for memory usage.

*   **`ExclusiveOperationCoordinator`**: A synchronization primitive that
    serializes access to resources. It ensures that "exclusive" operations (like
    cache-wide eviction or cleanup) do not run concurrently with "normal"
    operation (like reading or writing a single entry). Normal operations on
    *different* cache keys can run in parallel.

*   **`EvictionCandidateAggregator`**: A thread-safe helper class used during
    cache eviction. Each shard independently generates a list of its least
    recently used entries as eviction candidates. This class aggregates these
    lists from all shards, performs a final sort, and selects the global set of
    entries to be evicted to bring the cache size back under its limit.

*   **`InFlightEntryModification`**: A mechanism to queue metadata updates
    (`last_used` time, headers, body size) for a cache entry that is not
    currently active (i.e., not held open as a `SqlEntryImpl` object). When an
    operation modifies an entry, it records the change as an in-flight
    modification. If the entry is opened again before the background database
    write completes, these queued modifications are applied to the entry's data
    as it is read from disk. This ensures that the in-memory representation of
    an entry is always consistent with pending operations, even with fully
    asynchronous database writes.

### How It Works

1.  **Initialization**: `SqlBackendImpl` creates a `SqlPersistentStore`, which
    in turn creates a number of `BackendShard` instances (e.g., 3), each with
    its own background task runner. Each shard initializes its SQLite database.

2.  **Entry Operations (Create/Open)**:
    *   A request to open or create an entry arrives at `SqlBackendImpl`.
    *   The entry's key is hashed to determine which `BackendShard` is
        responsible for it.
    *   The operation is posted to the shard's background task runner.
    *   The `SqlPersistentStore::Backend` for that shard executes the necessary
        SQL commands to find or create the entry in its database.
    *   The result (a `SqlEntryImpl` or an error) is returned to the main thread
        via a callback.

3.  **Data I/O**: Reading and writing data to a `SqlEntryImpl` follows a similar
    pattern, with operations being posted to the appropriate shard's background
    task runner.

4.  **Eviction**:
    *   When the cache size exceeds a certain threshold, `SqlBackendImpl`
        initiates eviction.
    *   It posts an exclusive "start eviction" task to the `SqlPersistentStore`.
    *   Each shard queries its database for a list of its least recently used
        entries.
    *   The `EvictionCandidateAggregator` collects these lists, selects the
        entries to be removed, and sends the list of doomed entries back to each
        shard to be deleted from the database.

5.  **Coordination**: The `ExclusiveOperationCoordinator` ensures that
    operations like eviction, which affect the entire cache, do not conflict
    with ongoing reads and writes to individual entries. When an exclusive
    operation is requested, the coordinator waits for all active normal
    operations to complete, runs the exclusive operation, and then resumes
    queued normal operations.

## Database Schema

Each shard of the SQL disk cache uses a SQLite database with the following
schema:

### Tables

*   **`resources`**: Stores the main metadata for each cache entry.
    *   `res_id` (INTEGER, PRIMARY KEY AUTOINCREMENT): Unique ID for the
        resource.
    *   `last_used` (INTEGER): Timestamp for LRU.
    *   `body_end` (INTEGER): End offset of the body.
    *   `bytes_usage` (INTEGER): Total bytes consumed by the entry.
    *   `doomed` (INTEGER): Flag for entries pending deletion (0 for live, 1 for
        doomed).
    *   `check_sum` (INTEGER): The checksum `crc32(head + cache_key_hash)`.
    *   `cache_key_hash` (INTEGER): The hash of `cache_key`.
    *   `cache_key` (TEXT): The full cache key string.
    *   `head` (BLOB): Serialized response headers.

*   **`blobs`**: Stores the data chunks of the cached body.
    *   `blob_id` (INTEGER, PRIMARY KEY AUTOINCREMENT): Unique ID for the blob.
    *   `res_id` (INTEGER): Foreign key to `resources.res_id`.
    *   `start` (INTEGER): Start offset of this blob chunk.
    *   `end` (INTEGER): End offset of this blob chunk.
    *   `check_sum` (INTEGER): The checksum `crc32(blob + cache_key_hash)`.
    *   `blob` (BLOB): The actual data chunk.

### Indexes

*   **`index_resources_cache_key_hash_doomed`**:
    `ON resources(cache_key_hash,doomed)`
    *   Speeds up lookups for live entries (`doomed=0`) by `cache_key_hash`.
        Crucial for `OpenEntry` and similar operations.

*   **`index_live_resources_last_used_bytes_usage`**:
    `ON resources(last_used, bytes_usage) WHERE doomed=0`
    *   A covering index on `last_used` and `bytes_usage` for live entries.
        Essential for efficient eviction logic, which targets the least recently
        used entries without needing to access the `resources` table directly.

*   **`index_blobs_res_id_start`**: (`UNIQUE`) `ON blobs(res_id, start)`
    *   A unique index on `(res_id, start)` in the `blobs` table. Ensures quick
        retrieval of data blobs for a given entry at a specific offset and
        maintains data integrity by preventing overlapping blobs for the same
        entry.