File: design.txt

package info (click to toggle)
fstransform 0.9.4-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, bullseye, forky, sid, trixie
  • size: 1,888 kB
  • sloc: cpp: 11,917; sh: 2,225; xml: 125; makefile: 95
file content (394 lines) | stat: -rw-r--r-- 16,953 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
fstransform idea comes from now-abandoned program convertfs (http://tzukanov.narod.ru/convertfs)
and from the author's desire to return to C/C++ programming after a long hiatus.

fstransform is designed and implemented from scratch with safety in mind:
every reasonable precaution is taken in order not to lose/corrupt data.

Anyway, the operations performed by fstransform, and especially the
goal NOT to need or use a backup, are intrinsically VERY RISKY.

For this reason, no matter how well (or badly) designed fstransform
actually is, there are chances that by using it you will LOSE YOUR DATA.

Please understand that the risk is only yours, and that you take full
responsibility for using fstransform. In other words, the following standard
disclaimer applies:

THE AUTHOR DECLINES ALL RESPONSIBILITIES FOR ANY DAMAGE THAT MAY DERIVE
FROM USING THE PROGRAMS AND PROCEDURES DESCRIBED IN THIS DOCUMENT

Having said that, let's move to the requirements that led to fstransform:

############################### Requirements ##################################

1) be able function with very limited free disk space

2) be able to remap in-place a block device from any filesystem to any other,
   preserving file, directories, soft links, hard links, special devices,
   timestamps and owner/group permissions if the following prerequisites
   are satisfied:

   a) source filesystem supports sparse files (holes) and ioctl(FIEMAP or FIBMAP)

   b) both source and target filesystems are POSIX-like,
      i.e. they support links, special devices, timestamps and UNIX-like permissions

3) be reasonably fast

   operations on disk must be O(N), i.e. linear with device size,
   and perform large contiguous reads and writes as much as possible.

#################### Requirements not yet implemented ########################

4) be SAFE.

   If relevant hardware and drivers are working correctly,
   data must NEVER be lost or corrupted, not even in case of program crash
   or sudden power loss.

   Additionally, relevant hardware and drivers must be checked for known defects
   at program start, I/O problems must be detected as soon as possible,
   both using S.M.A.R.T. and by detecting I/O errors,
   and every reasonable effort must be made not to lose or corrupt data
   even in case of defective/failing disks or other relevant hardware
   and their drivers.

5) store work progress status on disk,
   and be able to resume work or completely undo it when program is started again.


################################# Reminders ###################################

* instantiate template classes only T = ft_uint and T = ft_uoff

* components must be emulable, especially OS I/O

* fstatfs(loop_fd) to stat device file-system, including used space (=S) and total inodes (=I)

* create loop file-system with number of inodes >= I

* Immediately check on created loop file-system that available space >= S

* move files inside loop file-system, monitoring available space on BOTH
  original device and loop file-system

* at fsremap start, check that device is mounted read-only

############################## fsremap algorithm #################################


================== Example initial situation =================

device
+-----------------------------------------------------------------------------+
|01|02|  |04|  |  |07|08|  |  |  |  |  |  |15|  |  |  |19|20|21|  |  |  |  |26|
+-----------------------------------------------------------------------------+
loop file inside device
+-----------------------------------------------------------------------------+
|  |  |  |  |21|26|  |  |19|17|10|15|08|11|  |07|09|  |  |  |  |02|01|  |06|  |
+-----------------------------------------------------------------------------+


============== Example final situation (target) ==============

loop-file (physical blocks)
+-----------------------------------------------------------------------------+
|01|02|  |  |  |06|07|08|09|10|11|  |  |  |15|  |17|  |19|  |21|  |  |  |  |26|
+-----------------------------------------------------------------------------+
device backup inside loop-file (physical blocks) - somewhere in these blocks
+-----------------------------------------------------------------------------+
|  |  |**|**|**|  |  |  |  |  |  |**|**|**|  |**|  |**|  |**|  |**|**|**|**|  |
+-----------------------------------------------------------------------------+


================ Algorithm simulation ================

0) compute LOOP-FILE extents, DEVICE extents and FREE-SPACE extents

1) find LOOP-FILE (logical) holes, i.e. LOOP-HOLES
+-----------------------------------------------------------------------------+
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
+------03-04-05-------------------12-13-14----16----18----20----22-23-24-25---+

2) re-number used DEVICE blocks, setting ->logical to values
   from loop-file holes (LOOP-HOLES). do not greedily use low hole numbers:
   a) prefer holes with ->logical numbers equal to DEVICE ->physical block number:
      they produce an INVARIANT block, already in its final destination
      (marked with @@)
   b) spread the remaining ->logical across rest of holes (use best-fit allocation)

device renumbered blocks - original numbers are above
+01-02----04-------07-08-------------------15----------19-20-21-------------26+
|03|05|  |04|  |  |12|13|  |  |  |  |  |  |16|  |  |  |18|20|23|  |  |  |  |25|
+---------@@----------------------------------------------@@------------------+

2.1) mark as INVARIANT (with @@) the LOOP-FILE (logical) blocks already in their
   final destination and forget them (no work on those).
   also compute total length of blocks remaining in LOOP-FILE.

   in this case, none
+01-02----04-------07-08-------------------15----------19-20-21-------------26+
|03|05|  |04|  |  |12|13|  |  |  |  |  |  |16|  |  |  |18|20|23|  |  |  |  |25|
+---------@@----------------------------------------------@@------------------+

3) merge renumbered DEVICE blocks with remaining LOOP-FILE blocks (remember who's who)

merged blocks
+01-02----04-------07-08-------------------15----------19-20-21-------------26+
|03|05|  |04|21|26|12|13|19|17|10|15|08|11|16|07|09|  |18|20|23|02|01|  |06|25|
+---------@@----------------------------------------------@@------------------+

4) compute (physical) intersection of FREE-SPACE and LOOP-HOLES,
   and mark it as INVARIANT FREE-SPACE (with !!).
   we can use these blocks as partial or total replacement for STORAGE - see 5)
   if there are sections with enough consecutive blocks (>= blocks to relocate / 1024)

   in this case, only 24
+01-02----04-------07-08-------------------15----------19-20-21-------------26+
|03|05|  |04|21|26|12|13|19|17|10|15|08|11|16|07|09|  |18|20|23|02|01|  |06|25|
+---------@@----------------------------------------------@@----------!!------+


5) sort

   Favor in-order disk reads, even if they cause out-of-order disk writes
   Reason: exploit read-ahead and write-back to speed up execution

   Reads and writes will happen in moderately large in-order chunks, such that:
   a) chunk size must fit into RAM
   b) storage must not overflow

   storage is the contatenation of primary-storage and secondary-storage.
   primary-storage is the intersection of FREE-SPACE and LOOP-HOLES,
   aligned to system PAGE_SIZE.
   secondary storage is the file ${dir}/.fsremap/job.$$.storage

+==============+
|  |  |  |  |  |  storage (SMALL!)
+==============+

5.0) start with position P=1

5.1) if storage has N free blocks (in this case N=5),
     lookup the contents of the consecutive positions P..P+N-1
     on disk (in this case 01..05).

5.2) note the blocks in such positions (in this case 03|05|  |04|21).
     ignore both free blocks and blocks already in their final destination:
     in case we find either kind, go back to 5.1 and increase N accordingly
     (in some cases we could loop between 5.1 and 5.2 many times)

     (in this case, one block is free and another - 04 - is in its final
     destination so we set N=7, extend the lookup to 01..07 and find
     03|05|  |@@|21|26|12 - again, remember to ignore 04, it's marked @@)

5.3) sort the found blocks by their (logical) number (in this case 03 05 12 21 26)

5.4) for each block, check the contents of its final destination
     (in this case, pos[03]=free, pos[05]=21, pos[12]=15, pos[21]=23, pos[26]=25).

     if a final destination is free, move the block there and mark
     its old position as free, otherwise put the block in a "move queue"

5.5) for each non-free final destination, examine the "lookup" block it contains
     and check if such block final destination is free or not.
     in case it is free, directly move there the "lookup" block,
     else move the "lookup" block into storage.

     also remember to update the "move queue", as some "lookup" blocks
     may also belong to it

     (in this case, final destinations are 21 15 23 25.
     none of them is free, so blocks 03 05 12 21 26 are moved to storage)
+======21=26===+
|21|15|23|25|  |  storage
+==============+

5.6) mark as free the blocks moved away from final destinations checked at 5.4)

+01-02----04-------07-08-------------------15----------19-20------------------+
|03|05|  |04|  |26|12|13|19|17|10|  |08|11|16|07|09|  |18|20|  |02|01|  |06|  |
+---------@@----------------------------!!----------------@@----!!----!!------+

5.7) move blocks in the "move queue" to their final destinations,
     if they were not moved already to their final destination by step 5.5
     (note: step 5.5 could have moved them to storage)

     remember to read and write them in disk order, and to mark their initial
     position as free AFTER they have been written in their final destination,
     and recorded as such.
     (in this case: read 03|05  21|26|12, write 03 05 12 21 26)
     and to mark them with @@
+------01-04-02-------08----------07-------15----------19-20------------------+
|  |  |03|04|05|  |  |13|19|17|10|12|08|11|16|07|09|  |18|20|21|02|01|  |06|26|
+------@@-@@-@@-------------------@@----!!----------------@@-@@-!!----!!----@@+

5.8) in storage, mark as free any block that was just written (to its
     final destination) during previous step (in this case, 21).
+======21=26===+
|  |15|23|25|  |  storage
+==============+

5.9) scan storage for blocks that can be directly written to their
     final destination (i.e. if it's free) and move them
     (in this case none)

5.10) if storage is at least 50% full then perform extra pass 6,
     else set P = P+N and restart from 5.1 if new P is less than device size


6) extra pass if storage is at least 50% full

6.1) find at least N/2 and up to N blocks that can be directly moved
     to their final destination
     note: since storage has at least N/2 blocks, there are at least N/2
     free final destinations, and thus at least N/2 blocks that could be moved
     to such destinations.

     how to find: get the position number of free blocks not marked '!!'
     note: the blocks could be in storage if we arrived here from step 6.2)
     (in this case 01 02 06 07 18)

     read them in disk order (in this case 07 18 02 01 06)
     and write them into their final destination in disk order (in this case
     01 02 07 07 18)
+------01-04-02-------08----------07-------15-------19----20------------------+
|01|02|03|04|05|06|07|13|19|17|10|12|08|11|16|  |09|18|  |20|21|  |  |  |  |26|
+@@-@@-@@-@@-@@-@@-@@-------------@@----!!----------@@----@@-@@-!!----!!----@@+

6.2) repeat 6.1) until storage is 25% full or less,
     or until work is finished,
     or until less than N/2 blocks can be moved (whatever comes first)
     (in this case move 16 19 23 25)
+------01-04-02-------08----------07----------15----19----20-------21----26---+
|01|02|03|04|05|06|07|13|  |17|10|12|08|11|  |16|09|18|19|20|21|  |23|  |25|26|
+@@-@@-@@-@@-@@-@@-@@-------------@@----!!----@@----@@-@@-@@-@@-!!-@@-!!-@@-@@+
+==============+
|  |15|  |  |  |  storage
+==============+

6.3) OBSOLETE - free positions marked as '!!' are primary-storage

     move as many blocks as possible from storage to free positions
     marked '!!'
     (in this case 15)
+------01-04-02-------08----------07----------15----19----20-------21----26---+
|01|02|03|04|05|06|07|13|  |17|10|12|08|11|  |16|09|18|19|20|21|15|23|  |25|26|
+@@-@@-@@-@@-@@-@@-@@-------------@@----!!----@@----@@-@@-@@-@@-!!-@@-!!-@@-@@+
+==============+
|  |  |  |  |  |  storage
+==============+

6.4) set P = P + N and restart from 5.1 if new P is less than device size
     (in this case N = 7 and new P = 8)

[continued]

[5.1] N=5, lookup positions 08..12

[5.2] pos[09]=free, pos[12]=@@, so increase N=7 and try again

[5.1] N=7, lookup positions 08..14, found blocks 13 17 10 08 11 (ignore 12,
      is marked '@@')

[5.2] 5 positions are used (= storage size), continue with [5.3]

[5.3] order blocks by their number, 08 10 11 13 17

[5.4] for each block, check the contents of its final destination
     pos[08]=13, pos[10]=17, pos[11]=10, pos[13]=08, pos[17]=09

     read blocks in final destinations, 13 17 10 08 09

[5.5] check the contents of such blocks (13 17 10 08 09) final destinations.
     for each such block, in case its own final destination is free move it there,
     else move it into storage
     only 09 can be moved directly (17 requires 09 to be moved first,
     13 requires 08 to be moved first),
     so put 13 17 10 08 to storage.

+08============+
|13|17|10|08|  |  storage
+==============+

[5.6] mark the blocks moved away from final destinations as free

+------01-04-02-------------------07----------15----19----20-------21----26---+
|01|02|03|04|05|06|07|  |09|  |  |12|  |11|  |16|  |18|19|20|21|15|23|  |25|26|
+@@-@@-@@-@@-@@-@@-@@----@@-------@@----!!----@@----@@-@@-@@-@@-!!-@@-!!-@@-@@+

[5.7] move the blocks selected at step [5.2] into their final destinations,
     if they were not moved already in their final destination by step 5.5
     (note: step 5.5 could have moved them to storage)
     read/write in disk order, update marks.

     read 13 17 10 08 11 (11 from main device, others from storage)
     write 08 10 11 13 17 to their final destinations

+------01-04-02-------------------07-08-------15----19----20-------21----26---+
|01|02|03|04|05|06|07|08|09|10|11|12|13|  |  |16|17|18|19|20|21|15|23|  |25|26|
+@@-@@-@@-@@-@@-@@-@@-@@-@@-@@-@@-@@-@@-!!----@@-@@-@@-@@-@@-@@-!!-@@-!!-@@-@@+

[5.8] in storage, mark as free any block that was just written (to its
     final destination) during previous step (in this case, all).
+==============+
|  |  |  |  |  |  storage
+==============+

[5.9] scan storage for blocks that can be directly written to their
     final destination (i.e. if it's free) and move them
     (in this case none)

[5.10] if storage is more than 50% full then perform extra pass 6,
     else set P = P+N and restart from 5.1 if new P is less than device size
     (in this case storage is empty, so since N=7, set new P=15 and
     restart from 5.1)

[5.1] N=5, lookup positions 15..19

[5.2] pos[15]=free, all other = @@, so increase N=10 and try again

[5.1] N=10, lookup positions 15..24

[5.2] ignoring free and @@ positions, we find pos[22] = 15
      so increase N=14 and try again

[5.1] N=14, lookup positions 15..28 -> truncate to device length (26)
      so N=12, lookup 15..26

[5.2] ignoring free and @@ positions, we still find only pos[22] = 15
      and cannot increase N more, we are at end of device

[5.3] order found blocks by their number, 15

[5.4] for each block, check the contents of its final destination
     pos[15]=free

     read blocks in final destinations, none

[5.5] check the contents of such blocks (none) final destinations
     and for each such block -> empty loop

[5.6] mark the blocks moved away from final destinations (none) as free

[5.7] move the blocks selected at step [5.2] into their final destinations,
     if they were not moved already in their final destination by step 5.5
     (note: step 5.5 could have moved them to storage)
     read/write in disk order, update marks.

     we move block 15 to its destination
+------01-04-02-------------------07-08-------15----19----20-------21----26---+
|01|02|03|04|05|06|07|08|09|10|11|12|13|  |15|16|17|18|19|20|21|  |23|  |25|26|
+@@-@@-@@-@@-@@-@@-@@-@@-@@-@@-@@-@@-@@-!!-@@-@@-@@-@@-@@-@@-@@-!!-@@-!!-@@-@@+

[5.8] in storage, mark as free any block that was just written (to its
     final destination) during previous step (in this case, none).

[5.9] scan storage for blocks that can be directly written to their
     final destination (i.e. if it's free) and move them
     (in this case none)

[5.10] if storage is more than 50% full then perform extra pass 6,
     else set P = P+N and restart from 5.1 if new P is less than device size
     (in this case storage is empty, so since N=12, set new P=26,
     is not less than device size -> we finished)