File: ReservedFileSpace.html

package info (click to toggle)
hdf5 1.8.13%2Bdocs-15
  • links: PTS, VCS
  • area: main
  • in suites: jessie-kfreebsd
  • size: 171,520 kB
  • sloc: ansic: 387,158; f90: 35,195; sh: 20,035; xml: 17,780; cpp: 13,516; makefile: 1,487; perl: 1,299; yacc: 327; lex: 178; ruby: 37
file content (93 lines) | stat: -rw-r--r-- 5,279 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
<html>
   <head>
      <title>Reserved File Address Space</title>
   </head>


<!--
  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  * Copyright by The HDF Group.                                               *
  * Copyright by the Board of Trustees of the University of Illinois.         *
  * All rights reserved.                                                      *
  *                                                                           *
  * This file is part of HDF5.  The full HDF5 copyright notice, including     *
  * terms governing use, modification, and redistribution, is contained in    *
  * the files COPYING and Copyright.html.  COPYING can be found at the root   *
  * of the source code distribution tree; Copyright.html can be found at the  *
  * root level of an installed copy of the electronic HDF5 document set and   *
  * is linked from the top-level documents page.  It can also be found at     *
  * http://hdfgroup.org/HDF5/doc/Copyright.html.  If you do not have          *
  * access to either file, you may request a copy from help@hdfgroup.org.     *
  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
 -->



   <body>
      <hl>Reserved File Address Space</hl>

      <p><b>Note: this tech note is no longer accurate for HDF5 1.8.  While users can 
      still adjust the size of HDF5 addresses, HDF5 no longer uses lazy allocations 
      due to the consequences of doing so in parallel when using the new metadata 
      cache. -JML 2006-09-28</b></p>
 
      <p>HDF5 files use 8-byte addresses by default, but users can change this to 2, 
      4, or even 16 bytes.  This means that it is possible to have files that only 
      address 64 KB of space, and thus that HDF must handle the case of files that 
      have enough space on disk but not enough internal address space to be written.</p>

      <p>Thus, every time space is allocated in a file, HDF needs to check that this 
      allocation is within the files address space.  If not, HDF should output an 
      error and ensure that all the data currently in the file (everything that is 
      still addressable) is successfully written to disk.</p>

      <p>Unfortunately, some structures are stored in memory and do not allocate space 
      for themselves until the file is actually flushed to disk (object headers and the 
      local heap).  This is good for efficiency, since these structures can grow without 
      creating the fragmentation that would result from frequent allocation and deallocation, 
      but means that if the library runs out of addressable space while allocating memory, 
      these structures will not be present in the file.  Without them, HDF5 does not know 
      how to parse the data in the file, rendering it unreadable.</p>

      <p>Thus, HDF keeps track of the space reserved for allocation in the file 
      (H5FD_t struct).  When a function tries to allocate space in the file, it first 
      checks that the allocation would not overflow the address space, taking the reserved 
      space into account.  When object headers or the heap finally allocate the space they 
      have reserved, they free the reserved space before allocating file space.</p>

      <p>A given object header is only flushed to disk once, but the heap can be flushed 
      to disk multiple times over the life of the file and will require contiguous space 
      every time.  To handle this, the heap keeps track of how much space it has reserved.  
      This allows it to reserve space only when it grows (when it is dirty and needs to be 
      re-written to disk).</p>

      <p>For instance, if the heap is flushed to disk, it frees its reserved space.  If new 
      data is inserted into the heap in memory, the heap may need to flush to disk again in 
      a new, larger section of memory.  Thus, not only does it reserve space in the file 
      for this new data, but also for all of the previously-existing data in the heap to 
      be re-written.  The next insert, however, will only need to reserve space for its new 
      data, since the rest of the heap already has space reserved for it.</p>

      <p>Potential issues:
      <ol>
         <li>This system does not take into accout deleted space.  Deleted space can still 
         be allocated as usual, but "reserved" space is always taken off the end of a file.  
         This means that a file that was filled to capacity but then had a significant number 
         of objects deleted will still throw errors if more data is written.  This occurs 
         because the file's free space is in the middle of the file, not at the end.  A 
         more complete space-reservation system would test if the reserved data can fit 
         into the file's free list, but this would be significantly more complicated to 
         implement.</li>

         <li>HDF5 currently claims to support 16-byte addresses, but a number of platforms 
         do not support 16-byte integers, so addresses of this size cannot be represented 
         in memory.  This solution does not attempt to address this issue.</li>
      </ol>
      </p>

<hr>
<address>
Last modifoed: 28 September 2006
</address>
   </body>
</html>