File: AutoBlock-global-settings.Rd

package info (click to toggle)
r-bioc-delayedarray 0.24.0%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 1,480 kB
  • sloc: ansic: 727; makefile: 2
file content (166 lines) | stat: -rw-r--r-- 5,765 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
\name{AutoBlock-global-settings}

\alias{AutoBlock-global-settings}

\alias{getAutoBlockSize}
\alias{setAutoBlockSize}
\alias{get_type_size}
\alias{getAutoBlockLength}
\alias{getAutoBlockShape}
\alias{setAutoBlockShape}

\title{Control the geometry of automatic blocks}

\description{
  A family of utilities to control the automatic block size (or
  length) and shape.
}

\usage{
getAutoBlockSize()
setAutoBlockSize(size=1e8)

getAutoBlockLength(type)

getAutoBlockShape()
setAutoBlockShape(shape=c("hypercube",
                          "scale",
                          "first-dim-grows-first",
                          "last-dim-grows-first"))
}

\arguments{
  \item{size}{
    The \emph{auto block size} (automatic block size) in bytes. Note that,
    except when the type of the array data is \code{"character"} or
    \code{"list"}, the size of a block is its length multiplied by the
    size of an array element. For example, a block of 500 x 1000 x 500
    doubles has a length of 250 million elements and a size of 2 Gb (each
    double occupies 8 bytes of memory).

    The \emph{auto block size} is set to 100 Mb at package startup and can
    be reset anytime to this value by calling \code{setAutoBlockSize()}
    with no argument.
  }
  \item{type}{
    A string specifying the type of the array data.
  }
  \item{shape}{
    A string specifying the \emph{auto block shape} (automatic block shape).
    See \code{\link{makeCappedVolumeBox}} for a description of the
    supported shapes.

    The \emph{auto block shape} is set to \code{"hypercube"} at
    package startup and can be reset anytime to this value by calling
    \code{setAutoBlockShape()} with no argument.
  }
}

\details{
  \emph{block size} != \emph{block length}

  \emph{block length} = number of array elements in a block
  (i.e. \code{prod(dim(block))}).

  \emph{block size} = \emph{block length} * size of the individual elements
  in memory.

  For example, for an integer array, \emph{block size} (in bytes) is
  going to be 4 x \emph{block length}. For a numeric array \code{x}
  (i.e. \code{type(x) == "double"}), it's going to be 8 x \emph{block length}.

  In its current form, block processing in the \pkg{DelayedArray} package
  must decide the geometry of the blocks before starting the walk on the
  blocks. It does this based on several criteria. Two of them are:
  \itemize{
    \item The \emph{auto block size}: maximum size (in bytes) of a block
          once loaded in memory.
    \item The \code{type()} of the array (e.g. \code{integer}, \code{double},
          \code{complex}, etc...)
  }

  The \emph{auto block size} setting and \code{type(x)} control the maximum
  length of the blocks. Other criteria control their shape. So for example
  if you set the \emph{auto block size} to 8GB, this will cap the length of
  the blocks to 2e9 if your DelayedArray object \code{x} is of type
  \code{integer}, and to 1e9 if it's of type \code{double}.

  Note that this simple relationship between \emph{block size} and
  \emph{block length} assumes that blocks are loaded in memory as
  ordinary (a.k.a. dense) matrices or arrays. With sparse blocks,
  all bets are off. But the max block length is always taken to be
  the \emph{auto block size} divided by \code{get_type_size(type())}
  whether the blocks are going to be loaded as dense or sparse arrays.
  If they are going to be loaded as sparse arrays, their memory footprint
  is very likely to be smaller than if they were loaded as dense arrays
  so this is safe (although probably not optimal).

  It's important to keep in mind that the \emph{auto block size} setting
  is a simple way for the user to put a cap on the memory footprint of
  the blocks. Nothing more. In particular it doesn't control the maximum
  amount of memory used by the block processing algorithm. Other variables
  can impact dramatically memory usage like parallelization (where more than
  one block is loaded in memory at any given time), what the algorithm is
  doing with the blocks (e.g. something like \code{blockApply(x, identity)}
  will actually load the entire array data in memory), what delayed
  operations are on \code{x}, etc... It would be awesome to have a way to
  control the maximum amount of memory used by a block processing algorithm
  as a whole but we don't know how to do that.
}

\value{
  \code{getAutoBlockSize}: The current \emph{auto block size} in bytes
  as a single numeric value.

  \code{setAutoBlockSize}: The new \emph{auto block size} in bytes as an
  invisible single numeric value.

  \code{getAutoBlockLength}: The \emph{auto block length} as a single
  integer value.

  \code{getAutoBlockShape}: The current \emph{auto block shape} as a
  single string.

  \code{setAutoBlockShape}: The new \emph{auto block shape} as an invisible
  single string.
}

\seealso{
  \itemize{
    \item \code{\link{defaultAutoGrid}} and family to create automatic
          grids to use for block processing of array-like objects.

    \item \code{\link{blockApply}} and family for convenient block
          processing of an array-like object.

    \item The \code{\link{makeCappedVolumeBox}} utility to make
          \emph{capped volume boxes}.
  }
}

\examples{
getAutoBlockSize()

getAutoBlockLength("double")
getAutoBlockLength("integer")
getAutoBlockLength("logical")
getAutoBlockLength("raw")

m <- matrix(runif(600), ncol=12)
setAutoBlockSize(140)
getAutoBlockLength(type(m))
defaultAutoGrid(m)
lengths(defaultAutoGrid(m))
dims(defaultAutoGrid(m))

getAutoBlockShape()
setAutoBlockShape("scale")
defaultAutoGrid(m)
lengths(defaultAutoGrid(m))
dims(defaultAutoGrid(m))

## Reset the auto block size and shape to factory settings:
setAutoBlockSize()
setAutoBlockShape()
}
\keyword{utilities}