File: gdal_datamodel.dox

package info (click to toggle)
gdal 1.10.1+dfsg-8
  • links: PTS, VCS
  • area: main
  • in suites: jessie, jessie-kfreebsd
  • size: 84,320 kB
  • ctags: 74,726
  • sloc: cpp: 677,199; ansic: 162,820; python: 13,816; cs: 11,163; sh: 10,446; java: 5,279; perl: 4,429; php: 2,971; xml: 1,500; yacc: 934; makefile: 494; sql: 112
file content (380 lines) | stat: -rw-r--r-- 17,448 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
/* $Id: gdal_datamodel.dox 25494 2013-01-13 12:55:17Z etourigny $ */

/*!
\page gdal_datamodel GDAL Data Model

This document attempts to describe the GDAL data model.  That is the
types of information that a GDAL data store can contain, and their 
semantics.<p>

\section gdal_datamodel_dataset Dataset

A dataset (represented by the GDALDataset class) is an assembly of 
related raster bands and some information common to them all.  In particular
the dataset has a concept of the raster size (in pixels and lines) that
applies to all the bands.  The dataset is also responsible for the 
georeferencing transform and coordinate system definition of all bands.  The
dataset itself can also have associated metadata, a list of name/value
pairs in string form. 

Note that the GDAL dataset, and raster band data model is loosely 
based on the OpenGIS Grid Coverages specification. 

\subsection gdal_datamodel_dataset_cs Coordinate System

Dataset coordinate systems are represented as OpenGIS Well Known Text
strings.  This can contain:

<ul>
<li> An overall coordinate system name. 
<li> A geographic coordinate system name. 
<li> A datum identifier. 
<li> An ellipsoid name, semi-major axis, and inverse flattening. 
<li> A prime meridian name and offset from Greenwich. 
<li> A projection method type (ie. Transverse Mercator). 
<li> A list of projection parameters (ie. central_meridian). 
<li> A units name, and conversion factor to meters or radians. 
<li> Names and ordering for the axes. 
<li> Codes for most of the above in terms of predefined coordinate systems
from authorities such as EPSG. 
</ul>

For more information on OpenGIS WKT coordinate system definitions, and 
mechanisms to manipulate them, refer to the <a href="ogr/osr_tutorial.html">
osr_tutorial</a> document and/or the OGRSpatialReference class documentation.

The coordinate system returned by GDALDataset::GetProjectionRef() 
describes the georeferenced coordinates implied by the affine georeferencing
transform returned by GDALDataset::GetGeoTransform().  The coordinate
system returned by GDALDataset::GetGCPProjection() describes the 
georeferenced coordinates of the GCPs returned by GDALDataset::GetGCPs().

Note that a returned coordinate system strings of "" indicates nothing
is known about the georeferencing coordinate system.  

\subsection gdal_datamodel_dataset_gtm Affine GeoTransform

GDAL datasets have two ways of describing the relationship between
raster positions (in pixel/line coordinates) and georeferenced coordinates.
The first, and most commonly used is the affine transform (the other is
GCPs).  

The affine transform consists of six coefficients returned by 
GDALDataset::GetGeoTransform() which map pixel/line coordinates into 
georeferenced space using the following relationship:

<pre>
    Xgeo = GT(0) + Xpixel*GT(1) + Yline*GT(2)
    Ygeo = GT(3) + Xpixel*GT(4) + Yline*GT(5)
</pre>

In case of north up images, the GT(2) and GT(4) coefficients are zero, and
the GT(1) is pixel width, and GT(5) is pixel height.  The (GT(0),GT(3))
position is the top left corner of the top left pixel of the raster.

Note that the pixel/line coordinates in the above are from (0.0,0.0) at the
top left corner of the top left pixel to (width_in_pixels,height_in_pixels)
at the bottom right corner of the bottom right pixel.  The pixel/line location
of the center of the top left pixel would therefore be (0.5,0.5). 

\subsection gdal_datamodel_dataset_gcp GCPs

A dataset can have a set of control points relating one or more positions
on the raster to georeferenced coordinates.  All GCPs share a georeferencing
coordinate system (returned by GDALDataset::GetGCPProjection()).  Each GCP
(represented as the GDAL_GCP class) contains the following:

<pre>
typedef struct
{
    char	*pszId; 
    char	*pszInfo;
    double 	dfGCPPixel;
    double	dfGCPLine;
    double	dfGCPX;
    double	dfGCPY;
    double	dfGCPZ;
} GDAL_GCP;
</pre>

The pszId string is intended to be a unique (and often, but not always
numerical) identifier for the GCP within the set of GCPs on this dataset.
The pszInfo is usually an empty string, but can contain any user defined
text associated with the GCP.  Potentially this can also contain machine 
parsable information on GCP status though that isn't done at this time.

The (Pixel,Line) position is the GCP location on the raster.  The (X,Y,Z)
position is the associated georeferenced location with the Z often being
zero. 

The GDAL data model does not imply a transformation mechanism that must
be generated from the GCPs ... this is left to the application.  However
1st to 5th order polynomials are common. 

Normally a dataset will contain either an affine geotransform, GCPs or
neither.  It is uncommon to have both, and it is undefined which is 
authoritative.

\subsection gdal_datamodel_dataset_metadata Metadata

GDAL metadata is auxiliary format and application specific textual data
kept as a list of name/value pairs.  The names are required to be well 
behaved tokens (no spaces, or odd characters).  The values can be of
any length, and contain anything except an embedded null (ASCII zero).  

The metadata handling system is not well tuned to handling very large bodies
of metadata.  Handling of more than 100K of metadata for a dataset is likely
to lead to performance degradation. 

Some formats will support generic (user defined) metadata, while other
format drivers will map specific format fields to metadata names.  For
instance the TIFF driver returns a few information tags as metadata
including the date/time field which is returned as:

<pre>
TIFFTAG_DATETIME=1999:05:11 11:29:56
</pre>

Metadata is split into named groups called domains, with the default
domain having no name (NULL or "").  Some specific domains exist for 
special purposes.  Note that currently there is no way to enumerate all
the domains available for a given object, but applications can "test" for
any domains they know how to interprete.

The following metadata items have well defined semantics in the default
domain:

<ul>
<li> AREA_OR_POINT: May be either "Area" (the default) or "Point".  Indicates 
whether a pixel value should be assumed to represent a sampling over the 
region of the pixel or a point sample at the center of the pixel.  This is not 
intended to influence interpretation of georeferencing which remains area 
oriented.
<li> NODATA_VALUES: The value is a list of space separated pixel values matching
the number of bands in the dataset that can be collectively used to identify
pixels that are nodata in the dataset.  With this style of nodata a pixel
is considered nodata in all bands if and only if all bands match the 
corresponding value in the NODATA_VALUES tuple.  This metadata is not widely
honoured by GDAL drivers, algorithms or utilities at this time. 
<li> MATRIX_REPRESENTATION: This value, used for Polarimetric SAR datasets, contains the matrix representation that this data is provided in. The following are acceptable values:
    <ul>
    <li>SCATTERING
    <li>SYMMETRIZED_SCATTERING
    <li>COVARIANCE
    <li>SYMMETRIZED_COVARIANCE
    <li>COHERENCY
    <li>SYMMETRIZED_COHERENCY
    <li>KENNAUGH
    <li>SYMMETRIZED_KENNAUGH
    </ul>
<li> POLARIMETRIC_INTERP: This metadata item is defined for Raster Bands for polarimetric SAR data. This indicates which entry in the specified matrix representation of the data this band represents. For a dataset provided as a scattering matrix, for example, acceptable values for this metadata item are HH, HV, VH, VV. When the dataset is a covariance matrix, for example, this metadata item will be one of Covariance_11, Covariance_22, Covariance_33, Covariance_12, Covariance_13, Covariance_23 (since the matrix itself is a hermitian matrix, that is all the data that is required to describe the matrix).
</ul>

\subsubsection gdal_datamodel_subdatasets SUBDATASETS Domain

The SUBDATASETS domain holds a list of child datasets.  Normally this is
used to provide pointers to a list of images stored within a single multi
image file. 

For example, an NITF with two images might have the following subdataset list.

<pre>
  SUBDATASET_1_NAME=NITF_IM:0:multi_1b.ntf
  SUBDATASET_1_DESC=Image 1 of multi_1b.ntf
  SUBDATASET_2_NAME=NITF_IM:1:multi_1b.ntf
  SUBDATASET_2_DESC=Image 2 of multi_1b.ntf
</pre>

The value of the _NAME is the string that can be passed to GDALOpen() to 
access the file.  The _DESC value is intended to be a more user friendly
string that can be displayed to the user in a selector. 

Drivers which support subdatasets advertize the DMD_SUBDATASETS capability. 
This information is reported when the \-\-format and \-\-formats options are 
passed to the commandline utilities.

Currently, drivers which support subdatasets are: 
ADRG, ECRGTOC, GEORASTER, GTiff, HDF4, HDF5, netCDF, NITF, NTv2, OGDI, PDF, 
PostGISRaster, Rasterlite, RPFTOC, RS2, WCS, and WMS.
 
\subsubsection gdal_datamodel_image_structure IMAGE_STRUCTURE Domain

Metadata in the default domain is intended to be related to the 
image, and not particularly related to the way the image is stored on
disk.  That is, it is suitable for copying with the dataset when it is
copied to a new format.  Some information of interest is closely tied
to a particular file format and storage mechanism.  In order to prevent
this getting copied along with datasets it is placed in a special domain
called IMAGE_STRUCTURE that should not normally be copied to new formats. 

Currently the following items are defined by 
<a href="http://trac.osgeo.org/gdal/wiki/rfc14_imagestructure">RFC 14</a> 
as having specific semantics in the IMAGE_STRUCTURE domain. 

<ul>
<li> COMPRESSION:     The compression type used for this dataset or band. There is no fixed catalog of compression type names, but where a given format includes a COMPRESSION creation option, the same list of values should be used here as there. 
<li> NBITS: The actual number of bits used for this band, or the bands of this dataset. Normally only present when the number of bits is non-standard for the datatype, such as when a 1 bit TIFF is represented through GDAL as GDT_Byte.
<li> INTERLEAVE: This only applies on datasets, and the value should be one of PIXEL, LINE or BAND. It can be used as a data access hint.
<li> PIXELTYPE: This may appear on a GDT_Byte band (or the corresponding dataset) and have the value SIGNEDBYTE to indicate the unsigned byte values between 128 and 255 should be interpreted as being values between -128 and -1 for applications that recognise the SIGNEDBYTE type.
</ul>

\subsubsection gdal_datamodel_rpc RPC Domain

The RPC metadata domain holds metadata describing the Rational Polynomial
Coefficient geometry model for the image if present.  This geometry model can
be used to transform between pixel/line and georeferenced locations.  The
items defining the model are:

<ul>
<li>  ERR_BIAS: Error - Bias. The RMS bias error in meters per horizontal axis of all points in the image (-1.0 if unknown)
<li>  ERR_RAND: Error - Random. RMS random error in meters per horizontal axis of each point in the image (-1.0 if unknown)
<li>  LINE_OFF: Line Offset
<li>  SAMP_OFF: Sample Offset
<li>  LAT_OFF: Geodetic Latitude Offset
<li>  LONG_OFF: Geodetic Longitude Offset
<li>  HEIGHT_OFF: Geodetic Height Offset
<li>  LINE_SCALE: Line Scale
<li>  SAMP_SCALE: Sample Scale
<li>  LAT_SCALE: Geodetic Latitude Scale
<li>  LONG_SCALE: Geodetic Longitude Scale
<li>  HEIGHT_SCALE: Geodetic Height Scale
<li>  LINE_NUM_COEFF (1-20): Line Numerator Coefficients. Twenty coefficients for the polynomial in the Numerator of the rn equation. (space separated)
<li>  LINE_DEN_COEFF (1-20): Line Denominator Coefficients. Twenty coefficients for the polynomial in the Denominator of the rn equation. (space separated)
<li>  SAMP_NUM_COEFF (1-20): Sample Numerator Coefficients. Twenty coefficients for the polynomial in the Numerator of the cn equation. (space separated)
<li> SAMP_DEN_COEFF (1-20): Sample Denominator Coefficients. Twenty coefficients for the polynomial in the Denominator of the cn equation. (space separated)
</ul>

These fields are directly derived from the document prospective GeoTIFF RPC document (http://geotiff.maptools.org/rpc_prop.html) which in turn is closely modelled on the NITF RPC00B definition.

\subsubsection gdal_datamodel_xml xml: Domains

Any domain name prefixed with "xml:" is not normal name/value metadata. 
It is a single XML document stored in one big string. 


\section gdal_datamodel_rasterband Raster Band

A raster band is represented in GDAL with the GDALRasterBand class.  It 
represents a single raster band/channel/layer.  It does not necessarily
represent a whole image.  For instance, a 24bit RGB image would normally
be represented as a dataset with three bands, one for red, one for green
and one for blue. 

A raster band has the following properties:

<ul>

<li> A width and height in pixels and lines.  This is the same as that 
defined for the dataset, if this is a full resolution band. 

<li> A datatype (GDALDataType).  One of Byte, UInt16, Int16, UInt32, Int32, 
Float32, Float64, and the complex types CInt16, CInt32, CFloat32, and CFloat64.

<li> A block size.  This is a preferred (efficient) access chunk size.  For 
tiled images this will be one tile.  For scanline oriented images this will 
normally be one scanline. 

<li> A list of name/value pair metadata in the same format as the dataset,
but of information that is potentially specific to this band.

<li> An optional description string. 

<li> An optional single nodata pixel value (see also NODATA_VALUES metadata on 
the dataset for multi-band style nodata values).

<li> An optional nodata mask band marking pixels as nodata or in some cases 
transparency as discussed in <a 
href="http://trac.osgeo.org/gdal/wiki/rfc15_nodatabitmask">RFC 15: Band 
Masks</a>.

<li> An optional list of category names (effectively class names in a 
thematic image).  

<li> An optional minimum and maximum value. 

<li> An optional offset and scale for transforming raster values into meaning
full values (ie translate height to meters) 

<li> An optional raster unit name.  For instance, this might indicate linear 
units for elevation data.

<li> A color interpretation for the band.  This is one of:

  <ul> 
  <li> GCI_Undefined: the default, nothing is known.
  <li> GCI_GrayIndex: this is an independent grayscale image
  <li> GCI_PaletteIndex: this raster acts as an index into a color table
  <li> GCI_RedBand: this raster is the red portion of an RGB or RGBA image
  <li> GCI_GreenBand: this raster is the green portion of an RGB or RGBA image
  <li> GCI_BlueBand: this raster is the blue portion of an RGB or RGBA image
  <li> GCI_AlphaBand: this raster is the alpha portion of an RGBA image
  <li> GCI_HueBand: this raster is the hue of an HLS image
  <li> GCI_SaturationBand: this raster is the saturation of an HLS image
  <li> GCI_LightnessBand: this raster is the hue of an HLS image
  <li> GCI_CyanBand: this band is the cyan portion of a CMY or CMYK image
  <li> GCI_MagentaBand: this band is the magenta portion of a CMY or CMYK image
  <li> GCI_YellowBand: this band is the yellow portion of a CMY or CMYK image
  <li> GCI_BlackBand: this band is the black portion of a CMYK image. 
  </ul>

<li> A color table, described in more detail later. 

<li> Knowledge of reduced resolution overviews (pyramids) if available. 

</ul>

\section gdal_datamodel_rasterband_ct Color Table

A color table consists of zero or more color entries described in C by the
following structure:

<pre>
typedef struct
{
    /- gray, red, cyan or hue -/
    short      c1;      

    /- green, magenta, or lightness -/    
    short      c2;      

    /- blue, yellow, or saturation -/
    short      c3;      

    /- alpha or blackband -/
    short      c4;      
} GDALColorEntry;
</pre>

The color table also has a palette interpretation value (GDALPaletteInterp)
which is one of the following values, and indicates how the c1/c2/c3/c4 values
of a color entry should be interpreted. 

<ul>
<li> GPI_Gray: Use c1 as grayscale value. 
<li> GPI_RGB: Use c1 as red, c2 as green, c3 as blue and c4 as alpha.
<li> GPI_CMYK: Use c1 as cyan, c2 as magenta, c3 as yellow and c4 as black. 
<li> GPI_HLS: Use c1 as hue, c2 as lightness, and c3 as saturation. 
</ul>

To associate a color with a raster pixel, the pixel value is used as a
subscript into the color table.  That means that the colors are always
applied starting at zero and ascending.  There is no provision for indicating
a prescaling mechanism before looking up in the color table. 

\section gdal_datamodel_rasterband_overviews Overviews

A band may have zero or more overviews.  Each overview is represented as
a "free standing" GDALRasterBand.  The size (in pixels and lines) of the 
overview will be different than the underlying raster, but the geographic
region covered by overviews is the same as the full resolution band.  

The overviews are used to display reduced resolution overviews more quickly
than could be done by reading all the full resolution data and downsampling. 

Bands also have a HasArbitraryOverviews property which is TRUE if the
raster can be read at any resolution efficiently but with no distinct
overview levels.  This applies to some FFT encoded images, or images pulled
through gateways (like OGDI) where downsampling can be done efficiently
at the remote point. 

*/