| 12
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 
 | <html>
<body>
<h1>DAP to Netcdf Translation Rules</h1>
Two translations are currently available.
<ul>
<li><a href="#ncdap3">DAP 2 Protocol to netCDF-3</a>
<li><a href="#ncdap4">DAP 2 Protocol to netCDF-4</a>
</ul>
<h2><a name="ncdap3">netCDF-3 Translation Rules</a></h3>
The current set of translation rules to convert an OPeNDAP
DAP protocol version 2 DDS to netCDF-3 is designed to mimic
as closely as possible those currently used by the libnc-dap
system.  Please note that the translation is still subject
to change to respond to unforeseen problems and user
suggestions.
<p>
For illustrative purposes, the following example will be used.
<pre>
Dataset {
  Int32 f1;
  Structure {
    Int32 f11;        
    Structure {
      Int32 f1[3];
      Int32 f2;
    } FS2[2]; 
  } S1; 
  Structure {
    Grid {
      Array:
        Float32 temp[lat=2][lon=2];
      Maps:
        Int32 lat[lat=2];
        Int32 lon[lon=2];
    } G1;
  } S2;
  Grid {
      Array:
        Float32 G2[lat=2][lon=2];
      Maps:
        Int32 lat[2];
        Int32 lon[2];
  } G2;
  Int32 lat[lat=2];
  Int32 lon[lon=2];
} D1;
</pre>
<h3>Variable Definition</h3>
The set of variables is defined by the fields with
primitive base types as they occur in
Sequences, Grids, and Structures.
The field names are modified to be fully qualified initially.
For the above, the set of variables, the variables are as follows.
<ol>
<li>f1
<li>S1.f11
<li>S1.FS2.f1
<li>S1.FS2.f2
<li>S2.G1.temp
<li>S2.G1.lat
<li>S2.G1.lon
<li>S2.G2.G2
<li>S2.G2.lat
<li>S2.G2.lon
<li>lat
<li>lon
</ol>
<h3>Variable Dimension Translation</h3>
A variable's rank is determined from three sources.
<ol>
<li>
The variable has the dimensions associated with the field
it represents (e.g. S1.FS2.f1[3] in the above example).
<li>
The variable inherits the dimensions associated with any containing
structure that has a rank greater than zero.
These dimensions precede those of case 1.
Thus, we have in our example, f1[2][3], where the first dimension
comes from the containing Structure FS2[2].
<li>
The variable's set of dimensions are altered
if any of its containers is a DAP DDS Sequence.
This is discussed more fully below.
</ol>
<h3>Dimension translation</h3>
For dimensions, the rules are as follows.
<ol>
<li> Fields in dimensioned structures inherit the dimension
of the structure; thus the above list would have the following
dimensioned variables.
<ul>
<li>S1.FS2.f1 -> S1.FS2.f1[2][3]
<li>S1.FS2.f2 -> S1.FS2.f2[2]
<li>S2.G1.temp -> S2.G1.temp[lat=2][lon=2]
<li>S2.G1.lat -> S2.G1.lat[lat=2]
<li>S2.G1.lon -> S2.G1.lon[lon=2]
<li>S2.G2.G2 -> S2.G2.lon[lat=2][lon=2]
<li>S2.G2.lat -> S2.G2.lat[lat=2]
<li>S2.G2.lon -> S2.G2.lon[lon=2]
<li>lat -> lat[lat=2]
<li>lon -> lon[lon=2]
</ul>
<p>
<li>
Collect all of the dimension specifications from the DDS, both
named and anonymous (unnamed)
For each unique anonymous dimension with value NN
create a netCDF dimension of the form "<array>_<i>=NN",
where <array> is the fully qualified name of the variable and i is the
i'th (inherited) dimension of the array where the anonymous dimension occurs.
For our example, this would create the following dimensions.
<ul>
<li>S1.FS2.f1_0 = 2 ;
<li>S1.FS2.f1_1 = 3 ;
<li>S1.FS2.f2_0 = 2 ;
<li>S2.G2.lat_0 = 2 ;
<li>S2.G2.lon_0 = 2 ;
</ul>
<p>
<li>
If, however, the anonymous dimension is the single dimension
of a MAP vector in a Grid then the dimension is given the
same name as the map vector This leads to the following.
<ul>
<li>S2.G2.lat_0 -> S2.G2.lat
<li>S2.G2.lon_0 -> S2.G2.lon
</ul>
<p>
<li>
For each unique named dimension "<name>=NN",
create a netCDF dimension of the form "<name>=NN",
where name has the qualifications removed.
If this leads to duplicates (i.e. same name and same value),
then the duplicates are ignored.
This produces the following.
<ul>
<li>S2.G2.lat -> lat
<li>S2.G2.lon -> lon
</ul>
Note that this produces duplicates.
<p>
<li>
At this point the only dimensions left to process should be named
dimensions with the same name as some dimension from step number 3,
but with a different value.  For those dimensions create a dimension
of the form "<name>M=NN" where M is a counter starting at 1.
The example has no instances of this.
<p>
<li>
Finally and if needed, define a single UNLIMITED dimension named "unlimited"
with value zero.
</ol>
This leads to the following set of dimensions.
<pre>
dimensions:
  unlimited = UNLIMITED;
  lat = 2 ;
  lon = 2 ;
  S1.FS2.f1_0 = 2 ;
  S1.FS2.f1_1 = 3 ;
  S1.FS2.f2_0 = 2 ;
</pre>
<h3>Variable Name Translation</h3>
The steps for variable name translation are as follows.
<p>
<ol>
<li>Take the set of variables captured above.
Thus for the above DDS, the following fields would be collected.
<ul>
<li>f1
<li>S1.f11
<li>S1.FS2.f1
<li>S1.FS2.f2
<li>S2.G1.temp
<li>S2.G1.lat
<li>S2.G1.lon
<li>S2.G2.G2
<li>S2.G2.lat
<li>S2.G2.lon
<li>lat
<li>lon
</ul>
<p>
<li>All grid array variables are renamed to be the same as the containing
grid and the grid prefix is removed.
In the above DDS, this results in the following changes.
<ol>
<li> G1.temp -> G1
<li> G2.G2 -> G2
</ol>
Note that, for example, the G1.lon keeps that name.
Also note that libnc-dap just drops the grid map variables,
so this is one place where the translation differs from
libnc-dap, but in a compatible way.
</ol>
<p>
It is important to note that if this process could produce duplicate
variables (i.e. with the same name); in that case they are all assumed
to have the same content and the duplicates are ignored.
If it turns out that the duplicates have different content, then
the translation will not detect this. YOU HAVE BEEN WARNED.
<p>
The final netCDF-3 schema (minus attributes) is then as follows.
<pre>
netcdf t {
dimensions:
        unlimited = UNLIMITED
        lat = 2 ;
        lon = 2 ;
        S1.FS2.f1_0 = 2 ;
        S1.FS2.f1_1 = 3 ;
        S1.FS2.f2_0 = 2 ;
variables:
        int f1 ;
        int lat(lat) ;
        int lon(lon) ;
        int S1.f11 ;
	int S1.FS2.f1(S1.FS2.f1_0, S1.FS2.f1_1) ;
        int S1.FS2.f2(S1_FS2_f2_0) ;
        float S2.G1(lat, lon) ;
	int S2.G1.lat(lat) ;
	int S2.G1.lon(lon) ;
        float G2(lat, lon) ;
        int G2.lat(lat) ;
        int G2.lon(lon) ;
}
</pre>
In actuality, the unlimited dimension is dropped because
it is unused.
<p>
There are differences with the original libnc-dap here
because libnc-dap technically was incorrect.  The original
would have said this, for example.
<pre>
int S1.FS2.f1(lat, lat) ;
</pre>
Note that this is incorrect because it dimensions
S1.FS2.f1(2,2) rather than S1.FS2.f1(2,3).
<h3>Translating DAP DDS Sequences</h3>
Any variable (as determined above) that is contained
directly or indirectly by a Sequence is subject to revision
of its rank using the following rules.
<ol>
<li> 
Let the variable be contained in Sequence Q1, where Q1 is the
innermost containing sequence. If Q1 is itself contained
(directly or indirectly) in a sequence,
or Q1 is contained (again directly or indirectly)
in a structure that has rank greater than 0,
then the variable will have an initial UNLIMITED
dimension.  However, all dimensions coming from "above" and including (in
the containment sense) the innermost Sequence, Q1, will be
removed and replaced by the single UNLIMITED dimension.  The
size associated with that UNLIMITED is zero, which means
that its contents are inaccessible through the netcdf-3 API.
Again, this differs from libnc-dap, which leaves out such variables.
Again, however, this difference is compatible.
<p>
<li> 
If the variable is contained in a single Sequence (i.e. not nested)
and all containing structures have rank 0, then the variable will
have an initial dimension whose size is the record count for that
Sequence. The name of the new dimension will be the name of the
Sequence.
</ol>
<p>
Consider this example.
<pre>
Dataset {
  Structure {
    Sequence {
      Int32 f1[3];
      Int32 f2;
    } SQ1;
  } S1[2]; 
  Sequence {
    Structure {
      Int32 x1[7];
    } S2[5];
  } Q2;
} D;
</pre>
The corresponding netcdf-3 translation is pretty much as follows
(the value for dimension Q2 may differ).
<pre>
dimensions:
    unlimited = UNLIMITED ; // (0 currently)
    S1.SQ1.f1_0 = 2 ;
    S1.SQ1.f1_1 = 3 ;
    S1.SQ1.f2_0 = 2 ;
    Q2.S2.x1_0 = 5 ;
    Q2.S2.x1_1 = 7 ;
    Q2 = 5 ;
variables:
    int S1.SQ1.f1(unlimited, S1.SQ1.f1_1) ;
    int S1.SQ1.f2(unlimited) ;
    int Q2.S2.x1(Q2, Q2.S2.x1_0, Q2.S2.x1_1) ;
</pre>
Note that for example S1.SQ1.f1_0
is not actually used because it has been folded
into the unlimited dimension.
<p>
Note that there is a performance cost
because the translation code has to walk the data to determine
how many records are associated with the sequence.
Since libnc-dap did something similar, it can be assumed that
the cost is not prohibitive.
<h2><a name="ncdap4">netCDF-4 Translation Rules</a></h2>
The DAP to netCDF-4 translation is enabled if the
"--enable-netcdf-4" option is specified at configure time.
This translation includes some elements of the libnc-dap
translation, but attempts to provide a simpler (but not,
unfortunately, simple) set of translation rules than is used
for the netCDF-3 translation.
Please note that the translation is still subject
to change to respond to unforeseen problems or to
suggested improvements.
<p>
This text will use this running example.
<pre>
Dataset {
  Int32 f1[fdim=10];
  Structure {
    Int32 f11;        
    Structure {
      Int32 f1[3];
      Int32 f2;
    } FS2[2]; 
  } S1; 
  Grid {
    Array:
      Float32 temp[lat=2][lon=2];
    Maps:
      Int32 lat[2];
      Int32 lon[2];
  } G1;
  Sequence {
    Float64 depth;
  } Q1;
} D
</pre>
<h3>Variable Definition</h3>
The rules for choosing variables is as follows.
<ol>
<li> Start with the names of the top-level fields of the DDS.
The term top-level means that the object is a direct subnode
of the Dataset object. In our example, this produces the set
[f1, S1, G1, Q1].
<p>
<li>
Replace all Grid objects with the fully qualified list of array
and map fields of the grid.
Our variable set then becomes [f1, S1, G1.temp, G1.lat, G1.lon, Q1].
Note that the libnc-dap practice of re-naming the array variable to be
that of the Grid is not used.
<p>
<li>
Attempt to remove the prefix Grid name from the top-level Grid array and map
variables. If that eventually conflicts with some other name,
then leave the conflicting Grids alone.
Our variable set then becomes [f1, S1, temp, lat, lon, Q1].
<li>
If the Grid array name is the same as the Grid name, then
remove the prefix Grid name (not shown).
</ol>
<h3>Dimension Definition</h3>
The rules for choosing and defining dimensions is as follows.
<ol>
<li>
Collect the set of dimensions (named and anonymous) directly
associated with the  variables as defined above.
This means that dimensions
within user-defined types are ignored.  From our example,
the dimension set is [fdim=10,lat=2,lon=2,2,2].  Note that the
unqualified names are used.
<p>
<li>
If an anonymous dimension is associated with a Grid Map variable,
then given the dimension, the name of the map.
Our dimension set now becomes 
[fdim=10,lat=2,lon=2,lat=2,lon=2].
<p>
<li>
All remaining anonymous dimensions are given the
name "<var>_NN", where "<var>" is the
unqualified name of the variable in which the anonymous
dimension appears and NN is the relative position of that
dimension in the dimensions associated with that array.
No instances of this rule occur in the running example.
<p>
<li>
Remove duplicate dimensions (those with same name and value).
Our dimension set now becomes 
[fdim=10,lat=2,lon=2].
<p>
<li>
The final case occurs when there are dimensions with the same
name but with different values. For this case,
the size of the dimension is appended to the dimension name.
</ol>
<h3>Type Definition</h3>
The rules for choosing user-defined types are as follows.
<ol>
<li>For every Structure, Sequence, and non-top-level Grid,
netcdf-4 compound type is created whose fields are the fields
of the Structure, Sequence, or Grid. The name of the type is the
same as the Structure or Grid name suffixed with "_t".
However, the compound types derived from Sequences are instead 
suffixed with "_record_t".
<p>
The types of the fields are the types of the corresponding field
of the Structure, Sequence, or Grid. Note that this type
might be itself a user-defined type.
<p>
From the example, we get the following compound types.
<pre>
compound FS2_t {
    int f1[3];
    int f2;
};
compound S1_t {
    int f11;
    FS2_t FS2[2];  
};
compound Q1_record_t {
    double depth;
};
</pre>
<p>
<li>
For all sequences of name X,
also create this type.
<pre>
    X_record_t (*) X_t
</pre>
In our example, this produces the following type.
<pre>
    Q1_record_t (*) Q1_t
</pre>
<p>
<li>
If a Sequence, Q has a single field F,
whose type is a primitive type, T,
(e.g., int, float, string), then
do not apply the previous rule, but instead replace the whole
sequence with the the following field.
<pre>
    T (*) Q.f
</pre>
<p>
<li>
Attempt to maximally shorten the type names as long as there is no conflict.
</ol>
<h2>Choosing a Translation</h2>
The decision about whether to translate to netCDF-3
(libnc-dap) or netCDF-4 is determined by applying the
following rules in order.
<ol>
<li>
If the NC_CLASSIC_MODEL flag is set on nc_open(), then netcdf-3
(i.e. libnc-dap) translation is used.
<li>
If the NC_NETCDF4 flag is set on nc_open(), then netCDF-4
translation is used.
<li>
If the URL is prefixed with the string
"[mode=netcdf3]" or "[mode=libnc-dap]",
then the libnc-dap translation is used.
<li>
If the URL is prefixed with the string "[mode=netcdf4]",
then the netCDF-4 translation described below is used.
<li>
If none of the above is specified, then the default
is "[mode=libnc-dap]".
</ol>
<h2>Defined Client Parameters</h2>
Currently, a limited set of client parameters is recognized.
Parameters not listed here are ignored, but no error is signalled.
<table borderwidth=1 cellpadding=2>
<tr><th>Parameter Name<th>Legal Values<th>Semantics
<tr><td>[mode=...]<td>libnc‑dap|netcdf3|netcdf4
<td>Specify the translation to be applied to the DAP data source on the
client side.
<tr><td>[show=...]<td>das|dds|url
<td>This causes information to appear as specific global attributes.  The
tags may be combined using comma with no spaces
(e.g. "show=dds,url").  The currently recognized tags are "dds" to
display the underlying DDS, "das" similarly, and "url" to display
the url used to retrieve the data.
</table>
</body>
</html>
 |