File: cfa.1

package info (click to toggle)
cf-python 1.3.2%2Bdfsg1-4
  • links: PTS, VCS
  • area: main
  • in suites: stretch
  • size: 7,996 kB
  • sloc: python: 51,733; ansic: 2,736; makefile: 78; sh: 2
file content (676 lines) | stat: -rw-r--r-- 19,169 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
.TH "CFA" "1" "1.3.1" "2016-09-09" "cfa"
.
.
.
.SH NAME
cfa \- create aggregated CF datasets
.
.
.
.SH SYNOPSIS
.
cfa [\-d dir] [\-f format] [\-h] [\-i] [\-n] [\-o file] [\-u] [\-v] [\-x] [OPTIONS] INPUTS
.
.
.SH DESCRIPTION
.
.
The cfa tool creates and writes to disk the CF fields contained in
files contained in the
.ft B
INPUTS
.ft P
(which may include directories if the
.ft B
\-\-recursive
.ft P
option is set).

Accepts CF\-netCDF and CFA\-netCDF files (or URLs if DAP access is
enabled), Met Office (UK) PP files and Met Office (UK) fields files as
input. Multiple input files in a mixture of formats may be given and
normal UNIX file globbing rules apply.

Output files are in CF\-netCDF or CFA\-netCDF format (see the
.ft B
\-f
.ft P
option).
Both output types are available in netCDF3 and netCDF4 formats. Note
that the netCDF3 formats are generally slower to write than the
netCDF4 formats, by several orders of magnitude if files with many
data variables are involved. However, not all software can read
netCDF4, so it is advisable to check before writing in this format.

By default the contents of each input file is aggregated
(i.e. combined) into as few multi\-dimensional CF fields as
possible. Unaggregatable fields in the input files may be omitted from
the output (see the
.ft B
\-x
.ft P
option). Information on which fields are unaggregatable, and why, may
be displayed (see the
.ft B
\-\-info
.ft P
option). All aggregation may be turned off with the
.ft B
\-n
.ft P
option, in which case all input fields are output without
modification.

See the AGGREGATION section for details on the aggregation process and
unaggregatable fields.

By default one output file is created per input file. In this case
there is no inter\-file aggregation and the contents of each file is
aggregated independently of the others. Output file names are created
by removing the suffix \.pp, \.nc or \.nca, if there is one, from each
input file name and then adding a new suffix of \.nc or \.nca for
CF\-netCDF and CFA\-netCDF output formats respectively. If the
.ft B
\-d
.ft P
option is set then all output files will be written to the specified
directory, otherwise each output file will be written to the same
directory as its input file.

Alternatively, all of the input files may be treated collectively as a
single CF dataset and written to a single output file (see the
.ft B
\-o
.ft P
option). In this case aggregation is attempted within and between the
input files.

An error occurs if an output file has the same full name as any of the
input files or any other output file.
.
.
.
.
.SH AGGREGATION
.
.
.
Aggregation of input fields into as few multi\-dimensional CF fields
as possible is carried out according to the aggregation rules
documented in CF ticket #78 (http://kitt.llnl.gov/trac/ticket/78). For
each input field, the aggregation process creates a
.ft I
structural signature
.ft P
which is essentially a subset of the metadata of the field, including
coordinate metadata and other domain information, but which contains
no data values. The structural signature accounts for the following
standard CF properties:
  
.RS
add_offset, calendar, cell_methods, _FillValue, flag_masks,
flag_meanings, flag_values, missing_value, scale_factor,
standard_error_multiplier, standard_name, units, valid_max, valid_min,
valid_range
.RE

Aggregation is then attempted on each group of fields with the same,
well defined structural signature, and will succeed where the
coordinate data values imply a safe combination into a single dataset.

Not all fields are aggregatable. Unaggregatable fields are those
without a well defined structural signature; or those with the same
structural signature when at least two of them 1) can't be
unambiguously distinguished by coordinates or other domain information
or 2) contain coordinate reference fields or ancillary variable fields
which themselves can't be unambiguously aggregated.
.
.
.
.SH EXAMPLES
.
.
Create a new netCDF3 classic file containing the aggregatable fields
in all of the input files:

.RS
cfa \-o newfile.nc *.nc
.RE

Create, in an existing directory and overwriting any existing files,
new netCDF3 classic files containing the aggregatable fields in each
input file:

.RS
cfa \-d directory \-\-overwrite *.pp
.RE

Create a new netCDF4 file containing all fields in all of the input
files:

.RS
cfa \-f NETCDF4 \-o newfile.nc *.nc
.RE

Create a new CFA-netCDF4 file containing all fields in all of the
input files and allow long names or netCDF variable names to identify
fields and their components:

.RS
cfa \-i \-f CFA4 \-o newfile.nc *.nc
.RE
.
.
.
.SH OPTIONS
.
.
.
.TP
.B \-\-axis=property
Aggregation configuration: Create a new axis for each input field
which has given property. If an input field has the property then,
prior to aggregation, a new axis is created with an auxiliary
coordinate whose data array is the property's value. This allows for
the possibility of aggregation along the new axis. The property itself
is deleted from that field. No axis is created for input fields which
do not have the specified property.

Multiple axes may be created by specifying more than one
.ft B
\-\-axis
.ft P
option.

For example, if you wish to aggregate an ensemble of model
experiments that are distinguished by the source property, you can use
.ft B
\-\-axis=source
.ft P
to create an ensemble axis which has an auxiliary coordinate variable
containing the source property values.
.
.
.TP
.B \-\-cfa_base=[value]
For output CFA\-netCDF files only. File names referenced by an output
CFA\-netCDF file have relative, as opposed to absolute, paths or URL
bases. This may be useful when relocating a CFA\-netCDF file together
with the datasets referenced by it.
.PP
.RS
If set with no value (\-\-cfa_base=) or the value is empty then file
names are given relative to the directory or URL base containing the
output CFA\-netCDF file. If set with a non\-empty value then file
names are given relative to the directory or URL base described by the
value.
.PP
By default, file names within CFA\-netCDF files are stored with
absolute paths. Ignored for output files of any other format.
.RE
.RE
.
.
.TP
.B \-\-compress=N

Regulate the speed and efficiency of compression. Must be an integer
between 0 and 9. By default N is 0, meaning no compression; 1 is the
fastest, but has the lowest compression ratio; 9 is the slowest but
best compression ratio.

.
.
.TP
.B \-\-contiguous
Aggregation configuration: Requires that aggregated fields have
adjacent dimension coordinate cells which partially overlap or share
common boundary values. Ignored if the dimension coordinates do not
have bounds.
.
.
.TP
.B \-d dir, \-\-directory=dir
Specify the output directory for all output files.
.
.
.TP
.B \-\-double
Write 32-bit floats as 64-bit floats and 32-bit integers as 64-bit
integers. By default, input data types are preserved.
.
.
.TP
.B \-\-equal=property
Aggregation configuration: Require that an input field may only be
aggregated with other fields if they all have the given CF property
(standard or non-standard) with equal values. Ignored for any input
field which does not have this property, or if the property is already
accounted for in the structural signature.

Supersedes the behaviour for the given property that may be implied by
the
.ft B
\-\-exist_all
.ft P
option.

Multiple properties may be set by specifying more than one
.ft B
\-\-equal
.ft P
option.
.
.
.TP
.B \-\-equal_all
Aggregation configuration: Require that an input field may only be
aggregated with other fields that have the same set of CF properties
(excluding those already accounted for in the structural signature)
with equal sets of values.

The behaviour for individual properties may be overridden by the
.ft B
\-\-exist \-\-ignore
.ft P
options.

For example, to insist that a group of aggregated input fields must
all have the same CF properties (other than those accounted for in the
structural signature) with matching values, but allowing the long_name
properties have unequal values, you can use
.ft B
\-\-equal_all \-\-exist=long_name
.ft P
.
.
.TP
.B \-\-exist=property
Aggregation configuration: Require that an input field may only be
aggregated with other fields if they all have the given CF property
(standard or non-standard), but not requiring the values to be the
same. Ignored for any input field which does not have this property,
or if the property is already accounted for in the structural
signature.

Supersedes the behaviour for the given property that may be implied by
the
.ft B
\-\-equal_all
.ft P
option.

Multiple properties may be set by specifying more than one
.ft B
\-\-exist
.ft P
option.
.
.
.TP
.B \-\-exist_all
Aggregation configuration: Require that an input field may only be
aggregated with other fields that have the same set of CF properties
(excluding those already accounted for in the structural signature),
but not requiring the values to be the same.

The behaviour for individual properties may be overridden by the
.ft B
\-\-equal \-\-ignore
.ft P
options.

For example, to insist that a group of aggregated input fields must
all have the same CF properties (other than those accounted for in the
structural signature), regardless of their values, but also insisting
that the long_name properties have equal values, you can use
.ft B
\-\-exist_all \-\-equal=long_name
.ft P
.
.
.TP
.B \-f format, \-\-format=format
Set the format of the output file(s). Valid choices are
NETCDF3_CLASSIC, NETCDF3_64BIT, NETCDF4, NETCDF4_CLASSIC and
NETCDF3_64BIT for outputting CF\-netCDF files in those netCDF formats
and CFA3 or CFA4 for outputting CFA\-netCDF files in NETCDF3_CLASSIC
or NETCDF4 formats respectively. By default, NETCDF3_CLASSIC is
assumed.
.PP
.RS
Note that the netCDF3 formats are generally slower to write than the
netCDF4 formats, by several orders of magnitude if files with many
data variables are involved. However, not all software can read
netCDF4, so it is advisable to check before writing in this format.
.RE
.
.
.TP
.B \-h, \-\-help
Display this man page.
.
.
.TP
.B \-i, \-\-relaxed_identities
Aggregation configuration: In the absence of standard names, allow
fields and their components (such as coordinates) to be identified by
their long_name CF properties or else their netCDF file variable
names.
.
.
.TP
.B \-\-ignore=property
Aggregation configuration: An input field may be aggregated with other
fields regardless of whether or not they have the given CF property
(standard or non-standard) and regardless of its values. Ignored for
any input field which does not have this property, or if the property
is already accounted for in the structural signature.

This is the default behaviour in the absence of all the
.ft B
\-\-exist \-\-equal \-\-exist_all \-\-equal_all
.ft P
options and supersedes the behaviour for the given property that may
be implied if any of these options are set.

Multiple properties may be set by specifying more than one
.ft B
\-\-ignore
.ft P
option.

For example, to insist that a group of aggregated input fields must
all have the same CF properties (other than those accounted for in the
structural signature) with the same values, but with no restrictions
on the existence or values of the long_name property you can use
.ft B
\-\-equal_all \-\-ignore=long_name
.ft P
.
.
.TP
.B \-\-fletcher32
Activate the Fletcher-32 HDF5 checksum algorithm to detect compression
errors. Ignored if there is no compression (see the
.ft B
\-\-compress
.ft P
option).
.
.
.TP
.B \-\-follow_symlinks
In combination with
.ft B
\-\-recursive
.ft P
also search for files in directories which resolve to symbolic
links. Files specified by the
.ft B
INPUTS
.ft P
which are symbolic links are always followed. Note that setting
.ft B
\-\-recursive --follow_symlinks
.ft P
can lead to infinite recursion if a directory which resolves to a
symbolic link points to a parent directory of itself.
.
.
.TP
.B \-\-ignore_read_error
Ignore, without failing, any input file which causes an error whilst
being read, as would be the case for an empty file, unknown file
format, etc. By default an error occurs in this case.
.
.
.TP
.B \-\-info=N
Aggregation configuration: Print information about the aggregation
process. If N is 0 then no information is displayed. If N is 1 or more
then display information on which fields are unaggregatable, and
why. If N is 2 or more then display the field structural signatures
and, when there is more than one field with the same structural
signature, their canonical first and last coordinate values. If N is 3
or more then display the field complete aggregation metadata.

By default N is 0.
.
.
.TP
.B \-\-least_sig_digit=N
Truncate the input field data arrays. For a positive integer N the
precision that is retained in the compressed data is '10 to the power
-N'. For example, if N is 2 then a precision of 0.01 is retained. In
conjunction with compression this produces 'lossy', but significantly
more efficient compression (see the
.ft B
\-\-compress
.ft P
option).
.
.
.TP
.B \-\-ncvar_identities
Aggregation configuration: Force fields and their components (such as
coordinates) to be identified by their netCDF file variable names.
.
.
.TP
.B \-n, \-\-no_aggregation
Aggregation configuration: Do not aggregate fields. Writes the input
fields as they exist in the input files.
.
.
.TP
.B \-\-no_overlap
Aggregation configuration: Requires that aggregated fields have
adjacent dimension coordinate cells which do not overlap (but they may
share common boundary values). Ignored if the dimension coordinates do
not have bounds.
.
.
.TP
.B \-\-no_shuffle
Turn off the HDF5 shuffle filter, which de-interlaces a block of data
before compression by reordering the bytes by storing the first byte
of all of a variable's values in the chunk contiguously, followed by
all the second bytes, and so on. By default the filter is applied
because if the data array values are not all wildly different, using
the filter can make the data more easily compressible. Ignored if
there is no compression (see the
.ft B
\-\-compress
.ft P
option).
.
.
.TP
.B \-o file, \-\-outfile=file
Treat all input files collectively as a single CF dataset. In this
case aggregation is attempted within and between the input files and
all outputs are written to the specified file.
.
.
.TP
.B \-\-overwrite
Allow pre\-existing output files to be overwritten.
.
.
.TP
.B \-\-promote=component
Promote field components to independent top-level fields. If component
is ancillary then ancillary data fields are promoted. If component is
auxiliary then auxiliary coordinate variables are promoted. If
component is measure then cell meausure variables are promoted. If
component is reference then fields pointed to from formula_terms
attributes are promoted. If component is field then all component
fields are promoted.

Multiple conponent types may be promoted by specifying more than one
.ft B
\-\-promote
.ft P
option.

For example, promote to ancillary data field and cell measure
variables to independent, top-level fields you can use
.ft B
\-\-promote=ancillary --promote=measure
.ft P
.
.
.TP
.B \-\-recursive
Allow directories to be specified by the
.ft B
INPUTS
.ft P
and recursively search the directories for actual files to read. Set
the
.ft B
\-\-ignore_read_error
.ft P
option to bypass any unreadable files and the
.ft B
\-\-follow_symlinks
.ft P
option to allow directories to be symbolic links.
.
.
.TP
.B \-\-reference_datetime=datetime
Set the reference date-time of time coordinate units to an ISO
8601-like date-time. Changing the reference date-time does not change
the absolute date-times of the coordinates. Ignored for non-reference
date-time coordinates. Some examples of valid date-times: 1830-12-1,
"1830-12-09 2:34:45Z".
.
.
.TP
.B \-\-respect_valid
Aggregation configuration: Take into account the CF properties
valid_max, valid_min and valid_range during aggregation. By default
they are ignored for the purposes of aggregation and deleted from any
aggregated output CF fields.
.
.
.TP
.B \-\-shared_nc_domain
Aggregation configuration: Match axes between a field and its
contained ancillary variable and coordinate reference fields via their
netCDF dimension names and not via their domains.
.
.
.TP
.B \-\-single
Write 64-bit floats as 32-bit floats and 64-bit integers as 32-bit
integers. By default, input data types are preserved.
.
.
.TP
.B \-\-squeeze
Remove size 1 axes from the output field data arrays. If a size one
axis has any one dimensional coordinates then these are converted to
CF scalar coordinates.
.
.
.TP
.B \-u, \-\-relaxed_units
Aggregation configuration: Assume that fields or their components
(such as coordinates) with the same standard name (or other
identifiers, see the
.ft B
\-i
.ft P
option) but missing units all have equivalent (but unspecified) units,
so that aggregation may occur. This is the default for Met Office (UK)
PP files and Met Office (UK) fields files, but not for other formats.
.
.
.TP
.B \-\-unsqueeze
Include size 1 axes in the output field data arrays. If a size one
axis has any CF scalar coordinates then these are converted to one
dimensional coordinates.
.
.
.TP
.B \-\-um_version=version
For Met Office (UK) PP files and Met Office (UK) fields files only,
the Unified Model (UM) version to be used when decoding the
header. Valid versions are, for example, 4.2, 6.6.3 and 8.2. The
default version is 4.5. In general, the given version is ignored if it
can be inferred from the header (which is usually the case for files
created by the UM at versions 5.3 and later). The exception to this is
when the given version has a third element (such as the 3 in 6.6.3),
in which case any version in the header is ignored. This option is
ignored for input files which are not Met Office (UK) PP files or Met
Office (UK) fields files.
.
.
.TP
.B \-\-unlimited=axis
Create an unlimited dimension (a dimension that can be appended to). A
dimension is identified by either a standard name; one of T, Z, Y, X
denoting time, height or horixontal axes (as defined by the CF
conventions); or the value of an arbitrary CF property preceded by
the property name and a colon. For example:

Multiple unlimited axes may be defined by specifying more than one
.ft B
\-\-unlimited
.ft P
option. Note, however, that only netCDF4 formats support multiple
unlimited dimensions. For example, to set the time and Z dimensions to
be unlimited you could use
.ft B
\-\-unlimited=time \-\-unlimited=Z 
.ft P

An example of defining an axis by an arbitrary CF property could be
.ft B
\-\-unlimited=long_name:pseudo_level
.ft P
.
.
.TP
.B \-v, \-\-verbose
Display a one\-line summary of each output CF field.
.
.
.TP
.B \-x, \-\-exclude
Aggregation configuration: Omit unaggregatable fields from the
output. Ignored if the
.ft B
\-n
.ft P
option is set. See the AGGREGATION section for the definition of an
unaggregatable field.
.
.
.
.SH SEE ALSO
cfdump(1)
.
.
.
.SH LIBRARY
cf\-python library version 1.3.1
.
.
.
.SH BUGS
Reports of bugs are welcome at http://cfpython.bitbucket.org/
.
.
.
.SH LICENSE
Open Source Initiative MIT License
.
.
.
.SH AUTHOR
David Hassell