3.1.2 Missing Data

It is possible to represent missing data explicitly in Octave using NA (short for “Not Available”). This is helpful in distinguishing between a property of the data (i.e., some of it was not recorded) and calculations on the data which generated an error (i.e., created NaN values). In short, if you do not get the result you expect is it your data or your algorithm?

The missing data marker is a special case of the representation of NaN. Because of that, it can only be used with data represented by floating point numbers—no integer, logical, or char values.

In general, use NA and the test isna, to describe the dataset or to reduce the dataset to only valid entries. Numerical calculations with NA will generally "poison" the results and conclude with an output NA. However, this can not be guaranteed on all platforms and NA may be replaced by NaN.

Example 1 : Describing the dataset

data = [1, NA, 3];
percent_missing = 100 * sum (isna (data(:))) / numel (data);
printf ('%2.0f%% of the dataset is missing\n', percent_missing);
-| 33% of the dataset is missing

Example 2 : Restrict calculations to valid data

raw_data = [1, NA, 3];
printf ('mean of raw data is %.1f\n', mean (raw_data));
-| mean of raw data is NA
valid_data = raw_data (! isna (raw_data));
printf ('mean of valid data is %.1f\n', mean (valid_data));
-| mean of valid data is 2.0
 
val = NA
val = NA (n)
val = NA (n, m)
val = NA (n, m, k, …)
val = NA (…, "like", var)
val = NA (…, class)

Return a scalar, matrix, or N-dimensional array whose elements are all equal to the special constant NA (Not Available) used to designate missing values.

Note that NA always compares not equal to NA (NA != NA). To find NA values, use the isna function.

When called with no arguments, return a scalar with the value ‘NA’.

When called with a single argument, return a square matrix with the dimension specified.

When called with more than one scalar argument the first two arguments are taken as the number of rows and columns and any further arguments specify additional matrix dimensions.

If a variable var is specified after "like", the output val will have the same data type, complexity, and sparsity as var.

The optional argument class specifies the return type and may be either "double" or "single".

Programming Note: The missing data marker NA is a special case of the representation of NaN. Numerical calculations with NA will generally "poison" the results and conclude with an output of NA. However, this can not be guaranteed on all platforms and NA may be replaced by NaN. See Missing Data.

See also: isna.

 
tf = isna (x)

Return a logical array which is true where the elements of x are NA (missing) values and false where they are not.

For example:

isna ([13, Inf, NA, NaN])
     ⇒ [ 0, 0, 1, 0 ]

See also: isnan, isinf, isfinite.