1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193
|
/* This is part of the netCDF package.
Copyright 2018 University Corporation for Atmospheric Research/Unidata.
See COPYRIGHT file for conditions of use.
This is a very simple example which tests NFC normalization of
Unicode names encoded with UTF-8.
$Id: tst_norm.c 2792 2014-10-27 06:02:59Z wkliao $
*/
#include <config.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <netcdf.h>
#ifdef USE_PARALLEL
#include <netcdf_par.h>
#endif
#include <nc_tests.h>
#include "err_macros.h"
/* The data file we will create. */
#define FILE7_NAME "tst_norm.nc"
#define UNITS "units"
#define NDIMS 1
#define NX 18
int
main(int argc, char **argv)
{
int ncid, dimid, varid;
int dimids[NDIMS];
/* unnormalized UTF-8 encoding for Unicode 8-character "Hello" in Greek: */
unsigned char uname_utf8[] = {
0x41, /* LATIN CAPITAL LETTER A */
0xCC, 0x80, /* COMBINING GRAVE ACCENT */
0x41, /* LATIN CAPITAL LETTER A */
0xCC, 0x81, /* COMBINING ACUTE ACCENT */
0x41, /* LATIN CAPITAL LETTER A */
0xCC, 0x82, /* COMBINING CIRCUMFLEX ACCENT */
0x41, /* LATIN CAPITAL LETTER A */
0xCC, 0x83, /* COMBINING TILDE */
0x41, /* LATIN CAPITAL LETTER A */
0xCC, 0x88, /* COMBINING DIAERESIS */
0x41, /* LATIN CAPITAL LETTER A */
0xCC, 0x8A, /* COMBINING RING ABOVE */
0x43, /* LATIN CAPITAL LETTER C */
0xCC, 0xA7, /* COMBINING CEDILLA */
0x45, /* LATIN CAPITAL LETTER E */
0xCC, 0x80, /* COMBINING GRAVE ACCENT */
0x45, /* LATIN CAPITAL LETTER E */
0xCC, 0x81, /* COMBINING ACUTE ACCENT */
0x45, /* LATIN CAPITAL LETTER E */
0xCC, 0x82, /* COMBINING CIRCUMFLEX ACCENT */
0x45, /* LATIN CAPITAL LETTER E */
0xCC, 0x88, /* COMBINING DIAERESIS */
0x49, /* LATIN CAPITAL LETTER I */
0xCC, 0x80, /* COMBINING GRAVE ACCENT */
0x49, /* LATIN CAPITAL LETTER I */
0xCC, 0x81, /* COMBINING ACUTE ACCENT */
0x49, /* LATIN CAPITAL LETTER I */
0xCC, 0x82, /* COMBINING CIRCUMFLEX ACCENT */
0x49, /* LATIN CAPITAL LETTER I */
0xCC, 0x88, /* COMBINING DIAERESIS */
0x4E, /* LATIN CAPITAL LETTER N */
0xCC, 0x83, /* COMBINING TILDE */
0x00
};
/* NFC normalized UTF-8 encoding for same Unicode string: */
unsigned char nname_utf8[] = {
0xC3, 0x80, /* LATIN CAPITAL LETTER A WITH GRAVE */
0xC3, 0x81, /* LATIN CAPITAL LETTER A WITH ACUTE */
0xC3, 0x82, /* LATIN CAPITAL LETTER A WITH CIRCUMFLEX */
0xC3, 0x83, /* LATIN CAPITAL LETTER A WITH TILDE */
0xC3, 0x84, /* LATIN CAPITAL LETTER A WITH DIAERESIS */
0xC3, 0x85, /* LATIN CAPITAL LETTER A WITH RING ABOVE */
0xC3, 0x87, /* LATIN CAPITAL LETTER C WITH CEDILLA */
0xC3, 0x88, /* LATIN CAPITAL LETTER E WITH GRAVE */
0xC3, 0x89, /* LATIN CAPITAL LETTER E WITH ACUTE */
0xC3, 0x8A, /* LATIN CAPITAL LETTER E WITH CIRCUMFLEX */
0xC3, 0x8B, /* LATIN CAPITAL LETTER E WITH DIAERESIS */
0xC3, 0x8C, /* LATIN CAPITAL LETTER I WITH GRAVE */
0xC3, 0x8D, /* LATIN CAPITAL LETTER I WITH ACUTE */
0xC3, 0x8E, /* LATIN CAPITAL LETTER I WITH CIRCUMFLEX */
0xC3, 0x8F, /* LATIN CAPITAL LETTER I WITH DIAERESIS */
0xC3, 0x91, /* LATIN CAPITAL LETTER N WITH TILDE */
0x00
};
/* Unnormalized name used for dimension, variable, and attribute value */
#define UNAME ((char *) uname_utf8)
#define UNAMELEN (sizeof uname_utf8)
/* Normalized name */
#define NNAME ((char *) nname_utf8)
#define NNAMELEN (sizeof nname_utf8)
char name_in[UNAMELEN + 1], strings_in[UNAMELEN + 1];
nc_type att_type;
size_t att_len;
int res;
int dimid_in, varid_in, attnum_in;
int attvals[] = {42};
#define ATTNUM ((sizeof attvals)/(sizeof attvals[0]))
#ifdef TEST_PNETCDF
MPI_Init(&argc, &argv);
#endif
printf("\n*** testing UTF-8 normalization...");
#ifdef TEST_PNETCDF
if((res = nc_create_par(FILE7_NAME, NC_CLOBBER, MPI_COMM_WORLD, MPI_INFO_NULL,&ncid)))
#else
if((res = nc_create(FILE7_NAME, NC_CLOBBER, &ncid)))
#endif
ERR;
/* Define dimension with unnormalized Unicode UTF-8 encoded name */
if ((res = nc_def_dim(ncid, UNAME, NX, &dimid)))
ERR;
dimids[0] = dimid;
/* Define variable with same name */
if ((res = nc_def_var(ncid, UNAME, NC_CHAR, NDIMS, dimids, &varid)))
ERR;
/* Create string attribute with same value */
if ((res = nc_put_att_text(ncid, varid, UNITS, UNAMELEN, UNAME)))
ERR;
/* Create int attribute with same name */
if ((res = nc_put_att_int(ncid, varid, UNAME, NC_INT, ATTNUM, attvals)))
ERR;
/* Try to create dimension and variable with NFC-normalized
* version of same name. These should fail, as unnormalized name
* should have been normalized in library, so these are attempts to
* create duplicate netCDF objects. */
if ((res = nc_def_dim(ncid, NNAME, NX, &dimid)) != NC_ENAMEINUSE)
ERR;
if ((res = nc_def_var(ncid, NNAME, NC_CHAR, NDIMS, dimids, &varid)) != NC_ENAMEINUSE)
ERR;
if ((res = nc_enddef(ncid)))
ERR;
/* Write string data, UTF-8 encoded, to the file */
if ((res = nc_put_var_text(ncid, varid, UNAME)))
ERR;
if ((res = nc_close(ncid)))
ERR;
/* Check it out. */
#ifdef TEST_PNETCDF
if ((res = nc_open_par(FILE7_NAME, NC_NOWRITE, MPI_COMM_WORLD,MPI_INFO_NULL, &ncid)))
#else
if ((res = nc_open(FILE7_NAME, NC_NOWRITE, &ncid)))
#endif
ERR;
if ((res = nc_inq_varid(ncid, UNAME, &varid)))
ERR;
if ((res = nc_inq_varname(ncid, varid, name_in)))
ERR;
if ((res = strncmp(NNAME, name_in, NNAMELEN)))
ERR;
if ((res = nc_inq_varid(ncid, NNAME, &varid_in)) || varid != varid_in)
ERR;
if ((res = nc_inq_dimid(ncid, UNAME, &dimid_in)) || dimid != dimid_in)
ERR;
if ((res = nc_inq_dimid(ncid, NNAME, &dimid_in)) || dimid != dimid_in)
ERR;
if ((res = nc_inq_att(ncid, varid, UNITS, &att_type, &att_len)))
ERR;
if ((att_type != NC_CHAR || att_len != UNAMELEN))
ERR;
if ((res = nc_get_att_text(ncid, varid, UNITS, strings_in)))
ERR;
strings_in[UNAMELEN] = '\0';
if ((res = strncmp(UNAME, strings_in, UNAMELEN)))
ERR;
if ((res = nc_inq_attid(ncid, varid, UNAME, &attnum_in)) || ATTNUM != attnum_in)
ERR;
if ((res = nc_inq_attid(ncid, varid, NNAME, &attnum_in)) || ATTNUM != attnum_in)
ERR;
if ((res = nc_close(ncid)))
ERR;
SUMMARIZE_ERR;
#ifdef TEST_PNETCDF
MPI_Finalize();
#endif
FINAL_RESULTS;
return 0;
}
|