1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410
|
/*!
\if MANPAGES
\page dcm2xml Convert DICOM file and data set to XML
\else
\page dcm2xml dcm2xml: Convert DICOM file and data set to XML
\endif
\section dcm2xml_synopsis SYNOPSIS
\verbatim
dcm2xml [options] dcmfile-in [xmlfile-out]
\endverbatim
\section dcm2xml_description DESCRIPTION
The \b dcm2xml utility converts the contents of a DICOM file (file format or
raw data set) to XML (Extensible Markup Language). There are two output
formats. The first one is specific to DCMTK with its DTD (Document Type
Definition) described in the file <em>dcm2xml.dtd</em>. The second one refers
to the "Native DICOM Model" which is specified for the DICOM Application
Hosting service found in DICOM part 19.
If \b dcm2xml reads a raw data set (DICOM data without a file format
meta-header) it will attempt to guess the transfer syntax by examining the
first few bytes of the file. It is not always possible to correctly guess the
transfer syntax and it is better to convert a data set to a file format
whenever possible (using the \b dcmconv utility). It is also possible to use
the \e -f and <em>-t[ieb]</em> options to force \b dcm2xml to read a data set
with a particular transfer syntax.
\section dcm2xml_parameters PARAMETERS
\verbatim
dcmfile-in DICOM input filename to be converted ("-" for stdin)
xmlfile-out XML output filename (default: stdout)
\endverbatim
\section dcm2xml_options OPTIONS
\subsection dcm2xml_general_options general options
\verbatim
-h --help
print this help text and exit
--version
print version information and exit
--arguments
print expanded command line arguments
-q --quiet
quiet mode, print no warnings and errors
-v --verbose
verbose mode, print processing details
-d --debug
debug mode, print debug information
-ll --log-level [l]evel: string constant
(fatal, error, warn, info, debug, trace)
use level l for the logger
-lc --log-config [f]ilename: string
use config file f for the logger
\endverbatim
\subsection dcm2xml_input_options input options
\verbatim
input file format:
+f --read-file
read file format or data set (default)
+fo --read-file-only
read file format only
-f --read-dataset
read data set without file meta information
input transfer syntax:
-t= --read-xfer-auto
use TS recognition (default)
-td --read-xfer-detect
ignore TS specified in the file meta header
-te --read-xfer-little
read with explicit VR little endian TS
-tb --read-xfer-big
read with explicit VR big endian TS
-ti --read-xfer-implicit
read with implicit VR little endian TS
long tag values:
+M --load-all
load very long tag values (e.g. pixel data)
-M --load-short
do not load very long values (default)
+R --max-read-length [k]bytes: integer (4..4194302, default: 4)
set threshold for long values to k kbytes
\endverbatim
\subsection dcm2xml_processing_options processing options
\verbatim
specific character set:
+Cr --charset-require
require declaration of extended charset (default)
+Ca --charset-assume [c]harset: string
assume charset c if no extended charset declared
+Cc --charset-check-all
check all data elements with string values
(default: only PN, LO, LT, SH, ST, UC and UT)
# this option is only used for the extended check whether
# the Specific Character Set (0008,0005) attribute should be
# present, but not for the conversion of unaffected element
# values to UTF-8 (e.g. element values with a VR of CS)
+U8 --convert-to-utf8
convert all element values that are affected
by Specific Character Set (0008,0005) to UTF-8
# requires support from an underlying character encoding
# library (see output of --version on which one is available)
\endverbatim
\subsection dcm2xml_output_options output options
\verbatim
general XML format:
-dtk --dcmtk-format
output in DCMTK-specific format (default)
-nat --native-format
output in Native DICOM Model format (part 19)
+Xn --use-xml-namespace
add XML namespace declaration to root element
DCMTK-specific format (not with --native-format):
+Xd --add-dtd-reference
add reference to document type definition (DTD)
+Xe --embed-dtd-content
embed document type definition into XML document
+Xf --use-dtd-file [f]ilename: string
use specified DTD file (only with +Xe)
(default: /usr/local/share/dcmtk-<VERSION>/dcm2xml.dtd)
+Wn --write-element-name
write name of the DICOM data elements (default)
-Wn --no-element-name
do not write name of the DICOM data elements
+Wb --write-binary-data
write binary data of OB and OW elements
(default: off, be careful with --load-all)
encoding of binary data:
+Eh --encode-hex
encode binary data as hex numbers
(default for DCMTK-specific format)
+Eu --encode-uuid
encode binary data as a UUID reference
(default for Native DICOM Model)
+Eb --encode-base64
encode binary data as Base64 (RFC 2045, MIME)
\endverbatim
\section dcm2xml_dcmtk_format DCMTK Format
The basic structure of the DCMTK-specific XML output created from a DICOM file
looks like the following:
\verbatim
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE file-format SYSTEM "dcm2xml.dtd">
<file-format xmlns="http://dicom.offis.de/dcmtk">
<meta-header xfer="1.2.840.10008.1.2.1" name="Little Endian Explicit">
<element tag="0002,0000" vr="UL" vm="1" len="4"
name="MetaElementGroupLength">
166
</element>
...
<element tag="0002,0013" vr="SH" vm="1" len="16"
name="ImplementationVersionName">
OFFIS_DCMTK_353
</element>
</meta-header>
<data-set xfer="1.2.840.10008.1.2" name="Little Endian Implicit">
<element tag="0008,0005" vr="CS" vm="1" len="10"
name="SpecificCharacterSet">
ISO_IR 100
</element>
...
<sequence tag="0028,3010" vr="SQ" card="2" name="VOILUTSequence">
<item card="3">
<element tag="0028,3002" vr="xs" vm="3" len="6"
name="LUTDescriptor">
256\0\8
</element>
...
</item>
...
</sequence>
...
<element tag="7fe0,0010" vr="OW" vm="1" len="262144"
name="PixelData" loaded="no" binary="hidden">
</element>
</data-set>
</file-format>
\endverbatim
The "file-format" and "meta-header" tags are absent for DICOM data sets.
\subsection dcm2xml_xml_encoding XML Encoding
Attributes with very large value fields (e.g. pixel data) are not loaded by
default. They can be identified by the additional attribute "loaded" with a
value of "no" (see example above). The command line option \e --load-all
forces to load all value fields including the very long ones.
Furthermore, binary data of OB and OW attributes are not written to the XML
output file by default. These elements can be identified by the additional
attribute "binary" with a value of "hidden" (default is "no"). The command line
option \e --write-binary-data causes also binary value fields to be printed
(attribute value is "yes" or "base64"). But, be careful when using this option
together with \e --load-all because of the large amounts of pixel data that
might be printed to the output. Please note that in this context element values
with a VR of OD, OF, OL and OV are not regarded as "binary data".
Multiple values (i.e. where the DICOM value multiplicity is greater than 1)
are separated by a backslash "\" (except for Base64 encoded data). The "len"
attribute indicates the number of bytes for the particular value field as
stored in the DICOM data set, i.e. it might deviate from the XML encoded value
length e.g. because of non-significant padding that has been removed. If this
attribute is missing in "sequence" or "item" start tags, the corresponding
DICOM element has been stored with undefined length.
\section dcm2xml_native_format Native DICOM Model Format
The description of the Native DICOM Model format can be found in the DICOM
standard, part 19 ("Application Hosting").
\subsection dcm2xml_bulk_data Bulk Data
Binary data, i.e. DICOM element values with Value Representations (VR) of OB
or OW, as well as OD, OF, OL, OV and UN values are by default not written to the
XML output because of their size. Instead, for each element, a new Universally
Unique Identifier (UUID) is being generated and written as an attribute of a
\<BulkData\> XML element. So far, there is no possibility to write an
additional file to hold the binary data for each of the binary data chunks.
This is not required by the standard, however, it might be useful for
implementing an Application Hosting interface; thus this feature may be
available in future versions of \b dcm2xml.
In addition, Supplement 163 (Store Over the Web by Representational State
Transfer Services) introduces a new \<InlineBinary\> XML element that allows
for encoding binary data as Base64. Currently, the command line option
\e --encode-base64 enables this encoding for the following VRs: OB, OD, OF, OL,
OV, OW and UN.
\subsection dcm2xml_known_issues Known Issues
In addition to what is written in the above section on "Bulk Data", there are
further known issues with the current implementation of the Native DICOM Model
format. For example, large element values with a VR other than OB, OD, OF, OL,
OV, OW or UN are currently never written as bulk data, although it might be
useful, e.g. for very long text elements (especially UT) or very long numeric
fields (of various VRs).
\section dcm2xml_notes NOTES
\subsection dcm2xml_character_encoding Character Encoding
The XML character encoding is determined automatically from the DICOM attribute
(0008,0005) "Specific Character Set" using the following mapping:
\verbatim
ASCII (ISO_IR 6) => "UTF-8"
UTF-8 "ISO_IR 192" => "UTF-8"
ISO Latin 1 "ISO_IR 100" => "ISO-8859-1"
ISO Latin 2 "ISO_IR 101" => "ISO-8859-2"
ISO Latin 3 "ISO_IR 109" => "ISO-8859-3"
ISO Latin 4 "ISO_IR 110" => "ISO-8859-4"
ISO Latin 5 "ISO_IR 148" => "ISO-8859-9"
ISO Latin 9 "ISO_IR 203" => "ISO-8859-15"
Cyrillic "ISO_IR 144" => "ISO-8859-5"
Arabic "ISO_IR 127" => "ISO-8859-6"
Greek "ISO_IR 126" => "ISO-8859-7"
Hebrew "ISO_IR 138" => "ISO-8859-8"
\endverbatim
If this DICOM attribute is missing in the input file, although needed, option
\e --charset-assume can be used to specify an appropriate character set
manually (using one of the DICOM defined terms). For reasons of backward
compatibility with previous versions of this tool, the following terms are also
supported and mapped automatically to the associated DICOM defined terms:
latin-1, latin-2, latin-3, latin-4, latin-5, latin-9, cyrillic, arabic, greek,
hebrew.
Multiple character sets using code extension techniques are not supported. If
needed, option \e --convert-to-utf8 can be used to convert the DICOM file or
data set to UTF-8 encoding prior to the conversion to XML format. This is also
useful for DICOMDIR files where each directory record can have a different
character set.
If no mapping is defined and option \e --convert-to-utf8 is not used, non-ASCII
characters and those below #32 are stored as "&#nnn;" where "nnn" refers to the
numeric character code. This might lead to invalid character entity references
(such as "" for ESC) and will cause most XML parsers to reject the document.
\section dcm2xml_logging LOGGING
The level of logging output of the various command line tools and underlying
libraries can be specified by the user. By default, only errors and warnings
are written to the standard error stream. Using option \e --verbose also
informational messages like processing details are reported. Option
\e --debug can be used to get more details on the internal activity, e.g. for
debugging purposes. Other logging levels can be selected using option
\e --log-level. In \e --quiet mode only fatal errors are reported. In such
very severe error events, the application will usually terminate. For more
details on the different logging levels, see documentation of module "oflog".
In case the logging output should be written to file (optionally with logfile
rotation), to syslog (Unix) or the event log (Windows) option \e --log-config
can be used. This configuration file also allows for directing only certain
messages to a particular output stream and for filtering certain messages
based on the module or application where they are generated. An example
configuration file is provided in <em>\<etcdir\>/logger.cfg</em>.
\section dcm2xml_command_line COMMAND LINE
All command line tools use the following notation for parameters: square
brackets enclose optional values (0-1), three trailing dots indicate that
multiple values are allowed (1-n), a combination of both means 0 to n values.
Command line options are distinguished from parameters by a leading '+' or '-'
sign, respectively. Usually, order and position of command line options are
arbitrary (i.e. they can appear anywhere). However, if options are mutually
exclusive the rightmost appearance is used. This behavior conforms to the
standard evaluation rules of common Unix shells.
In addition, one or more command files can be specified using an '@' sign as a
prefix to the filename (e.g. <em>\@command.txt</em>). Such a command argument
is replaced by the content of the corresponding text file (multiple
whitespaces are treated as a single separator unless they appear between two
quotation marks) prior to any further evaluation. Please note that a command
file cannot contain another command file. This simple but effective approach
allows one to summarize common combinations of options/parameters and avoids
longish and confusing command lines (an example is provided in file
<em>\<datadir\>/dumppat.txt</em>).
\section dcm2xml_environment ENVIRONMENT
The \b dcm2xml utility will attempt to load DICOM data dictionaries specified
in the \e DCMDICTPATH environment variable. By default, i.e. if the
\e DCMDICTPATH environment variable is not set, the file
<em>\<datadir\>/dicom.dic</em> will be loaded unless the dictionary is built
into the application (default for Windows).
The default behavior should be preferred and the \e DCMDICTPATH environment
variable only used when alternative data dictionaries are required. The
\e DCMDICTPATH environment variable has the same format as the Unix shell
\e PATH variable in that a colon (":") separates entries. On Windows systems,
a semicolon (";") is used as a separator. The data dictionary code will
attempt to load each file specified in the \e DCMDICTPATH environment variable.
It is an error if no data dictionary can be loaded.
Depending on the command line options specified, the \b dcm2xml utility will
attempt to load character set mapping tables. This happens when DCMTK was
compiled with the oficonv library (which is the default) and the mapping tables
are not built into the library (default when DCMTK uses shared libraries).
The mapping table files are expected in DCMTK's <em>\<datadir\></em>.
The \e DCMICONVPATH environment variable can be used to specify a different
location. If a different location is specified, those mapping tables also
replace any built-in tables.
\section dcm2xml_files FILES
<em>\<datadir\>/dcm2xml.dtd</em> - Document Type Definition (DTD) file
\section dcm2xml_see_also SEE ALSO
<b>xml2dcm</b>(1), <b>dcmconv</b>(1)
\section dcm2xml_copyright COPYRIGHT
Copyright (C) 2002-2024 by OFFIS e.V., Escherweg 2, 26121 Oldenburg, Germany.
*/
|