1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259
|
python-hl7 - Message Accessor
=============================
reproduced from: http://wiki.medical-objects.com.au/index.php/Hl7v2_parsing
**Warning: Indexes in this API are from 1, not 0. This is to align with the HL7 documentation.**
Example HL7 Fragment:
.. doctest::
>>> message = 'MSH|^~\&|\r'
>>> message += 'PID|Field1|Component1^Component2|Component1^Sub-Component1&Sub-Component2^Component3|Repeat1~Repeat2\r\r'
>>> import hl7
>>> h = hl7.parse(message)
The resulting parse tree with values in quotes:
| Segment = "PID"
| F1
| R1 = "Field1"
| F2
| R1
| C1 = "Component1"
| C2 = "Component2"
| F3
| R1
| C1 = "Component1"
| C2
| S1 = "Sub-Component1"
| S2 = "Sub-Component2"
| C3 = "Component3"
| F4
| R1 = "Repeat1"
| R2 = "Repeat2"
| Legend
|
| F Field
| R Repeat
| C Component
| S Sub-Component
A tree has leaf values and nodes. Only the leaves of the tree can have a value.
All data items in the message will be in a leaf node.
After parsing, the data items in the message are in position in the parse tree, but
they remain in their escaped form. To extract a value from the tree you start at the
root of the Segment and specify the details of which field value you want to extract.
The minimum specification is the field number and repeat number. If you are after a
component or sub-component value you also have to specify these values.
If for instance if you want to read the value "Sub-Component2" from the example HL7
you need to specify: Field 3, Repeat 1, Component 2, Sub-Component 2 (PID.F1.R1.C2.S2).
Reading values from a tree structure in this manner is the only safe way to read data
from a message.
.. doctest::
>>> h['PID.F1.R1']
u'Field1'
>>> h['PID.F2.R1.C1']
u'Component1'
You can also access values using :py:class:`hl7.Accessor`, or by directly calling
:py:meth:`hl7.Message.extract_field`. The following are all equivalent:
.. doctest::
>>> h['PID.F2.R1.C1']
u'Component1'
>>> h[hl7.Accessor('PID', 1, 2, 1, 1)]
u'Component1'
>>> h.extract_field('PID', 1, 2, 1, 1)
u'Component1'
All values should be accessed in this manner. Even if a field is marked as being
non-repeating a repeat of "1" should be specified as later version messages
could have a repeating value.
To enable backward and forward compatibility there are rules for reading values when the
tree does not match the specification (eg PID.F1.R1.C2.S2) The common example of this is
expanding a HL7 "IS" Value into a Codeded Value ("CE"). Systems reading a "IS" value would
read the Identifier field of a message with a "CE" value and systems expecting a "CE" value
would see a Coded Value with only the identifier specified. A common Australian example of
this is the OBX Units field, which was an "IS" value previously and became a "CE" Value
in later versions.
| Old Version: "\|mmol/l\|" New Version: "\|mmol/l^^ISO+\|"
Systems expecting a simple "IS" value would read "OBX.F6.R1" and this would yield a value
in the tree for an old message but with a message with a Coded Value that tree node would
not have a value, but would have 3 child Components with the "mmol/l" value in the first
subcomponent. To resolve this issue where the tree is deeper than the specified path the
first node of every child node is traversed until a leaf node is found and that value is
returned.
.. doctest::
>>> h['PID.F3.R1.C2']
u'Sub-Component1'
This is a general rule for reading values: **If the parse tree is deeper than the specified
path continue following the first child branch until a leaf of the tree is encountered
and return that value (which could be blank).**
Systems expecting a Coded Value ("CE"), but reading a message with a simple "IS" value in it
have the opposite problem. They have a deeper specification but have reached a leaf node and
cannot follow the path any further. Reading a "CE" value requires multiple reads for each
sub-component but for the "Identifier" in this example the specification would be "OBX.F6.R1.C1".
The tree would stop at R1 so C1 would not exist. In this case the unsatisfied path elements
(C1 in this case) can be examined and if every one is position 1 then they can be ignored and
the leaf of the tree that was reached returned. If any of the unsatisfied paths are not in
position 1 then this cannot be done and the result is a blank string.
This is the second Rule for reading values: **If the parse tree terminates before the full path
is satisfied check each of the subsequent paths and if every one is specified at position 1
then the leaf value reached can be returned as the result.**
.. doctest::
>>> h['PID.F1.R1.C1.S1']
u'Field1'
This is a general rule for reading values: **If the parse tree is deeper than the specified
path continue following the first child branch until a leaf of the tree is encountered
and return that value (which could be blank).**
In the second example every value that makes up the Coded Value, other than the identifier
has a component position greater than one and when reading a message with a simple "IS"
value in it, every value other than the identifier would return a blank string.
Following these rules will result in excellent backward and forward compatibility. It is
important to allow the reading of values that do not exist in the parse tree by simply
returning a blank string. The two rules detailed above, along with the full tree specification
for all values being read from a message will eliminate many of the errors seen when
handling earlier and later message versions.
.. doctest::
>>> h['PID.F10.R1']
u''
At this point the desired value has either been located, or is absent, in which case a blank
string is returned.
Assignments
-----------
The accessors also support item assignments. However, the Message object must exist and the
separators must be validly assigned.
Create a response message.
.. doctest::
>>> SEP = '|^~\&'
>>> CR_SEP = '\r'
>>> MSH = hl7.Segment(SEP[0], [hl7.Field(SEP[1], ['MSH'])])
>>> MSA = hl7.Segment(SEP[0], [hl7.Field(SEP[1], ['MSA'])])
>>> response = hl7.Message(CR_SEP, [MSH, MSA])
>>> response['MSH.F1.R1'] = SEP[0]
>>> response['MSH.F2.R1'] = SEP[1:]
>>> unicode(response)
u'MSH|^~\\&|\rMSA'
Assign values into the message. You can only assign a string into the message (i.e. a leaf
of the tree).
.. doctest::
>>> response['MSH.F9.R1.C1'] = 'ORU'
>>> response['MSH.F9.R1.C2'] = 'R01'
>>> response['MSH.F9.R1.C3'] = ''
>>> response['MSH.F12.R1'] = '2.4'
>>> response['MSA.F1.R1'] = 'AA'
>>> response['MSA.F3.R1'] = 'Application Message'
>>> unicode(response)
u'MSH|^~\\&|||||||ORU^R01^|||2.4\rMSA|AA||Application Message'
You can also assign values using :py:class:`hl7.Accessor`, or by directly calling
:py:meth:`hl7.Message.assign_field`. The following are all equivalent:
.. doctest::
>>> response['MSA.F1.R1'] = 'AA'
>>> response[hl7.Accessor('MSA', 1, 1, 1)] = 'AA'
>>> response.assign_field('AA', 'MSA', 1, 1, 1)
Escaping Content
----------------
HL7 messages are transported using the 7bit ascii character set. Only characters between
ascii 32 and 127 are used. Characters which cannot be transported using this range
of values must be 'escaped', that is replaced by a sequence of characters for transmission.
The stores values internally in the escaped format. When the message is composed using
'unicode', the escaped value must be returned.
.. doctest::
>>> message = 'MSH|^~\&|\r'
>>> message += 'PID|Field1|\F\|\r\r'
>>> h = hl7.parse(message)
>>> unicode(h['PID'][0][2])
u'\\F\\'
>>> h.unescape(unicode(h['PID'][0][2]))
u'|'
When the accessor is used to reference the field, the field is automatically unescaped.
.. doctest::
>>> h['PID.F2.R1']
u'|'
The escape/unescape mechanism support replacing separator characters with their escaped
version and replacing non-ascii characters with hexadecimal versions.
The escape method returns a 'str' object. The unescape method returns a unicode object.
.. doctest::
>>> h.unescape('\\F\\')
u'|'
>>> h.unescape('\\R\\')
u'~'
>>> h.unescape('\\S\\')
u'^'
>>> h.unescape('\\T\\')
u'&'
>>> h.unescape('\\X202020\\')
u' '
>>> h.escape('|~^&')
u'\\F\\\\R\\\\S\\\\T\\'
>>> h.escape('áéíóú')
u'\\Xc3\\\\Xa1\\\\Xc3\\\\Xa9\\\\Xc3\\\\Xad\\\\Xc3\\\\Xb3\\\\Xc3\\\\Xba\\'
**Presentation Characters**
HL7 defines a protocol for encoding presentation characters, These include hightlighting,
and rich text functionality. The API does not currently allow for easy access to the
escape/unescape logic. You must overwrite the message class escape and unescape methods,
after parsing the message.
|