Schedule a demo

HL7v2 message parsing

Zato parses raw ER7 strings into typed Python objects with named segment and field access. Use parse_hl7 to convert a pipe-delimited ER7 string into a structured HL7Message instance, then access fields by name, serialize back to ER7, or convert to dict and JSON.

Parsing a message

from zato.hl7v2 import parse_hl7

raw = (
    'MSH|^~\\&|SENDER|FACILITY|RECEIVER|FAC|20260315||ADT^A01^ADT_A01|CTL001|P|2.9\r'
    'EVN|A01|20260315\r'
    'PID|||12345^^^HOSP^MR||SMITH^JOHN^A||19800115|M\r'
    'PV1|1|I|WARD^101^BED1\r'
)

message = parse_hl7(raw, validate=False)

The returned object is an instance of the specific message type - ADT_A01 for the example above. Every segment defined in the HL7 2.9 specification is available as a typed attribute.

Validation

Pass validate=True (the default) to check the message structure against the HL7 schema. Invalid messages raise an exception. Pass validate=False to skip validation and accept any message that can be parsed, which is useful for real-world messages that do not strictly follow the specification.

Accessing segments and fields

Segments and fields are accessed by their semantic Python names:

control_id    = message.msh.message_control_id
family_name   = message.pid.patient_name.family_name
patient_class = message.pv1.patient_class

Every field descriptor carries its HL7 position, data type, and repeatability metadata. See Field access for the full list of naming conventions.

Building segments from scratch

Create a segment instance and assign fields directly:

from zato_hl7v2.v2_9.segments import PID

segment = PID()
segment.set_id_pid = '1'
segment.patient_identifier_list = '12345^^^HOSP^MR'
segment.patient_name = 'SMITH^JOHN^A'
segment.date_time_of_birth = '19800115'
segment.administrative_sex = 'M'

er7 = segment.serialize()

The serialize() method returns the segment as an ER7 string (e.g. PID|1||12345^^^HOSP^MR||SMITH^JOHN^A||19800115|M).

Repeatable fields

Some HL7 fields are repeatable - they can carry multiple values separated by ~ in ER7 format. When building a segment from scratch, assign repeatable fields as a list[str]:

from zato_hl7v2.v2_9.segments import NTE

segment = NTE()
segment.comment = ['arm', 'left leg', 'head']

er7 = segment.serialize()
# NTE|||arm~left leg~head

A single value can also be assigned directly as a string:

segment.comment = 'just one comment'

Repeatable fields accept any of: a single data type instance, a str, a list[str], or a list of data type instances.

Composite data types

For fields with composite data types (e.g. XPN for patient name, CWE for coded values), construct the data type and assign it:

from zato_hl7v2.v2_9.segments import PID
from zato_hl7v2.v2_9.datatypes import XPN

name = XPN(family_name='SMITH', given_name='JOHN', second_and_further_given_names_or_initials_thereof='A')

segment = PID()
segment.patient_name = name

Serialization

Every message, segment, and data type has a serialize() method that returns the ER7 string:

message = parse_hl7(raw, validate=False)
er7 = message.serialize()

Parsing and then serializing a message produces an ER7 string that can be parsed again to yield an identical structure (round-trip fidelity).

Dict and JSON conversion

Convert any message, segment, or data type to a Python dict or JSON string:

message = parse_hl7(raw, validate=False)

message_dict = message.to_dict()
message_json = message.to_json()
message_json_pretty = message.to_json(indent=2)

The include_empty parameter controls whether fields with no value are included in the output. By default, all fields are included (include_empty=True). Pass include_empty=False to exclude empty fields and produce a more compact representation.

The dict output includes metadata keys like _structure_id for messages, _segment_id for segments, and _group_name for groups.

Path-based navigation

Use message.get() with a dot-separated path to reach any field, component, or subcomponent:

message = parse_hl7(raw, validate=False)

# Field by position
result = message.get('PID.5')

# Component by position
result = message.get('PID.5.1')

# Repetition by index
result = message.get('PID.3[0]')

See Path expressions for the full path syntax.



Schedule a meaningful demo

Book a demo with an expert who will help you build meaningful systems that match your ambitions

"For me, Zato Source is the only technology partner to help with operational improvements."

- John Adams
Program Manager of Channel Enablement at Keysight