HL7 Parsing for Amalga Validation–the code (snips)

Well dang.  I had expected to just post up the code I wrote for the EAV/HL7 database but as I started to simplify the the code, it became clear that like much of HL7, our various feeds had anomalies that required pre-processing and special handling that would not make sense without the messages.  And of course, I cannot include messages.  So let me post up the DB scripts and talk through how the code works with the salient code snips.  Hopefully it will be enough to get you started; if not, post a comment and I’ll try to answer.

First, an overview.  We take a raw HL7 message and first we use the MSH (the header) to pull out metadata about the message such as Control ID, time of the message, etc.  Using these data, we run the “IsMessageDuplicate” stored-procedure to determine if we’ve already stored this parsed message.
Here’s the code:


If this is a new message, we then store the basic message information but leaving a place for MRN and Encounter which will be filled in later.

Next we break apart the other segments and using the new MessageId we received from saving the Message, we add one row for each segment or sub-segment that has data.  Note that this is still part of the SQL Transaction that saved the message so we either succeed with everything or fail back to nothing.

We have an array of text lines from the HL7.  Each line begins with the segment type such as PID.  So, we find out what type it is:


Since the segment are ordinal; that is PID.1, PID.2, we can use an iterator to walk the segments to get their AttributeId from the lookup table in the database:


We have the primary attribute ID and now we need to see if this Primary Attribute ID type has a Sub-Type associated with it.  This is per the HL7 Spec.  This is a bit tricky so let’s use an example before we show the code.

Let’s say that we are parsing the Patient ID segment, “PID”.  Our AttributeId for PID.11 (in our database table) is “18” and it has an AttributeTypeId of 13.  Since this is greater than our magical (and somewhat arbitrary) number of “9”, we know that this is a sub-type.  NOTE:  the first nine are somewhat native types such as date or string and so are not really sub-types so we make this a rule in our code.

With our AttributeTypeId of 13, we really mean that this is the sub-type ID.  To find out what this is, open the AttributeType table and you will see that this is an “XAD” type or address type with an HL7 configuration of:

<Street Address (SAD)> ^ <Other Designation (ST)> ^ <City (ST)> ^ <State or Province (ST)> ^ <Zip or Postal Code (ST)> ^ <Country (ID)> ^ <Address Type (ID)> ^ <Other Geographic Designation (ST)> ^ <County/Parish Code (IS)> ^ <Census Tract (IS)> ^ <Address Representation Code (ID)> ^ <Address Validity Range (DR)> ^ <Effective Date (TS)> ^ <Expiration Date (TS)>

So what we want to do here is keep the Primary Attribute ID of “Patient Address” but also to keep each individual element such as “Census Track” if we have one.  The sub-types let us do that; plus, we only save a value if we have one.  Very efficient.

Now, this is where we get a bit creative so listen up.  The sub-types are in the HL7 spec but they really aren’t valid HL7 segments.  But, we treat them like they are!  That way, we can use the same code logic and it just works.

So, we have an “XAD” type (which for our purposes is like a “PID” or any other HL7 segment type).  But, we need to define the names of each of the elements.  If you open the lkupAttribute table again and go down to AttributeId number 621, you will find that it is of HL7 type (wink, wink) “XAD” and our XAD.1 is “Street Address”.  Surprise!  If you look at the description above, you will see the first sub-type is “Street Address (SAD)”.  Now we would get a bit crazy and have a sub-sub-type of SAD, but enough is enough.  However, the same logic applied here could be applied to that as well if you choose.
We will get back to the code in a minute but let’s review where we are.  If I want to find the Street Address for a Patient, I can simply query the EAV table for the record for our patient (which we’ll get to in a minute) that has a Primary Attribute ID of 18 (“Patient Address”) and a SubAttributeId of 621 (“Street Address”).  This is very fast because we use integers to index.

OK, so back to our code.  We need to find the sub-id and so we take the AttributeID and look up the AttributeTypeId to see if we need to do a sub-parse:


If we have a sub-type, we again use an iterator to walk through the values to see if we have something that needs to be stored, just as we did with the Primary Attribute Ids.


OK, so now we should have enough information to store this row, for this value along with it’s Message ID, the Primary Attribute Id, and if it is a sub-type – the SubAttributeID:


Once all these loops complete, we will have a row in the Message table and many rows associated in the EAV table; but, we need to get the MRN and Encounter to make this useful.   But this is pretty easy now.  We set up a switch statement for each message type assuming that every message type might have the Encounter and MRN in a different location; and then we find the AttributeIds that hold the value we are interested in (for example, the MRN in the PID.3 segment and the Encounter in the PV1.19 segment):


Using the Attribute (and/or sub-attribute) ids, we get the MRN and Encounter numbers from the EAV for that message (and yes, we are still in a transaction), and using those values, we update the Message table with the correct MRN and Encounter:


And our last step is to commit the transaction; or roll it back if something untoward occurred.

I’ve attached the DB scripts that include a View that pivots this stuff around so you can search by (Amalga) Message ID; or other variables in the message such as MRN or Encounter/VisitID.

For our purposes, this tool has been useful to help us validate Amalga parsing and to do some limited ad-hoc reporting.

Leave a Reply

Your email address will not be published. Required fields are marked *

one × two =