Sunday, January 23, 2011

MedConnector.Com –what, why, how

Over at http://www.medconnector.com I created a site (more or less a “labor of love”) where I can organize links to useful health information on the web.  I also wanted a way to organize prescription medications to make it easy to print a list before a doctor visit; or, to quickly retrieve my list using the web and so there’s a section for medication management.

I recently re-developed the site to use open source JavaScript libraries and as a result, the site now works pretty well on smartphones such as the IPhone.  In fact, I have used the site to manage my medication list, saved it as a .PDF and easily stored it as an IPhone “IBook” so it’s very simple to retrieve the list when needed.

I posted a YouTube video here that shows how the site works including the new feature: the ability to quickly link to the National Library of Medicine’s Consumer Health site: MedLinePlus.  Here’s a sample.

To address how I developed the site, I started with the FDA’s Orange Book, their drug list of “Approved Drug Products with Therapeutic Equivalence”. In order to process the files, I created a series of .NET entities (collections) to hold the metadata about each prescription drug.  After the current FDA drug list is downloaded, it is processed into an xml “database” file using the entity classes.  Here’s a sample.

This “database” is used for the subsequent AJAX calls and  it supports the cascading options for presenting the medication list.  For example:  start with a drug name, then the way it is delivered (orally, injection, etc.), next the packaging (such as liquid or capsule), and finally the associated strengths.  These are cascading options so a user selection from a list will change the contents of the subsequent lists. For example, by selecting a drug name, the “How Taken” list will only show valid options. This reduces user errors since they cannot select an invalid medication and makes it quick and easy to detail the prescription.

image

After each selection, the web form makes an asynchronous AJAX call to the web server to get the contents for the next list to present to the user.  This is very fast and improves the user experience.

When a user first visits the site, they can either put in some type of identification that is easy to remember (such as a phone number); or, they can accept the GUID that the system generates which can be a one-time key which guarantees privacy if they care.

After the list has been completed, the user can save it one of several formats.  For example, when I used this on my iPhone, I navigated to the medconnector web site, input my personal ID which retrieved my medication list and pressed the “save as pdf” option.  When the pdf web page returned, I saved it as a PDF to the iPhone IBooks library and then bookmarked the URL so I can always retrieve the most current version if I need at a doctor visit.  Here’s a rough iPhone screen shot:

image

The user can also save it in Word or Excel format so they can edit or add over-the-counter medications to their list.  Their list is saved (as an XML blob) on the server to make it easy to return later to re-print the list or to make additional edits.

As I mentioned, this was a “labor of love” and it is a continuing work in progress as I add additional tools to help me and others manage their health and wellness.

Monday, January 3, 2011

HL7 Parsing for Amalga Validation–the code (snips)

Well dang.  I had expected to just post up the code I wrote for the EAV/HL7 database but as I started to simplify the the code, it became clear that like much of HL7, our various feeds had anomalies that required pre-processing and special handling that would not make sense without the messages.  And of course, I cannot include messages.  So let me post up the DB scripts and talk through how the code works with the salient code snips.  Hopefully it will be enough to get you started; if not, post a comment and I’ll try to answer.

First, an overview.  We take a raw HL7 message and first we use the MSH (the header) to pull out metadata about the message such as Control ID, time of the message, etc.  Using these data, we run the “IsMessageDuplicate” stored-procedure to determine if we’ve already stored this parsed message.

Here’s the code:

image

If this is a new message, we then store the basic message information but leaving a place for MRN and Encounter which will be filled in later.

image

Next we break apart the other segments and using the new MessageId we received from saving the Message, we add one row for each segment or sub-segment that has data.  Note that this is still part of the SQL Transaction that saved the message so we either succeed with everything or fail back to nothing.

We have an array of text lines from the HL7.  Each line begins with the segment type such as PID.  So, we find out what type it is:

image

Since the segment are ordinal; that is PID.1, PID.2, we can use an iterator to walk the segments to get their AttributeId from the lookup table in the database:

image

We have the primary attribute ID and now we need to see if this Primary Attribute ID type has a Sub-Type associated with it.  This is per the HL7 Spec.  This is a bit tricky so let’s use an example before we show the code.

Let’s say that we are parsing the Patient ID segment, “PID”.  Our AttributeId for PID.11 (in our database table) is “18” and it has an AttributeTypeId of 13.  Since this is greater than our magical (and somewhat arbitrary) number of “9”, we know that this is a sub-type.  NOTE:  the first nine are somewhat native types such as date or string and so are not really sub-types so we make this a rule in our code.

With our AttributeTypeId of 13, we really mean that this is the sub-type ID.  To find out what this is, open the AttributeType table and you will see that this is an “XAD” type or address type with an HL7 configuration of:

<Street Address (SAD)> ^ <Other Designation (ST)> ^ <City (ST)> ^ <State or Province (ST)> ^ <Zip or Postal Code (ST)> ^ <Country (ID)> ^ <Address Type (ID)> ^ <Other Geographic Designation (ST)> ^ <County/Parish Code (IS)> ^ <Census Tract (IS)> ^ <Address Representation Code (ID)> ^ <Address Validity Range (DR)> ^ <Effective Date (TS)> ^ <Expiration Date (TS)>

So what we want to do here is keep the Primary Attribute ID of “Patient Address” but also to keep each individual element such as “Census Track” if we have one.  The sub-types let us do that; plus, we only save a value if we have one.  Very efficient.

Now, this is where we get a bit creative so listen up.  The sub-types are in the HL7 spec but they really aren’t valid HL7 segments.  But, we treat them like they are!  That way, we can use the same code logic and it just works.

So, we have an “XAD” type (which for our purposes is like a “PID” or any other HL7 segment type).  But, we need to define the names of each of the elements.  If you open the lkupAttribute table again and go down to AttributeId number 621, you will find that it is of HL7 type (wink, wink) “XAD” and our XAD.1 is “Street Address”.  Surprise!  If you look at the description above, you will see the first sub-type is “Street Address (SAD)”.  Now we would get a bit crazy and have a sub-sub-type of SAD, but enough is enough.  However, the same logic applied here could be applied to that as well if you choose.

We will get back to the code in a minute but let’s review where we are.  If I want to find the Street Address for a Patient, I can simply query the EAV table for the record for our patient (which we’ll get to in a minute) that has a Primary Attribute ID of 18 (“Patient Address”) and a SubAttributeId of 621 (“Street Address”).  This is very fast because we use integers to index.

OK, so back to our code.  We need to find the sub-id and so we take the AttributeID and look up the AttributeTypeId to see if we need to do a sub-parse:

image

If we have a sub-type, we again use an iterator to walk through the values to see if we have something that needs to be stored, just as we did with the Primary Attribute Ids.

image

OK, so now we should have enough information to store this row, for this value along with it’s Message ID, the Primary Attribute Id, and if it is a sub-type – the SubAttributeID:

image

Once all these loops complete, we will have a row in the Message table and many rows associated in the EAV table; but, we need to get the MRN and Encounter to make this useful.   But this is pretty easy now.  We set up a switch statement for each message type assuming that every message type might have the Encounter and MRN in a different location; and then we find the AttributeIds that hold the value we are interested in (for example, the MRN in the PID.3 segment and the Encounter in the PV1.19 segment):

image

Using the Attribute (and/or sub-attribute) ids, we get the MRN and Encounter numbers from the EAV for that message (and yes, we are still in a transaction), and using those values, we update the Message table with the correct MRN and Encounter:

image

And our last step is to commit the transaction; or roll it back if something untoward occurred.

I’ve attached the DB scripts that include a View that pivots this stuff around so you can search by (Amalga) Message ID; or other variables in the message such as MRN or Encounter/VisitID.

For our purposes, this tool has been useful to help us validate Amalga parsing and to do some limited ad-hoc reporting.

Saturday, January 1, 2011

HL7 Entity-Attribute-Value Database for Data Validation

In my day job, I work with Microsoft Amalga which, for our implementation, consumes (parses) a number of HL7 feeds.  As with most HL7 consumers, Amalga parses the text elements it expects out of the the HL7 feed message for storage in the Amalga SQL Server tables.  However, there are application and implementation specific business rules that are applied and require us to validate that the parsers have acted correctly.  In this post, I’m going to discuss a tool I authored to help us validate the Amalga parsers.  Our tool is fairly simple and has no business rules associated with it.  It simply reads the HL7 message and parses out the various segments whose values are then placed in a “flat” Entity-Attribute-Value (“EAV”) database. 

Let’s start with the database. 

As it turns out, the database has proved to be useful for both validation and for ad-hoc reporting.  I enabled SQL Server’s Full-Text-Search engine on the Value column of data which holds every data element across all of our feeds.  Even with several billion rows of data, our searches return in under a few seconds.  Therefore, this eliminates the need to try and figure out where in the Amalga tables a particular data item might be stored. For example, we wanted to get an idea of the number of patients who had come in with symptoms of the H1N1 flu before the diagnosis code was widely used. Using a handful of symptoms, we ran a query in the EAV and in a matter of seconds, we received a count of unique MRNs spread across the dates of visit which could easily be displayed in a chart.

As another example, Amalga recently hit an error and stopped parsing messages when it encountered a new event type that was unanticipated. A quick search of the EAV found the message which was only the second message out of many millions of that type. The first had processed without error (which was an error in itself) but the second had stopped the system. Using the EAV, we found both messages and after working with our Interface team, we identified the problem and quickly corrected it and resumed parsing.

OK, enough background, let’s talk about the design of the database.

There are some great articles (many by IBM – so filter your Google with IBM hosted info) about how EAV works. The thing that confused me most when I started working with EAV was thinking of it as Name/Value or a one-to-one match type architecture. This is far from correct. Let me illustrate with our design.

For our database, our Entity is the HL7 message such as an ADT message. So the Entity is the unique message ID we assign when we first process the message; that is, the Primary Key in the Message Table which is an identity field.

The Attributes of our Message are also held in the Message table along with the key. For example, we store: message Type (ADT), message Event (A05), MRN (derived after the message parse), Encounter Id (derived after the message parse), and for our Amalga validation, the Amalga Message Id store in the Amalga database.

The Value(s) are stored in the EAV table with a Value field, which has specific HL7 attributes associated with the value and is linked to the Message with its attributes via the Message Id.

To illustrate with a single value, let’s say we receive an ADT message for a patient who recently arrived at our urgent care facility in Alaska (all of this is contrived and is created for illustration). The “point-of-care” value in our HL7 message might be “UCAK” and so we would have one row in our EAV with the “Value” segment holding “UCAK”. Therefore, “UCAK” is the value for an HL7 ADT Entity (message) with a number of attributes including: Patient Identifier, Visit Identifier, Message Type, etc.

Now that you understand the logic of our database, let’s talk about the design.

The database is normalized so that we use lookup tables. For example, we have an Event Type table with an HL7 type of A14 with the corresponding value of “pending admit”.

 

lkupAttribute Table

The key lookup table is the lkupAttribute table holding the HL7 segments (which in a very early version of this was the attribute for our EAV). Let’s take one row to describe what this table does:

AttributeId – is a database assigned unique identifier for this row. Note that this can never change since it is heavily used in the other tables. Our example, this value is 12.

HL7 – the HL7 segment type, for our example this is PID.5 In our parser code, we expect to have this text in this format so do not change it. For those astute in HL7, the Set and Field Ids are handled later so for this implementation, it contains the segment and position only.

AttributeName – in our case “Patient Name”. This is the typical “friendly name” for the value and the text we will use later when we pull out the values for display. Note that this is the HL7 suggested descriptor and the implementation may or may not abide by the suggestion.

AttributeTypeId – Each HL7 segment has a type. In our example, the type is “Name” and this type can have its own list of segments such as “first name” and “last name”. For our purposes, this type id is customized for our application and this Id value is a foreign key for the AttributeType table. For our current example, this value is of type 12.

 

AttributeType table

The AttributeType table is the lookup table used by the application to parse out the EAV values. Continuing our example:

Id – is the database assigned value for the type. Note that the values in this table should never change either since they are used in the code to determine how to parse. Generally, the first 10 items are native types and the rest are custom types. For example, id #1 is a “string” type; whereas id #12 is an “xpn” type which is a name type.

HL7Type – the type defined by the HL7 spec, such as “xpn”

Description – this is not used by the application but is copied from the spec to help in programming the parsers. For our example, it looks like this:

<Family Name (FN)> ^ <Given Name (ST)> ^ <Second and Further Given Names or Initials Thereof (ST)> ^ <Suffix (e.g., JR or III) (ST)> ^ <Prefix (e.g., DR) (ST)> ^ <Degree (e.g., MD) (IS)> ^ <Name Type Code (ID)> ^ <Name Representation Code (ID)> ^ <Name Context (CE)> ^ <Name Validity Range (DR)> ^ <Name Assembly Order (ID)> ^ <Effective Date (TS)> ^ <Expiration Date (TS)> ^ <Professional Suffix (ST)>

Message Table

The Message table has the following columns:

MessageId – This is the database assigned unique row identifier.

RawMessageId – Used by Amalga

MRN – This is the MRN for the patient for which this message is reporting. This value is calculated by the parser code and the value is pulled from the appropriate place in the various message types. For example, this may be in one location in the ADT messages but another in the Lab Results feed.

VisitID – or Encounter Id, or sometimes called the Account, is the identifier that the hospital systems assign for a patient visit or encounter.

MessageDateTime – when the message was sent or received

EventType – such as A31 or A05. This makes it easy to find and sort the HL7 messages by message type.

HL7FeedType- the name of the HL7 feed such as: ADT, or Lab Orders.

MessageControlId – this optional value is sent by the interface gateway such as Egate or SeeBeyond.

 

EAV TABLE

And the last table is the EAV table with the following columns:

EAVId – the row identifier, assigned by the database

MessageId – the foreign key from the Message table, allows us to join rows in the EAV table to the Message table.

PrimaryAttributeId - the foreign key from the lkupAttribute table. This tells us what HL7 segment the value belongs to.

SubAttributeId – the foreign key from the lkupAttribute table for the sub-attributes. For example, the PrimaryAttributeId might be the number for “Patient Name”, and the SubAttributeId value might be “Last Name”. This allows us to customize how we want to search or re-assemble values later and makes our “where” clauses extremely accurate.

FieldId – some HL7 segments are lists of entries and can have an almost unlimited number of entries. For example, a Patient Identifier might have several entries, one from each hospital system. This will allow us to store the data in a very granular fashion; plus providing future flexibility as we add new systems.

SetId – some HL7 segments have sets of data. For example, the Insurance segment may have a different set for each policy the patient has. This field identifies the SetId from the HL7 message.

Wow, this was much longer than I expected. Next time I’ll talk about the code and how it uses this database and post up the code and db scripts.

image