Encrypting Personal Health Data – Part 1

I work at a local children’s hospital with Microsoft’s Amalga platform as well as other clinical intelligence tools – mostly Microsoft based.  Just as most hospitals, we are concerned with how we can protect digital Personal Health Information (“PHI”) from disclosure to those without a “need to know”; while at the same time making it very easy and quick for clinicians to view the data they need to deliver quality healthcare.

This blog entry will discuss one solution we are in the process of evaluating to encrypt the critical PHI elements.

Requirements

  1. All PHI must be encrypted in-flight and at-rest.
  2. All hospital users can view the de-identified data without special permissions but all access of de-identified and identified data will be logged in audit logs.
  3. In order to do their jobs, certain users will be given the “keys” to unlock one or more PHI elements.  This can happen through the database, or through special decryption tools we provide.  However, access must be approved by the appropriate committees and will be tightly controlled.
  4. Encryption should be very fast and use Public Key Encryption (“PKI”) managed by our internal Certificate Authority (“CA”).

High Level Design Discussion

There are several ways we could implement our solution.  Here are a few that we rejected and the reasons why:

  • Symmetric Key Encryption with a Shared Password –  This solution has the advantage of being very fast.  However, it has several disadvantages.  First, we plan to use .NET and it is very easy to view the source code and thus the password by using tools such as Reflector.  Second, if all data were encrypted using a single password, anyone with the password could see all the data.  And finally, we may have cases where we want a source system to encrypt and send us data; but not give them the ability to decrypt and see other data.  If we gave them the password, they could both encrypt and decrypt all our hidden data.
  • PKI Everywhere – PKI would allow us to solve the problems previously mentioned for Symmetric Key Encryption (and is one reason why it is so popular and widely used); but for large amounts of data, it is relatively slow.
  • Database Encryption with Transport Encryption (SSL) – This would be an ideal solution except for the fact that all decryption would have to occur at the database.  This means that we could not send a sub-set of data to say our trusted Research group and have them “noodle the data” on their own servers.  Plus, SQL Server does not currently support certificate checks (expiration, revocation, etc) on PKI certs for internal data.

Our solution combines both PKI and Symmetric Key encryption and will give us the best of both worlds.  This is very close to the way that SSL operates.  That is, the relatively slow PKI system is used to encrypt the “session key” (our symmetric password) which is then used to encrypt the data.

Our Implementation – Overview

The following are the high level steps we will use to meet our requirements.  In the next Blog entry, I will drill deeper and provide code samples and discuss some of the issues found.

  1. Using our corporate CA, we issue a certificate to the SQL Server and include the Private Key.
  2. We export the Public Key and share it out to any source system (“SS”) that sends us PHI such as Amalga.
  3. Let’s say that a SS would like to add an row to our database that has the patient’s MRN which is a PHI element defined by HIPAA.  The SS would generate a symmetric encryption key (“Password”) and encrypt the MRN.  Note that each row would use a different Password thus maximizing our security.
  4. Next, the SS would use our Public Key to encrypt the Password and concatenate the encrypted Password to the front of the encrypted MRN and store both in a column such as MRN_X in our database.
  5. Since we use MRN to join across tables and our users like to see something in their data grids, we create a cryptographically secure one-way-hash of the MRN in Base64 format so it is more readable by humans and store that in our MRN column.  In other words, this hashed MRN is de-identified but is unique to the original MRN value and thus can be used for table joins and display in our de-identified views.
  6. At this point, our data are de-identified and we can allow users to query our database without fear of inadvertent disclosure of PHI.  Plus, we can ship our database backups offsite without fear of disclosure of PHI.
  7. For viewers of de-identified data, our query might be:  Select MRN from MyTable
  8. For viewers of re-identified data, our query might be:  Select ReId(MRN_X) as MRN from MyTable.

Summary

One of the best methods to protect Personal Health Data is through encryption.  Our solution provides a robust encryption mechanism that utilizes our corporate Certificate Authority to encrypt our passwords.  We can share our Public Key to any source system that can then encrypt data and send it to us.  However, re-identification is limited to applications running our internal network that have access to our Private Key.

Passwords are changed with every row added to the database, and with every modification to data in a row.  And, passwords are encrypted with a key that is not stored in the database and so offsite database backups are extremely secure.

Our data can be readily available to any user since all PHI data is encrypted.  This will allow users to create custom reports and views using their favorite reporting tools.  However, we can make re-identified data readily available when needed through custom functions that are protected directly as well as through Private Key access limitations.

In the next entry, I’ll provide a technical description and include snippets of source code.

Leave a Reply

Your email address will not be published. Required fields are marked *

13 − 6 =