Decrypting Our Cerner “BLOB” Data

This was a “no brainer”.  I am intrigued with data compression, particularly in a medical environment (and thus why I co-authored two patents on medical image compression); and I love puzzles.  So this (and the next post) will discuss the path I took to figuring out how Cerner compresses BLOBs. If you dunno what a BinaryLargeOBject is, and just want the code, see below but please don’t ask me for support

We are working on several “quality improvement” projects in my day job (isn’t every hospital?) and part of that is to mine (as in extract) the valuable medical insights in the notes — those semi-structured repositories of information in the medical record that are so difficult to organize.

Most of our notes and reports are contained in our Transcription system and Cerner is updated via an HL7 feed from Transcription.  However, I found that our Pathology reports were transcribed directly into Cerner and those were “compressed” and pretty much non-readable.  Moving forward, we have a way to intercept this interaction; but we had years of notes that we wanted to work with that were stored in the notorious CE_BLOB table (cue the scary music).

This was relatively simple (see previous blog entries) for the other notes; but I ran into the dreaded “Cerner Blob” — the “Billy The Kid” of Healthcare data….a dreaded reputation that may be deserved – but probably not.

I’m an ex-softie (Microsoft alumnus) and so I wanted to do this in C# and .NET so I needed a way to call the Cerner DLL from .NET.  Once I had the function signature and found the correct library, the rest was pretty straightforward C#.

For those interested, the uar_ocf_uncompress library is in their shrccluar.dll and if you have licensed this library from Cerner, then you are welcome to use it to decompress the BLOBs.  The parameters are the same in the compress and the decompress functions so here are the .NET interop snippets:

image

[DllImport(@”C:myCernerDirectoryshrccluar.DLL”,EntryPoint = “uar_ocf_uncompress”)]
public extern static int uar_ocf_uncompress( [In, Out, MarshalAs(UnmanagedType.LPArray, SizeParamIndex=1)] Byte[] Buffer, ref int BufSize,
[In, Out, MarshalAs(UnmanagedType.LPArray, SizeParamIndex=1)] Byte[] Buffer2, ref int BufSize2, ref int ilen);

    [DllImport(@”C:myCernerDirectoryshrccluar.DLL”, EntryPoint = “uar_ocf_compress”)]
public extern static int uar_ocf_compress([In, Out, MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)] Byte[] Buffer, ref int BufSize,
[In, Out, MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)] Byte[] Buffer2, ref int BufSize2, ref int ilen);   

So the next step was to try to figure out how to decompress the BLOBs directly from the Oracle database.  Fortunately, folks have mentioned in various forums that the compression algorithm that Cerner uses is LZW.  In my next Blog post, I’ll talk about how I determined how they implemented the algorithm but the good news is that Cerner didn’t try to hide the data (thank you Cerner).  In other words, they did not try to make this difficult to decompress but instead simply deployed a well known Lossless compression algorithm, LZW, using standard procedures.

OK, back to the code.  I have attached the simple decompression library here that you are welcome to download, compile and use.   This is so simple, that it would be a snap to convert this code to JAVA or you could even convert it using JavaScript in a web browser. Again, complements to Cerner for not trying to do something fancy and for using standard programming constructs.

To test this, I created a simple program that utilized Microsoft LINQ to iterate through a few thousand records from the CE_BLOB table. I first decompressed them using the Cerner API, and then again using the .NET library I created.  After decompression using both APIs, I ran a string comparison.  For these sample records, the final strings all compared successfully.

Disclaimer:  This code is not supported by me or Cerner and so you must compile and agree to use the code at your own risk.

image

That’s it for now.  In the next post, I’ll discuss the steps to develop the stand-alone solution and talk a bit more about how to use the library.

Again, here’s the C# class.

Leave a Reply

Your email address will not be published. Required fields are marked *

three × two =