I first became aware of this practice and problem from my own clinical notes. A few years ago, I had a temporary inner ear problem and my primary care physician referred me to an ENT doctor in his office. A couple of years after that, my audiologist referred me to the same guy for evaluation for hearing aids. Subsequently, I ordered copies of my medical record and notes and what I found astounded me.
Although I was in my forties on my first visit, the ENT doc declared that I was an eighty-year-old patient suffering from something completely unrelated to my visit. On my visit with the same doc a couple of years later for something completely different, he just copied the content from my first visit to this one. In both visits, my age was years off, the notes were identical and completely wrong, and he charged my insurance company hundreds of dollars for a less than five minute visit for each encounter.
Fortunately, CMS has gotten wise to this and offers several guidelines for policies. I understand that if they find offenses, they can decline payment for the encounter. Ouch. The Joint Commission has also weighed in; as have others.
In a recent project with a teaching hospital, I was tasked to work with the HIM department to determine if they had a problem with this and if so, to what extent. This was such a cool project that I wanted to share for others who may be thinking of heading down this same path.
One thing I must emphasize is that this problem can be twisted a number of ways; and I think it's extremely important to evaluate the scope of the problem by looking at each and every way the problem may present.
For example, in our little project, it was first thought that core problem might be: a Resident or Fellow would copy content from notes, written by them, from an earlier exam for the same patient. Computer code to find and identify these offenses was easy to write and performant since we only had to search for problems across a single patient and encounter. Indeed we found a few offenses.
The next iteration of the tool proved more interesting and informative. During a manual chart review of the notes, an HIM employee found that a provider had copied key sections of clinical notes from one patient to another and the tool needed to discover this too. This made the tool much more complicated since it now needed to search across all patients for duplication.
So in review of our new requirements, the tool needed to identify the following:
- Provider "A" copying sections of notes they had written for a patient previously during an earlier exam; when one would expect the narrative to change over time.
- Provider "A" writing a note for a patient who had previously been seen by Provider "B"; where Provider "A" copied key clinically relevant text authored by Provider "B" into the newer note for the patient.
- Provider "A" storing a "library" of clinical text locally on their machine that is NOT patient specific; and using that to paste into a variety of patient charts.
- And the very worst offense; Provider "A" copying Provider "B"'s (an Attending perhaps?) notes into a local library and copying and pasting from there into a variety of patient charts.
So next, let me fully describe how I wrote the suite of tools to handle these business requirements; summarize what we found.
Tool Suite Requirements
Here is a list of requirements we used to develop the tools:
- Minor copies are OK and frankly, expected. Therefore, the tool had a configuration-driven text length definition from which to begin to define a potentially problematic copy-forward. Those terms copied forward under the threshold were ignored.
- We need to exclude terms that were auto-populated in the EMR and so one would expect to find a large number of instances of those across all patients and encounters.
- We wanted to exclude sections of the notes that were less clinically problematic for the copy forward offense such as patient history.
- We wanted to be able to group our "hits" by provider, patient, encounter, and note type.
- We also wanted to report on (group by) the top hits by two metrics: the length of the copied text and the frequency of the copies.
- Due to the fact that we wanted to group by maximum length, it was important to take the first hit and grow the search string until the maximum match length could be determined; and to omit the previous hits which would be part of this longer string.
What We Found
- As expected, we found a large number of cases where a provider would copy key sections of their notes for the same patient and encounter from one exam to the next. When refined, it was clear that many (most?) of these were in areas of the notes one would expect to see patient improvement over time; but instead, identical observations were made over and over.
- A lesser problem, but still rather egregious, were examples where a provider would copy text written by another provider during an earlier exam into the notes for the same patient and encounter. Again, one would expect to see some change in the observations (hopefully for the better) over time; but instead, the exact same observations were documented; calling into question whether the provider actually saw the patient.
- What I think surprised all of us where the rather rare, but still glaringly problematic, cases where a provider would obviously keep a library of observations and simply copy and paste from those from patient to patient and encounter to encounter. It should be noted that when we did a "deep dive" into specific offenders, we found that the offenders did this broadly across their notes and with a variety of strings of text.
ThoughtsAlthough I used a multi-threaded C# application running on a beefy box with a bunch of RAM, and was able to load a substantial amount of text into memory and do memory compares directly using the CPU, the problem set is remarkably similar to that of doing pattern matching across the human genome that I discussed previously. Also, the solution as written used massively parallel code and would benefit from the power of CUDA and GPU processing.
One would expect that for a production or regularly executed program to detect offenses, the provider community would "wise up" and begin to insert small insignificant edits in their text to defeat the detection routine. However, as with genomic processing that identifies "inserts and deletes" or "indels", a robust application should be able to handle these minor edits with relative ease. Couple that with some level of natural language processing ("NLP"), and it's conceivable that a powerful forensic tool to identify policy infractions for copy-forward could be developed and operated on standard commodity hardware.