Database Meets Revision Control
Any developer who has worked with HIPAA compliancy knows that the law is murky at best and the fed doesn't publish a programmers' guide to make your life any easier. However, one of the cardinal rules is the requirement to keep track of who sees HIPAA data, who modifies it and when this was done. Failure to do so can subject a company to some pretty draconian penalties.
This creates a challenge on the database side because SQL UPDATE obliterates a record's history. There are a few potential solutions, such as maintaining logs which are written to based on triggers to tables containing personally-identifying patient information. I did this for one application but the log of atomic changes grew to immense size. It was also very difficult to reconstruct a large record based on potentially dozens of changes.
In a headbanging session with the CTO of Children's Health Fund we determined that what we needed was a hybrid of a relational database and a resource control system, where a SQL UPDATE would maintain a copy of the pre-updated record and freeze it from further changes. RCS does its work by storing just the changes, or diffs, made to a document. While it would be technically possible to do this with a database record -- for instance, using a BLOB in a sibling table -- there's a simpler and more practical method that also maintains relational integrity.
What we settlled on was adding an integer column to each table named entity_status, where entity_status = 1 is the current live record and entity_status = 0 is an archived row that should never show up in queries for live data. We added a few more statuses, such as "locked" which also freezes the current record from further edits and "incomplete" which is a live record that hasn't been completed or approved , but that's not important here.
The problem is that if you have a normalized database you can't have multiple records sharing the same primary key. In other words, a clinic encounter might identify the patient internally as patient_id=100. What happens in this system if you update Patient 100's data? A new primary key is created and now the encounter record is pointing at an inactive patient record.
The solution was creating companion _detail tables for HIPAA tables. The _detail table contains all the volatile data you want to archive. Here's an example:
CREATE TABLE patient (
patient_id SERIAL NOT NULL UNIQUE,
created_by VARCHAR NOT NULL,
update_time TIMESTAMP WITH TIMEZONE,
PRIMARY KEY (patient_id));
CREATE TABLE patient_detail (
patient_detail_id SERIAL NOT NULL UNIQUE,
patient_id INTEGER NOT NULL,
first_name VARCHAR,
last_name VARCHAR,
dob DATE NOT NULL,
-- add more HIPAA columns here
created_by VARCHAR NOT NULL,
create_date TIMESTAMP WITH TIMEZONE,
entity_status SMALLINT DEFAULT 1 CHECK(entity_status in (0, 1)),
PRIMARY KEY (person_detail_id,person_id));ALTER TABLE patient_detail ADD FOREIGN KEY (person_id) REFERENCES patient (patient_id) ON UPDATE RESTRICT ON DELETE RESTRICT;
With the patient table holding the permanent patient_id and exporting it as a foreign key to the patient_detail table, you always have a consistent patient_id. You just have to join any queries on the patient table with its companion patient_detail table, filtering out any records which are less than 1 (entity_status = 1).
SELECT
P.*,
PD.*
FROM patient P
LEFT JOIN patient_detail AS PD ON (PD.patient_id = P.patient_id AND PD.entity_status = 1);
Because, logically, every record update on a _detail table is an insert we can use a single stored procedure to handle both cases. In either case, I check for any existing _detail record that has an entity_status=1 and update that to entity_status=0, which hides it from the queries. Then I insert the new record, which defaults to entity_status=1.
It's a very simple system which has been working flawlessly on a busy health records system for the past 18 months. Periodically, you may want to spool very old inactive record to offline storage and delete them on the live database. That will maintain your HIPAA compliance while freeing up disk and increasing query speed.
