The Learning Healthcare (Data) System: Virtual Data Warehouse Data Capture Revisited
data capture, population-based research
Background/Aims: At the 2014 HCSRN annual meeting, Bachman and colleagues presented an excellent investigation into rates of encounters and drug fills at Virtual Data Warehouse (VDW) sites in order to evaluate (among other things) the VDW enrollment file’s “OUTSIDE_UTILIZATION” field, which purported to flag periods during which complete data capture of either pharmacy or encounter data was suspect. That investigation revealed serious problems with the flag, calling its usefulness into question. Taking this to heart, the VDW enrollment workgroup proposed removing this field and adding a suite of six new flags intended to express confidence in the capture of pharmacy, laboratory, outpatient encounter, inpatient encounter, tumor and electronic medical record data individually. These flags are assigned by local VDW analysts on the basis of their knowledge of data capture limitations at their respective sites for identifiable subgroups of patients. VDW programs were written and tested for creating these new data incompleteness variables. All HCSRN sites were invited to run these programs and share their results.
Methods: Modeled after Bachman et al.’s work, we calculated rates of pharmacy fills, lab results, encounters, tumor records and vital signs by the appropriate new flag. We then plotted these rates over time to see whether in fact the people/periods flagged as having suspect data capture did in fact have lower rates compared to those who/that were not.
Results: At the sites that implemented the flags, data capture rates generally varied in line with expectations — suspected incomplete groups had markedly lower rates. Of the six flags, “incomplete_rx” saw the best implementations, with all seven implementing sites showing clear distinctions between people whose data capture was suspect and those for whom it was not. “Incomplete_tumor” had the most variable implementations, with clear distinctions at some sites but not others.
Conclusion: On balance, the new flags stand to improve the quality of data-based research in the HCSRN. Projects needing to define populations at risk of exposure to particular pharmacy fills, tumors or lab result values, for example, would do well to use the new flags to screen out people for whom exposure risk may not be completely captured.
Pardee RE, Bachman D, Hornbrook MC, Cleveland CR, Mathur P, Ng D, Aumer SM, Harding WH, Jordan C, Meier J, Wong CC, Hoch BA. The learning healthcare (data) system: Virtual Data Warehouse data capture revisited. J Patient Cent Res Rev. 2016;3:226-7.