Article Title

Direction of Type 2 Diabetes Mellitus Progression

Publication Date



data mining, predictive model, progression of type 2 diabetes mellitus


Background/Aims: Due to the progressive nature of type 2 diabetes mellitus (T2DM), deterioration in a patient’s general health is inevitable. Hence, the overall health of a T2DM patient is poor. Markers of worsening overall health such as metabolic abnormalities are predominant in the progression to any of the T2DM complications, which mask patterns of progression to individual complications, thereby making it difficult to predict which particular complication a patient will progress to (i.e. direction of progression). We aim to unmask the patterns and develop the most distinctive and predictive markers of progression to each individual complication.

Methods: We take a two-step modeling approach using 94,536 T2DM patients’ 10-year, de-identified data from the OptumLabs™ Data Warehouse, a database that includes retrospective administrative claims data on commercially insured and Medicare Advantage enrollees, as well as electronic medical record data from a nationwide network of provider groups. First, we build a “general progression” model that predicts development of any new complication and captures markers of general health deterioration. Next, we develop “differential progression” models that predict progression to individual complications. We use least absolute shrinkage and selection operator (LASSO)-penalized Cox regressions to select variables and penalize the “differential progression” models against the “general progression” model such that “differential progression” patterns become unmasked. Significance of markers is determined via permutation test and predictive performance is evaluated through bootstrap estimation.

Results: For a 6-year follow-up period, 31,968 (34%) of all patients progressed to at least one complication: chronic kidney disease (12%), ischemic heart disease (11%), cerebrovascular disease (8%), congestive heart failure (5%), peripheral vascular disease (5%), and renal failure (1%). In this ongoing research, we anticipate our predictive model to demonstrate improved interpretation of progression to each individual complication without causing reduced predictive performance.

Conclusion: We study the progression of T2DM to multiple complications systematically and comprehensively by applying data mining techniques on the large amount of claims and clinical data from the OptumLabs Data Warehouse. By unmasking previously masked patterns and developing markers that suggest progression to a specific target complication, our study findings will assist clinicians in the management and treatment of T2DM.