Article Title

Direct Adjustment of Obesity Estimates in the Colorado BMI Monitoring System

Publication Date



geographic information systems, observational studies, biostatistics, epidemiology


Background: In the Colorado BMI Monitoring System, electronic health record body mass index (BMI) data from participating health care organizations are provided to the Colorado Department of Public Health and Environment (CDPHE) and combined to establish BMI estimates at the census tract level. This system provides estimates with more geographic specificity than national surveys; however, population representativeness is a limitation. Data are sampled from all members of five health care organizations, and selection bias is possible.

Methods: We applied direct adjustment based on gender, race/ethnicity and age to estimate BMI and overweight/obesity prevalence estimates. To avoid limiting to complete case data, missing race/ethnicity data were imputed using hot decking based on census tract, gender and age classes. Raking, an iterative method of marginal weight adjustment, was used to create sample weights used in direct adjustment estimates by census tract. Hot decking and raking were performed using modified SAS macros developed by Abt Associates (Cambridge, MA).

Results: Processes were developed on one site’s 2012–2014 data (N = 479,960) and are being tested on all site data at CDPHE. Missing race was imputed for 13.2% of members and failed on only 0.1%. Individual weights were generated through raking for 99.5% of individuals with a recent BMI measurement. For 658 of 668 Denver-metro census tracts, crude and adjusted estimates were similar (Pearson R = 0.981, P < 0.01), and the median absolute difference in crude versus adjusted adult obesity prevalence estimates was 0.05 (interquartile range: -0.54–0.80).

Conclusion: It is feasible to apply direct standardization to large data systems with many geographic units. Imputation via hot decking is appealing because it has been used in large government and public health surveys, it is effective using a limited set of demographic variables, and it provides a reasonable estimate of variable distribution by drawing from observed values. Raking is an advantageous weighting method in direct adjustment because it avoids empty or small cell size and only requires population-marginal demographic group estimates. Overall, adjusting census tract obesity prevalence estimates modified values slightly (the majority of absolute difference of crude and adjusted was within 1% in either direction), and adjusted estimates created more conservative confidence limits.