Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond

MJ Pencina, RB D'Agostino Sr… - Statistics in …, 2008 - Wiley Online Library
Statistics in medicine, 2008Wiley Online Library
Identification of key factors associated with the risk of developing cardiovascular disease
and quantification of this risk using multivariable prediction algorithms are among the major
advances made in preventive cardiology and cardiovascular epidemiology in the 20th
century. The ongoing discovery of new risk markers by scientists presents opportunities and
challenges for statisticians and clinicians to evaluate these biomarkers and to develop new
risk formulations that incorporate them. One of the key questions is how best to assess and …
Abstract
Identification of key factors associated with the risk of developing cardiovascular disease and quantification of this risk using multivariable prediction algorithms are among the major advances made in preventive cardiology and cardiovascular epidemiology in the 20th century. The ongoing discovery of new risk markers by scientists presents opportunities and challenges for statisticians and clinicians to evaluate these biomarkers and to develop new risk formulations that incorporate them. One of the key questions is how best to assess and quantify the improvement in risk prediction offered by these new models. Demonstration of a statistically significant association of a new biomarker with cardiovascular risk is not enough. Some researchers have advanced that the improvement in the area under the receiver‐operating‐characteristic curve (AUC) should be the main criterion, whereas others argue that better measures of performance of prediction models are needed. In this paper, we address this question by introducing two new measures, one based on integrated sensitivity and specificity and the other on reclassification tables. These new measures offer incremental information over the AUC. We discuss the properties of these new measures and contrast them with the AUC. We also develop simple asymptotic tests of significance. We illustrate the use of these measures with an example from the Framingham Heart Study. We propose that scientists consider these types of measures in addition to the AUC when assessing the performance of newer biomarkers. Copyright © 2007 John Wiley & Sons, Ltd.
Wiley Online Library