Dropdown items
My Academies

Personal Library

Account settings

Caring for the Individual Patient: Understanding Heterogeneous Treatment Effects (2019)

Chapter: Appendix A: Glossary

Visit NAP.edu/10766 to get more information about this book, to buy it in print, or to download it as a free PDF.

Previous chapter Next chapter
Page of 104
Search this publication

Page 89 Cite Bookmark

Suggested Citation: "Appendix A: Glossary." National Academy of Medicine. 2019. Caring for the Individual Patient: Understanding Heterogeneous Treatment Effects. Washington, DC: The National Academies Press. doi: 10.17226/27112.

Appendix A

GLOSSARY

Area under the receiver operating characteristic (ROC) curve (AUC): A measure of the discrimination of a logistic regression model. The ROC curve is the plot of sensitivity versus one minus specificity over all possible thresholds of predicted probability. The area under the ROC curve is numerically equivalent to the c-statistic for a binary outcome.

Bayesian nonparametric methods: An approach to model selection that allows the data to determine the complexity of the model. In an infinite-dimension parameter space, a Bayesian nonparametric model uses only a finite subset of the available dimensions to explain a sample of observations, with the complexity of the model adapting to the sample data.

C-statistic: A measure of the discriminative ability of a logistic regression model. The concordance (or c) statistic is a unit-less index denoting the probability that a randomly selected subject who experienced the outcome will have a higher predicted probability of having the outcome occur compared with a randomly selected subject who did not experience the event.

Effect modification: Occurs when the magnitude of the effect of the primary treatment or exposure on an outcome differs depending on the level of a third variable (e.g., patient characteristics). In the presence of effect modification, the use of an overall effect estimate is inappropriate.

Ensemble learning: A type of machine learning approach that combines multiple learning algorithms or models to predict an outcome to obtain better model performance than any of the individual models.

Page 90 Cite Bookmark

Genome-wide association study (GWAS): An observational study of a genome-wide set of genetic variants in different individuals to examine associations with variants with an outcome or trait.

Heterogeneous treatment effects (HTE): Nonrandom variability in the direction or magnitude of a treatment effect, measured using clinical outcomes. HTE is fundamentally a scale-dependent concept and therefore, for clarity, the scale should generally be specified.

Clinically important HTE: Occurs when variation in the risk difference across patient subgroups span an important decision threshold, which depends on treatment burden (including treatment-related harms and costs). It is generally assessed on the absolute scale.
Predictive HTE analysis: The main goal of predictive HTE analysis is to develop models that can be used to predict which of two or more treatments will be better for a particular individual.
- Risk modeling approach: An approach to predictive HTE analysis in which a multivariable model that predicts the risk of an outcome (usually the primary study outcome) is applied to disaggregate patients in trials to examine risk-based variation in treatment effects.
  - External models versus endogenous/internally derived models: An external risk model has been developed from an external trial or cohort population that can be applied for HTE analysis of the trial. An endogenous or “internal” risk model is one developed directly on the trial population that does not include a term for treatment assignment.
- Treatment effect modeling approach: An approach to predictive HTE analysis that develops a model directly on randomized trial data to predict treatment effects (i.e., the contrast in outcome risks under two alternative treatment conditions). Unlike risk modeling, the model incorporates a term for treatment assignment and permits the inclusion of treatment-by-covariate interaction terms.

Net benefit: A decision analytic measure that puts benefits and harms on the same scale. This is achieved by specifying an exchange rate based on the relative

Page 91 Cite Bookmark

value of benefits and harms associated with interventions. The exchange rate is related to the probability threshold to determine whether a patient is classified as being positive or negative for a model outcome, or (when applied to trial analysis) as being treatment-favorable versus treatment-unfavorable.

Overfitting: A key threat to the validity of a model when predictions do not generalize to new subjects outside the sample under study. Overfitting occurs when a model conforms too closely to the idiosyncrasies or “noise” of the limited data sample on which it is derived.

Penalized regression: A set of regression methods, developed to prevent overfitting, in which the coefficients assigned to covariates are penalized for model complexity. Penalized regression is sometimes referred to as shrinkage or regularization. Examples of penalized regression include LASSO, ridge, and elastic net regularization.

Predictive analytics: The field of predictive analytics encompasses a variety of statistical methods including prediction modeling, machine learning, and data mining techniques to make use of existing data to predict future events.

Reference class: A group of similar cases that is used to make predictions for an individual case of interest. The “reference class problem” refers to the fact that there are an indefinite number of different ways to define similarity.

Regression tree-based methods: Algorithms that use a recursive partitioning approach to predict categorical (classification tree) or continuous outcomes (regression tree).

Subgroup analysis: An analysis that examines whether specific patient characteristics modify the effects of treatment on an outcome.