Conditional validation sampling for consistent risk estimation with binary outcome data subject to misclassification
Purpose: Misclassification of a binary outcome can introduce bias in estimation of the odds-ratio associated with an exposure of interest in pharmacoepidemiology research. It has been previously demonstrated that utilizing information from an internal randomly selected validation sample can help mitigate this bias. Methods: Using a Monte Carlo simulation-based approach, we study the properties of misclassification bias-adjusted odds-ratio estimators in a contingency table setting. We consider two methods of internal validation sampling; namely, simple random sampling and sampling conditional on the original (possibly incorrect) outcome status. Additional simulation studies are conducted to investigate these sampling approaches in a multi-table setting. Results: We demonstrate that conditional validation sampling, across a range of subsampling fractions, can produce better estimates than those based on an unconditional simple random sample. This approach allows for greater flexibility in the chosen categorical composition of the validation data, as well as the potential for obtaining a more efficient estimator of the odds-ratio. We further demonstrate that this relationship holds for the Mantel-Haenszel misclassification bias-adjusted odds-ratio in stratified samples. Recommendations for the choice of validation subsampling fraction are also provided. Conclusions: Careful consideration when choosing the sampling scheme used to draw internal validation samples can improve the properties of the outcome misclassification bias-adjusted odds-ratio estimator in a (multiple) contingency table.
|Keywords||contingency tables, misclassification bias, misclassified binary data, pharmacoepidemiology, validation sampling|
|Journal||Pharmacoepidemiology and Drug Safety|
Gravel, C.A. (Christopher A.), Farrell, P, & Krewski, D. (Daniel). (2019). Conditional validation sampling for consistent risk estimation with binary outcome data subject to misclassification. Pharmacoepidemiology and Drug Safety, 28(2), 227–233. doi:10.1002/pds.4701