Logistic Regression With Misclassified Covariates Using Auxiliary Data

Date

2009-09-16T18:19:11Z

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Mathematics

Abstract

When standard regression methods are used, measurement errors cause a bias in parameters estimation. In dealing with discrete covariates, such measurement errors are known as misclassification and the corresponding discrete covariate is said to be misclassified. Even though the collected data may not be fully reliable, it may be possible to collect some sub-data with full precision called auxiliary data and the remaining data called primary data may contain misclassification. In this paper, in order to improve predictions from the primary data with misclassification, we propose a method based on the maximum likelihood approach; by using the prediction with auxiliary data we are able to improve the prediction from the primary data and to correct the bias in parameters estimation. For the simplified model, we consider a primary data sample with binary response Y and discrete covariate W which is a misclassified version of true latent variable X. In addition, an auxiliary data in which both W and X can be observed used to adjust for the bias due to misclassification in W. First the model parameters are shown to be non-identifiable. To resolve around the problem, we replace the estimated misclassification probability between X and W in the model. The estimator is shown to be consistent and asymptotic normality under some regularity conditions. Testing of our method by using Monte Carlo simulations indicate that the method works well with finite samples.

Description

Keywords

Citation