We developed a flexible non-parametric Bayesian model for regional disease-prevalence estimation based on cross-sectional data that are obtained from several subpopulations or clusters such as villages, cities, or herds. The subpopulation prevalences are modeled with a mixture distribution that allows for zero prevalence. The distribution of prevalences among diseased subpopulations is modeled as a mixture of finite Polya trees. Inferences can be obtained for (1) the proportion of diseased subpopulations in a region, (2) the distribution of regional prevalences, (3) the mean and median prevalence in the region, (4) the prevalence of any sampled subpopulation, and (5) predictive distributions of prevalences for regional subpopulations not included in the study, including the predictive probability of zero prevalence. We focus on prevalence estimation using data from a single diagnostic test, but we also briefly discuss the scenario where two conditionally dependent (or independent) diagnostic tests are used. Simulated data demonstrate the utility of our non-parametric model over parametric analysis. An example involving brucellosis in cattle is presented.
- Disease-prevalence estimation
- Polya trees
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty