Bayesian non-parametric models for regional prevalence estimation

Adam J. Branscum, Timothy E. Hanson, Ian Gardner

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


We developed a flexible non-parametric Bayesian model for regional disease-prevalence estimation based on cross-sectional data that are obtained from several subpopulations or clusters such as villages, cities, or herds. The subpopulation prevalences are modeled with a mixture distribution that allows for zero prevalence. The distribution of prevalences among diseased subpopulations is modeled as a mixture of finite Polya trees. Inferences can be obtained for (1) the proportion of diseased subpopulations in a region, (2) the distribution of regional prevalences, (3) the mean and median prevalence in the region, (4) the prevalence of any sampled subpopulation, and (5) predictive distributions of prevalences for regional subpopulations not included in the study, including the predictive probability of zero prevalence. We focus on prevalence estimation using data from a single diagnostic test, but we also briefly discuss the scenario where two conditionally dependent (or independent) diagnostic tests are used. Simulated data demonstrate the utility of our non-parametric model over parametric analysis. An example involving brucellosis in cattle is presented.

Original languageEnglish (US)
Pages (from-to)567-582
Number of pages16
JournalJournal of Applied Statistics
Issue number5
StatePublished - May 1 2008


  • Disease-prevalence estimation
  • Polya trees
  • Prediction

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'Bayesian non-parametric models for regional prevalence estimation'. Together they form a unique fingerprint.

Cite this