Modeling and analyzing respondent-driven sampling as a counting process

Yakir Berchenko, Jonathan D. Rosenblatt, Simon D.W. Frost

Research output: Contribution to journalArticlepeer-review

Abstract

Respondent-driven sampling (RDS) is an approach to sampling design and analysis which utilizes the networks of social relationships that connect members of the target population, using chain-referral. RDS sampling will typically oversample participants with many acquaintances. Naïve estimators, such as the sample average, will thus be biased towards the state of the most highly connected individuals. Current methodology cannot estimate population size from RDS, and promotes inverse probability weighted estimators for population parameters such as HIV prevalence. We propose to use the timing of recruitment, typically collected and discarded, in order to estimate the population size via a counting process model. Once population size and degree frequencies are made available, prevalence can be debiased in a post-stratified framework. We adapt methods developed for inference in epidemiology and software reliability to estimate the population size, degree counts and frequencies. A fundamental advantage of our approach is that it makes the assumptions of the sampling design explicit. This enables verification of the assumptions, maximum likelihood estimation, extension with covariates, and model selection. We develop large-sample theory, proving consistency and asymptotic normality. We further compare our estimators to other estimators in the RDS literature, through simulation and real-world data. In both cases, we find our estimators to outperform current methods. The likelihood problem in the model we present is separable, and thus efficiently solvable. We implement these estimators in an accompanying R package, chords, available on CRAN.

Original languageAmerican English
Pages (from-to)1189-1198
Number of pages10
JournalBiometrics
Volume73
Issue number4
DOIs
StatePublished - 1 Jan 2017

Keywords

  • Counting process
  • HIV
  • Hidden populations
  • Respondent driven sampling

All Science Journal Classification (ASJC) codes

  • General Immunology and Microbiology
  • Applied Mathematics
  • General Biochemistry,Genetics and Molecular Biology
  • General Agricultural and Biological Sciences
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Modeling and analyzing respondent-driven sampling as a counting process'. Together they form a unique fingerprint.

Cite this