Incomplete contingency tables with censored cells with application to estimating the number of people who inject drugs in Scotland

Antony M. Overstall, Ruth King, Sheila M Bird, Sharon J. Hutchinson, Gordon Hay

Research output: Contribution to journalArticle

24 Downloads (Pure)

Abstract

Estimating the size of hidden or difficult to reach populations is often of interest for economic, sociological or public health reasons. In order to estimate such populations, administrative data lists are often collated to form multi-list cross-counts and displayed in the form of an incomplete contingency table. Log-linear models are typically fitted to such data to obtain an estimate of the total population size by estimating the number of individuals not observed by any of the data-sources. This approach has been taken to estimate the current number of people who inject drugs (PWID) in Scotland, with the Hepatitis C virus diagnosis database used as one of the data-sources to identify PWID. However, the Hepatitis C virus diagnosis data-source does not distinguish between current and former PWID, which, if ignored, will lead to overestimation of the total population size of current PWID. We extend the standard model-fitting approach to allow for a data-source, which contains a mixture of target and non-target individuals (i.e. in this case, current and former PWID). We apply the proposed approach to data for PWID in Scotland in 2003, 2006 and 2009 and compare with the results from standard log-linear models.
Original languageEnglish
Pages (from-to)1564-1579
Number of pages14
JournalStatistics of Medicine
Volume33
Issue number9
Early online date1 Dec 2013
DOIs
Publication statusPublished - Apr 2014

Fingerprint

contingency
drug
linear model
contagious disease
public health
economics

Keywords

  • people who inject drugs
  • Scotland
  • hepatitis C virus
  • censoring
  • incomplete contingency table
  • log-linear models
  • population size

Cite this

Overstall, Antony M. ; King, Ruth ; Bird, Sheila M ; Hutchinson, Sharon J. ; Hay, Gordon. / Incomplete contingency tables with censored cells with application to estimating the number of people who inject drugs in Scotland. In: Statistics of Medicine. 2014 ; Vol. 33 , No. 9. pp. 1564-1579.
@article{a7e72d04f07e48ddbceb7da11e106db5,
title = "Incomplete contingency tables with censored cells with application to estimating the number of people who inject drugs in Scotland",
abstract = "Estimating the size of hidden or difficult to reach populations is often of interest for economic, sociological or public health reasons. In order to estimate such populations, administrative data lists are often collated to form multi-list cross-counts and displayed in the form of an incomplete contingency table. Log-linear models are typically fitted to such data to obtain an estimate of the total population size by estimating the number of individuals not observed by any of the data-sources. This approach has been taken to estimate the current number of people who inject drugs (PWID) in Scotland, with the Hepatitis C virus diagnosis database used as one of the data-sources to identify PWID. However, the Hepatitis C virus diagnosis data-source does not distinguish between current and former PWID, which, if ignored, will lead to overestimation of the total population size of current PWID. We extend the standard model-fitting approach to allow for a data-source, which contains a mixture of target and non-target individuals (i.e. in this case, current and former PWID). We apply the proposed approach to data for PWID in Scotland in 2003, 2006 and 2009 and compare with the results from standard log-linear models.",
keywords = "people who inject drugs, Scotland, hepatitis C virus , censoring, incomplete contingency table, log-linear models, population size",
author = "Overstall, {Antony M.} and Ruth King and Bird, {Sheila M} and Hutchinson, {Sharon J.} and Gordon Hay",
note = "Date of acceptance: 03/11/2013",
year = "2014",
month = "4",
doi = "10.1002/sim.6047",
language = "English",
volume = "33",
pages = "1564--1579",
number = "9",

}

Incomplete contingency tables with censored cells with application to estimating the number of people who inject drugs in Scotland. / Overstall, Antony M.; King, Ruth; Bird, Sheila M; Hutchinson, Sharon J.; Hay, Gordon.

In: Statistics of Medicine, Vol. 33 , No. 9, 04.2014, p. 1564-1579.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Incomplete contingency tables with censored cells with application to estimating the number of people who inject drugs in Scotland

AU - Overstall, Antony M.

AU - King, Ruth

AU - Bird, Sheila M

AU - Hutchinson, Sharon J.

AU - Hay, Gordon

N1 - Date of acceptance: 03/11/2013

PY - 2014/4

Y1 - 2014/4

N2 - Estimating the size of hidden or difficult to reach populations is often of interest for economic, sociological or public health reasons. In order to estimate such populations, administrative data lists are often collated to form multi-list cross-counts and displayed in the form of an incomplete contingency table. Log-linear models are typically fitted to such data to obtain an estimate of the total population size by estimating the number of individuals not observed by any of the data-sources. This approach has been taken to estimate the current number of people who inject drugs (PWID) in Scotland, with the Hepatitis C virus diagnosis database used as one of the data-sources to identify PWID. However, the Hepatitis C virus diagnosis data-source does not distinguish between current and former PWID, which, if ignored, will lead to overestimation of the total population size of current PWID. We extend the standard model-fitting approach to allow for a data-source, which contains a mixture of target and non-target individuals (i.e. in this case, current and former PWID). We apply the proposed approach to data for PWID in Scotland in 2003, 2006 and 2009 and compare with the results from standard log-linear models.

AB - Estimating the size of hidden or difficult to reach populations is often of interest for economic, sociological or public health reasons. In order to estimate such populations, administrative data lists are often collated to form multi-list cross-counts and displayed in the form of an incomplete contingency table. Log-linear models are typically fitted to such data to obtain an estimate of the total population size by estimating the number of individuals not observed by any of the data-sources. This approach has been taken to estimate the current number of people who inject drugs (PWID) in Scotland, with the Hepatitis C virus diagnosis database used as one of the data-sources to identify PWID. However, the Hepatitis C virus diagnosis data-source does not distinguish between current and former PWID, which, if ignored, will lead to overestimation of the total population size of current PWID. We extend the standard model-fitting approach to allow for a data-source, which contains a mixture of target and non-target individuals (i.e. in this case, current and former PWID). We apply the proposed approach to data for PWID in Scotland in 2003, 2006 and 2009 and compare with the results from standard log-linear models.

KW - people who inject drugs

KW - Scotland

KW - hepatitis C virus

KW - censoring

KW - incomplete contingency table

KW - log-linear models

KW - population size

U2 - 10.1002/sim.6047

DO - 10.1002/sim.6047

M3 - Article

VL - 33

SP - 1564

EP - 1579

IS - 9

ER -