Home > Research > Publications & Outputs > Improving imperfect data from health management...

Electronic data

  • fetchObject.action

    Rights statement: c 2006 Gething et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

    Final published version, 688 KB, PDF document

    Available under license: CC BY


Text available via DOI:

View graph of relations

Improving imperfect data from health management information systems in Africa using space-time geostatistics

Research output: Contribution to journalJournal articlepeer-review

  • Peter W. Gething
  • Abdisalan M. Noor
  • Priscilla W. Gikandi
  • Esther A. A. Ogara
  • Simon I. Hay
  • Mark S. Nixon
  • Robert W. Snow
  • Peter M. Atkinson
Article numbere271
<mark>Journal publication date</mark>6/06/2006
<mark>Journal</mark>PLoS Medicine
Issue number6
Number of pages7
Publication StatusPublished
<mark>Original language</mark>English



In order to allocate health-care resources (such as doctors, nurses, hospital beds, and drugs), public health officials need to know when and where in their country people are getting sick with which diseases. In most African countries, a country-wide health management information system (HMIS) compiles records about how many patients are being diagnosed with and treated for certain diseases. The actual data are meant to be collected and reported monthly by the individual health-care facilities. The HMIS compiles and analyzes these records, giving a picture of which patients are being treated across districts, regions, and the entire country. Ideally, all facilities report their data promptly and comprehensively every month. This allows the construction of a matrix that shows which treatments are used across the country through space (where) and time (when). However, many of the facilities operate under difficult circumstances, and keeping detailed records and reporting them every month is not always at the top of the priority list. As a result, data from many of the facilities are missing for any given month, and the overall national picture is inevitably incomplete.

Why Was This Study Done?

Almost any survey has to deal with some missing data, and there are various methods to estimate this missing data. Such estimates get harder the more data are missing. When it comes to reports on using health services in Africa, often more than half of the data are missing for a given month. Using sophisticated statistical methods instead of crude estimates is likely to make a big difference when such a big part of the data is missing. The researchers who did this study have adopted a statistical method called kriging to estimate missing data on health service usage. Kriging was originally developed in the earth sciences (such as geology and soil science) for estimating mineral concentrations at locations where no sampling had been done. This study was done to see whether kriging could be used to estimate the missing data on malaria cases in the Kenyan public health system. A better estimate of the missing data would be helpful for allocating malaria treatments to the right places.

What Did the Researchers Do and Find?

They obtained the monthly records of diagnoses made at outpatient departments of 2,165 health facilities across Kenya for an 84-month period from January 1996 to December 2002. The records included the number of outpatients and their diagnoses. The researchers chose to focus on malaria, for three reasons: (1) malaria is common (accounting for over one-third of the overall diagnoses in Kenya), (2) there is great variation in where and when it occurs across Kenya, and (3) donors are willing to provide additional support for malaria treatment and prevention but require documentation that such help is needed and reaches patients. The numbers of people diagnosed with malaria at each facility for a given month were matched to an independent database that contains information on where every health-care facility is located. Reporting rates varied from month to month and facility to facility, but the overall reporting rate was only 35%, with 25% of the facilities never reporting. The authors then adopted a version of kriging called space–time kriging to fill in missing data (space–time kriging assumes that for a given month a facility that didn't report is likely to be similar to its neighbors, and likely to be more similar to its own and its neighbors' recent numbers than to those further removed in space or time). The calculations resulted in a number of estimates. To test whether these estimates were accurate, the researchers randomly removed a test set of 10% of the monthly records from the full dataset and repeated the estimates based on the remaining 90% of reports. They found that the real and predicted cases across the country differed by less than 1%. At the district level (which is arguably the most useful for most planning purposes), the researchers found that their method can estimate 95% of the malaria cases within 35% of the true value. For 75% of the districts the estimates would be within 15% of the actual numbers.

What Do These Findings Mean?

In this case, space–time kriging provided a more precise estimate of missing data on diagnoses at the district and provincial levels than other estimates. This is likely to be true not just for malaria but for other diagnoses for which the number and the proportion of patients who have the disease and seek treatment vary by place and time of year. One caveat is that space–time kriging requires a detailed map of where exactly a country's health-care facilities are located. A database based on such a map existed for Kenya (and was used in this study) but doesn't exist in all countries that might benefit from a method like the one described here. The authors argue that knowledge about where health services are located is a must for any health planning agency, and that databases with that information should be developed everywhere

Bibliographic note

M1 - 6 c 2006 Gething et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.