POPULATION AND HEALTH IN DEVELOPING COUNTRIES

VOLUME 1

This page intentionally left blank.

POPULATION AND HEALTH IN DEVELOPING COUNTRIES

VOLUME 1

Population, Health, and Survival at INDEPTH Sites

INTERNATIONAL DEVELOPMENT RESEARCH CENTRE
Ottawa • Cairo • Dakar • Montevideo • Nairobi • New Delhi • Singapore

CONTENTS

Foreword

ix

Preface

xi

Acknowledgments

xiii

Introduction

1

PART I. DSS CONCEPTS AND METHODS

Chapter 1. Core Concepts of DSS

Introduction

7

Demographic surveillance systems

7

Demographic surveillance area

8

Longitudinality

8

Primary DSS subjects

9

Eligibility

11

Residency and membership

12

Core DSS events

12

Episodes

14

Other events

15

Chapter 2. DSS-generated Mortality Rates and Measures

Introduction

17

Rates and ratios

17

Standardization

20

Confidence intervals for rates

20

Chapter 3. DSS Methods of Data Collection

Introduction

21

Establishing the monitored population

22

Planning for data collection

23

Initial census

23

Update rounds

23

Recording demographic events

26

Monitoring mortality

27

Tracking migrants

28

Additional rounds of data collection

29

Geographic information systems

29

Conclusion

30

Chapter 4. Processing DSS Data

Introduction

31

Background

32

The INDEPTH concept of a data core

33

The reference data model

35

The role of the reference data model in maintaining data integrity

39

Extending the core

40

Conclusion

41

Chapter 5. Assessing the Quality of DSS Data

Introduction

43

Assessing data quality in the field

43

Assessing data quality at the data centre

44

Conclusion

47

PART II. MORTALITY AT INDEPTH SITES

Chapter 6. Comparing Mortality Patterns at INDEPTH Sites

Abstract

51

Introduction

51

Age-specific mortality rates and life tables

52

Crude death rate

53

Child mortality

57

Adult mortality

59

Discussion

61

Annex: Lifetables

63

Chapter 7. INDEPTH Mortality Patterns for Africa

Abstract

83

Mortality models and Africa

83

Principal-components analysis

87

Principal components of INDEPTH mortality data

89

INDEPTH mortality patterns

96

Demonstration of the HIV–AIDS model life-table system

111

Conclusion

114

Annex: AIDS-decremented model life tables

115

PART III. INDEPTH DSS SITE PROFILES

Introduction

129

Chapter 8. Butajira DSS, Ethiopia

135

Chapter 9. Dar es Salaam DSS, Tanzania

143

Chapter 10. Hai DSS, Tanzania

151

Chapter 11. Ifakara DSS, Tanzania

159

Chapter 12. Morogoro DSS, Tanzania

165

Chapter 13. Rufiji DSS, Tanzania

173

Chapter 14. Gwembe DSS, Zambia

183

Chapter 15. Manhiça DSS, Mozambique

189

Chapter 16. Agincourt DSS, South Africa

197

Chapter 17. Dikgale DSS, South Africa

207

Chapter 18. Hlabisa DSS, South Africa

213

Chapter 19. Nouna DSS, Burkina Faso

221

Chapter 20. Oubritenga DSS, Burkina Faso

227

Chapter 21. Farafenni DSS, The Gambia

235

Chapter 22. Navrongo DSS, Ghana

247

Chapter 23. Bandim DSS, Guinea-Bissau

257

Chapter 24. Bandafassi DSS, Senegal

263

Chapter 25. Mlomp DSS, Senegal

271

Chapter 26. Niakhar DSS, Senegal

279

Chapter 27. Matlab DSS, Bangladesh

287

Chapter 28. ORP DSS, Bangladesh

297

Chapter 29. FilaBavi DSS, Viet Nam

305

Appendix 1. Working Examples of DSS Forms

Example 1. DSS Baseline Form (RufijiDSS)

312

Example 2. Household Registration Book (HRB) (Rufiji DSS)

313

Example 3. Pregnancy Outcome / Birth Form (Rufiji DSS)

315

Example 4. Death Registration Form (Navrongo DSS)

316

Example 5. Marital Status Form (Butajira DSS)

316

Example 6. VA Form: Deaths of Children from Day 31 to 5 Years (Morogoro DSS)

318

Example 7. In-migration Form (Navrongo DSS)

320

Example 8. Out-migration Form (Navrongo DSS)

321

Appendix 2. Acronyms and Abbreviations

323

Appendix 3. Glossary

327

Appendix 4. Bibliography

333

FOREWORD

Traditional sources of health information collected from health facilities often serve as the basis for health-services planning and allocation of resources in many parts of the developing world. Yet, health-facility-based data provide only fragmentary and biased information. Not all population groups have geographic or economic access to health facilities. Those that do have such access are usually self-selected and are often those who visit health-care centres only when they suffer from a serious illness. A great majority of poor people may have less access to health-care facilities than those who are better off, and poor people often treat themselves or use nontraditional health care. Women may suffer gender disparities as well, with time and cultural constraints on the use of health-care facilities, particularly in rural settings. Services for children are also severely constrained. Thus, health-facility-based data are not representative of the health problems of all rural and urban communities and do not therefore reflect their health status.

This void of valid health information for a large segment of the world’s population makes it difficult for policymakers to formulate rational health policies to improve the health of these people. As the authors of this book argue, “the need to establish a reliable information base to support health development has never been greater” (INDEPTH Coordinating Committee, this volume, p. 1). Ideally, reliable health information should be population and community based, inclusive of all groups, and collected prospectively and continuously. Such an ideal is best met through demographic and health surveillance systems collecting demographic and health data on selected population samples. Often, randomly selected cross-sectional household surveys every few years complement these methods of research.

Demographic and health surveillance systems serve a number of functions:

The premier example of such a system is the Health and Demographic Surveillance System (formerly known as the Demographic Surveillance System) of Matlab, Bangladesh, which started operations in 1963 as a major component of the

field research program of the International Centre for Diarrhoeal Disease Research, Bangladesh. It is recognized as the largest and longest sustained prospective longitudinal demographic and health surveillance of any population in the world. It has made significant contributions to health development in both Bangladesh and the rest of the world. The high cost of running such a system has delayed replication in other parts of the developing world. However, thanks to the fast-paced development of user-friendly computers, this constraint has been partially overcome.

Over the last decade, a growing number of community-based field stations have evolved in Asia and sub-Saharan Africa and started to generate reliable longitudinal population-based health and demographic data. This bodes well for countries with such stations, as it marks the first step toward rational health planning and meaningful health programs for the people of these countries. Recently, these stations joined to form a network called the International Network for the continuous Demographic Evaluation of Populations and Their Health in developing countries (INDEPTH), creating “a trans-continental resource of robust, longitudinal, health and demographic data in some of the most information deprived settings in the world” (INDEPTH Founding Document; http://www.indepth-network.org). In the span of a few years, INDEPTH has matured rapidly, succeeding in strengthening the capabilities of member sites and developing strategies to harness their potential to redress long-standing inequities in health. This development has been possible because of the dedication and hard work of a few individuals, and this monograph is clearly an indication of the high quality of the network’s work.

The emergence of INDEPTH should be welcome news to the donor community, where people often, and rightly, complain that the programs they fund in low-income countries are not usually based on the real needs of the people. By the same token, donors should come out strongly in support of INDEPTH, because they will be investing in an initiative that directly addresses one of the major constraints of development assistance. Researchers in program countries should also take advantage of the INDEPTH sites to promote essential national health research. The domination of health-facility-based biomedical research should give way to policy-relevant research with the likelihood of a more immediate effect on the health of the people in the countries in the program.

Demissie Habte
World Bank
Washington, DC
1 June 2001

PREFACE

This monograph is the first in a series from the International Network for the continuous Demographic Evaluation of Populations and Their Health in developing countries (INDEPTH). It seeks to do several things. First, it seeks to compile, for both easy reference and comparative purposes, and in detailed and summary formats, the essential characteristics of each participating demographic surveillance system (DSS) site. Second, it seeks to present, for the first time, the mortality structure of each of these sites in a coherent and comparative format. Third, based on a network-wide analysis of the African site data, it proposes a methodology to generate, again for the first time, African model life tables that are based on objective empirical data.

The focus of this volume is the structures of populations at INDEPTH sites and the characteristics of their health and survival. The monograph is divided into three parts: Part I discusses core concepts and methods used in DSSs; Part II provides a comparison of mortality patterns in INDEPTH sites; and Part III presents profiles of INDEPTH sites.

As this is the first publication of its kind on DSSs in Africa and Asia, we thought it would be expedient to discuss core concepts and methods commonly used in most of the sites. Among the concepts discussed in Chapter 1 are the DSS area, longitudinality, DSS subjects, residency and membership, and core DSS events. Rates and measures generated using DSS are discussed in Chapter 2, with specific emphasis on the use of person–years lived in calculating rates. Chapter 3 discusses the DSS methods of data collection, starting with the initial census to establish the DSS population. This chapter discusses initial censuses, update rounds, and the vital events-registration system. It also puts emphasis on mortality monitoring and the tracking of migrants. The processing of DSS data is the main focus of Chapter 4. This chapter treats the important issues of quality assurance and control at the data-processing level. In Chapter 5, Part I ends with a discussion of the quality of DSS data, both in the field and at the data centre. This chapter then provides a detailed discussion of statistical and demographic techniques for analysis of DSS data.

Part II presents a comparison of mortality patterns of INDEPTH sites for the 1995–99 period. Chapter 6 starts with a discussion of crude overall mortality at INDEPTH sites. This chapter presents an INDEPTH population-age standard for sub-Saharan Africa (SSA) for the standardization of mortality rates, and it gives the reason for using this new standard instead of the United Nations models.

The INDEPTH age standard for SSA typifies the population in developing countries, with its very young age structure. INDEPTH sites have used this standard to compare mortality in SSA. This comparison highlights age-specific mortality, considering mortality in infancy, childhood, and adulthood. This discussion compares the INDEPTH standard for SSA with the Segi population and the new World Health

Organization standard population. The chapter ends with a presentation of basic life-table indicators for INDEPTH sites, based on their age-specific mortality rates over the 1995–99 period. Part II ends with Chapter 7, which analyzes more than 6.4 million person–years of observation at the African INDEPTH sites to identify mortality patterns. The emergent patterns are demonstrated to be substantially different from conventionally used model mortality patterns applied in Africa.

Part III presents profiles of 22 INDEPTH sites. The profiles are listed in alphabetical order, first according to region, and then according to country. These profiles are expected to stand for some time as the main reference source for basic details about INDEPTH sites and their DSS operations. Based on a structured template, each profile provides a site description, including the physical geography and population characteristics. It discusses DSS procedures at the site, including data collection and processing. Finally, each profile presents basic outputs, including demographic indicators. A summary matrix of all the DSS sites, presented in the introduction to Part III, provides the core details for each site.

INDEPTH monograph editorial team for Volume 1:

Osman A. Sankoh (University of Heidelberg, Germany, and Nouna DSS, Burkina Faso)

Kathleen Kahn (Agincourt DSS, South Africa)

Eleuther Mwageni (Rufiji DSS, Tanzania)

Pierre Ngom (Nairobi DSS, Kenya)

Philomena Nyarko (Navrongo DSS, Ghana)

1 June 2001

ACKNOWLEDGMENTS

This volume is an outgrowth of the efforts of many people, both INDEPTH members and its collaborators, who gave of their time and expertise to writing these chapters. We would like to particularly thank the following for their invaluable contributions to the corresponding chapters:

We would also like to thank INDEPTH site members, whose names are mentioned in the site profiles, for coordinating the writing of their site’s profile. Special thanks go to Rose Lusinde and Don de Savigny for producing the map panels for the site locations and particularly to Kathleen Kahn and Don de Savigny for coordinating the formatting and editing of the 22 site-profile chapters making up Part III of the monograph.

The INDEPTH coordinators would like to express their gratitude to the INDEPTH editorial committee, led by Osman A. Sankoh, for its outstanding work in compiling this first monograph. We acknowledge with pleasure the willingness of individual site teams and their leaders to collaborate in sharing such rich data sets and experiences. We also recognize the contributions of all our investment partners — local communities, public-sector services, academic and research institutions, and donors — all of whom, often over prolonged periods, continue to support and sustain our efforts. We express particular thanks and appreciation to the many sponsors of INDEPTH, including the Rockefeller Foundation, the Navrongo Health Research Centre, the Population Council, the World Health Organization, and the Andrew W.

1 Based on Benzler, J.; Herbst, A.J.; MacLeod, B. (in alphabetical order): A reference data model for demographic surveillance systems. INDEPTH 1999, http://www.indepth-network.org.

Mellon Foundation, for providing the funds needed to enable INDEPTH networking activities to function. We look forward to attracting new partners to join with us in advancing our mission, goals, activities, and products.

Finally, we thank internal and external reviewers for their invaluable comments, which increased the validity and clarity of many sections of the monograph.

INDEPTH Coordinating Committee

Fred Binka, Chair (Ghana, 1998–2001)

Steve Tollman, Deputy Chair (South Africa, 1998–2001)

Pedro Alonso, Member (Mozambique, 1998–2000)

Yemane Berhane, Member (Ethiopia, 1998–2001)

Chuc N. T.K., Member (Viet Nam, 2000–)

Don de Savigny, Member (Tanzania, 1998–2001)

Bocar Kouyaté, Member (Burkina Faso, 2000–)

Boubakar Sow, Member (Mali, 1998–1999)

Siswanto Wilopo, Member (Indonesia, 1998–2001)

1 June 2001

INTRODUCTION

As we enter the new millennium, with the revolution of the information age still gaining speed, it seems inconceivable that large parts of the Earth’s population remain devoid of vital health information. For 1 billion people living in the world’s poorest countries, where the burden of disease is highest, no one registers those who are born or who die or ascertains the causes of their deaths. From the limited data available, the health profile of these populations can be likened to an iceberg: the bulk of reliable data on trends in age, gender, geographic variations, and burden of disease remains hidden. This great void in population-based information constitutes a major and long-standing constraint on the articulation of effective policies and programs to improve the health of the poor and thus perpetuates profound inequities in health. The need to establish a reliable information base to support health development has never been greater.

Recently, experience has emerged from a growing number of community-based field stations that have continuous monitoring systems for geographically defined populations. These field stations generate high-quality, population-based, longitudinal health and demographic data with the potential to fill this information void in the developing world. Since 1997 a number of organizations have made a systematic effort to harness and make more readily available the products of these disparate initiatives. A series of meetings were convened by the University of Witwatersrand (South Africa) (Agincourt Health and Population Programme); Department of Tropical Hygiene and Public Health, University of Heidelberg (Germany); the Rockefeller Foundation (Bellagio, Italy); and the Ministry of Health (Navrongo, Ghana) to examine the potential for harnessing these sites through a network. These activities culminated in a meeting convened in Dar es Salaam, Tanzania, 9–12 November 1998, to establish such a network.

Seventeen field sites drawn from 13 countries in Africa and Asia participated in this founding meeting. The name adopted for the network was the International Network for the continuous Demographic Evaluation of Populations and Their Health in developing countries (INDEPTH). Network membership has increased steadily since then and currently stands at 29 health and demographic evaluation sites in 16 countries (the 13 countries whose sites are profiled in this volume are shown in Figure I.1). The network’s founding document and constitution are available on the INDEPTH website (www.INDEPTH-network.org).

Figure I.1 Countries with DSS field sites participating in the INDEPTH network.

phdc-1_16_la_0.jpg

The defining characteristics of an INDEPTH field site are the following:

The vision and goals of the network are

To achieve these goals and facilitate the effective interaction of INDEPTH sites, the network has identified the concept of flexible working groups focused on specific scientific issues or topics as a key mechanism. Seven working groups were initially established, with a focus on

Two further working groups have since been formed, focusing on adult health and ethical practice. Thus, through active and concerted efforts, the network is encompassing a critical agenda founded on traditional strengths in research on infectious diseases and nutrition, with a growing emphasis on reproductive health, and the network is extending this emphasis to chronic disease, injury, and related social phenomena such as rapid urbanization. A central objective is to use network sites to train local scientists in research and research management.

This monograph is the foundation for an INDEPTH series on various themes, including model life tables for Africa and Asia; cause-specific mortality in developing countries; migration patterns; trends in fertility; reproductive health (including HIV–AIDS); and health equity.

INDEPTH Coordinating Committee
Accra, Ghana
June 2001

This page intentionally left blank.

PART I
DSS CONCEPTS AND METHODS

This page intentionally left blank.

Chapter 1
CORE CONCEPTS OF DSS

Introduction

During the past 30 years, demographic surveillance systems (DSSs) have been established in a number of field research sites in various parts of the developing world where routine vital-registration systems were poorly developed or nonexistent. Although these systems may have been developed differently in terms of their initial rationale, they are all required to track a limited and common set of key variables determining population dynamics and demographic trends. DSSs have similar approaches to defining key variables and their relationships and to developing systems for collection, storage, and analysis of these data. The core concepts presented here draw directly from the ideas and experiences emerging from INDEPTH DSS sites in Africa and Asia. It should be emphasized, however, that even though an effort has been made to standardize the definitions, many DSS sites still define some of the concepts differently.

Demographic surveillance systems

A DSS is a set of field and computing operations to handle the longitudinal follow-up of well-defined entities or primary subjects (individuals, households, and residential units) and all related demographic and health outcomes within a clearly circumscribed geographic area. Unlike a cohort study, a DSS follows up the entire population of such a geographic area.

In such a system, an initial census defines and registers the target population. Regular subsequent rounds of data collection at prescribed intervals make it possible to register all new individuals, households, and residential units and to update key variables and attributes of existing subjects. The core system provides for monitoring of population dynamics through routine collection and processing of information on births, deaths, and migrations — the only demographic events leading to any change in the initial size of the resident population. This core system is often complemented by various other data sets that provide important social and economic correlates of population and health dynamics. These may include information on events such as household formation and dissolution, acquisition and loss of economic assets, and growth or depletion of income.

In many population sites, the DSS may also provide a platform for other studies within the same geographic area. This support varies from one study to another and may include the provision of an initial sampling frame, adjustment for confounding variables, provision of additional explanatory variables, and measurement of the demographic impact of interventions.

Demographic surveillance area

The demographic surveillance area (DSA) is an area with clearly and fairly permanent delineated boundaries, preferably recognizable on the ground (for example, rivers, roads, and clearly demarcated administrative boundaries). The clear delineation of boundaries enables an unambiguous distinction to be made between individuals, households, and residential units to include in the DSS and those to exclude.

The area of a DSS site depends mainly on the size of the population required for demographic surveillance and related research activities (for a typical example, see “Establishing the monitored population” in Chapter 3). The size is also influenced by pragmatic considerations, such as the cost to the research centre and its capacity to manage the associated logistics and human resources. The DSA may expand or shrink over time in response to changing research needs or sources of funding. These changes usually introduce additional complexity, as they alter eligibility criteria and may make it difficult to maintain consistent definitions of internal and external migrations over the period of transition.

Longitudinality

Longitudinal measurement of demographic and health variables is one of the key characteristics of a DSS. This is achieved through repeated visits at more or less regular intervals to all residential units in the DSA to collect a prescribed set of attribute data on registered subjects, who are consistently and uniquely identified. This and recording events affecting these subjects during the interval between visits allow one to construct their history and differentiate DSS data from data collected in multiround surveys and other prospective studies that allow comparison over time only on an aggregated level.

Visits

DSSs collect data during rounds, or cycles, of visits to registered residential units in the DSA. The interval between visits depends on the frequency of the changes in the phenomena under study and on the length of recall intervals for the collected data, and thus on the research focus of each field site. However, like the size of the DSA and observed population, it also depends on funding and logistics. This interval varies from one site to another, ranging from 1 week to 1 year. However, for the majority of DSSs, observations are made at 3- or 4-month intervals. This is widely considered an appropriate interval to ensure comprehensive recording of births, deaths, and migrations, which is the minimum requirement for maintaining the coherence of any DSS .

When intervals between visits are long (a year or more), researchers commonly ignore migration events and instead conduct a full census at each new round. In- and out-migration flows are then inferred through reconciliation of unlinked census records after account is taken of births and deaths between censuses.

Data collected during each fieldwork round are not restricted to key demographic events but may also include the various attributes of the primary subjects. These attributes may be fixed (for example, ethnicity, gender) or changing over time (for example, marital or residential status).

Unique identifiers

Unique identifiers for primary subjects are an indispensable element of DSSs. All systems invariably formulate rules for assigning unique identifiers at the start of the DSS, but their methods for assigning these identifiers to DSS subjects may vary from one site to another. There are two main approaches. One common strategy is to transparently link the subjects in a single residential unit through a hierarchical system of unique numbers. These are built up from a unique number for the residential unit, followed by serial numbers for each of the households within it (where the notion of households applies) and then for each of the enumerated individuals within each household. In this system, the unique number for each individual in the DSS is a composite of the numbers for the residential unit, household, and household member. This may involve creating complex hierarchies, in which the unique number of the residential unit itself is a composite reflecting allocation to regions, areas, and villages (where they exist). This system requires thorough mapping of the DSA before enumeration. It also requires proper training of enumerators to avoid confusion in assigning identifiers. When mapping of the DSA is coupled with georeferencing of residential units, using geographic information system (GIS) technology, global positioning system (GPS) coordinates are assigned as location attributes of the residential units within the database.

The other strategy for assigning identifiers to individuals is to avoid any fixed link to residential units and households. In this system, identifiers for each subject are simply serial numbers incremented each time a new DSS subject is registered. This system requires providing field staff with block allocations of ID numbers with enough latitude to register new subjects. This approach should be coupled with computer generation of the identifiers to safeguard against the assignment of the same ID to multiple subjects on the ground. This strategy helps to preserve people’s anonymity outside their residential units, or when their attribute data are accessed through the database.

Primary DSS subjects

DSSs are typically structured around three main subjects (Figure 1.1) within the DSA. These subjects have both a conceptual and a logistical rationale. From a logistical point of view, it is not feasible to interview all individuals directly, and for this reason individuals are put in groups with physical and social meaning, and information is collected from credible and informed respondents within these groups. The reasons to distinguish between these subjects from a conceptual point of view will be dealt with in greater detail in the following subsections. The three main subjects are (Figure 1.1) as follows:

Figure 1.1. The three main DSS subjects.

phdc-1_24_la_0.jpg

Residential units

All DSSs identify residential units as a primary subject of interest, although they vary in the terms they use for these units (for example, compounds or homesteads) and may also differ slightly in their definition of them. Residency, or physical presence within a DSA at a fixed place of abode and for a sufficiently long period, is an essential prerequisite for the enumeration of individuals at risk for demographic events or disease exposure.

In most systems a distinction is made between places of residence and other structures, such as clinics, schools, churches, and stores. Identifying a unifying term for all these structural units may have conceptual merit, and some systems have attempted to do this, as these structural units share many characteristics and this approach simplifies the database hierarchy for handling this concept. In this system an inclusive term such as bounded structure may be used at a higher level and compounds (or homesteads) and facilities at the more specific level.

Households

Households may be variably defined in one or more of the following ways:

The definition of household and its applicability both as a concept and as a separate DSS subject may vary greatly from one DSS to another. Households may simply be seen as fixed social subunits within residential units. In more complex systems, they may be seen as independent subjects able to change their place of residence while preserving their social identity, and they may have members who are resident elsewhere. In such a system, a clear distinction would be needed between residency, which defines the state of being physically present in a given residential unit for a defined threshold of time, and membership, which defines the state of belonging to a social group irrespective of physical presence. These concepts have a clear overlap with the related concepts of de facto population (persons who are physically present in a place) and de jure population (persons who usually reside in a given place), respectively. The concepts of residency and membership are discussed later in this chapter.

Individuals

The individuals are people of various ages, sex, and other personal characteristics who are residents or members of the DSS residential units or households, respectively. Their personal characteristics may be fixed (sex, date of birth) or change over time (age, marital status). Unless their changes are predictable (like the yearly increment of age), changing characteristics will need to be recorded repeatedly — or their changes will need to be recorded as events — to produce longitudinal trends.

Eligibility

Every DSS is required to define the population under surveillance. As most individuals within any population have places of residence and attachments to social groups, the task of defining the population begins with the identification of the residential units, households (where applicable), and individuals that will be visited and observed. Thereafter, a set of inclusion criteria must be applied to distinguish eligible from ineligible individuals or subjects within each subject category.

As residential units have fixed geographical positions in all DSSs, there are consistent and simple rules for their inclusion: they are included if they are situated in the DSA. In DSSs that deal with households as distinct (and potentially mobile) subjects, these households are eligible if (and while) they are situated in the DSA. This is what is referred to as household residency.

Rules for individuals, particularly in highly mobile populations, are more complex. The most typical approach is to simply base their eligibility on residence, that is, physical presence. Individuals are eligible if (and while) they are resident at eligible residential units. This is what is referred to as individual residency. Another approach,

based on social linkages, rules that individuals are eligible if (and while) they are members of eligible households. This requires careful and consistent definitions of household and membership and can allow individuals who are not resident to remain as members of the household and therefore to qualify for observation.

Residency and membership

Clear geographical boundaries for the DSA and well-defined physical boundaries for residential units are minimal prerequisites for following up DSS subjects consistently and arriving at numerators and denominators for rate calculations. In systems where residential units and households are separate subjects and there is a separate relationship between individuals and each of those subjects — expressed as residency and membership, respectively — these concepts become substantially more complex.

Observing an individual’s presence in, or absence from, a specific residential unit requires clear rules for residency status. The physical presence of an individual for a very short time may not be taken into account when the amount of time spent in the residential unit is computed. Conversely, the noncontinuous presence of an individual, with short periods of absence, may be considered continuous residency if he or she meets a threshold for inclusion.

Residency and membership statuses are assigned at the start of the DSS, based on prescribed eligibility rules. Thereafter, new residency episodes may commence as a result of births or in-migrations exceeding a prescribed threshold of duration, and current residency may end because of deaths or out-migrations, again exceeding a prescribed threshold of duration. New membership episodes may commence as a result of events that initiate a social relationship with a household, such as birth, marriage, adoption, or household formation, and may be terminated by events that end such a relationship, such as death, divorce, or household dissolution.

Core DSS events

To know the size of the registered resident population at any time, a DSS collects information about three core events that alter this size, namely, births, deaths, and migrations. These events are described by the following fundamental demographic equation:

phdc-1_26_la_0.jpg[1.1]

where P is the population; B is the number of births; D is the number of deaths; I is the number of in-migrants; O is the number of out-migrants; and t0, t1 is the time interval of their occurrence.

An underlying principle for recording events in a DSS is that of a population at risk. Mortality, fertility, and migration rates are calculated by counting the number of deaths, births, or migrations occurring within a registered population exposed to the risk. For example, an individual who is not resident within the DSA is not considered at risk of dying within the area. Consequently, most DSSs do not observe nonresident individuals or households and do not record their events.

Births and fertility

Pregnancies and their outcomes for all women registered in the DSS are recorded regardless of the place of occurrence of such events. The recording of births has two purposes: for estimating fertility and for identifying a criterion for registering an individual. To estimate fertility, a DSS should record all pregnancy outcomes, including miscarriages (<28 weeks), induced abortions, stillbirths (≥28 weeks), and live births. All live births are then registered as individual members of the DSS, independent of subsequent survival. In some DSSs, fieldworkers take note of live births to visitors to the DSA to alert the data collector in the next round to register the mother (if she becomes eligible) and her child. This procedure is very helpful, as it greatly improves the accuracy of dates of birth of newly born babies and increases reporting of births from eligible mothers with frequent in- and out-migration.

Although most DSSs will report their estimates of the fertility of a specific age group of women, usually 15–49 years, they should also record births to women outside this age group.

The underreporting of pregnancies and their outcomes is a major problem across all DSSs. Some DSSs have used the recording of pregnancies during routine update visits to improve birth coverage. Pregnancy observation has also been used to increase the reporting of other pregnancy outcomes, particularly miscarriages, induced abortions, and stillbirths. However, this requires an update-visit interval of <5 months so that a notification of pregnancy can be obtained in one round, followed by the recording of the pregnancy outcome in the next visit.

Deaths and mortality

Deaths of all registered and eligible individuals are recorded, regardless of the place of death. It may be impossible to record the deaths of previously eligible individuals who then out-migrated. In this case, observation of their survival is censored at the time of migration. Information about the death of visitors to the DSA is sometimes collected, but it is only used in mortality estimates if a de facto population estimate is available for each day.

Underreporting of deaths is typically less of a problem than that of births, because a death is widely known and remembered. Exceptions are the deaths of young (and yet unregistered) infants, particularly perinatal deaths, if cultural beliefs or grief hinders reporting.

Some DSSs collect more detailed information about deaths to establish the cause of death, generally through the so-called verbal autopsies (VAs).

Migrations and mobility

Two types of migration events occur:

Where nonresident household members are ignored, only external migration affects the size of the population, resulting in either the registration of a new in-migrant or the termination of follow-up of an out-migrant. However, recording internal migration is very important to ensure the accuracy and validity of DSS data. The DSS needs to identify internal migrations and migrants and collect supporting information to avoid double counting of individuals and to ensure that their exposure to the social and physical environment is correctly apportioned. Migrations influence the registration of births and deaths; for example, a death would not be recorded for an individual who out-migrated before his or her death.

Defining the circumstances under which a migration is acknowledged to have occurred is notoriously difficult, not only for DSSs, but even for vital-registration systems and censuses. Different DSSs have different criteria. One approach, generally known as the “50% rule,” considers individuals resident if they have spent most of the time between two data-collection visits within the DSA. Any former resident who has not spent at least 50% of the time in the DSA would be recorded as having out-migrated.

However, many rural communities have individuals who regularly and predictably change residence for seasonal work, employment, or educational opportunities. The terms circular and pendular migration are often used. In the Hlabisa DSS, a newly established system in an area of very high population mobility, individual residency has been replaced with household residency as a registration criterion. Consequently, although out-migrations are recorded, the fieldworkers do not automatically terminate follow-up observations.

Migration is a repeatable event — an individual may make several migrations over time, both internally and externally. To maintain longitudinal integrity of data concerning individuals, a DSS should establish whether an external in-migrant has previously been registered in the DSS. The individual’s current and previous records should be matched so that he or she is not handled as a new individual in the system but as an individual under observation for several periods.

Episodes

Episodes are a logical complement to events. They are meaningful and identifiable segments of time started and ended by events. The life of an individual, for instance, can be understood as an episode that started with the individual’s birth and ended with his or her death. In the same way, residential units or households can be said to be episodes that start when they are formed and end when they are dissolved.

The usefulness of the concept of episodes is not limited to primary subjects. It applies equally to associations between them and therefore provides a useful framework for handling residency, membership, marital status, and many other concepts. Episodes also make it much easier to formulate and implement validation rules regarding events.

Other events

In addition to births, deaths, and migrations, other events are of interest for our understanding of demographic, health, and social dynamics. One event on which data are commonly collected relates to nuptiality or marital status. Most DSSs collect information about events such as marriage, defined as an event that starts a marital relationship, and divorce, that is, an event that ends a marital union. Other events recorded by DSSs depend on their complexity and research interests but may include the change of a head of household, a household’s formation or dissolution, or the construction or destruction of building structures.

Nuptiality and conjugal relationships

DSSs collect data on nuptiality primarily because of the important influence of marital patterns on fertility. Marriage as a start of an episode is easily identified, although a period of sexual union may have preceded marriage. The ending of a conjugal relationship can be less clearly marked, because it may not always be the death of one of the partners or a divorce, but a period of separation. In DSAs where the nonmarital fertility rate is high, other conjugal relationships become important, and the systems record informal relationships as well as formal marriages. However, in taking on this broader approach to sexual relationships, the DSSs must overcome two hurdles:

Construction and disintegration of residential units

At any given time, new residential units may be under construction and other residential units may be at various stages of disrepair following natural disasters or abandonment. The physical state may be distinct from the functionality of the residential unit; that is, it is possible that a residential unit is physically intact but long abandoned, and apparently broken-down units may still have households and individuals living in them. It is also possible that broken-down or destroyed units may subsequently be rebuilt, when the owner returns.

As the state of the residential unit is often — if not always — a good indication of its functionality, a DSS should make provision to track both its physical state and function.

Events occurring in households

Similarly, households can go through important changes affecting their composition and socioeconomic and health conditions. New households may form within an existing residential unit when, for example, a son takes a wife and establishes a family of his own or when a polygynous man takes another wife. Separate households may merge to form a new household, or a complete household may move to settle at another residential unit. Households may lose one or more members over time and decrease in size, or they may completely dissolve through a process of slow attrition or a major environmental or social disaster.

In environments with substantial social flux and instability, it is important to keep track of these events and their effects on the formation and dissolution of households. This is essential if DSSs have conceptualized households as subjects in their own right. Because they also influence patterns of individual presence at a residential unit, these household changes have important implications for the composition of the residential unit as a whole.

Chapter 2
DSS-GENERATED MORTALITY RATES AND MEASURES

Introduction

This chapter provides definitions and explanations of key DSS-generated mortality rates and measures, as well as describing the methodology employed in calculating them. It is intended for readers unfamiliar with these rates and measures. Their calculation is basic, and the various formulas can be found in standard textbooks (see for example, Shryock and Siegel 1976; Kpedekpo 1982; Newell 1994). These measures have been briefly discussed in this chapter for quick reference, as they form the basis for standardizing the results across DSS sites. Perhaps the most important reason for discussing them is the opportunity it affords to discuss the classic controversy over whether to define some of them as rates or ratios (for example, infant mortality, under-five mortality, and maternal mortality). Furthermore, this chapter provides an explanation of the need for a standard population and introduces the INDEPTH standard population for Africa south of the Sahara, discussed in greater detail in Part II.

Rates and ratios

Rates and ratios are frequently used in measuring demographic events. Rate refers to the frequency of events. A rate is estimated by taking the number of events in a given period and dividing it by the population at risk during that period. Pressat (1985, p. 194) stated that the term rate

is also used more loosely to refer to the ratio between a sub-population and the total. . . . In many other uses of rate, the measure in question would be better termed a ratio, proportion, or probability. The term can be justified only when a dynamic process is being measured, not a static description of a population at a given date, although its use in the latter sense is widespread. In general the word ratio is preferable to rate when the measure is not one relating events to a population at risk.

A ratio is the proportion between a numerator and a denominator that are related (for example, under-five child deaths per 1000 under-five person–years lived in a given year).

Crude death rate

The crude death rate (CDR) is defined as the number of deaths in a given period divided by the total population. Although the CDR can be computed for any segment of time, the period usually used is a year, and the denominator used in the rate calculation is the midyear population. The midyear population is the size of the population (or any specified group within the population) at the midpoint of a calendar year. This midpoint is often calculated as the arithmetic mean of the size of the population at the beginning and end of the year. Conventionally, the rate is expressed as a number per 1000 individuals.

In the case of a population under continuous surveillance, with possibly high in- and out-migration rates that may yield a strong variation in population size, the use of exact person–years lived is preferred. Person–years is the sum, expressed in years, of the time spent by all individuals in a given category of the population (Pressat 1985). Specifically, these years express the periods that eligible individuals spent in the DSA. Times or periods spent outside the DSA due to migration or death are excluded.

Age-specific death rate and ratio

Because of the differentials in exposure to the risk of dying, epidemiologists and demographers often use age-specific death rates (ASDRs) and sex-specific death rates, instead of the CDR. ASDRs are the most commonly used. The ASDR for an age group is defined as the number of deaths in the age group in a specific period divided by the total number of person–years lived in that age group during that period and multiplied by 1000. Demographers often use a slightly different notation. They express the ASDR of a particular age group as the deaths among individuals in that age group in the year, divided by the mid-year population of that age group and then multiplied by 1000. Five-year age groups are common, although age categories vary according to the purpose of study.

The following discussion of infant, under-five, and maternal mortality measures highlights the classic controversy over whether to define these measures as rates or ratios. The denominator used in calculating a measure determines whether it is a rate or a ratio. As stated earlier, the measure is a rate when the total number of individuals at risk is used as the denominator, and it is a ratio when some other event is used as the denominator.

Infant mortality

It is usually difficult to estimate the number of person–years lived for children <1 year old (infants). Consequently, the total number of live births is often used as the denominator to calculate the infant mortality rate. The total number of deaths among children <1 year old in a calendar year is divided by the live births in the same year, multiplied by 1000. Calculating the infant mortality rate in this way makes it more appropriately referred to as a ratio.

Infant deaths are unevenly distributed through the first year of life. A high proportion of infant deaths usually occurs in the first month of life. Of these deaths, a high proportion occurs during the first week of life; and of these, a high proportion

occurs during the first day. The conventional infant mortality rate or ratio may usefully be broken up into rates or ratios covering the early stages of life and a rate or ratio for the remainder of the year. The one for the first period is called the neonatal mortality rate or ratio, and that for the second period is called the postneonatal mortality rate or ratio. These concepts are briefly defined in the following paragraphs.

Neonatal mortality is defined as the number of deaths of infants <4 weeks old (or <1 month old) during a year. It is calculated by dividing the deaths of infants <28 days old during a year by the live births in the same year and multiplying by 1000. Early neonatal mortality is calculated by dividing the deaths of infants <7 days old during a year by live births in the same year and multiplying by 1000. Late neonatal mortality is calculated by dividing the deaths of infants 7–28 days old in a year by live births in the same year and multiplying by 1000. Postneonatal mortality is calculated by dividing the deaths of infants 4–51 weeks old during a year by live births in the same year and multiplying by 1000.

Infant mortality can also be expressed as a probability of dying before reaching the age of 1 year. Perinatal mortality is calculated by dividing the sum of stillbirths in the year and the deaths of infants <7 days old during the year by the sum of stillbirths in the year and live births in the same year.

Under-five mortality

Some consider the under-five mortality as a ratio expressing the number of deaths of children <5 years old divided by the number of live births in a year and then multiplied by 1000. Others treat it as a rate, calculating it by dividing the number of deaths of children <5 years old by the total number of person–years of children <5 years old and multiplying by 1000. When under-five mortality is presented as a probability of dying before age 5, it is expressed as 5q0.

Maternal mortality rate and ratio

Most DSSs record all pregnancies and their outcomes as well as deaths. As such, they have the potential to provide accurate, up-to-date estimates of maternal mortality rates and ratios. The maternal mortality ratio is conventionally defined as the number of deaths due to puerperal (pregnancy-related) factors per 100 000 live births. But strictly speaking, this is referred to as a ratio because the denominator is not the persons at risk of experiencing the event. In view of this, the following are the methods for estimating maternal mortality ratios and rates. The maternal mortality ratio is calculated by dividing the number of pregnancy-related deaths in a specified period by that of live births in the same period and multiplying by 100 000. The maternal mortality rate is calculated by dividing the number of pregancy-related deaths in a specified period by person–years lived by women of childbearing age and multiplying by 1000.

Maternal mortality can also be estimated by relating maternal deaths to women of reproductive age or to all pregnancies, including stillbirths and abortions.

Standardization

Age-standardized death rate

Crude mortality rates are inappropriate for comparing different populations within the DSS sites because of the different age structures within the sites. On the other hand, a single parameter is required for simple comparison. Therefore, standardized rates are used, in which the age-specific mortality rates are combined using a standard population. An INDEPTH standard population for sub-Saharan Africa (SSA) has been developed (see Table 6.2). More details on the INDEPTH standard population are provided in Chapter 6. The Segi (1960) and the new World Health Organization (WHO) standard age distributions are also shown in Table 6.2.

Age-specific rates are weighted averages of rates, where the weights are obtained as a proportion of the standard population in the respective age group. The summation goes over all age groups.

Confidence intervals for rates

Estimates of the mean and standard deviation of a population are usually needed if it is impossible to deal with the entire population. The standard deviation of a distribution of sample means is referred to as the standard error of the sample. It measures how precisely the sample mean estimates the population mean. For example, with a 95% confidence interval, about 95% of the sample means obtained by repeated sampling would lie within two standard errors below or above the population mean. Based on the sample mean and its standard error, a range of likely values can be constructed for a population mean that is not known. This range is referred to as a confidence interval. More precisely, there is a 95% probability that a particular sample mean lies within 1.96 standard errors above or below the population mean.

Confidence intervals can be calculated for the ASDRs. The variance of the CDRs or the ASDRs is used instead of the means. Estève et al. (1994) discussed the method in detail. For a small number of deaths or for small populations, however, confidence intervals for ASDRs are not reliable, because the formula used to calculate them is too imprecise. The question is then one of how large the numbers of deaths and populations must be to give reliable results. It is difficult to supply a rule of thumb, and as Estève et al. (1994, p. 58) noted,

It is however difficult to tell what “sufficiently large” means in the present context because the numerator of a standardised rate is no longer a Poisson variable. Its variance depends not only on the total number of observed cases but also weighting scheme and the accuracy of the age-specific rates.

Chapter 3
DSS METHODS OF DATA COLLECTION

Introduction

Knowledge of the methods for collecting or compiling data at the DSS sites is essential because these methods influence the ways that data are processed, analyzed, and interpreted. The most common demographic methods used in data collection are censuses, sample surveys, and vital-events registration systems. The last method, however, is nonexistent or only partially applied in many developing countries. Given the paucity of vital-events registration and knowledge on population or health-status trends in such settings, demographic and health surveys have been introduced for health planning, practice, evaluation, and allocation of resources. Demographic estimates undertaken in developing countries have employed both indirect and direct methods, using retrospective single-round surveys and prospective multiround ones (Tablin 1984).

Indirect estimation methods rely on information obtained from subjects not directly at risk of a particular demographic phenomenon. The indirect methods can be used to estimate levels and trends of fertility, mortality, and migration where data sources are defective or incomplete. An example of an indirect method is the estimation of infant and child mortality from proportions of surviving children or the estimation of adult mortality from those orphaned. Indirect estimation methods are also used to assess data collected using conventional methods. Such data are compared with other information to infer a certain pattern, on the basis of certain assumptions. If this pattern is reproduced then data can be further inferred. Indirect estimation may, in addition, involve fitting of demographic models to fragmentary and incomplete data (Pressat 1985). The results obtained are used to estimate a particular parameter.

Direct methods use data on the people at risk to establish a demographic measure and pattern. These methods rely on data obtained from censuses, surveys, and recorded data on the components of change — that is, births, deaths, and migration. Data obtained from these methods are used directly to provide estimates of demographic phenomena, such as fertility, mortality, and migration. An example of a direct method is the use of the number of children born to women of a particular age group to estimate age-specific fertility rates.

In single-round surveys, a population is enumerated once during a survey, and retrospective data are gathered on past events (Kpedekpo 1982; Tablin 1984; Newell

1994), such as a birth or death that occurred in the last year (or a life and maternity history). This method may result in overestimation or underestimation of events, as a result of memory lapse. Respondents may exclude events from the reference period. It has been argued that an underestimation of 30–40% is likely using this method (Tablin 1984). Some examples of single-round surveys are the World Fertility Survey and the Demographic and Health Surveys.

Prospective surveys involve repeat visits (longitudinal data collection) to the same respondents or the same study area (Pressat 1985). All DSS sites employ this method of data collection. This does not mean, however, that the methodological approach is the same across all sites. Sites each have unique features, as shown in the various site chapters of this monograph. The purpose of this chapter is therefore to provide a general description of the data-collection methods used by the DSS sites. The data-collection methods are described to provide a quick reference for the reader, rather than describing experiences with data collection. Periodically, specific examples are provided from sites for clarification.

Establishing the monitored population

Selection and establishment of the DSA are prerequisites of any DSS site, but no specific sampling method has to be employed in the selection of an area. Depending on the nature of the study, sites employ probability or nonprobability sampling methods, or both, in drawing their sample population. Once an area has been selected the community has to be mobilized to prepare it to participate in the research and ensure its compliance. Mobilization activities involve conducting sensitization meetings with influential opinion leaders, such as councillors and village, hamlet, or religious leaders. During these meetings, the DSS staff presents and clarifies the project’s objectives and expected output and outlines its anticipated activities. Other sensitization methods include drama and sports activities involving the project staff and the community.

As DSSs are longitudinal studies, staff also have to maintain the community’s compliance with DSS activities longitudinally, and this means that mobilization of the community is not limited to the initial stages but has to be a continuous process. Compliance is maintained in a variety of ways across sites, including giving feedback to the community through presentation of results in simple tables or graphics, production and circulation of a newsletter, meetings with the key informants at regular intervals, and presentations of findings to health-management teams.

In terms of the minimum and maximum population size under DSS, there is no consensus. DSS sites can have a variety of population sizes under surveillance. For example, Butajira DSS (Ethiopia) began with a sample of 28 616 people (Berhane et al. 1999), whereas Navrongo DSS (Ghana) and Rufiji DSS (Tanzania) had, respectively, 124 857 and 85 102 people 1 year after they began operations (Binka et al. 1999; Mwageni and Irema 1999). The Adult Morbidity and Mortality Project (AMMP, Tanzania) has three sites and more than 300 000 people under surveillance (TMH 1997). The site chapters give more details on the sample sizes of the various DSS sites.

Planning for data collection

Any data-collection exercise requires advance planning and recruitment and training of field staff, such as enumerators and supervisors. It also involves the designing and printing of DSS forms and the preparation of field or training manuals. DSS enumerators are normally recruited from among those local individuals who meet minimum qualifications set for specific projects. Training focuses on proper ways to use DSS forms, conduct interviews, and handle various field forms. Field or interview manuals are used for training and are eventually provided to all field staff as reference materials during data collection. The training manuals clearly indicate the duties and responsibilities of the field staff. In addition, the staff may receive training on how to use or operate field equipment, such as motorcycles. The field staff are given periodic training on field operations to keep up to date on data-collection techniques.

Initial census

Data collection to establish the baseline population begins with a census, conducted by trained enumerators living in the study area. As stated earlier, they are trained on how to use DSS forms and conduct interviews. The initial census establishes the foundation for a longitudinal surveillance system and helps obtain background data on the subjects. Data are collected using standard questionnaires, with closed- or open-ended questions, or both. Separate questionnaires are used to collect household and individual data. The structured questionnaires comprise at least two sections: the header, for recording the unit of interest; and the main part, for recording basic information (see example 1 in Appendix 1).

The type of data collected during the initial censuses depends on the specific objectives of the site. In many sites, data are collected on variables such as household composition (household head, relation to household head, etc.), culture (religion and ethnicity), demographic data (age, sex, marital status), and socioeconomic data (education, occupation, etc.). In addition, the DSS can collect data on behavioural issues (alcohol consumption, smoking, etc.), housing, health-care use, and environmental conditions (source of drinking water, sanitation facility, etc.).

For identification purposes, each household and individual registered is assigned a unique number within its village and his or her household, respectively. A series of numbers for each individual may be used to identify the village, the household, and the individual within the household. The number allocated to the individual is permanent. In some systems, if an individual moves to a new area, the number is still used to identify that person. In this way, it is possible to monitor migrants, as will be shown.

Update rounds

The longitudinal system of data collection continues then with periodic visits to registered households. The purpose of the visits is to record vital changes or events since the previous visit. These may include births or other pregnancy outcomes, marital status (marriages, divorces, separations, reconciliations), deaths, and migrations. Regular data collection is undertaken to maintain accurate denominators for estimation of

age-, sex-, and cause-specific death rates. The DSS approach has no specific interval for periodic visits to the registered households (Indome et al. 1995). Yet, it is important to ensure that the interval chosen between interview rounds is consistent for any given household or area. Provided they are consistent, periodic-visit cycles may range from 1 to 12 months.

During the periodic visits or updates, the status of each individual is verified using the household-registration or -record books (see example 2 in Appendix 1) or forms. The registration books are computer printouts of information on households and their members collected in the initial census. They are systematically arranged by household to facilitate further visits or household contacts. These books can be printed in rows and columns to maintain several rounds of data collection. The information on rows may correspond to individual members, as well as details of a household, whereas the columns have spaces for filling in vital events detected in each DSS round. However, all vital events have to be registered on specific event forms. These forms may include observation of pregnancies, births, deaths, and marital changes (see examples 3–5 in Appendix 1). These are forms used in the Butajira, Navrongo, and Rufiji DSSs.

All errors that the interviewers note during update rounds they correct accordingly in the respective book, along with filling out the changes form. The changes form requires the unique number of the household or individual, the change to be made, the original information, and the correction. Corrections that may require filling in the changes form include those for age, name, sex, missed members of a household, and relationship to the head of household. Eventually, these forms are taken to the data centre for correction of databases. This means that in DSS sites data are collected in conjunction with data-management operations (details on data management are provided later in this monograph). In most cases, the fieldwork and computer cycles coincide. Figure 3.1 summarizes the linkage between field and computer operations in Rufiji DSS. This linkage aims at maintaining the integrity of data, as well as ensuring timely reporting of findings. Upon completion of interviews in the household (during the initial census or updates), the forms are taken to the computer centre for data entry. Errors noted during quality control (for details, see Chapter 5) or data entry are verified, reported to the field staff for diagnosis, and later corrected in both the household-registration book and the computer databases.

Updating of vital events is not the only activity carried out during these periodic visits. During update rounds, enumerators register new people or households. These include the migrants, the newly married, and any individuals missed during the initial census. The longitudinal system allows individuals to enter or exit the DSS at any time. They enter through births or in-migration and exit through deaths or out-migration (Figure 3.2). As these individuals are under surveillance, it is possible to estimate the total time spent by each individual in the study population. This time contribution is called person–years of observation and is used as a denominator to estimate rates of events (such as fertility, mortality, and migration). Details on the uses of person–years of observation appear elsewhere in this monograph.

The periodic visits to registered households make DSS self-checking, allowing data collected in one round to be checked and corrected in successive rounds. This reduces the risk of omitting, forgetting, or misreporting variables or events. During the rounds it is also possible to select subsamples (nested studies) on which to collect

data on specific items at marginal extra cost and without disturbing the original purpose of the surveillance. However, where the population is very mobile, a major problem of multiround surveillance is tracking subjects.

Figure 3.1. The linkage between field and computer operations at the Rufiji DSS site, Tanzania.
Source: After Binka et al. (1999). Note: HRB, household-registration book.

phdc-1_39_la_0.jpg

Figure 3.2. Prospective monitoring of demographic events.
Source: After Berhane et al. (1999).

phdc-1_39_la_1.jpg

Recording demographic events

Monitoring of births and deaths in developing countries is very crucial, as these two events are easily omitted from routine statistical records and systems (Binka et al. 1999). This can lead researchers to underestimate their occurrence. A good recording system is needed to capture such events. Key informants can do this. Key informants are usually senior or respected members of the community (such as village or hamlet leaders) within the DSA. Key informants fill in their registers whenever an event has occurred, and they report this to the supervisors who visit them on regular basis. Ideally, being part of the community themselves, these people should not be individuals who have to find out about these pregnancies, births, and deaths but those who would hear about them in their course of normal life. As an incentive, a common practice is to pay key informants token fees for reporting such events, once they are confirmed by the system. An example of the system for recording events, as practiced in the Rufiji DSS, is summarized in Figure 3.3.

Figure 3.3. Vital-events reporting system at the Rufiji DSS site, Tanzania.
Source: After TEHIP (1996).

phdc-1_40_la_0.jpg

In the vital-events reporting system of the Rufiji DSS, key informants observe and record any birth or death occurring in the study area. This information is passed on to the DSS key-informant supervisor (or enumerator, who informs the key-informant supervisor). Within 2 weeks, the key-informant supervisor visits the households where a birth or death has been reported and contacts the data centre for verification of the event. If the information is correct, the key informant is paid a token fee. The key-informant supervisor then administers a verbal autopsy (VA) with one of the deceased’s relatives (who is well informed of the trend of illness of the deceased) for all reported deaths. Enumerators also check births and deaths during fixed enumeration rounds.

Monitoring mortality

Documentation of causes of death has contributed to progress in knowledge of epidemiology and public health. Such documentation allows researchers and policymakers to assess the health status of a population, assign health priorities, study time trends in mortality from specific causes, and evaluate health interventions. Documenting deaths is a common practice in developed countries, where most deaths occur in a medical environment, postmortem autopsies are both feasible and culturally accepted, and vital-events registration is mandatory and complete. In developing countries, however, many deaths occur in the home, with limited or no medical attendance, and postmortem autopsies are rarely possible or complete and vital-events registration is impractical. To assess the cause of death, one must rely on an alternative source of information, that is, an attending relative’s description of symptoms and events preceding death.

The VA is an indirect method employed in DSS sites to ascertain the causes of death from close associates whom the DSS interviewers question regarding their knowledge of the symptoms, signs, and circumstances leading to the death. Retrospective interviews of individuals who were there and can describe what happened during the hours, days, or months preceding a death are done, and then a most likely cause of death is inferred from the sequence and combination of symptoms and events. Specially designed forms (questionnaires) are used to suit the population of interest (TMH 1997). For example, if the study of interest is the mortality patterns of children <5 years old, then a form is designed and structured to cover all signs and symptoms of illnesses that affect mostly children of this age (see example 6 in Appendix 1). There are also special interview forms for deaths of children <31 days old and for deaths of those ≥5 years old. The DSSs use trained medical personnel or laypeople to conduct VAs.

VAs are used in health-care projects involved in research and evaluation of health services. As earlier described, key informants record deaths that occur in their area in a mortality register; this is reported to the interviewers who will conduct the VA. The interviewers make appointments to visit the houses of the bereaved families. On the appointment day, an interviewer visits the house and administers a VA with the caretaker or a close family member of the deceased. The VA questionnaires are designed to suit the settings of the area under surveillance (TMH 1997). Such information as name, age, sex, occupation, and other risk factors is usually collected, in addition to an open history of events leading to the death, previously diagnosed medical conditions, and signs and symptoms that appeared before death. The interviewer

can use the questionnaires to record information on use of health facilities before the death, reasons for using or not using a particular health facility, the caretaker’s perception of cause of death, and confirmatory evidence of a cause of death (if available). The cause of death is determined from a combination of these signs and symptoms.

Causes of death from the VA questionnaires can be reached by either asking physicians or using computer algorithms, depending on the design and structure of the questions. If physicians are asked to do this, then usually two physicians independently code the VA forms and determine the cause of death, using some kind of agreed classification (for example, the WHO International Classification of Diseases [ICD] for causes of diseases). In the case of discrepancies, a third physician is asked to code the forms. Computer algorithms are based strongly on the checklist of signs and symptoms recorded on the form. If discrepancies are noted at this level, then the cause of death is categorized as unknown. Discrepant VA forms produced by the algorithm are taken to physicians for diagnosis and coding. Usually, forms with discrepancies are fewer than others.

Tracking migrants

Migration is a complex subject, with a variety of definitions (Pressat 1985; Newell 1994). As such, the definition relies more on the way data are collected and the purpose for which they are collected. Generally, migration refers to movement of people (groups or individuals) that involves a permanent or temporal change of their usual place of residence (Pressat 1985). Migrants are therefore people who change their usual place of residence. According to Kpedekpo (1982), classification of migrants can be based on the following criteria:

Data on migration can be collected in several ways. Censuses, sample surveys, and continuous population registers are the most common (Shryock and Siegel 1976). Censuses and surveys can provide migration data directly (by asking questions about, for example, the number of moves, duration of residence, date of exit or entry, and previous residence) or indirectly (by estimating migration from total counts of population and natural increase of two censuses or counts). The problem with these methods is their failure to detect multiple moves or those that people cannot remember. In addition, past migrants are grouped together with most recent ones. Also, the indirect method requires very accurate data for the two censuses.

A migration history is another way to collect data on migrants. DSS sites collecting migration data employ this method. This is a continuous way of giving data on previous residence of individuals with dates of their moving out and in. In this way, migrants are linked to the database. Special in- or out-migration forms are used to track down migrants (see examples 7 and 8 in Appendix 1). The in-migration form requires more details than that for out-migration. In addition to personal particulars

of an in-migrant (sex, date of birth, education, occupation, etc.), information on the date of and reasons for the migration and the place of origin are also gathered. If in-migration involves a household, a household questionnaire is also used to record household characteristics. On the out-migration form, information is recorded on the date of and reasons for the migration and the destination.

DSS sites do not record all the moves but only those within a certain period. For example, the Navrongo DSS considers an individual an in-migrant if this person is in the same place of residence for 3 months (Binka et al. 1994), whereas Rufiji DSS uses a 4-month criterion for the same purpose (TEHIP 1996). The opposite applies to an out-migrant. The purpose of setting these criteria is to find a proxy to determine the residency status of individuals. This status enables estimation of the individual’s overall time contribution to supply denominators for calculation of other demographic measures, such as mortality and fertility.

Additional rounds of data collection

The previous sections have focused on collection of data for demographic variables — mainly, births, deaths, and migrations. All these can be considered extradynamic events, as they change frequently within a year. Other variables are constant or change slowly, such as socioeconomic aspects like education, occupation, housing conditions (floor, roofing material), health-care use (like vaccination), and environmental conditions (like source of drinking water and sanitation facilities). Such information can be collected once in a year, preferably at the beginning of each calendar year.

A DSS can have other nested studies to capitalize on its population database and organizational infrastructure. Such studies employ a variety of designs, such as cohort, cross-sectional, and case referent, depending on the specific primary purpose of each study, and these studies are usually linked to the longitudinal surveillance system. The Butajira DSS, for example, used its database as a sampling frame for a study population and used the routine surveillance to follow subjects in various studies of acute respiratory infections (Berhane et al. 1999). In Tanzania, a new study aimed at monitoring a program for antimalarial combination therapy uses the Ifakara, Morogoro (AMMP), and Rufiji DSAs.

Such nested studies in the DSS sites take advantage of the existing infrastructure and field organization for data collection. Sometimes these new studies may employ supplementary personnel trained to collect information specific to each study. As a result, many DSS sites become pools of trained field staff.

Geographic information systems

A GIS is a computer-assisted information-management system for geographically referenced data. It integrates the management (that is, acquisition, storage), analysis, and display (mapping) of geographic data (Loslier 1995). The GIS contains two integrated databases, namely, spatial (location information) and attribute (characteristics of the spatial features). The spatial database comprises digital coordinates obtained from maps, using GPS. These coordinates can take a variety of forms, such as points (dispensaries, hospitals, schools, households), lines (roads, railways, rivers), or polygons (wards, towns, villages, hamlets). The attribute database can include information such

as population size or density and number of health facilities or personnel. The GIS can create a link between spatial data and their associated descriptive information. Its strength lies in its capacity for integration and analysis of data from many sources, such as population, topography, climate, vegetation, transportation network, social services, and epidemiological characteristics.

Many DSS sites use GPS to determine locations and boundaries of phenomena of interest, including boundaries of settlements, households, and villages, and to map health services in terms of access and coverage. Thus, Navrongo DSS used GPS coordinates to assess the child-mortality impact of insecticide-treated bednets in 96 clusters of contiguous compounds (Binka et al. 1996). The data collected using GPS are joined to spatial imagery with GIS. In this way, it is possible to combine and analyze the occurrence of features with various locations. Nouna DSS in Burkina Faso has a GIS with data on all households in 49 villages and information on such features as health facilities, sources of water, roads, schools, and religious places (Sauerborn and Kouyaté 20001).

Conclusion

This chapter has presented a general picture of the major data-collection activities at the DSS sites. The data-collection process has been presented in terms of sequence of events carried out in DSS sites. It discussed the people involved in data collection and the tools used in obtaining information. (Part III will describe specific data-collection methods the DSS sites employ, including sampling procedures, type of information gathered, and key functions and responsibilities of the staff.) This chapter has also shown the potential of DSS sites to contribute reliable demographic and health-related data. Given developing countries’ lack of complete vital-events registration systems and the costs of and long intervals between national censuses, the DSS approach is probably one of the best options for improving the quality of data. The DSS data-collection procedures are linked to data-management and quality-control procedures, which are the items discussed in detail in the next two chapters.

1 Sauerborn, R.; Kouyaté, B., ed. 2000. Nouna Research Centre, a platform for interdisciplinary field research in Burkina Faso, West Africa. Internal report.

Chapter 4
PROCESSING DSS DATA

Introduction

Compiling longitudinal population information poses unique data-management challenges. Projects must maintain changing individual-level information on the composition and household structure of a large, geographically defined population. Events that arise — births, deaths, migrations, etc. — must be linked to individuals and other entities at risk of these events. These events affect not only demographic rates, for instance, but also relationships within and between households. As event histories grow, records of new events must be logically consistent with those of events in the past. Seemingly obvious checks on data to meet minimal standards of integrity can result in hundreds of lines of code.

Relating critically needed auxiliary data to dynamic population registers poses further challenges. Morbidity and cause-of-death data must be entered, linked, and stored. Most DSS projects also maintain socioeconomic data such as on marriage, family relationships, and economic conditions, owing to the strong correlation between health and socioeconomic status. These must be logically consistent with other longitudinal data on the population at risk and relationships among individuals under surveillance. Moreover, projects are often launched to assess the impacts of health technologies, service strategies, or policies, and this necessitates data entry, management, and checking procedures for the internal consistency of service information, as well as procedures to link this information to demographic histories. Variance in exposure to interventions must be monitored at the individual level, in conjunction with precise registration of demographic events and individual risk. Maintaining a detailed record of demographic events, relationships, and exposure to risks or interventions requires complex data-management operations, with a carefully controlled field-operation infrastructure to oversee and support data collection and entry, and a comprehensive computer system for the data-management operation.

Data-management systems required for this operation typically encompass thousands of lines of computer code. A key contribution of the INDEPTH network has been technology-sharing to offset the complexity of developing a data system and creating a reference data model for storage of DSS data. This generic model for data storage facilitates cross-site comparative analyses of the type described in this volume, as it standardizes data rules and concepts across sites. Future work of the network will address the need for generic analytical and data-management software compatible with the reference data model.

This chapter outlines features of this reference data model that pertain to the INDEPTH DSSs. In the not-too-distant past, developing DSS software was difficult, time-consuming, and prone to conceptual and programmatic errors. Software generators and object-oriented tools for software development greatly simply the task of developing a complex system, once common principles of software structure are instantiated in a common applications framework. The mechanisms of INDEPTH have marshalled these software innovations to meet the collective needs of member stations. The reference data model will facilitate exchange of information, swift formulation of site-specific data management software and common software for data analysis, and simplified technical assistance and capacity-building operations.

Background

The work of the INDEPTH Technical Working Group (TWG) has been informed by the achievements, limitations, and future needs of projects in Bangladesh, Burkina Faso, Ghana, Indonesia, Mali, Senegal, South Africa, Tanzania, and Uganda. One of the earlier systems, the Bangladesh DSS in Matlab District, was developed in the 1960s and has since been used for a wide range of studies of demographic dynamics, family planning, epidemiology, health-services research, and other issues (Rahman and D’Souza 1981; D’Souza 1984). Although the Bangladesh DSS has redeveloped its computer operations several times, its field operations have provided a model for a wide range of DSS applications in developing countries. The Bangladesh DSS precisely defined eligibility rules for members of a population under study; this, combined with a data system with rigorous logical-consistency checks, has provided high-quality data for many research papers. A number of software systems have been written, based on experiences with the Bangladesh DSS, including the Sample Registration System (Leon 1986a, b, 1987; Phillips et al. 1988; Mozumdar et al. 1990) and the Indramayu Child Survival Project of the University of Indonesia (Utomo et al. 1990). The DSS in Niakhar, Senegal, most recently described in Garenne (1997), has also influenced the technical design of a number of systems, including those of PRAPASS in Nouna, Burkina Faso (Sauerborn et al. 1996), and Agincourt, South Africa (Tollman et al. 1995). Garenne (1997) described the concept of entry–exit files (similar to the concept of “episodes” described here) as a means of modeling both intervals of residence at a location and intervals of relationships. Garenne also provided useful observations regarding the implementation of field and software systems for longitudinal population studies.

To develop its data model, TWG synthesized the experience of these disparate applications. The model specifies a demographic “core” common to field stations doing longitudinal research on populations (MacLeod et al. 1991; Phillips et al. 1991). Sites have developed software systems to manage this demographic core, maintain a consistent record of significant demographic events in the population of a fixed geographic region, generate registration books that the fieldworkers use, and compute basic demographic rates, such as birth, mortality, and total fertility. These core capabilities establish a computational framework to which projects add their site-specific data and consistency specifications. The concept of a core also entails some generic principles of data collection and management that apply to all INDEPTH sites.

The INDEPTH concept of a data core

All participating sites in INDEPTH collect and maintain a common core of data. Attempts to standardize data processing have led to the concept of a “core system” that provides many of the common software requirements of field research laboratories and can be extended and modified to tailor software to various specifications. This concept is based on the principle that certain characteristics of households, household members, relationships, and demographic events are common to all longitudinal studies of human populations, and software required to collect, enter, and manage data can therefore be generic to a family of applications. TWG has identified these features of a core system common to all DSS operations. In this framework, the core system maintains a consistent record of baseline and longitudinal data on all households, household members, and their relationships in a geographically defined population, including births, deaths, migrations, and marriages. The core system maintains information on events and observation dates to give each entity in the study corresponding “person-day” counts of risk for demographic events. Core computer operations structure data and maintain logical integrity on the following basic elements of a household unit:

Although these are seemingly trivial items, mundane relationships tend to become complex and unwieldy when arrayed as a logical system of longitudinal population data; and portraying even simple relationships requires rigorous standards to avoid error. For example, to be counted as a death in a resident population, a concerned household member must be resident in the study area at the time of death; a live birth to a woman 5 months after she gave birth to another child would be an inconsistent event. A central contribution of TWG has been to clarify such minimal system logic so that the system prevents errors resulting in violation of business rules and rendering data useless.

All INDEPTH computer systems maintain standard DSS-processing operations:

Most INDEPTH sites have also developed software for reporting outcomes and managing data:

Tailoring the core system

Given the basic core model for data structure, each site has developed site-specific applications using building blocks of the core framework, which allow software developers to construct additional modules for project-specific data. At nine INDEPTH sites, standard tools of database-management packages have been used for an INDEPTH product known as the household-registration system (HRS) for the core specification.1 Other INDEPTH sites have developed project-specific core capabilities to maintain the logical integrity of birth, death, migration, and marriage data over time and in a format consistent with the reference data model. Each site modifies the core to accommodate new cross-sectional data, special longitudinal modules, or variable classes or labels investigators want to add to field registers, along with logic to maintain the integrity of new variables.

The tools of commercially available database packages greatly facilitate the process of core modification. Standard features of commercially available database systems include those for easily adding data to the core system. For example, the HRS is built from the form menu (data-entry screen) and database builders of the Microsoft FoxPro system. These builders encourag