Please download the Country-level Information and Sources spreadsheet (Microsoft Excel .xlsx file).
Detailed descriptions of the methods and improvements made in the GPWv4 data collection are described in the following paper by Doxsey-Whitfield et al. (2015):
GPWv4 is a minimally-modeled gridded population data collection that incorporates census population data from the 2010 round of censuses. Population estimates are created by extrapolating the raw census counts to estimates for target years 2000, 2005, 2010, 2015, and 2020. Additionally, a set of estimates that have been nationally adjusted to data from the United Nations World Population Prospects 2015 Revision (UN, 2015) is included in the GPWv4 collection for each of the target years. The development of GPWv4 builds upon previous versions of the data collection (Tobler et al., 1997; Deichmann et al., 2001; Balk et al., 2006).
The two basic inputs of GPW are non-spatial population data (i.e., tabular counts of population listed by administrative area) and spatially-explicit administrative boundary data (administrative or enumeration units). Population input data were collected at the highest resolution available from the results of the 2010 round of censuses, scheduled to occur between 2005 and 2014. Where census results were unavailable or not yet released, official population estimates from national statistical offices were used. Administrative boundary data were collected from a variety of national agencies (e.g. statistics offices, mapping agencies, planning agencies), as well as other organizations. Ideally, the boundaries are from the census geography. The population census counts or official estimates were then matched to digital geographic boundaries. Matching was based on the common identifying codes or the unit names used in the census.
A global framework of international boundaries was used to ensure consistent alignment between countries. The Global Administrative Areas version 2 (GADMv2; www.gadm.org) data set was selected as the framework as it is publicly available and frequently used in the research community. The international boundaries of census geography data sets were adjusted to the GADMv2 framework, although in cases where the resolution of the census geography far exceeded the GADMv2 boundaries, the former were kept (e.g., New Zealand, the United Kingdom, and the United States).
Since countries conduct their censuses at different time, annualized growth rates were used to adjust census counts to the target year of 2010 to allow for global comparison. Growth rates were calculated for each administrative unit by matching the total population from the input data to those from a previous census enumeration or estimate. Annualized rates of change were calculated as follows:
Population estimates were adjusted to target years as follows:
where r is the annualized growth rate, P1 and P2 are the census population counts, Px is the population estimate in the target year, and t is the number of years between population counts. In cases where matching at the highest resolution was not possible between the two points in time, censuses were matched and growth rates were calculated at a coarser resolution (e.g., state), and applied to each unit (e.g., municipality) within that state. In some cases we adopted a hybrid approach, matching the highest resolution where possible and coarsening where needed. The 2010 population estimates were then extrapolated to 2000, 2005, 2015, and 2020 using the calculated annualized growth rates.
National-level estimates for 2000, 2005, 2010, 2015, and 2020 were further adjusted to the estimates of the United Nation’s World Population Prospects (WPP): The 2015 Revision, which often correct for over- or under-reporting in the nationally-reported figures (United Nations, 2015).
Adjustment factors for matching national estimates to UN estimates are calculated as follows:
Adjustment factors were applied at the sub-national level as follows:
where a is the adjustment factor, Px is the population estimate in the target year, PUN is the UN national estimate, and Padj is the adjusted estimate.
To create the gridded population data set, the population estimates were distributed to a 30 arc-second (~1 km) grid using an areal-weighting method. This method, also known as uniform distribution or proportional allocation, does not make use of any other geographic data in order to spatially disaggregate the census population. Population was allocated into grid cells through the simple assumption that the population of a grid cell is an exclusive function of the land area within that pixel. For grid cells that intersect sub-national or national boundaries, population was allocated based on the proportion of the area of each unit located in the grid cell. A water mask was applied to the data to prevent lakes, rivers, and ice-covered areas from distorting the actual population density.
Balk, D.L., U. Deichmann, G. Yetman, F. Pozzi, S.I. Hay, and A. Nelson. 2006. Determining global population distribution: methods, applications and data. Advances in Parasitology 62:119-156.
Deichmann, U., D. Balk, and G. Yetman. 2001. Transforming population data for interdisciplinary usages: From census to grid. Palisades, NY: NASA Socioeconomic Data and Applications Center (SEDAC), CIESIN, Columbia University.
Doxsey-Whitfield, E., K. MacManus, S.B. Adamo, L. Pistolesi, J. Squires, O. Borkovska, and S.R. Baptista. 2015. Taking advantage of the improved availability of census data: A first look at the Gridded Population of the World, Version 4 (GPWv4). Papers in Applied Geography.
Tobler, W., U. Deichmann, J. Gottsegen, and K. Maloy. 1997. World population in a grid of spherical quadrilaterals. International Journal of Population Geography 3:203-225.
United Nations, Department of Economic and Social Affairs, Population Division (2015). World Population Prospects: The 2015 Revision, DVD Edition. http://esa.un.org/unpd/wpp/DVD/.