2. The data we need: Understanding the past, present and future


Tracking our progress on sustainable development requires huge quantities of data. In March 2016, the UN Statistical Commission approved a set of more than 240 indicators that they propose be used to monitor national and global progress towards the SDGs by 2030.⁷ These indicators will come from a wide array range of data sources and tools including censuses, household surveys, civil registration and vital statistics systems (CRVS), administrative data systems, and environmental data such as geospatial imagery.

Few countries in the world have statistical systems ready and equipped to monitor the breadth of the sustainable development challenge, as an analysis of population data serves to show – see Box 1. As this report will explain, investments in statistical systems and new data partnerships and innovative methods will be essential, but we must also attempt to use what resources we have more effectively. Nearly every country in the world has an existing statistical system that conducts the national census, compiles household surveys or prepares national accounts, even if infrequently. Each of these statistical processes generate data that – even if two, three, five or 10 years out of date – can give us insights into a country’s past and present. However, in a great many cases this historical data is poorly stored, relegated to a report gathering dust on a shelf. In addition, data is often collected in a piecemeal fashion with little attempt to assimilate complementary data and identify trends across data sources.

But to achieve the SDGs and be able to monitor our progress over time a historical perspective is essential; countries need to be able to compile a baseline (see Box 1), need to be able to analyze trends over time, and to predict future trajectories. Legacy data and data systems are frequently ignored as people look to the data revolution and the new, shiny tech-based approaches. But with relatively modest investments in digitizing, cleaning, and standardizing data, it is possible to derive value from these historical data. The process of digitizing, cleaning and standardizing data is also important so that new data can be consistent with that compiled in the past, enabling us to track progress over time and to assess our trajectories. A case study of the measurement of drinking water supply in rural Bangladesh serves to demonstrate the value that can be derived from utilizing legacy data systems (Box 2).


Fundamentally, data for sustainable development should help governments to effectively manage their resources, services and responsibilities so as to provide the best possible support and protection for their citizens and the natural habitat. Data should serve as an administrative tool supporting governments to make judicious decisions about where and how to direct attention and resources. But to be a helpful tool, data needs to be both timely and relevant. Data that is three or more years out of date cannot help a government make an effective decision between investing in one health clinic over another (though historical data can help governments identify where there have been services over time). Conversely, having access to real or near-time data can help governments make nimble decisions about how to move capacity and resources with the potential for huge efficiency gains. This kind of data in support of management and administration is known as administrative data. It is data that tells us how things tick and helps us to run effective and responsive operations, services and businesses.

In a wide array of literature on the data revolution and in national strategies for the development of statistics (NSDSs), administrative data is highlighted as the single greatest area of systematic underinvestment.¹⁶ The returns on investment are immense, but few cash-strapped, low-income countries are inclined to invest in long-term systems-building when there are other competing, urgent needs. Furthermore, few donors are inclined to invest in administrative data collection methods and tools as these systems are managerial and process-oriented, not responding to a specific problem or providing an immediate tangible solution. This makes it harder to explain the returns on investment, both socially and economically. Building robust administrative data systems in this resource-constrained environment is therefore dependent upon two things:

  1. Making available data interoperable, and

  2. Using diverse sources of data or ‘multi-modal’ data collection methods.

Data interoperability is the ability to convert or store data in a format that is easy to use and distribute, so as to facilitate the easy exchange of data. Within governments, interoperable data is essential for data to be shared across line ministries and departments in comparable, useful formats. This enables integrated program design and monitoring that cuts across sectors. Particular important is mapping data, which if made available in accessible formats by mapping agencies or ministries of land or environment, can be used by the NSO to assess the spatial distribution of poverty or wellbeing indicators.

Interoperable data is particularly important when looking to provide services to vulnerable groups. Take, for example, the situation of a vulnerable child. For this child to be recognized by the government and for their welfare to be tracked over time, they first and foremost need to have their birth registered for a record of their identity. Ministries of health most commonly collect this data. They then need to go to school, and the ministry of education should know if this highly vulnerable individual is able to access public schooling. There should be a record of their care situation (whether living in an institution or with a foster family) and an address. And there should be records of their developmental progress, health and wellbeing. All of this information is collected by different sections of government, often with the support of third parties such as UNICEF or the World Health Organization (WHO). If the data is not recorded in systematic administrative data systems within ministries and government departments, it will be nigh on impossible to monitor and track the welfare of that child over time. Furthermore, to ensure the child receives a holistic program of care, that data needs to be shared across departments – requiring that it be interoperable.

Creating truly integrated and interoperable data systems with real-time data sharing across governments relies on records held electronically; frontline service agents with access to computers and the internet (or the capacity for the department or ministry to digitize records); and agreements across government on data exchange, standards and storage. For third-party data to be integrated into this data architecture, they also need to agree on standards for data collection, storage and ease of use. These processes are not in and of themselves complex, but they are time- and resource-intensive, requiring high-level political commitment to bring about systemic change.

Given the gaps in government administrative data systems, multi-modal data collection can be a useful resource. Multi-modal data is when two or more data sources are overlaid with one another to offer a more complex picture of a community or geography than might be provided by any one of them, helping fill gaps in other sources. Examples include satellite imagery overlaid with telecommunications data to map population movement, or satellite imagery overlaid with citizen-generated data to carefully map facilities or risks within a given community. For more on the value of multimodal data collection, see Box 3.


A key imperative of sustainable development is to protect the Earth’s natural resources for future generations, ensuring that we do not deplete natural stocks at a rate that cannot be replenished. Sadly, this has not happened throughout recent history. With a world population now at 7.2 billion people and an annual gross domestic product (GDP) of nearly USD 90 trillion, the world economy using today’s technologies is already exceeding several of the Earth’s “planetary boundaries.”¹⁹ Without access to sexual and reproductive health services and other targeted responses, the global population will rise to 9 billion people — or possibly more — by 2050, and to 10 billion before 2100. ²⁰ Many natural resources and ecosystems essential for human and societal wellbeing are already under threat, and will be further threatened or destroyed if current generations do not consume them sustainably. The world will experience unprecedented crises of food production, public health and natural disaster, among other threats. Food prices will soar, and some parts of the world may be rendered virtually uninhabitable as a result of climate change and water stress.

Managing these risks and the increased incidence of natural disasters arising from extreme weather requires that we use data to analyze past trends and predict future scenarios. The necessity to effectively plan for and manage risk is well articulated in the Sendai Framework, and must be pursued in concert with the SDGs (see Box 4).

Sendai identified the need for new data capacities that enable governments to project into the future, anticipating our trajectory and changing course as required. The Sendai Framework specifically calls for enhanced scientific work in disaster risk reduction and a better coordination of existing networks and scientific research institutions, enabling more fluid exchange of data, modeled assessments and policy recommendations.²¹

A key tool for this kind of analysis is forecasting or modeling future scenarios. One such example is The World in 2050, a project assembling leading modeling teams to perform an integrated assessment addressing the full spectrum of sustainable development challenges. The key value of the project is that it maps our trajectories on a range of issues or goal areas and looks at the synergies and tradeoffs among these issues — for example, identifying how rapid electrification might result in excessive non-renewable energy use. This kind of approach allows governments not only to predict future trends, but also to make informed policy decisions that take into account potential sacrifices.

Unfortunately, few countries in the world have this kind of technical capacity within their statistical systems. However, academia and private industries like insurance often specialize in such methods, and can be a powerful partner for governments if they are invited into the data collection, policy development and planning processes. This relies upon governments having a more open and responsive attitude to data partnerships, in which scientific predictions and forecasts are given equal weight to current and past analysis within the data-based policy and decision-making processes. International entities like the GPSDD and SDSN can also play a useful role, showcasing examples of successful public and nongovernmental collaborations to help forecast scenarios and design responsive policies and programs.

Back to main page