Avoiding the Data Colonialism Trap

By TReNDS’ Manager, Hayden Dahmm and Expert Member, Tom Moultrie

Source: KamiPhuc/Trend hype

Source: KamiPhuc/Trend hype

Even while the COVID-19 pandemic has heightened awareness of data and statistics, it has highlighted the global divide in statistical capacity, and in many cases, exacerbated it. Across Africa, for example, incomplete death registration systems have hampered efforts to track the virus. And according to recent surveys of national statistical offices (NSOs), nine out of ten NSOs in low and lower-middle-income countries may not fully be able to meet their international reporting requirements. To address these issues, countries are now being presented with an array of private sector data solutions and other non-traditional sources of information to fill these data gaps. Yet as global institutions seek to empower with data, we must be wary of “data colonialism” — the potential for the powerful, data-rich countries and corporations in the Global North to undercut capacity development in the Global South by failing to recognize the local contexts. As the global statistical community gathers for the UN Statistical Commission next week, we encourage further consideration and discussion of this issue within the community.

Colonizing Personal Data and Digital Technology

The very origins of scientific knowledge are entangled with colonialist history. From the way Northern scientists can fail to recognize Southern collaborators in “parachute research,” to the ethical concerns raised around health studies conducted in the Global South, many colonialist legacies are still reflected in modern science. These colonialist influences can now be found in the data sciences as well.

The term ‘data colonialism’ has been used to describe the appropriation of big data throughout the Global South, particularly by the major international powers in the data space, the United States and China. In their book, The Costs of Connection, Nick Couldry and Ulises Mejias posit that the emerging state of data-driven capitalism is a continuation of the systems of exploitation that defined previous colonial eras. Paralleling past patterns of colonial resource extraction, many Northern corporations profit from the personal data they are gathering around the world and are undermining the data sovereignty of nations in the Global South. The process of “digital colonialism” — whereby the Global North monopolizes the digital technology supply — can impede Southern economies, particularly in Africa, to develop their own digital economies, manufacturing capabilities, and other domestic industries.

And companies are often taking advantage of these untapped data resources under the pretense of altruism. For example, Facebook’s Free Basic, a free limited internet service offered throughout the Global South, has been accused of colonialist practices by harvesting metadata and violating net neutrality rules. As of 2020, only 43% of least developed countries had established data and privacy protection legislation (compared to 96% of European countries), leaving them uniquely vulnerable to exploitation.

Furthermore, critics have suggested that U.S. and Chinese multi-national corporations are establishing an imperial level of control over digital ecosystems, leading to an increase in surveillance and a disproportionate influence over economics, politics, and culture. Facebook has begun laying a 37,000 km long undersea cable around Africa to expand internet access, while Google’s sister company, Loon, launched dozens of internet balloons over Kenya in 2020 to provide 4G access, with plans underway to replicate the service in Mozambique. Meanwhile, the Chinese facial recognition company, CloudWalk, struck a deal with Zimbabwe, potentially enabling enhanced surveillance by the government, while allowing CloudWalk to train its AI systems on a wider set of racial features.

Unbalanced Priorities and Local Statistical Credibility at Risk

The increased dependency on the Global North for technical infrastructure, science, and data-based solutions raises concerns about whether Southern countries’ efforts to produce data and statistics to meet their domestic needs are being crowded out. In a 2018 survey of NSO representatives from 140 low and middle-income countries on their perceptions of official statistics, respondents ranked international development partners as the most important users, well above government ministries, local government, and civil society. These priorities may in part reflect how international development partners, overwhelmingly based out of the Global North, provide the strongest demand for official statistics. However, these partners are also significant external funders of NSOs, and this relationship can create incentives that channel resources away from domestic needs in favor of donor data requirements. The OECD’s Development Assistance Committee (DAC) has also noted insufficient donor alignment with national priorities as a major bottleneck for statistical systems, as conflicting demands from donors for particular datasets can further erode domestic demand for data and entrench dependence on donors. Even if inadvertent, this dynamic parallels how colonial systems were administrated for the primary benefit of Western powers.

Moreover, not only are countries in the Global South prompted to prioritize the statistical needs set by the Global North, but new data methods and systems based in the North risk undermining their statistical credibility altogether. For example, although the “global health metrics enterprise” aims to bring objectivity and accountability to health policy, it has been criticized for transferring power from low to high-income country institutions. The Institute for Health Metrics and Evaluation (IHME) produces the Global Burden of Disease (GBD) report, which is now the de facto source of health accounting, but concerns have been raised about a lack of transparency in their methods, and GBD figures can diverge noticeably from the statistics countries publish themselves.

If we are not careful, the shift of knowledge and data production to external institutions (primarily based in the Global North) in the field of health statistics could be repeated in other domains. Emerging technologies - like remote sensing and machine learning - can complement our understanding of sustainable development issues, but over-reliance on these tools (again, largely produced in the Global North) could leave marginalized areas underserved or completely overlooked. For example, nighttime light data from the U.S. defense satellite system, Linescan, have been used as part of big data derived estimates of poverty, as in Senegal, but an accuracy test of electrification rates calculated from Linescan in Burkina Faso found that up to 57% of the 147 communities sampled were undetectable. Moreover, data solutions based in the Global North might not be able to anticipate or address such shortcomings. As described in a recent MIT Technology Review article, global AI groups often lack geographical diversity, creating the risk of AI standards that will perpetuate biases and fail to account for cultural contexts. And beyond the immediate implications for how we understand the world, there could be wider social ramifications. The introduction of private big data sources in the West has arguably helped fuel the politicization of basic facts and official statistics, and this could further compromise representative democracy throughout the world.

Looking Forward

There are many inspiring examples of communities and governments employing data to advance sustainability. However, if left unchecked, data colonialism will threaten this mission. Governments and citizens of the Global South must establish a shared understanding of the benefits and risks of their rapidly digitizing world and the power of data. The United Nations Statistical Commission and other supra-national bodies provide the means to develop a collective vision that address the prevailing imbalances in the data landscape. To begin with, several actions should be taken to support NSOs:

Map data gaps or needs. All supporting partners must work closely with governments to understand their data needs and priorities. They must align their efforts to ensure the most pressing needs are addressed.

Adequately fund NSOs in low-income countries. Additionally, the OECD’s DAC has called for greater coordination of donor funding at the country-level. The recently launched Bern Network on Financing Data for Development, along with calls to double the funding for statistics, offer hope that we may move in this direction. But crucially, funding must be expanded in a way that empowers statistical systems to prioritize the data needs of domestic policymakers.

Ensure that NSO officials, local academics, and others are actively involved in the management of new data sources, and that the associated methodologies are fully transparent.

International organizations and other data actors should commit to building sustainability in local data production and knowledge generation in low-income countries.

Build on efforts to document current practices, and establish clear guidance and standards for governing public-private data collaborations that protect both the interests of citizens and the integrity of NSOs.

When extra-national data are produced on issues that are monitored by NSOs or other domestic authorities, work to have these indicators complement or build off of domestic data, rather than supplant them.

Efforts will also need to extend beyond the limited, technical mandates of NSOs. Wider government policies and initiatives are required to address the full scope of the issue, including:

Support countries in advancing data-driven government strategies bolstered by good digital governance procedures.

Establish effective regulatory policies within the tech sector, similar to how the telecommunications sector is overseen.

Consider “localizing” data, which requires data about citizens to be stored within the country. While this has been objected to by the mobile money industry and free trade advocates for limiting important data flows, it is a step towards establishing a more equal balance of power, and regional-level regulations of data flows could maintain protections while fostering beneficial sector developments.

Examine opportunities around “nationalizing” data, in which data are turned into a national resource that big tech companies would be made to pay for at a level that benefits citizens.

We believe these actions will help mitigate the risk of perpetuating exploitative data practices and longstanding inequities. The “data revolution” will only be revolutionary if it genuinely empowers those who currently lack power.