Data activism is a social practice that uses technology and data. It emerged from existing activism sub-cultures such as hacker an open-source movements.[1] Data activism is a specific type of activism which is enabled and constrained by the data infrastructure.[2] It can use the production and collection of digital, volunteered, open data to challenge existing power relations.[3] It is a form of media activism; however, this is not to be confused with slacktivism. It uses digital technology and data politically and proactively to foster social change.[4] Forms of data activism can include digital humanitarianism[5] and engaging in hackathons. Data activism is a social practice that is becoming more well known with the expansion of technology, open-sourced software and the ability to communicate beyond an individual's immediate community. The culture of data activism emerged from previous forms of media activism, such as hacker movements. A defining characteristic of data activism is that ordinary citizens can participate, in comparison to previous forms of media activism where elite skill sets were required to participate.[6] By increasingly involving average users, they are a signal of a change in perspective and attitude towards massive data collection emerging within the civil society realm.
Data activism can be the act of providing data on events or issues that individuals feel have not been properly addressed by those in power. For example, the first deployment of the Ushahidi platform in 2008 in Kenya visualized the post-electoral violence that had been silenced by the government and the new media.[2] The social practice of data activism revolves around the idea that data is political in nature.[7] Data activism allows individuals to quantify a specific issue.[6] By collecting data for a particular purpose, it allows data activists to quantify and expose specific issues. As data infrastructures and data analytics grow, data activists can use evidence from data-driven science to support claims about social issues.[8] [2]
A twofold classification of data activism has been proposed by Stefania Milan and Miren Gutiérrez,[9] later explored more in-depth by Milan[6] according to the type of activists' engagement with data politics. 'Re-active data activism' can be characterized as motivated by the perception of massive data collection as a threat, for instance when activists seek to resist corporate and government snooping, whereas 'pro-active data activism' sees the increasing availability of data as an opportunity to foster social change.[6] These differentiated approaches to datafication result in different repertoires of action, which are not at odds with each other, since they share a crucial feature: they take information as a constitutive force capable of shaping social reality[10] and contribute to generate new alternative ways of interpreting it.[11] Examples of re-active data activism include the development and usage of encryption and anonymity networks to resist corporate or state surveillance, while instances of pro-active data activism include projects in which data is mobilized to advocate for change and contest established social narrative.
It was discovered that in the United States between 180,000 and 500,000 rape kits were left unprocessed in storage in forensic warehouses.[12] Registration and entry of criminal DNA had been inconsistent, which caused this large backlog in date rape kits. The delay in analysing these DNA samples would approximately be six months to two years.[13] The information from rape kits was meant to be entered into the forensic warehouse database, but there was a disconnect between the warehouse system and the national DNA database Combined DNA Index System (CODIS) that left these rape kits unexamined. Testing these rape kits is important in identifying and prosecuting offenders, recognizing serial rapists, and providing justice for rape victims.[12] The Ending the Backlog Initiative brought attention to this issue by demanding that the data from these rape kits be processed. It was this initiative that brought this issue to the attention of the United States government, who began stated that this was unacceptable and put $79 million in grants would be used to help eliminate the backlog of rape kits.[14] The quantification of this data changed the ways in which the public perceived the process of analysing rape kits. This data was then used to gain the attention of politicians.
DataKind is a digital activism organization that brings together data scientists and people from other organizations and governments for the purpose of using big data in similar ways that corporations currently use big data namely to monetize data. However, here big data is used to help solve social problems, like food shortages and homelessness. DataKind was founded in 2011 and today there are chapters in the United Kingdom, India, Singapore and the United States of America.[15] Jake Porway is the founder and executive director of DataKind.[16]
While data activists may have good intentions, one criticism is that by allowing citizens to generate data without training or reliable forms of measurement, the data can be skewed or presented in different forms.[17]
After the Fukushima nuclear disaster in 2011, Safecast was an organization established by a group of citizens that were concerned about high levels of radiation in the area. After receiving conflicting messages about levels of radiation from different media sources and scientists, individuals were uncertain which information was the most reliable. This brought about a movement where citizens would use Geiger counter readings to measure levels of radiation and circulate that data over the internet so that it was accessible by the public.[18] Safecast was developed as a means of producing multiple sources of data on radiation. It was assumed that if the data was collected by similar Geiger counter measurements in mass volume, the data produced was likely to be accurate.[19] Safecast allows individuals to download the raw radiation data, but Safecast also visualizes the data. The data that is used to create a visual map is processed and categorized by Safecast. This data is different from the raw radiation data because it has been filtered, which presents the data in a different way than the raw data.[20] The change in presentation of data may alter the information that individuals take from it, which can pose a threat if misunderstood.