Secondary data refers to data that is collected by someone other than the primary user.[1] Common sources of secondary data for social science include censuses, information collected by government departments, organizational records and data that was originally collected for other research purposes.[2] Primary data, by contrast, are collected by the investigator conducting the research.
Secondary data analysis can save time that would otherwise be spent collecting data and, particularly in the case of quantitative data, can provide larger and higher-quality databases that would be unfeasible for any individual researcher to collect on their own. In addition, analysts of social and economic change consider secondary data essential, since it is impossible to conduct a new survey that can adequately capture past change and/or developments. However, secondary data analysis can be less useful in marketing research, as data may be outdated or inaccurate.
Secondary data can be obtained from many sources:
Government departments and agencies routinely collect information when registering people or carrying out transactions, or for record keeping – usually when delivering a service. This information is called administrative data.[3]
It can include:
A census is the procedure of systematically acquiring and recording information about the members of a given population. It is a regularly occurring and official count of a particular population. It is a type of administrative data, but it is collected for the purpose of research at specific intervals. Most administrative data is collected continuously and for the purpose of delivering a service to the people.
Secondary data is available from other sources and may already have been used in previous research, making it easier to carry out further research. It is time-saving and cost-efficient: the data was collected by someone other than the researcher. Administrative data and census data may cover both larger and much smaller samples of the population in detail. Information collected by the government will also cover parts of the population that may be less likely to respond to the census (in countries where this is optional).[4]
A clear benefit of using secondary data is that much of the background work needed has already been carried out, such as literature reviews or case studies. The data may have been used in published texts and statistics elsewhere, and the data could already be promoted in the media or bring in useful personal contacts. Secondary data generally have a pre-established degree of validity and reliability which need not be re-examined by the researcher who is re-using such data. Secondary data is key in the concept of data enrichment, which is where datasets from secondary sources are connected to a research dataset to improve its precision by adding key attributes and values.[5]
Secondary data can provide a baseline for primary research to compare the collected primary data results to and it can also be helpful in research design.
However, secondary data can present problems, too. The data may be out of date or inaccurate. If using data collected for different research purposes, it may not cover those samples of the population researchers want to examine, or not in sufficient detail. Administrative data, which is not originally collected for research, may not be available in the usual research formats or may be difficult to get access to.
While 'secondary data' is associated with quantitative databases, analysis focused on verbal or visual materials created for another purpose, is a legitimate avenue for the qualitative researcher. Actually one could go as far as claim that qualitative secondary data analysis “can be understood, not so much as the analysis of pre-existing data; rather as involving a process of re-contextualizing, and re-constructing, data.”[6]
In the analysis of secondary qualitative data, good documentation cannot be underestimated as it provides future researchers with the background and context and allows replication.[7]