Open Social Science: Reusing and repurposing secondary data
Re-analysing secondary data (collected by other researchers) to address new questions is a well-established, economical and time-saving research practice in the social sciences, that fits well with open science goals. Key types of data include consortium research projects using cross-national surveys, standardized data from international organizations, well-respected databases from individual research teams checked for consistency over time or across areas, metrics data, and one-off datasets deposited by individual researchers or teams in data archives. Yet re-using any data collected by others for new and different purposes can entail compromises. Question wordings in surveys or categories/concepts in official statistics may only partly capture the phenomena you are interested in. ‘Mashing’ data from different sources or across different units or periods can be tricky. And for individual deposited datasets or constructed metrics, clarifying how the data was originally created and cleaning it so that it can be fully understood can be non-trivial tasks. This session examines how to develop research questions for re-using and mashing data, how to work with the limitations of secondary data sources, and how novel insights can be gained from bringing datasets together.