IASSIST Quarterly 2020-12-18T11:43:46-07:00 Karsten Boye Rasmussen Open Journal Systems <p class="p1">The <strong>IASSIST Quarterly</strong> at is a peer-reviewed, indexed, open access quarterly publication of articles dealing with social science information and data services, including relevant societal, legal, and ethical issues.</p> <p class="p1">The <strong>IASSIST Quarterly</strong> represents an international cooperative effort on the part of individuals managing, operating, or using machine-readable data archives, data libraries, and data services. The&nbsp;<strong>IASSIST Quarterly </strong>reports on activities related to the production, acquisition, preservation, processing, distribution, and use of machine-readable data carried out by its members and others in the international social science community.&nbsp;</p> Sharing open data without risk, and with machine-actionable provenance metadata 2020-12-18T08:52:01-07:00 Karsten Boye Rasmussen <p>tba</p> 2020-12-18T00:00:00-07:00 Copyright (c) 2020 Karsten Boye Rasmussen Sustainability through the liaison with data archive users 2020-12-08T00:10:43-07:00 Michaela Kudrnáčová Ilona Trtíková <p>As a social science data archive, we focus on collecting research data and archiving it. However, there are more responsibilities that come with data archiving: cooperation on international social surveys (ISSP, ESS), supporting secondary data analysis and much more. Significant part of our work is to communicate with students and researchers, to educate them about data management and data analysis. Although the relationship we have is functional and seems sufficient, we tend to ask ourselves: who are the data archive users and what do they expect from us?</p> <p>We decided to employ user-centered design methods and tools to define a typical user of our services and to find out what their motivations for using our data archive are and what specific functions they use and (do not) appreciate, so we would have a better image of their needs. Moreover, we wondered about the role of open science and its impact on the users’ needs and future requirements arising from the open science environment. Obtained information is a point of departure for redesigning archival services to satisfy new demands our users have regarding more data resources, new techniques of scientific work and better interconnection between different platforms.</p> 2020-12-18T00:00:00-07:00 Copyright (c) 2020 Michaela Kudrnáčová, Ilona Trtíková Mathematics, risk, and messy survey data 2020-08-03T21:52:30-06:00 Kristi Anne Thompson Carolyn Sullivan <p>Research funder mandates, such as those from the U.S. National Science Foundation (2011), the Canadian Tri-Agency (draft, 2018), and the UK Economic and Social Research Council (2018) now often include requirements for data curation, including where possible data sharing in an approved archive. Data curators need to be prepared for the potential that researchers who have not previously shared data will need assistance with cleaning and depositing datasets so that they can meet these requirements and maintain funding. Data de-identification or anonymization is a major ethical concern in cases where survey data is to be shared, and one which data professionals may find themselves ill-equipped to deal with. This article is intended to provide an accessible and practical introduction to the theory and concepts behind data anonymization and risk assessment, will describe a couple of case studies that demonstrate how these methods were carried out on actual datasets requiring anonymization, and discuss some of the difficulties encountered. Much of the literature dealing with statistical risk assessment of anonymized data is abstract and aimed at computer scientists and mathematicians, while material aimed at practitioners often does not consider more recent developments in the theory of data anonymization. We hope that this article will help bridge this gap.</p> 2020-12-18T00:00:00-07:00 Copyright (c) 2020 Kristi Anne Thompson, Carolyn Sullivan Provenance metadata for statistical data: An introduction to Structured Data Transformation Language (SDTL) 2020-09-09T00:33:24-06:00 George Alter Darrell Donakowski Jack Gager Pascal Heus Carson Hunter Sanda Ionescu Jeremy Iverson H.V. Jagadish Carl Lagoze Jared Lyle Alexander Mueller Sigbjørn Revheim Matthew A. Richardson Risnes Ørnulf Karunakara Seelam Dan Smith Tom Smith Jie Song Yashas Jaydeep Vaidya Ole Voldsater <p>Structured Data Transformation Language (SDTL) provides structured, machine actionable representations of data transformation commands found in statistical analysis software.&nbsp;&nbsp; The Continuous Capture of Metadata for Statistical Data Project (C<sup>2</sup>Metadata) created SDTL as part of an automated system that captures provenance metadata from data transformation scripts and adds variable derivations to standard metadata files.&nbsp; SDTL also has potential for auditing scripts and for translating scripts between languages.&nbsp; SDTL is expressed in a set of JSON schemas, which are machine actionable and easily serialized to other formats.&nbsp; Statistical software languages have a number of special features that have been carried into SDTL.&nbsp; We explain how SDTL handles differences among statistical languages and complex operations, such as merging files and reshaping data tables from “wide” to “long”.&nbsp;</p> 2020-12-18T00:00:00-07:00 Copyright (c) 2020 George Alter, Darrell Donakowski, Jack Gager, Pascal Heus, Carson Hunter, Sanda Ionescu, Jeremy Iverson, H.V. Jagadish, Carl Lagoze, Jared Lyle, Alexander Mueller, Sigbjørn Revheim, Matthew A. Richardson, Risnes Ørnulf, Karunakara Seelam, Dan Smith, Tom Smith, Jie Song, Yashas Jaydeep Vaidya, Ole Voldsater