Academic Data Center Groningen: Any chance for open data?
On 11 July 2017, the University of Groningen signed an agreement with Statistics Netherlands (CBS) marking the launch of the Academic Data Center Groningen. The aim of the agreement is to maximize the use of CBS data in research in the social sciences and economics, with the Center generating new research data and publications. However, the launch of the Center has raised some questions about how open and fair the new research data will be.
Academic Data Center Groningen
CBS is currently implementing a corporate valorization strategy with an eye to enhancing its value to Dutch society. One component of this strategy is to collaborate with academic institutions thus making maximum use of CBS data. This collaboration will be in the form of Academic Data Centers, the first of which is the Center in Groningen. The CIT (Center for Information Technology) and the RDO (Research Data Office) are important UG players in the Center.
Nowadays, research data is the raw material for academic research, and more is needed to stimulate new innovative research and publications. It is therefore important that such data be collected and made accessible and available. The principles of ‘open data’ and ‘fair data’ are often mentioned in this context.
The cornerstone of the open data movement is that all data should be freely available to everyone. Wessels et al. (2014) provided a more precise definition of ‘everyone’, noting that the proposed users of open data are ‘public and private stakeholders, user communities and citizens’. The idea of open data has since been widely adopted and promoted by public stakeholders and national and supranational authorities. The rationale behind this phenomenon is that it adds value because data is reused and promotes economic development.
Open research data
National governments think that academic institutions should embrace the open data movement and make more research data available. However, the suggestion to combine openness with research data is far too simplistic, according to Janssen, Charalabidis & Zuiderwijk (2012), because there are barriers to the openness research data. The first such barrier is unclear organizational policies on the publication of research data. The second is the lack of benefit for the data provider. There are also ethical and legal barriers. For example, if research data from human subject research is made available to all this could endanger the privacy of participants, which is unacceptable for ethical and legal reasons. Starting in 2018 enforcement by law is thinkable, because of the new EU General Data Protection Regulation (Dutch: Algemene Verordening Gegevensbescherming - AVG). This new regulation will affect human subject research and thus the findability, accessibility and openness of research data. Could these developments influence the Academic Data Center Groningen and future access to research data? We asked a number of stakeholders in the Academic Data Center from the University of Groningen and CBS this question.
Open research data and the Academic Data Center Groningen
Govert Schoof (Geodienst, CIT) thinks that the cooperation between CBS and the UG will be valuable and produce lots of new research data. However, he is concerned that the data will not be fully open to everyone, because the issue of data protection will mean that CBS will have to limit access to its data. Much of the CBS data relates to human subjects and therefore needs to be protected. Govert does believe that these problems can be solved and has two suggestions. First, grant access to the research data of the Academic Data Center at a high aggregate level. Second, investigate the concept of ‘Differential Privacy’. This is an innovative method developed by Apple Inc. [https://blog.cryptographyengineering.com/2016/06/15/what-is-differential-privacy/] for the enhancement of the privacy of personal data.
Wietske Degen (CIT) believes that the collaboration between CBS and the UG is very important. For instance, a CBS microdata catalogue will soon be available, and this is not available in StatLine or on the CBS Open Data platform . Microdata is linkable data at the level of individuals, companies and addresses that can be made available, under strict conditions, to researchers for statistical research. Wietske thinks that the legal teams at CBS and the UG will need to further define exactly what data protection is and the legal aspects of their collaboration. This includes the openness of data produced by the Academic Data Center.
Ronald de Jong (CBS) is glad that the CBS microdata will be available to UG researchers. A new catalogue of the data is already available, but researchers will need some time to become familiar with it, so there will be a learning curve. Furthermore, CBS considers data protection and legal standards for research data to be a top priority. It is therefore unlikely that the research data of the Academic Data Center will be as openly available as in the definition of Wessels et al. (2014), but it is clear that CBS and the UG are seeking possible solutions. However, in turbulent and changing times, treating research data with the utmost care is a very wise move.
Janssen, M., Charalabidis, Y., & Zuiderwijk, A. (2012). Benefits, adoption barriers and myths of open data and open government. Information systems management, 29(4), 258-268.
Wessels, B., Finn, R. L., Linde, P., Mazzetti, P., Nativi, S., Riley, S., ... & Wyatt, S. (2014). Issues in the development of open access to research data. Prometheus, 32(1), 49-66.
|Last modified:||20 September 2017 2.18 p.m.|