Benefiting from and practising Open Science during a PhD (side) project
Open Research objectives/practices
using open data; pre-registration of seconary data analysis; open code; open access publishing (preprints)
At the start of my PhD I was busy with setting up my data collection, which often involved long wait times (e.g. for ethical approval). I read a lot of papers during that time, including pioneering work by a research team in the US (Himmelstein et al., 2018) that collected interesting data which could also be used for a research question I had in mind. I contacted them and they were willing to share their data with me based on which I would start a side project which turned out to become now one of my thesis chapters. During this project, there were a number of things I struggled with - also in relation to open science, which I would like to reflect on here.
Many aspects of my project were open science-related.
First of all, I benefited from someone opening up their data to me (our collaborator is willing to open up data entirely after the publication of pending articles). Thereby I actually discovered mistakes in the preprocessing of the data and could help the researcher who shared the data partly fix this mistake. Our collaborator was partly embarrassed, but also quite happy, as it would have impacted other papers to be published based on the data.
Before analysing the data, I completed a pre-registration (on OSF) for a rather challenging secondary data analysis. The preregistration helped me a lot in structuring what I wanted to do and eventually helped in writing my analysis code and the paper. During the research process, it, however, also turned out that we needed to adapt our analysis plan for one question and deviate from the pre-registration. A methodological expert made us aware of a more appropriate analysis approach that I was not aware of while writing my pre-registration. We tried to document our deviations as transparently as possible and reported the most appropriate analysis in the paper.
I also uploaded our entire analysis code to OSF, which felt particularly daunting as the analyses we did were rather complicated and it was my first time doing them. However, I find it important to give reviewers the chance to check my work and readers the chance to also perform such analyses. I usually try to write my code in such a way that others can follow it, structuring the script and commenting a lot. During this project, I had many breaks (e.g., doing the data collection for my main PhD project; waiting for collaborator or reviewer comments) and the comments also helped me a lot in understanding my code when returning to it after a while.
Lastly, I published a pre-print of the final paper and intend to publish it open access (pending acceptance) so that anyone can read my work.
Open science enables very interesting collaborations and makes it possible that collected data is used efficiently.
It is okay to make mistakes (e.g., in data preprocessing or your pre-registration). Open science is not there to call you out but to make your work better.
URLs, references and further information
The paper that lead to this project:
Himmelstein PH, Woods WC, Wright AGC. A comparison of signal- and event-contingent ambulatory assessment of interpersonal behavior and affect in social situations. Psychol Assess. 2019 Jul;31(7):952-960. doi: 10.1037/pas0000718. Epub 2019 Apr 8. PMID: 30958026; PMCID: PMC6591090.
The OSF page of my project including code and preregistration
|Last modified:||01 November 2023 12.40 p.m.|