Young Academy Groningen recently organised an inspiring event on Open Science and Reproducibility.
What is open science?
And Isn't science supposed to be open anyway? Keynote speaker Simine Vazire showed us how this is not always the case. She took us back to the fundamental ideals of science, sharing Merton's norms, which distinguish science from other forms of knowing. Science has a sense of universalism, in which the validity of a scientific claim does not depend on who is making it—there are no arguments from authority. Another ideal is communality—the scientific findings belong to everyone and everyone can check them. Science should also be disinterested—all results should be reported without bias. Science should not withhold findings that are unfavorable to the scientist. Finally, nothing is sacred in science, and all claims should be tested.
But is this really how science proceeds in practice? How many scientific findings can truly be checked by everyone? How many scientists are completely unbiased? Studies show that even scientists think that most science does not adhere to these ideas. So, we need a change in science to become more open such that it becomes easier for everyone to check scientific claims.
Challenges in opening up science
A major problem in science is the emphasis on significant results as a precondition for publication. Unfortunately it is quite easy to obtain significant results with enough p-hacking (trying out many different tests on different subsets of your data) and HARKING (hypothesizing after results are known-- presenting the obtained significant results as the original hypothesis). Probably as a result of the commonality of these practices, many studies do not replicate, which was most clearly shown in large-scale attempts at replications (e.g., the Reproducibility Project).
Another challenge in opening up your data is that you may not remember the connection between all your graphs and the raw data, or you may feel your data analysis scripts are too messy. Laura Bringmann shared some knowledge about Rmarkdown, which allows you to seamlessly integrate data analysis with code, avoiding the need to have code and graphs and data live in different places. This also makes it really easy to do revisions, because you can easily reproduce the original analyses that lead to specific numbers and plots.
How can we improve?
A good step towards opening science would be to share all materials and data so others can check it. A good resource recommended was the Open Science Framework. Candice Morey for example has all her materials and data from various projects there. However, during the data management panel the Research Data Management Office mentioned that this does not (yet) adhere to the new European regulations on privacy. Better options would be to work with for example Dataverse. Moreover, it is important to really think carefully about how to deidentify your data, because with the current machine learning algorithms it is surprisingly easy to identify someone's identity by combining a few different sources of data.
Of course, even if you decide to open your data, many others may not do so. One practical step individuals can take to enhance openness in science is to participate in the Peer Reviewers Openness Initiative, in which you pledge to only review articles which make their data open (or provide a good excuse why they cannot do so).
Another way in which openness can be improved is if universities consider the extent to which an individual makes their data and materials open in hiring and promotion decisions. In addition, what helps to promote open science is to formalize your hypotheses and deposit them somewhere before you collect the data. This procedure is called pre-registration, a topic discussed in a keynote by Candice Morey. Preregistration can be done quite easily on the Open Science Framework. Another interesting method is to write up your hypotheses in a Registered Report format (offered by increasing numbers of journals), in which reviewers decide on acceptance based on your introduction and methods before you collect the data, and then you are guaranteed acceptance(in principle), regardless of how your results turn out. Of course academic incentives should also change to promote such a process: rewarding these research practices instead of rewarding high-impact publications.
A further step in improving the scientific process would be to stop overselling our results and better understand statistics. Rink Hoekstra talked about common misunderstandings about statistics. Most notably, almost everyone's intuitions about p-values are wrong. P-values cannot ever tell you that your statistical hypothesis is true, but it only provides some evidence against a null hypothesis, and it always carries a certain level of uncertainty. It is therefore never possible to make very strong claims about your data, unlike what journals, and even moreso the media, often asks for. An insightful visualization of how little p-values really mean is the Dance of the P-values. Instead of blindly relying on p-values, an easy way to get a better grasp on data is by making visualizations, for which Gert Stulp provided some useful resources.
In short, there is still a long way to go to open up your science, but more and more resources are available. The full slides and materials of the meeting can be found below. You can also check out the hashtag #RUGopenScience.
Open Science: advantages & challenges by Siminie Vazire
Four stages of embracing pre-registration by Candice Morey
R visualisation by Gert Stulp
R-Markdown by Laura Bringmann
Visualising data in R by Rink Hoekstra
Are you interested in joining our events for Early Career Researchers? Visit our website for more information.
Photo report on the most special and oldest books of the university.
On Monday, November 26, UG researcher Anouk Goossens receives the Shell Award. This prize is awarded annually to three (former) physics students. Goossens receives the prize for her investigation into using the material Nb-doped SrTiO3 for imitating...
A bountiful and healthy Wadden Sea is an indispensable link in the life cycles of many migratory birds and fish. This is why the Wadden Fund and the three Wadden provinces – Groningen, Friesland and Noord-Holland – strive for the creation of such a...