The replication crisis: Implications for myself

replication crisis
open science
reproducibility
In this blog post, I reflect on the ways in which learning about the replication crisis in science has affected my own work.
Author
Affiliation

University of Bamberg

Published

November 21, 2023

Since my research is almost exclusively quantitative, the methodological discourse surrounding the replication crisis has been directly relevant to my work. A recent invitation to take part in an online event by the International Society for the Linguistics of English (ISLE) on “Replication and Replicability” was an opportunity to reflect on the ways in which this “crisis” has affected how I do my job. In this blog post, I summarize these under three headings: (i) workflow and reproducibility, (2) open science, and (3) community discourse.

I would like to start, however, with two preliminary remarks. For one, I consider the discussions, suggestions, and innovations that have arisen in the context of the credibility crisis in science as an opportunity – they should inspire us to improve the way(s) in which we do and communicate research. While there are some who point out that we actually don’t know whether there is a replication crisis in linguistics1, the suggested ways forward enable better science, so it is worth adopting them in any case.

Further, if we decide to change our research routines, we should be indulgent with ourselves: Many of the suggested improvements, especially concerning data analysis workflow, can be quite overwhelming at first. We should avoid setting our immediate aims too high – as I had to find out on numerous occasions, it is too easy to become frustrated. And this may also be something to keep in mind when making recommendations: The advice we give to others should be calibrated to the person across the table. Nothing is gained if a researcher with a genuine interest in adopting better practices ends up quitting in frustration.

Workflow and reproducibility

A minimal requirement for quantitative research is that the results of a study should be reproducible, which means that they can be re-generated (or repeated) using the original data. I am therefore using the term “reproducibility” in what may be considered the original sense, and distinguish it from “replication”, which uses new data (see this very helpful paper on terminological traditions by Lorena Barba).

When it comes to tools and technical know-how, much progress has been made in the past two decades, and researchers can now make use of an impressive array of software solutions. A key issue, however, is the varying level of technical expertise in the research community. Scholars well-versed in the latest reproducibility tools may forget that not yet everybody is comfortable with the software necessary for their implementation. A case in point is the use of statistical programming languages such as R or Python.

When I started as a PhD student in 2012, I had some experience with SPSS, which I had used in my final thesis. Pressed for advice, my senior colleague Ole Schützler said that I might want to consider learning R – so I did. Today, I am truly grateful for this input. The steep learning curve into R, however, makes me hesitant to issue this as a universal recommendation. After all, good research can also be done with other, more user-friendly software. Nevertheless, for early-career researchers setting out on an empirical PhD project, I would probably make the same recommendation.

While learning R is a great first step towards a reproducible workflow, it does not guarantee improvement. In fact, in the early stages of learning R, when we are analyzing data while at the same time learning a new (programming) language, it is a relief if we manage to get things to work at all (data import, summary statistics, graphs). In the beginning, there are usually few capacities for worrying about reproducibility. This was certainly true for myself.

For instance, some time ago I tried to reproduce the results I reported in a 2014 paper. While the script ended up running OK, I had made a bad mistake when using the “Import Dataset” button in RStudio (a great button for my 2013 self, since getting data into R turned out to be quite difficult): I didn’t copy-and-paste the resulting code (which will always appear in the console) into my script, so ten years later I didn’t know which version of the data I had used in the analysis. After a time-consuming trial-and-error process, I realized, to my horror, that the analyses in the paper are in fact based on two different versions of the data (one figure rested on an uncleaned version). Re-running the code with the final version fortunately only led to minor differences in the patterns, which didn’t affect the linguistic conclusions drawn.

For researchers who are using (or have decided to learn) R, it makes sense to immediately learn how to run and document analyses using notebooks such as RMarkdown (or its successor quarto). These allow us to tightly integrate code, text and output (figures and numerical summaries) and they make it easier to structure contents (e.g. using section headings, bullet points and different font types). I think notebooks actually make it easier to work with R. It’s great to see that Bodo Winter’s worthwhile textbook, Statistics for linguists, has an extra chapter introducing the reader to reproducible R workflows. And Martin Schweinberger’s extensive LADAL platform also has a very informative section on reproducible workflows.

In my own research, I have profited immensely from switching to RMarkdown (and now quarto). Quarto is an ecosystem for academic communication, and it opens up an impressively broad array of possibilities: You can create presentation slides, write and typeset articles and books, create a personal website and/or blog, and much more.

I would like to stress, however, that notebooks are not a prerequisite to reproducibility – good workflows can also be built by relying on plain (R) scripts. In 2017, I did some work on the English comparative alternation. It was a memorable experience when I contacted Martin Hilpert via email, kindly asking him whether it might be possible to have a look at the background data and analysis code for his 2008 paper (note the 10-year gap). I received a reply within two hours, including an R workspace containing the data and code, which ran smoothly and reproduced all results reported in that paper.

Still, notebooks offer many attractive features when working in R. At the most basic level, they allow us to create PDF or html files documenting our analysis. These are more reader-friendly, and I usually try to upload a PDF or html version of my commented analysis scripts to the OSF project associated with a study. I also used RMarkdown to typeset my dissertation, which I self-published in 2020. The manuscript was originally written using MS Word, but since it contained many graphs, I wanted to align the book with the Tufte layout.

I have since switched to quarto, which is compatible with RMarkdown files but offers yet further options. For instance, I created my website using quarto, as well as this blog. I have also used it to create presentation slides for a statistics workshop, with the nice feature that I could use my smartphone to run these in a web browser, allowing me to have two projected screens (the other one showing a live RStudio session).

The tools and possibilities for reproducible science are vast, and we therefore need to become accustomed to the fact that it is always possible to do more and to do better. And it seems that this is never going to change – the more we learn, the more we become aware of yet further possibilities. For instance, I have recently come across a great paper by Julia Strand, where she gives many valuable tips for avoiding errors in research. There, I learned about the TIER protocol, and I have since been making an effort to start implementing it into my workflow.

With other options, I have (so far) been more hesitant. Thus, it is possible, in principle, to use quarto to author complete articles. Every figure and every number cited in the text are then generated from R code (and not copied in manually), thus securing a maximal level of reproducibility. However, I still prefer writing my papers in MS Word, and I still (carefully) copy stuff in manually.

Open science

I will now talk about some ways in which I have tried to adopt open science practices in my work. Part of the open science movement is the call for open data. In contrast to reproducible workflows, which often require an engagement with programming languages, the call for open data can in principle be answered by anyone.

When I am asked to serve as a reviewer for a statistics-heavy paper, I have made it a habit to always kindly ask the editor whether the authors might be willing to share their data and analysis code. This has led to some interesting exchanges, but overall my experiences have been quite positive.

In one case, for instance, I was concerned that a particular decision in the statistical analysis – which did not seem directly motivated on linguistic grounds – might have had an undue effect on the results. Without the data and code, I would have had to explain the concern, and the authors would have had to reply to my comments, and so on. Having the data and code, however, I could simply see for myself. It turned out that this data-analytic choice had virtually no effect on the results, so I did not raise the point but instead suggested that the authors might consider adding this as a reassuring footnote for future readers of the article. In another case, my request for data and code prompted the authors to find an error in their data retrieval process, which apparently affected the results; I received a revised version of the manuscript two weeks later.

Perhaps the easiest way to make data available to the community is to create an OSF project for a study and drop them there. Due to a straightforward graphical user interface, OSF projects are easy to create and maintain, and files can be uploaded via drag-and-drop.

It makes sense, however, to go one step further and publish data properly, in a domain-specific archive. To linguists, I can warmly recommend TROLLing, a language data repository associated with the Dataverse project and based at UiT The Arctic University of Norway. There are several advantages to publishing data using this channel, including: (i) a proper review process that (among other things) ensures alignment with the FAIR principles; (ii) a DOI for persistence and proper citation, (iii) advice on relevant legal and research-ethical aspects; and (iv) a license that clearly states the terms for (re)use.

The TROLLing website provides detailed guides on how to prepare your data and create a TROLLing post. Lots of background on archiving language data can be found in Helene Andreassen’s chapter in the The Open Handbook of Linguistic Data Management. She is also a member of the development and curator team at TROLLing. Some further information is also available from the slides I prepared for a workshop at META-LING 2023, on Data publication using TROLLing. I feel very honored to have been invited to become a member of TROLLing’s Scientific Advisory Board, and I hope that I will be able to support open data practices in this way.

While data archives such as TROLLing or the Journal of Open Humantities Data, another attractive platform, offer many appealing features to linguists, open data can (initially) also be practiced without such channels. For instance, Benedikt Szmrecsanyi, in my perception an early adopter of open data practices in English linguistics, has long been making data and code for many of his publications available on his website.

I have personally profited from open data practices in several ways. In one particular case, for instance, a data set I intended to submit to TROLLing contained an error that I spotted only when I carefully documented its contents for submission. For this reason, I now always prepare the data set publication before running my final analyses.

On another occasion, Joseph Flanagan contacted me to get some more background on a data set for a methodological study that appeared in Ole Schützler and Julia Schlüter’s edited volume Data and methods in corpus linguistics: Comparative approaches. I found it quite exciting to be contacted by Joe, because I had really enjoyed reading his 2017 VARIENG paper on Reproducible research: Strategies, tools, and workflows. And following our email exchange, I learned that there was in fact a much easier way of retrieving the data we needed than the one we had documented in our data preparation scripts.

Finally, for a recent study on down-sampling methods in corpus research (which has just been accepted for publication in the International Journal of Corpus Linguistics), I used data from Gard Jenset and Barbara McGillivray’s 2017 monograph Quantitative historical linguistics: A corpus framework. In a very friendly email exchange, they told me that they were quite happy to hear that their data could serve as a basis for further research. Incidentally, Barbara McGillivray is also the editor-in-chief of the above-mentioned Journal of Open Humantities Data.

As for other open science practices, I have made it a habit to create an OSF project for every research or teaching project I begin (talks, articles, workshops). In the beginning, I used OSF projects for conference talks, to upload my presentation slides in case anyone was interested. The handy thing about OSF links is that they are very short – they can be scribbled down quickly. In my default template for PowerPoint slides, the relevant OSF link always stays posted (in unobtrusive grey) in the top tight corner.

I then gradually started to provide more background material, data and code. For submitted papers, for instance, I use OSF to have control over the license under which my figures are published. I got this idea from Stefan Hartmann’s contribution to the “How to (:) Open linguistics” talk series at BICLCE 2019 (more on that further below). The motivation to gain control over licensing grew out of the self-publication process for my dissertation. Some of the figures that appear in the book are redrawn versions of graphs in the literature, and the legal department at our university advised me to contact the respective publishers for permission. This tedious process can be sidestepped if images are published under a permissive Creative Commons license prior to submitting a paper to (say) a journal. In the resulting publication, I indicate the terms of reuse by adding the CC and BY icons in the figure captions, the first mention including a footnote with a brief explanation (see, e.g., here).

Only much later did I start making my OSF projects into a coherent whole, with keywords and a wiki to provide background on the study and facilitate navigation. For some time I have also used PsyArXiv, a preprint service associated with the Open Science Framework. Thus, when giving a conference talk, I usually try to prepare the manuscript for the eventual article beforehand, so people who are interested can read the study. Journals vary somewhat in their policies for sharing the submitted version of a paper, and Sherpa Romeo is a helpful resource to learn about journal-specific regulations.

Overall, then, I try to make every publication into a bundle: an OSF project with the figures and analysis code, a published dataset, a preprint, and, ideally, an open access link to the paper. An early inspiration for this practice was Timo Roettger, an exemplary figure when it comes to transparency and open science in linguistics.

Community discourse

To spread the word and improve research practices in the broader community, it is also important to stimulate and maintain an active discourse. This process starts at home, with our students, colleagues, and collaborators.

At a larger scale, thematic workshops at conferences can help spread awareness of key issues. At ISLE 5 in 2018, my colleague Valentin Werner and I organized a workshop entitled “The ‘quantitative crisis’, cumulative science, and English linguistics”. The program included a series of six papers on various topics. The workshop drew quite a large audience, which showed that linguists have a genuine interest in the topic. I remember that the broad range of research backgrounds of the participants sometimes made it challenging to find common ground and establish coherence, however, especially when it came to issues relating to statistical analysis. After all, follow-up communication can be complicated by trivial things such as variation in the use of statistical terminology. Nevertheless, the workshop was a worthwhile and instructive experience. It also led to the publication of a special issue in Linguistics entitled “The replication crisis: Implications for linguistics”.

At BICLCE 2019, we tested another format for stimulating the discourse on implications of the replication crisis. In addition to the general sessions and thematic workshops, the conference included a series of talks (20+10 minutes) under the title “How to (:) Open linguistics”. These were integrated into the program and they targeted issues and better practices from a practical perspective. The series included the following invited talks:

  • Why and how to preregister your study (Timo Roettger)
  • Making your research reproducible: Tools and tips (Stefan Hartmann)
  • Analyzing language data: Recurring issues (Lukas Sönning)
  • Visualizing uncertainty using model-based graphs (Jan Vanhove); includes an online tutorial
  • Bayesian inference as an alternative paradigm (Bruno Nicenboim)

We got lots of positive feedback on this conference feature, and it turned out to be quite a success. Two things were particularly appealing, I think: its practical orientation and the fact that it was woven into the general program.

Another way of promoting more sustainable research practices is via teaching, of course, the most important target group being early-career researchers. I try to make an effort to teach (small) workshops on a regular basis, and some are listed on my website. Earlier this year, I gave a talk at the weekly linguistics workshop associated with the Bamberg Graduate School of Linguistics, on “Open science: Resources for linguists”, which included some basic tips on how to use R (notebooks) and OSF, and on data publication using TROLLing.

To do’s

While this interim report documents some things that I have done in response to the transparency movement in science, there is also a lot that I haven’t done.

For instance, I have so far never pre-registered a study (to my shame), and this is something I want to pursue in 2024. Since I am currently mostly doing methodological research, I liked to believe that this kind of work is perhaps not quite suited for pre-registration. However, a talk by Elen LeFoll and Vaclav Brezina at this year’s Corpus Linguistics conference in Lancaster (Is multi-dimensional analysis replicable? Variables, parameters and transparency) quite nicely showed that this is not the case. I have also never done a proper replication study, which is something I am hoping to do in the near future.

A workflow feature that I am not yet comfortable with is version control using services such as Github. While I am convinced by its benefits, I don’t find it particularly intuitive to work with. So firmly integrating version control into my workflow is also still on my list of things to do.

Footnotes

  1. It seems that we are not particularly eager to find out (see, e.g., this preprint by Kristina Kobrock and Timo Roettger). It would be quite surprising, however, if linguist(ic)s were spared – after all, the same human factors are at work in language research as in neighboring disciplines such as psychology.↩︎

Citation

BibTeX citation:
@online{sönning2023,
  author = {Sönning, Lukas},
  title = {The Replication Crisis: {Implications} for Myself},
  date = {2023-11-21},
  url = {https://lsoenning.github.io/posts/2023-11-19_replication_crisis_reflection/},
  langid = {en}
}
For attribution, please cite this work as:
Sönning, Lukas. 2023. “The Replication Crisis: Implications for Myself.” November 21, 2023. https://lsoenning.github.io/posts/2023-11-19_replication_crisis_reflection/.