Content from Introduction


Last updated on 2024-05-12 | Edit this page

Why is this important?


The topic of reproducible research is important for several reasons:

  • Making research more reproducible contributes to continuous research improvement
  • Recent discussions of reproducibility crisis, that is many research studies being irreproducible, call for the need to broader awareness and increased skills in research reproducibility
  • Open and reproducible research are becoming the norm around the world following many global changes in how academic research is conducted, including a broad science reform around the world
  • Transparent, rigorous and reproducible research is an integral part of research integrity and responsible research conduct that researchers should follow

To summarize, more reproducible research means: research improvement, mitigating reproducibility crisis, contributing to ongoing science reform and responsible research conduct.

Content from What is Reproducible Research?


Last updated on 2024-05-13 | Edit this page

Overview

Questions

  • What do we mean by reproducibility?
  • When is research reproducible?
  • Does reproducibility mean different things in different disciplines?

Objectives

  • Explain what research reproducibility is
  • Provide examples where reproducibility is not the same as open science (or: does not overlap with)
  • Explain how different disciplines define reproducibility differently

Reproducibility: Some Definitions


Reproducibility: Obtaining the same results using the same data.

Replicability: Achieving similar results with new data.

Research is reproduced when results are consistent when following the same method and analysis steps with the same input data

Research is replicated when results are consistent across studies that answer the same research question, each of which has obtained its own data

Research results are generalized when results apply in other contexts or populations that differ from the original one

Source

Based on: https://the-turing-way.netlify.app/reproducible-research/overview/overview-definitions.html

Based on what we went through, we can say that a study has been reproduced when:

  1. Researchers apply similar methods to the original study in a new study
  2. Researchers re-analyze data from the original study and observe the same results
  3. Researchers reuse data from the original study for a new purpose
  1. Researchers re-analyze data from the original study and observe the same results

Reproducibility: Some Examples


Let’s consider an example: a researcher is tossing a coin 100 times to check if the coin is fair. They register if they have observed heads or tails after each toss when the coin falls on the floor. Heads are registered as 0 and tails as 1. The sample size of this study is N = 100 (since they are tossing the coin 100 times).

Hypothesis: The coin is fair (i.e. not biased)\
Sample size: N = 100\
Heads = 0\
Tails = 1\
Analysis method = Student t-test\

After all data is collected (i.e. the researcher is done with the tossing) they start data analysis. They run a simple statistical test in the SPSS program - a Student t-test - to compare the number of observed tails outcomes against the chance level (which is 0.5 since the coin has two sides, and if it’s fair, there should be a 50% chance of getting tails). The researcher observes that the number of tails they got is no different from chance - and so they found a support for their original hypothesis. The researcher makes the complete data table and detailed methods and analysis from the study available to the public.

Another researcher downloads the data table and re-runs the exact same analysis in a different software using R programming language. They also observe that the number of tails is no different from chance. They have reproduced the study!

A third researcher reads about the reproduced study and decides to conduct a new data analysis on a different coin. They apply the exact same methods (i.e. they toss a coin 100 times and register the outcome every time the coin falls on the floor). Just as in the original study, they mark heads as 0 and tails as 1. They also run a Student t-test on the data and they observe that the number is tails is no different from chance. They have replicated the study!

Note, however, that in many different disciplines the word “reproduced” could be used in both the second and the third researcher case, that is to mean both reproducing and replicating the study.

Discussion

Can you provide additional examples of reproducible studies from various disciplines or research types?

Reproducibility Across Methodologies and Research Disciplines


Quantitative Studies: Computational Reproducibility

It is defined as “obtaining consistent computational results using the same input data, computational steps, methods, code, and conditions of analysis” (https://www.nap.edu/catalog/25303). What it means is basically re-running analyses/code with the same data.

Qualitative Studies: Process Transparency

Here we mean arriving at a similar (consistent) interpretation by following the same analysis process. This could be obtained by following the step-by-step reasoning and interpretation process of the researcher(s).

Discussion

Discuss in pairs: Should we use the term “reproducibility” across different disciplines and research methodologies even though it might mean different things?

Reproducibility and Open Research


Reproducibility is closely associated with transparency.

In order to reproduce others’ studies we need to have access to the methods, data, and analyses that have been conducted. So making data, tools and analyses available is essential for reproducibility.

Reproducible does not (have to) mean fully open.

However, a reproducible project does not have to be fully open. For example, due to privacy or copyright restrictions on methods, data, or analyses, researchers might need to keep parts of the research outputs under controlled access (e.g. available only to other researchers and not publicly available). This should not prevent then, though, from making the project fully reproducible (e.g. internally within their research team).

Open does not mean reproducible.

On the other hand, it is entirely possible to practice open science without following reproducibility principles. Materials, data, tools and code can be made openly available but if they don’t have necessary documentation, instruction on how to use them, error checks, proper versioning and organization - they are most probably not usable, and the project might not be reproducible.

Reproducibility Crisis


Problems with reproducibility of research have been noticed by many researchers, advisors and policy makers in the past several years and led to some even claim that there is a “Reproducibility crisis”. However, not everyone agrees.

https://www.pnas.org/doi/full/10.1073/pnas.1708272114 https://bmcresnotes.biomedcentral.com/articles/10.1186/s13104-022-05942-3

Reasons for Irreproducibility

  • Unavailability of materials, data and/or analyses
  • Poor data management
  • Unclear analysis specification
  • Lack of documentation
  • Errors in reporting numbers
  • Lack of quality checking procedures
  • Insufficient peer review

Discussion

Do you agree that there is a reproducibility crisis in academic research?
How many studies would have to reproduce successfully for the “crisis” to be over?
What could librarians do to help researchers fight the “reproducibility crisis”?

Key Points

  • Reproducibility usually means obtaining the same results with the same data.
  • Across different disciplines and methodologies, the understanding of what reproducibility means can be very different.
  • Reproducible research is not the same as open research - it is important to share research outputs to be able to reproduce others’ studies, but research can be made fully reproducible even if it cannot be made fully open.
  • Recent studies point to many issues with reproducibility across different disciplines, something that has been termed “reproducibility crisis”

Content from Benefits and Challenges of Reproducibility


Last updated on 2024-05-12 | Edit this page

Overview

Questions

  • How can science benefit from more reproducible research?
  • How can students and researchers benefit from more reproducible research?
  • What are the main challenges of making research more reproducible?

Objectives

  • list at least 4 benefits of reproducible research
  • link reproducible research to research integrity/ethical research principles (?)
  • list at least 4 challenges of reproducible research
  • provide one example for how researchers can be helped to overcome some of the challenges

Reproducibility benefits for science


Reproducible research is pivotal for research improvement because it ensures that studies are:

  1. Easier to verify, helping to catch errors and mistakes.
  2. More likely to be correct, as it increases the likelihood of catching issues.
  3. More understandable and reusable, due to the proper organization and thorough documentation of the research process.
  4. Better prepared to share and make open, when privacy or copyright restrictions do not apply.

Reproducibility benefits for researchers


But reproducibility also has particular benefits for researchers themselves, and not only for science more broadly. That’s because making studies more reproducible means that researchers:

  1. Are more efficient (although at first implementing reproducible workflows might be time-consuming, it makes them more efficient down the road!)
  2. Are less stressed about making a mistake (because both they and other researchers can check if the study reproduces across different contexts)
  3. Can get credit for producing rigorous research outputs (according to new research assessment criteria that follow global science reform)

Discussion

Think about one way in which more reproducible research could benefit science and one in which it could benefit researchers themselves. Why are these two benefits important?

Challenges in making research more reproducible


We learned that they are many benefits of making research more reproducible. However, this does not come without specific challenges:

  1. Making research reproducible is time-consuming (especially at first when new workflows are being implemented or research documentation is created from scratch)
  2. It requires skills and expertise (for example, researchers might need to know how to properly organize and document research outputs)
  3. It is more difficult when specific restrictions apply (in cases when due to privacy or copyright restrictions, critical parts of dataset or analysis cannot be made available for independent reproduction by other researchers)
  4. Research might not reproduce due to technical issues (for example, different analysis software might differ in how they perform certain calculations and produce different outputs for the same analysis)

Overcoming challenges


These specific challenges can be overcome, but they require that researchers have proper conditions and support for making studies more reproducible:

  1. Because making research reproducible is time-consuming, researchers should be rewarded for preparing reproducible outputs and have appropriate support
  2. Because it requires skills and expertise, institutions should offer training about the tools and solutions for reproducible research and/or trained support staff
  3. Because is more difficult when specific sharing restrictions apply, reproducibility should be checked by internal staff and/or proper infrastructure and tools with controlled access to research outputs should be in place
  4. Because research might not reproduce due to technical issues, software documentation should be provided and results could be checked across different types of software and operating systems

Discussion

Name one challenge of making research more reproducible.
Discuss with the person next to you your choice and brainstorm ways in which you could help researchers overcome that challenge as a librarian.

Key Points

  • Making research more reproducible contributes to general research improvement, quality and rigor but also to higher efficiency and easier error correction for researchers
  • This does not come without specific challenges, such as time constraints, technical issues and the need of specific skills for making research outputs reproducible
  • To overcome the challenges, researchers are in need of proper tools, solutions, research support staff, infrastructure and training

Content from Tools for Reproducible Research Workflows


Last updated on 2024-05-13 | Edit this page

Overview

Questions

  • What are reproducible research workflows?
  • Which areas of the research process can be made more reproducible?
  • What tools can improve reproducibility of research workflows?

Objectives

  • provide examples of reproducible research workflows
  • list at least 4 different tools and practices for increasing research reproducibility
  • demonstrate basic understanding of how to use selected tools for increasing research reproducibility

Starting with Reproducible Research Workflows


When we talk about research workflows we mean the sequence of processes through which researchers have to go to get to specific research outputs such as a dataset, analysis result or a publication. We can distinguish three main areas in the research process where workflows can be made more reproducible:

  1. Data acquisition and processing
  2. Data analyses
  3. Data reports (manuscripts)
6 helpful steps for reproducible research
6 helpful steps for reproducible research

What tools are available out there?


Different tools can be used for increasing reproducibility depending on the specific phase of research process. Here is a list of some helpful tools for each of the three phases:

Data acquisition and processing

Documentation is one of the most important

Tools for documenting data acquisition and processing:

Data analyses

In the data analysis phase of the research process, the tools for making analyses more reproducible will differ depending on the methodology used, for example depending on whether researcher applies quantitative or qualitative methods in the study. Here is a list of some helpful tools depending on research methodology:

Quantitative methods:

Qualitative methods:

Data reports (manuscripts)

  • R Markdown (fully reproducible manuscripts)
  • Quarto (fully reproducible manuscripts)
  • HackMD
  • Overleaf
  • Jupyter Notebooks

Exercise

  1. Take a look at the README file template that we listed in this lesson: https://data.research.cornell.edu/data-management/sharing/readme/ How could you help a researcher fill out a template like that? Which elements could you help most with?
  2. What types of tools can be used for making qualitative analyses more reproducible? If you or a researcher don’t have access to these specific tools, could you think of other ways in which one could make qualitative analysis more reproducible using commonly available tools?

Key Points

  • Research workflows are sequences of processes that researchers have to go through to get to specific research outputs
  • Data acquisition, data analysis and manuscript writing are three phases of the research process that can be made more reproducible
  • There are many tools out there that can help make research workflows more reproducible

Content from The Role of Libraries in Supporting Reproducibility


Last updated on 2024-05-13 | Edit this page

Overview

Questions

  • What is the role of libraries in research improvement?
  • How can library staff support researchers in improving reproducibility?

Objectives

  • demonstrate understanding of libraries’ central role in supporting reproducibility
  • provide examples of how libraries can support research reproducibility and its improvement

Why Libraries?


Academic libraries are uniquely positioned to provide support in the area of reproducible research. Open science is already at the core of many libraries’ work and many librarians provide direct and increasingly hands-on support to both early career and senior researchers at their institutions. Because reproducibility is strongly associated with open science/open research and because funders, journals, and other stakeholders begin to implement new requirements for not only open but also reproducible research, librarians can build on their expertise in helping preparing research outputs for sharing to also help make them reproducible.

Reproducibility support from the libraries


Where can libraries help:

  • Awareness raising, teaching, training, and hands-on guidance
  • Help researchers being transparent about their full research workflow: research questions, methods, data, step-by-step procedures and analyses; help making the methods, data and analyses open (if possible in a given project)
  • Assist with making good documentation for all outputs and all stages of the research process
  • Assist with version control to track of versions of all outputs
  • Help with the quality check of the methods, data and analyses
  • Verifying researchers’ work: helping with reproducing their own results

Exercise (Reflection)

What do you think is the most important area where libraries can provide support in making research more reproducible?

Key Points

  • Academic libraries are uniquely positioned to provide help and support in the area of open and reproducible research
  • Academic librarians can build on their expertise in making research outputs shareable and reusable to also help make them reproducible
  • Academic libraries can offer training in reproducible research workflows in addition to open science trainings