The repo_check module finds links to research repositories in a paper (OSF, GitHub, ResearchBox, Zenodo), retrieves the list of files in each, and reports what is shared — for example whether there are data and code files, and whether the repository includes a README.
Note
This module makes live network calls to the linked repositories. You need an internet connection to run the code below.
For every repository it finds, repo_check builds a unified file list and then assesses three things:
A README. Each repository should document its contents. The module looks for a file whose name contains “readme” (or “read me”). A repository without one is flagged, because readers have no map of what the shared files are.
Archive (zip) files. Files bundled into a .zip (or similar archive) are flagged, because their contents cannot be inspected, indexed, or reused without downloading and unpacking them. The module reports the archive names and suggests uploading the files individually.
What is shared. It counts the data files, code files, README files, and archives in each repository (the files_data, files_code, files_readme, and files_zip columns of the summary table), so you can see at a glance whether data and code were actually shared.
The traffic light follows from these checks:
green — repositories were found, every one has a README, and there are no archive files.
yellow — at least one repository is missing a README, or one or more archive files were found.
na — no repository links were found in the paper at all.
20.5 A clean example and one with problems
The demo paper links to three repositories, and running repo_check on it produces a yellow light — there are problems to address. The summary text reports exactly what:
#>
#> - We found 14 files in 3 repositories.
#> - We found 1 README file and 2 repositories without READMEs.
#> - We found 1 archive file.
Here the issues are a missing README and an archive file whose contents could not be examined. The per-paper summary table counts each file type — note files_readme is 0 (no README was shared) and files_zip is 1 (one archive):
A clean result, by contrast, would include a README in every linked repository (files_readme ≥ 1 for each) and share files individually rather than as a zip (files_zip of 0) — producing a green light with no items to address.
20.6 Options
repo_check accepts arguments for working with local files instead of (or in addition to) online repositories:
# also include files from a local foldermodule_run(paper, "repo_check", local_path ="path/to/downloaded/files")# only check a local folder, skip all online lookupsmodule_run(paper, "repo_check", local_path ="path/to/files", local_only =TRUE)
local_only = TRUE is useful when you cannot or do not want to contact external services — for example when checking a reviewer submission you downloaded as a zip. See the Local Files chapter for details.
20.7 Related
The companion code_check module analyses the actual code files that repo_check discovers.