27  Creating a Metacheck Report

So far we have looked at individual modules in isolation. In practice you usually want to run several checks on a paper at once and collect the results into a single document you can read, save, or hand to a co-author. This chapter shows how to go from a single paper object to a full Metacheck report.

We start with the all-in-one report() function, then unpack what it does — running one module with module_run(), running several with report_module_run(), and turning the result into a document with report_qmd(). Finally we cover the slightly tricky part: passing settings through to the modules.

27.1 Loading a paper

Every report starts from a paper object, created with read() (see Reading in a Paper). Throughout this chapter we use the bundled demo paper, which demopaper() returns directly so the examples run without a network connection:

paper <- demopaper()
paper$paper_id
#> [1] "to_err_is_human"

In your own work you would instead read a converted file:

paper <- read("converted/my_manuscript.json")

Everything below works on this single paper object. The same functions also accept a paperlist (a list of papers), so the same code scales to a whole corpus — see Batch processing and Using the Paper Database Corpora.

27.2 The quick way: report()

report() runs a set of modules on a paper and writes a finished report to disk. With no other arguments it runs the default battery of sixteen modules and saves an HTML file named after the paper. The defaults span the whole manuscript: preregistration (prereg_check), funding and conflict-of-interest statements (funding_check, coi_check), statistical power (power), data and code availability (repo_check, code_check), the reporting of statistics (stat_check, stat_p_exact, stat_p_nonsig, stat_effect_size, marginal), and the reference list (ref_accuracy, ref_replication, ref_retraction, ref_pubpeer, ref_summary).

report(paper)
#> "[paper_id]_report.html"

The function returns the path to the file it wrote. By default that is "<paper_id>_report.html" in the working directory; set output_file to choose your own name and location.

27.2.1 Choosing modules

You rarely need all the default checks. Pass a modules vector to run exactly the ones you want, in the order you want them reported:

report(paper,
       modules = c("stat_p_exact", "stat_p_nonsig", "marginal"))

Use module_list() to see every built-in module, or module_help("module_name") for the details of one. Custom modules are given by file path instead of name (see Creating Modules).

27.2.2 Choosing the output format

report() can write either a rendered HTML file or the intermediate Quarto source (qmd). The qmd option is useful when you want to edit the report before rendering, or render it yourself with different options:

report(paper,
       modules = c("stat_p_exact", "marginal"),
       output_file = "demo_report.html",
       output_format = "html")   # the default

report(paper,
       output_file = "demo_report.qmd",
       output_format = "qmd")    # write the Quarto source instead

Rendering to HTML requires Quarto to be installed. If rendering fails for any reason, report() falls back to saving the qmd source (and warns you), so you never lose the result of running the modules.

27.3 Running a single module

report() is convenient, but it always writes a file. While you are exploring a paper interactively it is often handier to run one module and look at the result in the console. That is what module_run() does:

mo <- module_run(paper, "stat_p_exact")

The object it returns holds everything the module found. The pieces you will use most are the traffic light, the one-line summary, and the full results table:

mo$traffic_light
#> [1] "red"
mo$summary_text
#> [1] "We found 1 imprecise *p* value out of 3 detected *p* values."
head(mo$table) |>
  knitr::kable()
text text_id paragraph_id section_id page_number formatted paper_id header section_type p_comp p_value expanded imprecise zero
p = 0.005 15 4 3 NA NA to_err_is_human Procedure method = 0.005 On average researchers in the experimental (app) condition made fewer mistakes (M = 9.12) than researchers in the control (checklist) condition (M = 10.9), t(97.7) = 2.9, p = 0.005, d =0.59. FALSE FALSE
p =0.152 16 5 3 NA NA to_err_is_human Procedure method = 0.152 On average researchers in the experimental condition found the app marginally significantly more useful (M = 5.06) than researchers in the control condition found the checklist (M = 4.5), t(97.2) = -1.96, p =0.152. FALSE FALSE
p > .05 17 6 3 NA There was no effect of experience on the reduction in errors when using the tool (p > .05), as the correlation was non-significant (Figure 2). to_err_is_human Procedure method > 0.050 There was no effect of experience on the reduction in errors when using the tool (p > .05), as the correlation was non-significant (Figure 2). TRUE FALSE

Printing the object gives a compact summary, and module_report() formats the same output as the Markdown section that would appear in a report:

module_report(mo) |> cat()

27.3.1 ⚠️ Exact P-Values

We found 1 imprecise p value out of 3 detected p values.

View detailed feedback

Reporting p values imprecisely (e.g., p < .05) reduces transparency, reproducibility, and re-use (e.g., in p value meta-analyses). Best practice is to report exact p-values with three decimal places (e.g., p = .032) unless p values are smaller than 0.001, in which case you can use p < .001.

#| echo: false


# table data --------------------------------------
table <- structure(list("P-Value" = "p > .05", Text = "There was no effect of experience on the reduction in errors when using the tool (p > .05), as the correlation was non-significant (Figure 2)."), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame"))

# display table -----------------------------------
metacheck::report_table(table, c(0.1, 0.9), 2, FALSE)

The APA manual states: Report exact p values (e.g., p = .031) to two or three decimal places. However, report p values less than .001 as p < .001. However, 2 decimals is too imprecise for many use-cases (e.g., a p value meta-analysis), so report p values with three digits.

American Psychological Association (2020). Publication manual of the American Psychological Association, 7 edition. American Psychological Association.

List any p-values reported with insufficient precision (e.g., p < .05 or p = n.s.) or reported as exactly zero (e.g., p = .000).

This module uses regular expressions to identify p-values. It will flag any values reported as p > ? or p < numbers greater than .001. It will also flag p-values reported as exactly zero (e.g., p = .000, p = 0.00), which are mathematically impossible — p-values are never exactly zero and should instead be reported as p < .001.

We try to exclude figure and table notes like “* p < .05”, but may not succeed at excluding all false positives.

This module was developed by Lisa DeBruine

Validation: In a sample of 225 papers containing 405 instances of non-exact p-values, the module correctly detected 269 cases (true positives) and incorrectly identified 78 (false positives). It missed 136 instances of imprecisely reported p-values (false negatives) and correctly identified 4557 cases of precisely reported p-values (true negative). Additionally, 78% of positive detections were correct (positive predictive value).

27.4 Running multiple modules

To run several modules and keep their combined output without writing a file, use report_module_run(). It takes the same modules vector as report() and returns a list of module outputs:

module_output <- report_module_run(
  paper,
  modules = c("stat_p_exact", "stat_p_nonsig", "marginal")
)

The result is a named list — one entry per module — that you can inspect individually:

names(module_output)
#> [1] "stat_p_exact"  "stat_p_nonsig" "marginal"
module_output[["marginal"]]$traffic_light
#> [1] "red"
module_output[["marginal"]]$summary_text
#> [1] "You described 2 effects with terms related to 'marginally significant'."

This is the same intermediate object report() builds internally. To turn it into a finished document yourself, pass it to report_qmd(), which produces the Quarto report text (header, summary section, and one section per module):

report_text <- report_qmd(module_output, paper)
writeLines(report_text, "demo_report.qmd")

Splitting the work this way — run the modules once with report_module_run(), then format with report_qmd() — is useful when the modules are slow (for example reference checks that query external services) and you want to render the report more than once without re-running them.

27.5 Passing settings to modules

This is the part that trips people up. Some modules take extra arguments beyond paper. For example:

  • power(paper, seed = 8675309) — the random seed used when sampling text for the LLM.
  • ref_replication(paper, show_outcomes = FALSE) — whether to include replication outcomes.
  • repo_check(paper, local_path = NULL, local_only = FALSE) — where to find the repository, and whether to skip online lookups.

module_help("module_name") shows the arguments a given module accepts:

module_help("power")
#> Power Analysis Check
#> 
#> This module uses uses regular expressions to identify sentences that contain a statistical power analysis. If specified by the user, it also uses a large language module (LLM) to extract information reported in power analyses, including the statistical test, sample size, alpha level, desired level of power, and magnitude and type of effect size.
#> 
#> module_run(paper, "power", seed = 8675309)
#> 
#> - paper: a paper object or paperlist object  
#> - seed: a seed for the LLM  
#> 
#> The Power Analysis Check module uses regular expressions to identify sentences that contain a statistical power analysis. Without the use of an LMM, the module uses regular expressions to classify the power analysis as a-priori, sensitivity or post-hoc. With the use of an LMM, it checks if the power analysis is reported with all required information.
#> 
#> The regular expressions can miss power analyses, or fail to classify them correctly. The type of power analysis is often difficult to classify, which can easily be solved by explicitly specifying the type of power analysis as 'a-priori', 'sensitivity', or 'post-hoc'. Note that 'post-hoc' or 'observed' power is rarely useful. The LMM can fail to identify information in the paper, and will not have access to information in paragraphs in the paper other than those that contain the word 'power'. This package was validated by the Metacheck team on articles in Psychological Science.
#> 
#> <validation>In a sample of 128 papers with 246 instances of power statements, 203 were correctly detected (true positives), 22 were missed (false negatives) and 21 were incorrectly detected (false positives). Overall, among all instances flagged as power statements, 90.6% were correct (positive predictive value).</validation>

How you supply these arguments depends on which function you are using.

27.5.1 One module: pass arguments directly

With module_run(), extra arguments go straight through the ..., exactly as if you were calling the module function itself:

module_run(paper, "power", seed = 123)
module_run(paper, "ref_replication", show_outcomes = TRUE)

27.5.2 Several modules: the args list

When you run several modules at once with report() or report_module_run(), you cannot use ..., because an argument like seed would be ambiguous — which module is it for? Instead you pass a single args argument: a named list of lists, where each name matches a module name and each value is the list of arguments for that module.

args <- list(
  power           = list(seed = 123),
  ref_replication = list(show_outcomes = TRUE),
  repo_check      = list(local_only = TRUE)
)

report_module_run(
  paper,
  modules = c("power", "ref_replication", "repo_check"),
  args = args
)

The same args list works with report():

report(
  paper,
  modules = c("power", "ref_replication", "repo_check"),
  args    = args
)

A few rules make this easier to get right:

  • You only list the modules that need arguments. Modules you do not mention in args run with their defaults, even if they are in modules. So a module taking no special settings (like stat_p_exact) never needs an entry.
  • The names must match the module names in your modules vector. A mismatched name (a typo, or a module not actually being run) is silently ignored — its settings simply have no effect.
  • Each value is itself a list, even when there is a single argument: power = list(seed = 123), not power = 123.

Putting it together, here is a complete report that runs four modules, two of them with custom settings and two with their defaults:

report(
  paper,
  modules = c("stat_p_exact", "marginal", "power", "ref_replication"),
  args = list(
    power           = list(seed = 123),
    ref_replication = list(show_outcomes = TRUE)
  ),
  output_file = "demo_report.html"
)

27.6 Summary

You want to… Use
Run modules and write a finished report file report(paper, modules, output_file, output_format, args)
Run one module and inspect the result module_run(paper, module, ...)
Run several modules, keep the result, write no file report_module_run(paper, modules, args)
Format module output as report text report_qmd(module_output, paper) or module_report(mo)
Pass settings to one module extra arguments to module_run()
Pass settings to several modules the args named-list-of-lists