Search the text of a paper or list of paper objects. Also works on the table results of a text_search() call.
Usage
text_search(
paper,
pattern = ".*",
return = c("sentence", "paragraph", "section", "header", "match", "paper_id"),
ignore.case = TRUE,
fixed = FALSE,
perl = FALSE,
exclude = FALSE,
search_header = FALSE,
include_refs = FALSE
)
search_text(
paper,
pattern = ".*",
return = c("sentence", "paragraph", "section", "header", "match", "paper_id"),
ignore.case = TRUE,
fixed = FALSE,
perl = FALSE,
exclude = FALSE,
search_header = FALSE,
include_refs = FALSE
)Arguments
- paper
a paper object or a list of paper objects
- pattern
the regex pattern to search for, if a vector with length > 1, the patterns will be searched separately and combined
- return
the kind of text to return, the full sentence, paragraph, header, or section that the text is in, or just the (regex) match, or all body text for a paper (paper_id)
- ignore.case
whether to ignore case when text searching
- fixed
logical. If TRUE, pattern is a string to be matched as is. Overrides all conflicting arguments.
- perl
logical. Should Perl-compatible regexps be used?
- exclude
should matches be included or excluded
- search_header
also search the header
- include_refs
whether to include the reference section in the search
Details
The section argument can take a vector of section names, or a PERL regular expression (use ".*" to match all sections). Possible section types are abstract, intro, method, results, discussion, references, acknowledgment, funding, endnote, footnote, table, figure, and unknown. The default includes all sections except references, tables and figures.
Examples
paper <- demopaper()
all_text <- text_search(paper)
study <- text_search(paper, "study")
equations <- text_search(paper, "\\b\\S+\\s*(=|<)\\s*[0-9\\.]+", return = "match")
no_numbers <- text_search(paper, "\\d", exclude = TRUE)
