This function finds non-ASCII characters in a given text column of a data frame. It uses
the quick_conc
function for finding non-ASCII characters. The results can optionally be sorted
by the non-ASCII characters.
Arguments
- tbl
A data frame that contains the text data.
- id
A character string indicating the name of the identifier column in
tbl
(default is NULL).- text
A character string indicating the name of the text column in
tbl
(default is "text").- sort_by_chr
A logical indicating whether to sort the results by the non-ASCII characters (default is FALSE).
- ...
Arguments to pass on to
quick_conc()
. For example, you can extend the resulting search window witb the argumentn = 10
,
Value
A tibble with identified non-ASCII characters. Each row represents an instance of a non-ASCII
character. If id
is not NULL, the tibble also includes the identifier for each instance.
If sort_by_chr
is TRUE, the tibble is sorted by the non-ASCII characters.
Examples
if (FALSE) {
data <- tibble(id = 1:2,
text = c("This is a text with a non-ASCII character: é.", "Another text without."))
find_non_ascii(data, id = "id")
}