Identify Irregular Spellings and Suggest Corrections to a Table of Text
Source:R/add_spell_suggestions.R
add_spell_suggestions.Rd
This function identifies spelling mistakes in a given text column of a data frame, suggests corrections,
and adds them as new columns. It uses the hunspell
package for spell checking and providing suggestions.
The output can be exported to a spreadsheet for easier editing.
Arguments
- tbl
A data frame that contains the text data.
- x
A character string indicating the name of the text column in
tbl
(default is "text").- window
An integer specifying the context window size for
quick_conc
function (default is 5).- dict
A character string indicating the dictionary to be used by
hunspell
(default is "en_US").- change_to
A character string specifying the column name ("alt_01" or "typo") to use for the
change_to
column (default is NULL).
Value
A tibble with identified spelling mistakes and corresponding suggested corrections. Each row represents an instance of a spelling mistake, and each cell in the 'alt' column represents a comma-separated list of spelling suggestions.
If change_to
is specified, an additional column 'change_to' is included with
either the original incorrect spelling (in case of change_to = "typo"
), or the function's
best guess at the correct spelling (when change_to = "alt_01"
).
This can be useful when exporting to a spreadsheet app for editing.
Examples
if (FALSE) {
data <- tibble(text = c("This is a smaple text.", "Another txt with a typo."))
add_spell_suggestions(data)
}