Skip to contents

This function identifies spelling mistakes in a given text column of a data frame, suggests corrections, and adds them as new columns. It uses the hunspell package for spell checking and providing suggestions. The output can be exported to a spreadsheet for easier editing.

Usage

add_spell_suggestions(
  tbl,
  x = "text",
  window = 5,
  dict = "en_US",
  change_to = NULL
)

Arguments

tbl

A data frame that contains the text data.

x

A character string indicating the name of the text column in tbl (default is "text").

window

An integer specifying the context window size for quick_conc function (default is 5).

dict

A character string indicating the dictionary to be used by hunspell (default is "en_US").

change_to

A character string specifying the column name ("alt_01" or "typo") to use for the change_to column (default is NULL).

Value

A tibble with identified spelling mistakes and corresponding suggested corrections. Each row represents an instance of a spelling mistake, and each cell in the 'alt' column represents a comma-separated list of spelling suggestions.

If change_to is specified, an additional column 'change_to' is included with either the original incorrect spelling (in case of change_to = "typo"), or the function's best guess at the correct spelling (when change_to = "alt_01").

This can be useful when exporting to a spreadsheet app for editing.

Examples

if (FALSE) {
data <- tibble(text = c("This is a smaple text.", "Another txt with a typo."))
add_spell_suggestions(data)
}