Add hesitation marker <HSTN> to a vector of tokenized strings.
Arguments
- x
A character vector with <HSTN> added, where necessary.
- regex
A regular expression (default "\berm?\b|\berm?|\bum\b|\bum"). The regex expression is case insensitive by default.
Examples
dtag_hesitation(c("I'm", "not", "sure",".", "Um" ,"," ,"no"))
#> [1] "I'm" "not" "sure" "." "Um <HSTN>" ","
#> [7] "no"