scrubc() removes non-numeric characters from test scores.

swapc() replaces non-numeric characters with numeric.

scrubc(c, re = "[^-.0-9]")

swapc(c, re = "[Kk]", repl = "0")



A character vector of potentially numeric values.


A regular expression specifying which characters to operate on. For best effect, re should begin with "[" and end with "]" to specify a character set. At some point, I'll add a check to enforce this.


A single character to be used as a replacement.


scrubc() removes non-numeric characters from test scores, often present due to fat-fingered data entry, in anticipation of converting to numeric. Report the number of such characters removed. Characters to be removed are specified by regular expression (re). The default re is "[^-.0-9]", which will remove all non-numeric characters at one swoop.

A typical use case in interactive use is to remove exactly 1 specific character at a time, e.g., scrubc(c=c("*K.1", "K.8", "1.6", "2.1", "3.0", ">12.9"), re="[>]"). This serves to alert the analyst to how many of each non-numeric character removed.

swapc() replaces one character in a string with another in anticipation of converting to numeric. One use-case is to replace "K" with "0" in grade-equivalent test scores. These sometimes occur in otherwise numeric standardized test scores. Report the number of such characters replaced. Characters to be replaced are specified by regular expression (re).

See also

str_remove_all cleanNumbers



c <- c("0.1", "_0.8", "1.6", "2.1`", "+3. 0", "12.9") scrubc(c)
#> Removing 4 characters from "_" "`" "+" " " #>
#> [1] "0.1" "0.8" "1.6" "2.1" "3.0" "12.9"
c = c("<K.1", "K.8", "1.6", "2.1", "3.0", ">12.9") swapc(c)
#> Replacing 2 characters from [Kk] with '0'. #>
#> [1] "<0.1" "0.8" "1.6" "2.1" "3.0" ">12.9"