trimws: Trim Leading or Trailing Whitespaces#

Description#

Removes whitespaces (or other code points as specified by the whitespace argument) from left, right, or both sides of each string.

Usage#

trimws(x, which = "both", whitespace = "\\p{Wspace}")

Arguments#

x

character vector whose elements are to be trimmed

which

single string; either "both", "left", or "right"; side(s) from which the code points matching the whitespace pattern are to be removed

whitespace

single string; specifies the set of Unicode code points for removal, see ‘Character Classes’ in about_search_regex for more details

Details#

Not to be confused with strtrim.

Value#

Returns a character vector (in UTF-8).

Differences from Base R#

Replacement for base trimws implemented with stri_replace_all_regex (and not stri_trim, which uses a slightly different syntax for pattern specifiers).

  • the default whitespace argument does not reflect the ‘contemporary’ definition of whitespaces (e.g., does not include zero-width spaces) [fixed here]

  • base R implementation is not portable as it is based on the system PCRE library (e.g., some Unicode classes may not be available or matching thereof can depend on the current LC_CTYPE category) [fixed here]

  • no sanity checks are performed on whitespace [fixed here]

Author(s)#

Marek Gagolewski

See Also#

The official online manual of stringx at https://stringx.gagolewski.com/

Related function(s): sub

Examples#

base::trimws("NAAAAANA!!!NANAAAAA", whitespace=NA)  # stringx raises an error
## [1] "NA!!!NA"
x <- "   :)\v\u00a0 \n\r\t"
base::trimws(x)
## [1] ":)\v "
stringx::trimws(x)
## [1] ":)"