sprintf: Format Strings¶
sprintf creates strings from a given template and the arguments provided. A new function (present in C and many other languages),
printf, displays formatted strings.
sprintf(fmt, ..., na_string = NA_character_) printf(fmt, ..., file = "", sep = "\n", append = FALSE, na_string = "NA")
character vector of format strings
vectors with data to format (coercible to integer, real, or character)
single string to represent missing values; if
Note that the purpose of
printf is to display a string, not to create a new one for use elsewhere, therefore this function, as an exception, treats missing values as
sprintf returns a character vector (in UTF-8). No attributes are preserved.
printf returns ‘nothing’.
Differences from Base R¶
missing values in
...are treated as
"NA"strings [fixed in
sprintf, left in
printf, but see the
partial recycling results in an error [fixed here – warning given]
input objects’ attributes are not preserved [not fixed, somewhat tricky]
in to-string conversions, field widths and precisions are interpreted as bytes which is of course problematic for text in UTF-8 [fixed by interpreting these as Unicode code point widths]
fmtis limited to 8192 bytes and the number of arguments passed via
...to 99 (note that we can easily exceed this limit by using
do.call) [rewritten from scratch, there is no limit anymore]
unused values in … are evaluated anyway (should not evaluation be lazy?) [not fixed here because this is somewhat questionable; in both base R and our case, a warning is given if this is the case; moreover, the length of the longest argument always determines the length of the output]
coercion of each argument can only be done once [fixed here - can coerce to integer, real, and character]
either width or precision can be fetched from
..., but not both [fixed here - two asterisks are allowed in format specifiers]
NaNsare not prefixed by a sign/space even if we explicitly request this [fixed here - prefixed by a space]
the outputs are implementation-dependent; the format strings are passed down to the system (
sprintffunction [not fixed here (yet), but the format specifiers are normalised more eagerly]
The official online manual of stringx at https://stringx.gagolewski.com/
# UTF-8 number of bytes vs. Unicode code point width: l <- c("e", "e\u00b2", "\u03c0", "\u03c0\u00b2", "\U0001f602\U0001f603") r <- c(exp(1), exp(2), pi, pi^2, NaN) cat(base::sprintf("%8s=%+.3f", l, r), sep="\n") ## e=+2.718 ## e²=+7.389 ## π=+3.142 ## π²=+9.870 ## 😂😃=NaN cat(stringx::sprintf("%8s=%+.3f", l, r), sep="\n") ## e=+2.718 ## e²=+7.389 ## π=+3.142 ## π²=+9.870 ## 😂😃= NaN # coercion of the same argument to different types: stringx::printf(c("UNIX time %1$f is %1$s.", "%1$s is %1$f UNIX time."), Sys.time()) ## UNIX time 1630628750.107000 is 2021-09-03T10:25:50+1000. ## 2021-09-03T10:25:50+1000 is 1630628750.107000 UNIX time.