Hot patch release to resolve R CMD check failures.
str_interp()
now renders lists consistently
independent on the presence of additional placeholders (@amhrasmussen).
New str_starts()
and str_ends()
functions to detect patterns at the beginning or end of strings (@jonthegeek,
#258).
str_subset()
, str_detect()
, and
str_which()
get negate
argument, which is
useful when you want the elements that do NOT match (#259, @yutannihilation).
New str_to_sentence()
function to capitalize with
sentence case (@jonthegeek, #202).
str_replace_all()
with a named vector now respects
modifier functions (#207)
str_trunc()
is once again vectorised correctly
(#203, @austin3dickey).
str_view()
handles NA
values more
gracefully (#217). I’ve also tweaked the sizing policy so hopefully it
should work better in notebooks, while preserving the existing behaviour
in knit documents (#232).
Error : object ‘ignore.case’ is not exported by 'namespace:stringr'
.
This is because the long deprecated str_join()
,
ignore.case()
and perl()
have now been
removed.str_glue()
and str_glue_data()
provide
convenient wrappers around glue
and
glue_data()
from the glue package (#157).
str_flatten()
is a wrapper around
stri_flatten()
and clearly conveys flattening a character
vector into a single string (#186).
str_remove()
and str_remove_all()
functions. These wrap str_replace()
and
str_replace_all()
to remove patterns from strings. (@Shians, #178)
str_squish()
removes spaces from both the left and
right side of strings, and also converts multiple space (or space-like
characters) to a single space within strings (@stephlocke, #197).
str_sub()
gains omit_na
argument for
ignoring NA
. Accordingly, str_replace()
now
ignores NA
s and keeps the original strings. (@yutannihilation,
#164)
str_trunc()
now preserves NAs (@ClaytonJY,
#162)
str_trunc()
now throws an error when
width
is shorter than ellipsis
(@ClaytonJY,
#163).
Long deprecated str_join()
,
ignore.case()
and perl()
have now been
removed.
str_match_all()
now returns NA if an optional group
doesn’t match (previously it returned ““). This is more consistent with
str_match()
and other match failures (#134).In str_replace()
, replacement
can now
be a function that is called once for each match and whose return value
is used to replace the match.
New str_which()
mimics grep()
(#129).
A new vignette (vignette("regular-expressions")
)
describes the details of the regular expressions supported by stringr.
The main vignette (vignette("stringr")
) has been updated to
give a high-level overview of the package.
str_order()
and str_sort()
gain
explicit numeric
argument for sorting mixed numbers and
strings.
str_replace_all()
now throws an error if
replacement
is not a character vector. If
replacement
is NA_character_
it replaces the
complete string with replaces with NA
(#124).
All functions that take a locale
(e.g. str_to_lower()
and str_sort()
) default
to “en” (English) to ensure that the default is consistent across
platforms.
Add sample datasets: fruit
, words
and
sentences
.
fixed()
, regex()
, and
coll()
now throw an error if you use them with anything
other than a plain string (#60). I’ve clarified that the replacement for
perl()
is regex()
not regexp()
(#61). boundary()
has improved defaults when splitting on
non-word boundaries (#58, @lmullen).
str_detect()
now can detect boundaries (by checking
for a str_count()
> 0) (#120). str_subset()
works similarly.
str_extract()
and str_extract_all()
now
work with boundary()
. This is particularly useful if you
want to extract logical constructs like words or sentences.
str_extract_all()
respects the simplify
argument when used with fixed()
matches.
str_subset()
now respects custom options for
fixed()
patterns (#79, @gagolews).
str_replace()
and str_replace_all()
now
behave correctly when a replacement string contains $
s,
\\\\1
, etc. (#83, #99).
str_split()
gains a simplify
argument
to match str_extract_all()
etc.
str_view()
and str_view_all()
create
HTML widgets that display regular expression matches (#96).
word()
returns NA
for indexes greater
than number of words (#112).
stringr is now powered by stringi instead of base R regular expressions. This improves unicode and support, and makes most operations considerably faster. If you find stringr inadequate for your string processing needs, I highly recommend looking at stringi in more detail.
stringr gains a vignette, currently a straight forward update of the article that appeared in the R Journal.
str_c()
now returns a zero length vector if any of
its inputs are zero length vectors. This is consistent with all other
functions, and standard R recycling rules. Similarly, using
str_c("x", NA)
now yields NA
. If you want
"xNA"
, use str_replace_na()
on the
inputs.
str_replace_all()
gains a convenient syntax for
applying multiple pairs of pattern and replacement to the same
vector:
<- c("abc", "def")
input str_replace_all(input, c("[ad]" = "!", "[cf]" = "?"))
str_match()
now returns NA if an optional group
doesn’t match (previously it returned ““). This is more consistent with
str_extract()
and other match failures.
New str_subset()
keeps values that match a pattern.
It’s a convenient wrapper for x[str_detect(x)]
(#21, @jiho).
New str_order()
and str_sort()
allow
you to sort and order strings in a specified locale.
New str_conv()
to convert strings from specified
encoding to UTF-8.
New modifier boundary()
allows you to count, locate
and split by character, word, line and sentence boundaries.
The documentation got a lot of love, and very similar functions (e.g. first and all variants) are now documented together. This should hopefully make it easier to locate the function you need.
ignore.case(x)
has been deprecated in favour of
fixed|regex|coll(x, ignore.case = TRUE)
,
perl(x)
has been deprecated in favour of
regex(x)
.
str_join()
is deprecated, please use
str_c()
instead.
fixed path in str_wrap
example so works for more R
installations.
remove dependency on plyr
Zero input to str_split_fixed
returns 0 row matrix
with n
columns
Export str_join
new modifier perl
that switches to Perl regular
expressions
str_match
now uses new base function
regmatches
to extract matches - this should hopefully be
faster than my previous pure R algorithm
new str_wrap
function which gives
strwrap
output in a more convenient format
new word
function extract words from a string given
user defined separator (thanks to suggestion by David Cooper)
str_locate
now returns consistent type when matching
empty string (thanks to Stavros Macrakis)
new str_count
counts number of matches in a
string.
str_pad
and str_trim
receive
performance tweaks - for large vectors this should give at least a two
order of magnitude speed up
str_length returns NA for invalid multibyte strings
fix small bug in internal recyclable
function