I expect the regex pattern ab{,2}c
to match only with a
followed by 0, 1 or 2 b
s, followed by c
.
It works that way in lots of languages, for instance Python. However, in R:
grepl("ab{,2}c", c("ac", "abc", "abbc", "abbbc", "abbbbc"))
# [1] TRUE TRUE TRUE TRUE FALSE
I'm surprised by the 4th TRUE
. In ?regex
, I can read:
{n,m}
The preceding item is matched at leastn
times, but not more thanm
times.
So I agree that {,2}
should be written {0,2}
to be a valid pattern (unlike in Python, where the docs state explicitly that omitting n
specifies a lower bound of zero).
But then using {,2}
should throw an error instead of returning misleading matches! Am I missing something or should this be reported as a bug?