I was curious about:
> strsplit("ty,rr", split = ",")
[[1]]
[1] "ty" "rr"
> strsplit("ty|rr", split = "|")
[[1]]
[1] "t" "y" "|" "r" "r"
Why don't I get c("ty","rr")
from strsplit("ty|rr", split="|")
?
I was curious about:
> strsplit("ty,rr", split = ",")
[[1]]
[1] "ty" "rr"
> strsplit("ty|rr", split = "|")
[[1]]
[1] "t" "y" "|" "r" "r"
Why don't I get c("ty","rr")
from strsplit("ty|rr", split="|")
?
It's because the split
argument is interpreted as a regular expression, and |
is a special character in a regex.
To get round this, you have two options:
Option 1: Escape the |
, i.e. split = "\|"
strsplit("ty|rr", split = "\|")
[[1]]
[1] "ty" "rr"
Option 2: Specify fixed = TRUE
:
strsplit("ty|rr", split = "|", fixed = TRUE)
[[1]]
[1] "ty" "rr"
Please also note the See Also section of ?strsplit
, which tells you to read ?"regular expression"
for details of the pattern specification.