Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have an interesting (only for me, perhaps, :)) question. I have text like:

"abbba"

The question is to find all possible substrings of length n in this string. For example, if n = 2, the substrings are

'ab','bb','ba'

and if n = 3, the substrings are

'abb','bbb','bba'

I thought to use something like this:

x <- 'abbba'
m <- matrix(strsplit(x, '')[[1]], nrow=2)
apply(m, 2, paste, collapse='')

But I got a warning and it doesn't work for len = 3.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
701 views
Welcome To Ask or Share your Answers For Others

1 Answer

We may use

x <- "abbba"
allsubstr <- function(x, n) unique(substring(x, 1:(nchar(x) - n + 1), n:nchar(x)))
allsubstr(x, 2)
# [1] "ab" "bb" "ba"
allsubstr(x, 3)
# [1] "abb" "bbb" "bba"

where substring extracts a substring from x starting and ending at specified positions. We exploit the fact that substring is vectorized and pass 1:(nchar(x) - n + 1) as starting positions and n:nchar(x) as ending positions.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...