regex - Extract all numbers from a single string in R

Question

Ask a Question

Welcome To Ask or Share your Answers For Others

regex - Extract all numbers from a single string in R

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

Let's imagine you have a string:

strLine <- "The transactions (on your account) were as follows: 0 3,000 (500) 0 2.25 (1,200)"

Is there a function that strips out the numbers into an array/vector producing the following required solution:

result <- c(0, 3000, -500, 0, 2.25, -1200)?

i.e.

result[3] = -500

Notice, the numbers are presented in accounting form so negative numbers appear between (). Also, you can assume that only numbers appear to the right of the first occurance of a number. I am not that good with regexp so would appreciate it if you could help if this would be required. Also, I don't want to assume the string is always the same so I am looking to strip out all words (and any special characters) before the location of the first number.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

509 views

1 Answer

深蓝 · Answer 1 · 2021-10-17T02:51:18+0000

library(stringr)
x <- str_extract_all(strLine,"\(?[0-9,.]+\)?")[[1]]
> x
[1] "0"       "3,000"   "(500)"   "0"       "2.25"    "(1,200)"

Change the parens to negatives:

x <- gsub("\((.+)\)","-\1",x)
x
[1] "0"      "3,000"  "-500"   "0"      "2.25"   "-1,200"

And then as.numeric() or taRifx::destring to finish up (the next version of destring will support negatives by default so the keep option won't be necessary):

library(taRifx)
destring( x, keep="0-9.-")
[1]    0 3000  -500    0    2.25 -1200

OR:

as.numeric(gsub(",","",x))
[1]     0  3000  -500     0     2.25 -1200

Categories

regex - Extract all numbers from a single string in R

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags