Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am reading in a bunch of CSVs that have stuff like "sales - thousands" in the title and come into R as "sales...thousands". I'd like to use a regular expression (or other simple method) to clean these up.

I can't figure out why this doesn't work:

#mock data
  a <- data.frame(this.is.fine = letters[1:5],
                  this...one...isnt = LETTERS[1:5])

#column names
  colnames(a)
  # [1] "this.is.fine"  "this...one...isnt"

#function to remove multiple spaces
  colClean <- function(x){
    colnames(x) <- gsub("\.\.+", ".", colnames(x))
  }

#run function
  colClean(a)

#names go unaffected
  colnames(a)
  # [1] "this.is.fine"  "this...one...isnt"

but this code does:

#direct change to names
  colnames(a) <- gsub("\.\.+", ".", colnames(a))

#new names
  colnames(a)
  # [1] "this.is.fine"  "this.one.isnt"

Note that I'm fine leaving one period between words when that occurs.

Thank you.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.0k views
Welcome To Ask or Share your Answers For Others

1 Answer

names(a) <- gsub(x = names(a), pattern = "\.", replacement = "#")  

you can use gsub function to replace . with another special character like #.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...