Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

When calculating the sum of two data tables, NA+n=NA.

> dt1 <- data.table(Name=c("Joe","Ann"), "1"=c(0,NA), "2"=c(3,NA))
> dt1
   Name  1  2
1:  Joe  0  3
2:  Ann NA NA
> dt2 <- data.table(Name=c("Joe","Ann"), "1"=c(0,NA), "2"=c(2,3))
> dt2
   Name  1 2
1:  Joe  0 2
2:  Ann NA 3
> dtsum  <- rbind(dt1, dt2)[, lapply(.SD, sum), by=Name]
> dtsum
   Name  1  2
1:  Joe  0  5
2:  Ann NA NA

I don't want to substitute all NA's with 0. What I want is NA+NA=NA and NA+n=n to get the following result:

   Name  1  2
1:  Joe  0  5
2:  Ann NA  3

How is this done in R?

UPDATE: removed typo in dt1

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
468 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can define your own function to act as you want

plus <- function(x) {
 if(all(is.na(x))){
   c(x[0],NA)} else {
   sum(x,na.rm = TRUE)}
 }


rbind(dt1, dt2)[,lapply(.SD, plus), by = Name]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...