Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

A previous post prompted me to post this question. It would seem like a best-practice to reassign == to isTRUE(all.equal()) ( and != to !isTRUE(all.equal()). I'm wondering if others do this in practice? I just realized that I use == and != to do numeric equality throughout my codebase. My first reaction was that I need to do a full-scrub and convert to all.equal. But in fact, everytime I use == and != I want to test equality (regardless of the datatype). In fact, I'm not sure what these operations would test for other than equality. I'm sure I'm missing some concept here. Can someone enlighten me? The only argument I see against this approach is that in some cases two non-identical numbers will appear to be identical because of the tolerance of all.equal. But we're told that two numbers that are in fact identical might not pass identical() because of how they are are stored in memory. So really what's the point of not defaulting to all.equal?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
359 views
Welcome To Ask or Share your Answers For Others

1 Answer

As @joran alluded to, you'll run into floating point issues with == and != in pretty much any other language too. One important aspect of them in R is the vectorization part.

It would be much better to define a new function almostEqual, fuzzyEqual or similar. It is unfortunate that there is no such base function. all.equal isn't very efficient since it handles all kinds of objects and returns a string describing the difference when mostly you just want TRUE or FALSE.

Here's an example of such a function. It's vectorized like ==.

almostEqual <- function(x, y, tolerance=1e-8) {
  diff <- abs(x - y)
  mag <- pmax( abs(x), abs(y) )
  ifelse( mag > tolerance, diff/mag <= tolerance, diff <= tolerance)
}

almostEqual(1, c(1+1e-8, 1+2e-8)) # [1]  TRUE FALSE

...it is around 2x faster than all.equal for scalar values, and much faster with vectors.

x <- 1
y <- 1+1e-8
system.time(for(i in 1:1e4) almostEqual(x, y)) # 0.44 seconds
system.time(for(i in 1:1e4) all.equal(x, y))   # 0.93 seconds

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...