I'm playing around with parallellization in R for the first time. As a first toy example, I tried
library(doMC)
registerDoMC()
B<-10000
myFunc<-function()
{
for(i in 1:B) sqrt(i)
}
myFunc2<-function()
{
foreach(i = 1:B) %do% sqrt(i)
}
myParFunc<-function()
{
foreach(i = 1:B) %dopar% sqrt(i)
}
I know that sqrt()
executes too fast for parallellization to matter, but what I didn't expect was that foreach() %do%
would be slower than for()
:
> system.time(myFunc())
user system elapsed
0.004 0.000 0.005
> system.time(myFunc2())
user system elapsed
6.756 0.000 6.759
> system.time(myParFunc())
user system elapsed
6.140 0.524 6.096
In most examples that I've seen, foreach() %dopar%
is compared to foreach() %do%
rather than for()
. Since foreach() %do%
was much slower than for()
in my toy example, I'm now a bit confused. Somehow, I thought that these were equivalent ways of constructing for-loops. What is the difference? Are they ever equivalent? Is foreach() %do%
always slower?
UPDATE: Following @Peter Fines answer, I update myFunc
as follows:
a<-rep(NA,B)
myFunc<-function()
{
for(i in 1:B) a[i]<-sqrt(i)
}
This makes for()
a bit slower, but not much:
> system.time(myFunc())
user system elapsed
0.036 0.000 0.035
> system.time(myFunc2())
user system elapsed
6.380 0.000 6.385
See Question&Answers more detail:os