I've been calling this in OpenMP
#pragma omp parallel for num_threads(totalThreads)
for(unsigned i=0; i<totalThreads; i++)
{
workOnTheseEdges(startIndex[i], endIndex[i]);
}
And this in C++11 std::threads (I believe those are just pthreads)
vector<thread> threads;
for(unsigned i=0; i<totalThreads; i++)
{
threads.push_back(thread(workOnTheseEdges,startIndex[i], endIndex[i]));
}
for (auto& thread : threads)
{
thread.join();
}
But, the OpenMP implementation is 2x the speed--Faster! I would have expected C++11 threads to be faster, as they are more low-level. Note: The code above is being called not just once, but probably 10,000 times in a loop, so maybe that has something to do with it?
Edit: for clarification, in practice, I either use the OpenMP or the C++11 version--not both. When I am using the OpenMP code, it takes 45 seconds and when I am using the the C++11, it takes 100 seconds.
See Question&Answers more detail:os