Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm developing a multithreaded program running on Linux (compiled with G++ 4.3) and if you search around for a bit you find a lot of scary stories about std::string not being thread-safe with GCC. This is supposedly due to the fact that internally it uses copy-on-write which wreaks havoc with tools like Helgrind.

I've made a small program that copies one string to another string and if you inspect both strings they both share the same internal _M_p pointer. When one string is modified the pointer changes so the copy-on-write stuff is working fine.

What I'm worried about though is what happens if I share a string between two threads (for instance passing it as an object in a threadsafe dataqueue between two threads). I've already tried compiling with the '-pthread' option but that does not seem to make much difference. So my question:

  • Is there any way to force std::string to be threadsafe? I would not mind if the copy-on-write behaviour was disabled to achieve this.
  • How have other people solved this? Or am I being paranoid?

I can't seem to find a definitive answer so I hope you guys can help me..

Edit:

Wow, that's a whole lot of answers in such a short time. Thank you! I will definitely use Jack's solution when I want to disable COW. But now the main question becomes: do I really have to disable COW? Or is the 'bookkeeping' done for COW thread safe? I'm currently browsing the libstdc++ sources but that's going to take quite some time to figure out...

Edit 2

OK browsed the libstdc++ source code and I find something like this in libstd++-v3/include/bits/basic_string.h:

  _CharT*
   _M_refcopy() throw()
   {
#ifndef _GLIBCXX_FULLY_DYNAMIC_STRING
     if (__builtin_expect(this != &_S_empty_rep(), false))
#endif
            __gnu_cxx::__atomic_add_dispatch(&this->_M_refcount, 1);
     return _M_refdata();
   }  // XXX MT

So there is definitely something there about atomic changes to the reference counter...

Conclusion

I'm marking sellibitze's comment as answer here because I think we've reached the conclusion that this area is still unresolved for now. To circumvent the COW behaviour I'd suggest Jack Lloyd's answer. Thank you everybody for an interesting discussion!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
76 views
Welcome To Ask or Share your Answers For Others

1 Answer

Threads are not yet part of the standard. But I don't think that any vendor can get away without making std::string thread-safe, nowadays. Note: There are different definitions of "thread-safe" and mine might differ from yours. Of course, it makes little sense to protect a container like std::vector for concurrent access by default even when you don't need it. That would go against the "don't pay for things you don't use" spirit of C++. The user should always be responsible for synchronization if he/she wants to share objects among different threads. The issue here is whether a library component uses and shares some hidden data structures that might lead to data races even if "functions are applied on different objects" from a user's perspective.

The C++0x draft (N2960) contains the section "data race avoidance" which basically says that library components may access shared data that is hidden from the user if and only if it activly avoids possible data races. It sounds like a copy-on-write implementation of std::basic_string must be as safe w.r.t. multi-threading as another implementation where internal data is never shared among different string instances.

I'm not 100% sure about whether libstdc++ takes care of it already. I think it does. To be sure, check out the documentation


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...