Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I need a "string pool" object into which I can repeatedly insert a "sequence of chars" (I use this phrase to mean "string" without confusing it with std::string or a C string), obtain a pointer to the sequence, and be guaranteed that the pointer will not become invalidated if/when the pool needs to grow. Using a simple std::string as the pool won't work, because of the possibility for the string to be reallocated when it outgrows its initial capacity, thus invalidating all previous pointers into it.

The pool will not grow without bound -- there are well-defined points at which I will call a clear() method on it -- but I don't want to reserve any maximum capacity on it, either. It should be able to grow, without moving.

One possibility I'm considering is inserting each new sequence of chars into a forward_list<string> and obtaining begin()->c_str(). Another is inserting into an unordered_set<string>, but I'm having a hard time finding out what happens when an unordered_set has to grow. The third possibility I'm considering (less enthusiastically) is rolling my own chain of 1K buffers into which I concatenate the sequence of chars. That has the advantage (I guess) of having the highest performance, which is a requirement for this project.

I'd be interested in hearing how others would recommend approaching this.

UPDATE 1: edited to clarify my use of the phrase "sequence of chars" to be equivalent to the general notion of a "string" without implying either std::string or null-terminated char array.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
214 views
Welcome To Ask or Share your Answers For Others

1 Answer

I've used this approach in the past:

using Atom = const char*;

Atom make_atom(string const& value)
{
    static set<string> interned;
    return interned.insert(value).first->c_str();
}

Obviously, if you want/need to clear the set, you'd make it available in some wider scope.

For even more efficiency move/emplace the strings into the set.

Update I've added this approach for completeness. See it Live on Coliru

#include <string>
#include <set>
using namespace std;

using Atom = const char*;

template <typename... Args>
typename enable_if<
    is_constructible<string, Args...>::value, Atom
>::type emplace_atom(Args&&... args)
{
    static set<string> interned;
    return interned.emplace(forward<Args>(args)...).first->c_str();
}

#include <iostream>

int main() {
    cout << emplace_atom("Hello World
");
    cout << emplace_atom(80, '=');
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...