Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

When processing large amounts of data I often find myself doing the following:

HashSet<String> set = new HashSet<String> ();
//Adding elements to the set
ArrayList<String> list = new ArrayList<String> (set);

Something like "dumping" the contents of the set in the list. I usually do this since the elements I add often contain duplicates I want to remove, and this seems like an easy way to remove them.

With only that objective in mind (avoiding duplicates) I could also write:

ArrayList<String> list = new ArrayList<String> ();
// Processing here
if (! list.contains(element)) list.add(element);
//More processing here

And thus no need for "dumping" the set into the list. However, I'd be doing a small check before inserting each element (which I'm assuming HashSet does as well)

Is any of the two possibilities clearly more efficient?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
518 views
Welcome To Ask or Share your Answers For Others

1 Answer

The set will give much better performance (O(n) vs O(n^2) for the list), and that's normal because set membership (the contains operation) is the very purpose of a set.

Contains for a HashSet is O(1) compared to O(n) for a list, therefore you should never use a list if you often need to run contains.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...