A HashMap
has such a phrase from it's documentation:
If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.
Notice how the documentation says rehash, not resize - even if a rehash will only happen when a resize will; that is when the internal size of buckets gets twice as big.
And of course HashMap
provides such a constructor where we could define this initial capacity.
Constructs an empty HashMap with the specified initial capacity and the default load factor (0.75).
OK, seems easy enough:
// these are NOT chosen randomly...
List<String> list = List.of("DFHXR", "YSXFJ", "TUDDY",
"AXVUH", "RUTWZ", "DEDUC", "WFCVW", "ZETCU", "GCVUR");
int maxNumberOfEntries = list.size(); // 9
double loadFactor = 0.75;
int capacity = (int) (maxNumberOfEntries / loadFactor + 1); // 13
So capacity is 13
(internally it is 16
- next power of two), this way we guarantee that documentation part about no rehashes. Ok let's test this, but first introduce a method that will go into a HashMap
and look at the values:
private static <K, V> void debugResize(Map<K, V> map, K key, V value) throws Throwable {
Field table = map.getClass().getDeclaredField("table");
table.setAccessible(true);
Object[] nodes = ((Object[]) table.get(map));
// first put
if (nodes == null) {
// not incrementing currentResizeCalls because
// of lazy init; or the first call to resize is NOT actually a "resize"
map.put(key, value);
return;
}
int previous = nodes.length;
map.put(key, value);
int current = ((Object[]) table.get(map)).length;
if (previous != current) {
++HashMapResize.currentResizeCalls;
System.out.println(nodes.length + " " + current);
}
}
And now let's test this:
static int currentResizeCalls = 0;
public static void main(String[] args) throws Throwable {
List<String> list = List.of("DFHXR", "YSXFJ", "TUDDY",
"AXVUH", "RUTWZ", "DEDUC", "WFCVW", "ZETCU", "GCVUR");
int maxNumberOfEntries = list.size(); // 9
double loadFactor = 0.75;
int capacity = (int) (maxNumberOfEntries / loadFactor + 1);
Map<String, String> map = new HashMap<>(capacity);
list.forEach(x -> {
try {
HashMapResize.debugResize(map, x, x);
} catch (Throwable throwable) {
throwable.printStackTrace();
}
});
System.out.println(HashMapResize.currentResizeCalls);
}
Well, resize
was called and thus entries where rehashed, not what the documentation says.
As said, the keys were not chosen randomly. These were set-up so that they would trigger the static final int TREEIFY_THRESHOLD = 8;
property - when a bucket is converted to a tree. Well not really, since we need to hit also MIN_TREEIFY_CAPACITY = 64
for the tree to appear; until than resize
happens, or a bucket is doubled in size; thus rehashing of entries happens.
I can only hint to why HashMap
documentation is wrong in that sentence, since before java-8, a bucket was not converted to a Tree; thus the property would hold, from java-8 and onwards that is not true anymore. Since I am not sure about this, I'm not adding this as an answer.