HashMap 源码

概述

HashMap 是一个散列表,是 key/value 的存储结构,每个 key 对应唯一的一个 value,查询和修改的速度都很快;不能保证元素的存储顺序。
HashMap 的实现不是同步的,非线程安全。key 和 value 都可以为 null。
HashMap 的实例有两个参数影响性能:初始容量加载因子
容量 是哈希表中桶的数量; 初始容量 是哈希表在创建时的容量;加载因子 是哈希表在其容量自动增加之前可以达到多满的一种尺度。

继承结构

HashMap 继承结构

  • HashMap 继承 AbstractMap<K,V>,实现 Map 接口。

存储结构

HashMap 存储结构

  • HashMap 的实现采用了 数组 + 链表 + 红黑树 的复杂结构;
  • 添加元素时,对 key 进行 hash 计算,根据 hash 值散列存储在数组中;
  • 如果要存储的位置已经有元素存在了,则以链表的形式把元素存放在尾部;
  • 如果数组的长度达到 64,且 链表 的长度也达到 8,则进行树化;
  • 链表树化能大大地提高查询性能;

    数组的查询复杂度是 O(1),链表的查询复杂度是 O(n),红黑树的查询复杂度是 O(log n);

源码实现

属性

View Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
/**
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

/**
* The maximum capacity, used if a higher value is implicitly specified
* by either of the constructors with arguments.
* MUST be a power of two <= 1<<30.
*/
static final int MAXIMUM_CAPACITY = 1 << 30;

/**
* The load factor used when none specified in constructor.
*/
static final float DEFAULT_LOAD_FACTOR = 0.75f;

/**
* The bin count threshold for using a tree rather than list for a
* bin. Bins are converted to trees when adding an element to a
* bin with at least this many nodes. The value must be greater
* than 2 and should be at least 8 to mesh with assumptions in
* tree removal about conversion back to plain bins upon
* shrinkage.
*/
static final int TREEIFY_THRESHOLD = 8;

/**
* The bin count threshold for untreeifying a (split) bin during a
* resize operation. Should be less than TREEIFY_THRESHOLD, and at
* most 6 to mesh with shrinkage detection under removal.
*/
static final int UNTREEIFY_THRESHOLD = 6;

/**
* The smallest table capacity for which bins may be treeified.
* (Otherwise the table is resized if too many nodes in a bin.)
* Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
* between resizing and treeification thresholds.
*/
static final int MIN_TREEIFY_CAPACITY = 64;

/**
* The table, initialized on first use, and resized as
* necessary. When allocated, length is always a power of two.
* (We also tolerate length zero in some operations to allow
* bootstrapping mechanics that are currently not needed.)
*/
transient Node<K,V>[] table;

/**
* Holds cached entrySet(). Note that AbstractMap fields are used
* for keySet() and values().
*/
transient Set<Map.Entry<K,V>> entrySet;

/**
* The number of key-value mappings contained in this map.
*/
transient int size;

/**
* The number of times this HashMap has been structurally modified
* Structural modifications are those that change the number of mappings in
* the HashMap or otherwise modify its internal structure (e.g.,
* rehash). This field is used to make iterators on Collection-views of
* the HashMap fail-fast. (See ConcurrentModificationException).
*/
transient int modCount;

/**
* The next size value at which to resize (capacity * load factor).
*
* @serial
*/
// (The javadoc description is true upon serialization.
// Additionally, if the table array has not been allocated, this
// field holds the initial array capacity, or zero signifying
// DEFAULT_INITIAL_CAPACITY.)
int threshold;

/**
* The load factor for the hash table.
*
* @serial
*/
final float loadFactor;
  • 默认初始化容量 16(1 << 4),最大容量为 2^30;
  • 默认装载因子为 0.75,容量达到装载因子,则进行扩容;
  • 当数组的容量达到 64 且链表的长度达到 8 的时候进行树化,当链表的长度小于 6 的时侯反树化;

Node内部类

Node 类是一个单向链表。

View Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/**
* Basic hash bin node, used for most entries. (See below for
* TreeNode subclass, and in LinkedHashMap for its Entry subclass.)
*/
static class Node<K,V> implements Map.Entry<K,V> {
final int hash;
final K key;
V value;
Node<K,V> next;

Node(int hash, K key, V value, Node<K,V> next) {
this.hash = hash;
this.key = key;
this.value = value;
this.next = next;
}

public final int hashCode() {
return Objects.hashCode(key) ^ Objects.hashCode(value);
}

思考

为什么 hashCode 分别取 key 和 value 的 hashCode 求异或?(异或运算的妙用)
相关文章

TreeNode 内部类

TreeNode是一个树型节点,其中,prev是链表中的节点,用于在删除元素的时候可以快速找到它的前置节点。

View Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/**
* Entry for Tree bins. Extends LinkedHashMap.Entry (which in turn
* extends Node) so can be used as extension of either regular or
* linked node.
*/
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
TreeNode<K,V> parent; // red-black tree links
TreeNode<K,V> left;
TreeNode<K,V> right;
TreeNode<K,V> prev; // needed to unlink next upon deletion
boolean red;
TreeNode(int hash, K key, V val, Node<K,V> next) {
super(hash, key, val, next);
}
/**
* Returns root of tree containing this node.
*/
final TreeNode<K,V> root() {
for (TreeNode<K,V> r = this, p;;) {
if ((p = r.parent) == null)
return r;
r = p;
}
}
}

构造方法

HashMap 提供多个构造方法,供调用者设置初始化容量和加载因子。

View Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
/**
* Constructs an empty <tt>HashMap</tt> with the specified initial
* capacity and load factor.
*
* @param initialCapacity the initial capacity
* @param loadFactor the load factor
* @throws IllegalArgumentException if the initial capacity is negative
* or the load factor is nonpositive
*/
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
this.threshold = tableSizeFor(initialCapacity);
}

/**
* Constructs an empty <tt>HashMap</tt> with the specified initial
* capacity and the default load factor (0.75).
*
* @param initialCapacity the initial capacity.
* @throws IllegalArgumentException if the initial capacity is negative.
*/
public HashMap(int initialCapacity) {
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}

/**
* Constructs an empty <tt>HashMap</tt> with the default initial capacity
* (16) and the default load factor (0.75).
*/
public HashMap() {
this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}

/**
* Constructs a new <tt>HashMap</tt> with the same mappings as the
* specified <tt>Map</tt>. The <tt>HashMap</tt> is created with
* default load factor (0.75) and an initial capacity sufficient to
* hold the mappings in the specified <tt>Map</tt>.
*
* @param m the map whose mappings are to be placed in this map
* @throws NullPointerException if the specified map is null
*/
public HashMap(Map<? extends K, ? extends V> m) {
this.loadFactor = DEFAULT_LOAD_FACTOR;
putMapEntries(m, false);
}

/**
* Returns a power of two size for the given target capacity.
*/
static final int tableSizeFor(int cap) {
int n = cap - 1;
n |= n >>> 1;
n |= n >>> 2;
n |= n >>> 4;
n |= n >>> 8;
n |= n >>> 16;
return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
  • 调用者没有指定初始化容量和加载因子,则使用默认的参数;
  • 调用者初始化容量,通过 tableSizeFor(int cap) 方法把初始化容量设置为最近的 2^n;

思考

给定一个数,如何求出大于等于这个数,且最小的 2 的 n 次幂;

1
2
3
4
5
6
7
8
9
10
11
12
/**
* Returns a power of two size for the given target capacity.
*/
static final int tableSizeFor(int cap) {
int n = cap - 1;
n |= n >>> 1;
n |= n >>> 2;
n |= n >>> 4;
n |= n >>> 8;
n |= n >>> 16;
return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

操作方法

put(K key, V value)

View Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
/**
* Associates the specified value with the specified key in this map.
* If the map previously contained a mapping for the key, the old
* value is replaced.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @return the previous value associated with <tt>key</tt>, or
* <tt>null</tt> if there was no mapping for <tt>key</tt>.
* (A <tt>null</tt> return can also indicate that the map
* previously associated <tt>null</tt> with <tt>key</tt>.)
*/
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}

/**
* Computes key.hashCode() and spreads (XORs) higher bits of hash
* to lower. Because the table uses power-of-two masking, sets of
* hashes that vary only in bits above the current mask will
* always collide. (Among known examples are sets of Float keys
* holding consecutive whole numbers in small tables.) So we
* apply a transform that spreads the impact of higher bits
* downward. There is a tradeoff between speed, utility, and
* quality of bit-spreading. Because many common sets of hashes
* are already reasonably distributed (so don't benefit from
* spreading), and because we use trees to handle large sets of
* collisions in bins, we just XOR some shifted bits in the
* cheapest possible way to reduce systematic lossage, as well as
* to incorporate impact of the highest bits that would otherwise
* never be used in index calculations because of table bounds.
*/
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

/**
* Implements Map.put and related methods
*
* @param hash hash for key
* @param key the key
* @param value the value to put
* @param onlyIfAbsent if true, don't change existing value
* @param evict if false, the table is in creation mode.
* @return previous value, or null if none
*/
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
// 数组容量为0,初始化数组
if ((tab = table) == null || (n = tab.length) == 0)
// resize 初始化
n = (tab = resize()).length;
// (n - 1) & hash 计算元素在哪个桶中,如果桶没有元素,则直接保存
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
// 桶中已经有元素了
Node<K,V> e; K k;
// 如果桶中的 key 与待插入的 key 相等,则修改 value 值
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
// 如果是树节点,调用树的插入方法插入元素
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
// 如果是链表,遍历链表,存在相同的 key 就修改 value,否则追加在链表末尾;如果复合树化规则,则树化
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
// 触发树化检查,不一定就树化,还要检查数组的大小
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
// 找到相同 key 的元素,执行 value 的修改
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
// 移动元素节点到最后
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
// 插入元素后,转后续流程,可能移动到最大的位置
afterNodeInsertion(evict);
return null;
}
  • 计算 key 的 hash 值:(h = key.hashCode()) ^ (h >>> 16)
  • 如果数组的容量为 0,则初始化数组;元素在数组中的位置位置:hash & (size - 1)
  • 如果 key 不存在,则直接插入;如果插入的位置是数组,检查是否需要扩容;如果插入的位置是链表,检查是否需要树化;如果插入的位置是树,则检查是否需要做平衡;
  • 如果 key 存在,则执行 value 修改;

    三种可能:

    1. key 存在于桶中,直接修改
    2. key 存在于链表中,修改 value
    3. key 存在于树中,修改 value,检查是否做平衡

思考

为什么 hashCode 如此设计?
相关文档

resize()

扩容。

View Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
/**
* Initializes or doubles table size. If null, allocates in
* accord with initial capacity target held in field threshold.
* Otherwise, because we are using power-of-two expansion, the
* elements from each bin must either stay at same index, or move
* with a power of two offset in the new table.
*
* @return the table
*/
final Node<K,V>[] resize() {
Node<K,V>[] oldTab = table;
int oldCap = (oldTab == null) ? 0 : oldTab.length;
int oldThr = threshold;
int newCap, newThr = 0;
if (oldCap > 0) {
if (oldCap >= MAXIMUM_CAPACITY) {
// 达到最大容量,不再扩容
threshold = Integer.MAX_VALUE;
return oldTab;
}
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
// 扩容后的容量没有超越最大容量,且大于默认初始化容量,则扩容为原来的2倍
newThr = oldThr << 1; // double threshold
}
else if (oldThr > 0) // initial capacity was placed in threshold
newCap = oldThr;
else { // zero initial threshold signifies using defaults
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {
// 如果扩容门槛被设置为0,则计算为容量 * 装载因子,当不能超过最大容量
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);
}
// 重置扩容门槛
threshold = newThr;
@SuppressWarnings({"rawtypes","unchecked"})
// 创建新数组,开始复制操作
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
table = newTab;
if (oldTab != null) {
for (int j = 0; j < oldCap; ++j) {
Node<K,V> e;
if ((e = oldTab[j]) != null) {
oldTab[j] = null;
if (e.next == null)
// 复制的元素没有下一个节点(不是链表和树),直接赋值
newTab[e.hash & (newCap - 1)] = e;
else if (e instanceof TreeNode)
// 如果是树,则拆分成两棵树存放到桶中
((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
else { // preserve order
// 如果是链表
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;
}
} while ((e = next) != null);

// 拆分成了两链表
// 低位链表在新桶中的位置一致
if (loTail != null) {
loTail.next = null;
newTab[j] = loHead;
}

// 高位链表在新桶中的位置正好是原来的位置加上旧容量
if (hiTail != null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
}
}
}
}
return newTab;
}

// TreeNode 类中拆分树的方法
/**
* Splits nodes in a tree bin into lower and upper tree bins,
* or untreeifies if now too small. Called only from resize;
* see above discussion about split bits and indices.
*
* @param map the map
* @param tab the table for recording bin heads
* @param index the index of the table being split
* @param bit the bit of hash to split on
*/
final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) {
TreeNode<K,V> b = this;
// Relink into lo and hi lists, preserving order
TreeNode<K,V> loHead = null, loTail = null;
TreeNode<K,V> hiHead = null, hiTail = null;
int lc = 0, hc = 0;
for (TreeNode<K,V> e = b, next; e != null; e = next) {
next = (TreeNode<K,V>)e.next;
e.next = null;
if ((e.hash & bit) == 0) {
if ((e.prev = loTail) == null)
loHead = e;
else
loTail.next = e;
loTail = e;
++lc;
}
else {
if ((e.prev = hiTail) == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;
++hc;
}
}

if (loHead != null) {
if (lc <= UNTREEIFY_THRESHOLD)
tab[index] = loHead.untreeify(map);
else {
tab[index] = loHead;
if (hiHead != null) // (else is already treeified)
loHead.treeify(tab);
}
}
if (hiHead != null) {
if (hc <= UNTREEIFY_THRESHOLD)
tab[index + bit] = hiHead.untreeify(map);
else {
tab[index + bit] = hiHead;
if (loHead != null)
hiHead.treeify(tab);
}
}
}
  • 数组的容量总是 2^n,每次扩容是原来的 2 倍;
  • 根据加载因子计算扩容门槛,达到扩容门槛,则扩容;
  • 最大容量不超过 2^30;
  • 扩容中的数据复制分三种情况处理;
    • 如果要复制的元素不是链表和树,直接引用;
    • 如果要复制的元素是树,则拆分为两棵树放到桶中,低位树的位置跟旧数组中的位置一致,高位树的位置为 旧数组的容量 + 旧数组中的位置;
    • 如果要复制的元素是链表,则拆分成两个链表,低位链表的位置跟旧数组中的一致,高位链表的位置为 旧数组的容量 + 旧数组中的位置;
      JDK8,这里顺序遍历,追加到链表尾部。JDK7没有使用红黑树,顺序遍历,添加在头部;并发场景下,转移数据扩容后,可能出现链表逆序,从而形成环型链表,造成死循环。

思考

一个链表拆分成两个的算法为什么是 (e.hash & oldCap) == 0 ?
相关文章

TreeNode.putTreeVal(HashMap<K,V> map, Node<K,V>[] tab, int h, K k, V v)

向红黑树中插入元素。

View Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
/**
* Tree version of putVal.
*/
final TreeNode<K,V> putTreeVal(HashMap<K,V> map, Node<K,V>[] tab,
int h, K k, V v) {
Class<?> kc = null;
// 标记是否找到
boolean searched = false;
// 找到根节点
TreeNode<K,V> root = (parent != null) ? root() : this;
for (TreeNode<K,V> p = root;;) {
// dir = direction , 标记是左边还是右边
// ph = p.hash , 当前节点的 hash 值
// pk = p.key , 当前节点的 key 值
int dir, ph; K pk;
if ((ph = p.hash) > h)
// 当前 hash 比目标 hash 大,在左边
dir = -1;
else if (ph < h)
// 当前 hash 比目标 hash 小,在右边
dir = 1;
else if ((pk = p.key) == k || (k != null && k.equals(pk)))
// 两者 hash 值相等,且 key 相等,说明找到该节点
return p;
else if ((kc == null &&
(kc = comparableClassFor(k)) == null) ||
(dir = compareComparables(kc, k, pk)) == 0) {
if (!searched) {
TreeNode<K,V> q, ch;
searched = true;
if (((ch = p.left) != null &&
(q = ch.find(h, k, kc)) != null) ||
((ch = p.right) != null &&
(q = ch.find(h, k, kc)) != null))
return q;
}

// 如果两者类型相同,再根据它们的内存地址计算hash值进行比较
dir = tieBreakOrder(k, pk);
}

TreeNode<K,V> xp = p;
if ((p = (dir <= 0) ? p.left : p.right) == null) {
// 如果最后确实没找到对应key的元素,则新建一个节点
Node<K,V> xpn = xp.next;
TreeNode<K,V> x = map.newTreeNode(h, k, v, xpn);
if (dir <= 0)
xp.left = x;
else
xp.right = x;
xp.next = x;
x.parent = x.prev = xp;
if (xpn != null)
((TreeNode<K,V>)xpn).prev = x;
// 插入树节点后平衡
// 把root节点移动到链表的第一个节点
moveRootToFront(tab, balanceInsertion(root, x));
return null;
}
}
}
  • 寻找根节点;
  • 从根节点查找;
  • 比较 hash 值及 key 值,如果相同,直接返回。在 putVal() 方法中决定是否要替换 value;
  • 根据 hash 值及 key 值确定在树的左子树还是右子树查找,找到了直接返回;
  • 如果最后没有找到则在树的相应位置插入元素,并做平衡;

treeifyBin(Node<K,V>[] tab, int hash)

如果插入元素后链表的长度大于等于 8 则判断是否需要树化。

View Code


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
/**
* Replaces all linked nodes in bin at index for given hash unless
* table is too small, in which case resizes instead.
*/
final void treeifyBin(Node<K,V>[] tab, int hash) {
int n, index; Node<K,V> e;
if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
// 如果桶数量小于64,扩容
resize();
else if ((e = tab[index = (n - 1) & hash]) != null) {
TreeNode<K,V> hd = null, tl = null;
// 把所有节点换成树节点
do {
TreeNode<K,V> p = replacementTreeNode(e, null);
if (tl == null)
hd = p;
else {
p.prev = tl;
tl.next = p;
}
tl = p;
} while ((e = e.next) != null);

// 如果进入过上面的循环,则从头节点开始树化
if ((tab[index] = hd) != null)
hd.treeify(tab);
}
}

TreeNode.treeify(Node<K,V>[] tab)

树化。

View Code


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
/**
* Forms tree of the nodes linked from this node.
* @return root of tree
*/
final void treeify(Node<K,V>[] tab) {
TreeNode<K,V> root = null;
for (TreeNode<K,V> x = this, next; x != null; x = next) {
next = (TreeNode<K,V>)x.next;
x.left = x.right = null;
// 第一个元素作为根节点,后面在做平衡
if (root == null) {
x.parent = null;
x.red = false;
root = x;
}
else {
K k = x.key;
int h = x.hash;
Class<?> kc = null;
for (TreeNode<K,V> p = root;;) {
int dir, ph;
K pk = p.key;
if ((ph = p.hash) > h)
dir = -1;
else if (ph < h)
dir = 1;
else if ((kc == null &&
(kc = comparableClassFor(k)) == null) ||
(dir = compareComparables(kc, k, pk)) == 0)
dir = tieBreakOrder(k, pk);

TreeNode<K,V> xp = p;
if ((p = (dir <= 0) ? p.left : p.right) == null) {
x.parent = xp;
if (dir <= 0)
xp.left = x;
else
xp.right = x;
root = balanceInsertion(root, x);
break;
}
}
}
}
moveRootToFront(tab, root);
}

TreeNode.removeTreeNode(HashMap<K,V> map, Node<K,V>[] tab, boolean movable)

View Code


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
/**
* Removes the given node, that must be present before this call.
* This is messier than typical red-black deletion code because we
* cannot swap the contents of an interior node with a leaf
* successor that is pinned by "next" pointers that are accessible
* independently during traversal. So instead we swap the tree
* linkages. If the current tree appears to have too few nodes,
* the bin is converted back to a plain bin. (The test triggers
* somewhere between 2 and 6 nodes, depending on tree structure).
*/
final void removeTreeNode(HashMap<K,V> map, Node<K,V>[] tab,
boolean movable) {
int n;
if (tab == null || (n = tab.length) == 0)
return;
// 节点在桶中的索引
int index = (n - 1) & hash;
TreeNode<K,V> first = (TreeNode<K,V>)tab[index], root = first, rl;
// 前置节点,后继节点
TreeNode<K,V> succ = (TreeNode<K,V>)next, pred = prev;
if (pred == null)
// 前置节点为空,当前位置是根节点。后继节点替换当前节点,删除了当前节点
tab[index] = first = succ;
else
// 否则把前置节点的下一个节点设置为当前节点的后继节点。相当于删除了当前节点。
pred.next = succ;

// 后继节点不为空,则后继节点的前置节点指向当前节点的前置节点,相当于删除了当前节点
if (succ != null)
succ.prev = pred;
// 无后继节点,直接返回
if (first == null)
return;
if (root.parent != null)
root = root.root();
if (root == null || root.right == null ||
(rl = root.left) == null || rl.left == null) {
tab[index] = first.untreeify(map); // too small
return;
}

// 删除红黑树中的节点
TreeNode<K,V> p = this, pl = left, pr = right, replacement;
if (pl != null && pr != null) {
TreeNode<K,V> s = pr, sl;
while ((sl = s.left) != null) // find successor
s = sl;
boolean c = s.red; s.red = p.red; p.red = c; // swap colors
TreeNode<K,V> sr = s.right;
TreeNode<K,V> pp = p.parent;
if (s == pr) { // p was s's direct parent
p.parent = s;
s.right = p;
}
else {
TreeNode<K,V> sp = s.parent;
if ((p.parent = sp) != null) {
if (s == sp.left)
sp.left = p;
else
sp.right = p;
}
if ((s.right = pr) != null)
pr.parent = s;
}
p.left = null;
if ((p.right = sr) != null)
sr.parent = p;
if ((s.left = pl) != null)
pl.parent = s;
if ((s.parent = pp) == null)
root = s;
else if (p == pp.left)
pp.left = s;
else
pp.right = s;
if (sr != null)
replacement = sr;
else
replacement = p;
}
else if (pl != null)
replacement = pl;
else if (pr != null)
replacement = pr;
else
replacement = p;
if (replacement != p) {
TreeNode<K,V> pp = replacement.parent = p.parent;
if (pp == null)
root = replacement;
else if (p == pp.left)
pp.left = replacement;
else
pp.right = replacement;
p.left = p.right = p.parent = null;
}

TreeNode<K,V> r = p.red ? root : balanceDeletion(root, replacement);

if (replacement == p) { // detach
TreeNode<K,V> pp = p.parent;
p.parent = null;
if (pp != null) {
if (p == pp.left)
pp.left = null;
else if (p == pp.right)
pp.right = null;
}
}
if (movable)
moveRootToFront(tab, r);
}

  • TreeNode 既是链表,也是红黑树;
  • 先删除链表;
  • 再删除红黑树,最后做平衡;

get(Object key)

View Code


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
/**
* Returns the value to which the specified key is mapped,
* or {@code null} if this map contains no mapping for the key.
*
* <p>More formally, if this map contains a mapping from a key
* {@code k} to a value {@code v} such that {@code (key==null ? k==null :
* key.equals(k))}, then this method returns {@code v}; otherwise
* it returns {@code null}. (There can be at most one such mapping.)
*
* <p>A return value of {@code null} does not <i>necessarily</i>
* indicate that the map contains no mapping for the key; it's also
* possible that the map explicitly maps the key to {@code null}.
* The {@link #containsKey containsKey} operation may be used to
* distinguish these two cases.
*
* @see #put(Object, Object)
*/
public V get(Object key) {
Node<K,V> e;
return (e = getNode(hash(key), key)) == null ? null : e.value;
}

/**
* Implements Map.get and related methods
*
* @param hash hash for key
* @param key the key
* @return the node, or null if none
*/
final Node<K,V> getNode(int hash, Object key) {
Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
if ((tab = table) != null && (n = tab.length) > 0 &&
(first = tab[(n - 1) & hash]) != null) {
if (first.hash == hash && // always check first node
((k = first.key) == key || (key != null && key.equals(k))))
// 在数组中查找的第一个元素匹配上,则返回
return first;
if ((e = first.next) != null) {
if (first instanceof TreeNode)
// 在树中查找
return ((TreeNode<K,V>)first).getTreeNode(hash, key);
do {
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
// 在链表中查找
return e;
} while ((e = e.next) != null);
}
}
return null;
}

  • 计算 key 的 hash 值;
  • 找到 key 所在的桶的第一个元素;
  • 如果第一个元素的 key 等于待查找的 key,直接返回;
  • 如果第一个元素是树节点,就按树的方式查找,否则按链表的方式查找;

TreeNode.getTreeNode(int h, Object k)

View Code


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
/**
* Calls find for root node.
*/
final TreeNode<K,V> getTreeNode(int h, Object k) {
return ((parent != null) ? root() : this).find(h, k, null);
}

/**
* Finds the node starting at root p with the given hash and key.
* The kc argument caches comparableClassFor(key) upon first use
* comparing keys.
*/
final TreeNode<K,V> find(int h, Object k, Class<?> kc) {
TreeNode<K,V> p = this;
do {
int ph, dir; K pk;
TreeNode<K,V> pl = p.left, pr = p.right, q;
if ((ph = p.hash) > h)
// 左子树
p = pl;
else if (ph < h)
// 右子树
p = pr;
else if ((pk = p.key) == k || (k != null && k.equals(pk)))
// 找到则返回
return p;
else if (pl == null)
// hash相同,但key不同,左子树为空,查右子树
p = pr;
else if (pr == null)
// 右子树为空,查左子树
p = pl;
else if ((kc != null ||
(kc = comparableClassFor(k)) != null) &&
(dir = compareComparables(kc, k, pk)) != 0)
// 比较 key 值大小,判断使用左子树还是右子树
p = (dir < 0) ? pl : pr;
else if ((q = pr.find(h, k, kc)) != null)
// 以上条件都不通过,则尝试右子树查找
return q;
else
// 都没有找到,左子树查找
p = pl;
} while (p != null);
return null;
}

  • 红黑树查找,先根据 hash 值比较,再根据 key 值比较决定查左子树还是右子树。

remove(Object key)

View Code


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
@Override
public boolean remove(Object key, Object value) {
return removeNode(hash(key), key, value, true, true) != null;
}

/**
* Implements Map.remove and related methods
*
* @param hash hash for key
* @param key the key
* @param value the value to match if matchValue, else ignored
* @param matchValue if true only remove if value is equal
* @param movable if false do not move other nodes while removing
* @return the node, or null if none
*/
final Node<K,V> removeNode(int hash, Object key, Object value,
boolean matchValue, boolean movable) {
Node<K,V>[] tab; Node<K,V> p; int n, index;
if ((tab = table) != null && (n = tab.length) > 0 &&
(p = tab[index = (n - 1) & hash]) != null) {
Node<K,V> node = null, e; K k; V v;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
// 第一个元素是要找的目标元素,保存,然后删除使用
node = p;
else if ((e = p.next) != null) {
if (p instanceof TreeNode)
// 第一个元素是树节点,以树的方式查找
node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
else {
// 如果是链表,则遍历查找
do {
if (e.hash == hash &&
((k = e.key) == key ||
(key != null && key.equals(k)))) {
node = e;
break;
}
p = e;
} while ((e = e.next) != null);
}
}
// 查找到元素,匹配删除
if (node != null && (!matchValue || (v = node.value) == value ||
(value != null && value.equals(v)))) {
if (node instanceof TreeNode)
// 树的方式删除
((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
else if (node == p)
// 链表删除的,下一个个元素上移
tab[index] = node.next;
else
// 数组删除
p.next = node.next;
++modCount;
--size;
// 删除节点后处理
afterNodeRemoval(node);
return node;
}
}
return null;
}

总结

  • HashMap 是一种散列表,采用 数组 + 链表 + 红黑树 的存储结构;
  • HashMap 默认初始容量是 16,默认装载因子是 0.75f,容量总是 2 的 n 次方;
  • HashMap 每次扩容时,容量变为原来的两倍;
  • 当数组的容量大于等于 64 且链表的长度大于 8 时,进行树化;
  • 当单个桶的数据小于 6 时,进行反树化;

FAQ

Q:HashMap 为什么用 数组 + 链表 + 红黑树 ?(实现原理)

  • 数组用来确定桶的位置;
    key 的 hash 值:(hash = key.hashCode()) ^ (hash >>> 16) ,桶在数组中的位置:hash & (size - 1)
  • 链表是解决 hash 冲突的一种方式;
    如果元素所在的桶已经有元素了,以链表的方式追加在末尾;
  • 红黑树是解决链表查询性能的问题;
    链表的查询时间复杂度是 O(n),红黑树的查询时间复杂度是 O(log n);当链表的长度大于 8,转成红黑树能大大提高查询性能;

Q:HashMap 什么时候触发扩容?每次扩容的长度总是 2 的 n 次幂?

  • 如果数组大小达到 (float)capacity * loadFactor,就进行扩容;
    默认加载因子为 0.75,最大容量不超过 2^30,每次扩容为原来的 2 倍,数组容量总是 2^n;
  • 每次扩容是 2 的 n 次幂;
    桶在数组中的位置为:hash & (size - 1),而 size - 1 转二进制,最高位为 0,低位全是 1,按位取与能减少 hash 碰撞;
  • 干扰函数 混合低位和高位,减少 hash 冲突:(hash = key.hashCode()) ^ (hash >>> 16)

Q:String 中 hashcode 的实现?

用自然溢出等效取模:s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

Q:HashMap 扩容时每个 entry 需要再计算一次 Hash 吗?

不用 rehash,因为 HashMap 在扩容量、HashCode 和 数组索引位置计算上做了巧妙的设计,扩容后的元素,分高低位,低位保持原来数组索引的位置,而高位为原来的位置+扩容 size,所以不用 rehash;

Q:HashMap 中元素的位置计算应用了 key.hashCode(),重写 hashcode 和 equals 要方法注意什么?

四个原则(存在 hash 碰撞):

  • 两个对象相等,hashcode 一定相等
  • 两个对象不等,hashcode 不一定不等
  • hashcode 相等,两个对象不一定相等
  • hashcode 不等,两个对象一定不等

Q:HashMap 的同步实现?

Map m = Collections.synchronizeMap(hashMap)

打赏
  • 版权声明: 本博客所有文章除特别声明外,均采用 Apache License 2.0 许可协议。转载请注明出处!
  • © 2015-2020 Andrew
  • Powered by Hexo Theme Ayer
  • PV: UV:

请我喝杯咖啡吧~

支付宝
微信