It's really quite clever code, and is obviously a lot more difficult to understand than a simple naive loop.
For the first line, let's just take a four-bit quantity, and call it abcd
. The code basically does this:
abcd - ((abcd >> 1) & 0101) = abcd - (0abc & 0101) = abcd - 0a0c
So, in each group of two bits, it subtracts the value of the high bit. What does that net us?
11 - 1 -> 10 (two bits set)
10 - 1 -> 01 (one bit set)
01 - 0 -> 01 (one bit set)
00 - 0 -> 00 (zero bits set)
So, that first line sets each consecutive group of two bits to the number of bits contained in the original value -- it counts the bits set in groups of two. Call the resulting four-bit quantity ABCD
.
The next line:
(ABCD & 0011) + ((ABCD>>2) & 0011) = 00CD + (AB & 0011) = 00CD + 00AB
So, it takes the groups of two bits and adds pairs together. Now, each four-bit group contains the number of bits set in the corresponding four bits of the input.
In the next line, v + (v >> 4) & 0xF0F0F0F
(which is parsed as (v + (v >> 4)) & 0xf0f0f0f
) does the same, adding pairs of four-bit groups together so that each eight-bit group (byte) contains the bit-set count of the corresponding input byte. We now have a number like 0x0e0f0g0h
.
Note that multiplying a byte in any position by 0x01010101
will copy that byte up to the most-significant byte (as well as leaving some copies in lower bytes). For example, 0x00000g00 * 0x01010101 = 0x0g0g0g00
. So, by multiplying 0x0e0f0g0h
, we will leave e+f+g+h
in the topmost byte; the >>24
at the end extracts that byte and leaves you with the answer.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…