Bit twiddling

Through bitwise operations it is possible to work with the individual bits of a larger number, even though the architecture doesn’t allow bit-level addressing. Since we can’t use assignment to set individual bits, we have to be clever with whole-word operations and masks. You can think of this as a form of SIMD (Single Instruction, Multiple Data) parallelism, where you can/must operate on all of the bits in an integer at once.

Some important identities

In order to isolate particular bits, it will be important to know, for each of the operations, what value can be safely used to ignore the bits we aren’t interested in, i.e. the operation’s identity element.

  • x & 1 == x

  • x | 0 == x

  • x ^ 0 == x

Looking at the truth table, you can work out what the non-identity element will do when combined with another bit.

  • x & 0 == 0

  • x | 1 == 1

  • x ^ 1 == ~x

Masking

The above identities are per column of a bitwise operation. In order to get meaningful work done bitwise, it is necessary to construct a ‘mask’ specifying what to do in each bit position of a larger value. The conventional way to refer to bit positions is by the index, i.e. bit 0 is the one’s place (\(2^0=1\)), bit 1 is the two’s place, and so on; the most significant bit in a 32-bit integer is bit 31.

To specify a value with a 1 in position \(n\) and 0 everywhere else, use bit shifting. Obviously 1 itself is the desired mask for bit 0. The expression 1 << n shifts that bit into any desired location, selecting bit \(n\). (When \(n=0\), 1 << 0 works and results, as you would hope, in 1, so there is no need for a special case.)

The value you calculate with a 1 in the position(s) you want to affect is called a mask by analogy with a painter’s mask, where one tapes off the parts that shouldn’t get paint on them and leaves gaps where the paint can go through in exactly the desired shape.

Because 0 | 1 == 1, you can combine masks with OR (|) to get a mask with ones wherever any of the input masks had ones. So, to choose both bits 2 and 3, you could use a mask of 1 << 2 | 1 << 3 (the order of operations in C would group this as (1 << 2) | (1 << 3)).

To choose large ranges of bits, there is a more concise way to generate a mask than combining each of the individual bit masks. Just as in decimal, where a number written as a sequence of nines is one less than a power of ten (e.g. \(999 = 10^3 - 1\)), in binary a number written as a sequence of ones is one less than a power of two (e.g. \(111_2 = 2^3 - 1\)). You can use this fact to quickly generate a mask that selects all of the \(n\) consecutive bits up to (not including) \(n\) with the expression (1 << n) - 1. (The parentheses here are required, because 1 << n - 1 would be grouped 1 << (n - 1) by the C order of operations.)

You can also shift a mask into any position you want, so to select 4 bits starting above bit 3 (i.e. bits 4–7) you could use the mask ((1 << 4) - 1) << 3.

These sorts of shifting expressions are ideally suited for computing masks at runtime, when some of the parameters are variables, or to state the programmer’s intention clearly. On the other hand, when the mask is known at compile time, it is common to simply write the number out in hexadecimal. (Writing it out in binary might be even clearer, but hexadecimal is shorter and C didn’t support binary literals until 2014.) So, the equivalent mask that could be computed to mask bits 4–7 could also be written immediately as 0x78. The use of hexadecimal for this is so fundamental to systems programming that although 0x2 == 2, when I see 2 I think of it like an integer, but when I see 0x2 I think of it as a 1 in bit 1 (the two’s place).

Manipulating bits

If you want to calculate x but with some additional bits turned on as indicated by the ones in a mask, choose the operation where the result can be guaranteed to be 1, i.e. OR. You can calculate the result with x | mask, which results in a value that is equal to x in every bit except that the bits set in mask will also be set in the result.

Very often you want to store the result back into x. Although x = x | mask is adequate, C provides augmented assignment operators for most of the binary operators, so it would be more idiomatic to write the following.

/* turn on (set) the bits set in mask */
x |= mask;

Programmers typically get used to x += y as an abbreviation of x = x + y, and may be aware that analogous operators exist for all of the arithmetic operations, e.g. *=; the full list of what C calls ‘compound assignment’ operators is *=, /=, %=, +=, -=, <<=, >>=, &=, ^=, and |=.

To toggle (i.e. invert) the masked bits, use XOR, because x ^ 1 == ~x.

/* toggle the bits set in mask */
x ^= mask;

To turn selected bits off, the operation we need is AND, but the usual mask with ones to select the bits is the exact opposite of what we need: ones where the operation is to leave bits alone, and zeroes where the bits are to be suppressed. Fortunately, ‘exact opposite’ is an operation we have—bitwise NOT.

/* turn off (clear) the bits that are 0 in mask */
x &= mask;

/* clear the bits that are set in mask */
x &= ~mask;

We have discussed combining masks with OR to select the bits set in either mask. You can combine masks before negating them for use with bitwise AND to turn off multiple bits.

/* clear the bits set in mask1 or mask2 */
x &= ~(mask1 | mask2);

If they are already negated, you can simply combine then with AND to produce a mask that has zeroes wherever either mask has a zero.

/* clear the bits set in mask1 or mask2 */
x &= ~mask1 & ~mask2;

The equivalence of ~(mask1 | mask2) and ~mask1 & ~mask2 is an appearance of De Morgan’s law, the duality of AND and OR.

Setting, clearing, and toggling bits are all useful, but in the end, one generally wants to check what the results have been. To test whether a bit is set, you can exploit the usual Boolean truthiness rule since a number with any bits set is true; just turn off all of the bits you aren’t interested in.

/* do something if any of the bits set in mask are set in x */
if (x & mask)
    something();

Any bits remaining set will count as true, i.e. if mask == 0x3 then x & mask will be true when either or both of the least two significant bits of x are set. If you want to check multiple tests, you can combine them with the logical operators.

/* do something if any bits in mask1 are set AND any bits in mask2 are set */
if ((x & mask1) && (x & mask2))
    something();

(The parentheses in this example are optional, given the order of operations, but I prefer to be pretty defensive about extra parentheses when combining Boolean operations.)

In the common special case that you want to check whether all of the bits in a mask are set, you can mask away the other bits and then check whether the result has the same bits set as the mask, i.e. they are equal.

/* do something if all the bits set in mask are set in x */
if (x & mask == mask)
    something();
You have attempted of activities on this page