Field packing¶
So far we’ve focused on treating a multibit integral value as an array of individual bits, and performing Boolean operations on them. However, similar bit hacking can enable us to pack and unpack arbitrary fields in memory, in order to match the requirements of a file format or network protocol, or to trade off between memory usage and processing time. To illustrate what I mean, let us explore different choices about how we might represent playing cards in C.
Suits and ranks¶
A playing card has two attributes, its suit and its rank. The suits,
following French tradition, are spades, hearts, clubs, and diamonds;
there are four of them. There are thirteen ranks: ten numbered cards,
one (‘ace’) through ten, and three ‘face’ cards, jack, queen,
and king. There are fifty-two (
A common first step in representing abstract values is to number them, since we have so many tools at our disposal for representing numbers already. So I will define a four-element array of strings with the names of the suits, and a thirteen-element array of strings with the names of the ranks, and the corresponding indices will be the numbers that represent each idea.
const char *suits[] = { "spades", "hearts", "clubs", "diamonds" };
const char *ranks[] =
{ "A", "2", "3", "4", "5", "6", "7", "8", "9", "10", "J", "Q", "K" };
So, for example, the 5 of clubs would be rank 4 and suit 2; the ace of
spaces is rank 0 and suit 0; the kind of diamonds is rank 12 and suit
3. These could be integral values in any type big enough to hold the
necessary range. The suits range from 0 through 3, i.e. four non-negative
values, so they can be represented in two bits. The ranks range over
13 non-negative values, and
/* it's the five of clubs! */
uint8_t suit = 2;
uint8_t rank = 4;
printf("it's the %s of %s!\n", suits[suit], ranks[rank]);
Since the smallest type is a byte, and a card has two attributes, each card requires two bytes.
Packing the fields together¶
There are only 52 unique cards, and
uint8_t pack_card(uint8_t suit, uint8_t rank) {
return suit * 13 + rank;
}
uint8_t unpack_suit(uint8_t card) {
return card / 13;
}
uint8_t unpack_rank(uint8_t card) {
return card % 13;
}
With this scheme we can represent cards with one byte for each. So,
a hand of five cards would be five bytes. However, thinking of each
card drawn independently, five cards represent
uint32_t pack_hand(uint8_t cards[5]) {
uint32_t hand = 0;
int i;
for (i = 4; i >= 0; i--)
hand = hand * 52 + cards[i];
return hand;
}
uint8_t unpack_card(uint32_t hand, int index) {
int i;
for (i = 0; i < index; i++)
hand /= 52;
return hand % 52;
}
This is a fairly aggressive space–time tradeoff; saving a byte per card, or six bytes per hand compared to a naïve strategy could be a huge benefit if a huge number of them have to be stored, e.g. to explore a tree of possible moves for a game. However, accessing the components of those packed values requires extensive computation rather than being directly addressable in memory. The implementations I’ve shown could be optimized significantly, but even with significant loop unrolling and precomputation, there’s going to end up being a division for each access.
Bit packing¶
Arranging to use powers of two allows a middle ground in this tradeoff.
The reason to multiply and divide by 13 when combining suits and
ranks is to ensure that all 52 unique combinations are numbered,
with no overlapping encodings. However, any number at least 13 would
work, and 16 is a power of two; choosing 16 corresponds to rounding up
uint8_t pack_card(uint8_t suit, uint8_t rank) {
return suit << 4 | rank;
}
uint8_t unpack_suit(uint8_t card) {
return card >> 4;
}
uint8_t unpack_rank(uint8_t card) {
return card & 0xf;
}
At 6 bits per card (4 for rank and 2 for suit), a hand of five cards would take 30 bits, which while pessimistic compared to less than 29, still fits in a 32-bit integer.
uint32_t pack_hand(uint8_t cards[5]) {
uint32_t hand = 0;
int i;
for (i = 4; i >= 0; i--)
hand = hand << 6 | cards[i];
return hand;
}
uint8_t unpack_card(uint32_t hand, int index) {
return hand >> (6 * index) & 0x3f;
}
(You might have also noticed that while 13 is not a power of two, the number of suits—four—is a power of two, so if I had chosen to pack the suit in the low bits and the rank in the higher bits, rather than the other order as I did, the factor of four would have been easy to optimize into bit operations.)
Conclusion¶
Bit-level field packing is often the ideal point in trading off storage
space for encoding effort. The heavier-encoding strategy above that
employed full-on division can approach the ideal of encoding a card in
Of course, what is true of all optimizations is worth repeating here as well: Don’t try to guess what you will have to optimize. Write the clearest, easiest-to-get-right version of your code first, and then, only if you measure an actual performance problem, is it worth worrying about how to fix it.
Performance aside, packing bit fields is also an essential component of writing code to work at a low level with storage layouts like binary file formats or network protocols that aren’t always aligned to bytes.