Huffman Serialization Techniques

Networking expert Glenn Fielder’s article on network serialization techniques is a great read. It discusses a number of scenarios including serializing a sparse set of items from an array. This blog post discusses Glenn’s implementation of sparse array serialization and my thoughts on it. To be sure, Glenn’s code has the virtue of simplicity, and the concepts I’m going to discuss are more complex. Don’t assume the space or time trade offs of different serialization techniques without measuring them in practice.

Without further ado, the original code:

template <typename Stream> 
bool serialize_object_index_internal( Stream & stream, 
                                      int & previous, 
                                      int & current )
{
    uint32_t difference;
    if ( Stream::IsWriting )
    {
        assert( previous < current );
        difference = current - previous;
        assert( difference > 0 );
    }

    // +1 (1 bit)
    bool plusOne;
    if ( Stream::IsWriting )
       plusOne = difference == 1;
    serialize_bool( stream, plusOne );
    if ( plusOne )
    {
        if ( Stream::IsReading )
            current = previous + 1;
        previous = current;
        return true;
    }

    // [+2,5] -> [0,3] (2 bits)
    bool twoBits;
    if ( Stream::IsWriting )
        twoBits = difference <= 5;
    serialize_bool( stream, twoBits );
    if ( twoBits )
    {
        serialize_int( stream, difference, 2, 5 );
        if ( Stream::IsReading )
            current = previous + difference;
        previous = current;
        return true;
    }

    // [6,13] -> [0,7] (3 bits)
    bool threeBits;
    if ( Stream::IsWriting )
        threeBits = difference <= 13;
    serialize_bool( stream, threeBits );
    if ( threeBits )
    {
        serialize_int( stream, difference, 6, 13 );
        if ( Stream::IsReading )
            current = previous + difference;
        previous = current;
        return true;
    }

    // [14,29] -> [0,15] (4 bits)
    bool fourBits;
    if ( Stream::IsWriting )
        fourBits = difference <= 29;
    serialize_bool( stream, fourBits );
    if ( fourBits )
    {
        serialize_int( stream, difference, 14, 29 );
        if ( Stream::IsReading )
            current = previous + difference;
        previous = current;
        return true;
    }

    // [30,61] -> [0,31] (5 bits)
    bool fiveBits;
    if ( Stream::IsWriting )
        fiveBits = difference <= 61;
    serialize_bool( stream, fiveBits );
    if ( fiveBits )
    {
        serialize_int( stream, difference, 30, 61 );
        if ( Stream::IsReading )
            current = previous + difference;
        previous = current;
        return true;
    }

    // [62,125] -> [0,63] (6 bits)
    bool sixBits;
    if ( Stream::IsWriting )
        sixBits = difference <= 125;
    serialize_bool( stream, sixBits );
    if ( sixBits )
    {
        serialize_int( stream, difference, 62, 125 );
        if ( Stream::IsReading )
            current = previous + difference;
        previous = current;
        return true;
    }

    // [126,MaxObjects+1] 
    serialize_int( stream, difference, 126, MaxObjects + 1 );
    if ( Stream::IsReading )
        current = previous + difference;
    previous = current;
    return true;
}

Intuition and Rules of Thumb

Let’s talk about why this code immediately stood out to me – it triggered an intuition: code structure correlates with entropy.  Entropy in the information theory sense means that repetitive data has lower entropy, and unique data has high entropy. This intuition cues me to think about refactoring along DRY principles or moving to a more data driven approach. These kinds of changes take low entropy code and “compress” it so it has higher entropy – in other words, code that is less repetitive to accomplish the same results.

Consider the following example:

// Choose largest of provided parameters.
int pickHighest(int a, int b, int c)
{
   if(a > b)
      if(a > c)
         return a;
      else
         return c;
   else
      if (b > c)
         return b;
      else
         return c;
}

The repetitive tree structure of the code immediately suggests that it could be written more concisely, especially as you have more arguments from which to pick. A sort algorithm is an obvious choice, e.g.

int pickHighest(vector<int> items)
{
   items.sort(); // Sort items in descending order.
   return items[0]; // Pick the highest item.
}

This version is much shorter, it can handle any number of items, and gets right to the point.

(Caveats: Of course, the astute reader will note that there are situations where for small argument counts the tree can be a faster choice. Rewriting it with ternary operators doesn’t count – we’re concerned with execution structure here, not syntactic structure. Finally, there are plenty of loose ends to tune in the new implementation for correctness and speed.)

Huffman Codes

How does this apply to Glenn’s implementation above? Well, we see it has a repetitive structure. It’s attempting to encode more common numbers in fewer bits, and rarer numbers with more bits. That smells like a Huffman code to me! A Huffman code is a tree-like encoding structure that assigns unique bit strings of varying lengths to different values based on their expected frequency. So the most common value is encoded with a couple of bits, while the least common value might take ten or twenty bits. This lets you get a big win assuming that your assumptions about what’s common are right. As an example, algorithms like deflate/zip rely on Huffman codes. There are also techniques like arithmetic coding that are more powerful and complex expressions of this concept.

A canonical Huffman code is one which generates N codes where the first value is expected to be most frequent, and the last value is least frequent. Implicitly, this means the first value’s code is shortest and the last value’s code is longest. Fortunately, this is exactly the situation we want to deal with! If you knew more about the actual distribution of values you could generate your own tree based on the observed values and get better efficiency in your serialization.

(In the following code, I’m going to assume you have a serialize function to read/write a canonical Huffman code that can represent values between 0 and 127. This is definitely sweeping complexity under the rug, although a Huffman coder is a handy thing to have in general. If I was implementing this, I’d start with the HuffmanProcessor from Torque, although there are surely more efficient implementations.)

Here’s the implementation using Huffman coding:

template  
bool serialize_object_index_internal( Stream & stream, 
                                      int & previous, 
                                      int & current )
{
    uint32_t difference;
    uint32_t cappedDifference;
    if ( Stream::IsWriting )
    {
        assert( previous < current );
        difference = current - previous;
        assert( difference > 0 );

        // Store a capped difference so we know when to 
        // write overflow values.
        cappedDifference = difference > 127 ? 127 : difference;
    }

    // Write huffman encoded delta.
    serialize_huffman127(stream, cappedDifference);

    if(cappedDifference == 127)
    {
        // Highest value keys us to code large gaps.
        serialize_int( stream, difference, 126, MaxObjects + 1 );
    }
    else
    {
        // The serialized valus is the actual value.
        difference = cappedDifference;
    }

    if ( Stream::IsWriting )
        current = previous + difference;
    return true;
}

Notice that there is now no repetitive structure to the code. We have to deal with a couple of details (what if it’s a number bigger than we can Huffman code?), but the basic flow is much simpler. And it achieves a very similar result.

Conclusions

What are the trade offs of this approach? Well, first off, it may not be as bit efficient in the “every item is consecutive” case with the canonical Huffman code because the canonical Huffman code won’t assign a single bit to that most common case. You could add a special case for that (and spend an extra bit per serialization in the non-consecutive case).

Second, Huffman codes are more complex to read and write. If you are doing a LOT of serialization this might affect CPU usage. Most likely it will not be a significant difference – I’ve rarely seen any game be bottleneck on network serialization.

Third, tuning is more indirect. With Glenn’s implementation, you can modify the code to directly affect the bit cost of various scenarios. With this approach, your main tuning option is to regenerate the Huffman code based on actual network traffic. You can also add special cases e.g. for the consecutive item case. Regardless of the approach you use, you always want to check your changes by running your game and measuring actual update sizes in various scenarios to confirm you are really getting a win.

Fourth, there are better coding techniques than Huffman – in practice you probably want a variant of arithmetic coding. Compressing the entire packet with a standard compression algorithm like LZMA is also an option. This article is just a piece in the puzzle of the ultimate network protocol.

Is the Huffman approach a slam dunk win over the manual approach? Unclear. But it gives a more formal perspective on the smart serialization technique Glenn uses, and is a good entree into a larger discussion on coding techniques for network serialization. I hope you found this article interesting!

(Thanks to  @glennfiedler for prompting me to write this.)

Author: Ben Garney

See https://bengarney.com/ and @bengarney for details.