• United States
by Steve Taylor and Joanie Wexler

How many bits?

Mar 11, 20042 mins

* History of data encoding

As discussed last time, one of the fundamental requirements for a code set to be useful in WAN communications is that the sender and the receiver must agree on the meaning of each combination of ones and zeros.  A 2-bit code set, for example, can have only four discrete meanings: one meaning each for the combinations 00, 01, 10, and 11.  Go to three bits and you get eight codes; four bits yield 16, and five bits yield 32.

The first widely accepted code set was Baudot code, developed more than 100 years ago.  By having five bits – and 32 code combinations – there were enough bit combinations available to have a unique code for each of the 26 letters of the alphabet. 

However, 26 letters plus the 10 digits 0 through 9 exceed the 32 combinations.  Rather than going to an additional bit, two unique codes are used to signal a shift between the “letters” interpretation of the code and the “figures” interpretation.  Since both “letters” and “figures” tend to come in groups, this works fine for simple applications.

However, there’s one big problem.  With just five bits, there’s no way to distinguish between UPPERCASE and lowercase letters.  Going to a 6-bit code with 64 combinations would still be minimal, because it would take 62 combinations for the letters and digits, with only two codes left for punctuation. 

Consequently, the minimal code set must consist of seven bits, and that’s exactly what the American Standard Code for Information Interchange (ASCII) uses.  This code, which has become the de facto standard for data communications, has 128 combinations, with a unique code for each letter in both uppercase and lowercase.  In fact the binary code for each uppercase and lowercase letter is the same except for one bit, which is sometimes called the “shift” bit.