que-son-bits-y-bytes-en-binario

What are Bits, Bytes, Char, Words, MSB and LSB

  • 5 min

If you delve into the binary system, there are certain words and terms you will inevitably encounter. Bit, Byte, and to a lesser extent Char and Word.

All of them refer to quantities of information. They are easy to understand, although they are very often mixed up. So let’s explain them briefly, to learn to speak properly.

Bits

The Bit (binary digit) is the smallest unit of information in the binary system. The term “bit” was coined by mathematician John Tukey in 1946.

As its name says, it is each of the digits that make up a binary number. Each bit can have one of two possible values, 0 or 1, and is used to represent a single piece of data or information.

In digital electronics, electronic devices rely on the presence or absence of voltage to indicate whether a bit is 0 or 1.

Bytes

A Byte is a sequence of bits that the computer is able to manage “at once”. The term originated in the 1950s at IBM, where “byte” (bite) referred to the amount of information a computer could “bite” at once.

Currently, on most machines a Byte has a length of 8 bits. But note that this is not always the case. There have been and exist machines where the Byte is 4, 6, 7, or 16 bits.

Choosing 8 bits as a standard was closely related to the need to encode characters, as we will see below. It was also an important factor that 8 is a power of 2 (it’s 2³), so it was a convenient number.

However, nowadays, in general, most of the time when we say Byte we refer to 8 bits grouped together. With them we can represent any number between 0 and 255 (2⁸ - 1).

However, it would be more correct to specifically call the set of 8 bits an Octet, to avoid confusion on other machines. In fact, this is done, for example in some communication texts.

Char

A Char (from the English “character”) is a set of bits that represent a character. It’s not very well known, but the concept of Char and Byte have always been closely related.

Encoding and processing text has always been a requirement for computers, and especially at the beginning, it wasn’t so simple. The fact that there were machines with Bytes of 4, 6, 7, 8 bits is precisely related to having the capacity to store a character. (Interesting, isn’t it?)

In any case, in most systems and “almost always”, the length of a Char is the same as that of a Byte.

Words

A Word is a set of Bytes that the machine uses internally to work “in blocks”. It was called that because a “Word” is a set of Chars. And we’ve already said that Char and Byte are almost the same thing.

Modern machines do not work internally only with one Byte, but they work with groups of several Bytes at once. Internally they are designed to function this way, handling larger blocks.

The word size is defined by the processor architecture and varies from machine to machine. In most modern computers, a word consists of 32 bits or 64 bits.

For example, a 32-bit machine will work with blocks of 4 Bytes at a time, for example, to perform calculations or store memory addresses. A 64-bit machine will work with blocks of 8 Bytes.

However, unless we are doing things at a very low level, in general we won’t have to worry too much about it (but it doesn’t hurt for you to know the term).

MSB and LSB

Two terms you will sometimes encounter are MSB and LSB. They are common acronyms in the context of binary data representation, such as in digital and computing systems.

In this context “significant” is a synonym for “weight” in the number. The bits furthest to the left have greater “weight” because they correspond to higher powers of 2.

  • MSB (Most Significant Bit): This term refers to the most significant bit in a binary number. The most significant bit is the first bit on the left.
  • LSB (Least Significant Bit): This term refers to the least significant bit in a binary number. The least significant bit is the last bit on the right.

For example, in the 8-bit binary number 11010110, the most significant bit (MSB) would be the first bit on the left. While the least significant bit (LSB) would be the 0 on the right.

You will often find the terms MSB and LSB in literature. It is a more precise and rigorous way than referring to the “bit on the left” or the “bit on the right”. It also protects us from the case where (who knows for what reason) someone had the brilliant idea of storing a binary number “backwards”.

However, in this course, I will continue to say “bit on the left” and “bit on the right” because, in my opinion, it’s easier to understand. (but if you encounter the terms, you know what they mean).