Language: EN

que-son-bits-y-bytes-en-binario

What are Bits, Bytes, Char, Words, MSB and LSB

If you get into the binary system, there are certain words and terms that you are going to inevitably encounter. Bit, Byte, and to a lesser extent Char and Word.

All of them refer to quantities of information. They are easy to understand, although they are often mixed up with each other. So let’s explain them briefly, to learn to speak properly.

Bits

The bit (binary digit) is the smallest unit of information in the binary system. The term “bit” was coined by the mathematician John Tukey in 1946.

As its name suggests, it is each of the digits that are part of a binary number. Each bit can have one of two possible values, 0 or 1, and is used to represent a single piece of data or information.

In digital electronics, electronic devices rely on the presence or absence of voltage to indicate whether a bit is 0 or 1.

Bytes

A byte is a sequence of bits that the computer is capable of handling “at once”. The term originated in the 1950s at IBM, where “byte” referred to the amount of information that a computer could “bite” at once.

Currently, in most machines, a Byte has a length of 8 bits. But make no mistake, this is not always the case. There were and still are machines where the Byte is 4, 6, 7, or 16 bits.

Choosing 8 bits as a standard was very much related to the need to encode characters, as we will see below. It was also an important factor that 8 is a power of 2 (it is 2³), so it was a convenient number.

However, today, in general, most of the time when we say Byte, we are referring to grouped 8 bits. With them, we can represent any number between 0 and 255 (2⁸ - 1).

However, it would be more correct to specifically call the set of 8 bits Octet, in order not to mislead other machines. In fact, this is done, for example in some communication texts.

Char

A Char (from the English “character”) is a set of bits that represent a character. It is not very well known, but the concept of Char and Byte have always been closely related.

Encoding and handling text has always been a requirement of computers and, especially at the beginning, it was not so simple. If there were machines with Bytes of 4, 6, 7, 8 bits, it is precisely related to having the capacity to store a character. Curious, isn’t it?

In any case, in most systems and “almost always”, the length of a Char is the same as that of a Byte.

Words

A Word is a set of Bytes that the machine uses internally to work “in blocks”. It is called this because a “Word” is a set of Chars. And we have already said that Char and Byte are almost the same.

Modern machines do not work internally only with a Byte, but they work with groups of several Bytes at once. They are internally designed to function this way, handling larger blocks.

The size of the word is defined by the processor’s architecture and varies from one machine to another. In most modern computers, a word consists of 32 bits or 64 bits.

For example, a 32-bit machine will work with blocks of 4 Bytes at a time, for example, to perform calculations or store memory positions. A 64-bit machine will work with blocks of 8 Bytes.

However, unless we are doing things at a very low level, in general we won’t have to worry too much about it. But it doesn’t hurt for you to know the term.

MSB and LSB

Two terms you will sometimes come across are MSB and LSB. They are common acronyms in the context of binary data representation, such as in digital and computing systems.

In this context, “significant” is a synonym for “weight” in the number. The bits to the left have greater “weight” because they correspond to higher powers of 2.

  • MSB (Most Significant Bit): This term refers to the most significant bit in a binary number. The most significant bit is the first bit on the left.
  • LSB (Least Significant Bit): This term refers to the least significant bit in a binary number. The least significant bit is the last bit on the right.

For example, in the 8-bit binary number 11010110, the most significant bit (MSB) would be the first bit on the left. While the least significant bit (LSB) would be the 0 on the right.

You will often find the terms MSB and LSB in literature. It is a more precise and rigorous way to refer to the “leftmost bit” or the “rightmost bit”. It also protects us from the possibility that, for some unknown reason, someone decided to store a binary number “backwards”.

However, in this course, I will continue to say “bit on the left” and “bit on the right” because, in my opinion, it is easier to understand. But if you come across these terms, now you know what they mean.