In this post we are going to look at some **simple checksum functions that we can use in small processors** such as Arduino to check the integrity of data.

When working with communication systems or stored data, there is the possibility that the data becomes corrupted. That is, one or more bytes of the sequence may change or even be completely lost.

We will often need to verify the integrity of the data before taking action. For example, the consequences of moving a robot based on incorrect orders received through wireless communication could be disastrous and potentially dangerous.

To do this, we can add a checksum, **a small value calculated from a larger block of data** whose purpose is to detect errors introduced into the sequence.

In a communication or storage system, the checksum is sent/received or stored along with the data. When processing the data, the checksum is recalculated and **compared with the available checksum** to verify the integrity of the data.

There are multiple checksum functions, each with different error detection rates and calculation requirements. On the other hand, the longer the checksum, the more reliable it is, but the greater the additional communication/storage needs to store the checksum.

Finally, **no system is infallible**. Even if you send the same message 100 times (which would be a kind of huge and inefficient “checksum”), you cannot guarantee that the same error occurred in all 100 transmission batches. As almost always, it’s a matter of **reaching a compromise between reliability, length, and computational load**.

We will implement some of the main checksum functions that are light enough to be executed quickly on a microprocessor like Arduino.

## Checksum XOR

The XOR checksum, also known as LCR (Longitudinal Redundancy Check), is **widely used due to its simplicity and speed of calculation**. It is calculated by processing the data sequence in N-bit fragments and performing the XOR operation on all the fragments.

Usually, the fragments are chosen to be multiples of 8 because the calculations are greatly accelerated, as processors are designed to perform actions on blocks of bytes.

Often, the checksum is returned negated. The reason is to differentiate a complete communication failure, which causes received data to be all 0, from a data transmission consisting of 0’s.

The following code shows an 8-bit XOR checksum.

```
uint8_t XORChecksum8(const byte *data, size_t dataLength)
{
uint8_t value = 0;
for (size_t i = 0; i < dataLength; i++)
{
value ^= (uint8_t)data[i];
}
return ~value;
}
```

The code for a 16-bit XOR checksum is as follows.

```
uint16_t XORChecksum16(const byte *data, size_t dataLength)
{
uint16_t value = 0;
for (size_t i = 0; i < dataLength / 2; i++)
{
value ^= data[2*i] + (data[2*i+1] << 8);
}
if(dataLength%2) value ^= data[dataLength - 1];
return ~value;
}
```

As we say, the XOR checksum is simple and quick to execute, which is why it is very widespread. However, **the error detection rate is quite limited**.

Due to the operation of the XOR function, in reality **the XOR checksum acts as a parity bit for each of the columns** (bits) of the checksum. Therefore, it is only able to detect an odd number of bit changes in each column. As a particular case, **it is not capable of detecting the receipt of two packets in the wrong order**.

The average failure rate is 1/k, where K is the number of bits in the checksum, which means an average failure rate of 12.5% for 8 bits and 6.25% for 16 bits.

## Additive Checksum

Another family of widely used checksums is the additive checksum families. Additive checksums also divide the data into N-bit fragments and perform the accumulated sum of the blocks.

Depending on how the sum of the received bytes is performed, we have different types of additive checksums.

The simplest case is to use the two’s complement sum, in which we simply ignore the carry-overs and let the checksum restart in case of overflow.

Here is an example of an 8-bit two’s complement checksum.

```
uint8_t TwoComplementChecksum8(const byte *data, size_t dataLength)
{
uint8_t value = 0;
for (size_t i = 0; i < dataLength; i++)
{
value += (uint8_t)data[i];
}
return ~value;
}
```

And here is a 16-bit two’s complement checksum.

```
uint16_t TwoComplementChecksum16(const byte *data, size_t dataLength)
{
uint16_t value = 0;
for (size_t i = 0; i < dataLength / 2; i++)
{
value += data[2*i] + (data[2*i+1] << 8);
}
if(dataLength%2) value += data[dataLength - 1];
return value;
}
```

This implementation is simple, but it poses a greater risk of failure in the most significant bytes (MSB) because the carry information does not affect the checksum calculation.

An alternative is to use the one’s complement sum as a sum function. In a processor with two’s complement calculation (the majority), the one’s complement is simulated by using a larger checksum, and finally adding the carry bits to the final value until the carry disappears.

This slightly improves reliability at the cost of a small increase in calculation cost.

```
uint8_t OneComplementChecksum8(const byte *data, size_t dataLength)
{
uint16_t value = 0;
for (size_t i = 0; i < dataLength; i++)
{
value += (uint8_t)data[i];
}
while (sum>>8)
sum = (sum & 0xF) + (sum >> 8);
return (uint8_t)~sum;
}
```

```
uint16_t OneComplementChecksum16(const byte *data, size_t dataLength)
{
uint32_t value = 0;
for (size_t i = 0; i < dataLength / 2; i++)
{
value += data[2*i] + (data[2*i+1] << 8);
}
if(dataLength%2) value += data[dataLength - 1];
while (sum>>16)
sum = (sum & 0xFF) + (sum >> 16);
return sum;
}
```

Additive checksums **exhibit better behavior than XORs**, as bit changes have effects on other columns thanks to the carry bits, while maintaining high efficiency compared to the XOR case (in some cases identical).

The average failure rate for the checksum with one’s complement sum is half (1/2K) that of XOR with the same number of bits. Equivalently, on average, an additive checksum is equally efficient as a XOR of double the bits.

The main cause of failure for the additive checksum is a simultaneous modification of two bits in the same column, whose value is different (0-1 1-0), while the XOR checksum fails in the simultaneous modification of two bits in the same column in all cases (0-0 0-1 1-0 1-1).

That is, the average failure rate for one’s complement is 6.25% for an 8-bit checksum, and 3.125% for 16 bits. In the case of two’s complement, it is slightly worse, approximately 7% for 8 bits and 3.3 for 16 bits.

However, they share a major weakness, **they are not capable of detecting when two entire blocks arrive in the wrong order** as the sum remains unchanged.

Finally, additive checksums exhibit worse behavior when the distribution of 0’s and 1’s is uniform (which penalizes it in benchmarks with random numbers). The best behavior is observed with data that has a majority distribution of 0’s.

## Chechsum Fletcher16

This verification algorithm was developed by John G. Fletcher (1934–2012) at the Lawrence Livermore Labs in the 1970s.

Fletcher’s checksum requires at least 16 bits and **is formed by the concatenation of two sums**. The first sum is simply the sum of the received bytes (with two’s complement or one’s complement). The second sum is the sum of the first sum at each step.

Here is a simple implementation of the Fletcher16 Checksum in 16 bits with two’s complement sum.

```
uint16_t ChecksumFletcher16(byte *data, size_t count )
{
uint8_t sum1 = 0;
uint8_t sum2 = 0;
for(size_t index = 0; index < count; ++index )
{
sum1 = sum1 + (uint8_t)data[index];
sum2 = sum2 + sum1;
}
return (sum2 << 8) | sum1;
}
```

And here is an implementation with one’s complement sum.

```
uint16_t ChecksumFletcher16(byte *data, size_t count )
{
uint16_t sum1 = 0;
uint16_t sum2 = 0;
for(size_t index = 0; index < count; ++index )
{
sum1 = sum1 + (uint8_t)data[index];
while (sum1>>8)
sum1 = (sum1 & 0xF) + (sum1 >> 8);
sum2 = sum2 + sum1;
while (sum2>>8)
sum2 = (sum2 & 0xF) + (sum2 >> 8);
}
return (sum2 << 8) | sum1;
}
```

The detection capability of the Fletcher function is **significantly superior to the XOR or additive equivalent** with the same number of bits (several orders of magnitude) .

Additionally, the Fletcher checksum **is capable of detecting complete exchanges of code fragments**. The reason is that the second sum accumulates each fragment once it appears in the sequence, until the end of the sequence, through the first sum.

If a fragment changes position in the data, the second sum is able to detect the error, even when the first sum is identical.

## CRC Checksum

CRC codes (Cyclic Redundancy Checks) are one of the most widely used checksums in the world of computing. Their calculation is based on **the calculation of a series of polynomials** whose formulation and order depend, among other things, on the number of bits in the checksum.

CRCs have better error detection rates than those presented here, but **their calculation represents a higher load** than the others. Although it is possible to speed it up using precalculated tables, the requirements are usually excessive to work smoothly on a small processor like Arduino.

Therefore, we will only see an example of the CRC8 checksum, based on the formulas distributed by the company Dallas/Maxim under the terms of the GNU GPL 3.0 license.

```
//CRC-8 - based on the CRC8 formulas by Dallas/Maxim
//code released under the therms of the GNU GPL 3.0 license
byte CRC8(const byte *data, size_t dataLength)
{
byte crc = 0x00;
while (dataLength--)
{
byte extract = *data++;
for (byte tempI = 8; tempI; tempI--)
{
byte sum = (crc ^ extract) & 0x01;
crc >>= 1;
if (sum)
{
crc ^= 0x8C;
}
extract >>= 1;
}
}
return crc;
}
```

**CRCs have very good error detection rates**. The reason is the greater mixing of bits that the application of polynomials entails. In general, their behavior is comparable to 1/(2^k), a reference limit in checksum analysis (understanding comparable as being able to be measured with it, not that it reaches it).

Like Fletcher checksums, **CRCs detect complete exchanges of fragments**. In general, the behavior of a CRC is slightly superior to the equivalent Fletcher with the same number of bits.

The main problem is, as we have mentioned, the higher calculation requirements. However, despite its simple formulation, the CRC8 checksum seen as an example has a very good fault detection capability, with moderate processor usage.

There are several libraries for Arduino that allow you to calculate CRC16. However, since the detection capability is superior, but not much higher than Fletcher16, my advice is to use this one for its greater simplicity and speed of calculation.

## Download the Code

All the code from this post is available for download on Github.