Language: EN


Control Characters in the Serial Port on Arduino

We continue to delve into the use of the advanced serial port on processors like Arduino. In this post, we will see how to add frame delimiters and control characters to our transmission systems to make them more robust.

Previously we have seen how to send bytes via serial port as a convenient and “professional” way to communicate. In the previous post, we saw that we will frequently use one or more structures defining a message that we want to send or receive.

Now we want to expand the frame (the bytes we send in the communication) by surrounding the data bytes with a series of elements that increase the “quality” of the communication. An example is to add a checksum function to check the integrity of the data, something we will see in the next post.

Another example, which is what we will see in this post, is to add frame delimiters. That is, a certain “signal” or mark that identifies the beginning and end of the communication. While we’re at it, we’d also like to be able to add certain control characters that have a special meaning.

As usual, all of this has already been invented and is called, precisely, control characters. In fact, we are using them frequently since the first communication post every time we use ‘\n’ (line feed) or ‘\r’ (carriage return).

Here is a list of some of the available control characters with their hexadecimal value and their meaning.

SOH1Start of Heading
STX2Start of Text
ETX3End of Text
EOT4End of Transmission
HT9\tHorizontal Tabulation
LF0A\nLine Feed
VT0B\vVertical Tabulation
FF0C\fForm Feed
CR0D\rCarriage Return
SO0EShift Out
SI0FShift In
DLE10Data Link Escape
DC111Device Control One (XON)
DC212Device Control Two
DC313Device Control Three (XOFF)
DC414Device Control Four
NAK15Negative Acknowledge
SYN16Synchronous Idle
ETB17End of Transmission Block
EM19End of medium
FS1CFile Separator
GS1DGroup Separator
RS1ERecord Separator
US1FUnit Separator

In particular, the accepted control characters for the start and end of the frame are, respectively, 0x02 (STX) and 0x03 (ETX). Of course, we are not required to use these characters. In fact, sometimes you will see code on the Internet using ‘H’ (Header) as the beginning of a header. There is no rule that prevents using it, but, given that control characters exist, it is logical (and more hygienic) to use the standard.

The operation is simple. When starting to send a frame, we will start by sending the STX character, and at the end, ETX. We are increasing the size of the frame by two bytes, at the expense of better communication quality. The relative increase in frame size is smaller the more data we are sending.

Here is an example of sending an array of data with frame delimiters.

const char STX = '\x002';
const char ETX = '\x003';

const int data[] = {0, 50, 100, 150, 200, 250};
const size_t dataLength = sizeof(data) / sizeof(data[0]);
const int bytesLength = dataLength * sizeof(data[0]);

void setup()
  Serial.write((byte*)&data, dataLength);

void loop() 

While an example of a receiver would be as follows,

const char STX = '\x002';
const char ETX = '\x003';

const int dataLength = 3;
size_t data[dataLength];
const int bytesLength = dataLength * sizeof(data[0]);

void setup()

void loop()
  if (Serial.available() >= bytesLength)
    if ( == STX)
      Serial.readBytes((byte*)&data, bytesLength);

      if ( == ETX)

However, control characters are nothing more than bytes. How secure are these delimiters? That is, is it possible that we confuse it with a data byte containing 0x02 or 0x3? Is it possible that, even losing bytes, we misinterpret one data byte as a delimiter?

Indeed, that is the case, no system is completely robust. Adding frame delimiters improves the system, but it does not make it infallible. In fact, we are not even checking the integrity of the data, we are only trying to check if we maintain a certain degree of “synchronization”.

For the delimiters to fail, it must coincide that, after losing several bytes, the received byte in the position where the delimiter should be has the same value. If we are working in an environment with many failures, it will not be enough to filter out all the defects.

It may seem unlikely, but in reality, the possibility of incorrectly interpreting a control code is 1/256. However, the combined probability of simultaneously misinterpreting the start and end of the message is 1/65,536.

However, the real advantage is that it provides a certain capacity for “resynchronization”. In a “normal” environment, in the face of an occasional loss of packets, the system can detect the failure and eventually recover synchronization.

Of course, we can greatly improve the transmission process by adding a timeout, or a checksum. We will see all of this in the next posts.

Download the code

All the code from this post is available for download on Github. github-full