regex-espacios-y-saltos

Spaces and Line Breaks in Regex

  • 3 min

When working with regular expressions, we will encounter spaces and line breaks since they are typically part of almost any text.

So we will have to learn how to manage and handle them 👇.

Whitespace

Spaces are blank characters that we use to separate words and other elements in text.

There are several types of whitespace in text, such as:

  • Simple space:
  • Tab: \t
  • End-of-line whitespace: \r (carriage return) and \n (new line)

To match a whitespace character, we can use the metacharacter \s, which includes spaces, tabs, and line breaks.

If we want to find all occurrences of a word that is preceded or followed by one or more spaces, we can use the following pattern:

\spalabra\s
Copied!

This pattern will match any occurrence of the word "palabra" surrounded by spaces.

palabra
aquí espacio palabra y más espacios
sin espacio delantepalabra y palabradetras

Other symbols related to whitespace:

SymbolMatches
\sAny whitespace character (space, tab, line break)
\SAny character that is NOT a whitespace character
\tA tab character (tab)

Line Breaks

Line breaks are characters that indicate the end of a line of text. Depending on the operating system, line breaks can be represented in different ways:

  • Unix/Linux: \n
  • Windows: \r\n
  • Mac (old): \r

To match line breaks, we use \n or \r\n, depending on the operating system.

If we want to match lines that contain a specific word, we can use a pattern like the following:

palabra\n
Copied!

Let’s try it,

con salto linea palabra
otra palabra sin salto linea

Other symbols related to breaks:

SymbolMatches
\nA line break
\rA carriage return
\vA vertical tab (vertical line break)
\fA form feed (page break)

Combining Whitespace and Line Breaks

We can combine whitespace and line breaks to create more complex patterns.

For example, if we want to find lines that contain a specific word, and that may be surrounded by whitespace or line breaks, we can use:

\s*palabra\s*
Copied!

This pattern will match the word "palabra" even if there are spaces before or after.