regex-literales-y-caracteres-especiales

Literals and Special Characters in Regex

  • 4 min

To start working with Regex queries, the first step is to begin with the two simplest patterns: literals and special characters.

Let’s see them 👇

What are literals in Regex

Literals are simply exact patterns we search for in a text string (meaning the same characters, in the same order).

When we write a literal in a regular expression, we are indicating that we want to find exactly that text in the text we are evaluating.

For example, if we want to find the word "hello" in a text string, we can write:

hello
Copied!

This will match any exact occurrence of the word "hello" in the text. Let’s see it in action.

hola, ¿cómo estás?

¡Hola, mundo!

hola123 y hola mundo

In the example, we can observe that:

  • In the first case, the word "hello" matches the string exactly.
  • In the second example, there is no match, because the literal "hello" does not account for the capital "H".
  • In the third case, it matches twice: the first within "hello123" and the second in "helloworld".

If we wanted to ignore case sensitivity, we would use a modifier (we’ll see this in its own article).

Literals are the simplest searches. It starts to get interesting when they are mixed with special characters and quantifiers.

What are special characters in Regex

Special characters are those that can represent more than one character (for example, all letters, or all digits).

These characters allow you to create more complex patterns than simply searching for literals.

Below are some of the most commonly used special characters in Regex:

SymbolMatches
.Any character
\wAny alphanumeric character
\WAny non-alphanumeric character
\dAny digit
\DAny character that is not a digit

Alphanumeric means letter or number, i.e., a-z, A-Z, or 0-9]

Let’s see it with an example.

abc123 123!@#

Fecha: 2023-09-15

Hay 25 personas aquí
  • Using \w finds all alphanumeric characters, while \W looks for non-alphanumeric characters.
  • On the other hand, \d looks for numbers, and \D looks for anything that is not a number.

In the previous example, try putting \w, \W, \d or \D

Escaping special characters

If we want to use a special character as a literal, we need to escape it using the backslash (\).

Escaping means treating a special character as if it were a normal literal, removing its special meaning.

For example, if we wanted to search for a literal period (.) in a text string, we cannot put . because in RegEX a period means “any character”.

So we must escape it, like this:

\.
Copied!

Let’s see it in an example.

archivo.txt

No hay punto aquí

Versión 1.0.3

In this example,

  • In the first line, it matches only the "." character that appears in "file.txt"
  • In the second line, since there is no period in the text, there are no matches
  • In the third line, the pattern matches all the periods in "1.0.3", as they are present in the text