What is a Reference

6 min
Intermediate

Today we are going to talk about the concept of REFERENCE in programming. It’s a very important concept (and very simple if we understand it correctly).

However, it is one of the topics that confuses programmers the most, even those who have been programming for a long time (this is not clickbait, I assure you it’s true. The concept of REFERENCE is terribly misunderstood).

The use of references has important implications regarding aspects such as constant / immutable, value types / reference types, stack, heap, efficiency… and a gazillion other things. So it’s a good idea to try to understand it well.

Furthermore, it’s a concept that applies to all programming languages, whether you know it or not. While more modern languages take care of hiding part of the difficulty of managing them, in others like C++ or Java it’s more visible.

So let’s get to the point. What is an REFERENCE? Let’s see what the dictionary says:

A reference is a relationship established between an expression and what it alludes to

Okaaay… we’re just as confused as before, thanks a lot dictionary. So let’s look at a practical case instead.

Imagine you have a book. At the end of the book, you want to add a bibliography to another book. Obviously, you don’t copy the entire book you want to mention, you simply add a citation. Or in other words, you reference other books 😉.

curso-programacion-referencia

That is, a reference is an element that serves as a link to direct us to other information. But it does NOT contain the information, the information is somewhere else. The reference is just the link.

Yay, I managed to finish the introduction without saying that a reference is something that refers to something! Good for me! 👏

References in Computing

In computing, we have a very common and simple example that you use every day, and that will be very useful to explain what a reference is. File and folder shortcuts.

You have your favorite movie on your hard drive. The Dark Knight, the Avengers, Frozen. Or whatever, I don’t care. But it’s on your hard drive, with director’s commentary and 4k resolution, taking up its 20GB.

referencia-1

Your movie

Now you have a shortcut on your desktop. When you double-click, the movie opens. But the shortcut is not the movie. It’s just a link that takes you to the real movie.

referencia-2

A shortcut to the movie

In fact, you can have different shortcuts, all pointing to the same movie. There is only one movie, but you have different accesses from which you can reach it.

referencia-4

Several shortcuts to the same movie

Well, that’s a referencing mechanism. In this case, for your Avatar movie. Or Interstellar, Toy Story 3. Or whatever, we still don’t care.

Now, a typical beginner’s mistake. “Luis, I’ve sent you 218 movies in this email.” And you think… “218 movies? How can that fit in an email? And it only took him 2 seconds to send it?”

Obviously NOT. Your novice friend just sent you only the shortcuts 🤦‍♂️. Shortcuts weigh very little, because they don’t contain the real movie. So he can send you thousands. But they are useless to you, because the information is not on your computer.

References in Programming

In programming, REFERENCES are variables that do not contain data, but contain a link to the data. Just like the shortcut didn’t have the movie data, it was just an access to the data.

Why would anyone think of inventing such a mess? Well, as you can imagine, it wasn’t for fun. Without references, computers couldn’t work.

Mainly for two reasons:

Copying or moving REFERENCES *is much faster than working with the real data. Remember your novice friend who only copied the shortcuts, and not the movies. It’s much, much faster because references are very small.

If two parts of a program need to work on the same data, they can exchange references. Otherwise, each part would have to make its own copy, modify it, and return it.

References allow programs to improve their speed a lot, a whole lot. In fact, literally without references we wouldn’t have computers.

The other reason to use REFERENCES is the need to use dynamic memory. Your program doesn’t always know what data it needs. For example, it has to store numbers entered by the user… but we don’t know how many numbers he will enter.

Except for the simplest programs, most will need to reserve memory during their execution. For that, they launch a process we call allocation. This process involves the operating system and the Allocator.

We won’t go into detail, but at the end of the process we will have our reserved space in memory. But we have no idea where that space is, they have to tell us where. So they are forced to return a reference to us.

referencias-memoria-dinamica

Your variable is in room 0x23

It’s like when we go to reserve a room in a hotel. We go to the reception, ask for a room, and when they finish they have to tell us the room number.

What is a Pointer Advanced

Many languages have the concept of POINTER. Feared by some, loved by others, and overrated by almost everyone, the term certainly leaves almost no one indifferent.

So, what is a pointer? A pointer is one of the simplest referencing mechanisms. It’s not the easiest, nor the hardest, nor the most powerful. It’s simply one of the simplest.

Basically, a pointer is a variable that contains a memory address, where there is a value. Pointers are called that because they “point” to a memory address.

Think of a laser pointer, the kind you use in a presentation with a whiteboard. What is a laser pointer? Something used to point. Well, the same thing, only this is a pointer to memory, something that points to a memory location.

As I said, POINTERS are a mechanism for making REFERENCES. All pointers are references, but not all references are pointers. There are other mechanisms, some better and some worse.

Internally, probably, all referencing systems use pointers to a greater or lesser extent. Although in the end, this is saying nothing. Yes, and they will all use memory addresses and bytes, and stuff… so what?

So I’m going to demystify the concept of a pointer. What’s much more important is that you understand the abstract concept of reference. Pointers are a particular case of implementing a referencing system.