lenguajes-compilados-e-interpretados

Compiled, Interpreted, and Semi-Interpreted Languages

  • 5 min

When we program, we write code in languages like C++, Python, or C#. It’s useful for people 🙋‍♀️, because we understand it (or at least, we try).

But your computer has no idea what you’re saying. As we’ve seen, the processor only understands zeros and ones. That is, machine code.

Therefore, we need “something” to translate our human-readable text (High Level Language) into instructions the machine can execute.

Depending on how and when this translation is done, we classify languages as Compiled, Interpreted, or Semi-interpreted.

Differences and Comparisons

Depending on which type a language is, it already has a series of inherent characteristics that will define part of its behavior.

Let’s look at a summary table of some of them:

{ “models”: [ { “name”: “Compiled” }, { “name”: “Semi-compiled (JIT)”}, { “name”: “Interpreted” } ], “specs”: { “Execution Speed”: [ { “grade”: “very-high”, “display”: “Very Fast”, “sub”: “Native” }, { “grade”: “high”, “display”: “Fast”, “sub”: “Close to compiled” }, { “grade”: “low”, “display”: “Limited”, “sub”: “Line-by-line interpretation” } ], “Portability”: [ { “grade”: “low”, “display”: “Low”, “sub”: “Requires recompilation” }, { “grade”: “high”, “display”: “High”, “sub”: “Bytecode” }, { “grade”: “very-high”, “display”: “Universal”, “sub”: “Portable source code” } ], “Development Phases”: [ “Prior compilation (AOT)”, “JIT compilation at runtime”, “Real-time interpretation” ], “Examples”: [ “C, C++, Rust”, “Java, C#”, “Python, JavaScript” ] } }

This classification is not rigid. Nowadays, the boundaries are blurry, and many modern languages use hybrid techniques to get the best of both worlds.

Now let’s try to understand why they have different characteristics by looking under the hood of each one 👇.

Compiled Languages

In compiled languages, like C, C++, Rust, Go, the translation is done completely before running the program.

In these languages, we write the source code and then use a program called a Compiler. This analyzes all our code, looks for errors, and if everything is okay, generates a binary file (the famous .exe in Windows, for example).

This resulting file contains instructions in pure machine code, ready to be executed by the processor.

  • Performance: Since it’s already translated, the program runs at full speed. The processor doesn’t waste time translating while executing.
  • Optimization: The compiler can analyze the entire code and perform complex optimizations before generating the binary.
  • Less flexible: If you want to change a line of code, you have to recompile everything.
  • Platform dependency: An executable compiled for Windows doesn’t work on Linux, and one compiled for an Intel processor doesn’t work on an ARM (like a mobile phone). You need to compile a version for each architecture.

Interpreted Languages

At the other extreme, we have interpreted languages, like Python, PHP, JavaScript (originally). Here there is no prior translation.

Instead, we have a program called an Interpreter. The interpreter reads our source code line by line, translates it to machine code at that very moment and executes it.

  • Portability: You can take your source code to any machine. If it has the interpreter installed, it will work.
  • Fast development: You make a change, save, and run. There’s no compilation wait time.
  • Lower performance: The computer has to do double the work at runtime (translate + execute), so they are usually slower than compiled ones.

Python technically compiles to bytecode (.pyc) transparently, but for practical purposes it behaves as interpreted.

JavaScript uses very powerful JIT engines (like V8 in Chrome), so nowadays it is semi-interpreted.

Semi-Interpreted (Hybrid) Languages

This is where the boundary becomes blurry. Languages like Java or C# sought a middle ground to have portability and good performance.

The process has two steps:

Compilation to Bytecode: Our source code is compiled, but not to machine code, but to an intermediate language (Bytecode in Java, CIL in .NET).

The Virtual Machine: This intermediate code runs on a “Virtual Machine” (JVM or CLR) installed on the user’s computer.

This intermediate code is universal. The same compiled Java file works on Windows, Linux, or Mac, as long as they have the Virtual Machine installed.

  • Power/Flexibility Balance: They combine very high execution speed (close to native thanks to JIT) with the huge advantage of Bytecode portability.
  • Dynamic optimizations: By compiling in real-time, the JIT can perform intelligent optimizations based on how the program is being used at that specific moment, something a static compiler (AOT) cannot predict.
  • Startup Time (Warm-up): The program usually takes a bit longer to start than a natively compiled one, as it must load the Virtual Machine and start “warming up” (compiling) the first instructions.
  • Higher memory consumption: They have a larger memory footprint (overhead), as the computer must load the program, the Virtual Machine, and the JIT compiler itself simultaneously.