Introduction to encryption for embedded Linux developers


This article is an introduction to encryption for embedded Linux developers.

It’s the first article in a series about how to use encryption on embedded Linux devices.

In this first part, I will cover the basic concepts, including an introduction to security, confidentiality and encryption, the main motivations and how it works, types of encryption (symetric and asymetric key encryption), the most commonly used ciphers and the trade-offs between them.

Although my focus here is embedded Linux, the concepts covered in these series of articles can be applied to all kinds of projects and might be useful for everyone dealing with encryption on embedded devices.

If you prefer an one hour talk instead of reading this article, you can also watch the talk “Introduction to encryption for embedded Linux developers” presented at Embedded Online Conference 2021.

Now let’s start with some introduction about security and encryption…

Security and encryption

Security is all about risk mitigation. You have something you want to protect (let’s call it an asset) because it brings you value. If the value of this asset is higher than the costs to protect it, you will probably spend some resources to mitigate the risks of this asset being compromised. Encryption is just one of the available risk mitigations techniques.

In the security field, encryption solves the confidentiality problem. According to Wikipedia, confidentiality is “the property that information is not made available or disclosed to unauthorized individuals, entities, or processes”.

But why should we care about confidentiality on embedded devices? Well, for a few reasons…

Confidentiality on embedded devices

We might want to protect the device from being counterfeited (e.g., IP protection). With the tools we have today, it’s very easy to dump the firmware from a storage device (disk, flash, etc). Nothing is impossible, but encrypting the firmware will make much harder for an attacker to have access to the firmware (as we will see later on, that actually depends on how secure the encryption key is stored on the device).

By the way, if you want to learn a few techniques on how to dump firmware from electronic devices, you might want to read the article “Extracting firmware from devices using JTAG”.

Another good motivation to use encryption is to protect sensitive information stored in the device (runtime data, user data, etc), due to privacy or other reasons. Indeed, depending on the product we are developing (e.g. a medical device) there are standards or regulations we need to follow that makes data confidentiality a mandatory requirement.

An additional reason would be to prevent a threat actor from extracting the code for reverse engineering purposes. Of course, encrypting the code doesn’t make it more secure, but complicates the life of an attacker that wants to search for (and explore) vulnerabilities in the firmware.

And these are just a few reasons we might want to care about code and data confidentiality on an embedded Linux device.

Code and data confidentiality

Code confidentiality is about protecting the code. The code can be a custom bare-metal firmware from a microcontroller-based project or a complete operating system from a microprocessor-based project.

On an embedded Linux system, you might want to protect the bootloader, the Linux kernel, the root filesystem, or any other specific partition where your applications are stored. Although it’s not as common to care about code confidentiality as it is with data, this might be a requirement for some type of products.

When using open-source software, be careful with software licenses when encrypting the code. In the end, why encrypt GPL licensed code since you will have to release it anyway? Also, GPLv3 is particularly problematic due to Tivoization.

Ensuring code confidentiality usually boils down to encrypting only your own applications (aka your intellectual property).

What about data confidentiality?

Data confidentiality is about protecting the data. And there are three different kinds of data you may want to protect.

  • You may want to protect data stored physically in the device, in any digital form (we usually call it data at rest).
  • You may also want to protect data going in/out of the device, that flows over some kind of network (we usually call it data in transit, data in motion or data in flight).
  • You may also want to protect data stored in a non-persistent storage (RAM, CPU caches, CPU registers, etc).

Now that we know what code and data confidentiality is, how to solve this problem?

How to solve the confidentiality problem?

Several techniques can be applied to the design of an embedded device to solve the confidentiality problem, including authorization, tampering and encryption.

Authorization is about controlling who has access to information via some mechanism of authentication. For example, to protect the access to data stored in the device, you might provide an authentication mechanism where users need to type their name and password to have access to the data.

Tampering is usually a physical protection against unauthorized access. For example, to protect the code, you could design a device in a way that, when it is opened, the firmware is automatically erased.

Another technique is, of course, encryption, a process that renders data unreadable to anyone except those who have access to some secret (password, key, etc). And that is our focus from now on!

Encryption in a nutshell

Encryption is the process of encoding information in a format that cannot be read or understood by an eavesdropper.

It uses an algorithm (cipher) to convert the original representation of the information (plaintext) to an alternative form (ciphertext). A secret (key) is used in the process, so only authorized parties that knows or have access to this secret can convert a ciphertext back to plaintext and access the original information.

Encryption in a nutshell

One of the simplest and oldest encryption technique is the Caesar cipher, where you basically shift each letter of the plaintext n times to generate the ciphertext.

Let’s say you want to encrypt EMBEDDED:

Cleartext Cipher Key Ciphertext
EMBEDDED Caesar cipher 3 HPEHGGHG

In this example, we are using the Caesar cipher and our key (or secret) is 3. That means the ciphertext is generated by shifting 3 times each charactere of the cleartext.

So the key here is the secret - pun intended! :-)

Keys and encryption

The process of encrypting and decrypting messages involves keys. And there are two main types of encryption keys in cryptographic systems: symmetric-keys and asymmetric-keys.

In symmetric-key schemes, the same key (usually called a private key) is used for encryption and decryption.

Symmetric Key Encryption

In asymmetric-key (also called public-key) encryption schemes, a pair of keys (usually called public and private keys) is used. Data encrypted with the public key can only be decrypted with the private key, and vice-versa.

Asymmetric Key Encryption

Symmetric-key based ciphers are usually simpler, faster and efficient. The problem is how to securely store the encryption key. If the key is leaked, the security is broken!

This is less of a problem with asymmetric-key based ciphers since you don’t need to share the private key. On the other hand, asymmetric-key based ciphers are complex and slower, making its usage a problem when you need to encrypt large chunks of data.

In the end, there are tradeoffs between these encryption schemes, as you can see from the table below:

What? Symmetric-key Asymmetric-key
Keys Only one key (private) Two keys (public/private)
Complexity Lower Higher
Speed Faster Slower
Resource usage Lower Higher
Length of Keys Typically 128 or 256 bits 2048 bits or higher
Usage Encrypt large chunks of data Encrypt small chunks of data
Security Lower Higher
Ciphers Block ciphers (AES, DES, 3DES), Stream ciphers (RC4) RSA, Elliptic Curve Cryptography (ECC), DSA, Diffie-Hellman (for key exchanging)

The true is that one scheme is not necessary better than the other. They actually complement each other and sometimes are used together to solve an specific problem.

For example, two peers that want to exchange data may use an asymmetric-key based cipher to exchange a private key, and then use this private key in a symmetric-key based cipher to encrypt and exchange data (SSL/TSL in a nutshell!).

In the next article, we will open a Linux terminal and apply all of this concepts! Using openssl, we will learn how to create keys and encrypt/decrypt data using symmetric based ciphers.

See you there!

About the author: Sergio Prado has been working with embedded systems for more than 25 years. If you want to know more about his work, please visit the About Me page or Embedded Labworks website.

Please email your comments or questions to hello at sergioprado.blog, or sign up the newsletter to receive updates.


See also