Overview of block ciphers

https://quequero.org/2015/07/information-leakage-analysis-in-block-ciphers-part-1/

Block ciphers are cryptographic functions for blocks of data of fixed-size, as opposed to stream ciphers (take as an example the classic RC4) that can be used over a stream of data of any length. Block ciphers can work on different data blocks sizes and they can take as input keys of different sizes as well. What we want to understand in this first part is how block ciphers work and what is semantically secure for them, also we will quickly delve into some common attacks used against block ciphers and block cipher modes of operation.

How Block Ciphers Work

Block ciphers, like all symmetric cryptographic functions are based on two algorithms, one is E and another is D, these two algorithms are encryption algorithm (E) and decryption algorithm (D). In order to encrypt or decrypt a message, both of this algorithms must take as input an encryption key with a specific size (in case of AES the key must be 128, 192, 256 bits). The point of block ciphers is that as a message they can only take fixed-size inputs and the output will be of the same size: every bit is in fact permutated and substituted inside the same block. The encryption function is defined over a triple (K, M, C); those sets are K = key space, M = messages space, C = ciphertexts space. During the encryption or decryption process, more iterations are normally adopted, the figure below shows a sample iteration scheme for a block cipher:

Block_ciphers iteration

The first step is that the encryption key is expanded into a sequence of keys starting from K1 to Kn that are usually referred to as ROUND KEYS. The cipher uses these round keys iteratively to encrypt the message M: K1 has a M1 and the result is the next K2, with a second message M2, the encryptions goes over again until you get to the final 128-bit message, in fact we have a function called ROUND FUNCTION that takes as input K1 and M, where M is the current state of message. M in this case has a size of 128 bits, after the first round a block of the same size is returned. The process keeps going for the next round and we have M1, during the next round the key K2 will be used so that: R(K2, M1) will lead to the next round M3 → R(K3, M3), and again until the end. Different ciphers can have a different number of rounds and they can have different round functions. For example 3DES has as number of rounds that is 48, for AES the number of rounds is only 10. To have semantic security we must introduce the concept of pseudorandom numbers and pseudorandom function.

Randomness for Block Ciphers

The pool of randomness, seeded by the encryption key, takes advantage of a PRF (Pseudo Random Function) and a PRP (Pseudo Random Permutation). PRF is defined over a triple (K,X,Y) where K is a key space, X is an input space (with all inputs available) and Y is the output space. These sets are essentially functions that take a key as an input and as result there is some element in output space (Y) so that:

f := K \times Y \rightarrow Y

The PRP is defined over two sets (K,X), as said before K is the key space, while X is an output space. It takes two elements of the sets Kand X and the result is basically an element in the X space:

f := K \times X \rightarrow X

E (encryption) function must be easy to evaluate, also a requirement is that E must be a one-to-one or injective function, so that one element in the set X maps only one element in set Y.

Modes of Block Ciphers

When there is the need to encrypt a block of data that is bigger than the block size itself a mode of operation for the block cipher must be chosen, basically this is an encryption function built using the block cipher itself. The various modes of operation allow for different levels of security, though they don’t include authentication, which is out of the scope of this article.

ECB (Electronic CodeBook)

This mode is called ECB (Electronic CodeBook), it’s the simpler mode of operation, albeit the less secure for reasons that will be discussed later in the article. This is the definition:

C_{i} = E(K, P_{i})

for i = 1, …, k

The block cipher is used sequentially on each block of data. The procedure follows the figure:

ECB_scheme

What’s the problem with this mode of operation? ECB encrypts messages into blocks independently, so what happens if two blocks are identical? That their the outputs will be identical too, now imagine what would that mean if someone was to intercept your banking transactions:

if m1 == m2 then c1 == c2

To visually give you an idea of what it means to use ECB mode on data, take a look at the picture below and you will understand that a considerable amount of information is still available after the encryption process.

bXAUL

CBC (Cipher Block Chaining)

Cipher Block Chaining  is one of the most used methods for block ciphers. CBC is used to prevent chosen plaintext attacks, something similar to what happened for encryption of similar repetitive blocks in the above picture. In particular we take advantage of a random IV(Initialization Vector) that doesn’t have to be kept secret, because acutually it’s a part of the initial ciphertext block. The IV is normally one block in size and in the picture below you can see what role it plays in the encryption process:

CBC_Scheme

suppose we want to encrypt plaintext m using block ciphers, the sender and receiver agree upon a specific predetermined IV. Then the IV will be XORed with first plaintext block (m0) and the intermediate value passed to the encryption function using symmetric key K, then the result is C0 or first ciphertext in the sequences of ciphertexts. In other words, an IV is used instead of the nonexistent C0. At this point, instead of passing directly the next message to the encryption function (like we would do in ECB mode) we XOR the next part with C0 and then the block gets passed to the encryption function to generate the 2nd block and so on until the message is encrypted. At the sender side, XORing is done before encryption; at the receiver site, decryption is done before XORoring.

In fact the formula to encrypt is:

 C<sub>0 </sub>= E(k,IV\oplus m<sub>0</sub>)

In case of decrypting the first message m0, the formula changes like below:

m<sub>0</sub> = D(k,C<sub>0</sub>) \oplus IV

because of using a Feistel cipher, the algorithm for decryption is very similar to the encryption scheme as it can be seed from the picture:

decrypt_scheme

You can see that this one is almost the same encryption circuit, except for the XOR that is executed after instead of before. One thing that we must know in this circuit is that we have excluded the IV as part of the decryption process and then we have only the original message, the IV is dropped in the decryption process.

CTR (Randomized Counter Mode)

This mode of operation uses (unlike the CBC scheme) a Secure PRF or Pseudo Random Function. The CTR has this encryption scheme:

CTR_Scheme

So let F be a secure PRF and let it act on N bit blocks, so if we still use AES as encryption algorithm, these will be 128 bits. However how does the CTR work? It starts by choosing a random IV which must be (if again we choose AES) of 128 bits. So, now we can start the count: we will have F(k, IV) for the first encrypting scheme then F(k,IV+1), F(k,IV+2) and so on until we will obtain the final result that is F(k, IV+L), in this mode we generate the random pad, we then XOR the result with the message in order to obtain the ciphertext. The randomized part of this mode of operation guarantees that the the PRF generates a random value every time, though it is possible to operate the block cipher differently using an incremental counter.

Exhaustive search attacks

Exhaustive search attack is probably the most obvious… Let’s assume we are analyzing the DES algorithm, our goal is that given a few input/output pairs (mi and ci = E(k, mi)) we must find the key that maps these messages to the related ciphertexts. In other words our target is to find the key that maps m1, m2, m3, into c1, c2, c3. The first question that comes to mind is: are we sure that key K is unique? Suppose for a moment that DES is an ideal cipher, what is ideal cipher? An ideal cipher is a cipher were we assume that the block cipher is a random permutation for every key, built on random invertible functions. That is for every K, DES implements a random invertible function, until there will be 256 keys in DES… and we’re going to pretend like DES is really a collection of functions that are invertible from {0,1}56 to {0,1}56. This means that given one message M and ciphertext CT, given one pair: PT (plaintext) and CT (ciphertext) there’s already one key that maps this message to that ciphertext. This means also that the solution to previous question is that the key is very likely to be unique. In fact the exhaustive search attacks is essentially try to checks all possible keys of a cipher or encrypted message to discover the real key. This attack is also known as brute force attack.

Block Swapping Attacks

The block swapping is applicable only on ECB mode. As we have seen previously seen, ECB mode is potentially harmful and is usually not recommended to secure data. ECB divides the message in blocks and each block is ciphered independently from each other… So block swapping simply exchanges the blocks between them, this figure will explain how it works:

Vic1

Another aspect of this attack is that is possible to insert other blocks in the transaction. For example if we capture the message of a banking transaction with an account number and a transfer amount of x€ we can potentially decrypt the block with the money, and exchange that with my own amount. ECB can still be used, provided the block is protected by a HMAC. HMAC is used to authenticate the entire message, and if the message gets tampered, the receiving end can immediately figure that out and drop the message. The HMAC function looks like: m,a := h(Ka,m) where a is HMAC and h is MAC function and Ka is the authentication (secret) key.

Reply Attacks on ECB

During a replay attack an attacker that intercepts a valid message can send again that message as many times as he wants. If there’s nothing in the message that couldn’t be legitimately repeated, then the recipient will have no reason not to accept it as a valid message. Imagine our previous bank transaction for x€, it might be possible to intercept the block that describes the transaction and replay it over and over again. This attack can often be combined together with the Block swapping attack. How can we protect against it? We can insert in the last block a Message Authentication Code (MAC). This function allows the receiving end to verify the integrity of the message with this function: V(k,  m,  tag)    =  `yes`, the ‘tag‘ function is generated from this original function tag←S(k,m) where S is called signing algorithm, the function uses the shared key k and the message m to produce a short tag. The tag is usually up to 100 bits in size and of course it is independent from the message that could be as short as a few bytes or as long as several gigabytes. Once the receiver gets the message, he can open it up and check the tag against che whole message for signs of tampering. Since the tag is based both on the message itself and on the shared key, an attacker won’t be in the position of altering the message without invalidating the MAC. This approach protects the integrity but it doesn’t guarantee any resistance against a replay attack. So usually a timestamp or a random ID can be used, in this scenario the timestamp will guarantee the uniqueness of the message against replay attacks and the MAC will guarantee its integrity against tampering.

Side Channel Attacks

Side channel attacks can be perpetrated not against the block cipher itself, but against the device that is using the cipher and the key. These attacks revolves around the physical implementation of the algorithm and all the side effects generated on a real device. As an example device can be measured while it performs an encryption, a decryption or while it generates an encryption key. This data can reliably be used to retrieve the encryption key or to considerably reduce the keyspace, making it easier to bruteforce the whole key. Some attacks also measure the power consumed and the fluctuations over the power consumption to reconstruct the encryption or decryption process.

Fault Attacks

Faults attacks, even if hard to implement, are extremely effective. These kind of attacks stress the encryption implementation at the physical level, usually by subjecting the encrypting device to high temperatures, extreme overclocks, power fluctuations and every that can lead to the generation of errors during the encryption of decryption process. Obtained errors can be analyzed (imagine the case of a Smart card) to derive the internal state of the encryption engine. Most of the algorithms are susceptible to these type of attacks and in 2002 it’s been proven that only 200 flipped bits are required to correctly derive the encryption key of 3DES. The solution proposed is to adopt error-correcting algorithms to avoid providing useful clues to the cryptanalysts.

Stay tuned for the next part where we will analyze a real attack scenario.