Discovering ECB block sizes

July 03, 2021

Building up on our previous ECB decryption post we will be figuring out one of the missing pieces of the challenge.

Given a prefix controlled message padded with PKCS7 and encrypted with ECB , how do we figure out the block size?

We barely touched upon PKCS7 and this this seems like a good opportunity to go over it. PKCS7 is a padding algorithm that allow us to encrypt irregularly-sized messages, or in less fancy terms, it allows us to encrypt messages that don’t have a length that is a multiple of our block size.

The algorithm works like the following per the spec:

       01 -- if l mod k = k-1
    02 02 -- if l mod k = k-2
        .
        .
        .
k k ... k k -- if l mod k = 0

The padding can be removed unambiguously since all input is
padded and no padding string is a suffix of another.

Let’s use an example to make things clearer. Suppose the last block of our message is [97, 98] and the block size is 4.

Using l mod k we end up with 2 mod 4 = 2
- l is the size of our last block ([97, 98]), which is 2
- k is our block size, which is 4
Now we know that we have to pad the last block with two bytes, so the end result is:
- [97, 98, 2, 2]

There’s one detail that a lot of folks miss. When the size of the last block is already correct the algorithm above adds a new block to the end as padding. So when we have the last block as [97, 98, 99, 100] and our block size is 4 we will end up with:

Using l mod k we end up with 4 mod 4 = 0
When the result is zero we pad a full block size, so we end up with:
- [97, 98, 99, 100] and [4, 4, 4, 4]

Let’s implement it in Ruby:

# See https://tools.ietf.org/html/rfc2315#section-10.3
def pkcs7_pad(message:, block_size:)
  raise 'Invalid block size' if block_size >= 256 # as per the PKCS7 spec

  size = block_size - (message.size % block_size)
  padding = Array.new(size, size)
  message + padding
end

message = 'ab'.bytes
puts pkcs7_pad(message: message, block_size: 4).inspect
# => [97, 98, 2, 2]

message = 'abcd'.bytes
puts pkcs7_pad(message: message, block_size: 4).inspect
# => [97, 98, 99, 100, 4, 4, 4, 4]

Figuring out the block size

We will work with the encryption_oracle method from our previous post. Let’s revisit it:

RANDOM_KEY = 16.times.map { rand(0..255) }
UNKNOWN_BUFFER = File.read('unknown_buffer').strip.bytes

def encryption_oracle(prefix)
  # Now we know what this method does :)
  padded_buffer = pkcs7_pad(
    message: prefix + UNKNOWN_BUFFER,
    block_size: 16
  )

  aes_ecb_encrypt(padded_buffer, RANDOM_KEY)
end

The algorithm we will need to implement is:

Feed identical bytes to the encryption_oracle method, one at a time
- Example, start with 1 byte (“A”), then “AA”, then “AAA” and so on.
Every time you do it take note of the first block of the result
Once two prefixes produce the SAME first block we have discovered our block size which is equal to the number of iterations

It’s time for a contrived example!

Contrived example

Suppose our message is composed of [0, 1, 2, 3, 4] and our block size is FOUR, but we don’t know it yet.

Iteration one:

Let’s feed our original message an iteration number of As:
- [41, 0, 1, 2, 3, 4, 2, 2] (the last two bytes came from PKCS7)
Encrypt the value above and store it
Let’s feed our original message an iteration plus one number of As:
- [41, 41, 0, 1, 2, 3, 4, 1] (the last byte came from PKCS7)
Encrypt the value above and store it

Is the first block of steps one and three the same? They are not!

Step 1 first block: [41, 0, 1, 2]
Step 3 first block: [41, 41, 0, 1]

Iteration 2

Let’s feed our original message an iteration number of As:
- [41, 41, 0, 1, 2, 3, 4, 1]
Encrypt the value above and store it
Let’s feed our original message an iteration plus one number of As:
- [41, 41, 41, 0, 1, 2, 3, 4, 4, 4, 4, 4]
Encrypt the value above and store it

Is the first block of steps one and three the same? They are not!

Step 1 first block: [41, 41, 0, 1]
Step 3 first block: [41, 41, 41, 0]

Iteration 3

Let’s feed our original message our iteration number of As:
- [41, 41, 41, 0, 1, 2, 3, 4, 4, 4, 4, 4]
Encrypt the value above and store it
Let’s feed our original message our iteration plus one number of As:
- [41, 41, 41, 41, 0, 1, 2, 3, 4, 3, 3, 3]
Encrypt the value above and store it

Is the first block of steps one and three the same? They are not!

Step 1 first block: [41, 41, 41, 0]
Step 3 first block: [41, 41, 41, 41]

Iteration 4

Let’s feed our original message our iteration number of As:
- [41, 41, 41, 41, 0, 1, 2, 3, 4, 3, 3, 3]
Encrypt the value above and store it
Let’s feed our original message our iteration plus one number of As:
- [41, 41, 41, 41, 41, 0, 1, 2, 3, 4, 2, 2]
Encrypt the value above and store it

Is the first block of steps one and three the same? They are!

Step 1 first block: [41, 41, 41, 41]
Step 3 first block: [41, 41, 41, 41]

We had 4 iterations, so that’s our block size!

Why this work?

It only works because ECB produces the same ciphertext given the same plaintext, and the last two iterations end up producing the same first block since they are the same plaintext (both are [41, 41, 41, 41]).

If ECB didn’t produce the same ciphertext we would have no way to compare whether both blocks are the same or not.

Implementation

BYTE = 'A'.ord # 41

def infer_block_size
  (1..256).each do |count|
    iteration_0 = encryption_oracle(Array.new(count, BYTE))
    iteration_1 = encryption_oracle(Array.new(count + 1, BYTE))

    if iteration_0.slice(0, count) == iteration_1.slice(0, count)
      return count
    end
  end
end

And we have reached the end of our exercise! Congratulations, take a moment to be proud of what you have achieved and I hope you are looking forward to the next posts as much as I am. :)