Home
Writeups Misc About
MDFlag

MDFlag

This is a challenge about length extension attack on hashes like MD5 and SHA1 that uses the Merkle-Damgard construction. More on that can be found in this link, and a good Youtube video about this topic, if you can understand Vietnamese, is this series by CyberJutsu.

The following is written under the assumption that you have some experience with the attack. In MD5 hash extension attack, it is often the case that we are allowed to extend the secret with some arbitrary data that we decide. However, in this context, the appended data has to go through the bxor function, which will involve xor of the data we have no information on. Hence, the state after xor is unknown if we do not craft the payload we send to the server properly.

However, we do know the prefix crypto{ and } of the flag. And as the bxor function is called by bxor(data, cycle(FLAG)), the flag's content is cycled through. Hence, we know 8 consecutive bytes that is xor-ed, }crypto{. The idea is now to craft some payload such that we can have the 8 bytes that we know is in the padding of MD5 (where we know the content due to the fact that it has to follow the algorithm, and we know the length of the message passed to the MD5 hash function).

Therefore, we will craft a block where the length (in bytes) is congruent to 55 (mod 64), so that the ending is of the form \x80 + LENGTH OF THE MESSAGE, and also the 8 bytes }crypto{ is the next 8 bytes in the cycle(FLAG) function. In mathematical terms, we have to find the number of blocks x such that 46x1=55mod64. Solving this yields x=4, which means that we will use a payload of size 46×41.

Unfortunately, we always have to append \x80, followed by 8 bytes (the length of the message). In total there is at least 9 bytes for the padding portion. Hence, we have to guess out the last byte, which is the first character after {. To do this, we can extend the secret block with a block that only contains the padding. The target for us is to guess out the character after {, and we know it is the correct character when the hash matches with the locally computed hash from extending the previously requested hash.

With the payload 64-byte aligned, guessing out the other characters becomes easy. We already know the state of all the blocks except the last one, hence we can locally compute the hash generated from extending the known state with the guess.

In my implementation, I try to implement MD5 on my own, and it certainly helped in solving this challenge. Apparently other solutions involves the use of hashpumpy, to make the solution shorter.

Python Implementation: