Like a Hot Knife Through Butter

More or less recently, an interesting line of attacks against software has been revisited, namely Hash-DoS, or, in a nutshell, exploiting weak hash functions used in a hash table implementation to trigger a denial-of-service.

To the best of my knowledge, this problematic has been exposed as early as in 1998 in Phrack by Solar Designer, then variants have been discussed by Crosby and Wallach at USENIX 2003, formally defining algorithmic complexity attacks, by Klink and Wälde during 28c3 in 2011, applying the idea on PHP, Python, Java, Ruby, etc. and more recently by Aumasson, Bernstein and Bosslet (see their slides at Appsec Forum 2012, and their upcoming talk at 29c3), showing that the proposed solutions, essentially randomizing the hash function, were not always as effective as expected.

Technically, this kind of attacks consists in generating a large number of colliding inputs for the hash table. Hence, instead of a having a O(1) average access time to a stored element, one can force the hash table to have a O(n) one. If one is willing to explore all the elements in the hash table, the worst-case complexity becomes O(n^2) instead of O(n). Be able to generate multi-collisions (i.e., multiple inputs, not just two, mapping to the same output), in practice hence depends on the properties of the hash function transforming elements to a key.

In this short post, I’d like to show how hash-DoS can be applied to the btrfs file-system with some astonishing and unexpected success. Btrfs, while still in development stage, is widely considered as being a viable successor of ext4, and an implementation of it is already part of the Linux kernel. According to this page,

Directories are indexed in two different ways. For filename lookup, there is an index comprised of keys:

Directory Objectid BTRFS_DIR_ITEM_KEY 64 bit filename hash

The default directory hash used is crc32c, although other hashes may be added later on. A flags field in the super block will indicate which hash is used for a given FS.

The second directory index is used by readdir to return data in inode number order. This more closely resembles the order of blocks on disk and generally provides better performance for reading data in bulk (backups, copies, etc). Also, it allows fast checking that a given inode is linked into a directory when verifying inode link counts. This index uses an additional set of keys:

Directory Objectid BTRFS_DIR_INDEX_KEY Inode Sequence number

The inode sequence number comes from the directory. It is increased each time a new file or directory is added.

Knowing how trivial it is to compute multi-collisions for a CRC-based hash, I did not resist to play a bit. Roughly speaking, computing a CRC of an n-bit message M consists in interpreting the message as a polynomial M(x) of degree n-1 over \mathrm{GF}(2), dividing it by the CRC defining polynomial P(x), and taking the remainder, hence writing something like \mathrm{CRC} = M(x) - Q(x)\cdot P(x). Obviously, adding any multiple of P(x) to the message M(x) will generate a collision. For the gory details, see this page.

I basically found two different attacks:

  • I computed the time to create 4000 empty files in the same directory whose names were randomly chosen. This takes about 0.2 seconds. The box used is a Fedora distribution within a VM (and btrfs was a loopback-ed device).

    Then, I computed the time to create those 4000 empty files in the same directory, whose names were however chosen in order to hash to the same CRC32C value. This operation fails after 5 (!) seconds and creating only 61 files. In other words, this first attack allows an adversary, in a shared directory scenario, to avoid that a victim creates a file with a known-in-advance name. According to the btrfs maintainer, Chris Mason, 

Collisions are a known issue with any of the hash based directories. […] The impact of the DOS is that a malicious user is able to prevent the creation of specific file names. For this to impact other users on the system, it must be done in directories where both the malicious user and the victim user have permission to create files. The obvious example is /tmp, but there are other cases that may be more site-specific. Group writable directories have other security issues, and so we picked the hash knowing this kind of DOS was possible. It is good practice to avoid the shared directories completely if you’re worried about users doing these kinds of attacks.

  • A bit annoyed by this answer, I tried harder and found the following: I have created several files with random names in a directory (around 500). The time required to remove them is negligible. Then, I have created the same number of files, but giving them only 55 different crc32c values. The time required to remove them is so large that I was not able to figure it out and killed the process after 220 minutes (!). The python script I used is the following, and borrows some code from StalkR:

    More exactly, I mounted a 1GB btrfs file system on a loopback device:

    In the exploit script, just put the variable  hack = False  to generate random empty filenames or  hack = True  to generate colliding filenames and hence trigger the phenomenon. Here is a screenshot of what I obtained:

Given the result, it looks like that playing with collisions is much more likely to trigger an infinite loop than just a complexity increase; at least, the btrfs code surely does not expect to handle many collisions.

Essentially, to thwart both attacks, I would recommend to use a modern lightweight hash algorithm, such as SipHash, instead of CRC32C. Another alternative is to avoid using data structures that have a high worst-case complexity, like hash tables, for storing data that can potentially be manipulated by malicious users. Sacrificing a bit of average-case performance, data structures like red-black trees have a guaranteed search time in O(\log(n)) (I learned this while reading the source code of nginx).

For the record, this vulnerability has been announced to the btrfs maintainer Chris Mason on November 14th, 2012, who acknowledged the bug, but then did not answer any of my e-mails. and mentioned that

 My plan is to get this fixed for the 3.8 merge window. We’re juggling a lot of patches right now but I expect to fix things then.

[UPDATE OF 17/12/2012] As several readers of this post have noticed, and I would like to warmly thank them for their feedback, the second attack does NOT generate an infinite loop within the btrfs code, but merely within the bash expansion code which is responsible to expand the command line rm *. This can be seen in the above screenshot, as the CPU is burnt in userland, and not in the kernel. Hence, what I thought to be a complexity attack against the btrfs file system is actually a (less glamorous) complexity attack against bash.

This said, after having communicated this unfortunate glitch to the btrfs maintainer Chris Mason, he kindly answered me the following:

You’ve found a real problem in btrfs though. Changes since I tested the crc overflow handling have made us not deal with our EOVERFLOW error handling completely right, and there are cases where we force the FS readonly when we shouldn’t. So this is a valid bug, I’m just waiting on some review of my fix, which will get backported to a number of kernels.

To summarize, the message I wanted to pass through this post remains still valid: if one uses a weak hash function, like it is the case in the btrfs file system, one should assume that malicious users can generate zillions of collisions, and one should write accordingly robust code able to handle those collisions in an efficient way.

Another possibility consists in using a lightweight cryptographic hash function that translates the search for multi-collision in a hard task. The security/performance tradeoff to find here is definitely a delicate and hard decision to take.

Finally, the first described attack, i.e., make impossible the creation of a given file within a shared directory, keeps still valid.