Awakening Zombie Code in Apache httpd

Mar 25 2013

At the end of last year, while playing with hash-DoS (see this previous post and my Insomni’hack 2013 talk for understanding the whole context), I have found funny things in the code source of the Apache httpd web server.  It concerns the module mod_auth_digest, which is responsible to authenticate users according to challenge-response protocols standardized in RFC 2617, namely MD5 and MD5-sess. Essentially these authentication mechanisms allow to protect passwords between the client and the web server from an adversary spying the communication, as they do not appear in clear like it is the case for the the Basic authentication mechanism provided by the module mod_auth_basic, but only in hashed form.

Those two different variants work in a slightly different way. By default, Apache uses the variants MD5 and the parameter qop (standing for “Quality of Protection”) is set to auth, as stated by the official documentation. First, two values are computed: \mathrm{HA1} = \mathrm{MD5}(\text{username:realm:password} and \mathrm{HA2} = \mathrm{MD5}(\text{method:digestURI}); then, the client computes the response as \mathrm{MD5}(\text{HA1:nonce:nonceCount:clientNonce:qop:HA2}) and sends it to the web server, that can repeat the same computation as it knows all the different parameter values.

The MD5-sess variant works in a slightly different way. The value \mathrm{HA1} is computed only once, on the first request by the client following the receipt of a WWW-Authenticate challenge from the server. It uses the server nonce from that challenge, and the first client nonce value to construct \mathrm{HA1} as \mathrm{MD5}(\mathrm{MD5}(\text{username:realm:password})\text{:nonce:cnonce}). The rationales of this construction are explained in §3.2.2.2 of RFC 2617:

This creates a ‘session key’ for the authentication of subsequent requests and responses which is different for each “authentication session”, thus limiting the amount of material hashed with any one key. [...] Because the server need only use the hash of the user credentials in order to create the HA1 value, this construction could be used in conjunction with a third party authentication service so that the web server would not need the actual password value. The specification of such a protocol is beyond the scope of this specification.

More details can be found on the dedicated Wikipedia page. Interestingly, although it is possible to configure Apache httpd to use MD5-sess, through the AuthDigestAlgorithm directive, the documentation tells us that “MD5-sess is not correctly implemented yet”. Trying to use it in a .htaccess file results in an error 500 (“Internal Server Error”), and the httpd server gently explains why in the error logs:

Essentially, the use of MD5-sess is killed by the following routine:

Furthermore, other mechanisms, like one-time nonces (as a side note, cryptographically speaking, a nonce is a number that must be used only once…), nonce-count checking are not supported as well:

All those mechanisms require to store server-side information in a shared memory segment, as one needs some synchronization between the different threads. Still, there exists a lot of code in the source code of the module mod_auth_digest that are related to handling those mechanisms. Some configuration directives are also documented, like AuthDigestShmemSize, although shared memory seems to be used only by those disabled features. In summary, it appears that there seems to be a lot of zombie code in this mod_auth_digest module. Let’s try to awaken it ;-) !

The routine  note_digest_auth_failure()  is responsible to handle authentication errors, and it still contains code that access the shared memory segment, more exactly through the routine gen_client(). The following piece of code is pretty interesting:

The conditions conf->check_nc and !strcasecmp(conf->algorithm, "MD5-sess") are always false, but conf->nonce_lifetime == 0 can be made true through the AuthDigestNonceLifetime directive.

Here is a proof of concept: I have put the following .htaccess file in the /aaa directory

and the following tiny Python script sends HTTP requests with a missing opaque field at a rather slow pace:

This is sufficient to trigger floating-point exceptions (I also observed NULL pointer dereferences if the AuthDigestShmemSize directive is used) and to make repeatedly crash the different threads, hence rendering the httpd server in whole unavailable to legitimate requests.

In summary, if one is able to put a AuthDigestNonceLifetime somewhare in a .htaccess file, either directly or through injection, then one is able to completely sabotage an Apache httpd installation. This seemed pretty annoying to me, expecially if we have the shared web environments scenario in head. At the time of writing this post, this works with the versions 2.4.4 and 2.2.24, which are the latest ones.

For the record, I have contacted the Apache security team, first directly without success, then through the oCERT crew (thank you guys for your quick answer!), and I received the following answer:

No responses yet

La HEIG-VD Partenaire d’Insomni’Hack 2013

Mar 11 2013

Comme chaque année depuis 6 ans, le petit monde de la sécurité informatique et du «ethical hacking» de Suisse Romande et d’ailleurs va converger vers Genève les 21 et 22 mars 2013 pour assister à une nouvelle édition d’Insomni’hack, l’un des plus grands évènements de sécurité informatique francophone. Cette sympathique manifestation, organisée par la société de Préverenges SCRT Sàrl, aura lieu pour la première fois sur deux journées à Palexpo:

  • 21 mars 2013: Workshops Plusieurs formations, données par des experts mondialement reconnus, auront lieu durant cette journée. Seules 12 places par workshop sont disponibles, et selon le thème, la participation coûte de CHF 150 à CHF 750. Les thèmes abordés seront l’exploitation ARM (par Stephen Ridley et Stephen Lawler), le framework d’exploitation Metasploit (par Paul Rascagnères), les technologies HTML, HTML5, SVG et CSS sous une forme offensive (par Mario Heiderich), les méthodologies de management de risques CORAS et OCTAVE-ALLEGRO (par Jeremy Kenaghan), ainsi que l’exploitation Linux (par des ingénieurs de SCRT Sàrl). L’inscription se fait via le site web de la manifestation. Dépêchez-vous, il ne reste plus que quelques places disponibles !
  • 22 mars 2013 (09h00 à 17h00): Conférences Cette seconde journée verra 12 présentations de chercheurs en sécurité internationalement reconnus sur des sujets brûlants et variés: par exemple, Charlie Miller, une rock-star de la scène mondiale de la sécurité informatique, parlera de NFC et montrera comment prendre contrôle d’un smartphone via ce nouveau moyen de communication, Mario Heiderich parlera de sécurité web, Ian Pratt du rôle de la virtualisation dans la sécurité de l’information, ou encore Eloi Sanfelix Gonzalez, de la société Riscure, qui expliquera comment il analyse des systèmes embarqués. Pour ma part, j’aurai le plaisir de présenter mes récentes recherches dans le domaines des Hash-DoS. L’inscription aux conférences coûte entre CHF 90 et CHF 150 et se fait également via le site web de la manifestation.
  • 22 mars 2013 (dès 18h00): Capture the Flag Finalement, le traditionnel concours de «ethical hacking», qu’il n’est plus besoin de présenter, aura lieu de 18h00 à 01h00.

La HEIG-VD est pour la première fois un partenaire d’Insomni’hack. Cela permettra à ses étudiants, qu’ils suivent la seule formation bachelor  de Suisse complètement dédiée à la «sécurité de l’information», ou des cours de master dans ce domaine, de participer gratuitement aux conférences. De plus, plusieurs équipes, formées d’étudiants, d’anciens étudiants, de professeurs et autres chercheurs en sécurité informatique de la région avec qui nous avons des  contacts étroits, participeront au «Capture the Flag».

No responses yet

Like a Hot Knife Through Butter

Dec 13 2012

More or less recently, an interesting line of attacks against software has been revisited, namely Hash-DoS, or, in a nutshell, exploiting weak hash functions used in a hash table implementation to trigger a denial-of-service.

To the best of my knowledge, this problematic has been exposed as early as in 1998 in Phrack by Solar Designer, then variants have been discussed by Crosby and Wallach at USENIX 2003, formally defining algorithmic complexity attacks, by Klink and Wälde during 28c3 in 2011, applying the idea on PHP, Python, Java, Ruby, etc. and more recently by Aumasson, Bernstein and Bosslet (see their slides at Appsec Forum 2012, and their upcoming talk at 29c3), showing that the proposed solutions, essentially randomizing the hash function, were not always as effective as expected.

Technically, this kind of attacks consists in generating a large number of colliding inputs for the hash table. Hence, instead of a having a O(1) average access time to a stored element, one can force the hash table to have a O(n) one. If one is willing to explore all the elements in the hash table, the worst-case complexity becomes O(n^2) instead of O(n). Be able to generate multi-collisions (i.e., multiple inputs, not just two, mapping to the same output), in practice hence depends on the properties of the hash function transforming elements to a key.

In this short post, I’d like to show how hash-DoS can be applied to the btrfs file-system with some astonishing and unexpected success. Btrfs, while still in development stage, is widely considered as being a viable successor of ext4, and an implementation of it is already part of the Linux kernel. According to this page,

Directories are indexed in two different ways. For filename lookup, there is an index comprised of keys:

Directory Objectid BTRFS_DIR_ITEM_KEY 64 bit filename hash

The default directory hash used is crc32c, although other hashes may be added later on. A flags field in the super block will indicate which hash is used for a given FS.

The second directory index is used by readdir to return data in inode number order. This more closely resembles the order of blocks on disk and generally provides better performance for reading data in bulk (backups, copies, etc). Also, it allows fast checking that a given inode is linked into a directory when verifying inode link counts. This index uses an additional set of keys:

Directory Objectid BTRFS_DIR_INDEX_KEY Inode Sequence number

The inode sequence number comes from the directory. It is increased each time a new file or directory is added.

Knowing how trivial it is to compute multi-collisions for a CRC-based hash, I did not resist to play a bit. Roughly speaking, computing a CRC of an n-bit message M consists in interpreting the message as a polynomial M(x) of degree n-1 over \mathrm{GF}(2), dividing it by the CRC defining polynomial P(x), and taking the remainder, hence writing something like \mathrm{CRC} = M(x) - Q(x)\cdot P(x). Obviously, adding any multiple of P(x) to the message M(x) will generate a collision. For the gory details, see this page.

I basically found two different attacks:

  • I computed the time to create 4000 empty files in the same directory whose names were randomly chosen. This takes about 0.2 seconds. The box used is a Fedora distribution within a VM (and btrfs was a loopback-ed device).

    Then, I computed the time to create those 4000 empty files in the same directory, whose names were however chosen in order to hash to the same CRC32C value. This operation fails after 5 (!) seconds and creating only 61 files. In other words, this first attack allows an adversary, in a shared directory scenario, to avoid that a victim creates a file with a known-in-advance name. According to the btrfs maintainer, Chris Mason, 

Collisions are a known issue with any of the hash based directories. [...] The impact of the DOS is that a malicious user is able to prevent the creation of specific file names. For this to impact other users on the system, it must be done in directories where both the malicious user and the victim user have permission to create files. The obvious example is /tmp, but there are other cases that may be more site-specific. Group writable directories have other security issues, and so we picked the hash knowing this kind of DOS was possible. It is good practice to avoid the shared directories completely if you’re worried about users doing these kinds of attacks.

  • A bit annoyed by this answer, I tried harder and found the following: I have created several files with random names in a directory (around 500). The time required to remove them is negligible. Then, I have created the same number of files, but giving them only 55 different crc32c values. The time required to remove them is so large that I was not able to figure it out and killed the process after 220 minutes (!). The python script I used is the following, and borrows some code from StalkR:

    More exactly, I mounted a 1GB btrfs file system on a loopback device:


    In the exploit script, just put the variable  hack = False  to generate random empty filenames or  hack = True  to generate colliding filenames and hence trigger the phenomenon. Here is a screenshot of what I obtained:

Given the result, it looks like that playing with collisions is much more likely to trigger an infinite loop than just a complexity increase; at least, the btrfs code surely does not expect to handle many collisions.

Essentially, to thwart both attacks, I would recommend to use a modern lightweight hash algorithm, such as SipHash, instead of CRC32C. Another alternative is to avoid using data structures that have a high worst-case complexity, like hash tables, for storing data that can potentially be manipulated by malicious users. Sacrificing a bit of average-case performance, data structures like red-black trees have a guaranteed search time in O(\log(n)) (I learned this while reading the source code of nginx).

For the record, this vulnerability has been announced to the btrfs maintainer Chris Mason on November 14th, 2012, who acknowledged the bug, but then did not answer any of my e-mails. and mentioned that

 My plan is to get this fixed for the 3.8 merge window. We’re juggling a lot of patches right now but I expect to fix things then.

[UPDATE OF 17/12/2012] As several readers of this post have noticed, and I would like to warmly thank them for their feedback, the second attack does NOT generate an infinite loop within the btrfs code, but merely within the bash expansion code which is responsible to expand the command line rm *. This can be seen in the above screenshot, as the CPU is burnt in userland, and not in the kernel. Hence, what I thought to be a complexity attack against the btrfs file system is actually a (less glamorous) complexity attack against bash.

This said, after having communicated this unfortunate glitch to the btrfs maintainer Chris Mason, he kindly answered me the following:

You’ve found a real problem in btrfs though. Changes since I tested the crc overflow handling have made us not deal with our EOVERFLOW error handling completely right, and there are cases where we force the FS readonly when we shouldn’t. So this is a valid bug, I’m just waiting on some review of my fix, which will get backported to a number of kernels.

To summarize, the message I wanted to pass through this post remains still valid: if one uses a weak hash function, like it is the case in the btrfs file system, one should assume that malicious users can generate zillions of collisions, and one should write accordingly robust code able to handle those collisions in an efficient way.

Another possibility consists in using a lightweight cryptographic hash function that translates the search for multi-collision in a hard task. The security/performance tradeoff to find here is definitely a delicate and hard decision to take.

Finally, the first described attack, i.e., make impossible the creation of a given file within a shared directory, keeps still valid.

37 responses so far

Older »