About Obfuscator-LLVM, Dual-Use Tools and Academic Ethics

On November 25th, my team has announced the release for Christmas of Obfuscator-LLVM, an open-source obfuscation tool based on the LLVM compilation suite. Why Christmas? Because version 3.4 of LLVM was planned to be released on the 23rd of December, and we wished to port our code to the latest version of the compiler before publishing it.

On December 18th, I have been privately contacted by pod2g, a French security researcher active in the Apple jailbreaking scene, kindly asking me whether his team, the evad3rs, could get an early access to our tool. Without thinking too much about the possible consequences, and naively seeing it as an easy way to get some publicity  for our research project, my collaborators and I have accepted to send them the source code of Obfuscator-LLVM one week earlier than the planned release. In exchange, we only asked to be credited on their website; I would like to clearly state that we never spoke about financial compensation, or about any other kind of reward.

The evasi0n jailbreak was released on December 23rd, taking the jailbreak community by surprise, and instantly generating a controversy. Indeed, the jailbreaking software arrived bundled with a Chinese app store apparently delivering pirated apps and/or malware.

Providing a version of Obfuscator-LLVM to the evad3rs one week in advance on our planned release was a mistake, and we regret this turn of events, as our academic research project is now somehow linked to the murkier side of ITsec.

But more importantly, this controversy raises deep questions about the release of “dual-use” academic tools. Obfuscation techniques, i.e. software techniques aiming at increasing the cost of reverse-engineering, have in practice been so far used by malware writers as well as in the domain of Digital Right Management. Although there has been academic research on the subject  for more than a decade, only a handful of tools are freely available as open-source software, and few of them are able to obfuscate C/C++/Objective-C code in an effective way (Kryptonite being an example). Moreover, I am aware of only a handful of vendors selling commercial tools of this kind; those include Arxan, Whitecryption or Morpher, and their products are expensive.

With that said, is it ethical to release a tool like Obfuscator-LLVM?

Our feeling is that publishing an open-source C/C++ obfuscating tool makes sense for several reasons:

  • While it is a “dual-use” tool, I see no reason why obfuscation and software protection tools should be in a different position than, say, encryption tools, fuzzers, network scanners, exploitation frameworks or butcher knives. All of them can be used with ethical goals in mind, or with malicious intents. Even jailbreaking software can be used as much to install pirated software on an device, as by authorities for forensic purposes or pen-testers for auditing goals.
  • The fact that Obfuscator-LLVM will be open-source will make it easier to audit its code base, ensuring that it is backdoor-insertion free.
  • With an available open-source obfuscating tool, academic security researchers will  have access to a free tool when studying and designing new automated de-obfuscation engines and reverse-engineering processes aiming at helping malware analysts.
  • Code obfuscation can also bring  lesser-known security benefits for our digital ecosystem as a by-product. For instance, obfuscation brings software diversity, since an obfuscation process can typically be heavily randomized. This can be considered as a first defense against mass software attacks.

Those considerations are only preliminary. My team still plans to release Obfuscator-LLVM in the coming days. I hope that this blog post will contribute in clarifying our intentions with regard to the goals of Obfuscator-LLVM.

In the meanwhile, Merry Christmas to everybody!

Credits: thunderstorm picture from http://darkwoman.d.a.pic.centerblog.net

Awakening Zombie Code in Apache

At the end of last year, while playing with hash-DoS (see this previous post and my Insomni’hack 2013 talk for understanding the whole context), I have found funny things in the code source of the Apache httpd web server.  It concerns the module mod_auth_digest, which is responsible to authenticate users according to challenge-response protocols standardized in RFC 2617, namely MD5 and MD5-sess. Essentially these authentication mechanisms allow to protect passwords between the client and the web server from an adversary spying the communication, as they do not appear in clear like it is the case for the the Basic authentication mechanism provided by the module mod_auth_basic, but only in hashed form.

Those two different variants work in a slightly different way. By default, Apache uses the variants MD5 and the parameter qop (standing for “Quality of Protection”) is set to auth, as stated by the official documentation. First, two values are computed: \mathrm{HA1} = \mathrm{MD5}(\text{username:realm:password} and \mathrm{HA2} = \mathrm{MD5}(\text{method:digestURI}); then, the client computes the response as \mathrm{MD5}(\text{HA1:nonce:nonceCount:clientNonce:qop:HA2}) and sends it to the web server, that can repeat the same computation as it knows all the different parameter values.

The MD5-sess variant works in a slightly different way. The value \mathrm{HA1} is computed only once, on the first request by the client following the receipt of a WWW-Authenticate challenge from the server. It uses the server nonce from that challenge, and the first client nonce value to construct \mathrm{HA1} as \mathrm{MD5}(\mathrm{MD5}(\text{username:realm:password})\text{:nonce:cnonce}). The rationales of this construction are explained in § of RFC 2617:

This creates a ‘session key’ for the authentication of subsequent requests and responses which is different for each “authentication session”, thus limiting the amount of material hashed with any one key. […] Because the server need only use the hash of the user credentials in order to create the HA1 value, this construction could be used in conjunction with a third party authentication service so that the web server would not need the actual password value. The specification of such a protocol is beyond the scope of this specification.

More details can be found on the dedicated Wikipedia page. Interestingly, although it is possible to configure Apache httpd to use MD5-sess, through the AuthDigestAlgorithm directive, the documentation tells us that “MD5-sess is not correctly implemented yet”. Trying to use it in a .htaccess file results in an error 500 (“Internal Server Error”), and the httpd server gently explains why in the error logs:

Essentially, the use of MD5-sess is killed by the following routine:

Furthermore, other mechanisms, like one-time nonces (as a side note, cryptographically speaking, a nonce is a number that must be used only once…), nonce-count checking are not supported as well:

All those mechanisms require to store server-side information in a shared memory segment, as one needs some synchronization between the different threads. Still, there exists a lot of code in the source code of the module mod_auth_digest that are related to handling those mechanisms. Some configuration directives are also documented, like AuthDigestShmemSize, although shared memory seems to be used only by those disabled features. In summary, it appears that there seems to be a lot of zombie code in this mod_auth_digest module. Let’s try to awaken it 😉 !

The routine  note_digest_auth_failure()  is responsible to handle authentication errors, and it still contains code that access the shared memory segment, more exactly through the routine gen_client(). The following piece of code is pretty interesting:

The conditions conf->check_nc and !strcasecmp(conf->algorithm, "MD5-sess") are always false, but conf->nonce_lifetime == 0 can be made true through the AuthDigestNonceLifetime directive.

Here is a proof of concept: I have put the following .htaccess file in the /aaa directory

and the following tiny Python script sends HTTP requests with a missing opaque field at a rather slow pace:

This is sufficient to trigger floating-point exceptions (I also observed NULL pointer dereferences if the AuthDigestShmemSize directive is used) and to make repeatedly crash the different threads, hence rendering the httpd server in whole unavailable to legitimate requests.

In summary, if one is able to put a AuthDigestNonceLifetime somewhare in a .htaccess file, either directly or through injection, then one is able to completely sabotage an Apache httpd installation. This seemed pretty annoying to me, expecially if we have the shared web environments scenario in head. At the time of writing this post, this works with the versions 2.4.4 and 2.2.24, which are the latest ones.

For the record, I have contacted the Apache security team, first directly without success, then through the oCERT crew (thank you guys for your quick answer!), and I received the following answer:

La HEIG-VD Partenaire d’Insomni’Hack 2013

Comme chaque année depuis 6 ans, le petit monde de la sécurité informatique et du «ethical hacking» de Suisse Romande et d’ailleurs va converger vers Genève les 21 et 22 mars 2013 pour assister à une nouvelle édition d’Insomni’hack, l’un des plus grands évènements de sécurité informatique francophone. Cette sympathique manifestation, organisée par la société de Préverenges SCRT Sàrl, aura lieu pour la première fois sur deux journées à Palexpo:

  • 21 mars 2013: Workshops Plusieurs formations, données par des experts mondialement reconnus, auront lieu durant cette journée. Seules 12 places par workshop sont disponibles, et selon le thème, la participation coûte de CHF 150 à CHF 750. Les thèmes abordés seront l’exploitation ARM (par Stephen Ridley et Stephen Lawler), le framework d’exploitation Metasploit (par Paul Rascagnères), les technologies HTML, HTML5, SVG et CSS sous une forme offensive (par Mario Heiderich), les méthodologies de management de risques CORAS et OCTAVE-ALLEGRO (par Jeremy Kenaghan), ainsi que l’exploitation Linux (par des ingénieurs de SCRT Sàrl). L’inscription se fait via le site web de la manifestation. Dépêchez-vous, il ne reste plus que quelques places disponibles !
  • 22 mars 2013 (09h00 à 17h00): Conférences Cette seconde journée verra 12 présentations de chercheurs en sécurité internationalement reconnus sur des sujets brûlants et variés: par exemple, Charlie Miller, une rock-star de la scène mondiale de la sécurité informatique, parlera de NFC et montrera comment prendre contrôle d’un smartphone via ce nouveau moyen de communication, Mario Heiderich parlera de sécurité web, Ian Pratt du rôle de la virtualisation dans la sécurité de l’information, ou encore Eloi Sanfelix Gonzalez, de la société Riscure, qui expliquera comment il analyse des systèmes embarqués. Pour ma part, j’aurai le plaisir de présenter mes récentes recherches dans le domaines des Hash-DoS. L’inscription aux conférences coûte entre CHF 90 et CHF 150 et se fait également via le site web de la manifestation.
  • 22 mars 2013 (dès 18h00): Capture the Flag Finalement, le traditionnel concours de «ethical hacking», qu’il n’est plus besoin de présenter, aura lieu de 18h00 à 01h00.

La HEIG-VD est pour la première fois un partenaire d’Insomni’hack. Cela permettra à ses étudiants, qu’ils suivent la seule formation bachelor  de Suisse complètement dédiée à la «sécurité de l’information», ou des cours de master dans ce domaine, de participer gratuitement aux conférences. De plus, plusieurs équipes, formées d’étudiants, d’anciens étudiants, de professeurs et autres chercheurs en sécurité informatique de la région avec qui nous avons des  contacts étroits, participeront au «Capture the Flag».