Thursday, February 28, 2013

Infiltrate Preview - Exfiltrate: Efficient Blind SQLi

Most of my work at Immunity is focused on the large-scale detection of vulnerabilities in web applications. Naturally, a good portion of my effort has gone towards detecting SQL injections (SQLi). Not only are they among the most common vulnerabilities to be found, but they are often the most critical, as they can get an attacker's foot in the door.

Of course, in order for an SQLi to be useful, there has to be some measurable response from the database that is being injected into. Timing attacks have proven to be the most reliable method for detecting an SQLi, as they do not rely onany output from the database making it through to the interface layer of the target web page or service. Commonly, this type of vulnerability is known as a "blind" SQL injection.

The basic idea behind an SQLi timing attack is to use SLEEP commands to force the database on the back end to delay its response. You then ask the database a series of true/false questions. If a response comes back at least "sleep_time" seconds later, you know that the answer is true.

An SQLi detection (top) and a test run against a page with no SQLi (bottom)

This simple technique almost guarantees the validity of an SQLi vulnerability, however it comes with several challenges. Real world attacks often involve VPNs, proxies, and other technologies that can add considerable uncertainty to response times. A good timing attack algorithm has to be able to account for these variables and at the same time strive to keep sleep times as low as possible. Which leads to the second major challenge of timing attacks: efficiency.

 Even with an attack against a local web server, the algorithm I am working with is only capable of pulling data from a database at ~2kbit/s. That's not nearly fast enough when the goal is to download the target's entire database. So how can we speed up the process? That is the main question I am asking right now in my research, and there are a number of roads to go down in searching for the answer. To sample a few potential solutions:

  • Unblinding: The database exists, after all, to provide data to the interface layer. Can we move the data we want somewhere where we can actually see it?
  • Get curious: Why ask just one question?
  • Distribute: Nobody likes a DoS, so architectures are more commonly accounting for traffic spikes. How many questions can the database answer at the same time?
  • Prediction: If we have some of the data, can we predict the rest?

In my talk at Infiltrate, I'll be demonstrating some of these ideas. Know of another technique? Let me know and I may shamelessly implement it!

Wednesday, February 27, 2013

Infiltrate Preview - NAND-Xplore -> Bad Blocks = Well Hidden

Please, welcome Josh "m0nk" Thomas, our first Infiltrate Guest Blogger!

Post-Ex can be sexy

So, you’re frustrated: You’ve spent countless nights discovering the most epic remote mobile 0day imaginable and innumerable hours crafting an intricate payload… only to get the whole thing popped by some stupid Android based AV variant. It’s not like you write typical “churn and burn” malware or ransom-ware where everything is on an expendable cycle; you really are trying to pull off some covert, next-level long term injections and they all just caught fire. Hmmm… It might be time to calm down on the offensive front until you have an acceptable post-exploit landscape to build upon.

“But m0nk, post-exploit is boring and not sexy; right?”

Actually, I think you are dead wrong. Post exploitation can actually be far more deviant than exploitation and the findings typically have a longer shelf life. With that mentality in mind, the NAND-Xplore project was born. The NAND-Xplore project is an attempt to investigate just how deep files can be hidden on an embedded system, starting with a deep understanding of the “bare metal hardware” well below the operating environment. The project attempts to expose weaknesses in the actual NAND data storage hardware / implementation architectures and showcase the vulnerable underpinnings across the spectrum of NAND based platforms. The project is focused on 2 POC tools: one to hide files on NAND devices and one to find them. The overall assumption of the project is that real world advanced malware already contains these tricks, we just don’t know about it yet.

Before the Infiltrate talk itself, I thought it might be useful to share some background info on the NAND Flash technologies themselves. The talk will pick up where this blog post leaves off, primarily with how the Linux kernel interacts with NAND flash and how those interactions can be manipulated and controlled.

A Deeper Understanding of How NAND Functions

Sample NAND Prototype chip with visible blocks and pages

Hardware functionality of actual NAND Flash

In the most basic sense, NAND devices store individual bits of data in a multidimensional array of floating-gate transistors. The floating-gate transistors allow each cell to trap individual electrons, thus keeping or removing a charge. It is this charge that corresponds to a single 0 or 1 for the device. The multiplexed transistor design, coupled with the concept of Fowler-Nordheim tunnel injection and release, allows this grid of floating gates to access cells at a single bit level. In layman’s terms, consider the NAND flash to behave as a highly dense, addressable LED array.

A Simple NAND Circuit

Each individual flash cell is contained in a collection designated as a page. Pages on NAND devices are typically collections of 512, 2048 or 4096 bytes. In turn, each page is collected into a construct known as a block. NAND blocks typically follow an exponential based size paradigm and can range from 16 KB to 512 KB.

While the grid architecture of NAND flash allows for addressing at the single bit level, such accuracy comes with a hard set of limitations:

  • All bits on the device default to and are initially set to a 1.The shift from a 1 to a 0 is a simple electronic pulse to open the gate and dump the stored electron. Sadly, shifting the other direction (from a 0 to a 1) is non trivial and cannot be preformed at the bit level, only at the block level. As such, shifting a stored byte of 1111 1111 to 1010 1010 is trivial but the reverse would entail erasing and entire block of 512 KB.
  • The physical floating-gate transistors are fragile and slowly wear down over time. Typical industry expectations are that each gate can survive around 100,000 state changes before becoming unreliable and unstable. Once a block has become unstable, the NAND controller has the ability to mark it “bad”. This designation will ensure the block is removed from rotation and can no longer be read or accessed automatically.
  • As the gates wear over time, charge leakage can occur. This leakage will corrupt neighboring cells and their stored information. Charge leakage can also occur with exceptionally high levels of repeated reading even without writing to a cell. This is mostly due to the power utilized across the grid to query a specific cell.

Given these limitations, NAND designers and manufacturers introduced automated leveling across the devices. This process attempts to distribute digital information across the hardware in an even manner, not allowing any single bit, page or block to be utilized more than another. The leveling software will also copy highly accessed information around the NAND to discourage charge leakage. If one has the correct tools, they can see this phenomenon by low- level analysis of a NAND. Typically, a forensics analyst can view multiple histories of a file because the NAND flash controller will elect to copy the entire file to a new block of NAND instead of modify the existing imprint. These older versions of the file stay resident until the block is reset and new data is written. This, as well as all other NAND interactions, is managed by the NAND controller hardware. This NAND controller is also a main culprit for why writing successive 0’s and 1’s repeatedly over an entire device is meaningless to the technology, typically because the NAND controller will simply disallow such wasteful access to the memory.

Toshiba NAND Reference Design with NAND Controller

The final applicable detail about NAND flash pertains to mass production yields, transistor size and quality control. Manufacturers are constantly pushing the size of this hardware to be well below a 100% reliable component threshold. As such, devices are known to contain and ship with bad and unusable sections. These sections, much like the blocks that have exhausted their maximum number of times data can be written, are marked as “bad” at the controller level using a collection of NAND flash based error codes. These blocks are simply considered unusable by the overall system and are removed from the addressable space of the memory by the NAND controller. The NAND controller supports this functionality by keeping an active map of the hardware detailing valid and error prone blocks.
Lastly, it should be noted that most but not all embedded NAND flash devices contain a hardware based NAND controller. Those devices that do not contain controlling hardware, such as smart cards, USB storage devices and the like, expect the controlling operating system to mark, flag, control and manipulate the hardware directly. As such, most modern operating systems have a basic understanding of NAND error and correction codes. For the devices that do contain hardware-based controllers, the operating system and hardware drivers preform read and write operations in a similar manner to their older magnetic platter counterparts.

Overview of the NAND Flash Standards

The 2 main standards bodies relevant to NAND are JEDEC and ONFI.   

Development NAND Breakout with a standard TSOP connection

The JEDEC (Joint Electronic Device Engineering Council) committee is primarily concerned with ensuring the various vendors and manufacturers of NAND Flash hardware conform to certain chip package hardware standards. JEDEC is also concerned with ensuring general interoperability between manufacturers and NAND designs. JEDEC provides this services for numerous types of hardware and is far from a NAND specific committee.
The ONFI (Open NAND Flash Interface) group is a governing body for NAND Flash specific interface standards. The group intends to dictate how NAND will interface with other hardware and (to some extent) other software in the wild.
In general, most NAND devices connect to other hardware with either a TSOP (Thin, Small outline package) or BGA (Ball Grid Array) connection. The referenced standards dictate the footprint and layout of the hardware. In typical situations, embedded NAND is delivered on a 169 ball BGA package.
Standard Types of NAND to Board connections

Raw NAND vs. FTL Technologies

NAND Flash can come in a variety of configurations when manufactured. In specific relation to this research we can categorize them as such:
  • Raw NAND
  • NAND + FTL (Managed NAND)
Raw NAND Flash is a slab of NAND storage in its most basic form and all management of the hardware and storage interactions are performed in software outside of the NAND. The Linux kernel utilizes the MTD (Memory Technology Device) subsystem to interact with these devices. This grouping contains only bare NAND and other MTD based devices. To add to the confusion, some raw NAND devices do have embedded ECC (error correction) and simple block management. The main differentiation in this instance is the Linux kernel is treated as the master controller of the hardware, with the embedded processing simply supporting.

NAND + FTL devices contain an on package NAND controller that manages the slab of NAND flash internal to the chip. This controller will manage bad blocks, wear leveling and data access internally and provide a FTL (Flash Transition Layer) interface to outside software such as the Linux kernel. The FTL presents the NAND hardware as a standard block device externally. Though there are significant differences in implementation, this broad grouping contains MMC, eMMC, SD and SSD devices.
Raw NAND vs. Managed NAND (FTL)

Revisiting Post-Ex: Moving forward

Now that we all have a good understanding of the basic NAND architecture, we can have a little fun with data and process hiding. See you @ Infiltrate.

Monday, February 25, 2013

VisualSploit 2.0

Immunity is well known for its product base that is designed to help the lives and duties as network professionals, security auditors and penetration testers much easier.  However one of the lesser known features of CANVAS is VisualSploit.

VisualSploit is a learning utility that we created specifically for our popular Unethical Hacking training course that we conduct at INFILTRATE.

I have been an instructor of this course for a few years now so I have been in a position to see how a lot of people consume, assimilate and digest topics such as buffer overflows, memory corruption, debugging and assembly and the conclusion of this analysis is that these topics are best illustrated with simple, visual tools.  This way the students can walk away with a solid understanding of what happens before, during and after a buffer overflow.

VisualSploit is that simple, visual tool.  I decided to create VisualSploit v2.0 for a few reasons but topping the list is because I was paying close attention to how the tool could be improved to make sure the students got the most out of training and learning about these topics.

The new VisualSploit v2.0 web interface

During the Unethical Hacking course we teach you everything you need to know about assembly in order to write an exploit for buffer overflows.  With the help of VisualSploit you can literally go from analyzing the crash in Immunity Debugger to a working exploit/proof of concept in a matter of minutes because no programming is required.  VisualSploit behind the scenes just builds a CANVAS exploit for you which means that everything you build will be available to you as a regular exploit module the next time you start up CANVAS.  You're welcome.

This visual and hands-on method of teaching and learning about buffer overflows is very effective.  I have yet to encounter a student who didn't have that "ah-ha!" moment where it all clicked and they were able to finish writing the more challenging (and fun) exploits for real-world applications by the end of the course.

So come join me in April during the INFILTRATE edition of the Unethical Hacking class.  It will be fun and educational but more importantly you get to break stuff.

- @MarkWuergler

Friday, February 15, 2013

MOSDEF-C for you and me

One of the super neat things about CANVAS and MOSDEF is that it provides a vehicle to write code that executes in memory of an exploited host, meaning it doesn't have to touch disk if you don't want it to. This is a boon for covertness as it requires any defensive measures to do in memory forensics. So today we'll take a look at a quick post exploitation command that I wrote up for CANVAS based on a Windows kernel quirk discovered by Walied Assar.

Walied discovered a signedness error in NtSetInformationThread, I'll let his blog cover the specifics but what it means for us is that we can set a thread's I/O and memory priority to max. While this isn't terribly relevant in a security context (you could leverage a more efficient DoS on the box you have code exec on, but that's silly), writing up a quick module is demonstrative so let's go ahead and dive in!

When you start out on this path I would strongly encourage you to have working C/C++ code to base the module on because debugging in this scenario can be a frustrating process. So you will need:

1) Working C/C++ code
2) A Windows 7 VM with Immunity Debugger installed
3) A standard callback trojan deployed on the Win7 VM

When writing a new CANVAS module you'll need to create a new directory under CANVAS_ROOT/exploits with the module's name (note '-' is not allowed in module names so use '_' instead), and then within that directory you'll need a dialog.glade2 file as well as a .py with the module name. So my directory structure looks like:


Normally we recommend that customers take an existing module and adapt it to their needs, I used windows_sniffer for this purpose but it required a lot of work since windows_sniffer is fairly complex and threadio is very simple. If you plan on writing your own commands using MOSDEF-C I'd recommend using threadio as your example, so just copy it into your exploit module's directory and give it the proper name and start modifying. Let's take a look at some souce:

#! /usr/bin/env python

# Proprietary CANVAS source code - use only under the license agreement
# specified in LICENSE.txt in your CANVAS distribution
# Copyright Immunity, Inc, 2002-2013
# for more information

import sys
if "." not in sys.path: sys.path.append(".")

from localNode          import localNode
from timeoutsocket      import Timeout
from MOSDEF.mosdefutils import intel_order
from ExploitTypes.localcommand import LocalCommand

NAME                        = "threadio"
DESCRIPTION                 = "Set thread I/O and memory priority to max"
VERSION                     = "0.1"
GTK2_DIALOG                 = "dialog.glade2"
DOCUMENTATION               = {}
DOCUMENTATION["References"] = ""
DOCUMENTATION["Notes"]      = """

Tested on Win7 x86

A module to demonstrate MOSDEF-C, if you're looking to pass simple values (int) 
back from the host this is a good demonstration

It will attempt to set a thread's I/O and memory priority to the maximum assignable values


PROPERTY               = {}
PROPERTY['SITE']       = "Local"
PROPERTY['TYPE']       = "Commands"
PROPERTY['ARCH']       = [ ["Windows"] ]

I generally call this the preamble to the module, where we take care of imports, the documentation dictionary and the properties dictionary. All of this should be easy to understand if you've done any Python so I'll just touch on a few points. 1) All the variables are required (NAME, DOCUMENTATION, etc), 2) it is worth your while to fill these out completely when you begin writing the module especially the references section. Finding that blog post you want to remember two months from now is not very fun.

class theexploit(LocalCommand):
    def __init__(self):
        self.result         = ""           = NAME       

    def run(self):
        self.setInfo("%s (in progress)" % (NAME))

        node     = self.argsDict['passednodes'][0]
        type     = node.nodetype.lower()
        nodename = node.getname()

        if isinstance(node, localNode):
            self.log('Node of type %s not supported.' % type)
            return 0
        if type not in ['win32node']:
            self.log('Node of type %s not supported yet.' % type)
            return 0

"thexploit" class is the standard class from which all modules are run, by passing LocalCommand (in lieu of say tcpexploit) we tell CANVAS what type of module this is. Next we get into the run function which is another required function and where in this case all of our heavy lifting occurs.

With all command type modules it's always a good plan to put in some helpful error checking which you see with our first two if statements. node = self.argsDict['passednodes'][0] provides us a node object which we can compare against another object type, localNode. If you've ever used the CANVAS GUI localNode is the red circle that represents your CANVAS host, so here we check to ensure that isn't selected, because commands are meant to be run on compromised hosts rather than your CANVAS host. Next we get our node type with type = node.nodetype.lower() and check it against a list of node types this will work against. Since this is a Windows kernel issue it makes sense that we only allow the module to be run on Windows nodes.

code = """
        #import "remote", "ntdll.dll|ZwSetInformationThread" as "ZwSetInformationThread"
        #import "remote", "ntdll.dll|ZwQueryInformationThread" as "ZwQueryInformationThread"
        #import "remote", "kernel32.dll|GetCurrentThread" as "GetCurrentThread"
        #import "remote", "kernel32.dll|GetCurrentThreadId" as "GetCurrentThreadId"
        #import "local", "sendint" as "sendint"
        void main() {
            int success;
            int threadId;
            int setResult;
            int queryResult;
            unsigned long p1;
            unsigned long p2;
            success = 42;
            p1 = 0xFF3FFF3C;
            p2 = 0;
            threadId = GetCurrentThreadId();
            setResult = ZwSetInformationThread(GetCurrentThread(), 0x16, &p1, 4);
            queryResult = ZwQueryInformationThread(GetCurrentThread(),0x16, &p2,4,0);
        # Compile the code and ship it over
        vars = {}
        request =, vars)

This is the meat of our module. I'll leave the specifics of the C code to the blog post referenced in the first paragraph. But I do want to point out a few things. First, this example is tied to MOSDEF-C for win32, if you're interested in MOSDEF-C for win64 I'll refer you to the windows_sniffer module, the changes are important but not difficult. In MOSDEF-C you have to import all your functions, you do this with a line like: #import "remote", "ntdll.dll|ZwSetInformationThread" as "ZwSetInformationThread". So a few things to note here, MSDN is your friend for determining which DLLs should contain which functions but save yourself the aggravation and check that this is the case by using Immunity Debugger. Open a program that has your DLL loaded, alt+e to get the imports list, right click your DLL and choose View Names, find your function name. Additionally, it is wise (though not required) to import the function into MOSDEF-C with the same name as it exists in the Windows API.

Having a line like: #import "remote" "ntdll.dll|ZwSetInformationThread" as "ZwSetInformationThreat"; is very annoying to debug if you use ZwSetInformationThread() later. Which brings me to my next point: yacc will give you some help when compiling on the CANVAS side before shipping it over to the target host but if it passes compilation any host side errors you will have to use your cunning, savvy and a debugger to find.

Variables do take a bit to get used to. Declaring a variable via: int ret = 4; gave me headaches. So I declared all my variables at the top then assigned them values after they'd all been declared. It may not be your style but I stopped getting wonky yacc errors after I followed this method.

sendint is a mosdef built in that allows you to, as you expected, send an integer back to your CANVAS host. This is incredibly useful for localizing where your MOSDEF-C might be failing. I make use of it in multiple locations, you'll note that the success variable isn't strictly required as no additional instructions are executed after the ZwQueryInformationThread call. This is a remnant of development but having a final send after all substantive instructions have been executed allows you to know that all of your code ran.

# Handle the responses
        threadId = 0
        success = 0
        threadId =     # recv threadId
        setResult =   # recv ZwQueryInformationThread result 
        queryResult = # recv new thread priority
        success =      # recv success, not strictly needed

As you may expect the CANVAS has a corresponding readint() for receiving these values. I found it helpful to have my CANVAS python variable names and my MOSDEF-C variable names be consistent and to label my readint()'s with enough information that I could easily figure out which one wasn't firing. When you get into more complex code, like a readint() within a conditional statement, keeping things labeled will help immensely with debugging.

# Lets have some verbose error handling
            if threadId == 0:
                self.log("Unable to get current thread ID, this will likely fail")
            setResult = hex(setResult)
            if setResult != "0x0":
                self.log("Received an error when attempting to call ZwSetInformationThread")
                self.log("Error no: %s"%(setResult))
                self.log("Check here for error details:")
                raise ValueError
            if queryResult != 0:
                self.log("Error when attemping to call ZwQueryInformationThread, the module may have worked but unable to confirm")
                raise ValueError
            if success != 42:
                self.log("Encountered an error before the module exited")
                raise ValueError
        except ValueError:
            self.setInfo("%s - Done (failed)"%(NAME))
            return 0
        except Exception as e:
            self.log("Encountered an unhandled exception")
            self.setInfo("%s - Done (failed)"%(NAME))
            print e.message
            return 0

This may not be the most elegant or Pythonic way to do error handling but I found it made sense to me. The more effort you put into having good error handling now means debugging this module in six months when Microsoft has adjusted something is much easier. A few functions here that are useful: self.log() will generate CANVAS log events, I recommend using this over print for debugging. self.setInfo() will set the module's status in the Current Status GUI tab and is helpful to set if others will be using your code.

        self.log("Thead Id: %d"%threadId)
        self.log("ZwSetinformationThread: %s"%setResult)
        self.log("ZwQueryInformationThread: %s"%queryResult)
        self.log("Success: %d"%success)
        self.setInfo("%s - Done (success)"%(NAME))
        return 1

Finally we dump some information to the user and tell CANVAS the module has completed by returning. As you can probably guess return 0 will tell CANVAS the module failed, return 1 the module succeeded.

I think the benefit to MOSDEF-C is that it quickly allows you to interface with the Windows API without touching the remote file system. There's no DLL to load, if you have a Windows CANVAS Node this code is inserted and run into the running process. A defender may be able to determine that a machine was compromised but determining what specifically was done to that machine if you make use of techniques like this can be substantially more difficult. After all, hooking the entire Windows API isn't practical.

MOSDEF-C is a bit of a labor of love. If you're interested in starting to use it I would seriously suggest starting to read through the ./MOSDEF/ directory in CANVAS and then proceeding to ./MOSDEF/MOSDEFlibc. It is a powerful tool but it's important to note some of the current implementation limitations and some of the language quirks before you start doing anything too complicated.

The link to the complete module source can be found at:

Wednesday, February 13, 2013

WPS Attack Detection and Reaction

For those of us in the security industry we are well aware that reactions to attacks can take vendors a long time to develop and deploy a preventative measure.  This is especially true with embedded devices that have no auto-update feature and/or are completely forgotten about once they are up and running on the network.

As a refresher in December of 2011 an attack was published that targeted a weakness in the Wi-Fi Protected Setup (WPS) protocol that demonstrates how to significantly decrease the amount of attempts needed to derive a valid WPS PIN during a brute force attack.  This attack leaves most routers that have WPS enabled vulnerable to an attack that will allow an attacker to learn the WPA Pre-Shared Key (PSK) or WEP key as well as gain access to more configuration information.

Click here to see a video of an attack against WPS using SILICA.

Most routers are still vulnerable to this attack today because there is no easy way to disable WPS in the router's configuration interface (a regular user is not going to go through the trouble of modifying firmware) and not very many people/organizations are very good about checking for and updating new firmware.  I should probably also mention the probability that most people that use or administer a wireless router are completely oblivious to the fact that there is such a weakness in WPS and don't disable it even if they can (after all it is a protocol meant to provide a convenient method to the admin and network users).

Even though it usually takes a long time for vendors to respond to this kind of attack we have recently seen a change in Netgear's firmware that actually addresses the security weakness.  Take a look at the following section taken from a Netgear R6300 web interface:

This is the first vendor response that I have seen for the WPS PIN attack.  After 3 failed attempts the feature is disabled and you get the following message next to the feature configuration: 


This obviously does offer another avenue of attack in the tune of a not-so-exciting denial of service (DoS) making it easy for an attacker to turn off WPS all together.

This slightly changes the game (and by slightly I mean not very much).  It used to be that identifying that WPS is enabled was all an attacker needed to determine if the AP was vulnerable to this attack (no major or minor versions to check before launching the attack - it's kind of the same feeling you get when you are pentesting a ColdFusion server/application; it doesn't matter the version you just know it's vulnerable). 

The WPS bug and attack are not going anywhere for a long time but it's interesting to see the proactive actions of vendors hoping for an eventual extinction.  I will keep you posted if I see any similar trends.

Friday, February 8, 2013

Infiltrate Preview - The Chameleon: A cellphone-based USB impersonator.

If you've ever been part of a Call For Papers committee you know that the information provided in the submission's abstract is generally vague and ambiguous. A talk could be a total Fail or THE talk that will define your conference and ensure it will return next year. That's why we ask for a draft version of the research and involve our technical team in asking questions and vetting submissions.

As an attendee of a conference you of course have to decide which conferences to attend. Immediate financial costs, scheduling time off, logistical details, these are all headache inducing aspects of going to a conference. You read a vague abstract on a conference website but you are not allowed to ask questions. You understand the Windows 8 heap, but does this presenter really know what they're talking about or will this be a waste of your time? In the past folks have had to base their attendance decision on how well the conference has been reviewed in the past (reviews which are sometimes are paid for by the organizers) and on speaker reputation (another criteria used by CFP committee).

At Infiltrate, we like you, heck we understand you because we've gone through that ourselves. So in order to change this annoying process in the infosec community, we're offering something different :). We are encouraging our selected speakers to be guest bloggers here, to provide more in depth information on their topics and to field questions.

The ball is now in your court, the shoe on your foot, the ace is in your sleeve. As an attendee you should take the time to research the subject and try to get the most out of each of each talk. We generally like to brag that our audience poses the most difficult questions to presenters, let's try to keep that up! Without further ado:

Matias Soler on:

The Chameleon: A cellphone-based USB impersonator

Fuzzing or auditing USB stuff has been around for a while but doing it from the device's perspective is a relatively unexplored area. This is mostly because of the complications that are involved in pretending to be a USB device to attack a USB host. The simplest method involves some hackish modifications on QEMU, where you can write a driver for QEMU for your emulated device and it will appear as real to the virtualized OS.

Unfortunately you cannot run everything interesting inside a virtual machine (think smart TVs, music centers, etc), and even if you could operating systems may behave differently on real hardware. And even more important your real target probably won't be inside QEMU.

Travis Goddspeed has done an amazing job on building a small device with two USB ports that will solve this problem. One end goes to your computer and the other one to the target machine. By using python he is able to control the device such that it will appear as a real USB gadget to the target. Although this is awesome, building this device requires building the PCB and more importantly SMD soldering skills (that I don't have).

So our goal was to build a device that allowed us to quickly prototype USB gadgets and make it easy to assemble. We decided to use a Teensy board, which it's not much more that an AVR micro controller with built-in USB support. All the basic components needed to work, a USB boot-loader so you can reprogram it using this port and of course it had to come pre-assembled :)

Things we love about this board:
  • Is a small board with a relatively powerful microprocessor
  • This processor has built-in USB controller
  • Has a handy bootloader that allows for programming the microprocessor via USB very easily
The Teensy's USB controller is managed entirely by setting and reading a couple of specific registers which makes life wonderfully simple. For example, sending data from the device to the host would be something like this:

There are lots on examples on the Internet about using this device to emulate different USB gadgets such as: keyboards, mice, mass storage, etc. Although there are some drawbacks:

  • There is no easy way to debug: You can always send data trough USB, but you will be interfering with the emulation
  • You have to code in C/C++
  • For each test you need to recompile and re-flash.

The last issue too may not seem very important but when you are doing hundreds of tests it adds a lot of time overhead. Also if you are testing on real hardware, you will need to unplug the Teensy and connect it to your development machine for reprogramming.

So what we did was instead of emulating the whole Gadget on the Teensy, we just use it as a proxy. And this proxy is controlled from a Python script where all the USB logic was implemented. This allow us to move all the complexity of coding USB stuff from the hardware device to the host side, using python code which is much more friendly. We also designed our library in layers so the upper layers do not depend on our actual hardware this gives us the ability to port the library to different hardware if needed.

For this we needed another communication channel from the Teensy to our development computer, so we added a generic USB-Serial adapter connected to the host's serial port:

  • USB device that emulates a serial port on the host
  • RX and TX outputs are in TTL level
  • 0v to 0.8v = 0
  • 2.2v to 5v = 1
  • Native support on Linux

Now the developer's computer communicates via the USB-serial line to the Teensy, where it can interact with the USB controller. On the other side the Target machine is connected directly to the USB port of our device.

We built a very basic firmware for the Teensy that reads commands from the serial line and based on the value received it will either read or write a register from the USB controller. The first byte transmitted either from the host to the Teensy or from the Teensy to the host has the following format:

Where opcodes can be SET_REGISTER, READ_REGISTER or PING, and the register value is 5 bits that represent what register will be read or written to. In the case of a SET_REGISTER operation a second byte is sent indicating the value that should be written to the register. In the case of READ_REGISTER the value will be returned by the Teensy to the host.

PING is just for debugging, a PING_REPLY is answered every time we send this request and lets us know if some part of the chain has died. Just to be more clear here is part of the firmware code (it has been trimmed so it can fit nicely on the blog post):

And finally here's what it looks like on my desk:

As you can see the device is very simple to build, just need to attach tree wires and it's done!

The firmware is also very simple, but the whole complexity is moved to Python on the host side, where you need to understand and handle almost raw USB requests.

Despite all the good features of this project, it has its limitations. When you are emulating stuff that requires a lot of data transfers, like USB-storage for example, it results are pretty slow. The bottleneck right now is the serial line, although some improvements can be made to the protocol like handling bulk data transfers instead of byte by byte, it will at most double the speed which is still not enough.

At this point we were forced to look for new alternatives and we decided to move to using a Cellphone as the base hardware. Surprisingly a cellphone is a perfect platform for this task, they have powerful processors, big screens, Wi-Fi and more important a guest-capable USB controller. In particular we have chosen Samsung S2, as it is very popular and relatively easy to replace the kernel.

If you want to see more on this new approach you should came to Infiltrate and see my talk, "Chameleon, a cellphone-based USB impersonator" :)

~ Matias Soler


Wednesday, February 6, 2013

SWARM and Vulnerability Impact

Yesterday the latest D2 exploit pack for CANVAS was released. In it there was an exploit module for the local file include in TP-LINK's TL-WR841N Wireless Router (CVE-2012-6276). I'd been looking at some data in SWARM that had a high concentration of TP-LINK routers so in the back of my mind I thought that it'd be interesting to see if I could find vulnerable routers with a high degree of confidence.

For those of you playing at home SWARM is Immunity's answer to distributed and parallel CANVAS. If you need to scale CANVAS to hundreds of thousands of IPs per hour, then SWARM is the solution. One of the things that SWARM is really good at is recon on a massive scale which can give you perspective on how much you should care about a particular vulnerability. Sure enough today another interesting router vulnerability was disclosed by the folks at, and they were nice enough to include some tips on finding these routers. Having looked at well over 100,000,000 IPs with SWARM at this point I decided to see how many of these routers were really available.

We learned a few things:
1) We had much better results parsing the index page for the webserver than relying on server banners because of #2

2) There are a variety of server headers associated with this product. We saw: Mathopd, "Linux, HTTP/1.1, DIR-[3|6]00", obviously spoofed headers, and even blank server headers

3) Sometimes you get the firmware version in the http server header (2.X series), others you have to parse the server's index.html (1.x), occasionally you wouldn't get it at all

4) Some configurations will leak what appear to be internal host names and MAC addresses to unauthenticated users (name=$HOSTNAME&key=$MAC_ADDR is the string if you're curious). They didn't look like SSIDs but without more context it's hard to tell.

5) There aren't that many easily found routers of this type with their web servers exposed to the internet, out of our data set we found ~1500

6) We found only a handful of routers which met the version criteria to be considered vulnerable

A subtle thing people forget with scans of any real magnitude is that they offer a rolling snapshot of the IP space. The results you get for a particular IP reflect only the results for that IP at the time the request was sent. This may seem obvious but if your scan takes two days to run you can't consider the output to be "Thursday's results". Half your requests went out on Thursday the other half Friday. For environments where hosts are statically mapped this isn't as big of an issue but when you look at big networks it becomes a headache. It's fun to see the exact same host pop up at multiple IPs within the same set of results. It's possible you'll miss hosts entirely if they jump from the end of the IP range to the beginning while the scan is running.

Going through this exercise gave me another idea for SWARM. Many American ISPs now bundle anti-virus with their broadband offerings and security is playing a bigger part of the marketing narrative. I think a neat application of SWARM that fits with this is for your ISP to notify you if your router has a public vulnerability. At some set interval the ISP can kick off a scan that finds all the routers with their webservers exposed to the internet, then filter that through some Python to pick out models with public bugs (which is all of them). Easy, automated, helpful.

Tuesday, February 5, 2013


When the Java 7 update 11 came out, Esteban freaked out.

On every patch cycle, someone somewhere freaks out. It's life. After so many years of working in exploit development everyone learns to make peace with that fact (well, maybe not Max).
This time Esteban had valid reasons. As it turns out Java decided to burn all their bridges and basically set their default security level to High. Which means that every-time a website serves an applet you get an awful warning message that requires a user's confirmation.

This pretty much kills all your Java bugs... Why? Because if your attack strategy relies on a user clicking through a prompt then it's just better to self-sign an applet and it will run out of the sandbox when they click ok. Simple right?

But then, we are talking about Java, right? So we decided to take a look at the implementation to see if there was any way we could bypass it.

Fifteen minutes later Esteban came back to my desk. The job was done. The high fives were his.

Bug? Feature? Miscarriage? 

It hard to classify this vulnerability especially since this was supposed to be the main security implementation that Java released and something that people were blogging about.

The function responsible for initializing the Java applet Plugin2Manager.initAppletAdapter, is the one that contains the new security protections.

void initAppletAdapter(AppletExecutionRunnable paramAppletExecutionRunnable)
  throws ClassNotFoundException, IllegalAccessException,
  ExitException, JRESelectException, IOException,
  InstantiationException {
 long l = DeployPerfUtil
   .put(0L, "Plugin2Manager.createApplet() - BEGIN");
 String str1 = getSerializedObject();
 String str2 = getCode();
 Plugin2ClassLoader localPlugin2ClassLoader = getAppletClassLoader();
   .put("Plugin2Manager.createApplet() - post getAppletClassLoader()");
 Object localObject1;
  localObject1 = new IOException(
  throw ((Throwable) localObject1);
 if ((str2 != null) && (str1 != null)) {
  throw new InstantiationException(
    "Either \"code\" or \"object\" should be specified, but not both.");
 if ((str2 == null) && (str1 == null))
 if (str2 != null) {
  localObject1 = localPlugin2ClassLoader.loadCode(str2);
    .put("Plugin2Manager.createApplet() -  post loader.loadCode()");
  String str3 = getAppletCode();
  Object localObject2 = null;
  if (!str3.equals(str2))
   localObject2 = this.loader.loadCode(str3);
   localObject2 = localObject1;
  this._signedApplet = isAppletSigned((Class) localObject2);
  if (localObject1 != null)
   if (fireAppletSSVValidation()) {
   } else {
    if (paramAppletExecutionRunnable != null)
    this.adapter.instantiateApplet((Class) localObject1);
      .put("Plugin2Manager.createApplet() - created applet instance");
 } else {
  if (!this.isSecureVM)
  this.adapter.instantiateSerialApplet(localPlugin2ClassLoader, str1);
  this.doInit = false;
    .put("Plugin2Manager.createApplet() - post: secureVM .. serialized .. ");
 if (!this.adapter.isInstantiated()) {
  System.out.println("Failed to instantiate applet??");
 if (this.shouldStop) {
  if (DEBUG)
   Trace.println("Applet ID " + this.appletID
     + " killed during creation", TraceLevel.BASIC);
  synchronized (this.stopLock) {
   this.stopSuccessful = true;
 DeployPerfUtil.put(l, "Plugin2Manager.initAppletAdapter() - END");

I will give you 5 second to read the code and facepalm.
5... 4... 3... 2... 1... boom!

At the beginning of the function, the code obtains a serialized object (str1 variable) and the code (str2 variable) and only one of them can be used (line 7 and 8).

The normal way people use applets is through the “code” attribute (str2 variable), and as a result the fireAppletSSVValidation method (line 41) is called thus displaying the warning to the end user. In order to bypass this protection, the exploit has to take the second route and load the applet through a serialized object (line 56).

 <embed object="object.ser" type="application/x-java-applet;version=1.6"> 

Oracle's fix was as one-liner, they added a call to fireAppletSSVValidation() before line 56. Job Done.
Is Java dead now? I will leave that as an exercise to you, and if you don't know how to approach this, maybe you are interested in learning how to audit and find your own bypass? In that case we can discuss it in April during Master Class Java Auditing Extravaganza.

 Since Immunity is very pleased with the revamped blog page, we decided to share this happiness by offering discounts for INFILTRATE 2013. Send an email to with the subject "KEEP CALM and Run this Applet" and you will receive a 10% discount off of the conference briefings pass, Unethical Hacking training and/or Web Hacking training. This is a limited time promotion. You have until Friday February 8, 2013 @ 4pm EST to take advantage of these savings!