Immunity Products: Infiltrate Preview - NAND-Xplore -> Bad Blocks = Well Hidden

Please, welcome Josh "m0nk" Thomas, our first Infiltrate Guest Blogger!

Post-Ex can be sexy

So, you’re frustrated: You’ve spent countless nights discovering the most epic remote mobile 0day imaginable and innumerable hours crafting an intricate payload… only to get the whole thing popped by some stupid Android based AV variant. It’s not like you write typical “churn and burn” malware or ransom-ware where everything is on an expendable cycle; you really are trying to pull off some covert, next-level long term injections and they all just caught fire. Hmmm… It might be time to calm down on the offensive front until you have an acceptable post-exploit landscape to build upon.

“But m0nk, post-exploit is boring and not sexy; right?”

Actually, I think you are dead wrong. Post exploitation can actually be far more deviant than exploitation and the findings typically have a longer shelf life. With that mentality in mind, the NAND-Xplore project was born. The NAND-Xplore project is an attempt to investigate just how deep files can be hidden on an embedded system, starting with a deep understanding of the “bare metal hardware” well below the operating environment. The project attempts to expose weaknesses in the actual NAND data storage hardware / implementation architectures and showcase the vulnerable underpinnings across the spectrum of NAND based platforms. The project is focused on 2 POC tools: one to hide files on NAND devices and one to find them. The overall assumption of the project is that real world advanced malware already contains these tricks, we just don’t know about it yet.

Before the Infiltrate talk itself, I thought it might be useful to share some background info on the NAND Flash technologies themselves. The talk will pick up where this blog post leaves off, primarily with how the Linux kernel interacts with NAND flash and how those interactions can be manipulated and controlled.

A Deeper Understanding of How NAND Functions

Sample NAND Prototype chip with visible blocks and pages

Hardware functionality of actual NAND Flash

In the most basic sense, NAND devices store individual bits of data in a multidimensional array of floating-gate transistors. The floating-gate transistors allow each cell to trap individual electrons, thus keeping or removing a charge. It is this charge that corresponds to a single 0 or 1 for the device. The multiplexed transistor design, coupled with the concept of Fowler-Nordheim tunnel injection and release, allows this grid of floating gates to access cells at a single bit level. In layman’s terms, consider the NAND flash to behave as a highly dense, addressable LED array.

A Simple NAND Circuit

Each individual flash cell is contained in a collection designated as a page. Pages on NAND devices are typically collections of 512, 2048 or 4096 bytes. In turn, each page is collected into a construct known as a block. NAND blocks typically follow an exponential based size paradigm and can range from 16 KB to 512 KB.

While the grid architecture of NAND flash allows for addressing at the single bit level, such accuracy comes with a hard set of limitations:

All bits on the device default to and are initially set to a 1.The shift from a 1 to a 0 is a simple electronic pulse to open the gate and dump the stored electron. Sadly, shifting the other direction (from a 0 to a 1) is non trivial and cannot be preformed at the bit level, only at the block level. As such, shifting a stored byte of 1111 1111 to 1010 1010 is trivial but the reverse would entail erasing and entire block of 512 KB.
The physical floating-gate transistors are fragile and slowly wear down over time. Typical industry expectations are that each gate can survive around 100,000 state changes before becoming unreliable and unstable. Once a block has become unstable, the NAND controller has the ability to mark it “bad”. This designation will ensure the block is removed from rotation and can no longer be read or accessed automatically.
As the gates wear over time, charge leakage can occur. This leakage will corrupt neighboring cells and their stored information. Charge leakage can also occur with exceptionally high levels of repeated reading even without writing to a cell. This is mostly due to the power utilized across the grid to query a specific cell.

Given these limitations, NAND designers and manufacturers introduced automated leveling across the devices. This process attempts to distribute digital information across the hardware in an even manner, not allowing any single bit, page or block to be utilized more than another. The leveling software will also copy highly accessed information around the NAND to discourage charge leakage. If one has the correct tools, they can see this phenomenon by low- level analysis of a NAND. Typically, a forensics analyst can view multiple histories of a file because the NAND flash controller will elect to copy the entire file to a new block of NAND instead of modify the existing imprint. These older versions of the file stay resident until the block is reset and new data is written. This, as well as all other NAND interactions, is managed by the NAND controller hardware. This NAND controller is also a main culprit for why writing successive 0’s and 1’s repeatedly over an entire device is meaningless to the technology, typically because the NAND controller will simply disallow such wasteful access to the memory.

Toshiba NAND Reference Design with NAND Controller

The final applicable detail about NAND flash pertains to mass production yields, transistor size and quality control. Manufacturers are constantly pushing the size of this hardware to be well below a 100% reliable component threshold. As such, devices are known to contain and ship with bad and unusable sections. These sections, much like the blocks that have exhausted their maximum number of times data can be written, are marked as “bad” at the controller level using a collection of NAND flash based error codes. These blocks are simply considered unusable by the overall system and are removed from the addressable space of the memory by the NAND controller. The NAND controller supports this functionality by keeping an active map of the hardware detailing valid and error prone blocks.

Lastly, it should be noted that most but not all embedded NAND flash devices contain a hardware based NAND controller. Those devices that do not contain controlling hardware, such as smart cards, USB storage devices and the like, expect the controlling operating system to mark, flag, control and manipulate the hardware directly. As such, most modern operating systems have a basic understanding of NAND error and correction codes. For the devices that do contain hardware-based controllers, the operating system and hardware drivers preform read and write operations in a similar manner to their older magnetic platter counterparts.

Overview of the NAND Flash Standards

The 2 main standards bodies relevant to NAND are JEDEC and ONFI.

Development NAND Breakout with a standard TSOP connection

The JEDEC (Joint Electronic Device Engineering Council) committee is primarily concerned with ensuring the various vendors and manufacturers of NAND Flash hardware conform to certain chip package hardware standards. JEDEC is also concerned with ensuring general interoperability between manufacturers and NAND designs. JEDEC provides this services for numerous types of hardware and is far from a NAND specific committee.

The ONFI (Open NAND Flash Interface) group is a governing body for NAND Flash specific interface standards. The group intends to dictate how NAND will interface with other hardware and (to some extent) other software in the wild.

In general, most NAND devices connect to other hardware with either a TSOP (Thin, Small outline package) or BGA (Ball Grid Array) connection. The referenced standards dictate the footprint and layout of the hardware. In typical situations, embedded NAND is delivered on a 169 ball BGA package.

Standard Types of NAND to Board connections

Raw NAND vs. FTL Technologies

NAND Flash can come in a variety of configurations when manufactured. In specific relation to this research we can categorize them as such:

Raw NAND
NAND + FTL (Managed NAND)

Raw NAND Flash is a slab of NAND storage in its most basic form and all management of the hardware and storage interactions are performed in software outside of the NAND. The Linux kernel utilizes the MTD (Memory Technology Device) subsystem to interact with these devices. This grouping contains only bare NAND and other MTD based devices. To add to the confusion, some raw NAND devices do have embedded ECC (error correction) and simple block management. The main differentiation in this instance is the Linux kernel is treated as the master controller of the hardware, with the embedded processing simply supporting.

NAND + FTL devices contain an on package NAND controller that manages the slab of NAND flash internal to the chip. This controller will manage bad blocks, wear leveling and data access internally and provide a FTL (Flash Transition Layer) interface to outside software such as the Linux kernel. The FTL presents the NAND hardware as a standard block device externally. Though there are significant differences in implementation, this broad grouping contains MMC, eMMC, SD and SSD devices.

Raw NAND vs. Managed NAND (FTL)

Revisiting Post-Ex: Moving forward

Now that we all have a good understanding of the basic NAND architecture, we can have a little fun with data and process hiding. See you @ Infiltrate.

-m0nk

Immunity Products

Wednesday, February 27, 2013

Infiltrate Preview - NAND-Xplore -> Bad Blocks = Well Hidden