Selective Imaging

From Digital Forensics Framework

Jump to: navigation, search

This page contains information on the selective imaging capabilities of DFF, especially installation of libaff4 and a usage guide. It is strongly recommend that you read it before use!

Selective Imaging

Technological advance leads to bigger and bigger hard-disks. Five years ago, 320GB hard-disks where standard, today its 1TB or even 2TB. This trend is naturally not going to slow down, and creates a problem for forensic examiners. Imaging complete hard-disks becomes more difficult with capacities increasing, it takes a lot of time before analysis can start and huge amounts of storage is required to archive evidence.

Most of the data on hard-disks is harmless, depending on the case it can even be irrelevant or distracting. Only a small amount of the data stored on hard disks is usually evidence or of any importance to the forensic analysis.

In most cases, there is no actual reason to image every single bit on the drive. It's done because it is regarded as common practice or because examiners believe it to be required for evidence to be accepted into court.

I'm not going to explain the idea in too much detail here, the purpose of this document is document the process of getting the modules working and enable people to use them. For more information you might want to read my thesis, which will be released sometime in march. You can also contact me (driest) on irc in the #digital-forensic channel on freenode.


To use the modules described here, the library "libaff4" by Michael Cohen is needed. It can be obtained from and instructions on how to compile it are available on . The pyaff4 module must be in your python-path!


Don't confuse aff4 with aff ( They were developed by the same crowd, but are two different things. The DFF modules won't work with aff v3.X!

At the time of this writing, the selective imaging modules are not in the master branch of DFF and thus must be downloaded from the GIT repository. You can obtain a copy by checking out the branch mod_aff4 from git:// . You then need to compile dff. You need a lot of QT and other packages for that, installation instructions can be found here: .

I've written down an exemplary configuration that allows to compile libAFF4 as well as DFF 0.9. These packages where required on a fresh Ubuntu 10.04 install with nothing but the package build-essential installed:

  1. Prerequisites:
    For DFF: cmake python-sip swig python-qt4 qt4-qmake libqt4-dev qt4-dev-tools qscintilla pyqt4-dev-tools python-magic python-qt4-phonon python-qscintilla2
    For LibAFF4: libxml2-dev libewf-dev libtsk3-3 python-dev
  2. Compile libaff4
  3. Run ldconfig -v to clear the library cache
  4. compile dff
    cmake && make && make install
  5. That's it, you should be able to run now

Using the Selective Imager

Normally you'll want to use the selective imager instead of your usual acquisition tool like linen, ewfacquire, aimage, etc.

The acquisition process will look like this:

  1. Start by connecting a hard-disk to the analysis machine with a hardware-writeblocker.
  2. Fire up DFF in superuser mode (Otherwise you can't directly acces a block device).
  3. Use DFF to perform a preliminary analysis of the device. This should be done quick, so don't go into too much detail. Your goal should be to identify which parts of the disk contain data relevant to the investigation and which parts are irrelevant.
  4. Use the bookmarking function of DFF to bookmark everything relevant to the investigation.
  5. Right click the root node of your bookmarks and apply the module Node->acquire. You can choose from some options, for example if you want to hash every node that you image or where to store the image.

The selective imager will recursively traverse every node, extract and serialise all the meta-data and write them into an aff4 image. Meta-data is every attribute in dff, so the results of your analysis with any other modules that stick their results to the node as an attribute will be preserved. More information on the aff4 format and it's metadata serialization can be found on the forensicswiki:

The imager will additionally store this information for every node:

  1. The byte-runs of the node. This is a list of the byte mappings of the data in the node in relation to the original disk. This information allows you to verify the integrity of the image.
  2. A cryptographic hash (optional). If already computed by another module, it will not recalculate the hash but simply append the attribute. This is also required for verification purposes.

Using the Image

The image can be used with any of the existing aff4 tools, it can be mounted with fuse, encrypted or signed with the standard aff4 tools and much more. For further analysis, it can be parsed back into DFF with the aff4 connector. The module can be found in Connectors->aff4. It will parse the image and recreate the underlying directory structure exactly as it was at the time of imaging. Every node that was imaged will be restored with all their attributes/meta-data. You can then continue your work on the image, extract individual files/nodes, etc.


The connector will try to automatically detect the underlying file on the filesystem, because it doesn't use the DFF I/O subsystem to access the file. If this fails, nothing will happen. You can manually specify the file as a parameter to the aff4-connector module.