Modules how to
This page purpose is to explain how to easily develop a module for the framework. Writing a new plug-in and integrating it to the framework is not difficult. The difficulty comes from the module themselves, depending on what they have to do. Developing a plug-in requires to create new files and to modify some existing others.
Module's can be developed in Python or C++. This is up to developers to chose which language they prefer and find more adapted. Generally, if you need high performances, as for a huge file system parsing for example, you will pick C++. If you need to rapidly develop a module with no need of really high performances, Python is the best choice. This is of course just the "main idea" of how choosing a language, other considerations are to be taken in account "in real life" cases.
We will here takes as example a module developed in C++ designed to parse a file system we will call DummyFS. Its specification is given below.
Contents |
Get started
The first thing to do is defining in which directory the source of our plug-in will be. If we decide to develop a dummy file system plug-in, we will need to create a new directory in the directory modules/fs of dff sources. On UNIX / Linux it would give something like :
$ cd dff_sources/modules/fs $ mkdir dummy
First, we need to configure our environment to compile the module : DFF uses cmake to generate Makefiles. There are a few files to modify and add in order to do so.
In the fs directory, we need to open the CMakeLists.txt file and add the line :
add_subdirectory (dummy)
where dummy is the name of the directory where our plug-in will be located. Doing this, cmake will configure Makefiles to enter in the dummy directory, which is necessary to compile the module.
CMakeLists.txt
Next, we need to enter the dummy directory and create an other CMakeLists.txt, used to generate the Makefile. This file will contain, among other things, the path to the module sources. An example is given below :
FIND_PACKAGE(SWIG REQUIRED)
INCLUDE(${SWIG_USE_FILE})
include_directories(${PYTHON_INCLUDE_PATH})
include_directories(${CMAKE_CURRENT_SOURCE_DIR})
include_directories(${CMAKE_HOME_DIRECTORY}/api/include)
# sources of the module
set(dummy_srcs
dummy.cpp)
SET_SOURCE_FILES_PROPERTIES(dummy.i PROPERTIES CPLUSPLUS ON)
SWIG_ADD_MODULE(DUMMY python dummy.i ${dummy_srcs})
SWIG_LINK_LIBRARIES(DUMMY ${PYTHON_LIBRARIES} _libvariant _libexceptions _libtype _libvfs _libenv)
if ( CMAKE_GENERATOR MATCHES "Visual Studio")
set_target_properties (${SWIG_MODULE_DUMMY_REAL_NAME} PROPERTIES
PREFIX "../"
SUFFIX ".pyd"
)
endif ( CMAKE_GENERATOR MATCHES "Visual Studio")
set(dummy_files
DUMMY.py
)
install_lib(${SWIG_MODULE_DUMMY_REAL_NAME})
install_file(dummy ${dummy_files})
The example given above is a standard DFF CMakeList.txt. It is possible that some more configurations are required (linking to external libraries, path to other include files, etc). We recommend that you read the cmake official documentation if you need to do such things. You can also have a look to DFF other CMakeList.txt files.
dummy.i
We are developing a C++ module, so we need to wrap it to generate Python code. This is done by using SWIG. To use the wrapper, we have to create a file dummy.i in the same directory. This file will contain the following code :
# the name of the module
%module DUMMY
# should be present in any module supposed to be portable on Linux and Windows
#ifndef WIN_32
%include "stdint.i"
#else
%include "wstdint.i"
#endif
# Here should be included every wrapped headers you need
%include "std_string.i"
%include "windows.i"
%import "../../../api/vfs/libvfs.i"
# add here all API headers you need
%{
#include "dummy.hpp"
#include "variant.hpp"
#include "mfso.hpp"
#include "node.hpp"
%}
# importing wrapped libraries
%import "../../../api/vfs/libvfs.i"
%include "dummy.hpp"
/*
namespace std
{
}; */
# in this part, the configuration of the module is defined (more details are given below).
%pythoncode
%{
__dff_module_extfs_version__ = "1.0.0"
from api.module.module import *
from api.types.libtypes import Argument, typeId, Parameter
class DUMMY(Module):
"""Useless DUMMY module example""" # description of your module
def __init__(self):
Module.__init__(self, 'dummy', Dummy)
# adding options to the module
self.conf.addArgument({"name": "file",
"description": "file containing a Dummy file system",
"input": Argument.Required|typeId.Node})
self.tags = "file system" # tag of the module
%}
The most important part of this file is the module configuration of the DUMMY class. As can see, there is call to the conf.addArgument() method, and a Python dict is passed as a parameter. It defines which option the module will take in input when launching it. In our dummy example, we define only one parameter, called file, of type node. This argument is required, i.e. not optional (if it is not input, the framework will issue an error and the module won't be launched).
The tag of the module is used to define in which category the module will be.
Once again, if you wish to go further with SWIG, we recommend that you refer to the official documentation of the project.
API implementation
Now, our environment is set and we can start the development of the module itself.
mfso
The different classes of the API can be found in the directory dff_sources/api/include. The first class we will need is mfso (file mfso.hpp).
Mfso is the definition of modules, so we will need to implement it. To do this we have to create a new class, called Dummy in our example, inheriting mfso.
- Dummy.hpp :
#ifndef __DUMMY_H_ # define __DUMMY_H_ #include "type.hpp" #include "vfs.hpp" #include "argument.hpp" #include "mfso.hpp" class Dummy : public mfso { public: // constructor and destructor Dummy(); ~Dummy(); /* The paramters "args" contains the list of all parameters which were passed to the module. The std::string key is the name of the parameter, the value Variant * its value. */ virtual void start(std::map<std::string, Variant*> args); VFile * vfile; class DummyNode * root_node; Node * node; }; #endif /* __DUMMY_H_ */
The start() method is declared as a pure virtual in mfso, so every module must implement it. This method is called by the framework when the module is launched.
- Dummy.cpp :
#include <iostream> #include "dummy.hpp" Dummy::Dummy() : mfso("Dummy") /* mfso constructor requires a std::string, the name of the module. */ { } Dummy::~Dummy() { } void Dummy::start(std::map<std::string, Variant*> args) { try { std::map<std::string, Variant*>::iterator it; // map iterator // get argument. If it cannot be found, throw an exception if ((it = args.find("file")) != args.end()) this->node = it->second->value<Node*>(); else throw (std::string("Dummy exception: no parent provided")); /* open the node to get a vfile on which we will be able to seek / read */ this->vfile = this->node->open(); /* Creation of the root node of the tree view we are about to build in the Dummy module. The fact that it is our "root node" is indicated by the NULL parameter. */ root_node = new Node("Dummy", 0, NULL, this); /* Here should be the code of the module. */ /* Once the module has finished its execution, we need to register the tree we built */ this->registerTree(this->node, this->root_node); } catch (envError & e) // catch blocks in case of exception { std::cerr << "Dummy::start() : envError Exception caught : \n\t ->" << e.error << std::endl; } catch (vfsError & e) { std::cerr << "Dummy::start() : vfsError exeption caught :" << std::endl << "\t -> " << e.error << std::endl; } catch (std::exception & e) { std::cerr << "Dummy::start() : std::exception caught :\n\t -> " << e.what() << std::endl; } catch (...) { std::cerr << "Dummy::start() : unknown exception caught." << std::endl; } }
This code is a skeleton of how a module should look like. Nothing complicated, but it does not do much for now. The treatment of the data we want to analyze is not implemented yet and we just created a root node for the dummy tree view.
At that point we can already try to compile our module. We will need to go back to the root directory of DFF sources and use the command
$ cmake . && make
Once the compilation is finished, we can launch DFF, add a dump and use the Dummy module (right click on the dump, Open with -> file system -> dummy). A node called Dummy should be created. Nothing else should happen (for now).
nodes
DummyNode class
The second step is to create nodes to display results of our file system analysis. But to do this an analysis is necessary. It will start in the start method of the Dummy class. Remember that nodes represents files or directories on a file system, and that the dummy module should be able to read file content. For the example, we will create a dummy file system which specification is given below :
- File names must be 8 ascii characters long (8 bytes), no more no less.
- Files are described by name entries, which are located at the beginning of the file system.
- There are no directories.
- Blocks are 0x10 bytes (16 bytes) big.
- A name entry is 16 bytes big and their layouts is :
- bytes 0-1 : file content offset on the file system
- bytes 2-9 : file name
- bytes 10-11 : file size in bytes
- bytes 12-15 : second offset if the file is fragmented, contain 0s if there are no fragmentation.
- The dummy file systems can contains only three files (I told you it was dummy). As far as a name entry is 16 bytes big and located at the beginning of the DummyFS, and that there are 3 files at maximum, the first 48 bytes of the file system are reserved for the name entries table.
An example of DummyFS can be found here : it is 512 bytes big and contains three files names '12345678' (5 bytes big), '87654321' (32 bytes big) and 'filename' (8 bytes big). They respectively contain the strings "file1", "fragmented file to test fragment" and "file3abc".
We can create the following structure anywhere in our code (preferently in a .hpp file, lets say dummy.hpp). It will be used to get file name entries, which can more or less be seen as metadata :
typedef struct entry_s { uint16_t offset; uint8_t name[8]; uint16_t size; uint32_t fragment; } entry_t;
This file system is of course completely useless and was "designed" for the need of the example.
First things first, we need to create a new class DummyNode and add the new source file's name in our CMakeLists.txt :
set(dummy_srcs dummy.cpp DummyNode.cpp)
We will need to create a node type, DummyNode for example, inheriting the class Node. For each files on the file system, one DummyNode instance will be created. The file DummyNode.hpp will looks like this :
#ifndef DUMMY_NODE_H_ #define DUMMY_NODE_H_ #include "dummy.hpp" #include "node.hpp" // class from the API (file dff_source/api/include/node.hpp) class DummyNode : public Node { public: DummyNode(std::string name, uint64_t size = 0, Node * parent = NULL, Dummy * fsobj = NULL, uint32_t n_entry_addr = 0); ~DummyNode() {} /* Not used for now */ virtual void fileMapping(FileMapping* fm) {} virtual Attributes _attributes(void) {} private : /* these two attributes will be used later */ uint32_t __n_entry_addr; Dummy * __dummy; }; #endif /* DUMMY_NODE */
The implementation, done in the file DummyNode.cpp, will be quite simple :
#include "DummyNode.hpp" DummyNode::DummyNode(std::string name, uint64_t size, Node* parent, Dummy * fsobj, uint32_t n_entry_addr) : Node (name, size, parent, fsobj) { /* we initialize the attributes */ this->__n_entry_addr = n_entry_addr; this->__dummy = fsobj; }
Nodes instantiation
For now, we can already parse the name entries table and create the three nodes corresponding to the tree files. Notice that the call to vfile->read() allow us to read a fix amount of data on the vfile we opened earlier, at the current offset. To modify the offset, vfile->seek(offset) musts be used, where offset is the offset where we want to go.
The following piece of code musts go in the Dummy::start() method and replace the commentary /* Here should be the code of the module. */. It is of course also possible to create a new method, called by start(), executing this code. The important point is that the analysis of the file system starts in the start() method.
/* Allocation of enough memory space to read the three names entries. */ uint8_t * name_entries = (uint8_t *)operator new(3 * sizeof(entry_t)); entry_t * entry = (entry_t *)name_entries; /* We seek at offset 0 on the file system, because it is the position of name entries. Useless in that case, as far as we already are at offset 0. This is just to illustrate the example. */ vfile->seek(0); /* We read 48 bytes on the file system, corresponding of the 3 file name entries (3 * 16 bytes). */ vfile->read(name_entries, 48); /* For each file, we instanciate a DummyNode. '''root_node''' is the father of the node we instanciate (in that case it always is root_node, but it could be any node.) */ for (unsigned int i = 0; i < 3; i++) { DummyNode * d_node = new DummyNode(std::string((char *)entry[i].name, 8), /* 8 characters names, there are no end string \0 so we must manually specify the size. */ entry[i].size, root_node, this); }
At that point you can launch cmake and make again, restart DFF, load a DummyFs dump and use the dummy module on it. You will see that three nodes will be created, with the right names and sizes.
File mapping
OK, we parsed our DummyFs and created nodes, but we still need to be able to read files content. This will be done through the method DummyNode::FileMapping(). To read a file, we need to know the position of its content on the disk, which is given by the name entry table.
When a file is opened, the DummyNode::fileMapping() method will be called. It must fill up the parameter FileMapping * fm. The file mapping structure contains a list of unsigned integer, each of them representing the position on the dump of one part of the file. These parts are called chuncks (with a typo). If the file is fragmented several chunks will be required. On the contrary if the file is not fragmented only one chunk is required.
Basically, we will have to create a list of blocks composing the file content. In a human speakable way, it would give something like :
- the first part of my file starts at offset XX and occupies x bytes (chunck 0)
- the second part of my file starts at offset YY and occupies y bytes (chunck 1)
and so on, until we reach the end of the file.
We can do this by calling the method FileMapping::push(). It takes 4 parameters :
- An offset in the file (the first chunk starts at offset 0 in the file, the second chunk starts at offset size of the first chunk, the third chunk starts at chunk1 size + chunk2 size, and so on).
- The size of the current chunk.
- A pointer to the root node.
- The real offset on the parent vfile.
The fileMapping method will look like this :
void DummyNode::fileMapping(FileMapping* fm) { /* We allocate memory space for the entry table we are gonna read. */ uint8_t * entry = (uint8_t *)operator new(sizeof(entry_t)); entry_t * n_entry = (entry_t *)entry; /* We seek to the address of the entry table, which was stored in the node's __n_entry_addr attribute, and read it on the vfile. */ this->__dummy->vfile->seek(this->__n_entry_addr); this->__dummy->vfile->read(entry, 16); // 16 is the size of an entry, the result is store in the 'entry' buffer. /* We push the chunck in the file mapping. */ fm->push(0, (n_entry->size <= 16 ? n_entry->size : 16) , this->__dummy->node, n_entry->offset); /* If the file size is bigger than one block (16 bytes), it means that the address of the remaining content is stored on the 'fragment' field of the entry_t structure and that a new chunk has to be created. */ if (n_entry->size > 16) fm->push(16, n_entry->size - 16, this->__dummy->node, n_entry->fragment); }
Extended Attributes
Some extended attributes can be added to each nodes. Those attributes can be used to store metadata for example.
In the case of the DummyFs, there are no attributes, so we can decide that our only extended attributes will be the string "NO extended attributes". We will need to implement the DummyNode::_attributes() method :
Attributes DummyNode::_attributes(void) { Attributes vm; Variant * v = new Variant(std::string("NO extented attributes")); vm["query"] = v; return vm; }
For each node, attributes are stored in a std::map composed of std::string, Variant * pairs. Variants are generic containers which can be used with the most common types (int, string, etc) and DFF types (Node).
Typically, extended attributes are used to display informations on each files. These informations are file system dependent. For example, the Extfs driver uses the extended attributes to display the different data stored in an inode.
Using Variants, you can put everything you want in extended attributes : integer, string, lists, map, etc (see the file dff_sources/api/include/variant.hpp fore more details).