Abstract:
File system analysis is an important process in the fields of data recovery and computer forensics. The file system is formed by metadata describing how files and directories are organized in a hard drive. File system corruption, either accidental or intentional, may compromise the ability to access and recover the contents of files. Existing techniques, such as file carving, allow for the recovery of file contents partially, without considering the file system structure. However, the loss of metadata may prevent the attribution of meaning to extracted contents, given by file names or timestamps.
We present a signature recognition process that matches and parses known records belonging to files on a drive, followed by a bottom-up reconstruction algorithm which is able to recover the structure of the file system by rebuilding the entire tree, or multiple subtrees if the upper nodes are missing. Partition geometry is determined even if partition boundaries are unknown by applying the Baeza-Yates–Perleberg approximate string matching algorithm. We focus on NTFS, a file system commonly found in personal computers and high-capacity external hard drives. We compare the results of our algorithm with existing open source and commercial tools for data recovery.