Data
recovery
Data recovery is the process of recovering data from
primary storage media when it cannot be accessed normally.
This can be due to physical damage to the storage device
or logical damage to the file system that prevents it
from being mounted by the host operating system.
Physical damage
A wide variety of failures can cause physical damage
to storage media. CD-ROMs can have their metallic substrate
or dye layer scratched off; hard disks can suffer any
of several mechanical failures, such as head crashes
and failed motors; and tapes can simply break. Physical
damage always causes at least some data loss, and in
many cases the logical structures of the file system
are damaged as well. This causes logical damage that
must be dealt with before any files can be recovered.
Most physical damage cannot be repaired by end users.
For example, opening a hard disk in a normal environment
can allow dust to settle on the surface, causing further
damage to the platters. Furthermore, end users generally
do not have the hardware or technical expertise required
to make these sorts of repairs; therefore, data recovery
companies are consulted. These firms use Class 100 cleanroom
facilities to protect the media while repairs are made,
and tools such as magnetometers to manually read the
bits off failed magnetic media. The extracted raw bits
can be used to reconstruct a disk image, which can then
be mounted to have its logical damage repaired. Once
that is complete, the files can be extracted from the
image.
Logical damage
Far more common than physical damage is logical damage
to a file system. Logical damage is primarily caused
by power outages that prevent file system structures
from being completely written to the storage medium,
but problems with hardware (especially RAID controllers)
and drivers, as well as system crashes, can have the
same effect. The result is that the file system is left
in an inconsistent state. This can cause a variety of
problems, such as strange behavior (e.g., infinitely
recursing directories, drives reporting negative amounts
of free space), system crashes, or an actual loss of
data. Various programs exist to correct these inconsistencies,
and most operating systems come with at least a rudimentary
repair tool for their native file systems. Linux, for
instance, comes with the fsck utility, and Microsoft
Windows provides chkdsk. Third-party utilities are also
available, and some can produce superior results by
recovering data even when the disk cannot be recognized
by the operating system's repair utility.
Two main techniques are used by these repair programs.
The first, consistency checking, involves scanning the
logical structure of the disk and checking to make sure
that it is consistent with its specification. For instance,
in most file systems, a directory must have at least
two entries: a dot (.) entry that points to itself,
and a dot-dot (..) entry that points to its parent.
A file system repair program can read each directory
and make sure that these entries exist and point to
the correct directories. If they do not, an error message
can be printed and the problem corrected. Both chkdsk
and fsck work in this fashion. This strategy suffers
from a major problem, however; if the file system is
sufficiently damaged, the consistency check can fail
completely. In this case, the repair program may crash
trying to deal with the mangled input, or it may not
recognize the drive as having a valid file system at
all.
The second technique for file system repair is to assume
very little about the state of the file system to be
analyzed and to, using any hints that any undamaged
file system structures might provide, rebuild the file
system from scratch. This strategy involves scanning
the entire drive and making note of all file system
structures and possible file boundaries, then trying
to match what was located to the specifications of a
working file system. Some third-party programs use this
technique, which is notably slower than consistency
checking. It can, however, recover data even when the
logical structures are almost completely destroyed.
This technique generally does not repair the underlying
file system, but merely allows for data to be extracted
from it to another storage device.
While most logical damage can be either repaired or
worked around using these two techniques, data recovery
software can never guarantee that no data loss will
occur. For instance, in the FAT file system, when two
files claim to share the same allocation unit ("cross-linked"),
data loss for one of the files is essentially guaranteed.
The increased use of journaling file systems, such
as NTFS 5.0, ext3, and xfs, is likely to reduce the
incidence of logical damage. These file systems can
always be "rolled back" to a consistent state,
which means that the only data likely to be lost is
what was in the drive's cache at the time of the system
failure. However, regular system maintenance should
still include the use of a consistency checker in case
the file system journal itself fails.
|