Recently, I found myself in the situation of having to undergo a large-scale recovery operation on a 15-year-old hard drive with a total runtime of nearly 70,000 hours. The drive had long suffered from numerous defective sectors and reading errors.
The specific hard drive model in question was the SAMSUNG HD501LJ, a contemporary model widely used in systems of that time. It was later replaced by 750GB and 1TB models in the following years before the hard drive division of SAMSUNG was partially acquired by Seagate in 2011.
The task at hand was not an easy one, as it involved:
- Safeguarding and reading as many data as possible.
- Making existing data available on the file systems.
- Locating and making available previously deleted data from the distant past.
- If feasible, writing the data onto a new storage medium of equal or greater size to restore the original system.
In summary, the objectives were as follows:
- Reading and safeguarding as many sectors as possible.
- Signature-based data recovery.
- Backing up existing data.
- Making a 1-to-1 copy onto a new storage medium.
The process proceeded as follows:
Using “ddrescue,” the data was read sector by sector. “ddrescue” attempts to recover as much data as possible by alternately moving forward and backward while repeatedly reading unreadable blocks.
ddrescue /dev/sdd ddrescue.log
The progress was monitored using the auxiliary program “ddrescueview” and the text output from “ddrescue.”
As a result, all data could be recovered except for 1.5 MB, which accounts for 99.99% of the drive’s size of 500 GB. The unreadable data is located within the first 2 GB of the hard drive, increasing the likelihood that it pertains to operating system data that is less crucial for recovery or can be easily obtained from a similar operating system installation or through reinstallation.
Once the 500GB image was completed, the “photorec” tool was used in conjunction with this image to search for intact data and data fragments based on file signatures. “photorec” is capable of locating numerous file types based on their signatures, even after they have been deleted. In this case, the most important data to be recovered were image files, although “photorec” is also capable of recognizing and recovering many other file types.
photorec /log ./500g-backup.img
At the end of the process, several hundred thousand files had been recovered and could now be reviewed and categorized in a partially automated manner.
By using ddrescue’s “Fill-Mode,” it is possible to determine which files are affected within the file system. In other words, while it may not be possible to fully recover the files, it is still possible to identify what type of files they are.
printf "DEADBEEF" > tmpfile
ddrescue --fill-mode=l- tmpfile ./500g-sicherung.img ddrescue.log
rm tmpfile
mount -t iso9660 -o loop,ro ./500g-sicherung.img /mnt/500g-sicherung
find /mnt/500g-sicherung -type f -exec grep -l "DEADBEEF" '{}' ';'
Depending on the partitioning and configuration, the above commands may need to be adjusted. The result should be a listing of all erroneous files with irretrievable data, provided that an intact file system with its metadata is present.
By simply executing:
dd if=./500g-backup.img of=/dev/sdX
The data could be copied 1-to-1 onto a new storage medium, which, depending on the situation, may be operational again with or without manual data repairs, just like the 15-year-old HDD was when it was new and error-free.