Why it's important to preclear your HDDs

A few days ago my workplace provided me a laptop (preinstalled with Windows 10) for assisting me in carrying my duties as a System Administrator.

The first thing I did was pull out my bootable Arch Linux USB Flash Drive. I nuked the Windows partition and setup a base Arch installation.

Everything seemed fine and dandy. But once I started installing some big packages like texlive and vscode, I noticed that my laptop started freezing every 30 - 40 seconds. I thought it was due to my cores being used to compile the packages and paid no heed.

Once I performed the first reboot after installing all my packages I noticed my zsh terminal was using the default settings. I use oh-my-zsh framework as an extension to my zsh setup. Finding it weird that my zsh terminal theme did not activate I checked my .zshrc file.

I immediately got an I/O Error. Hmm. I checked my .oh-my-zsh directory and found out it was inaccessible.

I ran btrfs scrub to check if my files were corrupt. My dmesg log was filled with checksum errors. I stop the scrubbing since it was freezing the laptop.

I unscrewed the laptop case and try reseating the hard disk to see if the connectors were the issue. Booted it back up and was still facing issues.

I fired up a bootable Fedora Workstation Live USB. The laptop never had any important data, to begin with so I decided to zero-fill my entire HDD.

1
dd if=/dev/zero of=/dev/sda bs=16M status=progress && sync

At the same time, I checked my hard drive S.M.A.R.T. logs and found I had around 6000 Pending Sector Reallocations and around 2 Reported Uncorrectable Errors. The dd was running in the background and estimated around 3 hours before it can finish.

The next day I checked back on the laptop. I tried pulling the S.M.A.R.T. logs and I was greeted with this error.

/images/blog/hdd-crash/error-1.png

I reran the commands with -T permissive appended to the arguments and lo and behold.

/images/blog/hdd-crash/error-2.png

It still doesn’t work!

Well, I guess the hard drive went bust. I tried rerunning dd to see if it might show some results. Nope. Read-only file system.

/images/blog/hdd-crash/error-3.png

I tried running badblocks to see if that would give any output. It was erroring out.

1
badblocks /dev/sda
/images/blog/hdd-crash/error-4.png

In the end, I could never get the hard disk to work again.

Conclusion

Whenever you buy a new HDD whether it’s for your laptop, desktop or NAS, always preclear it. Use badblocks or dd to either check for bad sectors or zero-fill the entire data. Doing so will allow you to eliminate any premature failures your hard disk might experience.

Usually, HDDs have a bathtub curve in terms of their failure rate, and zero-filling it early on allows you to weed out premature failed hard drives.

/images/blog/hdd-crash/bathtub.png

While it’s recommended to do it for HDDs, I would not perform it on SSDs. As SSDs have a limited number of write / erase cycles you would be wearing it down instead.

In the end, I wasted an entire day from setting up the laptop to diagnosing. I would’ve spent only half a day had I precleared my hard drive.

Always preclear your hard drives before using them.