Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One point of clarification: one bit on a classic PMR drive contains hundreds of magnetic grains. It is the grains that flip, not the bit. It would take many grain flips to affect the bit. Errors of this sort do not manifest as flipped bits per se—they manifest as a degraded signal, which the drive may or may not be able to translate to the correct bit sequence correctly. Also, the nature of ECC is (usually) that you get the correct sequence or an error. It would be unusual to get an incorrect sequence unless that is happening somewhere off-drive.

If you have a stored drive that is reporting errors, my starting assumption would be that something else is causing problems besides the platter—maybe the heads have gotten a bit of corrosion from humidity.

Still disagree?



Because the HDD manufacturers avoid to provide the information that would be necessary to estimate with any degree of certainty the data retention time for HDDs, we cannot know for sure the causes of HDD errors during long term storage, so we can only speculate about them.

Nevertheless, the experimental facts, both from my experience during many years with many HDDs and from the reports that I have read are:

1. Immediately after the warranty of a HDD expires, the probability of mechanical failure increases a lot. I have seen several cases of HDD failures a few months after the warranty expiration, while I have never seen a failure before that (on drives that had passed the initial acceptance tests after purchase; some drives have failed the initial tests and have been replaced by the vendor).

Therefore one should never plan to store data on HDDs beyond their warranty expiration.

2. When data is stored on HDDs that are powered down for several years, one should expect a few errors (I have seen e.g. about one error per 2 to 8 TB of data), which cause either non-correctable errors or wrong corrections that corrupt the data.

The effect of such errors can be easily mitigated by storing 2 copies of each data file on 2 different HDDs.

An alternative is to introduce a controlled data redundancy, e.g. of 5% or 10%, with a program like "par2create".

That works fine against wrongly corrected sectors, but when a non-correctable error is reported, many file copy programs fail to copy any good sector following a bad sector, so one may need to write a custom script that will seek through the corrupt file and copy the good sectors, in order to get enough data from which the original file can be reconstructed.

Storing everything on 2 HDDs, preferably of different models, is the safest method, as it also guards against the case when one HDD is completely lost due to a mechanical defect.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: