Massive Google hard drive survey turns up very interesting things
When your server farm is in the hundreds of thousands and you're using cheap, off-the-shelf hard drives as your primary means of storage, you've probably got a a pretty damned good data set for looking at the health and failure patterns of hard drives. Google studied a hundred thousand SATA and PATA drives with between 80 and 400GB storage and 5400 to 7200rpm, and while unfortunately they didn't call out specific brands or models that had high failure rates, they did find a few interesting patterns in failing hard drives. One of those we thought was most intriguing was that drives often needed replacement for issues that SMART drive status polling didn't or couldn't determine, and 56% of failed drives did not raise any significant SMART flags (and that's interesting, of course, because SMART exists solely to survey hard drive health); other notable patterns showed that failure rates are indeed definitely correlated to drive manufacturer, model, and age; failure rates did not correspond to drive usage except in very young and old drives (i.e. heavy data "grinding" is not a significant factor in failure); and there is less correlation between drive temperature and failure rates than might have been expected, and drives that are cooled excessively actually fail more often than those running a little hot. Normally we'd recommend you go on ahead and read the document, but be ready for a seriously academic and scientific analysis. [Warning: PDF link][Via Slashdot, photo by Uwe Hermann]






Reader Comments (Page 1 of 1)
Meltz; @ Feb 18th 2007 10:25PM
suck it, consumer reports
Meltz; @ Feb 18th 2007 10:25PM
suck it, consumer reports
Meltz; @ Feb 18th 2007 10:26PM
anyone else up for some heavy data grinding?
Brennan @ Feb 18th 2007 10:38PM
this bad for Western Digital brand HDD? Because i was planning on getting a 250GB HDD from them on newegg.com later this year n WD always say they take less power n more cooling.
sounds reasonable to get these branded HDD for lower temperatures but now after reading this article, im worried about WD HDDs.
PreGHz @ Feb 18th 2007 11:57PM
They run cooler than average, and use (somewhat) less power. But they get warm, not hot.
I believe that the articles are referring to cooled hard drives as hard drives that have a dedicated fan system/kept in cooler rooms.
Warm I think would be a good medium between really hot (like my last Maxtor), and really cold (my new Hitachi liquid cooled.) Speaking of which... time to move some cooling ducts around.
Jon @ Feb 18th 2007 10:38PM
>>other notable patterns showed that failure rates are indeed definitely correlated to drive manufacturer, model, and age
Okay, someone tip us off. Who knows where Google is spending the most money (brands or models)?
kyle @ Feb 18th 2007 10:55PM
I did a lot of research on hard drive make/model in relationship to lifespan, and my end conclusion was...*drumroll*...totally inconclusive. I agree, a tip would really be excellent.
Ihar `Philips` Filipau @ Feb 20th 2007 3:51AM
From all I heard, Google doesn't have "preferred" vendors - it buys from many sources as to diversify its stack.
Every HDD producer had a bad day - and only by diversifying one can assure that data will live on regardless of single vendor problem.
As if nobody here had such experience: five Dells shipped to us all had hard drives failed in 2-3 months. (Most ridiculous part: the hard drives were from different vendors, only common part was "Dell"). Or our RAID from Dell failing due to one hard drive failure and then consequent second hard drive failure during populating replacement for first hard drive. (Tape backups are still only way to ensure data survival.)
John Doe @ Feb 18th 2007 11:03PM
My practical experience with drive reliability has been
Best--
Seagate
Hitachi
IBM
Toshiba
Maxtor
WD
--Worst
I'm going to read the PDF and see if my observations are even remotely accurate.
Brennan @ Feb 18th 2007 11:52PM
u got to be kidding me........y say WD is worst? If u think about it, WD is a trusty brand to go with, only thing is is that after i read this article, im worried about WD HDDs in the future of failing.
Jim @ Apr 12th 2007 7:21PM
Seagate now owns Maxtor. Quantum was purchased a bit back by Maxtor.
n3ldan @ Feb 19th 2007 10:49PM
Personal experience says:
Seagate is the best, followed by WD.
Stay away from Maxtors and Hitachi's, they suck.
Blayne @ Apr 27th 2007 10:36AM
Six years' experience repairing PCs makes me totally agree with n3ldan.
Reliability, best to worst:
1. Seagate
2. Western Digital
3. Maxtor / Quantum / Hitachi / IBM "deathstar" / Samsung
Most of the dead drives I get to see are plain old desktop IDE drives, and I suppose SCSI drives may be different, since they're usually purchased by a completely different type of customer.
Seagate's purchase of Maxtor frightened me. I hope it doesn't mean Seagate is going to use some kind of Maxtor "build extra cheap drives" technology.
John Doe @ Feb 18th 2007 11:06PM
Note to self: RTFA. *sighs*
Othello @ Feb 18th 2007 11:46PM
Ok, so what's the point of all this? Why do this and not release any actual results? The temperature thing is interesting, I'll give them that, but the most important result - which manufacturer makes the most reliable drives - is left out.
PreGHz @ Feb 18th 2007 11:53PM
The point isn't to place blame on certain companies, but to shed light on a lot of hard drive myths.
Like a cooler (all around) hard drive is not better than ones that run a little hot. And that drive failures do result from companies more often than random chance.
This is besides the point, but I swear by Western Digital. I own an array of ten of those babies and none have crashed on my yet. My Maxtors, Samsung, and Seagates, on the other hand...
Tango Charlie @ Feb 18th 2007 11:56PM
Oh, come on. Please please please: we NEED to know the brands and models!
Personally, I've relied on Seagate drives and they've been fine, though the latest one I picked up does actually sound as though it is grinding my data. Literally. Mortar & Pestle type action.
KazO @ Feb 19th 2007 12:06AM
Bleh, they don't want to name names.
My sample size is a lot smaller, but I support 600+ PCs, and I definitely agree on two points: 1) failures correspond heavily to particular models, and not where/how they're used. 2) SMART is a poor predictor of failures**
Though I don't work in a corporate environment, we buy PCs similarly; i.e. we buy the same Tier1 (Compaq/HP in our case) model throughout its life. They'll generally have a specific set of drives they'll spec for that PC model. Failures will tend to come from the same model drives, and within a surprisingly narrow time period.
The worst ones in my memory have been:
WD AC36400/AC26400/64AA, 102AA, 205AA (last 3 are the same family, I think)
Seagate Medalist 6-10Gbish (These might be Conner designs since that's what they look like; they also have SeaShield)
20Gb DiamondMAX (I forget which model)
WD200BB/400BB (the silver and silver/black lids, not the later all-black ones)
DiamondMax Plus 8 (the slimline ones). These are going out left right and center now.
The worst SCSIs have been Seagate Barracuda 18LP and Cheetah X15.
On occasion, we'll catch a drive that's reporting SMART errors before they fail, but they usually let go with no warning. However, SMART is much more effective on SCSI drives. HP/Compaq tracks server drive stats really carefully, and failing drives more often than not WILL show SMART errors.
michael @ Feb 19th 2007 12:22AM
Well it's not a wonder. Google isn't my favorite company with there planning to take over the world and all.
akijikan @ Feb 19th 2007 12:42AM
To the people who are wanting to know the results brandwise:
What is Google more than anything else? No, not a search engine...Advertising Firm! Why would they give free "advertising" by saying company x is the best?
ksj @ Feb 19th 2007 12:47AM
Google is just so powerful. They're not necessarily good or bad, it's just scary when any entity has that much power. Can't decide: thumbs up? or thumbs down?
Sean @ Feb 19th 2007 12:56AM
Granted, I have a limited test base, but of the dozen or so HDs I've used in the last decade, both Maxtors (30Gb and 100Gb) have failed and one Hitachi/IBM 20Gb 2.5 incher. I have an 11 year old WD 1.6Gb that I recently dusted of to see I had left a copy of Steel Panthers on and by god that sumbitch spun right up (and that game still rocks). My vote for least reliable has got to be Maxtor.
Also, who is still offering a 3 year warranty? That could be quite telling.
Vexorg @ Feb 19th 2007 1:16AM
In my experience, I had good luck with Maxtor drives right up until a couple of years ago, now I've had 3 fail on me in the last 2 years (one of the failures being a warranty replacement for the first one.) At this point, I'm hesitant to trust them anymore. Seagate and WD seem to be the recommendations around here these days. Then again, my brother's still got one of my old IBM Deathstars running in one of my previous systems, and still going after six years...
paul34 @ Feb 19th 2007 1:23AM
well, I'm a Seagate junky to the end. Can't beat how much longer the warranty is than the other guys, and darn QUIET it is compared to other drives. Maybe it lasts a little longer, but its more than worth it for the much greater quality.
That's IMO, anyway...
Can we get a digest of the Google report Engadget? :D
paul34 @ Feb 19th 2007 1:23AM
wow, I should really get to sleep
>>Maybe it lasts a little longer, but its more than worth it for the much greater quality
I meant to say
Maybe it costs a little more, but its more than worth it for the much greater quality
Unbangyourmom @ Feb 19th 2007 2:06AM
Anyone catch how many hard drives are in the tests? It seems that series/brands corelations are probably more due to bad batches. Like the IBM 75GXP fiascos. I would generally try to buy drives from different places when making RAID arrays, looking at manufacturing dates and countries ect. In the end manufacturer would probably have some incredible statistics that we will never be able to see. A pity really.
Kayode Adeshina @ Feb 19th 2007 2:22AM
my i have a WD 300 GB that was supposed to be my back up safety hard drive and now im having a hell of a time to recover all my music and family photos off of it. If and when i gt all my data back, this WD drive and my M4 rifle are going to have a meeting!
shelterpaw @ Feb 19th 2007 3:04AM
It would be nice to know which manufacturers performed well and which ones did not.
Tech^Cellfish @ Feb 19th 2007 3:56AM
I'm still sticking to Seagate's and Western Digitals
Josh @ Feb 19th 2007 4:24AM
Fwiw (and that's not very much) the only hd I've had fail on me was the hitachi that came in my dell laptop, after ten months of use. The new one they've given me is...hitachi. Hmm. Backing up...
nikster @ Feb 19th 2007 4:51AM
just a thought... maybe there's no correlation between brand and failure rates? e.g. Seagate 80GB model is crap, 120GB model excellent, and 200GB model so-so... if the study was over multiple years I am sure there were some duds from all manufacturers.
At the same time, if it turns out - unlikely as it is- that the 2004 model Seagate such-and-such was the best ever - it doesn't matter as these are in production only for a year or two before being replaced by other models.
nikster @ Feb 19th 2007 4:54AM
just a thought... maybe there's no correlation between brand and failure rates? e.g. Seagate 80GB model is crap, 120GB model excellent, and 200GB model so-so... if the study was over multiple years I am sure there were some duds from all manufacturers.
At the same time, if it turns out - unlikely as it is- that the 2004 model Seagate such-and-such was the best ever - it doesn't matter as these are in production only for a year or two before being replaced by other models.
XGM @ Feb 19th 2007 8:45AM
As i always say, make frequent backups of important data.
Heres a funny one, ive owned a 8GB Quantum Fireball, and it still working right now in my Linux firewall box. That thing cost 300$ in its days, and from that day ive already had to replace 4 harddrives;
20GB of some brand
80GB Maxtor DiamondMax8
200GB Western Digital (forget what type)
160GB Seagate SATA (they replaced it with a 200GB for free)
Jeremy Moses @ Feb 19th 2007 9:59AM
"Seriously academic and scientific analysis"? Well, ok this is true, but this is about as light and friendly as a serious academic paper gets.
Many of you are wondering why they don't list specific makes and models as unreliable - this paper isn't for you. It's for managers of data centres that already have an installed base of drives, and who want to predict failures. The paper is _not_ trying to create some sort of warm/fuzzy brand recognition in consumers that let's them sleep easier after their latest purchase. Yes, science is objective.
Google could have published the names and models, but the paper states that the drives were up to five years old - where would data centres buy these drives anyway? What sort of predictive power would this give us? Obviously brand name doesn't equate to reliability, and the fact that the results that they have drawn span manufacturers and several model years make them significant, and much much more useful than simply saying, "stay away from brand X", which is how people seem to want to view the marketplace.
Raden Munim @ Feb 19th 2007 11:29AM
http://linuxdevices.com/news/NS2659179152.html
The IA's PetaBox installation comprises about 16 racks housing 600 systems with 2,500 spinning drives, for a total capacity of roughly 1.5 petabytes.
Saikley says Capricorn did extensive testing to qualify hard drives for capacity, reliability, and cost, finally choosing Hitachi. "Although Hitachi does not offer an 'enterprise' or '24x7' SATA drive, our testing found their drives to be as reliable as anything out there, enterprise distinction or not," Saikley said.
Rodney Sharples @ Feb 19th 2007 12:02PM
I've had a fair few hard drives over the years from different manus and I tell you this:
LaCie discs are total junk. (A lot of the them contain maxtor hard drives). I have had two of three die in the last year.
I have since moved all storage to G-Technology"G-Drive/ G-Raid drives they are around 2x expensive, but are stable, reliable and use decent chipsets. I recommend them to anyone that works with graphics/video.
regards
Rodney Sharples @ Feb 19th 2007 12:07PM
Hi I've mentioned on two different blogs
how horrifically bad L**** hard drives are.
The posts are delated soon there after.
Is it not allowed to critcise anything on the
site? Is Engadget just advertorial?
Should we get our pompoms out and only
cheerlead?
A response would be appreciated.
Regards
Rodders
chris @ Feb 19th 2007 1:04PM
Check this out:
http://tinyurl.com/34o8lc
According to this guy, his company is finding that WD makes the most reliable Hard Drives.
I've always used Seagate myself, and never have had a problem.
Ihar `Philips` Filipau @ Feb 20th 2007 4:12AM
Of course, if you can survive seeking noise of the Seagates. I never seen single *silent* Seagate in my life.
I have now Samsung SpinPoint and it works silently. Not as Quantums of old days, but yet magnitude better than Seagates.
Mike @ Feb 19th 2007 4:35PM
Non-PDF Doxica link: http://doxi.ca/a88f
OpenlyFurtive2ALL @ Feb 19th 2007 8:20PM
I'm an Admin at a company with 150+/- computers. We use WD for our PATA and SATA (7,200-10,000 RPM). We use Seagate for our SCSI (not sure of the speed but I believe we use the 15,000RPM versions). (WD btw doesn't make a SCSI) In my experience, up to this point, WD has been extremely reliable with an extremely low "Observed" failure rate. The Seagate SCSI has been just as reliable.
My problem with any of these studies is that the terms are never defined (“Ageâ€, “Temp†Slightly hot? WTH is that?!) and full histories are not known or impossible to know.
The term age brings to mind questions like
‘How many times was the drive/computer moved?’
‘Was the drive ever disconnected (on the shelf) or was it in constant use?’
‘Was the computer the drive was in a workstation, server, in RAID config, what RAID config, controller-card type, data-only, boot-drive, high-usage environment, allowed to stop when idle, 1U server, power-cycled daily, shutdown-overnight, have a controller-card fail, power-supply fail, reformatted, format type….?’
Otherwise, how do I know where this data fits in my world? (And yes, I know some of these questions were answered in this report.)
FYI
The major cause of HDD failure in our SERVERS (SATA 250GB RAID5 / SCSI 36GBRAID5) thus far has been heat (AC power was turned-off at night in our new location, unbeknownst to us, and we lost 1 drive every couple of days for 2 months until we got the cooling right. We ended up replacing every drive in some of our arrays. The servers that had better airflow in the case design had the fewest failures.
The major cause of HDD failure in our WORKSTATIONS (PATA 10GB-80GB single drive configuration) has been age. Old lower capacity drives (10GB-20GB) that have been used and abused (Moved from computer to computer to and/or office to office) make up most of the failures.
Phred @ Feb 19th 2007 8:31PM
About temperature:
They notes in the paper how their drives are almost always on and therefore, they have a statistically useless amount of data on power cycles.
Perhaps what makes a difference is temperature changes. Think about it: fast temperature changes can obviously cause big problems (try moving glass from the oven to the freezer). Perhaps the thermal stress from warming up and cooling down every power cycle wears out the drive over time.
This would explain the lack of results from a server farm, but be consistent with common wisdom for drives in PCs, as hotter-running drives would undergo a larger temperature change when warming up and cooling down
Any thermodynamicists or materials scientists know this stuff better than me?
Mike @ Feb 19th 2007 11:51PM
In addition to being an advertizing firm, Google is also a large consumer of electronics. Their data on HDDs is valuable (as in has a dollar value) to them when they are negotiating their next big contract. A maker of more reliable drives would be in a position to charge more if they knew that they had that edge. Google may be able to trade data for dollars with a maker of less reliable, but improvable, drives.
Kads Baker @ Feb 20th 2007 7:30PM
In Russia...Hard drive grinds you!
toddshriber @ Feb 22nd 2007 2:12PM
data grinding rox! - todd shriber