Overclock.net - Overclocking.net
     
 
Home Gallery Reviews Blogs Register Today's Posts Mark Forums Read Members List


Go Back   Overclock.net - Overclocking.net > Specialty Builds > Servers

Reply
 
LinkBack Thread Tools
Old 09-02-09   #1 (permalink)
Mobo Master
 
Bonz™'s Avatar
 
intel nvidia

Join Date: Sep 2007
Location: Ohio, USA
Posts: 1,954
Blog Entries: 3

Rep: 168 Bonz™ is acknowledged by manyBonz™ is acknowledged by many
Unique Rep: 142
Folding Team Rank: 142
Hardware Reviews: 5
Trader Rating: 21
Default PERC 5/i and Possible HDD Failure

Hey,

First I just want to state my configuration: 6x1TB WD Blacks in RAID5 on Perc 5/i. Now onto the post.

This morning when I got to work, I logged into my file server at home only to see about 8 error messages up on my screen. They all said something about ID:0 referring to my PERC controller.

So I logged into the Dell SAS Admin panel to have a look. I saw the disk on channel 0 was being rebuilt. How do I know if it's a failed drive? Could something just have gotten corrupted and it was forced to rebuild?

In the middle of looking I got another error, then it said there was a problem during rebuild. I looked, and it was rebuilding again. So far it's at 22% rebuild and still going.

Should I go ahead and take it out and RMA it? Should I just wait it out and see what happens? Here is the log from the second time it errored out and started to rebuild.

Quote:
Dell SAS RAID Storage Manager v2.67-00Event Log - Generated on Wed Sep 02 10:03:18 EDT 2009
--------------------------------------------------------------------------------------------
ID = 2950
SEQUENCE NUMBER = 26176
TIME = 02-09-2009 10:00:29
LOCALIZED MESSAGE = Controller ID: 0 Time established since power on: Time 2009-09-02,10:00:29 149257 Seconds

ID = 2949
SEQUENCE NUMBER = 26165
TIME = 02-09-2009 09:40:08
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = --:--:0 Previous = Offline Current = Rebuild

ID = 2948
SEQUENCE NUMBER = 26163
TIME = 02-09-2009 09:40:08
LOCALIZED MESSAGE = Controller ID: 0 Rebuild automatically started: PD --:--:0

ID = 2947
SEQUENCE NUMBER = 26162
TIME = 02-09-2009 09:40:08
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = --:--:0 Previous = Unconfigured Good Current = Offline

ID = 2946
SEQUENCE NUMBER = 26161
TIME = 02-09-2009 09:40:08
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = --:--:0 Previous = Unconfigured Bad Current = Unconfigured Good

ID = 2945
SEQUENCE NUMBER = 26160
TIME = 02-09-2009 09:40:08
LOCALIZED MESSAGE = Controller ID: 0 Drive is not certified: --:--:0

ID = 2944
SEQUENCE NUMBER = 26159
TIME = 02-09-2009 09:40:08
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: : :0

ID = 2943
SEQUENCE NUMBER = 26158
TIME = 02-09-2009 09:40:08
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: --:--:0

ID = 2942
SEQUENCE NUMBER = 26157
TIME = 02-09-2009 09:40:05
LOCALIZED MESSAGE = Controller ID: 0 Rebuild failed due to target drive error: PD --:--:0

ID = 2941
SEQUENCE NUMBER = 26156
TIME = 02-09-2009 09:40:05
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = --:--:0 Previous = Failed Current = Unconfigured Bad

ID = 2940
SEQUENCE NUMBER = 26155
TIME = 02-09-2009 09:40:05
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = --:--:0 Previous = Rebuild Current = Failed

ID = 2939
SEQUENCE NUMBER = 26154
TIME = 02-09-2009 09:40:05
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: : :0

ID = 2938
SEQUENCE NUMBER = 26153
TIME = 02-09-2009 09:40:05
LOCALIZED MESSAGE = Controller ID: 0 PD removed: --:--:0

ID = 2937
SEQUENCE NUMBER = 26152
TIME = 02-09-2009 09:40:04
LOCALIZED MESSAGE = Controller ID: 0 PD Reset: PD = --:--:0, Error = 3, Path = 12:21:00:00:00:00:00:00

ID = 2936
TIME = 02-09-2009 09:26:49
LOCALIZED MESSAGE = Successful log on to the server User: Administrator, Client: 10.0.0.10, Access Mode: Full, Client Time: 2009-09-02,09:26:49
__________________
Bonz::Asmodian Templar of Zikel

Total Available Storage
Currently: 8,377 GB (useable)


System: BonzTM
CPU
Q6600
Motherboard
Gigabyte G41-ES2L
Memory
4GB DDR2-800
Graphics Card
BFG GTX260 MaxCore
Hard Drive
WD 3x500GB R0
Power Supply
Silverstone DA750
Case
Antec Mini P180
CPU cooling
Xiggy Dark Knight
GPU cooling
Stock
OS
Windows 7 x64
Monitor
2x Asus 21.5" 1080p
Overclock.net - 2009 Chimp Challenge Champions 1 Million+ Folding at Home points
Bonz™ is online now I fold for Overclock.net Bonz™'s Gallery   Reply With Quote
Old 09-02-09   #2 (permalink)
Danke schön
 
Tator Tot's Avatar
 
amd ati

Join Date: Jun 2008
Location: Ellisville, Missouri;U.S.
Posts: 11,445
Blog Entries: 3

Rep: 1187 Tator Tot is a starTator Tot is a starTator Tot is a starTator Tot is a starTator Tot is a starTator Tot is a starTator Tot is a starTator Tot is a starTator Tot is a star
Unique Rep: 771
Hardware Reviews: 1
Trader Rating: 8
Default

It looks to me like a failed drive.
But it could be a "failing drive" aka, not 100% dead, but going there.

I would just take the drive out and start an RMA unless you really want to wait on the rebuild process.

System: AM Goodnewss
CPU
Phenom II x4 965BE (300x14) 4.2ghz 1.475v
Motherboard
Gigabyte 790FXT-UD5 (NB @ 2.7Ghz)
Memory
2x2GB DDR3 1600mhz Cas 7 1.75v
Graphics Card
HD2600 Pro
Hard Drive
2 x 500GB F3 (PERC5/i RAID0)
Sound Card
Asus Xonar D2X
Power Supply
Seasonic X-Series 650watt
Case
Custom Wood/Acrylic Bench (5x140mm Yate loon Med)
CPU cooling
Mega Shadow w/ PP Delta AFB1212SH
GPU cooling
Stock
OS
Windows 7 Ultimate 64bit
Monitor
2 Dell U2410 (1920x1200) H-IPS Panel
Tator Tot is offline Overclocked Account Tator Tot's Gallery   Reply With Quote
Old 09-03-09   #3 (permalink)
66MHz
 
Manyak's Avatar
 
intel ati

Join Date: Mar 2008
Posts: 7,079

Rep: 654 Manyak is becoming famousManyak is becoming famousManyak is becoming famousManyak is becoming famousManyak is becoming famousManyak is becoming famous
Unique Rep: 430
Folding Team Rank: 422
Trader Rating: 31
Default

Take the drive out and stick it in another PC, out of a RAID array, and test it with HDTune.


Also, one other thing to consider - how much headroom do you have with your PSU? I only ask because capacitor aging can change a computer's power draw (and reduce a PSU's possible output) over time, so if there's not enough headroom then it might be hitting the limit and just not powering the HDD anymore.

Either way, you'll find out when you move the drive to another PC to try it.

System: Obsidian Phoenix
CPU
Ci7 920 D0
Motherboard
E760 Classified
Memory
12GB G.Skill Titan DDR3-2000
Graphics Card
Waiting for HD5870x2
Hard Drive
1x Intel G2, 4x Intel G2, 1x Scorpio Black 320GB
Sound Card
Xonar D2X
Power Supply
Corsair 1000HX
Case
Corsair Obsidian 800D
CPU cooling
Heatkiller 3.0 Copper
OS
Windows 7 Ultimate
Monitor
3x Sony GDM-FW900 24" CRT's
Manyak is offline I fold for Overclock.net Overclocked Account   Reply With Quote
Old 09-03-09   #4 (permalink)
Mobo Master
 
Bonz™'s Avatar
 
intel nvidia

Join Date: Sep 2007
Location: Ohio, USA
Posts: 1,954
Blog Entries: 3

Rep: 168 Bonz™ is acknowledged by manyBonz™ is acknowledged by many
Unique Rep: 142
Folding Team Rank: 142
Hardware Reviews: 5
Trader Rating: 21
Default

I automatically rebuilt while I was at work. I never received any smart errors or anything. Hoping it was just a random mess up or controller glitch. Ran a consistency check and the array is at OPTIMAL status now.
__________________
Bonz::Asmodian Templar of Zikel

Total Available Storage
Currently: 8,377 GB (useable)


System: BonzTM
CPU
Q6600
Motherboard
Gigabyte G41-ES2L
Memory
4GB DDR2-800
Graphics Card
BFG GTX260 MaxCore
Hard Drive
WD 3x500GB R0
Power Supply
Silverstone DA750
Case
Antec Mini P180
CPU cooling
Xiggy Dark Knight
GPU cooling
Stock
OS
Windows 7 x64
Monitor
2x Asus 21.5" 1080p
Overclock.net - 2009 Chimp Challenge Champions 1 Million+ Folding at Home points
Bonz™ is online now I fold for Overclock.net Bonz™'s Gallery   Reply With Quote
Old 09-03-09   #5 (permalink)
Linux Lobbyist
 
intel ati

Join Date: May 2009
Location: San Diego, CA
Posts: 233

Rep: 42 BLinux is acknowledged by some
Unique Rep: 34
Trader Rating: 0
Default

it looks to me that PD 0 is having communication issues.. it goes offline, but then re-appears, so the controller starts a rebuild.. but then goes offline again, hence the "target failure" ... i would also check that the cables to PD 0 are secure, no nicks in the cables, etc.... a loose cable can result in intermittent communication problems like this.

if it was an electronic drive failure, usually you'd just see the drive go into failed state. if it was a media error (bad blocks on the platter), you would get I/O errors with corresponding timeouts as it waits for the failed I/O operation to abort. so, i don't think it's a failed drive, not yet at least.

System: AURORA
CPU
Dual Quad-core E5420 2.5Ghz 12mb cache
Motherboard
Intel 5000 chipset
Memory
48GB FBDIMM DDR2 PC2-5300 667Mhz
Graphics Card
ATI ES1000
Hard Drive
8x500GB WD5002ABYS/RAID5 PERC6/I 256MB cache
Power Supply
950W x2
OS
CentOS 5.3
BLinux is offline   Reply With Quote
Old 09-03-09   #6 (permalink)
Mobo Master
 
Bonz™'s Avatar
 
intel nvidia

Join Date: Sep 2007
Location: Ohio, USA
Posts: 1,954
Blog Entries: 3

Rep: 168 Bonz™ is acknowledged by manyBonz™ is acknowledged by many
Unique Rep: 142
Folding Team Rank: 142
Hardware Reviews: 5
Trader Rating: 21
Default

Quote:
Originally Posted by BLinux View Post
it looks to me that PD 0 is having communication issues.. it goes offline, but then re-appears, so the controller starts a rebuild.. but then goes offline again, hence the "target failure" ... i would also check that the cables to PD 0 are secure, no nicks in the cables, etc.... a loose cable can result in intermittent communication problems like this.

if it was an electronic drive failure, usually you'd just see the drive go into failed state. if it was a media error (bad blocks on the platter), you would get I/O errors with corresponding timeouts as it waits for the failed I/O operation to abort. so, i don't think it's a failed drive, not yet at least.
Thanks. That was my thought after diagnosing the issue. I haven't had a problem since, and that was the first time. So I'm hoping it was some sort of freak occurrence.
__________________
Bonz::Asmodian Templar of Zikel

Total Available Storage
Currently: 8,377 GB (useable)


System: BonzTM
CPU
Q6600
Motherboard
Gigabyte G41-ES2L
Memory
4GB DDR2-800
Graphics Card
BFG GTX260 MaxCore
Hard Drive
WD 3x500GB R0
Power Supply
Silverstone DA750
Case
Antec Mini P180
CPU cooling
Xiggy Dark Knight
GPU cooling
Stock
OS
Windows 7 x64
Monitor
2x Asus 21.5" 1080p
Overclock.net - 2009 Chimp Challenge Champions 1 Million+ Folding at Home points
Bonz™ is online now I fold for Overclock.net Bonz™'s Gallery   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools



All times are GMT -5. The time now is 11:40 PM.


Overclock.net is a Carbon Neutral Site Creative Commons License

Terms of Service / Forum Rules | Privacy Policy | DMCA Info | Advertising | Become an Official Vendor
Copyright © 2009 Shogun Interactive Development. Most rights reserved.
Page generated in 0.13829 seconds with 8 queries