Overclock.net › Forums › Components › Hard Drives & Storage › RAID Controllers and Software › PERC 5/i RAID Card: Tips and Benchmarks
New Posts  All Forums:Forum Nav:

PERC 5/i RAID Card: Tips and Benchmarks - Page 634

post #6331 of 7194
Quote:
Originally Posted by kyle5281 View Post

Its an issue with your distro of linux then (or amd/linux) if you are still using the same mobo. If you wanna run linux, I would suggest a Red Hat / Cent Based Distro, since those are the only ones supported with Dell/LSI Official Drivers.

I just got done installing Cent 6. Array shows up just fine. sadsmiley.gif

Guess it's time to learn a new distro. Thank you for all the help.
Gamer
(10 items)
 
Home Server
(13 items)
 
 
CPUMotherboardGraphicsRAM
Core i5 2500k ASUS P8Z68-V LX Galaxy 780 HOF Hyper-X@1600 
Hard DriveHard DriveCoolingOS
Crucial MX300 Crucial M4 Thermaltake Frio Windows 10 Pro 
PowerCase
EVGA SuperNova GQ 650W Haf 922 
CPUMotherboardRAMHard Drive
AMD A4-5300 FM2-A75MA-E35 2 x 2 gb HyperX WD Velociraptor 
Hard DriveCoolingOSMonitor
WD Red Stock Windows Server 2012 R2 none 
KeyboardPowerCaseMouse
none 650 w Rosewill 4U none 
Mouse Pad
none 
  hide details  
Reply
Gamer
(10 items)
 
Home Server
(13 items)
 
 
CPUMotherboardGraphicsRAM
Core i5 2500k ASUS P8Z68-V LX Galaxy 780 HOF Hyper-X@1600 
Hard DriveHard DriveCoolingOS
Crucial MX300 Crucial M4 Thermaltake Frio Windows 10 Pro 
PowerCase
EVGA SuperNova GQ 650W Haf 922 
CPUMotherboardRAMHard Drive
AMD A4-5300 FM2-A75MA-E35 2 x 2 gb HyperX WD Velociraptor 
Hard DriveCoolingOSMonitor
WD Red Stock Windows Server 2012 R2 none 
KeyboardPowerCaseMouse
none 650 w Rosewill 4U none 
Mouse Pad
none 
  hide details  
Reply
post #6332 of 7194
Hey guys,
I am getting some issues from one of my drives in my Raid5 array with the Perc6/i.
The array consists of 4 Samsung F4EG/HD204UI disk (all been patched with the famous samsung bug) and have been working for ~1.5 years now, error free.

Last night I saw the following warnings:
Code:
267  [Warning, 1]    2013-02-10, 01:11:31    Controller ID:  0  Command timeout on PD:   PD  
    =   -:-:3No addtional sense information,   CDB   =    0x2a 0x00 0x5d 0x93 0x77 0x00 0x00 0x01 0x00 0x00    ,   Sense   =   ,   Path   =
      0x1221000003000000        2920

After getting a big bunch of those, I got the following critical error:
Code:
251  [Critical, 2]   2013-02-10, 01:11:32    Controller ID:  0  [B]VD is now DEGRADED[/B]   VD  
    2   2925
81      [Information, 0]        2013-02-10, 01:11:32    Controller ID:  0   State change on VD:   2 
    Previous   =   Optimal  Current   =
      Degraded  2924
114     [Information, 0]        2013-02-10, 01:11:32    Controller ID:  0   State change:   PD  
    =   -:-:3  Previous   =   Online 
    Current   =   Failed        2923
248     [Information, 0]        2013-02-10, 01:11:32    Controller ID:  0  Device removed   Device Type:
      Disk  Device Id:   -:-:3  2922
112     [Warning, 1]    2013-02-10, 01:11:32    Controller ID:  0   PD removed:  
    -:-:3       2921

The array started to rebuild automatically, and then after a while got another error:
Code:
102  [Critical, 2]   2013-02-10, 09:34:10    Controller ID: 0 Rebuild failed due to target drive error: PD -:-:3     3013

For now it is still rebuilding, with about 56% done.

Is one of the disks faulty ? is it just the TLER/CCTL issue ?
What is the recommended action in this situation ?
How can I check if the disk healthy or bad ?(as I might want to use the warranty on it).

This really got me freaked out, would really appreciate help here.

10x.
post #6333 of 7194
Plug whatever drive is on port 3 directly to your motherboard and run crystal disk info to check its smart status.
post #6334 of 7194
penguinlogik,
I tried reading all the SMART data from all of the drives (one by one), but none of them seemed to have any alerting status in the SMART status.
I matched the PD drive (#3) from the LSI Management SW to the actual drive by it's Serial # and below is the SMART status for the disk from Crystal disk Info:


I don't see any problem, but maybe I am missing something.

below is also screenshot from HDD Guardian for the same disk:


post #6335 of 7194
Hmm it looks like you have the same problem I did where one drive would always randomly drop even with its smart status being okay. The only solution I found was to buy another drive sadly, unless someone else has suggestions. The drive still works perfectly fine by itself though. The only thing that seems off to me is the Write error count, which I have at 100 and you have at 1, but because the originating error is a command timeout, the drive was unresponsive for a while, it has to do with the drive spinning down. Do you know the Power On count for your other drives? If one is much higher than the other, then it would be a good idea to replace it so the array won't keep on degrading.
post #6336 of 7194
The power on count is very similar on the other drives, the other disks have 1890, 1888, 1889 respectively (so they are within 1-2 count).
I think there is something with the drive response time that is not considered a SMART error/issue while the RAID controller considers it as being unresponsive.

Is there any spin down/up test on the SMART that can show the time it takes for the drive ?
The spin up time reported in SMART for all drives is about the same, although there is another drive (not the problematic one) with higher spin up time and it doesn't cause a problem.

I think it might be related to varying working temperature which can lead to inconsistent behavior from the drive, or simply inconsistency in general that exceeds the Controllers tolerance at peak points.
post #6337 of 7194
Quote:
Originally Posted by The-Fox View Post

Hey guys,
I am getting some issues from one of my drives in my Raid5 array with the Perc6/i.
The array consists of 4 Samsung F4EG/HD204UI disk (all been patched with the famous samsung bug) and have been working for ~1.5 years now, error free.

Last night I saw the following warnings:
Code:
267  [Warning, 1]    2013-02-10, 01:11:31    Controller ID:  0  Command timeout on PD:   PD  
    =   -:-:3No addtional sense information,   CDB   =    0x2a 0x00 0x5d 0x93 0x77 0x00 0x00 0x01 0x00 0x00    ,   Sense   =   ,   Path   =
      0x1221000003000000        2920

After getting a big bunch of those, I got the following critical error:
Code:
251  [Critical, 2]   2013-02-10, 01:11:32    Controller ID:  0  [B]VD is now DEGRADED[/B]   VD  
    2   2925
81      [Information, 0]        2013-02-10, 01:11:32    Controller ID:  0   State change on VD:   2 
    Previous   =   Optimal  Current   =
      Degraded  2924
114     [Information, 0]        2013-02-10, 01:11:32    Controller ID:  0   State change:   PD  
    =   -:-:3  Previous   =   Online 
    Current   =   Failed        2923
248     [Information, 0]        2013-02-10, 01:11:32    Controller ID:  0  Device removed   Device Type:
      Disk  Device Id:   -:-:3  2922
112     [Warning, 1]    2013-02-10, 01:11:32    Controller ID:  0   PD removed:  
    -:-:3       2921

The array started to rebuild automatically, and then after a while got another error:
Code:
102  [Critical, 2]   2013-02-10, 09:34:10    Controller ID: 0 Rebuild failed due to target drive error: PD -:-:3     3013

For now it is still rebuilding, with about 56% done.

Is one of the disks faulty ? is it just the TLER/CCTL issue ?
What is the recommended action in this situation ?
How can I check if the disk healthy or bad ?(as I might want to use the warranty on it).

This really got me freaked out, would really appreciate help here.

10x.

PM me the entire controller log please.
post #6338 of 7194
Quote:
Originally Posted by The-Fox View Post

Hey guys,
I am getting some issues from one of my drives in my Raid5 array with the Perc6/i.
The array consists of 4 Samsung F4EG/HD204UI disk (all been patched with the famous samsung bug) and have been working for ~1.5 years now, error free.

Last night I saw the following warnings:

-snip-

For now it is still rebuilding, with about 56% done.

Is one of the disks faulty ? is it just the TLER/CCTL issue ?
What is the recommended action in this situation ?
How can I check if the disk healthy or bad ?(as I might want to use the warranty on it).

This really got me freaked out, would really appreciate help here.

10x.

I'm not 100% positive this is the cause of your problem, but I would think it is most likely the cause: you're using SATA drives without ERC (western digital calls it TLER) on a enterprise raid controller (SaS controllers). In this configuration it should be expected that they will drop drives from the array at random without warning, and even when the drives are still healthy. This is not a PERC problem, or cable problem, or a problem with the drive it's self. This will happen when you use any SAS controller with SATA disks. Try and rebuild it and hope it works, but in all reality you're supposed to be using enterprise sas drives on these controllers, anything else is not guaranteed to work, and if it does it may not work forever. Alternatively the better option is to use raid-6 in this sort of configuration (sata on sas controllers) if the controller supports it.

Edit: You can disconnect the hard drives from the array one by one, plug them direct to the motherboard and download "HDTune" and it has a free trial for a few days. You can use this to read the SMART info and it also has a hard drive check you can run and it will do a slow surface scan of the entire drive for defects. Once you find the drive is healthy, you can plug it back in to the PERC and attempt to rebuild the array with it. That's about all I can really suggest.
Edited by kithylin - 2/11/13 at 7:12pm
post #6339 of 7194
Got an update:
I tried to rebuild the array yesterday as it was failing a few times when I tried it.
After a shutdown, I let the computer cool down a bit and powered it on again.
This time It was much quicker and it passed successfully this morning.

I got home not long ago and I couldn't see the controller active in Windows redface.gif

Restart, nada, reboot and let it cool a bit, same thing.
I started thinking that the Card is dead.
Meanwhile I got messages from my ASUS Probe utility that the 3.3V is too low @ 2.944V and the 5V is not looking good either, ~4.7V.

So I figured it could be the PSU, I hooked up the Perc6/i in another small HTPC computer I have and boom, the card is working biggrin.gif

Maybe all those drive issues were due to the PSU giving low 3.3V and 5V or the controller freaked out a bit as well.
I am going to replace the PSU in the next few days and I'll see how it goes.

10x for your help thumb.gif
post #6340 of 7194
Quote:
Originally Posted by The-Fox View Post

Got an update:
I tried to rebuild the array yesterday as it was failing a few times when I tried it.
After a shutdown, I let the computer cool down a bit and powered it on again.
This time It was much quicker and it passed successfully this morning.

I got home not long ago and I couldn't see the controller active in Windows redface.gif

Restart, nada, reboot and let it cool a bit, same thing.
I started thinking that the Card is dead.
Meanwhile I got messages from my ASUS Probe utility that the 3.3V is too low @ 2.944V and the 5V is not looking good either, ~4.7V.

So I figured it could be the PSU, I hooked up the Perc6/i in another small HTPC computer I have and boom, the card is working biggrin.gif

Maybe all those drive issues were due to the PSU giving low 3.3V and 5V or the controller freaked out a bit as well.
I am going to replace the PSU in the next few days and I'll see how it goes.

10x for your help thumb.gif

I had some seagate drives appearing failed on a system using the onboard raid in the middle of the summer last year, turned out to be a bad power supply there too. I replaced it with a better unit and I haven't had a problem since, so, power supplies can make drives appear faulty sometimes. Good luck in fixing it.
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: RAID Controllers and Software
Overclock.net › Forums › Components › Hard Drives & Storage › RAID Controllers and Software › PERC 5/i RAID Card: Tips and Benchmarks