Overclock.net banner
1 - 20 of 23 Posts

· Registered
Joined
·
550 Posts
Discussion Starter · #1 ·
Hi guys, I need some help like the title says. I'm a bit of an amateur with server grade stuff so be gentle lol. Anyways, I have 5 drives running in Raid 5 on an LSI raid card (with an Intel SAS expander as well) a few days ago my raid alarm went off notifying me a drive had failed power up. Now my questions are, did I lose my array completely or can I just pop in a new drive and rebuild my existing array?. I tried to access my files but obviously I can't with the array being damaged. Any help or info would be appreciated. Here are a few pics showing the status of the array from today

Thanks




Font Parallel Electronic device Electric blue Pattern

Personal computer Computer Font Rectangle Display device

Font Rectangle Material property Tints and shades Symmetry
 

· Registered
Joined
·
766 Posts
If just one drive fails in RAID5, the array is rebuildable. That's the whole point of RAID in the first place - protect data from a drive failure.

Unless the data is critical, best course of action would be to shut the system down and order a replacement drive ASAP. Plug in new drive and start rebuilding the array. If you keep running the system and another drive fails, you will lose data, guaranteed. (unless you had RAID6, in which case you can lose 2 drives, lol).

I'm not familiar with LSI, but I had the same thing happen to me with PERC H700 (which is made by LSI, iirc, just diff firmwave) and the array continued to boot and function in the degraded state with reduced performance. I didn't have alarm setup, hence did not catch the failed drive for who knows how long. New drive in, array rebuilt, as if nothing ever happened. Maybe LSI forbids operation of degraded arrays, or maybe there's a setting there preventing it, don't know.
 

· Banned
Joined
·
1,185 Posts
Hi guys, I need some help like the title says. I'm a bit of an amateur with server grade stuff so be gentle lol. Anyways, I have 5 drives running in Raid 5 on an LSI raid card (with an Intel SAS expander as well) a few days ago my raid alarm went off notifying me a drive had failed power up. Now my questions are, did I lose my array completely or can I just pop in a new drive and rebuild my existing array?. I tried to access my files but obviously I can't with the array being damaged. Any help or info would be appreciated. Here are a few pics showing the status of the array from today

Thanks




View attachment 2552354
View attachment 2552353
View attachment 2552355
I took a random Intel Raid manual, attached below, you can find some information about the Raid status.

This table describes the disks states.
Looking at your pictures, it seems that the drives are out of the Raid, as not configured.
Font Parallel Number Rectangle Screenshot


And your Raid status should be offline.
Rectangle Font Parallel Number Screenshot


Check the manual, there are descriptions of each option, long story short, you need to add back the disk to your virtual drive, to reassemble your Raid.
As said above, Raid 5 can only have one drive missing, the next drive that dies could and would be fatal for the whole Raid.

Font Screenshot Rectangle Terrestrial plant Number


BEWARE, do not perform a drive initialization by mistake, it means cleaning the drive, loosing all the data.
 

Attachments

· Registered
Joined
·
550 Posts
Discussion Starter · #4 ·
Thanks for the help so far guys, so I was able to add the other drive back into the array now it says all 5 drives are listed on the physical view but only 2 are online for some reason?. I tried to act like I was going to add a new drive to the array for a rebuild and it wouldn't even let me stating the array was the issue. How would I go about getting all the drives back online?. I can show the option page listed for you guys if it might help. In all honesty if I could just pull data off the drives (hoping its still intact) I'd just wipe the entire array and start over but I can't access anything.
Rectangle Font Screenshot Parallel Technology
Azure Font Screenshot Technology Electric blue
 

· Registered
Joined
·
766 Posts
Either more than one drive failed, or you have already tried to mess with it and did something. Do you recall what steps were taken between the drive failure and now?

As stated before, RAID 5 data recovery is problematic due to the way data is split up. You would need at least 4 drives imported to bring the degraded array back online for you to be able to read anything off it yourself. Get the manual for your particular card and read it very carefully.

If you have somehow deleted drive date but have NOT initialized any of the existing drives (the process that takes hours per drive and completely wipes it without chance of recovery), then in theory your data is still recoverable by a data recovery company. Unfortunately, RAID5 recovery with your 5 drives will probably run you at least $4k.
 

· Banned
Joined
·
1,185 Posts
Thanks for the help so far guys, so I was able to add the other drive back into the array now it says all 5 drives are listed on the physical view but only 2 are online for some reason?. I tried to act like I was going to add a new drive to the array for a rebuild and it wouldn't even let me stating the array was the issue. How would I go about getting all the drives back online?. I can show the option page listed for you guys if it might help. In all honesty if I could just pull data off the drives (hoping its still intact) I'd just wipe the entire array and start over but I can't access anything.
View attachment 2552487 View attachment 2552488
Humm, the failure corrupted and/or cleared the drive configuration.
You maybe need to import and merge the foreign configuration for all drives.
You can check the manual attached below, page 144, and/or check the below example.

If it does not work, please check also this guide.
Long story short, you need to note somewhere how the current Raid is configured.
Erase the config and create a new config, identical to the old one.
BEWARE, you MUST press NO at the end of the procedure, when the tool ask for the drive initialization.

Computer Azure Rectangle Font Parallel
 

Attachments

· Registered
Joined
·
550 Posts
Discussion Starter · #7 ·
Either more than one drive failed, or you have already tried to mess with it and did something. Do you recall what steps were taken between the drive failure and now?

As stated before, RAID 5 data recovery is problematic due to the way data is split up. You would need at least 4 drives imported to bring the degraded array back online for you to be able to read anything off it yourself. Get the manual for your particular card and read it very carefully.

If you have somehow deleted drive date but have NOT initialized any of the existing drives (the process that takes hours per drive and completely wipes it without chance of recovery), then in theory your data is still recoverable by a data recovery company. Unfortunately, RAID5 recovery with your 5 drives will probably run you at least $4k.
Yeah that's what I don't get, when I first got into the bios and looked at the physical and logical view of the drives to see if they were still intact it only said 2 drives were online of the 5 which even I know something went wrong just not sure what. Maybe more than one drive did fail and that's why I couldn't access the array at all
 

· Registered
Joined
·
550 Posts
Discussion Starter · #8 ·
Humm, the failure corrupted and/or cleared the drive configuration.
You maybe need to import and merge the foreign configuration for all drives.
You can check the manual attached below, page 144, and/or check the below example.

If it does not work, please check also this guide.
Long story short, you need to note somewhere how the current Raid is configured.
Erase the config and create a new config, identical to the old one.
BEWARE, you MUST press NO at the end of the procedure, when the tool ask for the drive initialization.

View attachment 2552537
Jesus that looks a little confusing lol. I wish I could have gotten Megaraid Storage Manager to work on the server instead of having to do it all in the web bios. Maybe tomorrow I''ll give it a try and see what happens
 

· Banned
Joined
·
1,185 Posts
Jesus that looks a little confusing lol. I wish I could have gotten Megaraid Storage Manager to work on the server instead of having to do it all in the web bios. Maybe tomorrow I''ll give it a try and see what happens
Raid stuff is often not easy as it seems.
This is why you should take some time, to read calmly the documentation.

Any progress?
 

· Registered
Joined
·
550 Posts
Discussion Starter · #10 · (Edited)
Raid stuff is often not easy as it seems.
This is why you should take some time, to read calmly the documentation.

Any progress?
So my raid card is finally out of safe mode (it was preventing me from doing anything to my array). The first menu I came up to was the Foreign Config Utility. Now all 3 of 5 drives are shown online but 2 are showing as rebuild?. I've tried importing the drives through the foreign config utility in the web bios but it won't let me so I may need to do what you suggested above with importing it. My questions is, if 2 drives are showing rebuild is there still even any chance at data recovery at this point? I have a total of 5 3TB drives if that info helps at all (pic attached below showing current status)
**EDIT.I still haven't added the new drives so I'm not sure why it says rebuild yet
Rectangle Font Screenshot Parallel Operating system

View attachment 2553224
 

· Registered
Joined
·
600 Posts
So my raid card is finally out of safe mode (it was preventing me from doing anything to my array). The first menu I came up to was the Foreign Config Utility. Now all 3 of 5 drives are shown online but 2 are showing as rebuild?. I've tried importing the drives through the foreign config utility in the web bios but it won't let me so I may need to do what you suggested above with importing it. My questions is, if 2 drives are showing rebuild is there still even any chance at data recovery at this point? I have a total of 5 3TB drives if that info helps at all (pic attached below showing current status) View attachment 2553224
If two are showing as rebuild, let the rebuild finish, it will probably take around 24 hours, most importantly don't panic, second most important, if you cared about the data please tell me you have a backup. Now all you can do is wait, when rebuild is complete and you can mount the volume(s) get the data off asap.
 
  • Rep+
Reactions: 1devomer

· Banned
Joined
·
1,185 Posts
So my raid card is finally out of safe mode (it was preventing me from doing anything to my array). The first menu I came up to was the Foreign Config Utility. Now all 3 of 5 drives are shown online but 2 are showing as rebuild?. I've tried importing the drives through the foreign config utility in the web bios but it won't let me so I may need to do what you suggested above with importing it. My questions is, if 2 drives are showing rebuild is there still even any chance at data recovery at this point? I have a total of 5 3TB drives if that info helps at all (pic attached below showing current status)
**EDIT.I still haven't added the new drives so I'm not sure why it says rebuild yet View attachment 2553231
View attachment 2553224
Yep, glad it is rebuilding, it is not uncommon that a raid would fail, because the disk is almost done, but not enough to really die.
The best thing to do, is to leave the thing to rebuild, it can take some time.
When the raid come online again, start by troubleshooting each disk, looking at the advanced HDD SMART tables.
With the values stored here, we can check each drive carefully, identifying the broken drives.

Then you have a couple of options:
  • Either, you can try to export the data, hoping the raid would stay online, if the disks are still ok.
  • Either, replace the drives with new ones, rebuilding everything, still it can be stressful for the drives, if not really in good shape.
 

· Registered
Joined
·
550 Posts
Discussion Starter · #13 ·
Yep, glad it is rebuilding, it is not uncommon that a raid would fail, because the disk is almost done, but not enough to really die.
The best thing to do, is to leave the thing to rebuild, it can take some time.
When the raid come online again, start by troubleshooting each disk, looking at the advanced HDD SMART tables.
With the values stored here, we can check each drive carefully, identifying the broken drives.

Then you have a couple of options:
  • Either, you can try to export the data, hoping the raid would stay online, if the disks are still ok.
  • Either, replace the drives with new ones, rebuilding everything, still it can be stressful for the drives, if not really in good shape.
So I left the system running for a while for to try and rebuild now 3 drives are offline again. Should I just try and add the new drives and go from there before I keep trying to even rebuild what is left?. I'm not really sure if all 3 drives are bad or if maybe it's a controller problem it just seems weird the way drive status keeps changing. Now this is what its saying...
Font Rectangle Material property Parallel Tints and shades
 

· Banned
Joined
·
1,185 Posts
So I left the system running for a while for to try and rebuild now 3 drives are offline again. Should I just try and add the new drives and go from there before I keep trying to even rebuild what is left?. I'm not really sure if all 3 drives are bad or if maybe it's a controller problem it just seems weird the way drive status keeps changing. Now this is what its saying... View attachment 2553308
Well, let's keep it up.

Again, according to the manual, here are the states of the drives.
Font Number Screenshot Document


So, i suppose that the drives 6 and 8 failed because the rebuilding didn't succeed.
The drive 8 is, again, out of the configuration, missing from the group.

As said above, a drive is considered Foreign, either when it is a drive already carrying a raid configuration, from another raid.
Either if the drive disconnected and/or failed.
Azure Purple Human body Rectangle Violet



At this point, i would spend a bit of time troubleshooting the drives with the SMART feature.
Dunno if you have access to the advanced drive information, from the bios menus.
Otherwise, you would need to install the windows command line or the MegaRaidStorage application.
Either, power down the machine and swap the drive 8, that keeps failing.
As said above, you don't want to stress too much the drives, by keeping rebuilding the raid.
 

· Registered
Joined
·
600 Posts
He seemingly has not 1 but 2 failed drives in his R5 array do, that means most likely that the data is toast. Continue to let it rebuild and hope for the best, but from what I see it's not looking great.
 

· Registered
Joined
·
766 Posts
Well, 3TB drives are not that expensive these days, I would try to replace the drive that completely fail and give rebuild another try.
If you disconnect a drive and power up the RAID, then reconnect, it will need to "rebuild" the array as data integrity is not guaranteed at that point (something could've happened to connected drives that would not be reflected in the disconnected drives). So, a rebuild is not necessarily that big of a deal. If the failing drive is somehow taking that whole SAS line offline for whatever reason, removing that drive could solve your issue and give you a successful rebuild.

Also, don't rule out the issue being with the SAS cable or something, given your symptoms. What exactly is your physical configuration, as in what cables go where?

I would say you have a decent chance of getting your data back, so stay calm and don't throw in the towel just yet.
 

· Registered
Joined
·
550 Posts
Discussion Starter · #17 · (Edited)
Well, let's keep it up.

Again, according to the manual, here are the states of the drives.
View attachment 2553339

So, i suppose that the drives 6 and 8 failed because the rebuilding didn't succeed.
The drive 8 is, again, out of the configuration, missing from the group.

As said above, a drive is considered Foreign, either when it is a drive already carrying a raid configuration, from another raid.
Either if the drive disconnected and/or failed.
View attachment 2553344


At this point, i would spend a bit of time troubleshooting the drives with the SMART feature.
Dunno if you have access to the advanced drive information, from the bios menus.
Otherwise, you would need to install the windows command line or the MegaRaidStorage application.
Either, power down the machine and swap the drive 8, that keeps failing.
As said above, you don't want to stress too much the drives, by keeping rebuilding the raid.
Ok thanks for the help!, like others have said I'm starting to think 2 drives may have failed by now for some reason. Once I install the new drive will the array start to rebuild on its own like it did before (although it failed) or would I need to add it to the array through the bios like you suggested?. Unfortunately the storage manager software won't work for me it keeps saying something is missing from my registry and I could never figure it out 🤦‍♂️. I'm running Windows Server 2011 if that might help with anymore info. I had a plan to switch to Freenas I regret not doing it sooner at least I would have had a copy of my data...
 

· Registered
Joined
·
550 Posts
Discussion Starter · #18 ·
Well, 3TB drives are not that expensive these days, I would try to replace the drive that completely fail and give rebuild another try.
If you disconnect a drive and power up the RAID, then reconnect, it will need to "rebuild" the array as data integrity is not guaranteed at that point (something could've happened to connected drives that would not be reflected in the disconnected drives). So, a rebuild is not necessarily that big of a deal. If the failing drive is somehow taking that whole SAS line offline for whatever reason, removing that drive could solve your issue and give you a successful rebuild.

Also, don't rule out the issue being with the SAS cable or something, given your symptoms. What exactly is your physical configuration, as in what cables go where?

I would say you have a decent chance of getting your data back, so stay calm and don't throw in the towel just yet.
Perhaps I will try disconnecting the drive and see if the array will rebuild on it's own and maybe trying a set of different cables. I'm still waiting for the drives to come it would be nice if it was fixed before they got here
 

· Banned
Joined
·
1,185 Posts
Ok thanks for the help!, like others have said I'm starting to think 2 drives may have failed by now for some reason. Once I install the new drive will the array start to rebuild on its own like it did before (although it failed) or would I need to add it to the array through the bios like you suggested?. Unfortunately the storage manager software won't work for me it keeps saying something is missing from my registry and I could never figure it out 🤦‍♂️. I'm running Windows Server 2011 if that might help with anymore info. I had a plan to switch to Freenas I regret not doing it sooner at least I would have had a copy of my data...
The only way to assess properly the states of the drives, is to check at the SMART table registers values.
If you post the error you are encountering, we can work on solving that too.
If you can't use the LSI software, you can easily find any drive health checker, that can read the SMART tables raw values.

Well, at this point, i would boot the machines with the drives removed.
And slowly adding each drive, in the same order they were on the backplane.
Tho, as said above, i would not trigger the rebuilding task so often, it stresses the drives for nothing
 

· Registered
Joined
·
550 Posts
Discussion Starter · #20 ·
The only way to assess properly the states of the drives, is to check at the SMART table registers values.
If you post the error you are encountering, we can work on solving that too.
If you can't use the LSI software, you can easily find any drive health checker, that can read the SMART tables raw values.

Well, at this point, i would boot the machines with the drives removed.
And slowly adding each drive, in the same order they were on the backplane.
Tho, as said above, i would not trigger the rebuilding task so often, it stresses the drives for nothing
I was able to track down what drive was bad and am trying to rebuild the array now (with the 4 drives I have left) the only part I'm stuck on is when I try to make the unfig drive online it adds itself to another drive group for some reason and I can't seem to get the rest of the array online?
Font Parallel Pattern Electric blue Rectangle
 
1 - 20 of 23 Posts
This is an older thread, you may not receive a response, and could be reviving an old thread. Please consider creating a new thread.
Top