PSIgroup | Computing / Linux Software Raid How-To with GRUB Bootloader

Some of this is not up-to-date (like the raidhotadd thing), but it's in general very good advice. I've tryied to only make formatting corrections to the original text. You may also want to see the Disk Errors page.

Linux Software Raid How-To with GRUB Bootloader

From Steve Boley of Dell
see the original

Install and Prep

Do an install and create the individual software raid partitions on the drives first and is easiest to deselect the available drives that you don't want the partition on first while creating the individual partitions on specific drives (ie sda1 sdb1 with type as software raid) Then do the raid button in disk druid and it will only have these 2 partitions available and set your raid level and ext3 and your partition label and then ok. Tag the next 2 partitions and follow the same process as above.

The reason for being so meticulous is that this way the raid partitions are matching on the drives and not out of order. If you just create partitions and then let Disk Druid do whatever you wind up with them all out of order and makes a big headache when replacing drives and initiating rebuilds on partitions. (Trust me on this).

After install is complete we will then create a directory on root called /raidinfo and will place some information in there that will make replacing drives and rebuilding them a lot easier if you have a failure.

Prepping the System in Case of Drive Failure

1st is to backup the drive partition tables and is rather simple command:

#sfdisk -d /dev/sda > /raidinfo/partitions.sda
#sfdisk -d /dev/sdb > /raidinfo/partitions.sdb

Do this for all your drives that you have in the system and then you have backup configuration files of your drive's partitions and when you have a new drive to replace a previous drive it is very easy to load the partition table on the new drive with command:

#sfdisk /dev/sda < /raidinfo/partitions.sda

This partitions the new drive with the exact same partition table that was there before and makes all the rebuild and raid partitions all in the exact same place so that you don't have to edit any raid configuration files at all.

Installing GRUB Onto Drives that have /boot Raid Partitions

PROBLEM: When you do the install is that grub is only installed on id0 if you have scsi drives and /dev/hda for ide drives. So if either of these drives fails and you pull it, you have problems.

SOLUTION: You need to run grub and install onto all other drives that are part of raid that the /boot partitions are on. At present only raid1 is supported to put the /boot partition on with linux software raid.

Type grub from prompt and will put you in grub's shell:

#grub
grub>

grub>find /grub/stage1

This will give you where all the grub setup files are located. On my raid1 mirror system listed:

(hd0,0)
(hd1,0)

Red Hat specifically mounts your /boot partition as the root partition for grub and is redhat specific. What this is listing are the locations of root for grub which in fact is grub syntax for where the /boot partition is located. Sda=hd0, sdb=hd1 for scsi and hda=hd0 and hdb=hd1 for IDE and the second number specifies the partition number and 0 is the starting number so this is specifying that it found grub setup files on sda1 and sdb1 scsi or hda1 and hdb1 ide.

Now you want to make sure that grub gets installed into the master boot record of your additional raid drives so that if id0 is gone then the next drive has a mbr loaded with grub ready to go. Systems will automatically go in drive order for both ide and scsi and use the first mbr and active partitions it finds so you can have multiple drives that have mbr's as well as active partitions and it won't affect your system booting at all.

So using what was shown with the find above and already knowing that hd0 already has grub in mbr, we then run:

Grub>device (hd0) /dev/sdb (/dev/hdb for ide)
Grub>root (hd0,0)

and then:

Grub>setup (hd0)

Grub will then spit out all the commands it runs in the background of setup and will end with a successful embed command and then install command and end with .. succeeded on both of these commands and then Done returning to the grub> prompt.

Notice that we made the second drive device 0. Why is that you ask? Because device 0 is going to be the one with mbr on the drive so passing these commands to grub temporarily puts the 2nd mirror drive as 0 and will put a bootable mbr on the drive and when you quit grub you still have the original mbr on sda and will still boot to it till it is missing from the system.

You have then just succeeded in installing grub to the mbr of your other mirrored drive and marked the boot partition on it active as well. This will insure that if id0 fails that you can still boot to the os with id0 pulled and not have to have an emergency boot floppy.

Replacing Drives Before or After Failure

Remember the entries above in preparing the system of saving the partition tables? Well they will come handy here and make life much easier in case you have to replace a drive.

There is a lot of mythology about linux software raid and it being difficult as well as not supporting hot swap of drives. That is excactly what it is, a myth. Since most scsi controllers are adaptec and the adaptec driver supports hotswap functionality in the linux scsi layer, you can echo the device in and out of the realtime /proc filesystem within linux and it will spin down and spin up the corresponding drive you pass to it.

To get the syntax to pass to the proc filesystem you perform:

#cat /proc/scsi/scsi

It will list all scsi devices detected within linux at the moment you pass the command. Let's say you have a smart alert being raised from the aic7xxx module and you want to replace the drive and you are running software raid1 on the system.

Output of the /proc/scsi/scsi is:

Host:  scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: Seagate Model: and so on
Host:  scsi0 Channel: 00 Id: 01 Lun: 00
  Vendor: Seagate and so on.

Id 1 has been the offender of the smart alerts in messages and you have another exact duplicate of the drive to replace it with.

You will need to pass this command to remove id1 from the system:

#echo "scsi remove-single-device 0 0 1 0" > /proc/scsi/scsi

You will then get a message stating that it is spinning down the drive and will then message what dev it is as well and will tell you when it has completed. Once completed you can remove the drive and then replace with the new one.

The reverse is the case to spin the drive up and make it available to linux:

#echo "scsi add-single-device" 0 0 1 0 > /proc/scsi/scsi

Will give the dev name and say the drive is spinning up and then give the queue and tag information and then give ready and the size and device name.

You would then use the sfdisk command to reload the partition table back onto the drive. (Word of caution, if the system was rebooted and came up with id0 missing, id1 is now sda and not sdb)

#sfdisk /dev/sdb < /raidinfo/partitions.sdb

This will autopartition the drive to exactly what it was partitioned before and now you are ready to recreate the raid partitions that were previously on id1.

Running Raid Commands to Recover Partitions

The commands to rebuild or remove raid partitions are rather simple. You have to understand where and what to look at to see the health of the raid partitions as well as which /dev/sda and /dev/sdb partitions are assigned to what md raid device. You see that within the /proc filesystem just like you did with scsi. The filename is mdstat and is in the root of /proc:

#less /proc/mdstat
md1 : active raid1 sda1[1]
      40064 blocks [2/1] [_U]

It will list all the md partitions which are actually the raid partitions and the raid counterpart of /sdxx. Notice the [2/1] which indicates that one of the mirrored partitions is missing which is what was on the other drive which was sdb1. The [1] after sda1 denotes that this is id1 drive. Adding the drive partition back to the array is accomplished with raidhotadd command:

#raidhotadd /dev/md0 /dev/sdb1

Now it shows with the less /proc/mdstat

Md1 : active raid1 sdb1[0] sda1[1]
      40064 blocks [2/2] [UU]

Now both drives are shown with id0 being first which is sdb1 and the [UU] designates that both partitions are up and [_U] before meant the first partition was missing and not up. You can queue up all the partition rebuilds and it will process them one by one in order of queue up. It will then show both of the partitions with the ones that are still queued will show a sda5[2] with the 2 denoting that it is a spare drive waiting to build.

If you've put a new drive in, don't forget to do the grub commands above to place grub in the mbr of the drive so that you can boot to it later if necessary.

Congratulations you have now installed, worked with, and restored software raid in linux.

Steve Boley
Dell Enterprise Expert Center
Senior Network Technician/Analyst
Poweredge, PowerApp, PowerConnect, and PowerVault Trained Technician

from http://lists.us.dell.com/pipermail/linux-poweredge/2003-July/008898.html