This quick howto document shows how to remove a fibre channel LUN under multipathd(8) control from a running RedHat Enterprise Linux 6 machine. Be careful when performing online storage modifications. Make sure you have a valid backup. And of course I can't be held resonsible for any problems if you follow these steps ;)
In our example, we have an LVM volume mounted on /export/oracle which is under multipathd(8) control. We will remove this volume from the server without taking the machine down.
So first, make sure the mount point is not used anymore. Check your applications and users and remove all references to this device.
If the volume is mounted, check if it's used and if not, then unmount it. The fuser(1) and lsof(1) commands can tell you if the device is in use. Don't forget that if this file system is shared via NFS, you will need to stop the NFS daemons before you can umount(1) it.
df -h /export/oracle
/dev/mapper/ora-bckp 2.0T 1.4T 509G 74% /export/oracle
Now unmount the file system.
sudo umount /export/oracle
From the df(1) command above, we saw that the /export/oracle file system is in fact an LVM logical volume called « bckp » from the volume group « ora ». Let's take a look at the LVM configuration for both of these objects starting with the logical volume.
sudo lvs bckp
LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert
bckp ora -wi-a---- 2.00t
sudo pvs | egrep 'PV|ora'
PV VG Fmt Attr PSize PFree
/dev/mapper/backup01 ora lvm2 a-- 512.00g 0
/dev/mapper/backup02 ora lvm2 a-- 512.00g 0
/dev/mapper/backup03 ora lvm2 a-- 512.00g 0
/dev/mapper/backup04 ora lvm2 a-- 512.00g 0
Take a note of these four LVM physical devices. We will use this info later. But for now, we must first remove the logical volume and then the volume group from LVM. We start by removing the logical volume.
sudo lvremove ora/bckp
Do you really want to remove active logical volume bckp? [y/n]: y
Logical volume "bckp" successfully removed
sudo vgremove ora
Volume group "ora" successfully removed
We can now work on the LVM physical devices.
sudo pvremove /dev/mapper/backup01
Labels on physical volume "/dev/mapper/backup01" successfully wiped
Labels on physical volume "/dev/mapper/backup02" successfully wiped
Labels on physical volume "/dev/mapper/backup03" successfully wiped
Labels on physical volume "/dev/mapper/backup04" successfully wiped
Good, now let's check the multipath status for these four LVM physical devices.
sudo multipath -ll
[...output truncated...]
backup04 (3600508b4000c1ec00001400000b30000) dm-2 HP,HSV300
size=512G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 2:0:0:4 sdd 8:48 active ready running
| `- 3:0:3:4 sdt 65:48 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 2:0:3:4 sdj 8:144 active ready running
`- 3:0:0:4 sdn 8:208 active ready running
backup03 (3600508b4000c1ec00001400000a60000) dm-3 HP,HSV300
size=512G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 2:0:3:3 sdi 8:128 active ready running
| `- 3:0:0:3 sdm 8:192 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 2:0:0:3 sdc 8:32 active ready running
`- 3:0:3:3 sds 65:32 active ready running
backup02 (3600508b4000c1ec00001400000980000) dm-1 HP,HSV300
size=512G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 2:0:0:2 sdb 8:16 active ready running
| `- 3:0:3:2 sdr 65:16 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 2:0:3:2 sdh 8:112 active ready running
`- 3:0:0:2 sdl 8:176 active ready running
backup01 (3600508b4000c1ec00001400000840000) dm-0 HP,HSV300
size=512G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 2:0:0:1 sda 8:0 active ready running
| `- 3:0:3:1 sdq 65:0 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 2:0:3:1 sdg 8:96 active ready running
`- 3:0:0:1 sdk 8:160 active ready running
sudo vim /etc/multipath.conf
<remove>
multipath {
wwid "3600508b4000c1ec00001400000840000"
alias backup01
}
multipath {
wwid "3600508b4000c1ec00001400000980000"
alias backup02
}
multipath {
wwid "3600508b4000c1ec00001400000a60000"
alias backup03
}
multipath {
wwid "3600508b4000c1ec00001400000b30000"
alias backup04
}
In our example, we have an LVM volume mounted on /export/oracle which is under multipathd(8) control. We will remove this volume from the server without taking the machine down.
So first, make sure the mount point is not used anymore. Check your applications and users and remove all references to this device.
If the volume is mounted, check if it's used and if not, then unmount it. The fuser(1) and lsof(1) commands can tell you if the device is in use. Don't forget that if this file system is shared via NFS, you will need to stop the NFS daemons before you can umount(1) it.
df -h /export/oracle
/dev/mapper/ora-bckp 2.0T 1.4T 509G 74% /export/oracle
sudo fuser /export/oracle
sudo umount /export/oracle
Now unmount the file system.
sudo umount /export/oracle
From the df(1) command above, we saw that the /export/oracle file system is in fact an LVM logical volume called « bckp » from the volume group « ora ». Let's take a look at the LVM configuration for both of these objects starting with the logical volume.
sudo lvs bckp
LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert
bckp ora -wi-a---- 2.00t
Then the volume group.
sudo vgs ora
VG #PV #LV #SN Attr VSize VFree
ora 4 1 0 wz--n- 2.00t 0
And finally, the physical devices.
sudo pvs | egrep 'PV|ora'
PV VG Fmt Attr PSize PFree
/dev/mapper/backup01 ora lvm2 a-- 512.00g 0
/dev/mapper/backup02 ora lvm2 a-- 512.00g 0
/dev/mapper/backup03 ora lvm2 a-- 512.00g 0
/dev/mapper/backup04 ora lvm2 a-- 512.00g 0
Take a note of these four LVM physical devices. We will use this info later. But for now, we must first remove the logical volume and then the volume group from LVM. We start by removing the logical volume.
sudo lvremove ora/bckp
Do you really want to remove active logical volume bckp? [y/n]: y
Logical volume "bckp" successfully removed
Then we remove the volume group.
sudo vgremove ora
Volume group "ora" successfully removed
We can now work on the LVM physical devices.
sudo pvremove /dev/mapper/backup01
Labels on physical volume "/dev/mapper/backup01" successfully wiped
sudo pvremove /dev/mapper/backup02 /dev/mapper/backup03 /dev/mapper/backup04
Labels on physical volume "/dev/mapper/backup02" successfully wiped
Labels on physical volume "/dev/mapper/backup03" successfully wiped
Labels on physical volume "/dev/mapper/backup04" successfully wiped
Good, now let's check the multipath status for these four LVM physical devices.
sudo multipath -ll
[...output truncated...]
backup04 (3600508b4000c1ec00001400000b30000) dm-2 HP,HSV300
size=512G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 2:0:0:4 sdd 8:48 active ready running
| `- 3:0:3:4 sdt 65:48 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 2:0:3:4 sdj 8:144 active ready running
`- 3:0:0:4 sdn 8:208 active ready running
backup03 (3600508b4000c1ec00001400000a60000) dm-3 HP,HSV300
size=512G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 2:0:3:3 sdi 8:128 active ready running
| `- 3:0:0:3 sdm 8:192 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 2:0:0:3 sdc 8:32 active ready running
`- 3:0:3:3 sds 65:32 active ready running
backup02 (3600508b4000c1ec00001400000980000) dm-1 HP,HSV300
size=512G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 2:0:0:2 sdb 8:16 active ready running
| `- 3:0:3:2 sdr 65:16 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 2:0:3:2 sdh 8:112 active ready running
`- 3:0:0:2 sdl 8:176 active ready running
backup01 (3600508b4000c1ec00001400000840000) dm-0 HP,HSV300
size=512G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 2:0:0:1 sda 8:0 active ready running
| `- 3:0:3:1 sdq 65:0 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 2:0:3:1 sdg 8:96 active ready running
`- 3:0:0:1 sdk 8:160 active ready running
Record the bold output as we need it later. A quick way to do so is like this :
for i in backup01 backup02 backup03 backup04; do
sudo multipath -ll $i | grep ':' | sed -e "s/.- //g" -e "s/^| //g" -e "s/ //g" | cut -d' ' -f1 | tee -a /tmp/ids
done
for i in backup01 backup02 backup03 backup04; do
sudo multipath -ll $i | grep ':' | sed -e "s/.- //g" -e "s/^| //g" -e "s/ //g" | cut -d' ' -f1 | tee -a /tmp/ids
done
Now remove the LUNs from multipathd(8) control.
sudo multipath -f backup01 backup02 backup03 backup04
Once that's done, make sure they're not listed in the following output.
sudo multipath -ll | grep backup
Update /etc/multipath.conf to remove the LUN. In this example, I removed this block of code from the file'smultipaths section. YMMV of course, because the LUN's WWN will obviously not be the same.
sudo multipath -ll | grep backup
Update /etc/multipath.conf to remove the LUN. In this example, I removed this block of code from the file'smultipaths section. YMMV of course, because the LUN's WWN will obviously not be the same.
sudo vim /etc/multipath.conf
<remove>
multipath {
wwid "3600508b4000c1ec00001400000840000"
alias backup01
}
multipath {
wwid "3600508b4000c1ec00001400000980000"
alias backup02
}
multipath {
wwid "3600508b4000c1ec00001400000a60000"
alias backup03
}
multipath {
wwid "3600508b4000c1ec00001400000b30000"
alias backup04
}
</remove>
Tell multipathd(8) that the configuration has changed.
sudo /etc/init.d/multipahtd reload
Clear the device from the SCSI subsystem. This is where we need the recorded output from above. What we need is the HBA number:Channel:Target ID:LUN number numbers. These numbers look like 2:0:1:3 in the `multipath -ll` output. Since we previously saved our SCSI IDs in the /tmp/ids file, we can simply do this :
sudo su - root
cat /tmp/ids | while read id; do
cat /tmp/ids | while read id; do
echo "1" > /sys/class/scsi_device/${id}/device/delete
done
This will generate logs similar to these ones in /var/log/messages :
Aug 16 13:19:52 oxygen multipathd: sdw: remove path (uevent)
Now that we have safely removed the LUNs from the server, we can remove those LUNs from the storage array. Once you do this, the server from which we just removed a LUNs will complain in it's /var/log/messages :
Aug 16 13:48:59 oxygen kernel: sd 5:0:0:1: [sdc] Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatically remap LUN assignments.
Aug 16 13:48:59 oxygen kernel: sd 5:0:0:1: [sdc] Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatically remap LUN assignments.
These are warning messages only and can be safely ignored. To be complete, we should really issue a LIP from each of the HBA ports on the server. If you don't know how many HBA ports you have, just look into the /sys/class/fc_host directory. There is going to be one sub-directory per HBA port. In this example, the machine has two single ports HBA, so we have two sub-directories.
ls /sys/class/fc_host/
host2 host3
ls /sys/class/fc_host/
host2 host3
To issue a LIP reset, simple do this.
sudo su - root
ls /sys/class/fc_host/ | while read dir
do echo $dir; echo 1 > /sys/class/fc_host/${dir}/issue_lip
done
And that's it!
sudo su - root
ls /sys/class/fc_host/ | while read dir
do echo $dir; echo 1 > /sys/class/fc_host/${dir}/issue_lip
done
And that's it!
No comments:
Post a Comment