* [linux-lvm] Mirror fail/recover test
@ 2010-02-24 16:14 jose nuno neto
2010-02-24 18:55 ` malahal
0 siblings, 1 reply; 16+ messages in thread
From: jose nuno neto @ 2010-02-24 16:14 UTC (permalink / raw)
To: linux-lvm
Hi
I'm trying to test the failure of a SAN Mirrored Lv, and the recover and
check for data lost.
Im runing RedHat 5.4
2.6.18-164.2.1.el5
lvm2-2.02.46-8.el5_4.1
I create a 2mirror+log lv ok, can lvconvert to one leg only, can delete ok.
But when I simulate a disk fail either with
dd if=/dev/zero of=pvmirror_device
echo offline > /sys/block/pvmirror_device/device/status
lvs -a -o +devices
stills shows the lv has mirrored ( should switch to non-mirrored right?) ,
and show Unknonwn device on the pvmirror_device I destroy
When I issue a lvconvert -m 0 complains about pv --uuid not found, if I
try to pvcreate --uuid pvmirror_device I got a error complaining about
still mounted device
If I try lvconvert --repair i hangs forever
What should be the correct procedure to recover from this?
Thanks
Jose
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Mirror fail/recover test
2010-02-24 16:14 [linux-lvm] Mirror fail/recover test jose nuno neto
@ 2010-02-24 18:55 ` malahal
2010-02-25 10:36 ` jose nuno neto
0 siblings, 1 reply; 16+ messages in thread
From: malahal @ 2010-02-24 18:55 UTC (permalink / raw)
To: linux-lvm
jose nuno neto [jose.neto@liber4e.com] wrote:
> Hi
>
> I'm trying to test the failure of a SAN Mirrored Lv, and the recover and
> check for data lost.
>
> Im runing RedHat 5.4
> 2.6.18-164.2.1.el5
> lvm2-2.02.46-8.el5_4.1
>
> I create a 2mirror+log lv ok, can lvconvert to one leg only, can delete ok.
> But when I simulate a disk fail either with
> dd if=/dev/zero of=pvmirror_device
> echo offline > /sys/block/pvmirror_device/device/status
What is the output of "dmsetup status" at this point?
There must be some messages in the /var/log/messages file if you enable
them.
> lvs -a -o +devices
> stills shows the lv has mirrored ( should switch to non-mirrored right?) ,
Yes, provided you successfully started the dmeventd monitoring thread
and it handled the failure event.
Thanks, Malahal.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Mirror fail/recover test
2010-02-24 18:55 ` malahal
@ 2010-02-25 10:36 ` jose nuno neto
2010-02-25 16:11 ` malahal
0 siblings, 1 reply; 16+ messages in thread
From: jose nuno neto @ 2010-02-25 10:36 UTC (permalink / raw)
To: LVM general discussion and development
Much thanks for your interest
im putting more info below
> jose nuno neto [jose.neto@liber4e.com] wrote:
>> Hi
>>
>> I'm trying to test the failure of a SAN Mirrored Lv, and the recover and
>> check for data lost.
>>
>> Im runing RedHat 5.4
>> 2.6.18-164.2.1.el5
>> lvm2-2.02.46-8.el5_4.1
>>
>> I create a 2mirror+log lv ok, can lvconvert to one leg only, can delete
>> ok.
>> But when I simulate a disk fail either with
>> dd if=/dev/zero of=pvmirror_device
>> echo offline > /sys/block/pvmirror_device/device/status
>
> What is the output of "dmsetup status" at this point?
> There must be some messages in the /var/log/messages file if you enable
> them.
This is my setup for the device Im unppluging
multipath -l -v2 | grep -A 7 3600a0b800048f9b200000c2b4b5980b7
mpath12 (3600a0b800048f9b200000c2b4b5980b7) dm-8 SUN,CSM200_R
[size=52G][features=1 queue_if_no_path][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=0][enabled]
\_ 7:0:1:1 sdo 8:224 [active][undef]
\_ 9:0:1:1 sdq 65:0 [active][undef]
\_ round-robin 0 [prio=0][enabled]
\_ 7:0:0:1 sdd 8:48 [active][undef]
\_ 9:0:0:1 sdf 8:80 [active][undef]
Before UnPluging
dmsetup status mpath12
0 109051904 multipath 2 0 0 0 2 1 E 0 2 0 8:224 A 0 65:0 A 0 E 0 2 0 8:48
A 0 8:80 A 0
echo offline > /sys/block/sdd/device/state
echo offline > /sys/block/sdo/device/state
echo offline > /sys/block/sdq/device/state
echo offline > /sys/block/sdf/device/state
dmsetup status mpath12
0 109051904 multipath 2 0 0 0 2 1 E 0 2 0 8:224 F 1 65:0 F 1 E 0 2 0 8:48
F 1 8:80 F 1
Feb 25 11:10:32 malta9 multipathd: sdd: rdac checker reports path is down
Feb 25 11:10:32 malta9 multipathd: sdd: rdac checker reports path is down
Feb 25 11:10:32 malta9 multipathd: sdo: rdac checker reports path is down
Feb 25 11:10:32 malta9 multipathd: sdo: rdac checker reports path is down
Feb 25 11:10:32 malta9 multipathd: sdq: rdac checker reports path is down
Feb 25 11:10:32 malta9 multipathd: sdq: rdac checker reports path is down
Feb 25 11:10:32 malta9 multipathd: dm-8: devmap already registered
Feb 25 11:10:32 malta9 multipathd: dm-8: devmap already registered
Feb 25 11:10:52 malta9 multipathd: sdf: rdac checker reports path is down
Feb 25 11:10:52 malta9 multipathd: sdf: rdac checker reports path is down
Feb 25 11:10:52 malta9 multipathd: dm-8: devmap already registered
Feb 25 11:10:52 malta9 multipathd: dm-8: devmap already registered
dmeventd is running
root 4601 0.0 0.1 96272 69668 ? S<Lsl Feb24 0:00 [dmeventd]
Also I have lvm.conf option
ignore_suspended_devices = 1
that should prevent this right?
>
>> lvs -a -o +devices
>> stills shows the lv has mirrored ( should switch to non-mirrored right?)
>> ,
>
> Yes, provided you successfully started the dmeventd monitoring thread
> and it handled the failure event.
>
> Thanks, Malahal.
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
lvm.conf
devices {
dir = "/dev"
scan = [ "/dev" ]
preferred_names = [ ]
filter = [ "r/disk/", "r/sd.*/", "a/.*/" ]
cache_dir = "/etc/lvm/cache"
cache_file_prefix = ""
write_cache_state = 1
sysfs_scan = 1
md_component_detection = 1
md_chunk_alignment = 1
data_alignment = 0
ignore_suspended_devices = 1
}
log {
verbose = 1
syslog = 1
file = "/var/log/lvm2.log"
overwrite = 0
level = 0
indent = 1
command_names = 1
prefix = " "
}
backup {
backup = 1
backup_dir = "/etc/lvm/backup"
archive = 1
archive_dir = "/etc/lvm/archive"
retain_min = 10
retain_days = 30
}
shell {
history_size = 100
}
global {
library_dir = "/usr/lib64"
umask = 077
test = 0
units = "h"
activation = 1
proc = "/proc"
locking_type = 1
fallback_to_clustered_locking = 1
fallback_to_local_locking = 1
locking_dir = "/var/lock/lvm"
}
activation {
missing_stripe_filler = "error"
reserved_stack = 256
reserved_memory = 8192
process_priority = -18
volume_list = [ "rootvg" , "@cluster.test" ]
mirror_region_size = 512
readahead = "auto"
mirror_log_fault_policy = "allocate"
mirror_device_fault_policy = "remove"
}
dmeventd {
mirror_library = "libdevmapper-event-lvm2mirror.so"
snapshot_library = "libdevmapper-event-lvm2snapshot.so"
}
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Mirror fail/recover test
2010-02-25 10:36 ` jose nuno neto
@ 2010-02-25 16:11 ` malahal
2010-03-02 10:31 ` [linux-lvm] Mirror fail/recover test SOLVED jose nuno neto
0 siblings, 1 reply; 16+ messages in thread
From: malahal @ 2010-02-25 16:11 UTC (permalink / raw)
To: linux-lvm
jose nuno neto [jose.neto@liber4e.com] wrote:
> Much thanks for your interest
> im putting more info below
>
> > jose nuno neto [jose.neto@liber4e.com] wrote:
> >> Hi
> >>
> >> I'm trying to test the failure of a SAN Mirrored Lv, and the recover and
> >> check for data lost.
> >>
> >> Im runing RedHat 5.4
> >> 2.6.18-164.2.1.el5
> >> lvm2-2.02.46-8.el5_4.1
>
> multipath -l -v2 | grep -A 7 3600a0b800048f9b200000c2b4b5980b7
> mpath12 (3600a0b800048f9b200000c2b4b5980b7) dm-8 SUN,CSM200_R
> [size=52G][features=1 queue_if_no_path][hwhandler=1 rdac][rw]
> \_ round-robin 0 [prio=0][enabled]
> \_ 7:0:1:1 sdo 8:224 [active][undef]
> \_ 9:0:1:1 sdq 65:0 [active][undef]
> \_ round-robin 0 [prio=0][enabled]
> \_ 7:0:0:1 sdd 8:48 [active][undef]
> \_ 9:0:0:1 sdf 8:80 [active][undef]
>
> Before UnPluging
> dmsetup status mpath12
> 0 109051904 multipath 2 0 0 0 2 1 E 0 2 0 8:224 A 0 65:0 A 0 E 0 2 0 8:48
> A 0 8:80 A 0
>
> echo offline > /sys/block/sdd/device/state
> echo offline > /sys/block/sdo/device/state
> echo offline > /sys/block/sdq/device/state
> echo offline > /sys/block/sdf/device/state
>
> dmsetup status mpath12
> 0 109051904 multipath 2 0 0 0 2 1 E 0 2 0 8:224 F 1 65:0 F 1 E 0 2 0 8:48
> F 1 8:80 F 1
I was actually asking for "dmsetup status <mirror-device>" rather than
multipath device. I didn't know that you were using multipath devices!!!
Anyway, looks like you have mpath12 that probably queues I/O on path
failures rather than failing them back to upper layers. In other words,
if you were to run "dd" or any other app to mpath12, it would hang too.
mpath12 seems to keep the request and forever wait for the paths to
become available again in your case. If you really want it to fail,
configure your multipath accordingly.
Thanks, Malahal.
PS: "features=1 queue_if_no_path" in your 'multipath -ll' output is the
source of error here...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Mirror fail/recover test SOLVED
2010-02-25 16:11 ` malahal
@ 2010-03-02 10:31 ` jose nuno neto
2010-04-14 15:03 ` [linux-lvm] Lvm hangs on San fail jose nuno neto
0 siblings, 1 reply; 16+ messages in thread
From: jose nuno neto @ 2010-03-02 10:31 UTC (permalink / raw)
To: LVM general discussion and development
Hi
Much thanks for your insights, you're right
multipath was keep the I/O on queue and lvm didn't fail the mirror
I had to create a device section on multipath.conf so that the features=0
would be on used by multipath
Best Regards
Jose
> jose nuno neto [jose.neto@liber4e.com] wrote:
>> Much thanks for your interest
>> im putting more info below
>>
>> > jose nuno neto [jose.neto@liber4e.com] wrote:
>> >> Hi
>> >>
>> >> I'm trying to test the failure of a SAN Mirrored Lv, and the recover
>> and
>> >> check for data lost.
>> >>
>> >> Im runing RedHat 5.4
>> >> 2.6.18-164.2.1.el5
>> >> lvm2-2.02.46-8.el5_4.1
>>
>> multipath -l -v2 | grep -A 7 3600a0b800048f9b200000c2b4b5980b7
>> mpath12 (3600a0b800048f9b200000c2b4b5980b7) dm-8 SUN,CSM200_R
>> [size=52G][features=1 queue_if_no_path][hwhandler=1 rdac][rw]
>> \_ round-robin 0 [prio=0][enabled]
>> \_ 7:0:1:1 sdo 8:224 [active][undef]
>> \_ 9:0:1:1 sdq 65:0 [active][undef]
>> \_ round-robin 0 [prio=0][enabled]
>> \_ 7:0:0:1 sdd 8:48 [active][undef]
>> \_ 9:0:0:1 sdf 8:80 [active][undef]
>>
>> Before UnPluging
>> dmsetup status mpath12
>> 0 109051904 multipath 2 0 0 0 2 1 E 0 2 0 8:224 A 0 65:0 A 0 E 0 2 0
>> 8:48
>> A 0 8:80 A 0
>>
>> echo offline > /sys/block/sdd/device/state
>> echo offline > /sys/block/sdo/device/state
>> echo offline > /sys/block/sdq/device/state
>> echo offline > /sys/block/sdf/device/state
>>
>> dmsetup status mpath12
>> 0 109051904 multipath 2 0 0 0 2 1 E 0 2 0 8:224 F 1 65:0 F 1 E 0 2 0
>> 8:48
>> F 1 8:80 F 1
>
> I was actually asking for "dmsetup status <mirror-device>" rather than
> multipath device. I didn't know that you were using multipath devices!!!
>
> Anyway, looks like you have mpath12 that probably queues I/O on path
> failures rather than failing them back to upper layers. In other words,
> if you were to run "dd" or any other app to mpath12, it would hang too.
>
> mpath12 seems to keep the request and forever wait for the paths to
> become available again in your case. If you really want it to fail,
> configure your multipath accordingly.
>
> Thanks, Malahal.
> PS: "features=1 queue_if_no_path" in your 'multipath -ll' output is the
> source of error here...
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [linux-lvm] Lvm hangs on San fail
2010-03-02 10:31 ` [linux-lvm] Mirror fail/recover test SOLVED jose nuno neto
@ 2010-04-14 15:03 ` jose nuno neto
2010-04-14 17:38 ` Eugene Vilensky
2010-04-14 23:02 ` brem belguebli
0 siblings, 2 replies; 16+ messages in thread
From: jose nuno neto @ 2010-04-14 15:03 UTC (permalink / raw)
To: linux-lvm
Hi2all
I'm on RHEL 5.4 with
lvm2-2.02.46-8.el5_4.1
2.6.18-164.2.1.el5
I have a multipathed SAN connection with what Im builing LVs
Its a Cluster system, and I want LVs to switch on failure
If I simulate a fail through the OS via /sys/bus/scsi/devices/$DEVICE/delete
I get a LV fail and the service switch to other node
But if I do it "real" portdown on the SAN Switch, multipath reports path
down, but LVM commands hang forever and nothing gets switched
from the logs i see multipath failing paths, and lvm Failed to remove faulty
"devices"
Any ideas how I should "fix" it?
Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has failed.
Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
vg_ora_scapa-lv_ora_scapa_redo
Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
event. Waiting...
Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
paths: 0
Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
paths: 0
Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
paths: 0
Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
paths: 0
Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
vg_syb_roger-lv_syb_roger_admin
Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices in
vg_syb_roger-lv_syb_roger_admin
Much Thanks
Jose
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Lvm hangs on San fail
2010-04-14 15:03 ` [linux-lvm] Lvm hangs on San fail jose nuno neto
@ 2010-04-14 17:38 ` Eugene Vilensky
2010-04-14 23:02 ` brem belguebli
1 sibling, 0 replies; 16+ messages in thread
From: Eugene Vilensky @ 2010-04-14 17:38 UTC (permalink / raw)
To: LVM general discussion and development
What is your multipath.conf setting of "queue_if_no_path?
On 4/14/10, jose nuno neto <jose.neto@liber4e.com> wrote:
> Hi2all
>
> I'm on RHEL 5.4 with
> lvm2-2.02.46-8.el5_4.1
> 2.6.18-164.2.1.el5
>
> I have a multipathed SAN connection with what Im builing LVs
> Its a Cluster system, and I want LVs to switch on failure
>
> If I simulate a fail through the OS via /sys/bus/scsi/devices/$DEVICE/delete
> I get a LV fail and the service switch to other node
>
> But if I do it "real" portdown on the SAN Switch, multipath reports path
> down, but LVM commands hang forever and nothing gets switched
>
> from the logs i see multipath failing paths, and lvm Failed to remove faulty
> "devices"
>
> Any ideas how I should "fix" it?
>
> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has failed.
> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
> vg_ora_scapa-lv_ora_scapa_redo
> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
> event. Waiting...
>
> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> paths: 0
> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> paths: 0
> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> paths: 0
> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> paths: 0
>
> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> vg_syb_roger-lv_syb_roger_admin
> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices in
> vg_syb_roger-lv_syb_roger_admin
>
> Much Thanks
> Jose
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
--
Sent from my mobile device
Regards,
Eugene Vilensky
evilensky@gmail.com
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Lvm hangs on San fail
2010-04-14 15:03 ` [linux-lvm] Lvm hangs on San fail jose nuno neto
2010-04-14 17:38 ` Eugene Vilensky
@ 2010-04-14 23:02 ` brem belguebli
2010-04-15 8:29 ` jose nuno neto
1 sibling, 1 reply; 16+ messages in thread
From: brem belguebli @ 2010-04-14 23:02 UTC (permalink / raw)
To: LVM general discussion and development
post your multipath.conf file, you may be queuing forever ?
On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
> Hi2all
>
> I'm on RHEL 5.4 with
> lvm2-2.02.46-8.el5_4.1
> 2.6.18-164.2.1.el5
>
> I have a multipathed SAN connection with what Im builing LVs
> Its a Cluster system, and I want LVs to switch on failure
>
> If I simulate a fail through the OS via /sys/bus/scsi/devices/$DEVICE/delete
> I get a LV fail and the service switch to other node
>
> But if I do it "real" portdown on the SAN Switch, multipath reports path
> down, but LVM commands hang forever and nothing gets switched
>
> from the logs i see multipath failing paths, and lvm Failed to remove faulty
> "devices"
>
> Any ideas how I should "fix" it?
>
> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has failed.
> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
> vg_ora_scapa-lv_ora_scapa_redo
> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
> event. Waiting...
>
> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> paths: 0
> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> paths: 0
> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> paths: 0
> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> paths: 0
>
> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> vg_syb_roger-lv_syb_roger_admin
> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices in
> vg_syb_roger-lv_syb_roger_admin
>
> Much Thanks
> Jose
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Lvm hangs on San fail
2010-04-14 23:02 ` brem belguebli
@ 2010-04-15 8:29 ` jose nuno neto
2010-04-15 9:32 ` Bryan Whitehead
2010-04-15 11:59 ` jose nuno neto
0 siblings, 2 replies; 16+ messages in thread
From: jose nuno neto @ 2010-04-15 8:29 UTC (permalink / raw)
To: LVM general discussion and development
GoodMornings
This is what I have on multipath.conf
blacklist {
wwid SSun_VOL0_266DCF4A
wwid SSun_VOL0_5875CF4A
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z]"
}
defaults {
user_friendly_names yes
}
devices {
device {
vendor "HITACHI"
product "OPEN-V"
path_grouping_policy group_by_node_name
failback immediate
no_path_retry fail
}
device {
vendor "IET"
product "VIRTUAL-DISK"
path_checker tur
path_grouping_policy failover
failback immediate
no_path_retry fail
}
}
As an example this is one LUN. It shoes [features=0] so I'd say it should
fail right way
mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V -SU
[size=26G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=4][active]
\_ 5:0:1:0 sdu 65:64 [active][ready]
\_ 5:0:1:16384 sdac 65:192 [active][ready]
\_ 5:0:1:32768 sdas 66:192 [active][ready]
\_ 5:0:1:49152 sdba 67:64 [active][ready]
\_ round-robin 0 [prio=4][enabled]
\_ 3:0:1:0 sdaw 67:0 [active][ready]
\_ 3:0:1:16384 sdbe 67:128 [active][ready]
\_ 3:0:1:32768 sdbi 67:192 [active][ready]
\_ 3:0:1:49152 sdbm 68:0 [active][ready]
It think they fail since I see this messages from LVM:
Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
vg_syb_roger-lv_syb_roger_admin
Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices in
vg_syb_roger-lv_syb_roger_admin
But from some reason LVM cant remove them, any option I should have on
lvm.conf?
BestRegards
Jose
> post your multipath.conf file, you may be queuing forever ?
>
>
>
> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
>> Hi2all
>>
>> I'm on RHEL 5.4 with
>> lvm2-2.02.46-8.el5_4.1
>> 2.6.18-164.2.1.el5
>>
>> I have a multipathed SAN connection with what Im builing LVs
>> Its a Cluster system, and I want LVs to switch on failure
>>
>> If I simulate a fail through the OS via
>> /sys/bus/scsi/devices/$DEVICE/delete
>> I get a LV fail and the service switch to other node
>>
>> But if I do it "real" portdown on the SAN Switch, multipath reports path
>> down, but LVM commands hang forever and nothing gets switched
>>
>> from the logs i see multipath failing paths, and lvm Failed to remove
>> faulty
>> "devices"
>>
>> Any ideas how I should "fix" it?
>>
>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has failed.
>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
>> vg_ora_scapa-lv_ora_scapa_redo
>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
>> event. Waiting...
>>
>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>> paths: 0
>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>> paths: 0
>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>> paths: 0
>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>> paths: 0
>>
>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>> vg_syb_roger-lv_syb_roger_admin
>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
>> in
>> vg_syb_roger-lv_syb_roger_admin
>>
>> Much Thanks
>> Jose
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Lvm hangs on San fail
2010-04-15 8:29 ` jose nuno neto
@ 2010-04-15 9:32 ` Bryan Whitehead
2010-04-15 11:59 ` jose nuno neto
1 sibling, 0 replies; 16+ messages in thread
From: Bryan Whitehead @ 2010-04-15 9:32 UTC (permalink / raw)
To: LVM general discussion and development
Can you post the output of pvdisplay?
Also the output of multipath when the port is down?
If your multipath output is still showing all paths [active][ready]
when you shut a port down, you might need to change the path_checker
option. I don't have a Hitachi array but readsector0 (the default) did
not work for me, directio does. This could be LVM seeing IO is timing
out, but the multipath stuff isn't downing a dead path.
On Thu, Apr 15, 2010 at 1:29 AM, jose nuno neto <jose.neto@liber4e.com> wrote:
> GoodMornings
>
> This is what I have on multipath.conf
>
> blacklist {
> � � � �wwid SSun_VOL0_266DCF4A
> � � � �wwid SSun_VOL0_5875CF4A
> � � � �devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
> � � � �devnode "^hd[a-z]"
> }
> defaults {
> � � � � � � � �user_friendly_names � � � � � � yes
> }
> devices {
> � � � device {
> � � � � � � � �vendor � � � � � � � � � � � � �"HITACHI"
> � � � � � � � �product � � � � � � � � � � � � "OPEN-V"
> � � � � � � � �path_grouping_policy � � � � � �group_by_node_name
> � � � � � � � �failback � � � � � � � � � � � �immediate
> � � � � � � � �no_path_retry � � � � � � � � � fail
> � � � }
> � � � device {
> � � � � � � � �vendor � � � � � � � � � � � � �"IET"
> � � � � � � � �product � � � � � � � � � � � � "VIRTUAL-DISK"
> � � � � � � � �path_checker � � � � � � � � � �tur
> � � � � � � � �path_grouping_policy � � � � � �failover
> � � � � � � � �failback � � � � � � � � � � � �immediate
> � � � � � � � �no_path_retry � � � � � � � � � fail
> � � � }
> }
>
> As an example this is one LUN. It shoes [features=0] so I'd say it should
> fail right way
>
> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V � � �-SU
> [size=26G][features=0][hwhandler=0][rw]
> \_ round-robin 0 [prio=4][active]
> �\_ 5:0:1:0 � � sdu �65:64 �[active][ready]
> �\_ 5:0:1:16384 sdac 65:192 [active][ready]
> �\_ 5:0:1:32768 sdas 66:192 [active][ready]
> �\_ 5:0:1:49152 sdba 67:64 �[active][ready]
> \_ round-robin 0 [prio=4][enabled]
> �\_ 3:0:1:0 � � sdaw 67:0 � [active][ready]
> �\_ 3:0:1:16384 sdbe 67:128 [active][ready]
> �\_ 3:0:1:32768 sdbi 67:192 [active][ready]
> �\_ 3:0:1:49152 sdbm 68:0 � [active][ready]
>
> It think they fail since I see this messages from LVM:
> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> vg_syb_roger-lv_syb_roger_admin
> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices in
> vg_syb_roger-lv_syb_roger_admin
>
> But from some reason LVM cant remove them, any option I should have on
> lvm.conf?
>
> BestRegards
> Jose
>> post your multipath.conf file, you may be queuing forever ?
>>
>>
>>
>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
>>> Hi2all
>>>
>>> I'm on RHEL 5.4 with
>>> lvm2-2.02.46-8.el5_4.1
>>> 2.6.18-164.2.1.el5
>>>
>>> I have a multipathed SAN connection with what Im builing LVs
>>> Its a Cluster system, and I want LVs to switch on failure
>>>
>>> If I simulate a fail through the OS via
>>> /sys/bus/scsi/devices/$DEVICE/delete
>>> I get a LV fail and the service switch to other node
>>>
>>> But if I do it "real" portdown on the SAN Switch, multipath reports path
>>> down, but LVM commands hang forever and nothing gets switched
>>>
>>> from the logs i see multipath failing paths, and lvm Failed to remove
>>> faulty
>>> "devices"
>>>
>>> Any ideas how I should �"fix" it?
>>>
>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has failed.
>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
>>> vg_ora_scapa-lv_ora_scapa_redo
>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
>>> event. �Waiting...
>>>
>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>> paths: 0
>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>> paths: 0
>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>> paths: 0
>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>> paths: 0
>>>
>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>>> vg_syb_roger-lv_syb_roger_admin
>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
>>> in
>>> vg_syb_roger-lv_syb_roger_admin
>>>
>>> Much Thanks
>>> Jose
>>>
>>> _______________________________________________
>>> linux-lvm mailing list
>>> linux-lvm@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Lvm hangs on San fail
2010-04-15 8:29 ` jose nuno neto
2010-04-15 9:32 ` Bryan Whitehead
@ 2010-04-15 11:59 ` jose nuno neto
2010-04-15 12:41 ` Eugene Vilensky
1 sibling, 1 reply; 16+ messages in thread
From: jose nuno neto @ 2010-04-15 11:59 UTC (permalink / raw)
To: LVM general discussion and development
hellos
I spent more time on this and it seems since LVM cant write to any pv on
the volumes it has lost, it cannot write the failure of the devices and
update the metadata on other PVs. So it hangs forever
Is this right?
> GoodMornings
>
> This is what I have on multipath.conf
>
> blacklist {
> wwid SSun_VOL0_266DCF4A
> wwid SSun_VOL0_5875CF4A
> devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
> devnode "^hd[a-z]"
> }
> defaults {
> user_friendly_names yes
> }
> devices {
> device {
> vendor "HITACHI"
> product "OPEN-V"
> path_grouping_policy group_by_node_name
> failback immediate
> no_path_retry fail
> }
> device {
> vendor "IET"
> product "VIRTUAL-DISK"
> path_checker tur
> path_grouping_policy failover
> failback immediate
> no_path_retry fail
> }
> }
>
> As an example this is one LUN. It shoes [features=0] so I'd say it should
> fail right way
>
> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V
> -SU
> [size=26G][features=0][hwhandler=0][rw]
> \_ round-robin 0 [prio=4][active]
> \_ 5:0:1:0 sdu 65:64 [active][ready]
> \_ 5:0:1:16384 sdac 65:192 [active][ready]
> \_ 5:0:1:32768 sdas 66:192 [active][ready]
> \_ 5:0:1:49152 sdba 67:64 [active][ready]
> \_ round-robin 0 [prio=4][enabled]
> \_ 3:0:1:0 sdaw 67:0 [active][ready]
> \_ 3:0:1:16384 sdbe 67:128 [active][ready]
> \_ 3:0:1:32768 sdbi 67:192 [active][ready]
> \_ 3:0:1:49152 sdbm 68:0 [active][ready]
>
> It think they fail since I see this messages from LVM:
> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> vg_syb_roger-lv_syb_roger_admin
> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices in
> vg_syb_roger-lv_syb_roger_admin
>
> But from some reason LVM cant remove them, any option I should have on
> lvm.conf?
>
> BestRegards
> Jose
>> post your multipath.conf file, you may be queuing forever ?
>>
>>
>>
>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
>>> Hi2all
>>>
>>> I'm on RHEL 5.4 with
>>> lvm2-2.02.46-8.el5_4.1
>>> 2.6.18-164.2.1.el5
>>>
>>> I have a multipathed SAN connection with what Im builing LVs
>>> Its a Cluster system, and I want LVs to switch on failure
>>>
>>> If I simulate a fail through the OS via
>>> /sys/bus/scsi/devices/$DEVICE/delete
>>> I get a LV fail and the service switch to other node
>>>
>>> But if I do it "real" portdown on the SAN Switch, multipath reports
>>> path
>>> down, but LVM commands hang forever and nothing gets switched
>>>
>>> from the logs i see multipath failing paths, and lvm Failed to remove
>>> faulty
>>> "devices"
>>>
>>> Any ideas how I should "fix" it?
>>>
>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has failed.
>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
>>> vg_ora_scapa-lv_ora_scapa_redo
>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
>>> event. Waiting...
>>>
>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>> paths: 0
>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>> paths: 0
>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>> paths: 0
>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>> paths: 0
>>>
>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>>> vg_syb_roger-lv_syb_roger_admin
>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
>>> in
>>> vg_syb_roger-lv_syb_roger_admin
>>>
>>> Much Thanks
>>> Jose
>>>
>>> _______________________________________________
>>> linux-lvm mailing list
>>> linux-lvm@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Lvm hangs on San fail
2010-04-15 11:59 ` jose nuno neto
@ 2010-04-15 12:41 ` Eugene Vilensky
2010-04-16 8:55 ` jose nuno neto
0 siblings, 1 reply; 16+ messages in thread
From: Eugene Vilensky @ 2010-04-15 12:41 UTC (permalink / raw)
To: LVM general discussion and development
Can you show us a pvdisplay or verbose vgdisplay ?
On 4/15/10, jose nuno neto <jose.neto@liber4e.com> wrote:
> hellos
>
> I spent more time on this and it seems since LVM cant write to any pv on
> the volumes it has lost, it cannot write the failure of the devices and
> update the metadata on other PVs. So it hangs forever
>
> Is this right?
>
>> GoodMornings
>>
>> This is what I have on multipath.conf
>>
>> blacklist {
>> wwid SSun_VOL0_266DCF4A
>> wwid SSun_VOL0_5875CF4A
>> devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
>> devnode "^hd[a-z]"
>> }
>> defaults {
>> user_friendly_names yes
>> }
>> devices {
>> device {
>> vendor "HITACHI"
>> product "OPEN-V"
>> path_grouping_policy group_by_node_name
>> failback immediate
>> no_path_retry fail
>> }
>> device {
>> vendor "IET"
>> product "VIRTUAL-DISK"
>> path_checker tur
>> path_grouping_policy failover
>> failback immediate
>> no_path_retry fail
>> }
>> }
>>
>> As an example this is one LUN. It shoes [features=0] so I'd say it should
>> fail right way
>>
>> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V
>> -SU
>> [size=26G][features=0][hwhandler=0][rw]
>> \_ round-robin 0 [prio=4][active]
>> \_ 5:0:1:0 sdu 65:64 [active][ready]
>> \_ 5:0:1:16384 sdac 65:192 [active][ready]
>> \_ 5:0:1:32768 sdas 66:192 [active][ready]
>> \_ 5:0:1:49152 sdba 67:64 [active][ready]
>> \_ round-robin 0 [prio=4][enabled]
>> \_ 3:0:1:0 sdaw 67:0 [active][ready]
>> \_ 3:0:1:16384 sdbe 67:128 [active][ready]
>> \_ 3:0:1:32768 sdbi 67:192 [active][ready]
>> \_ 3:0:1:49152 sdbm 68:0 [active][ready]
>>
>> It think they fail since I see this messages from LVM:
>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>> vg_syb_roger-lv_syb_roger_admin
>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices in
>> vg_syb_roger-lv_syb_roger_admin
>>
>> But from some reason LVM cant remove them, any option I should have on
>> lvm.conf?
>>
>> BestRegards
>> Jose
>>> post your multipath.conf file, you may be queuing forever ?
>>>
>>>
>>>
>>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
>>>> Hi2all
>>>>
>>>> I'm on RHEL 5.4 with
>>>> lvm2-2.02.46-8.el5_4.1
>>>> 2.6.18-164.2.1.el5
>>>>
>>>> I have a multipathed SAN connection with what Im builing LVs
>>>> Its a Cluster system, and I want LVs to switch on failure
>>>>
>>>> If I simulate a fail through the OS via
>>>> /sys/bus/scsi/devices/$DEVICE/delete
>>>> I get a LV fail and the service switch to other node
>>>>
>>>> But if I do it "real" portdown on the SAN Switch, multipath reports
>>>> path
>>>> down, but LVM commands hang forever and nothing gets switched
>>>>
>>>> from the logs i see multipath failing paths, and lvm Failed to remove
>>>> faulty
>>>> "devices"
>>>>
>>>> Any ideas how I should "fix" it?
>>>>
>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has failed.
>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
>>>> vg_ora_scapa-lv_ora_scapa_redo
>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
>>>> event. Waiting...
>>>>
>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>>> paths: 0
>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>>> paths: 0
>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>>> paths: 0
>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>>> paths: 0
>>>>
>>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>>>> vg_syb_roger-lv_syb_roger_admin
>>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
>>>> in
>>>> vg_syb_roger-lv_syb_roger_admin
>>>>
>>>> Much Thanks
>>>> Jose
>>>>
>>>> _______________________________________________
>>>> linux-lvm mailing list
>>>> linux-lvm@redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>>
>>>
>>> _______________________________________________
>>> linux-lvm mailing list
>>> linux-lvm@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>>
>>
>>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
--
Sent from my mobile device
Regards,
Eugene Vilensky
evilensky@gmail.com
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Lvm hangs on San fail
2010-04-15 12:41 ` Eugene Vilensky
@ 2010-04-16 8:55 ` jose nuno neto
2010-04-16 20:15 ` Bryan Whitehead
2010-04-17 9:00 ` brem belguebli
0 siblings, 2 replies; 16+ messages in thread
From: jose nuno neto @ 2010-04-16 8:55 UTC (permalink / raw)
To: LVM general discussion and development
Hi
> Can you show us a pvdisplay or verbose vgdisplay ?
>
Here goes the vgdisplay -v of one of the vgs with mirrors
###########################################################
--- Volume group ---
VG Name vg_ora_jura
System ID
Format lvm2
Metadata Areas 3
Metadata Sequence No 705
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 4
Open LV 4
Max PV 0
Cur PV 3
Act PV 3
VG Size 52.79 GB
PE Size 4.00 MB
Total PE 13515
Alloc PE / Size 12292 / 48.02 GB
Free PE / Size 1223 / 4.78 GB
VG UUID nttQ3x-4ecP-Q6ms-jt2u-UIs4-texj-Q9Nxdt
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_arch
VG Name vg_ora_jura
LV UUID 8oUfYn-2TrP-yS6K-pcS2-cgI4-tcv1-33dSdX
LV Write Access read/write
LV Status available
# open 1
LV Size 5.00 GB
Current LE 1280
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:28
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_export
VG Name vg_ora_jura
LV UUID NLfQT6-36TS-DRHq-PJRf-9UDv-L8mz-HjPea2
LV Write Access read/write
LV Status available
# open 1
LV Size 5.00 GB
Current LE 1280
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:32
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_data
VG Name vg_ora_jura
LV UUID VtSBIL-XvCw-23xK-NVAH-DvYn-P2sE-OkZJro
LV Write Access read/write
LV Status available
# open 1
LV Size 12.00 GB
Current LE 3072
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:40
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_redo
VG Name vg_ora_jura
LV UUID KRHKBG-71Qv-YBsA-oJDt-igzP-EYaI-gPwcBX
LV Write Access read/write
LV Status available
# open 1
LV Size 2.00 GB
Current LE 512
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:48
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mimage_0
VG Name vg_ora_jura
LV UUID lQCOAt-aoK3-HBp1-xrQW-eh7L-6t94-CyAg5c
LV Write Access read/write
LV Status available
# open 1
LV Size 5.00 GB
Current LE 1280
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:26
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mimage_1
VG Name vg_ora_jura
LV UUID snrnPc-8FxY-ekAk-ooNe-sBws-tuI0-cTFfj3
LV Write Access read/write
LV Status available
# open 1
LV Size 5.00 GB
Current LE 1280
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:27
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mlog
VG Name vg_ora_jura
LV UUID ouqaCQ-Deex-iArv-xLe9-jg8b-5cLf-3SChQ1
LV Write Access read/write
LV Status available
# open 1
LV Size 4.00 MB
Current LE 1
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:25
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_data_mlog
VG Name vg_ora_jura
LV UUID TmE2S0-r8ST-v624-RxUn-Qppw-2l8p-jM9EC9
LV Write Access read/write
LV Status available
# open 1
LV Size 4.00 MB
Current LE 1
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:37
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_data_mimage_0
VG Name vg_ora_jura
LV UUID 8hR0bP-g9mR-OSXS-KdUM-ouZ6-KVdS-sfz51c
LV Write Access read/write
LV Status available
# open 1
LV Size 12.00 GB
Current LE 3072
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:38
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_data_mimage_1
VG Name vg_ora_jura
LV UUID fzdzrD-7p6d-XFkA-UHyr-CPad-F2nV-6QIU9p
LV Write Access read/write
LV Status available
# open 1
LV Size 12.00 GB
Current LE 3072
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:39
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_export_mlog
VG Name vg_ora_jura
LV UUID 29yLY8-N3Lv-46pN-1jze-50A2-wlhu-quuoMa
LV Write Access read/write
LV Status available
# open 1
LV Size 4.00 MB
Current LE 1
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:29
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_export_mimage_0
VG Name vg_ora_jura
LV UUID 1uMTsf-wPaQ-ItTy-rpma-m2La-TGZl-C4KIU4
LV Write Access read/write
LV Status available
# open 1
LV Size 5.00 GB
Current LE 1280
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:30
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_export_mimage_1
VG Name vg_ora_jura
LV UUID cm8Kn7-knL3-mUPL-XFvU-geMm-Wxff-32x2va
LV Write Access read/write
LV Status available
# open 1
LV Size 5.00 GB
Current LE 1280
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:31
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mlog
VG Name vg_ora_jura
LV UUID 811tNy-eaC5-zfZQ-1QVf-cbYP-1MIM-v6waJF
LV Write Access read/write
LV Status available
# open 1
LV Size 4.00 MB
Current LE 1
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:45
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mimage_0
VG Name vg_ora_jura
LV UUID aUZAer-f5rl-1f2X-9jgY-f8CJ-jdwe-F5Pmao
LV Write Access read/write
LV Status available
# open 1
LV Size 2.00 GB
Current LE 512
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:46
--- Logical volume ---
LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mimage_1
VG Name vg_ora_jura
LV UUID gAEJym-sSbq-rC4P-AjpI-OibV-k3yI-lDx1I6
LV Write Access read/write
LV Status available
# open 1
LV Size 2.00 GB
Current LE 512
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:47
--- Physical volumes ---
PV Name /dev/mapper/mpath-dc1-b
PV UUID hgjXU1-2qjo-RsmS-1XJI-d0kZ-oc4A-ZKCza8
PV Status allocatable
Total PE / Free PE 6749 / 605
PV Name /dev/mapper/mpath-dc2-b
PV UUID hcANwN-aeJT-PIAq-bPsf-9d3e-ylkS-GDjAGR
PV Status allocatable
Total PE / Free PE 6749 / 605
PV Name /dev/mapper/mpath-dc2-mlog1p1
PV UUID 4l9Qvo-SaAV-Ojlk-D1YB-Tkud-Yjg0-e5RkgJ
PV Status allocatable
Total PE / Free PE 17 / 13
> On 4/15/10, jose nuno neto <jose.neto@liber4e.com> wrote:
>> hellos
>>
>> I spent more time on this and it seems since LVM cant write to any pv on
>> the volumes it has lost, it cannot write the failure of the devices and
>> update the metadata on other PVs. So it hangs forever
>>
>> Is this right?
>>
>>> GoodMornings
>>>
>>> This is what I have on multipath.conf
>>>
>>> blacklist {
>>> wwid SSun_VOL0_266DCF4A
>>> wwid SSun_VOL0_5875CF4A
>>> devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
>>> devnode "^hd[a-z]"
>>> }
>>> defaults {
>>> user_friendly_names yes
>>> }
>>> devices {
>>> device {
>>> vendor "HITACHI"
>>> product "OPEN-V"
>>> path_grouping_policy group_by_node_name
>>> failback immediate
>>> no_path_retry fail
>>> }
>>> device {
>>> vendor "IET"
>>> product "VIRTUAL-DISK"
>>> path_checker tur
>>> path_grouping_policy failover
>>> failback immediate
>>> no_path_retry fail
>>> }
>>> }
>>>
>>> As an example this is one LUN. It shoes [features=0] so I'd say it
>>> should
>>> fail right way
>>>
>>> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V
>>> -SU
>>> [size=26G][features=0][hwhandler=0][rw]
>>> \_ round-robin 0 [prio=4][active]
>>> \_ 5:0:1:0 sdu 65:64 [active][ready]
>>> \_ 5:0:1:16384 sdac 65:192 [active][ready]
>>> \_ 5:0:1:32768 sdas 66:192 [active][ready]
>>> \_ 5:0:1:49152 sdba 67:64 [active][ready]
>>> \_ round-robin 0 [prio=4][enabled]
>>> \_ 3:0:1:0 sdaw 67:0 [active][ready]
>>> \_ 3:0:1:16384 sdbe 67:128 [active][ready]
>>> \_ 3:0:1:32768 sdbi 67:192 [active][ready]
>>> \_ 3:0:1:49152 sdbm 68:0 [active][ready]
>>>
>>> It think they fail since I see this messages from LVM:
>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>>> vg_syb_roger-lv_syb_roger_admin
>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
>>> in
>>> vg_syb_roger-lv_syb_roger_admin
>>>
>>> But from some reason LVM cant remove them, any option I should have on
>>> lvm.conf?
>>>
>>> BestRegards
>>> Jose
>>>> post your multipath.conf file, you may be queuing forever ?
>>>>
>>>>
>>>>
>>>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
>>>>> Hi2all
>>>>>
>>>>> I'm on RHEL 5.4 with
>>>>> lvm2-2.02.46-8.el5_4.1
>>>>> 2.6.18-164.2.1.el5
>>>>>
>>>>> I have a multipathed SAN connection with what Im builing LVs
>>>>> Its a Cluster system, and I want LVs to switch on failure
>>>>>
>>>>> If I simulate a fail through the OS via
>>>>> /sys/bus/scsi/devices/$DEVICE/delete
>>>>> I get a LV fail and the service switch to other node
>>>>>
>>>>> But if I do it "real" portdown on the SAN Switch, multipath reports
>>>>> path
>>>>> down, but LVM commands hang forever and nothing gets switched
>>>>>
>>>>> from the logs i see multipath failing paths, and lvm Failed to remove
>>>>> faulty
>>>>> "devices"
>>>>>
>>>>> Any ideas how I should "fix" it?
>>>>>
>>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has
>>>>> failed.
>>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
>>>>> vg_ora_scapa-lv_ora_scapa_redo
>>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
>>>>> event. Waiting...
>>>>>
>>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>>>> paths: 0
>>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>>>> paths: 0
>>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>>>> paths: 0
>>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>>>> paths: 0
>>>>>
>>>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>>>>> vg_syb_roger-lv_syb_roger_admin
>>>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty
>>>>> devices
>>>>> in
>>>>> vg_syb_roger-lv_syb_roger_admin
>>>>>
>>>>> Much Thanks
>>>>> Jose
>>>>>
>>>>> _______________________________________________
>>>>> linux-lvm mailing list
>>>>> linux-lvm@redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>>>
>>>>
>>>> _______________________________________________
>>>> linux-lvm mailing list
>>>> linux-lvm@redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>>>
>>>
>>>
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>
> --
> Sent from my mobile device
>
> Regards,
> Eugene Vilensky
> evilensky@gmail.com
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Lvm hangs on San fail
2010-04-16 8:55 ` jose nuno neto
@ 2010-04-16 20:15 ` Bryan Whitehead
2010-04-17 9:00 ` brem belguebli
1 sibling, 0 replies; 16+ messages in thread
From: Bryan Whitehead @ 2010-04-16 20:15 UTC (permalink / raw)
To: LVM general discussion and development
Can you post the output of multipath when the port is down?
On Fri, Apr 16, 2010 at 1:55 AM, jose nuno neto <jose.neto@liber4e.com> wrote:
> Hi
>
>
>> Can you show us a pvdisplay or verbose vgdisplay ?
>>
>
> Here goes the vgdisplay -v of one of the vgs with mirrors
>
> ###########################################################
>
> --- Volume group ---
> �VG Name � � � � � � � vg_ora_jura
> �System ID
> �Format � � � � � � � �lvm2
> �Metadata Areas � � � �3
> �Metadata Sequence No �705
> �VG Access � � � � � � read/write
> �VG Status � � � � � � resizable
> �MAX LV � � � � � � � �0
> �Cur LV � � � � � � � �4
> �Open LV � � � � � � � 4
> �Max PV � � � � � � � �0
> �Cur PV � � � � � � � �3
> �Act PV � � � � � � � �3
> �VG Size � � � � � � � 52.79 GB
> �PE Size � � � � � � � 4.00 MB
> �Total PE � � � � � � �13515
> �Alloc PE / Size � � � 12292 / 48.02 GB
> �Free �PE / Size � � � 1223 / 4.78 GB
> �VG UUID � � � � � � � nttQ3x-4ecP-Q6ms-jt2u-UIs4-texj-Q9Nxdt
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_arch
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �8oUfYn-2TrP-yS6K-pcS2-cgI4-tcv1-33dSdX
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �5.00 GB
> �Current LE � � � � � � 1280
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:28
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_export
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �NLfQT6-36TS-DRHq-PJRf-9UDv-L8mz-HjPea2
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �5.00 GB
> �Current LE � � � � � � 1280
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:32
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_data
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �VtSBIL-XvCw-23xK-NVAH-DvYn-P2sE-OkZJro
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �12.00 GB
> �Current LE � � � � � � 3072
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:40
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_redo
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �KRHKBG-71Qv-YBsA-oJDt-igzP-EYaI-gPwcBX
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �2.00 GB
> �Current LE � � � � � � 512
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:48
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_arch_mimage_0
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �lQCOAt-aoK3-HBp1-xrQW-eh7L-6t94-CyAg5c
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �5.00 GB
> �Current LE � � � � � � 1280
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:26
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_arch_mimage_1
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �snrnPc-8FxY-ekAk-ooNe-sBws-tuI0-cTFfj3
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �5.00 GB
> �Current LE � � � � � � 1280
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:27
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_arch_mlog
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �ouqaCQ-Deex-iArv-xLe9-jg8b-5cLf-3SChQ1
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �4.00 MB
> �Current LE � � � � � � 1
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:25
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_data_mlog
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �TmE2S0-r8ST-v624-RxUn-Qppw-2l8p-jM9EC9
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �4.00 MB
> �Current LE � � � � � � 1
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:37
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_data_mimage_0
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �8hR0bP-g9mR-OSXS-KdUM-ouZ6-KVdS-sfz51c
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �12.00 GB
> �Current LE � � � � � � 3072
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:38
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_data_mimage_1
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �fzdzrD-7p6d-XFkA-UHyr-CPad-F2nV-6QIU9p
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �12.00 GB
> �Current LE � � � � � � 3072
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:39
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_export_mlog
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �29yLY8-N3Lv-46pN-1jze-50A2-wlhu-quuoMa
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �4.00 MB
> �Current LE � � � � � � 1
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:29
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_export_mimage_0
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �1uMTsf-wPaQ-ItTy-rpma-m2La-TGZl-C4KIU4
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �5.00 GB
> �Current LE � � � � � � 1280
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:30
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_export_mimage_1
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �cm8Kn7-knL3-mUPL-XFvU-geMm-Wxff-32x2va
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �5.00 GB
> �Current LE � � � � � � 1280
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:31
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_redo_mlog
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �811tNy-eaC5-zfZQ-1QVf-cbYP-1MIM-v6waJF
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �4.00 MB
> �Current LE � � � � � � 1
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:45
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_redo_mimage_0
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �aUZAer-f5rl-1f2X-9jgY-f8CJ-jdwe-F5Pmao
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �2.00 GB
> �Current LE � � � � � � 512
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:46
>
> �--- Logical volume ---
> �LV Name � � � � � � � �/dev/vg_ora_jura/lv_ora_jura_redo_mimage_1
> �VG Name � � � � � � � �vg_ora_jura
> �LV UUID � � � � � � � �gAEJym-sSbq-rC4P-AjpI-OibV-k3yI-lDx1I6
> �LV Write Access � � � �read/write
> �LV Status � � � � � � �available
> �# open � � � � � � � � 1
> �LV Size � � � � � � � �2.00 GB
> �Current LE � � � � � � 512
> �Segments � � � � � � � 1
> �Allocation � � � � � � inherit
> �Read ahead sectors � � auto
> �- currently set to � � 256
> �Block device � � � � � 253:47
>
> �--- Physical volumes ---
> �PV Name � � � � � � � /dev/mapper/mpath-dc1-b
> �PV UUID � � � � � � � hgjXU1-2qjo-RsmS-1XJI-d0kZ-oc4A-ZKCza8
> �PV Status � � � � � � allocatable
> �Total PE / Free PE � �6749 / 605
>
> �PV Name � � � � � � � /dev/mapper/mpath-dc2-b
> �PV UUID � � � � � � � hcANwN-aeJT-PIAq-bPsf-9d3e-ylkS-GDjAGR
> �PV Status � � � � � � allocatable
> �Total PE / Free PE � �6749 / 605
>
> �PV Name � � � � � � � /dev/mapper/mpath-dc2-mlog1p1
> �PV UUID � � � � � � � 4l9Qvo-SaAV-Ojlk-D1YB-Tkud-Yjg0-e5RkgJ
> �PV Status � � � � � � allocatable
> �Total PE / Free PE � �17 / 13
>
>
>
>> On 4/15/10, jose nuno neto <jose.neto@liber4e.com> wrote:
>>> hellos
>>>
>>> I spent more time on this and it seems since LVM cant write to any pv on
>>> the �volumes it has lost, it cannot write the failure of the devices and
>>> update the metadata on other PVs. So it hangs forever
>>>
>>> Is this right?
>>>
>>>> GoodMornings
>>>>
>>>> This is what I have on multipath.conf
>>>>
>>>> blacklist {
>>>> � � � � wwid SSun_VOL0_266DCF4A
>>>> � � � � wwid SSun_VOL0_5875CF4A
>>>> � � � � devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
>>>> � � � � devnode "^hd[a-z]"
>>>> }
>>>> defaults {
>>>> � � � � � � � � user_friendly_names � � � � � � yes
>>>> }
>>>> devices {
>>>> � � � �device {
>>>> � � � � � � � � vendor � � � � � � � � � � � � �"HITACHI"
>>>> � � � � � � � � product � � � � � � � � � � � � "OPEN-V"
>>>> � � � � � � � � path_grouping_policy � � � � � �group_by_node_name
>>>> � � � � � � � � failback � � � � � � � � � � � �immediate
>>>> � � � � � � � � no_path_retry � � � � � � � � � fail
>>>> � � � �}
>>>> � � � �device {
>>>> � � � � � � � � vendor � � � � � � � � � � � � �"IET"
>>>> � � � � � � � � product � � � � � � � � � � � � "VIRTUAL-DISK"
>>>> � � � � � � � � path_checker � � � � � � � � � �tur
>>>> � � � � � � � � path_grouping_policy � � � � � �failover
>>>> � � � � � � � � failback � � � � � � � � � � � �immediate
>>>> � � � � � � � � no_path_retry � � � � � � � � � fail
>>>> � � � �}
>>>> }
>>>>
>>>> As an example this is one LUN. It shoes [features=0] so I'd say it
>>>> should
>>>> fail right way
>>>>
>>>> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V
>>>> -SU
>>>> [size=26G][features=0][hwhandler=0][rw]
>>>> \_ round-robin 0 [prio=4][active]
>>>> �\_ 5:0:1:0 � � sdu �65:64 �[active][ready]
>>>> �\_ 5:0:1:16384 sdac 65:192 [active][ready]
>>>> �\_ 5:0:1:32768 sdas 66:192 [active][ready]
>>>> �\_ 5:0:1:49152 sdba 67:64 �[active][ready]
>>>> \_ round-robin 0 [prio=4][enabled]
>>>> �\_ 3:0:1:0 � � sdaw 67:0 � [active][ready]
>>>> �\_ 3:0:1:16384 sdbe 67:128 [active][ready]
>>>> �\_ 3:0:1:32768 sdbi 67:192 [active][ready]
>>>> �\_ 3:0:1:49152 sdbm 68:0 � [active][ready]
>>>>
>>>> It think they fail since I see this messages from LVM:
>>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>>>> vg_syb_roger-lv_syb_roger_admin
>>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
>>>> in
>>>> vg_syb_roger-lv_syb_roger_admin
>>>>
>>>> But from some reason LVM cant remove them, any option I should have on
>>>> lvm.conf?
>>>>
>>>> BestRegards
>>>> Jose
>>>>> post your multipath.conf file, you may be queuing forever ?
>>>>>
>>>>>
>>>>>
>>>>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
>>>>>> Hi2all
>>>>>>
>>>>>> I'm on RHEL 5.4 with
>>>>>> lvm2-2.02.46-8.el5_4.1
>>>>>> 2.6.18-164.2.1.el5
>>>>>>
>>>>>> I have a multipathed SAN connection with what Im builing LVs
>>>>>> Its a Cluster system, and I want LVs to switch on failure
>>>>>>
>>>>>> If I simulate a fail through the OS via
>>>>>> /sys/bus/scsi/devices/$DEVICE/delete
>>>>>> I get a LV fail and the service switch to other node
>>>>>>
>>>>>> But if I do it "real" portdown on the SAN Switch, multipath reports
>>>>>> path
>>>>>> down, but LVM commands hang forever and nothing gets switched
>>>>>>
>>>>>> from the logs i see multipath failing paths, and lvm Failed to remove
>>>>>> faulty
>>>>>> "devices"
>>>>>>
>>>>>> Any ideas how I should �"fix" it?
>>>>>>
>>>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has
>>>>>> failed.
>>>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
>>>>>> vg_ora_scapa-lv_ora_scapa_redo
>>>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
>>>>>> event. �Waiting...
>>>>>>
>>>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>>>>> paths: 0
>>>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>>>>> paths: 0
>>>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>>>>> paths: 0
>>>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>>>>> paths: 0
>>>>>>
>>>>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>>>>>> vg_syb_roger-lv_syb_roger_admin
>>>>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty
>>>>>> devices
>>>>>> in
>>>>>> vg_syb_roger-lv_syb_roger_admin
>>>>>>
>>>>>> Much Thanks
>>>>>> Jose
>>>>>>
>>>>>> _______________________________________________
>>>>>> linux-lvm mailing list
>>>>>> linux-lvm@redhat.com
>>>>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> linux-lvm mailing list
>>>>> linux-lvm@redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> linux-lvm mailing list
>>> linux-lvm@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>>
>>
>> --
>> Sent from my mobile device
>>
>> Regards,
>> Eugene Vilensky
>> evilensky@gmail.com
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Lvm hangs on San fail
2010-04-16 8:55 ` jose nuno neto
2010-04-16 20:15 ` Bryan Whitehead
@ 2010-04-17 9:00 ` brem belguebli
2010-04-19 9:21 ` jose nuno neto
1 sibling, 1 reply; 16+ messages in thread
From: brem belguebli @ 2010-04-17 9:00 UTC (permalink / raw)
To: LVM general discussion and development
Hi Jose,
You have a total of 8 paths per LUN, 4 are marked active thru HBA host5
and the remaining 4 are marked enabled on HBA3 (you're on 2 differnet
FABRICS right ?) , this may due to the fact that you use policy
group_by_node_name. I don't know if this mode if it actually load
balances across the 2 HBA's.
When you pull the cable (this is the test that you're doing and that s
failling ?) you say it times out forever.
As you're in policy group_by_node_name, which corresponds to the
fc_transport target node name you should look at the state of the target
ports bound to the HBA you disconnected (is it the test you're doing?)
(state Blocked ?) /sys/class/fc_remote_ports/rport:H:B-R (where H is
your HBA number )forever due to may dev_loss_tmo or fast_io_fail_tmo too
high (both timers are located under /sys/class/fc_remote_ports/rport....
I have almost the same setup with almost the same storage (OPEN-V) from
a pair of HP XP (OEM'ized Hitachi arrays) and things are setup to use
maximum 4 paths per LUN (2 per fabric), some storage experts tend to say
it is already too much, and as multipath policy I use multibus to
distribute across the 2 fabrics.
Hope all this will help
you say this happens when you pull the fiber cable from the server
On Fri, 2010-04-16 at 08:55 +0000, jose nuno neto wrote:
> Hi
>
>
> > Can you show us a pvdisplay or verbose vgdisplay ?
> >
>
> Here goes the vgdisplay -v of one of the vgs with mirrors
>
> ###########################################################
>
> --- Volume group ---
> VG Name vg_ora_jura
> System ID
> Format lvm2
> Metadata Areas 3
> Metadata Sequence No 705
> VG Access read/write
> VG Status resizable
> MAX LV 0
> Cur LV 4
> Open LV 4
> Max PV 0
> Cur PV 3
> Act PV 3
> VG Size 52.79 GB
> PE Size 4.00 MB
> Total PE 13515
> Alloc PE / Size 12292 / 48.02 GB
> Free PE / Size 1223 / 4.78 GB
> VG UUID nttQ3x-4ecP-Q6ms-jt2u-UIs4-texj-Q9Nxdt
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_arch
> VG Name vg_ora_jura
> LV UUID 8oUfYn-2TrP-yS6K-pcS2-cgI4-tcv1-33dSdX
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:28
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_export
> VG Name vg_ora_jura
> LV UUID NLfQT6-36TS-DRHq-PJRf-9UDv-L8mz-HjPea2
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:32
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_data
> VG Name vg_ora_jura
> LV UUID VtSBIL-XvCw-23xK-NVAH-DvYn-P2sE-OkZJro
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 12.00 GB
> Current LE 3072
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:40
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_redo
> VG Name vg_ora_jura
> LV UUID KRHKBG-71Qv-YBsA-oJDt-igzP-EYaI-gPwcBX
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 2.00 GB
> Current LE 512
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:48
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mimage_0
> VG Name vg_ora_jura
> LV UUID lQCOAt-aoK3-HBp1-xrQW-eh7L-6t94-CyAg5c
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:26
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mimage_1
> VG Name vg_ora_jura
> LV UUID snrnPc-8FxY-ekAk-ooNe-sBws-tuI0-cTFfj3
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:27
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mlog
> VG Name vg_ora_jura
> LV UUID ouqaCQ-Deex-iArv-xLe9-jg8b-5cLf-3SChQ1
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 4.00 MB
> Current LE 1
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:25
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mlog
> VG Name vg_ora_jura
> LV UUID TmE2S0-r8ST-v624-RxUn-Qppw-2l8p-jM9EC9
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 4.00 MB
> Current LE 1
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:37
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mimage_0
> VG Name vg_ora_jura
> LV UUID 8hR0bP-g9mR-OSXS-KdUM-ouZ6-KVdS-sfz51c
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 12.00 GB
> Current LE 3072
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:38
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mimage_1
> VG Name vg_ora_jura
> LV UUID fzdzrD-7p6d-XFkA-UHyr-CPad-F2nV-6QIU9p
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 12.00 GB
> Current LE 3072
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:39
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mlog
> VG Name vg_ora_jura
> LV UUID 29yLY8-N3Lv-46pN-1jze-50A2-wlhu-quuoMa
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 4.00 MB
> Current LE 1
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:29
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mimage_0
> VG Name vg_ora_jura
> LV UUID 1uMTsf-wPaQ-ItTy-rpma-m2La-TGZl-C4KIU4
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:30
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mimage_1
> VG Name vg_ora_jura
> LV UUID cm8Kn7-knL3-mUPL-XFvU-geMm-Wxff-32x2va
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:31
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mlog
> VG Name vg_ora_jura
> LV UUID 811tNy-eaC5-zfZQ-1QVf-cbYP-1MIM-v6waJF
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 4.00 MB
> Current LE 1
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:45
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mimage_0
> VG Name vg_ora_jura
> LV UUID aUZAer-f5rl-1f2X-9jgY-f8CJ-jdwe-F5Pmao
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 2.00 GB
> Current LE 512
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:46
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mimage_1
> VG Name vg_ora_jura
> LV UUID gAEJym-sSbq-rC4P-AjpI-OibV-k3yI-lDx1I6
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 2.00 GB
> Current LE 512
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:47
>
> --- Physical volumes ---
> PV Name /dev/mapper/mpath-dc1-b
> PV UUID hgjXU1-2qjo-RsmS-1XJI-d0kZ-oc4A-ZKCza8
> PV Status allocatable
> Total PE / Free PE 6749 / 605
>
> PV Name /dev/mapper/mpath-dc2-b
> PV UUID hcANwN-aeJT-PIAq-bPsf-9d3e-ylkS-GDjAGR
> PV Status allocatable
> Total PE / Free PE 6749 / 605
>
> PV Name /dev/mapper/mpath-dc2-mlog1p1
> PV UUID 4l9Qvo-SaAV-Ojlk-D1YB-Tkud-Yjg0-e5RkgJ
> PV Status allocatable
> Total PE / Free PE 17 / 13
>
>
>
> > On 4/15/10, jose nuno neto <jose.neto@liber4e.com> wrote:
> >> hellos
> >>
> >> I spent more time on this and it seems since LVM cant write to any pv on
> >> the volumes it has lost, it cannot write the failure of the devices and
> >> update the metadata on other PVs. So it hangs forever
> >>
> >> Is this right?
> >>
> >>> GoodMornings
> >>>
> >>> This is what I have on multipath.conf
> >>>
> >>> blacklist {
> >>> wwid SSun_VOL0_266DCF4A
> >>> wwid SSun_VOL0_5875CF4A
> >>> devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
> >>> devnode "^hd[a-z]"
> >>> }
> >>> defaults {
> >>> user_friendly_names yes
> >>> }
> >>> devices {
> >>> device {
> >>> vendor "HITACHI"
> >>> product "OPEN-V"
> >>> path_grouping_policy group_by_node_name
> >>> failback immediate
> >>> no_path_retry fail
> >>> }
> >>> device {
> >>> vendor "IET"
> >>> product "VIRTUAL-DISK"
> >>> path_checker tur
> >>> path_grouping_policy failover
> >>> failback immediate
> >>> no_path_retry fail
> >>> }
> >>> }
> >>>
> >>> As an example this is one LUN. It shoes [features=0] so I'd say it
> >>> should
> >>> fail right way
> >>>
> >>> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V
> >>> -SU
> >>> [size=26G][features=0][hwhandler=0][rw]
> >>> \_ round-robin 0 [prio=4][active]
> >>> \_ 5:0:1:0 sdu 65:64 [active][ready]
> >>> \_ 5:0:1:16384 sdac 65:192 [active][ready]
> >>> \_ 5:0:1:32768 sdas 66:192 [active][ready]
> >>> \_ 5:0:1:49152 sdba 67:64 [active][ready]
> >>> \_ round-robin 0 [prio=4][enabled]
> >>> \_ 3:0:1:0 sdaw 67:0 [active][ready]
> >>> \_ 3:0:1:16384 sdbe 67:128 [active][ready]
> >>> \_ 3:0:1:32768 sdbi 67:192 [active][ready]
> >>> \_ 3:0:1:49152 sdbm 68:0 [active][ready]
> >>>
> >>> It think they fail since I see this messages from LVM:
> >>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> >>> vg_syb_roger-lv_syb_roger_admin
> >>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
> >>> in
> >>> vg_syb_roger-lv_syb_roger_admin
> >>>
> >>> But from some reason LVM cant remove them, any option I should have on
> >>> lvm.conf?
> >>>
> >>> BestRegards
> >>> Jose
> >>>> post your multipath.conf file, you may be queuing forever ?
> >>>>
> >>>>
> >>>>
> >>>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
> >>>>> Hi2all
> >>>>>
> >>>>> I'm on RHEL 5.4 with
> >>>>> lvm2-2.02.46-8.el5_4.1
> >>>>> 2.6.18-164.2.1.el5
> >>>>>
> >>>>> I have a multipathed SAN connection with what Im builing LVs
> >>>>> Its a Cluster system, and I want LVs to switch on failure
> >>>>>
> >>>>> If I simulate a fail through the OS via
> >>>>> /sys/bus/scsi/devices/$DEVICE/delete
> >>>>> I get a LV fail and the service switch to other node
> >>>>>
> >>>>> But if I do it "real" portdown on the SAN Switch, multipath reports
> >>>>> path
> >>>>> down, but LVM commands hang forever and nothing gets switched
> >>>>>
> >>>>> from the logs i see multipath failing paths, and lvm Failed to remove
> >>>>> faulty
> >>>>> "devices"
> >>>>>
> >>>>> Any ideas how I should "fix" it?
> >>>>>
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has
> >>>>> failed.
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
> >>>>> vg_ora_scapa-lv_ora_scapa_redo
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
> >>>>> event. Waiting...
> >>>>>
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> >>>>> paths: 0
> >>>>>
> >>>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> >>>>> vg_syb_roger-lv_syb_roger_admin
> >>>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty
> >>>>> devices
> >>>>> in
> >>>>> vg_syb_roger-lv_syb_roger_admin
> >>>>>
> >>>>> Much Thanks
> >>>>> Jose
> >>>>>
> >>>>> _______________________________________________
> >>>>> linux-lvm mailing list
> >>>>> linux-lvm@redhat.com
> >>>>> https://www.redhat.com/mailman/listinfo/linux-lvm
> >>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> linux-lvm mailing list
> >>>> linux-lvm@redhat.com
> >>>> https://www.redhat.com/mailman/listinfo/linux-lvm
> >>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>>>
> >>>
> >>>
> >>
> >> _______________________________________________
> >> linux-lvm mailing list
> >> linux-lvm@redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-lvm
> >> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>
> >
> > --
> > Sent from my mobile device
> >
> > Regards,
> > Eugene Vilensky
> > evilensky@gmail.com
> >
> > _______________________________________________
> > linux-lvm mailing list
> > linux-lvm@redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-lvm
> > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [linux-lvm] Lvm hangs on San fail
2010-04-17 9:00 ` brem belguebli
@ 2010-04-19 9:21 ` jose nuno neto
0 siblings, 0 replies; 16+ messages in thread
From: jose nuno neto @ 2010-04-19 9:21 UTC (permalink / raw)
To: LVM general discussion and development
GoodMornings
In the meantime we did an upgrade on RHEL to 5.5 and multipath now looks
more accurate showing only 1path per HBA. We have a 2datacenter setup with
4Fabrics between them. 2Fabrics for each datacenter.
mpath-dc2-a (360060e8004f240000000f24000000502) dm-12 HITACHI,OPEN-V -SU
[size=26G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
\_ 3:0:1:0 sdg 8:96 [active][ready]
\_ round-robin 0 [prio=1][enabled]
\_ 5:0:1:0 sdo 8:224 [active][ready]
I'll repeat the tests and look at the state you're saying
I'm using group_by_node_name because before with 8links it was a mess, but
it spreads some load between the paths, but not on all of them. anyway
that was it the "strange" paths i'll see how it goes now
Thanks
Jose
> Hi Jose,
>
> You have a total of 8 paths per LUN, 4 are marked active thru HBA host5
> and the remaining 4 are marked enabled on HBA3 (you're on 2 differnet
> FABRICS right ?) , this may due to the fact that you use policy
> group_by_node_name. I don't know if this mode if it actually load
> balances across the 2 HBA's.
>
>
> When you pull the cable (this is the test that you're doing and that s
> failling ?) you say it times out forever.
> As you're in policy group_by_node_name, which corresponds to the
> fc_transport target node name you should look at the state of the target
> ports bound to the HBA you disconnected (is it the test you're doing?)
> (state Blocked ?) /sys/class/fc_remote_ports/rport:H:B-R (where H is
> your HBA number )forever due to may dev_loss_tmo or fast_io_fail_tmo too
> high (both timers are located under /sys/class/fc_remote_ports/rport....
>
> I have almost the same setup with almost the same storage (OPEN-V) from
> a pair of HP XP (OEM'ized Hitachi arrays) and things are setup to use
> maximum 4 paths per LUN (2 per fabric), some storage experts tend to say
> it is already too much, and as multipath policy I use multibus to
> distribute across the 2 fabrics.
>
> Hope all this will help
>
>
>
>
>
>
>
> you say this happens when you pull the fiber cable from the server
>
> On Fri, 2010-04-16 at 08:55 +0000, jose nuno neto wrote:
>> Hi
>>
>>
>> > Can you show us a pvdisplay or verbose vgdisplay ?
>> >
>>
>> Here goes the vgdisplay -v of one of the vgs with mirrors
>>
>> ###########################################################
>>
>> --- Volume group ---
>> VG Name vg_ora_jura
>> System ID
>> Format lvm2
>> Metadata Areas 3
>> Metadata Sequence No 705
>> VG Access read/write
>> VG Status resizable
>> MAX LV 0
>> Cur LV 4
>> Open LV 4
>> Max PV 0
>> Cur PV 3
>> Act PV 3
>> VG Size 52.79 GB
>> PE Size 4.00 MB
>> Total PE 13515
>> Alloc PE / Size 12292 / 48.02 GB
>> Free PE / Size 1223 / 4.78 GB
>> VG UUID nttQ3x-4ecP-Q6ms-jt2u-UIs4-texj-Q9Nxdt
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_arch
>> VG Name vg_ora_jura
>> LV UUID 8oUfYn-2TrP-yS6K-pcS2-cgI4-tcv1-33dSdX
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 5.00 GB
>> Current LE 1280
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:28
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_export
>> VG Name vg_ora_jura
>> LV UUID NLfQT6-36TS-DRHq-PJRf-9UDv-L8mz-HjPea2
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 5.00 GB
>> Current LE 1280
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:32
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_data
>> VG Name vg_ora_jura
>> LV UUID VtSBIL-XvCw-23xK-NVAH-DvYn-P2sE-OkZJro
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 12.00 GB
>> Current LE 3072
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:40
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_redo
>> VG Name vg_ora_jura
>> LV UUID KRHKBG-71Qv-YBsA-oJDt-igzP-EYaI-gPwcBX
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 2.00 GB
>> Current LE 512
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:48
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mimage_0
>> VG Name vg_ora_jura
>> LV UUID lQCOAt-aoK3-HBp1-xrQW-eh7L-6t94-CyAg5c
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 5.00 GB
>> Current LE 1280
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:26
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mimage_1
>> VG Name vg_ora_jura
>> LV UUID snrnPc-8FxY-ekAk-ooNe-sBws-tuI0-cTFfj3
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 5.00 GB
>> Current LE 1280
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:27
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mlog
>> VG Name vg_ora_jura
>> LV UUID ouqaCQ-Deex-iArv-xLe9-jg8b-5cLf-3SChQ1
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 4.00 MB
>> Current LE 1
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:25
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mlog
>> VG Name vg_ora_jura
>> LV UUID TmE2S0-r8ST-v624-RxUn-Qppw-2l8p-jM9EC9
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 4.00 MB
>> Current LE 1
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:37
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mimage_0
>> VG Name vg_ora_jura
>> LV UUID 8hR0bP-g9mR-OSXS-KdUM-ouZ6-KVdS-sfz51c
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 12.00 GB
>> Current LE 3072
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:38
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mimage_1
>> VG Name vg_ora_jura
>> LV UUID fzdzrD-7p6d-XFkA-UHyr-CPad-F2nV-6QIU9p
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 12.00 GB
>> Current LE 3072
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:39
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mlog
>> VG Name vg_ora_jura
>> LV UUID 29yLY8-N3Lv-46pN-1jze-50A2-wlhu-quuoMa
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 4.00 MB
>> Current LE 1
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:29
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mimage_0
>> VG Name vg_ora_jura
>> LV UUID 1uMTsf-wPaQ-ItTy-rpma-m2La-TGZl-C4KIU4
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 5.00 GB
>> Current LE 1280
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:30
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mimage_1
>> VG Name vg_ora_jura
>> LV UUID cm8Kn7-knL3-mUPL-XFvU-geMm-Wxff-32x2va
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 5.00 GB
>> Current LE 1280
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:31
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mlog
>> VG Name vg_ora_jura
>> LV UUID 811tNy-eaC5-zfZQ-1QVf-cbYP-1MIM-v6waJF
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 4.00 MB
>> Current LE 1
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:45
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mimage_0
>> VG Name vg_ora_jura
>> LV UUID aUZAer-f5rl-1f2X-9jgY-f8CJ-jdwe-F5Pmao
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 2.00 GB
>> Current LE 512
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:46
>>
>> --- Logical volume ---
>> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mimage_1
>> VG Name vg_ora_jura
>> LV UUID gAEJym-sSbq-rC4P-AjpI-OibV-k3yI-lDx1I6
>> LV Write Access read/write
>> LV Status available
>> # open 1
>> LV Size 2.00 GB
>> Current LE 512
>> Segments 1
>> Allocation inherit
>> Read ahead sectors auto
>> - currently set to 256
>> Block device 253:47
>>
>> --- Physical volumes ---
>> PV Name /dev/mapper/mpath-dc1-b
>> PV UUID hgjXU1-2qjo-RsmS-1XJI-d0kZ-oc4A-ZKCza8
>> PV Status allocatable
>> Total PE / Free PE 6749 / 605
>>
>> PV Name /dev/mapper/mpath-dc2-b
>> PV UUID hcANwN-aeJT-PIAq-bPsf-9d3e-ylkS-GDjAGR
>> PV Status allocatable
>> Total PE / Free PE 6749 / 605
>>
>> PV Name /dev/mapper/mpath-dc2-mlog1p1
>> PV UUID 4l9Qvo-SaAV-Ojlk-D1YB-Tkud-Yjg0-e5RkgJ
>> PV Status allocatable
>> Total PE / Free PE 17 / 13
>>
>>
>>
>> > On 4/15/10, jose nuno neto <jose.neto@liber4e.com> wrote:
>> >> hellos
>> >>
>> >> I spent more time on this and it seems since LVM cant write to any pv
>> on
>> >> the volumes it has lost, it cannot write the failure of the devices
>> and
>> >> update the metadata on other PVs. So it hangs forever
>> >>
>> >> Is this right?
>> >>
>> >>> GoodMornings
>> >>>
>> >>> This is what I have on multipath.conf
>> >>>
>> >>> blacklist {
>> >>> wwid SSun_VOL0_266DCF4A
>> >>> wwid SSun_VOL0_5875CF4A
>> >>> devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
>> >>> devnode "^hd[a-z]"
>> >>> }
>> >>> defaults {
>> >>> user_friendly_names yes
>> >>> }
>> >>> devices {
>> >>> device {
>> >>> vendor "HITACHI"
>> >>> product "OPEN-V"
>> >>> path_grouping_policy group_by_node_name
>> >>> failback immediate
>> >>> no_path_retry fail
>> >>> }
>> >>> device {
>> >>> vendor "IET"
>> >>> product "VIRTUAL-DISK"
>> >>> path_checker tur
>> >>> path_grouping_policy failover
>> >>> failback immediate
>> >>> no_path_retry fail
>> >>> }
>> >>> }
>> >>>
>> >>> As an example this is one LUN. It shoes [features=0] so I'd say it
>> >>> should
>> >>> fail right way
>> >>>
>> >>> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V
>> >>> -SU
>> >>> [size=26G][features=0][hwhandler=0][rw]
>> >>> \_ round-robin 0 [prio=4][active]
>> >>> \_ 5:0:1:0 sdu 65:64 [active][ready]
>> >>> \_ 5:0:1:16384 sdac 65:192 [active][ready]
>> >>> \_ 5:0:1:32768 sdas 66:192 [active][ready]
>> >>> \_ 5:0:1:49152 sdba 67:64 [active][ready]
>> >>> \_ round-robin 0 [prio=4][enabled]
>> >>> \_ 3:0:1:0 sdaw 67:0 [active][ready]
>> >>> \_ 3:0:1:16384 sdbe 67:128 [active][ready]
>> >>> \_ 3:0:1:32768 sdbi 67:192 [active][ready]
>> >>> \_ 3:0:1:49152 sdbm 68:0 [active][ready]
>> >>>
>> >>> It think they fail since I see this messages from LVM:
>> >>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>> >>> vg_syb_roger-lv_syb_roger_admin
>> >>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty
>> devices
>> >>> in
>> >>> vg_syb_roger-lv_syb_roger_admin
>> >>>
>> >>> But from some reason LVM cant remove them, any option I should have
>> on
>> >>> lvm.conf?
>> >>>
>> >>> BestRegards
>> >>> Jose
>> >>>> post your multipath.conf file, you may be queuing forever ?
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
>> >>>>> Hi2all
>> >>>>>
>> >>>>> I'm on RHEL 5.4 with
>> >>>>> lvm2-2.02.46-8.el5_4.1
>> >>>>> 2.6.18-164.2.1.el5
>> >>>>>
>> >>>>> I have a multipathed SAN connection with what Im builing LVs
>> >>>>> Its a Cluster system, and I want LVs to switch on failure
>> >>>>>
>> >>>>> If I simulate a fail through the OS via
>> >>>>> /sys/bus/scsi/devices/$DEVICE/delete
>> >>>>> I get a LV fail and the service switch to other node
>> >>>>>
>> >>>>> But if I do it "real" portdown on the SAN Switch, multipath
>> reports
>> >>>>> path
>> >>>>> down, but LVM commands hang forever and nothing gets switched
>> >>>>>
>> >>>>> from the logs i see multipath failing paths, and lvm Failed to
>> remove
>> >>>>> faulty
>> >>>>> "devices"
>> >>>>>
>> >>>>> Any ideas how I should "fix" it?
>> >>>>>
>> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has
>> >>>>> failed.
>> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
>> >>>>> vg_ora_scapa-lv_ora_scapa_redo
>> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling
>> an
>> >>>>> event. Waiting...
>> >>>>>
>> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining
>> active
>> >>>>> paths: 0
>> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining
>> active
>> >>>>> paths: 0
>> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining
>> active
>> >>>>> paths: 0
>> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining
>> active
>> >>>>> paths: 0
>> >>>>>
>> >>>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>> >>>>> vg_syb_roger-lv_syb_roger_admin
>> >>>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty
>> >>>>> devices
>> >>>>> in
>> >>>>> vg_syb_roger-lv_syb_roger_admin
>> >>>>>
>> >>>>> Much Thanks
>> >>>>> Jose
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> linux-lvm mailing list
>> >>>>> linux-lvm@redhat.com
>> >>>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> >>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> linux-lvm mailing list
>> >>>> linux-lvm@redhat.com
>> >>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> >>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>> >>>>
>> >>>
>> >>>
>> >>
>> >> _______________________________________________
>> >> linux-lvm mailing list
>> >> linux-lvm@redhat.com
>> >> https://www.redhat.com/mailman/listinfo/linux-lvm
>> >> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>> >>
>> >
>> > --
>> > Sent from my mobile device
>> >
>> > Regards,
>> > Eugene Vilensky
>> > evilensky@gmail.com
>> >
>> > _______________________________________________
>> > linux-lvm mailing list
>> > linux-lvm@redhat.com
>> > https://www.redhat.com/mailman/listinfo/linux-lvm
>> > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>> >
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2010-04-19 9:22 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-24 16:14 [linux-lvm] Mirror fail/recover test jose nuno neto
2010-02-24 18:55 ` malahal
2010-02-25 10:36 ` jose nuno neto
2010-02-25 16:11 ` malahal
2010-03-02 10:31 ` [linux-lvm] Mirror fail/recover test SOLVED jose nuno neto
2010-04-14 15:03 ` [linux-lvm] Lvm hangs on San fail jose nuno neto
2010-04-14 17:38 ` Eugene Vilensky
2010-04-14 23:02 ` brem belguebli
2010-04-15 8:29 ` jose nuno neto
2010-04-15 9:32 ` Bryan Whitehead
2010-04-15 11:59 ` jose nuno neto
2010-04-15 12:41 ` Eugene Vilensky
2010-04-16 8:55 ` jose nuno neto
2010-04-16 20:15 ` Bryan Whitehead
2010-04-17 9:00 ` brem belguebli
2010-04-19 9:21 ` jose nuno neto
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.