From: brem belguebli <brem.belguebli@gmail.com>
To: LVM general discussion and development <linux-lvm@redhat.com>
Subject: Re: [linux-lvm] Lvm hangs on San fail
Date: Sat, 17 Apr 2010 11:00:59 +0200 [thread overview]
Message-ID: <1271494859.5449.41.camel@localhost> (raw)
In-Reply-To: <c3ddeadaff6c2abaf928cf7636469164.squirrel@fela.liber4e.com>
Hi Jose,
You have a total of 8 paths per LUN, 4 are marked active thru HBA host5
and the remaining 4 are marked enabled on HBA3 (you're on 2 differnet
FABRICS right ?) , this may due to the fact that you use policy
group_by_node_name. I don't know if this mode if it actually load
balances across the 2 HBA's.
When you pull the cable (this is the test that you're doing and that s
failling ?) you say it times out forever.
As you're in policy group_by_node_name, which corresponds to the
fc_transport target node name you should look at the state of the target
ports bound to the HBA you disconnected (is it the test you're doing?)
(state Blocked ?) /sys/class/fc_remote_ports/rport:H:B-R (where H is
your HBA number )forever due to may dev_loss_tmo or fast_io_fail_tmo too
high (both timers are located under /sys/class/fc_remote_ports/rport....
I have almost the same setup with almost the same storage (OPEN-V) from
a pair of HP XP (OEM'ized Hitachi arrays) and things are setup to use
maximum 4 paths per LUN (2 per fabric), some storage experts tend to say
it is already too much, and as multipath policy I use multibus to
distribute across the 2 fabrics.
Hope all this will help
you say this happens when you pull the fiber cable from the server
On Fri, 2010-04-16 at 08:55 +0000, jose nuno neto wrote:
> Hi
>
>
> > Can you show us a pvdisplay or verbose vgdisplay ?
> >
>
> Here goes the vgdisplay -v of one of the vgs with mirrors
>
> ###########################################################
>
> --- Volume group ---
> VG Name vg_ora_jura
> System ID
> Format lvm2
> Metadata Areas 3
> Metadata Sequence No 705
> VG Access read/write
> VG Status resizable
> MAX LV 0
> Cur LV 4
> Open LV 4
> Max PV 0
> Cur PV 3
> Act PV 3
> VG Size 52.79 GB
> PE Size 4.00 MB
> Total PE 13515
> Alloc PE / Size 12292 / 48.02 GB
> Free PE / Size 1223 / 4.78 GB
> VG UUID nttQ3x-4ecP-Q6ms-jt2u-UIs4-texj-Q9Nxdt
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_arch
> VG Name vg_ora_jura
> LV UUID 8oUfYn-2TrP-yS6K-pcS2-cgI4-tcv1-33dSdX
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:28
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_export
> VG Name vg_ora_jura
> LV UUID NLfQT6-36TS-DRHq-PJRf-9UDv-L8mz-HjPea2
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:32
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_data
> VG Name vg_ora_jura
> LV UUID VtSBIL-XvCw-23xK-NVAH-DvYn-P2sE-OkZJro
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 12.00 GB
> Current LE 3072
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:40
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_redo
> VG Name vg_ora_jura
> LV UUID KRHKBG-71Qv-YBsA-oJDt-igzP-EYaI-gPwcBX
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 2.00 GB
> Current LE 512
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:48
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mimage_0
> VG Name vg_ora_jura
> LV UUID lQCOAt-aoK3-HBp1-xrQW-eh7L-6t94-CyAg5c
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:26
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mimage_1
> VG Name vg_ora_jura
> LV UUID snrnPc-8FxY-ekAk-ooNe-sBws-tuI0-cTFfj3
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:27
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_arch_mlog
> VG Name vg_ora_jura
> LV UUID ouqaCQ-Deex-iArv-xLe9-jg8b-5cLf-3SChQ1
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 4.00 MB
> Current LE 1
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:25
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mlog
> VG Name vg_ora_jura
> LV UUID TmE2S0-r8ST-v624-RxUn-Qppw-2l8p-jM9EC9
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 4.00 MB
> Current LE 1
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:37
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mimage_0
> VG Name vg_ora_jura
> LV UUID 8hR0bP-g9mR-OSXS-KdUM-ouZ6-KVdS-sfz51c
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 12.00 GB
> Current LE 3072
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:38
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_data_mimage_1
> VG Name vg_ora_jura
> LV UUID fzdzrD-7p6d-XFkA-UHyr-CPad-F2nV-6QIU9p
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 12.00 GB
> Current LE 3072
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:39
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mlog
> VG Name vg_ora_jura
> LV UUID 29yLY8-N3Lv-46pN-1jze-50A2-wlhu-quuoMa
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 4.00 MB
> Current LE 1
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:29
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mimage_0
> VG Name vg_ora_jura
> LV UUID 1uMTsf-wPaQ-ItTy-rpma-m2La-TGZl-C4KIU4
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:30
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_export_mimage_1
> VG Name vg_ora_jura
> LV UUID cm8Kn7-knL3-mUPL-XFvU-geMm-Wxff-32x2va
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 5.00 GB
> Current LE 1280
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:31
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mlog
> VG Name vg_ora_jura
> LV UUID 811tNy-eaC5-zfZQ-1QVf-cbYP-1MIM-v6waJF
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 4.00 MB
> Current LE 1
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:45
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mimage_0
> VG Name vg_ora_jura
> LV UUID aUZAer-f5rl-1f2X-9jgY-f8CJ-jdwe-F5Pmao
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 2.00 GB
> Current LE 512
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:46
>
> --- Logical volume ---
> LV Name /dev/vg_ora_jura/lv_ora_jura_redo_mimage_1
> VG Name vg_ora_jura
> LV UUID gAEJym-sSbq-rC4P-AjpI-OibV-k3yI-lDx1I6
> LV Write Access read/write
> LV Status available
> # open 1
> LV Size 2.00 GB
> Current LE 512
> Segments 1
> Allocation inherit
> Read ahead sectors auto
> - currently set to 256
> Block device 253:47
>
> --- Physical volumes ---
> PV Name /dev/mapper/mpath-dc1-b
> PV UUID hgjXU1-2qjo-RsmS-1XJI-d0kZ-oc4A-ZKCza8
> PV Status allocatable
> Total PE / Free PE 6749 / 605
>
> PV Name /dev/mapper/mpath-dc2-b
> PV UUID hcANwN-aeJT-PIAq-bPsf-9d3e-ylkS-GDjAGR
> PV Status allocatable
> Total PE / Free PE 6749 / 605
>
> PV Name /dev/mapper/mpath-dc2-mlog1p1
> PV UUID 4l9Qvo-SaAV-Ojlk-D1YB-Tkud-Yjg0-e5RkgJ
> PV Status allocatable
> Total PE / Free PE 17 / 13
>
>
>
> > On 4/15/10, jose nuno neto <jose.neto@liber4e.com> wrote:
> >> hellos
> >>
> >> I spent more time on this and it seems since LVM cant write to any pv on
> >> the volumes it has lost, it cannot write the failure of the devices and
> >> update the metadata on other PVs. So it hangs forever
> >>
> >> Is this right?
> >>
> >>> GoodMornings
> >>>
> >>> This is what I have on multipath.conf
> >>>
> >>> blacklist {
> >>> wwid SSun_VOL0_266DCF4A
> >>> wwid SSun_VOL0_5875CF4A
> >>> devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
> >>> devnode "^hd[a-z]"
> >>> }
> >>> defaults {
> >>> user_friendly_names yes
> >>> }
> >>> devices {
> >>> device {
> >>> vendor "HITACHI"
> >>> product "OPEN-V"
> >>> path_grouping_policy group_by_node_name
> >>> failback immediate
> >>> no_path_retry fail
> >>> }
> >>> device {
> >>> vendor "IET"
> >>> product "VIRTUAL-DISK"
> >>> path_checker tur
> >>> path_grouping_policy failover
> >>> failback immediate
> >>> no_path_retry fail
> >>> }
> >>> }
> >>>
> >>> As an example this is one LUN. It shoes [features=0] so I'd say it
> >>> should
> >>> fail right way
> >>>
> >>> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V
> >>> -SU
> >>> [size=26G][features=0][hwhandler=0][rw]
> >>> \_ round-robin 0 [prio=4][active]
> >>> \_ 5:0:1:0 sdu 65:64 [active][ready]
> >>> \_ 5:0:1:16384 sdac 65:192 [active][ready]
> >>> \_ 5:0:1:32768 sdas 66:192 [active][ready]
> >>> \_ 5:0:1:49152 sdba 67:64 [active][ready]
> >>> \_ round-robin 0 [prio=4][enabled]
> >>> \_ 3:0:1:0 sdaw 67:0 [active][ready]
> >>> \_ 3:0:1:16384 sdbe 67:128 [active][ready]
> >>> \_ 3:0:1:32768 sdbi 67:192 [active][ready]
> >>> \_ 3:0:1:49152 sdbm 68:0 [active][ready]
> >>>
> >>> It think they fail since I see this messages from LVM:
> >>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> >>> vg_syb_roger-lv_syb_roger_admin
> >>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
> >>> in
> >>> vg_syb_roger-lv_syb_roger_admin
> >>>
> >>> But from some reason LVM cant remove them, any option I should have on
> >>> lvm.conf?
> >>>
> >>> BestRegards
> >>> Jose
> >>>> post your multipath.conf file, you may be queuing forever ?
> >>>>
> >>>>
> >>>>
> >>>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
> >>>>> Hi2all
> >>>>>
> >>>>> I'm on RHEL 5.4 with
> >>>>> lvm2-2.02.46-8.el5_4.1
> >>>>> 2.6.18-164.2.1.el5
> >>>>>
> >>>>> I have a multipathed SAN connection with what Im builing LVs
> >>>>> Its a Cluster system, and I want LVs to switch on failure
> >>>>>
> >>>>> If I simulate a fail through the OS via
> >>>>> /sys/bus/scsi/devices/$DEVICE/delete
> >>>>> I get a LV fail and the service switch to other node
> >>>>>
> >>>>> But if I do it "real" portdown on the SAN Switch, multipath reports
> >>>>> path
> >>>>> down, but LVM commands hang forever and nothing gets switched
> >>>>>
> >>>>> from the logs i see multipath failing paths, and lvm Failed to remove
> >>>>> faulty
> >>>>> "devices"
> >>>>>
> >>>>> Any ideas how I should "fix" it?
> >>>>>
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has
> >>>>> failed.
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
> >>>>> vg_ora_scapa-lv_ora_scapa_redo
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
> >>>>> event. Waiting...
> >>>>>
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> >>>>> paths: 0
> >>>>>
> >>>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> >>>>> vg_syb_roger-lv_syb_roger_admin
> >>>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty
> >>>>> devices
> >>>>> in
> >>>>> vg_syb_roger-lv_syb_roger_admin
> >>>>>
> >>>>> Much Thanks
> >>>>> Jose
> >>>>>
> >>>>> _______________________________________________
> >>>>> linux-lvm mailing list
> >>>>> linux-lvm@redhat.com
> >>>>> https://www.redhat.com/mailman/listinfo/linux-lvm
> >>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> linux-lvm mailing list
> >>>> linux-lvm@redhat.com
> >>>> https://www.redhat.com/mailman/listinfo/linux-lvm
> >>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>>>
> >>>
> >>>
> >>
> >> _______________________________________________
> >> linux-lvm mailing list
> >> linux-lvm@redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-lvm
> >> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>
> >
> > --
> > Sent from my mobile device
> >
> > Regards,
> > Eugene Vilensky
> > evilensky@gmail.com
> >
> > _______________________________________________
> > linux-lvm mailing list
> > linux-lvm@redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-lvm
> > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
next prev parent reply other threads:[~2010-04-17 7:28 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-24 16:14 [linux-lvm] Mirror fail/recover test jose nuno neto
2010-02-24 18:55 ` malahal
2010-02-25 10:36 ` jose nuno neto
2010-02-25 16:11 ` malahal
2010-03-02 10:31 ` [linux-lvm] Mirror fail/recover test SOLVED jose nuno neto
2010-04-14 15:03 ` [linux-lvm] Lvm hangs on San fail jose nuno neto
2010-04-14 17:38 ` Eugene Vilensky
2010-04-14 23:02 ` brem belguebli
2010-04-15 8:29 ` jose nuno neto
2010-04-15 9:32 ` Bryan Whitehead
2010-04-15 11:59 ` jose nuno neto
2010-04-15 12:41 ` Eugene Vilensky
2010-04-16 8:55 ` jose nuno neto
2010-04-16 20:15 ` Bryan Whitehead
2010-04-17 9:00 ` brem belguebli [this message]
2010-04-19 9:21 ` jose nuno neto
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1271494859.5449.41.camel@localhost \
--to=brem.belguebli@gmail.com \
--cc=linux-lvm@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.