All of lore.kernel.org
 help / color / mirror / Atom feed
From: brem belguebli <brem.belguebli@gmail.com>
To: LVM general discussion and development <linux-lvm@redhat.com>
Subject: Re: [linux-lvm] Lvm hangs on San fail
Date: Sat, 17 Apr 2010 11:00:59 +0200	[thread overview]
Message-ID: <1271494859.5449.41.camel@localhost> (raw)
In-Reply-To: <c3ddeadaff6c2abaf928cf7636469164.squirrel@fela.liber4e.com>

Hi Jose,

You have a total of 8 paths per LUN, 4 are marked active thru HBA host5
and the remaining 4 are marked enabled on HBA3 (you're on 2 differnet
FABRICS right ?) , this may due to the fact that you use policy
group_by_node_name. I don't know if this mode if it actually load
balances across the 2 HBA's.


When you pull the cable (this is the test that you're doing and that s
failling ?) you say it times out forever. 
As you're in policy group_by_node_name, which corresponds to the
fc_transport target node name you should look at the state of the target
ports bound to the HBA you disconnected (is it the test you're doing?)
(state Blocked ?) /sys/class/fc_remote_ports/rport:H:B-R (where H is
your HBA number )forever due to may dev_loss_tmo or fast_io_fail_tmo too
high (both timers are located under /sys/class/fc_remote_ports/rport....

I have almost the same setup with almost the same storage (OPEN-V) from
a pair of HP XP (OEM'ized Hitachi arrays) and things are setup to use
maximum 4 paths per LUN (2 per fabric), some storage experts tend to say
it is already too much, and as multipath policy I use multibus to
distribute across the 2 fabrics.

Hope all this will help  

   




 
you say this happens when you pull the fiber cable from the server 

On Fri, 2010-04-16 at 08:55 +0000, jose nuno neto wrote:
> Hi
> 
> 
> > Can you show us a pvdisplay or verbose vgdisplay ?
> >
> 
> Here goes the vgdisplay -v of one of the vgs with mirrors
> 
> ###########################################################
> 
> --- Volume group ---
>   VG Name               vg_ora_jura
>   System ID
>   Format                lvm2
>   Metadata Areas        3
>   Metadata Sequence No  705
>   VG Access             read/write
>   VG Status             resizable
>   MAX LV                0
>   Cur LV                4
>   Open LV               4
>   Max PV                0
>   Cur PV                3
>   Act PV                3
>   VG Size               52.79 GB
>   PE Size               4.00 MB
>   Total PE              13515
>   Alloc PE / Size       12292 / 48.02 GB
>   Free  PE / Size       1223 / 4.78 GB
>   VG UUID               nttQ3x-4ecP-Q6ms-jt2u-UIs4-texj-Q9Nxdt
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_arch
>   VG Name                vg_ora_jura
>   LV UUID                8oUfYn-2TrP-yS6K-pcS2-cgI4-tcv1-33dSdX
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                5.00 GB
>   Current LE             1280
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:28
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_export
>   VG Name                vg_ora_jura
>   LV UUID                NLfQT6-36TS-DRHq-PJRf-9UDv-L8mz-HjPea2
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                5.00 GB
>   Current LE             1280
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:32
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_data
>   VG Name                vg_ora_jura
>   LV UUID                VtSBIL-XvCw-23xK-NVAH-DvYn-P2sE-OkZJro
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                12.00 GB
>   Current LE             3072
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:40
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_redo
>   VG Name                vg_ora_jura
>   LV UUID                KRHKBG-71Qv-YBsA-oJDt-igzP-EYaI-gPwcBX
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                2.00 GB
>   Current LE             512
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:48
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_arch_mimage_0
>   VG Name                vg_ora_jura
>   LV UUID                lQCOAt-aoK3-HBp1-xrQW-eh7L-6t94-CyAg5c
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                5.00 GB
>   Current LE             1280
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:26
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_arch_mimage_1
>   VG Name                vg_ora_jura
>   LV UUID                snrnPc-8FxY-ekAk-ooNe-sBws-tuI0-cTFfj3
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                5.00 GB
>   Current LE             1280
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:27
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_arch_mlog
>   VG Name                vg_ora_jura
>   LV UUID                ouqaCQ-Deex-iArv-xLe9-jg8b-5cLf-3SChQ1
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                4.00 MB
>   Current LE             1
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:25
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_data_mlog
>   VG Name                vg_ora_jura
>   LV UUID                TmE2S0-r8ST-v624-RxUn-Qppw-2l8p-jM9EC9
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                4.00 MB
>   Current LE             1
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:37
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_data_mimage_0
>   VG Name                vg_ora_jura
>   LV UUID                8hR0bP-g9mR-OSXS-KdUM-ouZ6-KVdS-sfz51c
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                12.00 GB
>   Current LE             3072
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:38
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_data_mimage_1
>   VG Name                vg_ora_jura
>   LV UUID                fzdzrD-7p6d-XFkA-UHyr-CPad-F2nV-6QIU9p
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                12.00 GB
>   Current LE             3072
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:39
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_export_mlog
>   VG Name                vg_ora_jura
>   LV UUID                29yLY8-N3Lv-46pN-1jze-50A2-wlhu-quuoMa
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                4.00 MB
>   Current LE             1
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:29
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_export_mimage_0
>   VG Name                vg_ora_jura
>   LV UUID                1uMTsf-wPaQ-ItTy-rpma-m2La-TGZl-C4KIU4
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                5.00 GB
>   Current LE             1280
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:30
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_export_mimage_1
>   VG Name                vg_ora_jura
>   LV UUID                cm8Kn7-knL3-mUPL-XFvU-geMm-Wxff-32x2va
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                5.00 GB
>   Current LE             1280
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:31
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_redo_mlog
>   VG Name                vg_ora_jura
>   LV UUID                811tNy-eaC5-zfZQ-1QVf-cbYP-1MIM-v6waJF
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                4.00 MB
>   Current LE             1
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:45
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_redo_mimage_0
>   VG Name                vg_ora_jura
>   LV UUID                aUZAer-f5rl-1f2X-9jgY-f8CJ-jdwe-F5Pmao
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                2.00 GB
>   Current LE             512
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:46
> 
>   --- Logical volume ---
>   LV Name                /dev/vg_ora_jura/lv_ora_jura_redo_mimage_1
>   VG Name                vg_ora_jura
>   LV UUID                gAEJym-sSbq-rC4P-AjpI-OibV-k3yI-lDx1I6
>   LV Write Access        read/write
>   LV Status              available
>   # open                 1
>   LV Size                2.00 GB
>   Current LE             512
>   Segments               1
>   Allocation             inherit
>   Read ahead sectors     auto
>   - currently set to     256
>   Block device           253:47
> 
>   --- Physical volumes ---
>   PV Name               /dev/mapper/mpath-dc1-b
>   PV UUID               hgjXU1-2qjo-RsmS-1XJI-d0kZ-oc4A-ZKCza8
>   PV Status             allocatable
>   Total PE / Free PE    6749 / 605
> 
>   PV Name               /dev/mapper/mpath-dc2-b
>   PV UUID               hcANwN-aeJT-PIAq-bPsf-9d3e-ylkS-GDjAGR
>   PV Status             allocatable
>   Total PE / Free PE    6749 / 605
> 
>   PV Name               /dev/mapper/mpath-dc2-mlog1p1
>   PV UUID               4l9Qvo-SaAV-Ojlk-D1YB-Tkud-Yjg0-e5RkgJ
>   PV Status             allocatable
>   Total PE / Free PE    17 / 13
> 
> 
> 
> > On 4/15/10, jose nuno neto <jose.neto@liber4e.com> wrote:
> >> hellos
> >>
> >> I spent more time on this and it seems since LVM cant write to any pv on
> >> the  volumes it has lost, it cannot write the failure of the devices and
> >> update the metadata on other PVs. So it hangs forever
> >>
> >> Is this right?
> >>
> >>> GoodMornings
> >>>
> >>> This is what I have on multipath.conf
> >>>
> >>> blacklist {
> >>>         wwid SSun_VOL0_266DCF4A
> >>>         wwid SSun_VOL0_5875CF4A
> >>>         devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
> >>>         devnode "^hd[a-z]"
> >>> }
> >>> defaults {
> >>>                 user_friendly_names             yes
> >>> }
> >>> devices {
> >>>        device {
> >>>                 vendor                          "HITACHI"
> >>>                 product                         "OPEN-V"
> >>>                 path_grouping_policy            group_by_node_name
> >>>                 failback                        immediate
> >>>                 no_path_retry                   fail
> >>>        }
> >>>        device {
> >>>                 vendor                          "IET"
> >>>                 product                         "VIRTUAL-DISK"
> >>>                 path_checker                    tur
> >>>                 path_grouping_policy            failover
> >>>                 failback                        immediate
> >>>                 no_path_retry                   fail
> >>>        }
> >>> }
> >>>
> >>> As an example this is one LUN. It shoes [features=0] so I'd say it
> >>> should
> >>> fail right way
> >>>
> >>> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V
> >>> -SU
> >>> [size=26G][features=0][hwhandler=0][rw]
> >>> \_ round-robin 0 [prio=4][active]
> >>>  \_ 5:0:1:0     sdu  65:64  [active][ready]
> >>>  \_ 5:0:1:16384 sdac 65:192 [active][ready]
> >>>  \_ 5:0:1:32768 sdas 66:192 [active][ready]
> >>>  \_ 5:0:1:49152 sdba 67:64  [active][ready]
> >>> \_ round-robin 0 [prio=4][enabled]
> >>>  \_ 3:0:1:0     sdaw 67:0   [active][ready]
> >>>  \_ 3:0:1:16384 sdbe 67:128 [active][ready]
> >>>  \_ 3:0:1:32768 sdbi 67:192 [active][ready]
> >>>  \_ 3:0:1:49152 sdbm 68:0   [active][ready]
> >>>
> >>> It think they fail since I see this messages from LVM:
> >>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> >>> vg_syb_roger-lv_syb_roger_admin
> >>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
> >>> in
> >>> vg_syb_roger-lv_syb_roger_admin
> >>>
> >>> But from some reason LVM cant remove them, any option I should have on
> >>> lvm.conf?
> >>>
> >>> BestRegards
> >>> Jose
> >>>> post your multipath.conf file, you may be queuing forever ?
> >>>>
> >>>>
> >>>>
> >>>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
> >>>>> Hi2all
> >>>>>
> >>>>> I'm on RHEL 5.4 with
> >>>>> lvm2-2.02.46-8.el5_4.1
> >>>>> 2.6.18-164.2.1.el5
> >>>>>
> >>>>> I have a multipathed SAN connection with what Im builing LVs
> >>>>> Its a Cluster system, and I want LVs to switch on failure
> >>>>>
> >>>>> If I simulate a fail through the OS via
> >>>>> /sys/bus/scsi/devices/$DEVICE/delete
> >>>>> I get a LV fail and the service switch to other node
> >>>>>
> >>>>> But if I do it "real" portdown on the SAN Switch, multipath reports
> >>>>> path
> >>>>> down, but LVM commands hang forever and nothing gets switched
> >>>>>
> >>>>> from the logs i see multipath failing paths, and lvm Failed to remove
> >>>>> faulty
> >>>>> "devices"
> >>>>>
> >>>>> Any ideas how I should  "fix" it?
> >>>>>
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has
> >>>>> failed.
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
> >>>>> vg_ora_scapa-lv_ora_scapa_redo
> >>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
> >>>>> event.  Waiting...
> >>>>>
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> >>>>> paths: 0
> >>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
> >>>>> paths: 0
> >>>>>
> >>>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
> >>>>> vg_syb_roger-lv_syb_roger_admin
> >>>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty
> >>>>> devices
> >>>>> in
> >>>>> vg_syb_roger-lv_syb_roger_admin
> >>>>>
> >>>>> Much Thanks
> >>>>> Jose
> >>>>>
> >>>>> _______________________________________________
> >>>>> linux-lvm mailing list
> >>>>> linux-lvm@redhat.com
> >>>>> https://www.redhat.com/mailman/listinfo/linux-lvm
> >>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> linux-lvm mailing list
> >>>> linux-lvm@redhat.com
> >>>> https://www.redhat.com/mailman/listinfo/linux-lvm
> >>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>>>
> >>>
> >>>
> >>
> >> _______________________________________________
> >> linux-lvm mailing list
> >> linux-lvm@redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-lvm
> >> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >>
> >
> > --
> > Sent from my mobile device
> >
> > Regards,
> > Eugene Vilensky
> > evilensky@gmail.com
> >
> > _______________________________________________
> > linux-lvm mailing list
> > linux-lvm@redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-lvm
> > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> >
> 
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

  parent reply	other threads:[~2010-04-17  7:28 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-24 16:14 [linux-lvm] Mirror fail/recover test jose nuno neto
2010-02-24 18:55 ` malahal
2010-02-25 10:36   ` jose nuno neto
2010-02-25 16:11     ` malahal
2010-03-02 10:31       ` [linux-lvm] Mirror fail/recover test SOLVED jose nuno neto
2010-04-14 15:03         ` [linux-lvm] Lvm hangs on San fail jose nuno neto
2010-04-14 17:38           ` Eugene Vilensky
2010-04-14 23:02           ` brem belguebli
2010-04-15  8:29             ` jose nuno neto
2010-04-15  9:32               ` Bryan Whitehead
2010-04-15 11:59               ` jose nuno neto
2010-04-15 12:41                 ` Eugene Vilensky
2010-04-16  8:55                   ` jose nuno neto
2010-04-16 20:15                     ` Bryan Whitehead
2010-04-17  9:00                     ` brem belguebli [this message]
2010-04-19  9:21                       ` jose nuno neto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1271494859.5449.41.camel@localhost \
    --to=brem.belguebli@gmail.com \
    --cc=linux-lvm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.