From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx1.redhat.com (ext-mx03.extmail.prod.ext.phx2.redhat.com [10.5.110.7]) by int-mx05.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o3F9WNWC016771 for ; Thu, 15 Apr 2010 05:32:23 -0400 Received: from mail-vw0-f46.google.com (mail-vw0-f46.google.com [209.85.212.46]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o3F9WDTY020657 for ; Thu, 15 Apr 2010 05:32:14 -0400 Received: by vws5 with SMTP id 5so143901vws.33 for ; Thu, 15 Apr 2010 02:32:13 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <20100224185530.GA22199@us.ibm.com> <20100225161112.GA14691@us.ibm.com> <230efd8b7a2864c37b18fb4c0617b4b5.squirrel@fela.liber4e.com> <1271286129.2462.0.camel@localhost> Date: Thu, 15 Apr 2010 02:32:13 -0700 Message-ID: From: Bryan Whitehead Content-Transfer-Encoding: 8bit Subject: Re: [linux-lvm] Lvm hangs on San fail Reply-To: LVM general discussion and development List-Id: LVM general discussion and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , List-Id: Content-Type: text/plain; charset="utf-8" To: LVM general discussion and development Can you post the output of pvdisplay? Also the output of multipath when the port is down? If your multipath output is still showing all paths [active][ready] when you shut a port down, you might need to change the path_checker option. I don't have a Hitachi array but readsector0 (the default) did not work for me, directio does. This could be LVM seeing IO is timing out, but the multipath stuff isn't downing a dead path. On Thu, Apr 15, 2010 at 1:29 AM, jose nuno neto wrote: > GoodMornings > > This is what I have on multipath.conf > > blacklist { > � � � �wwid SSun_VOL0_266DCF4A > � � � �wwid SSun_VOL0_5875CF4A > � � � �devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" > � � � �devnode "^hd[a-z]" > } > defaults { > � � � � � � � �user_friendly_names � � � � � � yes > } > devices { > � � � device { > � � � � � � � �vendor � � � � � � � � � � � � �"HITACHI" > � � � � � � � �product � � � � � � � � � � � � "OPEN-V" > � � � � � � � �path_grouping_policy � � � � � �group_by_node_name > � � � � � � � �failback � � � � � � � � � � � �immediate > � � � � � � � �no_path_retry � � � � � � � � � fail > � � � } > � � � device { > � � � � � � � �vendor � � � � � � � � � � � � �"IET" > � � � � � � � �product � � � � � � � � � � � � "VIRTUAL-DISK" > � � � � � � � �path_checker � � � � � � � � � �tur > � � � � � � � �path_grouping_policy � � � � � �failover > � � � � � � � �failback � � � � � � � � � � � �immediate > � � � � � � � �no_path_retry � � � � � � � � � fail > � � � } > } > > As an example this is one LUN. It shoes [features=0] so I'd say it should > fail right way > > mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V � � �-SU > [size=26G][features=0][hwhandler=0][rw] > \_ round-robin 0 [prio=4][active] > �\_ 5:0:1:0 � � sdu �65:64 �[active][ready] > �\_ 5:0:1:16384 sdac 65:192 [active][ready] > �\_ 5:0:1:32768 sdas 66:192 [active][ready] > �\_ 5:0:1:49152 sdba 67:64 �[active][ready] > \_ round-robin 0 [prio=4][enabled] > �\_ 3:0:1:0 � � sdaw 67:0 � [active][ready] > �\_ 3:0:1:16384 sdbe 67:128 [active][ready] > �\_ 3:0:1:32768 sdbi 67:192 [active][ready] > �\_ 3:0:1:49152 sdbm 68:0 � [active][ready] > > It think they fail since I see this messages from LVM: > Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in > vg_syb_roger-lv_syb_roger_admin > Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices in > vg_syb_roger-lv_syb_roger_admin > > But from some reason LVM cant remove them, any option I should have on > lvm.conf? > > BestRegards > Jose >> post your multipath.conf file, you may be queuing forever ? >> >> >> >> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote: >>> Hi2all >>> >>> I'm on RHEL 5.4 with >>> lvm2-2.02.46-8.el5_4.1 >>> 2.6.18-164.2.1.el5 >>> >>> I have a multipathed SAN connection with what Im builing LVs >>> Its a Cluster system, and I want LVs to switch on failure >>> >>> If I simulate a fail through the OS via >>> /sys/bus/scsi/devices/$DEVICE/delete >>> I get a LV fail and the service switch to other node >>> >>> But if I do it "real" portdown on the SAN Switch, multipath reports path >>> down, but LVM commands hang forever and nothing gets switched >>> >>> from the logs i see multipath failing paths, and lvm Failed to remove >>> faulty >>> "devices" >>> >>> Any ideas how I should �"fix" it? >>> >>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has failed. >>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in >>> vg_ora_scapa-lv_ora_scapa_redo >>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an >>> event. �Waiting... >>> >>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active >>> paths: 0 >>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active >>> paths: 0 >>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active >>> paths: 0 >>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active >>> paths: 0 >>> >>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in >>> vg_syb_roger-lv_syb_roger_admin >>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices >>> in >>> vg_syb_roger-lv_syb_roger_admin >>> >>> Much Thanks >>> Jose >>> >>> _______________________________________________ >>> linux-lvm mailing list >>> linux-lvm@redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-lvm >>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ >> >> >> _______________________________________________ >> linux-lvm mailing list >> linux-lvm@redhat.com >> https://www.redhat.com/mailman/listinfo/linux-lvm >> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ >> > > _______________________________________________ > linux-lvm mailing list > linux-lvm@redhat.com > https://www.redhat.com/mailman/listinfo/linux-lvm > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ >