All of lore.kernel.org
 help / color / mirror / Atom feed
* multipath - AAArgh! How do I turn "features=1 queue_if_no_path" off?
@ 2009-09-30 10:33 John Hughes
  2009-09-30 15:42 ` malahal
  2009-09-30 16:22 ` Moger, Babu
  0 siblings, 2 replies; 9+ messages in thread
From: John Hughes @ 2009-09-30 10:33 UTC (permalink / raw)
  To: device-mapper development

I want to turn queue_if_no_path off and use

                polling_interval        5
                no_path_retry           5

because I've had problems with things hanging when a lun "vanishes" (I 
deleted it from my external raid box).

But whatever I put in /etc/multipath.conf when I do a "multipath -l" or 
"multipath-ll" it shows:

360024e80005b3add000001b64ab05c87dm-28 DELL    ,MD3000        
[size=68G][features=1 queue_if_no_path][hwhandler=1 rdac]
\_ round-robin 0 [prio=3][active]
 \_ 3:0:1:13 sdad 65:208 [active][ready]
\_ round-robin 0 [prio=0][enabled]
 \_ 4:0:0:13 sdas 66:192 [active][ghost]


How on earth do I turn this damn thing off?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: multipath - AAArgh! How do I turn "features=1 queue_if_no_path" off?
  2009-09-30 10:33 multipath - AAArgh! How do I turn "features=1 queue_if_no_path" off? John Hughes
@ 2009-09-30 15:42 ` malahal
  2009-10-01  6:24   ` Hannes Reinecke
  2009-09-30 16:22 ` Moger, Babu
  1 sibling, 1 reply; 9+ messages in thread
From: malahal @ 2009-09-30 15:42 UTC (permalink / raw)
  To: dm-devel

John Hughes [john@Calva.COM] wrote:
> I want to turn queue_if_no_path off and use
>
>                polling_interval        5
>                no_path_retry           5
>
> because I've had problems with things hanging when a lun "vanishes" (I 
> deleted it from my external raid box).
>
> But whatever I put in /etc/multipath.conf when I do a "multipath -l" or 
> "multipath-ll" it shows:

Did you reload the mapper table?

> 360024e80005b3add000001b64ab05c87dm-28 DELL    ,MD3000        
> [size=68G][features=1 queue_if_no_path][hwhandler=1 rdac]
> \_ round-robin 0 [prio=3][active]
> \_ 3:0:1:13 sdad 65:208 [active][ready]
> \_ round-robin 0 [prio=0][enabled]
> \_ 4:0:0:13 sdas 66:192 [active][ghost]
>
>
> How on earth do I turn this damn thing off?
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: multipath - AAArgh! How do I turn "features=1 queue_if_no_path" off?
  2009-09-30 10:33 multipath - AAArgh! How do I turn "features=1 queue_if_no_path" off? John Hughes
  2009-09-30 15:42 ` malahal
@ 2009-09-30 16:22 ` Moger, Babu
  2009-09-30 17:11   ` John Hughes
  2009-10-01  9:05   ` John Hughes
  1 sibling, 2 replies; 9+ messages in thread
From: Moger, Babu @ 2009-09-30 16:22 UTC (permalink / raw)
  To: device-mapper development

You will see "features=1 queue_if_no_path" if you have one of the following entry in your multipath.conf.

1. Features		"1 queue_if_no_path"

  Or

2. no_path_retry           5


The first option will queue the i/o forever. The second option will queue the i/o only for 5 polling intervals. But, queue_if_no_path will show up as enabled if one of the above is true.

Try "no_path_retry    fail"  then you will not see this feature in your multipath.conf.

Thanks
Babu Moger 

> -----Original Message-----
> From: dm-devel-bounces@redhat.com [mailto:dm-devel-bounces@redhat.com]
> On Behalf Of John Hughes
> Sent: Wednesday, September 30, 2009 5:33 AM
> To: device-mapper development
> Subject: [dm-devel] multipath - AAArgh! How do I turn "features=1
> queue_if_no_path" off?
> 
> I want to turn queue_if_no_path off and use
> 
>                 polling_interval        5
>                 no_path_retry           5
> 
> because I've had problems with things hanging when a lun "vanishes" (I
> deleted it from my external raid box).
> 
> But whatever I put in /etc/multipath.conf when I do a "multipath -l" or
> "multipath-ll" it shows:
> 
> 360024e80005b3add000001b64ab05c87dm-28 DELL    ,MD3000
> [size=68G][features=1 queue_if_no_path][hwhandler=1 rdac]
> \_ round-robin 0 [prio=3][active]
>  \_ 3:0:1:13 sdad 65:208 [active][ready]
> \_ round-robin 0 [prio=0][enabled]
>  \_ 4:0:0:13 sdas 66:192 [active][ghost]
> 
> 
> How on earth do I turn this damn thing off?
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: multipath - AAArgh! How do I turn "features=1 queue_if_no_path" off?
  2009-09-30 16:22 ` Moger, Babu
@ 2009-09-30 17:11   ` John Hughes
  2009-10-01  9:05   ` John Hughes
  1 sibling, 0 replies; 9+ messages in thread
From: John Hughes @ 2009-09-30 17:11 UTC (permalink / raw)
  To: device-mapper development

Moger, Babu wrote:
> You will see "features=1 queue_if_no_path" if you have one of the following entry in your multipath.conf.
>
> 1. Features		"1 queue_if_no_path"
>
>   Or
>
> 2. no_path_retry           5
>   
Ah, that was the problem.   Now I have "features=0".

I did my silly test again (deleting a LUN that was in use from the 
external cabinet) and the system kept working.  The errors on the dm-xxx 
device propogated back to mdadm which marked it as failed an all was well.

Except that I'm still getting lots of I/O errors:

[ 9019.822396] Buffer I/O error on device dm-17, logical block 17790463
[ 9019.822432] Buffer I/O error on device dm-17, logical block 0
[ 9019.822473] Buffer I/O error on device dm-17, logical block 0
[ 9020.735190] sd 4:0:0:1: queueing MODE_SELECT command.
[ 9020.771075] sd 4:0:0:1: MODE_SELECT failed with sense 0x59100.
[ 9020.793828] sd 4:0:0:1: [sdah] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
[ 9020.793894] sd 4:0:0:1: [sdah] Sense Key : Illegal Request [current] 
[ 9020.793932] sd 4:0:0:1: [sdah] Add. Sense: Logical unit not supported
[ 9020.793968] end_request: I/O error, dev sdah, sector 0
[ 9020.793995] device-mapper: multipath: Failing path 66:16.

... endlessly repeating

I was able to stop the errors by

   1. removing the failed multipath from the mdadm array
   2. using the multipath -f command to flush the deleted disk.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: multipath - AAArgh! How do I turn "features=1 queue_if_no_path" off?
  2009-09-30 15:42 ` malahal
@ 2009-10-01  6:24   ` Hannes Reinecke
  2009-10-01  8:55     ` John Hughes
  0 siblings, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2009-10-01  6:24 UTC (permalink / raw)
  To: dm-devel

malahal@us.ibm.com wrote:
> John Hughes [john@Calva.COM] wrote:
>> I want to turn queue_if_no_path off and use
>>
>>                polling_interval        5
>>                no_path_retry           5
>>
>> because I've had problems with things hanging when a lun "vanishes" (I 
>> deleted it from my external raid box).
>>
>> But whatever I put in /etc/multipath.conf when I do a "multipath -l" or 
>> "multipath-ll" it shows:
> 
> Did you reload the mapper table?
> 
>> 360024e80005b3add000001b64ab05c87dm-28 DELL    ,MD3000        
>> [size=68G][features=1 queue_if_no_path][hwhandler=1 rdac]
>> \_ round-robin 0 [prio=3][active]
>> \_ 3:0:1:13 sdad 65:208 [active][ready]
>> \_ round-robin 0 [prio=0][enabled]
>> \_ 4:0:0:13 sdas 66:192 [active][ghost]
>>
Which is entirely correct. The 'queue_if_no_path' flag _has_ to
be set here as we do want to retry failed paths, if only for
a limited amount of retries.

The in-kernel dm-multipath module should handle the situation correctly
and switch off the queue_if_no_path flag (= pass I/O errors upwards)
when the amount of retries is exhausted.

You can only switch the flag off by setting 'no_path_retry failed'.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: multipath - AAArgh! How do I turn "features=1 queue_if_no_path" off?
  2009-10-01  6:24   ` Hannes Reinecke
@ 2009-10-01  8:55     ` John Hughes
  2009-10-01  9:41       ` Diedrich Ehlerding
  2009-10-01  9:44       ` Hannes Reinecke
  0 siblings, 2 replies; 9+ messages in thread
From: John Hughes @ 2009-10-01  8:55 UTC (permalink / raw)
  To: device-mapper development

Hannes Reinecke wrote:
> malahal@us.ibm.com wrote:
>> John Hughes [john@Calva.COM] wrote:
>>> I want to turn queue_if_no_path off and use
>>>
>>>                polling_interval        5
>>>                no_path_retry           5
>>>
>>> because I've had problems with things hanging when a lun "vanishes" 
>>> (I deleted it from my external raid box).
>>>
>>> But whatever I put in /etc/multipath.conf when I do a "multipath -l" 
>>> or "multipath-ll" it shows:
>>
>> Did you reload the mapper table?
>>
>>> 360024e80005b3add000001b64ab05c87dm-28 DELL    ,MD3000        
>>> [size=68G][features=1 queue_if_no_path][hwhandler=1 rdac]
>>> \_ round-robin 0 [prio=3][active]
>>> \_ 3:0:1:13 sdad 65:208 [active][ready]
>>> \_ round-robin 0 [prio=0][enabled]
>>> \_ 4:0:0:13 sdas 66:192 [active][ghost]
>>>
> Which is entirely correct. The 'queue_if_no_path' flag _has_ to
> be set here as we do want to retry failed paths, if only for
> a limited amount of retries.
>
> The in-kernel dm-multipath module should handle the situation correctly
> and switch off the queue_if_no_path flag (= pass I/O errors upwards)
> when the amount of retries is exhausted.
As far as I can tell it retries forever (even with polling_interval 5 
and no_path_retry 5).   The mdadm raid10 built on top of the multipath 
devices hangs, even /proc/mdstat hangs.

You're saying that without queue_if_no_path multipath basicly won't work 
- mdadm will see I/O errors on multipath devices if a path fails?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: multipath - AAArgh! How do I turn "features=1 queue_if_no_path" off?
  2009-09-30 16:22 ` Moger, Babu
  2009-09-30 17:11   ` John Hughes
@ 2009-10-01  9:05   ` John Hughes
  1 sibling, 0 replies; 9+ messages in thread
From: John Hughes @ 2009-10-01  9:05 UTC (permalink / raw)
  To: device-mapper development

Moger, Babu wrote:
> You will see "features=1 queue_if_no_path" if you have one of the following entry in your multipath.conf.
>
>
> 2. no_path_retry           5
>   
As it says in the multipath.conf manpage:

       features         Specify  any  device-mapper  features  to be used. The
                        most common of these features  is  1  queue_if_no_path
                        Note  that  this can also be set via the no_path_retry
                        keyword.

I must learn to read the fucking manual.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: multipath - AAArgh! How do I turn "features=1 queue_if_no_path" off?
  2009-10-01  8:55     ` John Hughes
@ 2009-10-01  9:41       ` Diedrich Ehlerding
  2009-10-01  9:44       ` Hannes Reinecke
  1 sibling, 0 replies; 9+ messages in thread
From: Diedrich Ehlerding @ 2009-10-01  9:41 UTC (permalink / raw)
  To: device-mapper development

John Hughes wrote:  

> and no_path_retry 5).   The mdadm raid10 built on top of the multipath
> devices hangs, even /proc/mdstat hangs.
> 
> You're saying that without queue_if_no_path multipath basicly won't
> work 
> - mdadm will see I/O errors on multipath devices if a path fails?

No. It will see IO errors immediately if _all_ paths fail. With 
no_path_retry nn, the intended behaviour is to wait nn cycles to see if 
the array (at elast one path) reappears, and to fail thhe IO after nn 
cycles. 

Waht you report here is  exactly what I observed too (my distro was a 
SLES10). Apparently, some versions of multipath-tools seem to be buggy 
with respect to no_path_retry count and seem to react as if you had 
used "no_path_retry queue".  AFAIR some weeks ago Hannes Reinecke 
stated here that this is indeed a bug in some SuSE versions of 
multipath_tools. 

I succeeded to set up mdadm mirrors (and also lvm mirrors on a SLES11 
machine)  on top of dm_multipath by explicitely using "no_path_retry 
fail" (edit multipath.conf and restart multipathd afterwards). With 
these settings, path failures are handled as usually, and I could 
survive a (simulated) raid array failure (i.e., all paths failed). 
"no_path_retry fail" may contradict commercial raid system 
manufacturers' recommendations ... but it seems to work for me.

Another idea which you might take into account: I do not know the raid 
array which you are using. My attempts were done with EMC Clariion 
arrays. If I simulate an array failure by just removing the host's 
access rights to a lun within the array, I get a different behaviour 
depending on the lun address - on a Clariion, removing, say, scsi 
address 2:0:0:0 and 3:0:0:0 is not exactly the same as removing 2:0:0:1 
and 3:0:0:1. A clariion exposes some kind of dummy lun 0 ("LUNZ") to 
all hosts which dont have access rights to any real lun visible at  
address 0. The consequence ist that removing a real lun 0 will not 
result in not having a lun 0 o the scsi level; instead, it results in a 
not_ready lun 0 (i.e. 2:0:0:0 and 3:0:0:0 are still visible at the scsi 
layer!). Therefore I recommend to simulate site failures with luns !=0

best regards
Diedrich

-- 
Diedrich Ehlerding, Fujitsu Technology Solutions GmbH, R GE TIS N IC2 
Hildesheimer Str 25, D-30880 Laatzen
Fon +49 511 8489-1806, Fax -251806, Mobil +49 173 2464758
Firmenangaben: http://de.ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: multipath - AAArgh! How do I turn "features=1 queue_if_no_path" off?
  2009-10-01  8:55     ` John Hughes
  2009-10-01  9:41       ` Diedrich Ehlerding
@ 2009-10-01  9:44       ` Hannes Reinecke
  1 sibling, 0 replies; 9+ messages in thread
From: Hannes Reinecke @ 2009-10-01  9:44 UTC (permalink / raw)
  To: device-mapper development

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 1833 bytes --]

On Thu, Oct 01, 2009 at 10:55:12AM +0200, John Hughes wrote:
> Hannes Reinecke wrote:
>> malahal@us.ibm.com wrote:
>>> John Hughes [john@Calva.COM] wrote:
>>>> I want to turn queue_if_no_path off and use
>>>>
>>>>                polling_interval        5
>>>>                no_path_retry           5
>>>>
>>>> because I've had problems with things hanging when a lun "vanishes" (I 
>>>> deleted it from my external raid box).
>>>>
>>>> But whatever I put in /etc/multipath.conf when I do a "multipath -l" or 
>>>> "multipath-ll" it shows:
>>>> 360024e80005b3add000001b64ab05c87dm-28 DELL    ,MD3000        
>>>> [size=68G][features=1 queue_if_no_path][hwhandler=1 rdac]
>>>> \_ round-robin 0 [prio=3][active]
>>>> \_ 3:0:1:13 sdad 65:208 [active][ready]
>>>> \_ round-robin 0 [prio=0][enabled]
>>>> \_ 4:0:0:13 sdas 66:192 [active][ghost]
>>>>
>> Which is entirely correct. The 'queue_if_no_path' flag _has_ to
>> be set here as we do want to retry failed paths, if only for
>> a limited amount of retries.
>>
>> The in-kernel dm-multipath module should handle the situation correctly
>> and switch off the queue_if_no_path flag (= pass I/O errors upwards)
>> when the amount of retries is exhausted.
> As far as I can tell it retries forever (even with polling_interval 5 and 
> no_path_retry 5).   The mdadm raid10 built on top of the multipath devices 
> hangs, even /proc/mdstat hangs.
>
> You're saying that without queue_if_no_path multipath basicly won't work - 
> mdadm will see I/O errors on multipath devices if a path fails?
>
If _all_ paths fail. Note the 'no_path' bit :-)

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-10-01  9:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-30 10:33 multipath - AAArgh! How do I turn "features=1 queue_if_no_path" off? John Hughes
2009-09-30 15:42 ` malahal
2009-10-01  6:24   ` Hannes Reinecke
2009-10-01  8:55     ` John Hughes
2009-10-01  9:41       ` Diedrich Ehlerding
2009-10-01  9:44       ` Hannes Reinecke
2009-09-30 16:22 ` Moger, Babu
2009-09-30 17:11   ` John Hughes
2009-10-01  9:05   ` John Hughes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.