All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
@ 2016-12-01  1:06 peng.liang5
  2016-12-01 16:44 ` Benjamin Marzinski
  0 siblings, 1 reply; 9+ messages in thread
From: peng.liang5 @ 2016-12-01  1:06 UTC (permalink / raw)
  To: bmarzins; +Cc: zhang.kai16, dm-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 2082 bytes --]

If fast_io_fail_tmo isn't set, it will be use the DEFAULT_FAST_IO_FAIL in select_fast_io_fail.

So, multipath will not run the limited of dev_loss_tmo to 600.


And I think using MP_FAST_IO_FAIL_UNSET as the condition is meaningless after multipath

run select_fast_io_fail even if it's not set.






原始邮件



发件人:BenjaminMarzinski
收件人:彭亮10137102
抄送人:<dm-devel@redhat.com>张凯10072500
日 期 :2016年11月29日 08:30
主 题 :Re: [dm-devel] [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue





On Fri, Nov 25, 2016 at 02:36:04PM +0800, peng.liang5@zte.com.cn wrote:
> From: PengLiang <peng.liang5@zte.com.cn>
> 
> If no_path_retry set to queue, we should make sure dev_loss_tmo update to MAX_DEV_LOSS_TMO.
> But, it will be limit to 600 if fast_io_fail_tmo set to off or 0 meanwhile.

Doesn't the system still limit dev_loss_tmo to 600 if fast_io_fail_tmo isn't set. Multipath
was using this limit, since the underlying system uses it.

-Ben

> 
> Signed-off-by: PengLiang <peng.liang5@zte.com.cn>
> ---
>  libmultipath/discovery.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> index aaa915c..05b0842 100644
> --- a/libmultipath/discovery.c
> +++ b/libmultipath/discovery.c
> @@ -608,7 +608,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
>                  goto out
>              }
>          }
> -    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO) {
> +    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO &&
> +        mpp->no_path_retry != NO_PATH_RETRY_QUEUE) {
>          condlog(3, "%s: limiting dev_loss_tmo to %d, since "
>              "fast_io_fail is not set",
>              rport_id, DEFAULT_DEV_LOSS_TMO)
> -- 
> 2.8.1.windows.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

[-- Attachment #1.1.2: Type: text/html , Size: 4502 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
  2016-12-01  1:06 [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue peng.liang5
@ 2016-12-01 16:44 ` Benjamin Marzinski
  0 siblings, 0 replies; 9+ messages in thread
From: Benjamin Marzinski @ 2016-12-01 16:44 UTC (permalink / raw)
  To: peng.liang5; +Cc: zhang.kai16, dm-devel

On Thu, Dec 01, 2016 at 09:06:14AM +0800, peng.liang5@zte.com.cn wrote:
>    If fast_io_fail_tmo isn't set, it will be use the DEFAULT_FAST_IO_FAIL
>    in select_fast_io_fail.
> 
>    So, multipath will not run the limited of dev_loss_tmo to 600.

Yes, but the kernel will. With your patch installed, if I disable
fast_io_fail_tmo and set no_path_retry to queue, I get these messages

Dec 01 04:19:02 | rport-11:0-0: failed to set dev_loss_tmo to
2147483647, error 22

Because if fast_io_fail_tmo is not set, the kernel itself will bar
dev_loss_tmo from being above 600 seconds. Also, even if you could set
dev_loss_tmo to it's maximum without fast_io_fail_tmo set, you would
never want to, because you would break multipath.

With fast_io_fail_tmo disabled, the scsi device will never pass the
failed IO back up until dev_loss_tmo triggers.  This means that if you
lose a path on your multipath device while doing IO, you won't be able
to resend that IO down another path for 68 years (2147483647 seconds).
Also, all the synchronous checker functions will not return for 648
years. And during all this time these processes will be uninterruptable
sleep. At that point, there would be no point to even having multiple
paths, because you couldn't ever actually use them if one went down.

> 
>    And I think using MP_FAST_IO_FAIL_UNSET as the condition is meaningless
>    after multipath
> 
>    run select_fast_io_fail even if it's not set.

This is true in the default case, but we can't rely on the default case.
Since we allow users to turn it off, we need to correctly configure
multipath when it is off.

-Ben

>                                     原始邮件
>    发件人:BenjaminMarzinski
>    收件人:彭亮10137102;
>    抄送人:<dm-devel@redhat.com>张凯10072500;
>    日 期 :2016年11月29日 08:30
>    主 题 :Re: [dm-devel] [PATCH] libmultipath: ensure dev_loss_tmo will be
>    update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
> 
>    On Fri, Nov 25, 2016 at 02:36:04PM +0800, peng.liang5@zte.com.cn wrote:
>    > From: PengLiang <peng.liang5@zte.com.cn>
>    > 
>    > If no_path_retry set to queue, we should make sure dev_loss_tmo update to MAX_DEV_LOSS_TMO.
>    > But, it will be limit to 600 if fast_io_fail_tmo set to off or 0 meanwhile.
> 
>    Doesn't the system still limit dev_loss_tmo to 600 if fast_io_fail_tmo isn't set. Multipath
>    was using this limit, since the underlying system uses it.
> 
>    -Ben
> 
>    > 
>    > Signed-off-by: PengLiang <peng.liang5@zte.com.cn>
>    > ---
>    >  libmultipath/discovery.c | 3 ++-
>    >  1 file changed, 2 insertions(+), 1 deletion(-)
>    > 
>    > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
>    > index aaa915c..05b0842 100644
>    > --- a/libmultipath/discovery.c
>    > +++ b/libmultipath/discovery.c
>    > @@ -608,7 +608,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
>    >                  goto out;
>    >              }
>    >          }
>    > -    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO) {
>    > +    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO &&
>    > +        mpp->no_path_retry != NO_PATH_RETRY_QUEUE) {
>    >          condlog(3, "%s: limiting dev_loss_tmo to %d, since "
>    >              "fast_io_fail is not set",
>    >              rport_id, DEFAULT_DEV_LOSS_TMO);
>    > -- 
>    > 2.8.1.windows.1
> 
>    --
>    dm-devel mailing list
>    dm-devel@redhat.com
>    https://www.redhat.com/mailman/listinfo/dm-devel


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
@ 2016-12-08  1:38 peng.liang5
  0 siblings, 0 replies; 9+ messages in thread
From: peng.liang5 @ 2016-12-08  1:38 UTC (permalink / raw)
  To: hare; +Cc: dm-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 5965 bytes --]

Hello, Hannes


The kernel didn't limit the dev_loss_tmo if fast_io_fail_tmo is 0.


But multipath did. Should I leave it alone and just revert this patch?

Thanks.






原始邮件



发件人:HannesReinecke
收件人:<dm-devel@redhat.com>
日 期 :2016年12月07日 15:04
主 题 :Re: [dm-devel] [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue





On 12/07/2016 07:42 AM, peng.liang5@zte.com.cn wrote:
> Hello, Ben
> 
> Sorry for late to reply.
> 
> Such is the case as you said below. If fast_io_fail_tmo is off we have
> to cap dev_loss_tmo at 600. So, this patch is a wrong guide and will be
> cause a kernel error.
> 
Indeed.

We've had _far_ too many fixes for the 'dev_loss_tmo defaults to 600'
issue, but seems to have it fixed by now.
So any patches in this area should be treated with utmost caution.

> And one more question. Should the system limit dev_loss_tmo to 600 if 
> fast_io_fail_tmo set to 0?
> 
There kernel surely does. And if there is no error in the current
algorithm I'm strongly in favour of just leave it alone.

Cheers,

Hannes

> On Thu, Dec 01, 2016 at 09:06:14AM +0800, peng.liang5@zte.com.cn wrote:
> >    If fast_io_fail_tmo isn't set, it will be use the DEFAULT_FAST_IO_FAIL
> >    in select_fast_io_fail.
> > 
> >    So, multipath will not run the limited of dev_loss_tmo to 600.
> 
> Yes, but the kernel will. With your patch installed, if I disable
> fast_io_fail_tmo and set no_path_retry to queue, I get these messages
> 
> Dec 01 04:19:02 | rport-11:0-0: failed to set dev_loss_tmo to
> 2147483647, error 22
> 
> Because if fast_io_fail_tmo is not set, the kernel itself will bar
> dev_loss_tmo from being above 600 seconds. Also, even if you could set
> dev_loss_tmo to it's maximum without fast_io_fail_tmo set, you would
> never want to, because you would break multipath.
> 
> With fast_io_fail_tmo disabled, the scsi device will never pass the
> failed IO back up until dev_loss_tmo triggers.  This means that if you
> lose a path on your multipath device while doing IO, you won't be able
> to resend that IO down another path for 68 years (2147483647 seconds).
> Also, all the synchronous checker functions will not return for 648
> years. And during all this time these processes will be uninterruptable
> sleep. At that point, there would be no point to even having multiple
> paths, because you couldn't ever actually use them if one went down.
> 
> > 
> >    And I think using MP_FAST_IO_FAIL_UNSET as the condition is meaningless
> >    after multipath
> > 
> >    run select_fast_io_fail even if it's not set.
> 
> This is true in the default case, but we can't rely on the default case.
> Since we allow users to turn it off, we need to correctly configure
> multipath when it is off.
> 
> -Ben
> 
> >                                     原始邮件
> >    发件人:BenjaminMarzinski
> >    收件人:彭亮10137102
> >    抄送人:<dm-devel@redhat.com>张凯10072500
> >    日 期 :2016年11月29日 08:30
> >    主 题 :Re: [dm-
> devel] [PATCH] libmultipath: ensure dev_loss_tmo will be
> >    update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
> > 
> >    On Fri, Nov 25, 2016 at 02:36:04PM +0800, peng.liang5@zte.com.cn wrote:
> >    > From: PengLiang <peng.liang5@zte.com.cn>
> >    > 
> >    > If no_path_retry set to queue, we should make sure dev_loss_tmo update to MAX_DEV_LOSS_TMO.
> >    > But, it will be limit to 600 if fast_io_fail_tmo set to off or 0 meanwhile.
> > 
> >    Doesn't the system still limit dev_loss_tmo to 600 if fast_io_fail_tmo isn't set. Multipath
> >    was using this limit, since the underlying system uses it.
> > 
> >    -Ben
> > 
> >    > 
> >    > Signed-off-by: PengLiang <peng.liang5@zte.com.cn>
> >    > ---
> >    >  libmultipath/discovery.c | 3 ++-
> >    >  1 file changed, 2 insertions(+), 1 deletion(-)
> >    > 
> >    > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> >    > index aaa915c..05b0842 100644
> >    > --- a/libmultipath/discovery.c
> >    > +++ b/libmultipath/discovery.c
> >    > @@ -608,7 +608,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
> >    >                  goto out
> >    >              }
> >    >          }
> >    > -    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO) {
> >    > +    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO &&
> >    > +        mpp->no_path_retry != NO_PATH_RETRY_QUEUE) {
> >    >          condlog(3, "%s: limiting dev_loss_tmo to %d, since "
> >    >              "fast_io_fail is not set",
> >    >              rport_id, DEFAULT_DEV_LOSS_TMO)
> >    > -- 
> >    > 2.8.1.windows.1
> > 
> >    --
> >    dm-devel mailing list
> >    dm-devel@redhat.com
> >    https://www.redhat.com/mailman/listinfo/dm-devel
> 
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
> 
> 
> 
> 
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
> 


-- 
Dr. Hannes Reinecke           Teamlead Storage & Networking
hare@suse.de                           +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

[-- Attachment #1.1.2: Type: text/html , Size: 12226 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
  2016-12-07  6:42 peng.liang5
  2016-12-07  6:57 ` Hannes Reinecke
@ 2016-12-07 17:08 ` Benjamin Marzinski
  1 sibling, 0 replies; 9+ messages in thread
From: Benjamin Marzinski @ 2016-12-07 17:08 UTC (permalink / raw)
  To: peng.liang5; +Cc: zhang.kai16, dm-devel

On Wed, Dec 07, 2016 at 02:42:16PM +0800, peng.liang5@zte.com.cn wrote:
>    Hello, Ben
> 
>    Sorry for late to reply.
> 
>    Such is the case as you said below. If fast_io_fail_tmo is off we have to
>    cap
> 
>    dev_loss_tmo at 600. So, this patch is a wrong guide and will be cause a
> 
>    kernel error.
> 
>    And one more question. Should the system limit dev_loss_tmo to 600 if 
> 
>    fast_io_fail_tmo set to 0?

No. The kernel doesn't limit dev_loss_tmo in this case. From a quick
test, it looks like setting fast_io_fail_tmo to 0 means that the scsi
layer fails the IO back immediately, without any waiting for the path to
return. This means that any value for dev_loss_tmo should be fine.

Thanks.
-Ben
 
>    Hope for your reply. Thanks.
> 
>                                     原始邮件
>    发件人:BenjaminMarzinski
>    收件人:彭亮10137102;
>    抄送人:张凯10072500;<dm-devel@redhat.com>
>    日 期 :2016年12月02日 00:51
>    主 题 :Re: [dm-devel] [PATCH] libmultipath: ensure dev_loss_tmo will be
>    update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
> 
>    On Thu, Dec 01, 2016 at 09:06:14AM +0800, peng.liang5@zte.com.cn wrote:
>    >    If fast_io_fail_tmo isn't set, it will be use the DEFAULT_FAST_IO_FAIL
>    >    in select_fast_io_fail.
>    > 
>    >    So, multipath will not run the limited of dev_loss_tmo to 600.
> 
>    Yes, but the kernel will. With your patch installed, if I disable
>    fast_io_fail_tmo and set no_path_retry to queue, I get these messages
> 
>    Dec 01 04:19:02 | rport-11:0-0: failed to set dev_loss_tmo to
>    2147483647, error 22
> 
>    Because if fast_io_fail_tmo is not set, the kernel itself will bar
>    dev_loss_tmo from being above 600 seconds. Also, even if you could set
>    dev_loss_tmo to it's maximum without fast_io_fail_tmo set, you would
>    never want to, because you would break multipath.
> 
>    With fast_io_fail_tmo disabled, the scsi device will never pass the
>    failed IO back up until dev_loss_tmo triggers.  This means that if you
>    lose a path on your multipath device while doing IO, you won't be able
>    to resend that IO down another path for 68 years (2147483647 seconds).
>    Also, all the synchronous checker functions will not return for 648
>    years. And during all this time these processes will be uninterruptable
>    sleep. At that point, there would be no point to even having multiple
>    paths, because you couldn't ever actually use them if one went down.
> 
>    > 
>    >    And I think using MP_FAST_IO_FAIL_UNSET as the condition is meaningless
>    >    after multipath
>    > 
>    >    run select_fast_io_fail even if it's not set.
> 
>    This is true in the default case, but we can't rely on the default case.
>    Since we allow users to turn it off, we need to correctly configure
>    multipath when it is off.
> 
>    -Ben
> 
>    >                                     原始邮件
>    >    发件人:BenjaminMarzinski
>    >    收件人:彭亮10137102;
>    >    抄送人:<dm-devel@redhat.com>张凯10072500;
>    >    日 期 :2016年11月29日 08:30
>    >    主 题 
>    :Re: [dm-devel] [PATCH] libmultipath: ensure dev_loss_tmo will be
>    >    update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
>    > 
>    >    On Fri, Nov 25, 2016 at 02:36:04PM +0800, peng.liang5@zte.com.cn wrote:
>    >    > From: PengLiang <peng.liang5@zte.com.cn>
>    >    > 
>    >    
>    > If no_path_retry set to queue, we should make sure dev_loss_tmo update to MAX_DEV_LOSS_TMO.
>    >    
>    > But, it will be limit to 600 if fast_io_fail_tmo set to off or 0 meanwhile.
>    > 
>    >    Doesn't the system still limit dev_loss_tmo to 600 if fast_io_fail_tmo isn't set. Multipath
>    >    was using this limit, since the underlying system uses it.
>    > 
>    >    -Ben
>    > 
>    >    > 
>    >    > Signed-off-by: PengLiang <peng.liang5@zte.com.cn>
>    >    > ---
>    >    >  libmultipath/discovery.c | 3 ++-
>    >    >  1 file changed, 2 insertions(+), 1 deletion(-)
>    >    > 
>    >    > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
>    >    > index aaa915c..05b0842 100644
>    >    > --- a/libmultipath/discovery.c
>    >    > +++ b/libmultipath/discovery.c
>    >    
>    > @@ -608,7 +608,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
>    >    >                  goto out;
>    >    >              }
>    >    >          }
>    >    > -    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO) {
>    >    > +    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO &&
>    >    > +        mpp->no_path_retry != NO_PATH_RETRY_QUEUE) {
>    >    >          condlog(3, "%s: limiting dev_loss_tmo to %d, since "
>    >    >              "fast_io_fail is not set",
>    >    >              rport_id, DEFAULT_DEV_LOSS_TMO);
>    >    > -- 
>    >    > 2.8.1.windows.1
>    > 
>    >    --
>    >    dm-devel mailing list
>    >    dm-devel@redhat.com
>    >    https://www.redhat.com/mailman/listinfo/dm-devel
> 
>    --
>    dm-devel mailing list
>    dm-devel@redhat.com
>    https://www.redhat.com/mailman/listinfo/dm-devel


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
  2016-12-07  6:42 peng.liang5
@ 2016-12-07  6:57 ` Hannes Reinecke
  2016-12-07 17:08 ` Benjamin Marzinski
  1 sibling, 0 replies; 9+ messages in thread
From: Hannes Reinecke @ 2016-12-07  6:57 UTC (permalink / raw)
  To: dm-devel

On 12/07/2016 07:42 AM, peng.liang5@zte.com.cn wrote:
> Hello, Ben
> 
> Sorry for late to reply.
> 
> Such is the case as you said below. If fast_io_fail_tmo is off we have
> to cap dev_loss_tmo at 600. So, this patch is a wrong guide and will be
> cause a kernel error.
> 
Indeed.

We've had _far_ too many fixes for the 'dev_loss_tmo defaults to 600'
issue, but seems to have it fixed by now.
So any patches in this area should be treated with utmost caution.

> And one more question. Should the system limit dev_loss_tmo to 600 if 
> fast_io_fail_tmo set to 0?
> 
There kernel surely does. And if there is no error in the current
algorithm I'm strongly in favour of just leave it alone.

Cheers,

Hannes

> On Thu, Dec 01, 2016 at 09:06:14AM +0800, peng.liang5@zte.com.cn wrote:
> >    If fast_io_fail_tmo isn't set, it will be use the DEFAULT_FAST_IO_FAIL
> >    in select_fast_io_fail.
> > 
> >    So, multipath will not run the limited of dev_loss_tmo to 600.
> 
> Yes, but the kernel will. With your patch installed, if I disable
> fast_io_fail_tmo and set no_path_retry to queue, I get these messages
> 
> Dec 01 04:19:02 | rport-11:0-0: failed to set dev_loss_tmo to
> 2147483647, error 22
> 
> Because if fast_io_fail_tmo is not set, the kernel itself will bar
> dev_loss_tmo from being above 600 seconds. Also, even if you could set
> dev_loss_tmo to it's maximum without fast_io_fail_tmo set, you would
> never want to, because you would break multipath.
> 
> With fast_io_fail_tmo disabled, the scsi device will never pass the
> failed IO back up until dev_loss_tmo triggers.  This means that if you
> lose a path on your multipath device while doing IO, you won't be able
> to resend that IO down another path for 68 years (2147483647 seconds).
> Also, all the synchronous checker functions will not return for 648
> years. And during all this time these processes will be uninterruptable
> sleep. At that point, there would be no point to even having multiple
> paths, because you couldn't ever actually use them if one went down.
> 
> > 
> >    And I think using MP_FAST_IO_FAIL_UNSET as the condition is meaningless
> >    after multipath
> > 
> >    run select_fast_io_fail even if it's not set.
> 
> This is true in the default case, but we can't rely on the default case.
> Since we allow users to turn it off, we need to correctly configure
> multipath when it is off.
> 
> -Ben
> 
> >                                     原始邮件
> >    发件人:BenjaminMarzinski
> >    收件人:彭亮10137102;
> >    抄送人:<dm-devel@redhat.com>张凯10072500;
> >    日 期 :2016年11月29日 08:30
> >    主 题 :Re: [dm-
> devel] [PATCH] libmultipath: ensure dev_loss_tmo will be
> >    update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
> > 
> >    On Fri, Nov 25, 2016 at 02:36:04PM +0800, peng.liang5@zte.com.cn wrote:
> >    > From: PengLiang <peng.liang5@zte.com.cn>
> >    > 
> >    > If no_path_retry set to queue, we should make sure dev_loss_tmo update to MAX_DEV_LOSS_TMO.
> >    > But, it will be limit to 600 if fast_io_fail_tmo set to off or 0 meanwhile.
> > 
> >    Doesn't the system still limit dev_loss_tmo to 600 if fast_io_fail_tmo isn't set. Multipath
> >    was using this limit, since the underlying system uses it.
> > 
> >    -Ben
> > 
> >    > 
> >    > Signed-off-by: PengLiang <peng.liang5@zte.com.cn>
> >    > ---
> >    >  libmultipath/discovery.c | 3 ++-
> >    >  1 file changed, 2 insertions(+), 1 deletion(-)
> >    > 
> >    > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> >    > index aaa915c..05b0842 100644
> >    > --- a/libmultipath/discovery.c
> >    > +++ b/libmultipath/discovery.c
> >    > @@ -608,7 +608,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
> >    >                  goto out;
> >    >              }
> >    >          }
> >    > -    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO) {
> >    > +    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO &&
> >    > +        mpp->no_path_retry != NO_PATH_RETRY_QUEUE) {
> >    >          condlog(3, "%s: limiting dev_loss_tmo to %d, since "
> >    >              "fast_io_fail is not set",
> >    >              rport_id, DEFAULT_DEV_LOSS_TMO);
> >    > -- 
> >    > 2.8.1.windows.1
> > 
> >    --
> >    dm-devel mailing list
> >    dm-devel@redhat.com
> >    https://www.redhat.com/mailman/listinfo/dm-devel
> 
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
> 
> 
> 
> 
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
> 


-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
@ 2016-12-07  6:42 peng.liang5
  2016-12-07  6:57 ` Hannes Reinecke
  2016-12-07 17:08 ` Benjamin Marzinski
  0 siblings, 2 replies; 9+ messages in thread
From: peng.liang5 @ 2016-12-07  6:42 UTC (permalink / raw)
  To: bmarzins; +Cc: zhang.kai16, dm-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 4556 bytes --]

Hello, Ben

Sorry for late to reply.

Such is the case as you said below. If fast_io_fail_tmo is off we have to cap

dev_loss_tmo at 600. So, this patch is a wrong guide and will be cause a

kernel error.

And one more question. Should the system limit dev_loss_tmo to 600 if 

fast_io_fail_tmo set to 0?




Hope for your reply. Thanks.









原始邮件



发件人:BenjaminMarzinski
收件人:彭亮10137102
抄送人:张凯10072500<dm-devel@redhat.com>
日 期 :2016年12月02日 00:51
主 题 :Re: [dm-devel] [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue





On Thu, Dec 01, 2016 at 09:06:14AM +0800, peng.liang5@zte.com.cn wrote:
>    If fast_io_fail_tmo isn't set, it will be use the DEFAULT_FAST_IO_FAIL
>    in select_fast_io_fail.
> 
>    So, multipath will not run the limited of dev_loss_tmo to 600.

Yes, but the kernel will. With your patch installed, if I disable
fast_io_fail_tmo and set no_path_retry to queue, I get these messages

Dec 01 04:19:02 | rport-11:0-0: failed to set dev_loss_tmo to
2147483647, error 22

Because if fast_io_fail_tmo is not set, the kernel itself will bar
dev_loss_tmo from being above 600 seconds. Also, even if you could set
dev_loss_tmo to it's maximum without fast_io_fail_tmo set, you would
never want to, because you would break multipath.

With fast_io_fail_tmo disabled, the scsi device will never pass the
failed IO back up until dev_loss_tmo triggers.  This means that if you
lose a path on your multipath device while doing IO, you won't be able
to resend that IO down another path for 68 years (2147483647 seconds).
Also, all the synchronous checker functions will not return for 648
years. And during all this time these processes will be uninterruptable
sleep. At that point, there would be no point to even having multiple
paths, because you couldn't ever actually use them if one went down.

> 
>    And I think using MP_FAST_IO_FAIL_UNSET as the condition is meaningless
>    after multipath
> 
>    run select_fast_io_fail even if it's not set.

This is true in the default case, but we can't rely on the default case.
Since we allow users to turn it off, we need to correctly configure
multipath when it is off.

-Ben

>                                     原始邮件
>    发件人:BenjaminMarzinski
>    收件人:彭亮10137102
>    抄送人:<dm-devel@redhat.com>张凯10072500
>    日 期 :2016年11月29日 08:30
>    主 题 :Re: [dm-devel] [PATCH] libmultipath: ensure dev_loss_tmo will be
>    update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
> 
>    On Fri, Nov 25, 2016 at 02:36:04PM +0800, peng.liang5@zte.com.cn wrote:
>    > From: PengLiang <peng.liang5@zte.com.cn>
>    > 
>    > If no_path_retry set to queue, we should make sure dev_loss_tmo update to MAX_DEV_LOSS_TMO.
>    > But, it will be limit to 600 if fast_io_fail_tmo set to off or 0 meanwhile.
> 
>    Doesn't the system still limit dev_loss_tmo to 600 if fast_io_fail_tmo isn't set. Multipath
>    was using this limit, since the underlying system uses it.
> 
>    -Ben
> 
>    > 
>    > Signed-off-by: PengLiang <peng.liang5@zte.com.cn>
>    > ---
>    >  libmultipath/discovery.c | 3 ++-
>    >  1 file changed, 2 insertions(+), 1 deletion(-)
>    > 
>    > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
>    > index aaa915c..05b0842 100644
>    > --- a/libmultipath/discovery.c
>    > +++ b/libmultipath/discovery.c
>    > @@ -608,7 +608,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
>    >                  goto out
>    >              }
>    >          }
>    > -    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO) {
>    > +    } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO &&
>    > +        mpp->no_path_retry != NO_PATH_RETRY_QUEUE) {
>    >          condlog(3, "%s: limiting dev_loss_tmo to %d, since "
>    >              "fast_io_fail is not set",
>    >              rport_id, DEFAULT_DEV_LOSS_TMO)
>    > -- 
>    > 2.8.1.windows.1
> 
>    --
>    dm-devel mailing list
>    dm-devel@redhat.com
>    https://www.redhat.com/mailman/listinfo/dm-devel


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

[-- Attachment #1.1.2: Type: text/html , Size: 9442 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
  2016-11-25  6:36 peng.liang5
  2016-11-26  9:05 ` Christophe Varoqui
@ 2016-11-29  0:16 ` Benjamin Marzinski
  1 sibling, 0 replies; 9+ messages in thread
From: Benjamin Marzinski @ 2016-11-29  0:16 UTC (permalink / raw)
  To: peng.liang5; +Cc: dm-devel, zhang.kai16

On Fri, Nov 25, 2016 at 02:36:04PM +0800, peng.liang5@zte.com.cn wrote:
> From: PengLiang <peng.liang5@zte.com.cn>
> 
> If no_path_retry set to queue, we should make sure dev_loss_tmo update to MAX_DEV_LOSS_TMO.
> But, it will be limit to 600 if fast_io_fail_tmo set to off or 0 meanwhile.

Doesn't the system still limit dev_loss_tmo to 600 if fast_io_fail_tmo isn't set. Multipath
was using this limit, since the underlying system uses it.

-Ben

> 
> Signed-off-by: PengLiang <peng.liang5@zte.com.cn>
> ---
>  libmultipath/discovery.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> index aaa915c..05b0842 100644
> --- a/libmultipath/discovery.c
> +++ b/libmultipath/discovery.c
> @@ -608,7 +608,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
>  				goto out;
>  			}
>  		}
> -	} else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO) {
> +	} else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO &&
> +		mpp->no_path_retry != NO_PATH_RETRY_QUEUE) {
>  		condlog(3, "%s: limiting dev_loss_tmo to %d, since "
>  			"fast_io_fail is not set",
>  			rport_id, DEFAULT_DEV_LOSS_TMO);
> -- 
> 2.8.1.windows.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
  2016-11-25  6:36 peng.liang5
@ 2016-11-26  9:05 ` Christophe Varoqui
  2016-11-29  0:16 ` Benjamin Marzinski
  1 sibling, 0 replies; 9+ messages in thread
From: Christophe Varoqui @ 2016-11-26  9:05 UTC (permalink / raw)
  To: peng.liang5; +Cc: zhang.kai16, device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 1187 bytes --]

Applied, thanks.

On Fri, Nov 25, 2016 at 7:36 AM, <peng.liang5@zte.com.cn> wrote:

> From: PengLiang <peng.liang5@zte.com.cn>
>
> If no_path_retry set to queue, we should make sure dev_loss_tmo update to
> MAX_DEV_LOSS_TMO.
> But, it will be limit to 600 if fast_io_fail_tmo set to off or 0 meanwhile.
>
> Signed-off-by: PengLiang <peng.liang5@zte.com.cn>
> ---
>  libmultipath/discovery.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> index aaa915c..05b0842 100644
> --- a/libmultipath/discovery.c
> +++ b/libmultipath/discovery.c
> @@ -608,7 +608,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path
> *pp)
>                                 goto out;
>                         }
>                 }
> -       } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO) {
> +       } else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO &&
> +               mpp->no_path_retry != NO_PATH_RETRY_QUEUE) {
>                 condlog(3, "%s: limiting dev_loss_tmo to %d, since "
>                         "fast_io_fail is not set",
>                         rport_id, DEFAULT_DEV_LOSS_TMO);
> --
> 2.8.1.windows.1
>
>

[-- Attachment #1.2: Type: text/html, Size: 1850 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue
@ 2016-11-25  6:36 peng.liang5
  2016-11-26  9:05 ` Christophe Varoqui
  2016-11-29  0:16 ` Benjamin Marzinski
  0 siblings, 2 replies; 9+ messages in thread
From: peng.liang5 @ 2016-11-25  6:36 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: zhang.kai16, PengLiang, dm-devel

From: PengLiang <peng.liang5@zte.com.cn>

If no_path_retry set to queue, we should make sure dev_loss_tmo update to MAX_DEV_LOSS_TMO.
But, it will be limit to 600 if fast_io_fail_tmo set to off or 0 meanwhile.

Signed-off-by: PengLiang <peng.liang5@zte.com.cn>
---
 libmultipath/discovery.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index aaa915c..05b0842 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -608,7 +608,8 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
 				goto out;
 			}
 		}
-	} else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO) {
+	} else if (mpp->dev_loss > DEFAULT_DEV_LOSS_TMO &&
+		mpp->no_path_retry != NO_PATH_RETRY_QUEUE) {
 		condlog(3, "%s: limiting dev_loss_tmo to %d, since "
 			"fast_io_fail is not set",
 			rport_id, DEFAULT_DEV_LOSS_TMO);
-- 
2.8.1.windows.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-12-08  1:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-01  1:06 [PATCH] libmultipath: ensure dev_loss_tmo will be update to MAX_DEV_LOSS_TMO if no_path_retry set to queue peng.liang5
2016-12-01 16:44 ` Benjamin Marzinski
  -- strict thread matches above, loose matches on Subject: below --
2016-12-08  1:38 peng.liang5
2016-12-07  6:42 peng.liang5
2016-12-07  6:57 ` Hannes Reinecke
2016-12-07 17:08 ` Benjamin Marzinski
2016-11-25  6:36 peng.liang5
2016-11-26  9:05 ` Christophe Varoqui
2016-11-29  0:16 ` Benjamin Marzinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.