dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [dm-devel] [PATCH] multipath-tools: update no_path_retry value for IBM/2145
@ 2021-08-25 22:24 Xose Vazquez Perez
  2021-08-26  6:47 ` Martin Wilck
  0 siblings, 1 reply; 5+ messages in thread
From: Xose Vazquez Perez @ 2021-08-25 22:24 UTC (permalink / raw)
  Cc: Xose Vazquez Perez, Martin Wilck, DM-DEVEL ML

Based on current configs: https://www.ibm.com/docs/en/flashsystem-9x00/8.4.x?topic=system-settings-linux-hosts

Cc: Martin Wilck <mwilck@suse.com>
Cc: Benjamin Marzinski <bmarzins@redhat.com>
Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
Cc: DM-DEVEL ML <dm-devel@redhat.com>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
---
 libmultipath/hwtable.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
index 2a896440..58554cbb 100644
--- a/libmultipath/hwtable.c
+++ b/libmultipath/hwtable.c
@@ -662,7 +662,7 @@ static struct hwentry default_hw[] = {
 		/* Storwize family / SAN Volume Controller / Flex System V7000 / FlashSystem V840/V9000/9100 */
 		.vendor        = "IBM",
 		.product       = "^2145",
-		.no_path_retry = NO_PATH_RETRY_QUEUE,
+		.no_path_retry = 5,
 		.pgpolicy      = GROUP_BY_PRIO,
 		.pgfailback    = -FAILBACK_IMMEDIATE,
 		.prio_name     = PRIO_ALUA,
-- 
2.32.0

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [dm-devel] [PATCH] multipath-tools: update no_path_retry value for IBM/2145
  2021-08-25 22:24 [dm-devel] [PATCH] multipath-tools: update no_path_retry value for IBM/2145 Xose Vazquez Perez
@ 2021-08-26  6:47 ` Martin Wilck
  2021-08-30 16:57   ` Steffen Maier
  2024-02-12 23:42   ` Xose Vazquez Perez
  0 siblings, 2 replies; 5+ messages in thread
From: Martin Wilck @ 2021-08-26  6:47 UTC (permalink / raw)
  To: Xose Vazquez Perez; +Cc: DM-DEVEL ML

On Thu, 2021-08-26 at 00:24 +0200, Xose Vazquez Perez wrote:
> Based on current configs:
> https://www.ibm.com/docs/en/flashsystem-9x00/8.4.x?topic=system-settings-linux-hosts
> 
> Cc: Martin Wilck <mwilck@suse.com>
> Cc: Benjamin Marzinski <bmarzins@redhat.com>
> Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
> Cc: DM-DEVEL ML <dm-devel@redhat.com>
> Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
> ---
>  libmultipath/hwtable.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
> index 2a896440..58554cbb 100644
> --- a/libmultipath/hwtable.c
> +++ b/libmultipath/hwtable.c
> @@ -662,7 +662,7 @@ static struct hwentry default_hw[] = {
>                 /* Storwize family / SAN Volume Controller / Flex
> System V7000 / FlashSystem V840/V9000/9100 */
>                 .vendor        = "IBM",
>                 .product       = "^2145",
> -               .no_path_retry = NO_PATH_RETRY_QUEUE,
> +               .no_path_retry = 5,
>                 .pgpolicy      = GROUP_BY_PRIO,
>                 .pgfailback    = -FAILBACK_IMMEDIATE,
>                 .prio_name     = PRIO_ALUA,

Ref: https://github.com/opensvc/multipath-tools/issues/6

The question is on which basis IBM came up with this recommendation.
5 (aka 25s) is a rather low value. Some users may encounter unpleasant
surprises if we change the default this way, as it used to be infinite
before.

Using 5, the IBS 2145 would have the 2nd-lowest default in hwtable.c
after Dell PowerStore (3). Symmetrix has 6; all other arrays default to
10 or higher, many default to "queue".

Observing that the above is the documentation for the *Flashsystem*
9200,  I consider it likely that the value ".no_path_retry = 5" would
apply to flash-based IBM storage products, but not to the older
products such as the V7000, which unfortunately use the same device ID.

It'd be helpful if someone from IBM could jump in here...

Pondering the pros and cons, I vote for keeping the current defaults
for now.

Martin



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dm-devel] [PATCH] multipath-tools: update no_path_retry value for IBM/2145
  2021-08-26  6:47 ` Martin Wilck
@ 2021-08-30 16:57   ` Steffen Maier
  2024-02-12 23:42   ` Xose Vazquez Perez
  1 sibling, 0 replies; 5+ messages in thread
From: Steffen Maier @ 2021-08-30 16:57 UTC (permalink / raw)
  To: dm-devel; +Cc: Benjamin Block

On 8/26/21 8:47 AM, Martin Wilck wrote:
> On Thu, 2021-08-26 at 00:24 +0200, Xose Vazquez Perez wrote:
>> Based on current configs:
>> https://www.ibm.com/docs/en/flashsystem-9x00/8.4.x?topic=system-settings-linux-hosts
>>
>> Cc: Martin Wilck <mwilck@suse.com>
>> Cc: Benjamin Marzinski <bmarzins@redhat.com>
>> Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
>> Cc: DM-DEVEL ML <dm-devel@redhat.com>
>> Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
>> ---
>>   libmultipath/hwtable.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
>> index 2a896440..58554cbb 100644
>> --- a/libmultipath/hwtable.c
>> +++ b/libmultipath/hwtable.c
>> @@ -662,7 +662,7 @@ static struct hwentry default_hw[] = {
>>                  /* Storwize family / SAN Volume Controller / Flex
>> System V7000 / FlashSystem V840/V9000/9100 */
>>                  .vendor        = "IBM",
>>                  .product       = "^2145",
>> -               .no_path_retry = NO_PATH_RETRY_QUEUE,
>> +               .no_path_retry = 5,
>>                  .pgpolicy      = GROUP_BY_PRIO,
>>                  .pgfailback    = -FAILBACK_IMMEDIATE,
>>                  .prio_name     = PRIO_ALUA,
> 
> Ref: https://github.com/opensvc/multipath-tools/issues/6
> 
> The question is on which basis IBM came up with this recommendation.
> 5 (aka 25s) is a rather low value. Some users may encounter unpleasant
> surprises if we change the default this way, as it used to be infinite
> before.
> 
> Using 5, the IBS 2145 would have the 2nd-lowest default in hwtable.c
> after Dell PowerStore (3). Symmetrix has 6; all other arrays default to
> 10 or higher, many default to "queue".
> 
> Observing that the above is the documentation for the *Flashsystem*
> 9200,  I consider it likely that the value ".no_path_retry = 5" would
> apply to flash-based IBM storage products, but not to the older
> products such as the V7000, which unfortunately use the same device ID.
> 
> It'd be helpful if someone from IBM could jump in here...
> 
> Pondering the pros and cons, I vote for keeping the current defaults
> for now.

+1

I think this depends on host and workload requirements and maybe other things. 
There might not be one simple answer.

FWIW, from a zfcp point of view: 
https://public.dhe.ibm.com/software/dw/linux390/lvc/zFCP_Best_Practices-BB-Webcast_201805.pdf#page=19
Distributed/parallel file systems with shared volumes might have their own 
requirement.
YMMV

We also have our opinion on dev_loss_tmo and fast_io_fail_tmo, but that's a 
different story 
[https://public.dhe.ibm.com/software/dw/linux390/lvc/zFCP_Best_Practices-BB-Webcast_201805.pdf#page=18].

-- 
Mit freundlichen Gruessen / Kind regards
Steffen Maier

Linux on IBM Z and LinuxONE

https://www.ibm.com/privacy/us/en/
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] multipath-tools: update no_path_retry value for IBM/2145
  2021-08-26  6:47 ` Martin Wilck
  2021-08-30 16:57   ` Steffen Maier
@ 2024-02-12 23:42   ` Xose Vazquez Perez
  2024-02-13 10:36     ` Martin Wilck
  1 sibling, 1 reply; 5+ messages in thread
From: Xose Vazquez Perez @ 2024-02-12 23:42 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Benjamin Marzinski, Christophe Varoqui, DM-DEVEL ML

On 8/26/21 8:47 AM, Martin Wilck wrote:
    ^^^^^^^
It is never too late!

> On Thu, 2021-08-26 at 00:24 +0200, Xose Vazquez Perez wrote:
>> Based on current configs:
>> https://www.ibm.com/docs/en/flashsystem-9x00/8.4.x?topic=system-settings-linux-hosts
>>
>> Cc: Martin Wilck <mwilck@suse.com>
>> Cc: Benjamin Marzinski <bmarzins@redhat.com>
>> Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
>> Cc: DM-DEVEL ML <dm-devel@redhat.com>
>> Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
>> ---
>>   libmultipath/hwtable.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
>> index 2a896440..58554cbb 100644
>> --- a/libmultipath/hwtable.c
>> +++ b/libmultipath/hwtable.c
>> @@ -662,7 +662,7 @@ static struct hwentry default_hw[] = {
>>                  /* Storwize family / SAN Volume Controller / Flex
>> System V7000 / FlashSystem V840/V9000/9100 */
>>                  .vendor        = "IBM",
>>                  .product       = "^2145",
>> -               .no_path_retry = NO_PATH_RETRY_QUEUE,
>> +               .no_path_retry = 5,
>>                  .pgpolicy      = GROUP_BY_PRIO,
>>                  .pgfailback    = -FAILBACK_IMMEDIATE,
>>                  .prio_name     = PRIO_ALUA,
> 
> Ref: https://github.com/opensvc/multipath-tools/issues/6
> 
> The question is on which basis IBM came up with this recommendation.
> 5 (aka 25s) is a rather low value. Some users may encounter unpleasant
> surprises if we change the default this way, as it used to be infinite
> before.
> 
> Using 5, the IBS 2145 would have the 2nd-lowest default in hwtable.c
> after Dell PowerStore (3). Symmetrix has 6; all other arrays default to
> 10 or higher, many default to "queue".
> 
> Observing that the above is the documentation for the *Flashsystem*
> 9200,  I consider it likely that the value ".no_path_retry = 5" would
> apply to flash-based IBM storage products, but not to the older
> products such as the V7000, which unfortunately use the same device ID.
> 
> It'd be helpful if someone from IBM could jump in here...
> 
> Pondering the pros and cons, I vote for keeping the current defaults
> for now.
> 
> Martin

Some history:

first commit 3eb8c380a :
        {
                /* IBM SAN Volume Controller */
                .vendor        = "IBM",
                .product       = "2145",
                .getuid        = DEFAULT_GETUID,
                .getprio       = "mpath_prio_alua /dev/%n",
                .features      = "1 queue_if_no_path",
                .hwhandler     = DEFAULT_HWHANDLER,
                .selector      = DEFAULT_SELECTOR,
                .pgpolicy      = GROUP_BY_PRIO,
                .pgfailback    = -FAILBACK_IMMEDIATE,
                .rr_weight     = RR_WEIGHT_NONE,
                .no_path_retry = NO_PATH_RETRY_UNDEF,
                .minio         = DEFAULT_MINIO,
                .checker_name  = TUR,
        },

NO_PATH_RETRY_UNDEF was removed in b7c3cf014 because it was the default value,
and later "1 queue_if_no_path" was replaced by NO_PATH_RETRY_QUEUE in 87ea76f99

IBM docs recommends:
no_path_retry 5 # or no_path_retry "fail" for some current linux distros

IBM Storage FlashSystem 5200, 5000, 5100, Storwize V5100 and V5000E:
https://www.ibm.com/docs/en/flashsystem-5x00/8.6.x?topic=system-settings-linux-hosts

IBM Storage FlashSystem 7300, 7200 and Storwize V7000:
https://www.ibm.com/docs/en/flashsystem-7x00/8.6.x?topic=system-settings-linux-hosts

IBM FlashSystem V9000:
https://www.ibm.com/docs/en/flashsystem-v9000/8.3.x?topic=system-settings-linux-hosts

IBM Storage FlashSystem 9500, 9200 and 9100:
https://www.ibm.com/docs/en/flashsystem-9x00/8.6.x?topic=system-settings-linux-hosts

Therefore, we should change this value.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] multipath-tools: update no_path_retry value for IBM/2145
  2024-02-12 23:42   ` Xose Vazquez Perez
@ 2024-02-13 10:36     ` Martin Wilck
  0 siblings, 0 replies; 5+ messages in thread
From: Martin Wilck @ 2024-02-13 10:36 UTC (permalink / raw)
  To: Xose Vazquez Perez; +Cc: Benjamin Marzinski, Christophe Varoqui, DM-DEVEL ML

On Tue, 2024-02-13 at 00:42 +0100, Xose Vazquez Perez wrote:
> On 8/26/21 8:47 AM, Martin Wilck wrote:
>     ^^^^^^^
> It is never too late!

:-)

> Some history:
> 
> first commit 3eb8c380a :
>         {
>                 /* IBM SAN Volume Controller */
>                 .vendor        = "IBM",
>                 .product       = "2145",
>                 .getuid        = DEFAULT_GETUID,
>                 .getprio       = "mpath_prio_alua /dev/%n",
>                 .features      = "1 queue_if_no_path",
>                 .hwhandler     = DEFAULT_HWHANDLER,
>                 .selector      = DEFAULT_SELECTOR,
>                 .pgpolicy      = GROUP_BY_PRIO,
>                 .pgfailback    = -FAILBACK_IMMEDIATE,
>                 .rr_weight     = RR_WEIGHT_NONE,
>                 .no_path_retry = NO_PATH_RETRY_UNDEF,
>                 .minio         = DEFAULT_MINIO,
>                 .checker_name  = TUR,
>         },
> 
> NO_PATH_RETRY_UNDEF was removed in b7c3cf014 because it was the
> default value,
> and later "1 queue_if_no_path" was replaced by NO_PATH_RETRY_QUEUE in
> 87ea76f99

... which shows that the default has been "queue" for almost 18 years.

> IBM docs recommends:
> no_path_retry 5 # or no_path_retry "fail" for some current linux
> distros
> 
> IBM Storage FlashSystem 5200, 5000, 5100, Storwize V5100 and V5000E:
> https://www.ibm.com/docs/en/flashsystem-5x00/8.6.x?topic=system-settings-linux-hosts
> 
> IBM Storage FlashSystem 7300, 7200 and Storwize V7000:
> https://www.ibm.com/docs/en/flashsystem-7x00/8.6.x?topic=system-settings-linux-hosts
> 
> IBM FlashSystem V9000:
> https://www.ibm.com/docs/en/flashsystem-v9000/8.3.x?topic=system-settings-linux-hosts
> 
> IBM Storage FlashSystem 9500, 9200 and 9100:
> https://www.ibm.com/docs/en/flashsystem-9x00/8.6.x?topic=system-settings-linux-hosts
> 
> Therefore, we should change this value.

I tend to disagree. It's true that we usually follow vendor
recommendations. But in this case, I think the change would do more
harm than good, because we've defaulted to "queue" basically forever
for this product. Suddenly switching to a rather short no_path_retry
value might come as a unpleasant surprise for users. Users who follow
the IBM recommendations (using explicit multipath.conf settings) won't
notice the change anyway, but those who rely on our defaults might even
loose data.

In general, I believe vendors recommendations about "no_path_retry"
don't mean much. This setting doesn't depend on the properties of the
hardware, it's rather the preference of the end customer [*]. IMHO
"fail" or low numeric values of no_path_retry mainly make sense in
cluster configurations. Unfortunately, IBM gives no rationale for this
recommendation in its manuals [+].

But I'm not religious on the matter; more opinions welcome.

Martin

[*] Vendors can recommend a lower limit for no_path_retry, in the sense
"with this product, it can happen that zero paths are available for N
seconds during a firmware update", but a fixed no_path_retry value acts
as an upper limit.
[+] I suspect that the recommendations in the current IBM manuals have
just been copy/pasted from earlier ones, without much consideration.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-02-13 10:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-25 22:24 [dm-devel] [PATCH] multipath-tools: update no_path_retry value for IBM/2145 Xose Vazquez Perez
2021-08-26  6:47 ` Martin Wilck
2021-08-30 16:57   ` Steffen Maier
2024-02-12 23:42   ` Xose Vazquez Perez
2024-02-13 10:36     ` Martin Wilck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).