From: Martin Wilck <mwilck@suse.com>
To: Brian Bunker <brian@purestorage.com>,
linux-scsi@vger.kernel.org, Hannes Reinecke <hare@suse.com>
Subject: Re: [PATCH 1/1]: scsi dm-mpath do not fail paths which are in ALUA state transitioning
Date: Mon, 12 Jul 2021 11:19:36 +0200 [thread overview]
Message-ID: <32a96fc27c250fb5772a0b301576ad702b8ea934.camel@suse.com> (raw)
In-Reply-To: <CAHZQxyKJ1qFatzhR-k19PXjAPo7eC0ZgwgaGKwfndB=jEO8mRQ@mail.gmail.com>
Hello Brian,
On Do, 2021-07-08 at 13:42 -0700, Brian Bunker wrote:
> In a controller failover do not fail paths that are transitioning or
> an unexpected I/O error will return when accessing a multipath device.
>
> Consider this case, a two controller array with paths coming from a
> primary and a secondary controller. During any upgrade there will be a
> transition from a secondary to a primary state.
>
> [...]
> 4. It is not expected that the remaining 4 paths will also fail. This
> was not the case until the change which introduced BLK_STS_AGAIN into
> the SCSI ALUA device handler. With that change new I/O which reaches
> that handler on paths that are in ALUA state transitioning will result
> in those paths failing. Previous Linux versions, before that change,
> will not return an I/O error back to the client application.
> Similarly, this problem does not happen in other operating systems,
> e.g. ESXi, Windows, AIX, etc.
Please confirm that your kernel included ee8868c5c78f ("scsi:
scsi_dh_alua: Retry RTPG on a different path after failure").
That commit should cause the RTPG to be retried on other map members
which are not in failed state, thus avoiding this phenomenon.
> [...]
>
> 6. The error gets back to the user of the muitipath device
> unexpectedly:
> Thu Jul 8 13:33:59 2021: /opt/Purity/bin/bb/pureload I/O Error: io
> 43047 fd 36 op read offset 00000028ef7a7000 size 4096 errno 11
> rsize -1
>
> The earlier patch I made for this was not desirable, so I am proposing
> this much smaller patch which will similarly not allow the
> transitioning paths to result in immediate failure.
>
> Signed-off-by: Brian Bunker <brian@purestorage.com>
> Acked-by: Krishna Kant <krishna.kant@purestorage.com>
> Acked-by: Seamus Connor <sconnor@purestorage.com>
>
> ____
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index bced42f082b0..d5d6be96068d 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -1657,7 +1657,7 @@ static int multipath_end_io(struct dm_target
> *ti, struct request *clone,
> else
> r = DM_ENDIO_REQUEUE;
>
> - if (pgpath)
> + if (pgpath && (error != BLK_STS_AGAIN))
> fail_path(pgpath);
>
> if (!atomic_read(&m->nr_valid_paths) &&
>
I doubt that this is correct. If you look at the commit msg of
268940b80fa4 ("scsi: scsi_dh_alua: Return BLK_STS_AGAIN for ALUA
transitioning state"):
"When the ALUA state indicates transitioning we should not retry the command
immediately, but rather complete the command with BLK_STS_AGAIN to signal
the completion handler that it might be retried. This allows multipathing
to redirect the command to another path if possible, and avoid stalls
during lengthy transitioning times."
The purpose of that patch was to set the state of the transitioning
path to failed in order to make sure IO is retried on a different path.
Your patch would undermine this purpose.
Regards
Martin
next prev parent reply other threads:[~2021-07-12 9:19 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-08 20:42 [PATCH 1/1]: scsi dm-mpath do not fail paths which are in ALUA state transitioning Brian Bunker
2021-07-12 9:19 ` Martin Wilck [this message]
2021-07-12 21:38 ` Brian Bunker
2021-07-13 9:13 ` Martin Wilck
2021-07-14 0:32 ` Brian Bunker
2021-07-14 0:37 ` Brian Bunker
2021-07-14 8:38 ` Martin Wilck
2021-07-14 18:13 ` Brian Bunker
2021-07-14 21:06 ` Martin Wilck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=32a96fc27c250fb5772a0b301576ad702b8ea934.camel@suse.com \
--to=mwilck@suse.com \
--cc=brian@purestorage.com \
--cc=hare@suse.com \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).