All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Wilck <mwilck@suse.com>
To: Benjamin Marzinski <bmarzins@redhat.com>,
	device-mapper development <dm-devel@redhat.com>
Subject: Re: [PATCH v3 18/19] libmultipath: Don't blank intialized paths
Date: Wed, 03 Oct 2018 00:37:17 +0200	[thread overview]
Message-ID: <f87221ecedb072a7f3551ea0f62e7df586d075f1.camel@suse.com> (raw)
In-Reply-To: <3bb1c108b1e43a85d296eed9a63cefe7a6de2d01.camel@suse.com>

Hi Ben,

On Tue, 2018-10-02 at 00:00 +0200, Martin Wilck wrote:
> On Fri, 2018-09-21 at 18:05 -0500, Benjamin Marzinski wrote:
> > When pathinfo fails for some likely transient reason, it clears the
> > path
> > wwid, but otherwise returns successfully, to keep the path around
> > but
> > not usable until it gets fully initialized. However, if the path
> > has
> > already been initialized, and pathinfo hits a transient error, it
> > shouldn't clear the wwid.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/discovery.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> > index 3e0db7f..33815dc 100644
> > --- a/libmultipath/discovery.c
> > +++ b/libmultipath/discovery.c
> > @@ -1991,9 +1991,9 @@ blank:
> >  	/*
> >  	 * Recoverable error, for example faulty or offline path
> >  	 */
> > -	memset(pp->wwid, 0, WWID_SIZE);
> >  	pp->chkrstate = pp->state = PATH_DOWN;
> > -	pp->initialized = INIT_FAILED;
> > +	if (pp->initialized == INIT_FAILED)
> > +		memset(pp->wwid, 0, WWID_SIZE);
> >  
> >  	return PATHINFO_OK;
> >  }
> 
> I am uncertain about this one. The old code sets pp->initialized to
> INIT_FAILED. If the state had been INIT_MISSING_UDEV or
> INIT_REQUESTED_UDEV before, this patch might change how the code
> behaves later in check_path(), where these conditions are checked.
> 
> Likewise, tests for strlen(pp->wwid) are used in various places
> around
> the code. These tests would now yield different results for paths in
> "recoverable error" state.
> 
> Have you considered these possible side effects?

I've pondered over this a lot. The dust is clearing up a bit.

1. With your patch in place, INIT_FAILED is never set except in
alloc_path() (we might rename it to INIT_NEW or the like, but see
below).

2. I don't understand how you handle repeated failure to retrieve the
WWID. I see that get_uid() (actually, scsi_uid_fallback()) would
retrieve the WWID from sysfs after retriggers are exhausted. But I
don't see how pathinfo(DI_WWID) would ever be called in this situation:

In the last invocation, pathinfo() had failed to retrieve the WWID and
set pp->initialized = INIT_MISSING_UDEV. There it will remain because
check_path() won't set it to INIT_REQUESTED_UDEV any more after retries
are exhausted. And now, check_path() won't call pathinfo(DI_ALL) any
more from the "add missing path" code, because of the (pp->initialized
!= INIT_MISSING_UDEV) condition.

Am I overlooking something?

3. If "blank" state means that important device information couldn't be
retrieved because of presumably transient failure conditions, we should
retry to retrieve this information by calling pathinfo again later. But
unless the WWID is (reset to) the empty string, check_path() won't call
pathinfo(DI_ALL) any more.

4. The "blank" logic in pathinfo() combines several very different
cases.
  a) PATH_REMOVED status from path_offline(). This means that
elementary sysfs attributes were missing. This is almost the same as
failure in sysfs_pathinfo(), which results in PATHINFO_FAILED return
status; but for PATH_REMOVED we return PATHINFO_OK and keep the path
around.
  b) Failure in checker_check(). If the path is offline in the first
place, the checker isn't called, and WWID determination is attempted.
But if the checker returns PATH_UNCHECKED or PATH_WILD, we goto "blank"
state.
  c) Failure in scsi_ioctl_pathinfo() or cciss_ioctl_pathinfo(). Both
functions never fail, so this can't happen. I've patches here to fix
that.  
  d) Failure to open pp->fd. 

d) is the only case in which the "blank" logic makes really sense to
me. It can happen only at the first pathinfo() invocation, meaning 
pp->wwid is still empty, and pp->initialized is INIT_FAILED. Your patch
would change nothing for this case.

a) and b) can happen for paths that have been initialized already. I
think in case a) the WWID should be reset, probably initialized should
be set to INIT_FAILED, and PATHINFO_FAILED should be returned. In case
b) we should IMO proceed normally rather than goto "blank". Resetting
the WWID in case b) is nonsense, agreed.

Altogether, if my analysis is correct, your patch (not blanking the
WWID) should be applied to case b) only.

Please comment - I still feel a bit confused and may have overlooked
something essential.

Regards
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

  reply	other threads:[~2018-10-02 22:37 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-21 23:05 [PATCH v3 00/19] Misc Multipath patches Benjamin Marzinski
2018-09-21 23:05 ` [PATCH v3 01/19] libmultipath: fix tur checker timeout Benjamin Marzinski
2018-10-01 19:51   ` Martin Wilck
2018-10-04 16:31     ` Benjamin Marzinski
2018-10-05 10:11       ` Martin Wilck
2018-10-05 17:02         ` Benjamin Marzinski
2018-10-05 19:23           ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 02/19] libmultipath: fix tur checker double locking Benjamin Marzinski
2018-10-01 20:09   ` Martin Wilck
2018-10-01 20:44     ` Martin Wilck
2018-10-04 16:47       ` Benjamin Marzinski
2018-10-04 16:45     ` Benjamin Marzinski
2018-10-05 10:25       ` Martin Wilck
2018-10-05 17:10         ` Benjamin Marzinski
2018-10-05 19:07           ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 03/19] libmultipath: fix tur memory misuse Benjamin Marzinski
2018-10-01 20:59   ` Martin Wilck
2018-10-02  7:48     ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 04/19] libmultipath: cleanup tur locking Benjamin Marzinski
2018-10-01 21:08   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 05/19] libmultipath: fix tur checker timeout issue Benjamin Marzinski
2018-10-01 21:09   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 06/19] libmultipath: fix set_int error path Benjamin Marzinski
2018-10-01 21:17   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 07/19] libmultipath: fix length issues in get_vpd_sgio Benjamin Marzinski
2018-10-01 21:25   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 08/19] libmultipath: _install_keyword cleanup Benjamin Marzinski
2018-10-01 21:26   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 09/19] libmultipath: remove unused code Benjamin Marzinski
2018-10-01 21:28   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 10/19] libmultipath: fix memory issue in path_latency prio Benjamin Marzinski
2018-10-01 21:30   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 11/19] libmultipath: fix null dereference int alloc_path_group Benjamin Marzinski
2018-10-01 21:33   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 12/19] libmutipath: don't use malformed uevents Benjamin Marzinski
2018-10-01 21:31   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 13/19] multipath: fix max array size in print_cmd_valid Benjamin Marzinski
2018-10-01 21:35   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 14/19] multipathd: function return value tweaks Benjamin Marzinski
2018-10-01 21:37   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 15/19] multipathd: minor fixes Benjamin Marzinski
2018-10-01 21:38   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 16/19] multipathd: remove useless check and fix format Benjamin Marzinski
2018-10-01 21:40   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 17/19] multipathd: fix memory leak on error in configure Benjamin Marzinski
2018-10-01 21:42   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 18/19] libmultipath: Don't blank intialized paths Benjamin Marzinski
2018-10-01 22:00   ` Martin Wilck
2018-10-02 22:37     ` Martin Wilck [this message]
2018-10-05 19:38       ` Benjamin Marzinski
2018-10-08  9:41         ` Martin Wilck
2018-10-09 22:20           ` Benjamin Marzinski
2018-10-08  9:35   ` Martin Wilck
2018-09-21 23:05 ` [PATCH v3 19/19] libmultipath: Fixup updating paths Benjamin Marzinski
2018-10-01 22:30   ` Martin Wilck
2018-10-05 20:32     ` Benjamin Marzinski
2018-10-07  8:36 ` [PATCH v3 00/19] Misc Multipath patches Christophe Varoqui
2018-10-09 16:13   ` Benjamin Marzinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f87221ecedb072a7f3551ea0f62e7df586d075f1.camel@suse.com \
    --to=mwilck@suse.com \
    --cc=bmarzins@redhat.com \
    --cc=dm-devel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.