All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@primarydata.com>
To: "neilb@suse.com" <neilb@suse.com>,
	"tigran.mkrtchyan@desy.de" <tigran.mkrtchyan@desy.de>
Cc: "anna.schumaker@netapp.com" <anna.schumaker@netapp.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH - v2] NFSv4: handle EINVAL from EXCHANGE_ID better.
Date: Tue, 20 Mar 2018 19:43:54 +0000	[thread overview]
Message-ID: <1521575027.89994.20.camel@primarydata.com> (raw)
In-Reply-To: <87woy68u39.fsf@notabene.neil.brown.name>

[-- Attachment #1: Type: text/plain, Size: 9251 bytes --]

On Wed, 2018-03-21 at 05:44 +1100, NeilBrown wrote:
> On Tue, Mar 20 2018, Trond Myklebust wrote:
> 
> > On Tue, 2018-03-20 at 11:32 +1100, NeilBrown wrote:
> > > On Fri, Mar 16 2018, Trond Myklebust wrote:
> > > 
> > > > On Fri, 2018-03-16 at 10:31 +0100, Mkrtchyan, Tigran wrote:
> > > > > Hi Neil,
> > > > > 
> > > > > according to rfc5661, NFS4ERR_INVAL is returned by the server
> > > > > if
> > > > > it
> > > > > thinks that client sends an invalid request (e.g. points to a
> > > > > client
> > > > > bug)
> > > > > or server misinterpret it (broken server).
> > > > > 
> > > > > With your change instead of failing the mount, client will
> > > > > silently
> > > > > go for
> > > > > v4.0, even v4.1 mount was requested and produce undesirable
> > > > > behavior,
> > > > > e.g.
> > > > > proxy-io instead of pnfs. I fill prefer fail-fast instead of
> > > > > long
> > > > > debug
> > > > > sessions.
> > > > > 
> > > > > On the other hand, I understand, that it's not always
> > > > > possible to
> > > > > fix
> > > > > server
> > > > > or clients in production environment and time-to-time
> > > > > workarounds
> > > > > are
> > > > > necessary.
> > > > > 
> > > > > 
> > > > 
> > > > I'd tend to agree with Tigran. Hiding server bugs, should not
> > > > be a
> > > > priority and particularly not in this case, where the
> > > > workaround is
> > > > simple: either turn off version negotiation altogether, or edit
> > > > /etc/nfsmount.conf to negotiate a different set of versions.
> > > 
> > > Yes, it could be worked-around in nfsmount.conf, but manual
> > > configuration should be seen as an optimization or a last
> > > resort.   If
> > > we can make things work without configuration, that provides the
> > > best
> > > experience.
> > > In this case, the kernel has strong evidence that the server
> > > isn't responding as expected, but it gives an unhelpful error
> > > message.
> > 
> > Server do not spontaneously break their ability to process NFSv4
> > operations and so this is not an issue that we need to worry about
> > in
> > ordinary operation. It should only ever be an issue when
> > 
> > 1) An insufficiently tested and broken upgrade is applied to an
> > existing server, in which case the main workaround should be to
> > revert
> > the upgrade until it can be fixed.
> > 2) A completely new broken server is introduced to the system.
> 
> 3) An upgrade to the client defaults to trying 4.2 first, then 4.1
> and
>    only then 4.0.  Previous client defaulted to 4.0.

I'm sorry, but this still does not sounds like a good case for "fix the
kernel client". The kernel has no opinion on which NFS version you
should try first in a mount attempt: that decision is made in
userspace.
If you upgraded the nfs-utils to something that now tries 4.2 first,
then that is a user space policy issue.

> > 
> > > At the very least, nfs4_discover_server_trunking() should not
> > > treat
> > > -NFS4ERR_INVAL as unexpect (because there is code in
> > > nfs4_check_cl_exchange_flags which explicitly generates it).
> > > If it just let this error through, instead of translating it to
> > > EIO,
> > > then the problem would go away.
> > 
> > The code in nfs4_check_cl_exchange_flags is there to check for
> > explicit NFSv4.1 protocol violations. Is it broken?
> 
> It is broken in that it reports -EINVAL when no arguments were
> invalid.
> This gets translated to -EIO when there was no IO error.

The purpose of that function is to ensure that the server is
advertising itself as a bona fide NFSv4.1 or newer server, and to
ensure that it is not replying to our EXCHANGE_ID request with some
flag or service that we did not expect and that might cause us to
break.

IOW: it is there to check protocol compliance. As far as I can tell, it
is doing that, and doing so correctly.


> Thanks,
> NeilBrown
> 
> 
> > 
> > > > 
> > > > What we might want to do, is make it easier to allow the user
> > > > to
> > > > detect
> > > > that this is indeed a server bug and is not a problem with the
> > > > arguments supplied to the "mount" utility. Perhaps we might
> > > > have
> > > > the
> > > > kernel log something in the syslogs?
> > > 
> > > Yes, logging a message might be useful.  Most of the messages
> > > logged
> > > about bad servers are currently going through dprintk(), so they
> > > won't
> > > often be seen.  Is that what we want??  Don't know...
> > > 
> > > Anyway, you point that it "is not a problem with the arguments"
> > > is
> > > stop-on.  If the client gets EINVAL from the server, then it
> > > shouldn't
> > > blindly report that back to the user as EINVAL means "Invalid
> > > argument" and the argements given to the server are probably not
> > > the
> > > argument given by the user.
> > > 
> > > Following that line of reasoning, I think
> > > nfs4_check_cl_exchange_flag()
> > > should *not* return -NFS4ERR_INVAL, and _nfs4_proc_exchange_id()
> > > shouldn't pass NFS4ERR_INVAL through unchanged.
> > > 
> > > So I propose the following version.
> > > 
> > > Thanks,
> > > NeilBrown
> > > 
> > > ------------------------8<---------------------------
> > > From: NeilBrown <neilb@suse.com>
> > > Date: Tue, 20 Mar 2018 11:31:33 +1100
> > > Subject: [PATCH] NFSv4: handle EINVAL from EXCHANGE_ID better.
> > > 
> > > nfs4_proc_exchange_id() can return -EINVAL if the server
> > > reported NFS4INVAL (which I have seen in a packet trace),
> > > or nfs4_check_cl_exchange_flags() exchange flags detects
> > > a problem.
> > > Each of these mean that NFSv4.1 and later cannot be used,
> > > but they should not prevent fallback to NFSv4.0.  Currently
> > > they do.
> > > 
> > > Currently this EINVAL error is returned by
> > > nfs4_proc_exchange_id() to nfs41_discover_server_trunking(),
> > > and thence to nfs4_discover_server_trunking().
> > > nfs4_discover_server_trunking() doesn't understand EINVAL,
> > > so converts it to EIO which causes mount.nfs to think
> > > something is horribly wrong and to give up.
> > > 
> > > EINVAL is never a sensible error code here.  It means "Invalid
> > > argument", but is being used when the problem is "Invalid
> > > response
> > > from the server".  If we change these two circumstances to report
> > > EPROTONOSUPPORT to the caller (which seems a reasonable
> > > assessment
> > > when the server gives confusing responses), and if we enhance
> > > nfs4_discover_server_trunking() to treat -EPROTONOSUPPORT as an
> > > expected error to pass through, then the error reported to user-
> > > space
> > > will be more representative of the actual fault.
> > > 
> > > A failure to negotiate a client ID clearly shows that NFSv4.1
> > > cannot
> > > be supported, but isn't as general a failure as EIO.
> > > 
> > > Signed-off-by: NeilBrown <neilb@suse.com>
> > > ---
> > >  fs/nfs/nfs4proc.c  | 18 +++++++++++++++---
> > >  fs/nfs/nfs4state.c |  1 +
> > >  2 files changed, 16 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > index 47f3c273245e..97757f646f13 100644
> > > --- a/fs/nfs/nfs4proc.c
> > > +++ b/fs/nfs/nfs4proc.c
> > > @@ -7364,7 +7364,8 @@ static int nfs4_check_cl_exchange_flags(u32
> > > flags)
> > >  		goto out_inval;
> > >  	return NFS_OK;
> > >  out_inval:
> > > -	return -NFS4ERR_INVAL;
> > > +	dprintk("NFS: server returns invalid flags for
> > > EXCHANGE_ID\n");
> > > +	return -EPROTONOSUPPORT;
> > >  }
> > >  
> > >  static bool
> > > @@ -7741,8 +7742,19 @@ static int _nfs4_proc_exchange_id(struct
> > > nfs_client *clp, struct rpc_cred *cred,
> > >  	int status;
> > >  
> > >  	task = nfs4_run_exchange_id(clp, cred, sp4_how, NULL);
> > > -	if (IS_ERR(task))
> > > -		return PTR_ERR(task);
> > > +	if (IS_ERR(task)) {
> > > +		status = PTR_ERR(task);
> > > +		if (status == -NFS4ERR_INVAL) {
> > > +			/* If the server think we did something
> > > invalid, it is certainly
> > > +			 * not the fault of our caller, so it
> > > would
> > > wrong to report
> > > +			 * this error back up.  So in that case
> > > simply acknowledge that
> > > +			 * we don't seem able to support this
> > > protocol.
> > > +			 */
> > > +			dprintk("NFS: server return
> > > NFS4ERR_INVAL to
> > > EXCHANGE_ID\n");
> > > +			status = -EPROTONOSUPPORT;
> > > +		}
> > > +		return status;
> > > +	}
> > >  
> > >  	argp = task->tk_msg.rpc_argp;
> > >  	resp = task->tk_msg.rpc_resp;
> > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > > index 91a4d4eeb235..273c032089c4 100644
> > > --- a/fs/nfs/nfs4state.c
> > > +++ b/fs/nfs/nfs4state.c
> > > @@ -2219,6 +2219,7 @@ int nfs4_discover_server_trunking(struct
> > > nfs_client *clp,
> > >  		clnt = clp->cl_rpcclient;
> > >  		goto again;
> > >  
> > > +	case -EPROTONOSUPPORT:
> > >  	case -NFS4ERR_MINOR_VERS_MISMATCH:
> > >  		status = -EPROTONOSUPPORT;
> > >  		break;
> > 
> > -- 
> > Trond Myklebust
> > Linux NFS client maintainer, PrimaryData
> > trond.myklebust@primarydata.com
-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2018-03-20 19:44 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-15 23:29 [PATCH] " NeilBrown
2018-03-15 23:44 ` [PATCH - v2] " NeilBrown
2018-03-16  9:31   ` Mkrtchyan, Tigran
2018-03-16 12:10     ` Trond Myklebust
2018-03-20  0:32       ` NeilBrown
2018-03-20 14:14         ` Trond Myklebust
2018-03-20 18:44           ` NeilBrown
2018-03-20 19:43             ` Trond Myklebust [this message]
2018-03-20 19:58               ` NeilBrown
2018-03-20 20:13                 ` Trond Myklebust
2018-03-20 21:20                   ` NeilBrown
2018-03-20  0:09     ` NeilBrown
2018-03-20 21:48   ` J. Bruce Fields
2018-03-20 22:12     ` NeilBrown
2018-04-03  0:41     ` NeilBrown
2018-04-03 16:02       ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1521575027.89994.20.camel@primarydata.com \
    --to=trondmy@primarydata.com \
    --cc=anna.schumaker@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=tigran.mkrtchyan@desy.de \
    --subject='Re: [PATCH - v2] NFSv4: handle EINVAL from EXCHANGE_ID better.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.