All of lore.kernel.org
 help / color / mirror / Atom feed
From: "NeilBrown" <neilb@suse.de>
To: "Chuck Lever III" <chuck.lever@oracle.com>
Cc: "Benjamin Coddington" <bcodding@redhat.com>,
	"Steve Dickson" <SteveD@RedHat.com>,
	"Trond Myklebust" <trondmy@hammerspace.com>,
	"Linux NFS Mailing List" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH/RFC] NFSv4: ensure different netns do not share cl_owner_id
Date: Mon, 28 Mar 2022 15:32:31 +1100	[thread overview]
Message-ID: <164844195133.6096.11388357137493699567@noble.neil.brown.name> (raw)
In-Reply-To: <CC04FF50-B936-456D-8BF0-4BF04647B4BC@oracle.com>

On Sat, 26 Mar 2022, Chuck Lever III wrote:
> Hi Neil-
> 
> > On Mar 24, 2022, at 8:24 PM, NeilBrown <neilb@suse.de> wrote:
> > 
> > 
> > [[ This implements an idea I had while discussing the issues around
> >   NFSv4 client identity.  It isn't a solution, but instead aims to make
> >   the problem more immediately visible so that sites can "fix" their
> >   configuration before they get strange errors.
> >   I'm not convinced it is a good idea, but it seems worthy of a little
> >   discussion at least.
> >   There is a follow up patch to nfs-utils which expands this more
> >   towards a b-grade solution, but again if very open for discussion.
> > ]]
> > 
> > The default cl_owner_id is based on host name which may be common
> > amongst different network namespaces.  If two clients in different
> > network namespaces with the same cl_owner_id attempt to mount from the
> > same server, problem will occur as each client can "steal" the lease
> > from the other.
> 
> The immediate issue, IIUC, is that this helps only when the potentially
> conflicting containers are located on the same physical host. I would
> expect there are similar, but less probable, collision risks across
> a population of clients.

I see that as a separate issue - certainly similar but probably
requiring a separate solution.  I had hope to propose a (partial)
solution to that the same time, but it proved challenging.

I would like to automatically set nfs.nfs4_unique_id to something based
on /etc/machine_id if it isn't otherwise set.
- we could auto-generate /etc/modprobe.d/00-nfs-identity.conf
  but I suspect that would over-ride anything on the kernel command
  line.
- we could run a tool at boot and when udev notices that the module is
  loaded, and set the parameter if it isn't already set, but that might
  happen after the first mount
- we could get mount.nfs to check-and-set, but that might run in a mount
  namespace which sees a different /etc/machine-id
- we could change the kernel to support another module parameter.
  nfs.nfs4_unique_id_default, and set that via /etc/modprobe.d
  Then the kernel uses it only if nfs4_unique_id is not set.

I think this idea would be sufficiently safe if we could make it work.
I can't see how to make it work without a kernel change - and I don't
really like the kernel change I suggested. 

> 
> I guess I was also under the impression that NFS mount(2) can return
> EADDRINUSE already, but I might be wrong about that.

Maybe it could return EADDRINUSE if all privileged ports were in use ...
I'd need to check that.

> 
> In regard to the general issues around client ID, my approach has been
> to increase the visibility of CLID_INUSE errors. At least on the
> server, there are trace points that fire when this happens. Perhaps
> the server and client could be more verbose about this situation.

I found the RFC a bit unclear here, but my understanding is that
CLID_INUSE will only get generated if krb5 authentication is used for
EXCHANGE_ID, and the credentials for the clients are different.
This is certainly a case worth handling well, but I doubt it would
affect a high proportion of NFS installations.

Or have I misunderstood?

> 
> Our server might also track the last few boot verifiers for a client
> ID, and if it sees them repeating, issue a warning? That becomes
> onerous if there are more than a handful of clients with the same
> co_ownerid, however.

That's an interesting idea.  There would be no guarantees, but I would
probably report some errors, which would be enough to flag to the
problem to the sysadmin.

Thanks,
NeilBrown

  reply	other threads:[~2022-03-28  4:32 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-25  0:24 [PATCH/RFC] NFSv4: ensure different netns do not share cl_owner_id NeilBrown
2022-03-25 15:06 ` Chuck Lever III
2022-03-28  4:32   ` NeilBrown [this message]
2022-03-28 14:08     ` Chuck Lever III
2022-03-29 15:11       ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=164844195133.6096.11388357137493699567@noble.neil.brown.name \
    --to=neilb@suse.de \
    --cc=SteveD@RedHat.com \
    --cc=bcodding@redhat.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.