All of lore.kernel.org
 help / color / mirror / Atom feed
From: "NeilBrown" <neilb@suse.de>
To: "Chuck Lever III" <chuck.lever@oracle.com>
Cc: "Benjamin Coddington" <bcodding@redhat.com>,
	"Steve Dickson" <SteveD@RedHat.com>,
	"Trond Myklebust" <trondmy@hammerspace.com>,
	"Linux NFS Mailing List" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH/RFC] NFSv4: ensure different netns do not share cl_owner_id
Date: Mon, 28 Mar 2022 15:32:31 +1100	[thread overview]
Message-ID: <164844195133.6096.11388357137493699567@noble.neil.brown.name> (raw)
In-Reply-To: <CC04FF50-B936-456D-8BF0-4BF04647B4BC@oracle.com>

On Sat, 26 Mar 2022, Chuck Lever III wrote:
> Hi Neil-
> 
> > On Mar 24, 2022, at 8:24 PM, NeilBrown <neilb@suse.de> wrote:
> > 
> > 
> > [[ This implements an idea I had while discussing the issues around
> >   NFSv4 client identity.  It isn't a solution, but instead aims to make
> >   the problem more immediately visible so that sites can "fix" their
> >   configuration before they get strange errors.
> >   I'm not convinced it is a good idea, but it seems worthy of a little
> >   discussion at least.
> >   There is a follow up patch to nfs-utils which expands this more
> >   towards a b-grade solution, but again if very open for discussion.
> > ]]
> > 
> > The default cl_owner_id is based on host name which may be common
> > amongst different network namespaces.  If two clients in different
> > network namespaces with the same cl_owner_id attempt to mount from the
> > same server, problem will occur as each client can "steal" the lease
> > from the other.
> 
> The immediate issue, IIUC, is that this helps only when the potentially
> conflicting containers are located on the same physical host. I would
> expect there are similar, but less probable, collision risks across
> a population of clients.

I see that as a separate issue - certainly similar but probably
requiring a separate solution.  I had hope to propose a (partial)
solution to that the same time, but it proved challenging.

I would like to automatically set nfs.nfs4_unique_id to something based
on /etc/machine_id if it isn't otherwise set.
- we could auto-generate /etc/modprobe.d/00-nfs-identity.conf
  but I suspect that would over-ride anything on the kernel command
  line.
- we could run a tool at boot and when udev notices that the module is
  loaded, and set the parameter if it isn't already set, but that might
  happen after the first mount
- we could get mount.nfs to check-and-set, but that might run in a mount
  namespace which sees a different /etc/machine-id
- we could change the kernel to support another module parameter.
  nfs.nfs4_unique_id_default, and set that via /etc/modprobe.d
  Then the kernel uses it only if nfs4_unique_id is not set.

I think this idea would be sufficiently safe if we could make it work.
I can't see how to make it work without a kernel change - and I don't
really like the kernel change I suggested. 

> 
> I guess I was also under the impression that NFS mount(2) can return
> EADDRINUSE already, but I might be wrong about that.

Maybe it could return EADDRINUSE if all privileged ports were in use ...
I'd need to check that.

> 
> In regard to the general issues around client ID, my approach has been
> to increase the visibility of CLID_INUSE errors. At least on the
> server, there are trace points that fire when this happens. Perhaps
> the server and client could be more verbose about this situation.

I found the RFC a bit unclear here, but my understanding is that
CLID_INUSE will only get generated if krb5 authentication is used for
EXCHANGE_ID, and the credentials for the clients are different.
This is certainly a case worth handling well, but I doubt it would
affect a high proportion of NFS installations.

Or have I misunderstood?

> 
> Our server might also track the last few boot verifiers for a client
> ID, and if it sees them repeating, issue a warning? That becomes
> onerous if there are more than a handful of clients with the same
> co_ownerid, however.

That's an interesting idea.  There would be no guarantees, but I would
probably report some errors, which would be enough to flag to the
problem to the sysadmin.

Thanks,
NeilBrown

  reply	other threads:[~2022-03-28  4:32 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-25  0:24 NeilBrown
2022-03-25 15:06 ` Chuck Lever III
2022-03-28  4:32   ` NeilBrown [this message]
2022-03-28 14:08     ` Chuck Lever III
2022-03-29 15:11       ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=164844195133.6096.11388357137493699567@noble.neil.brown.name \
    --to=neilb@suse.de \
    --cc=SteveD@RedHat.com \
    --cc=bcodding@redhat.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    --subject='Re: [PATCH/RFC] NFSv4: ensure different netns do not share cl_owner_id' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.