linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "NeilBrown" <neilb@suse.de>
To: "Chuck Lever" <chuck.lever@oracle.com>
Cc: "Al Viro" <viro@zeniv.linux.org.uk>,
	"Christian Brauner" <brauner@kernel.org>,
	"Jens Axboe" <axboe@kernel.dk>, "Oleg Nesterov" <oleg@redhat.com>,
	"Jeff Layton" <jlayton@kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH 1/3] nfsd: use __fput_sync() to avoid delayed closing of files.
Date: Mon, 11 Dec 2023 09:47:35 +1100	[thread overview]
Message-ID: <170224845504.12910.16483736613606611138@noble.neil.brown.name> (raw)
In-Reply-To: <ZXMv4psmTWw4mlCd@tissot.1015granger.net>

On Sat, 09 Dec 2023, Chuck Lever wrote:
> On Fri, Dec 08, 2023 at 02:27:26PM +1100, NeilBrown wrote:
> > Calling fput() directly or though filp_close() from a kernel thread like
> > nfsd causes the final __fput() (if necessary) to be called from a
> > workqueue.  This means that nfsd is not forced to wait for any work to
> > complete.  If the ->release of ->destroy_inode function is slow for any
> > reason, this can result in nfsd closing files more quickly than the
> > workqueue can complete the close and the queue of pending closes can
> > grow without bounces (30 million has been seen at one customer site,
> > though this was in part due to a slowness in xfs which has since been
> > fixed).
> > 
> > nfsd does not need this.
> 
> That is technically true, but IIUC, there is only one case where a
> synchronous close matters for the backlog problem, and that's when
> nfsd_file_free() is called from nfsd_file_put(). AFAICT all other
> call sites (except rename) are error paths, so there aren't negative
> consequences for the lack of synchronous wait there...

What you say is technically true but it isn't the way I see it.

Firstly I should clarify that __fput_sync() is *not* a flushing close as
you describe it below.
All it does, apart for some trivial book-keeping, is to call ->release
and possibly ->destroy_inode immediately rather than shunting them off
to another thread.
Apparently ->release sometimes does something that can deadlock with
some kernel threads or if some awkward locks are held, so the whole
final __fput is delay by default.  But this does not apply to nfsd.
Standard fput() is really the wrong interface for nfsd to use.  
It should use __fput_sync() (which shouldn't have such a scary name).

The comment above flush_delayed_fput() seems to suggest that unmounting
is a core issue.  Maybe the fact that __fput() can call
dissolve_on_fput() is a reason why it is sometimes safer to leave the
work to later.  But I don't see that applying to nfsd.

Of course a ->release function *could* do synchronous writes just like
the XFS ->destroy_inode function used to do synchronous reads.
I don't think we should ever try to hide that by putting it in
a workqueue.  It's probably a bug and it is best if bugs are visible.

Note that the XFS ->release function does call filemap_flush() in some
cases, but that is an async flush, so __fput_sync doesn't wait for the
flush to complete.

The way I see this patch is that fput() is the wrong interface for nfsd
to use, __fput_sync is the right interface.  So we should change.  1
patch.
The details about exhausting memory explain a particular symptom that
motivated the examination which revealed that nfsd was using the wrong
interface.

If we have nfsd sometimes using fput() and sometimes __fput_sync, then
we need to have clear rules for when to use which.  It is much easier to
have a simple rule: always use __fput_sync().

I'm certainly happy to revise function documentation and provide
wrapper functions if needed.

I might be good to have

  void filp_close_sync(struct file *f)
  {
       get_file(f);
       filp_close(f);
       __fput_sync(f);
  }

but as that would only be called once, it was hard to motivate.
Having it in linux/fs.h would be nice.

Similarly would could wrap __fput_sync() is a more friendly name, but
that would be better if we actually renamed the function.

  void fput_now(struct file *f)
  {
      __fput_sync(f);
  }

??

Thanks,
NeilBrown

  reply	other threads:[~2023-12-10 22:59 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-08  3:27 [PATCH 0/3] nfsd: fully close all files in the nfsd threads NeilBrown
2023-12-08  3:27 ` [PATCH 1/3] nfsd: use __fput_sync() to avoid delayed closing of files NeilBrown
2023-12-08 15:01   ` Chuck Lever
2023-12-10 22:47     ` NeilBrown [this message]
2023-12-11 19:01       ` Chuck Lever
2023-12-11 22:04         ` NeilBrown
2023-12-12 16:17           ` Chuck Lever
2023-12-11 19:11       ` Al Viro
2023-12-11 22:23         ` NeilBrown
2023-12-11 23:13           ` Al Viro
2023-12-11 23:21             ` Al Viro
2023-12-13  0:28               ` NeilBrown
2023-12-15 18:27                 ` David Laight
2023-12-15 19:35                   ` Chuck Lever III
2023-12-15 22:36                   ` NeilBrown
2023-12-15 18:28           ` David Laight
2023-12-16  1:50       ` Dave Chinner
2023-12-08  3:27 ` [PATCH 2/3] nfsd: Don't leave work of closing files to a work queue NeilBrown
2023-12-08  3:27 ` [PATCH 3/3] VFS: don't export flush_delayed_fput() NeilBrown
2023-12-08 11:40 ` [PATCH 0/3] nfsd: fully close all files in the nfsd threads Jeff Layton
2023-12-08 14:33 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=170224845504.12910.16483736613606611138@noble.neil.brown.name \
    --to=neilb@suse.de \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).