Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
From: "bfields@fieldses.org" <bfields@fieldses.org>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"bfields@redhat.com" <bfields@redhat.com>,
	Kinglong Mee <kinglongmee@gmail.com>
Subject: Re: [PATCH] SUNRPC/cache: Allow garbage collection of invalid cache entries
Date: Thu, 26 Mar 2020 16:40:01 -0400
Message-ID: <20200326204001.GA25053@fieldses.org> (raw)
In-Reply-To: <20200207181817.GC17036@fieldses.org>

Sorry, just getting back to this:

On Fri, Feb 07, 2020 at 01:18:17PM -0500, bfields@fieldses.org wrote:
> On Fri, Feb 07, 2020 at 02:25:27PM +0000, Trond Myklebust wrote:
> > On Thu, 2020-02-06 at 11:33 -0500, J. Bruce Fields wrote:
> > > On Tue, Jan 14, 2020 at 11:57:38AM -0500, Trond Myklebust wrote:
> > > > If the cache entry never gets initialised, we want the garbage
> > > > collector to be able to evict it. Otherwise if the upcall daemon
> > > > fails to initialise the entry, we end up never expiring it.
> > > 
> > > Could you tell us more about what motivated this?
> > > 
> > > It's causing failures on pynfs server-reboot tests.  I haven't pinned
> > > down the cause yet, but it looks like it could be a regression to the
> > > behavior Kinglong Mee describes in detail in his original patch.
> > > 
> > 
> > Can you point me to the tests that are failing?
> 
> I'm basically doing
> 
> 	./nfs4.1/testserver.py myserver:/path reboot
> 			--serverhelper=examples/server_helper.sh
> 			--serverhelperarg=myserver
> 
> For all I know at this point, the change could be exposing a pynfs-side
> bug.

From a trace, it's clear that the server is actually becoming
unresponsive, so it's not a pynfs bug.

> > The motivation here is to allow the garbage collector to do its job of
> > evicting cache entries after they are supposed to have timed out.
> 
> Understood.  I was curious whether this was found by code inspection or
> because you'd run across a case where the leak was causing a practical
> problem.

I'm still curious.

> > The fact that uninitialised cache entries are given an infinite
> > lifetime, and are never evicted is a de facto memory leak if, for
> > instance, the mountd daemon ignores the cache request, or the downcall
> > in expkey_parse() or svc_export_parse() fails without being able to
> > update the request.

If mountd ignores cache requests, or downcalls fail, then the server's
broken anyway.  The server can't do anything without mountd.

> > The threads that are waiting for the cache replies already have a
> > mechanism for dealing with timeouts (with cache_wait_req() and
> > deferred requests), so the question is what is so special about
> > uninitialised requests that we have to leak them in order to avoid a
> > problem with reboot?

I'm not sure I have this right yet.  I'm just staring at the code and at
Kinglong Mee's description on d6fc8821c2d2.  I think the way it works is
that a cash flush from mountd results in all cache entries (including
invalid entries that nfsd threads are waiting on) being considered
expired.  So cache_check() returns an immediate ETIMEDOUT without
waiting.

Maybe the cache_is_expired() logic should be something more like:

	if (h->expiry_time < seconds_since_boot())
		return true;
	if (!test_bit(CACHE_VALID, &h->flags))
		return false;
	return h->expiry_time < seconds_since_boot();

So invalid cache entries (which are waiting for a reply from mountd) can
expire, but they can't be flushed.  If that makes sense.

As a stopgap we may want to revert or drop the "Allow garbage
collection" patch, as the (preexisting) memory leak seems lower impact
than the server hang.

--b.

  parent reply index

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-14 16:57 Trond Myklebust
2020-02-06 16:33 ` J. Bruce Fields
2020-02-07 14:25   ` Trond Myklebust
2020-02-07 18:18     ` bfields
2020-02-10 18:47       ` Trond Myklebust
2020-03-26 20:40       ` bfields [this message]
2020-03-26 21:42         ` Trond Myklebust
2020-03-27  1:50           ` J. Bruce Fields
2020-03-27 12:33             ` Trond Myklebust
2020-03-27 15:53               ` [PATCH] SUNRPC/cache: don't allow invalid entries to be flushed J. Bruce Fields
2020-03-27 16:15                 ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200326204001.GA25053@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=bfields@redhat.com \
    --cc=kinglongmee@gmail.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org
	public-inbox-index linux-nfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git