linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: "Tobin C. Harding" <me@tobin.cc>
Cc: "Tobin C. Harding" <tobin@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>,
	Alexander Viro <viro@ftp.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Christopher Lameter <cl@linux.com>,
	Matthew Wilcox <willy@infradead.org>,
	Miklos Szeredi <mszeredi@redhat.com>,
	Andreas Dilger <adilger@dilger.ca>,
	Waiman Long <longman@redhat.com>, Tycho Andersen <tycho@tycho.ws>,
	Theodore Ts'o <tytso@mit.edu>, Andi Kleen <ak@linux.intel.com>,
	David Chinner <david@fromorbit.com>,
	Nick Piggin <npiggin@gmail.com>, Rik van Riel <riel@redhat.com>,
	Hugh Dickins <hughd@google.com>, Jonathan Corbet <corbet@lwn.net>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [RFC PATCH v3 14/15] dcache: Implement partial shrink via Slab Movable Objects
Date: Thu, 11 Apr 2019 22:02:00 +0100	[thread overview]
Message-ID: <20190411210200.GH2217@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20190411044746.GE2217@ZenIV.linux.org.uk>

On Thu, Apr 11, 2019 at 05:47:46AM +0100, Al Viro wrote:

> Note, BTW, that umount coming between isolate and drop is not a problem;
> it call shrink_dcache_parent() on the root.  And if shrink_dcache_parent()
> finds something on (another) shrink list, it won't put it to the shrink
> list of its own, but it will make note of that and repeat the scan in
> such case.  So if we find something with zero refcount and not on
> shrink list, we can move it to our shrink list and be sure that its
> superblock won't go away under us...

Aaaarrgghhh...  No, we can't.  Look: we get one candidate dentry in isolate
phase.  We put it into shrink list.  umount(2) comes and calls
shrink_dcache_for_umount(), which calls shrink_dcache_parent(root).
In the meanwhile, shrink_dentry_list() is run and does __dentry_kill() on
that one dentry.  Fine, it's gone - before shrink_dcache_parent() even
sees it.  Now shrink_dentry_list() holds a reference to its parent and
is about to drop it in
                dentry = parent;
                while (dentry && !lockref_put_or_lock(&dentry->d_lockref))
                        dentry = dentry_kill(dentry);
And dropped it will be, but... shrink_dcache_parent() has finished the
scan, without finding *anything* with zero refcount - the thing that used
to be on the shrink list was already gone before shrink_dcache_parent()
has gotten there and the reference to parent was not dropped yet.  So
shrink_dcache_for_umount() plows past shrink_dcache_parent(), walks the
tree and complains loudly about "busy" dentries (that parent we hadn't
finished dropping), and then we proceed with filesystem shutdown.
In the meanwhile, dentry_kill() finally gets to killing dentry and
triggers an unexpected late call of ->d_iput() on a filesystem that
has already been far enough into shutdown - far enough to destroy the
data structures needed for that sucker.

The reason we don't hit that problem with regular memory shrinker is
this:
                unregister_shrinker(&s->s_shrink);
                fs->kill_sb(s);
in deactivate_locked_super().  IOW, shrinker for this fs is gone
before we get around to shutdown.  And so are all normal sources
of dentry eviction for that fs.

Your earlier variants all suffer the same problem - picking a page
shared by dentries from several superblocks can run into trouble
if it overlaps with umount of one of those.

Fuck...  One variant of solution would be to have per-superblock
struct kmem_cache to be used for dentries of that superblock.
However,
	* we'd need to prevent them getting merged
	* it would add per-superblock memory costs (for struct
kmem_cache and associated structures)
	* it might mean more pages eaten by the dentries -
on average half a page per superblock (more if there are very
few dentries on that superblock)

OTOH, it might actually improve the memory footprint - all
dentries sharing a page would be from the same superblock,
so the use patterns might be more similar, which might
lower the fragmentation...

Hell knows...  I'd like to hear an opinion from VM folks on
that one.  Comments?

  parent reply	other threads:[~2019-04-11 21:02 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-11  1:34 [RFC PATCH v3 00/15] Slab Movable Objects (SMO) Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 01/15] slub: Add isolate() and migrate() methods Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 02/15] tools/vm/slabinfo: Add support for -C and -M options Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 03/15] slub: Sort slab cache list Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 04/15] slub: Slab defrag core Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 05/15] tools/vm/slabinfo: Add remote node defrag ratio output Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 06/15] tools/vm/slabinfo: Add defrag_used_ratio output Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 07/15] tools/testing/slab: Add object migration test module Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 08/15] tools/testing/slab: Add object migration test suite Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 09/15] xarray: Implement migration function for objects Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 10/15] tools/testing/slab: Add XArray movable objects tests Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 11/15] slub: Enable moving objects to/from specific nodes Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 12/15] slub: Enable balancing slabs across nodes Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 13/15] dcache: Provide a dentry constructor Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 14/15] dcache: Implement partial shrink via Slab Movable Objects Tobin C. Harding
2019-04-11  2:33   ` Al Viro
2019-04-11  2:48     ` Tobin C. Harding
2019-04-11  4:47       ` Al Viro
2019-04-11  5:05         ` Tobin C. Harding
2019-04-11 20:01         ` Al Viro
2019-04-11 21:02         ` Al Viro [this message]
2019-06-29  4:08           ` Al Viro
2019-06-29  4:38             ` shrink_dentry_list() logics change (was Re: [RFC PATCH v3 14/15] dcache: Implement partial shrink via Slab Movable Objects) Al Viro
2019-06-29 19:06               ` Al Viro
2019-06-29 22:29                 ` Al Viro
2019-06-29 22:34                   ` Al Viro
2019-07-01  9:26                 ` Tobin C. Harding
2019-04-11  1:34 ` [RFC PATCH v3 15/15] dcache: Add CONFIG_DCACHE_SMO Tobin C. Harding

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190411210200.GH2217@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=adilger@dilger.ca \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=corbet@lwn.net \
    --cc=david@fromorbit.com \
    --cc=guro@fb.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=me@tobin.cc \
    --cc=mszeredi@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=penberg@cs.helsinki.fi \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=tobin@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tycho@tycho.ws \
    --cc=tytso@mit.edu \
    --cc=viro@ftp.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).