All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Kirill Tkhai <tkhai@ya.ru>,
	akpm@linux-foundation.org, vbabka@suse.cz,
	viro@zeniv.linux.org.uk, brauner@kernel.org, djwong@kernel.org,
	hughd@google.com, paulmck@kernel.org, muchun.song@linux.dev,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	zhengqi.arch@bytedance.com
Subject: Re: [PATCH v2 3/3] fs: Use delayed shrinker unregistration
Date: Tue, 6 Jun 2023 11:24:32 +1000	[thread overview]
Message-ID: <ZH6K0McWBeCjaf16@dread.disaster.area> (raw)
In-Reply-To: <ZH6AA72wOd4HKTKE@P9FQF9L96D>

On Mon, Jun 05, 2023 at 05:38:27PM -0700, Roman Gushchin wrote:
> On Mon, Jun 05, 2023 at 10:03:25PM +0300, Kirill Tkhai wrote:
> > Kernel test robot reports -88.8% regression in stress-ng.ramfs.ops_per_sec
> > test case caused by commit: f95bdb700bc6 ("mm: vmscan: make global slab
> > shrink lockless"). Qi Zheng investigated that the reason is in long SRCU's
> > synchronize_srcu() occuring in unregister_shrinker().
> > 
> > This patch fixes the problem by using new unregistration interfaces,
> > which split unregister_shrinker() in two parts. First part actually only
> > notifies shrinker subsystem about the fact of unregistration and it prevents
> > future shrinker methods calls. The second part completes the unregistration
> > and it insures, that struct shrinker is not used during shrinker chain
> > iteration anymore, so shrinker memory may be freed. Since the long second
> > part is called from delayed work asynchronously, it hides synchronize_srcu()
> > delay from a user.
> > 
> > Signed-off-by: Kirill Tkhai <tkhai@ya.ru>
> > ---
> >  fs/super.c |    3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/super.c b/fs/super.c
> > index 8d8d68799b34..f3e4f205ec79 100644
> > --- a/fs/super.c
> > +++ b/fs/super.c
> > @@ -159,6 +159,7 @@ static void destroy_super_work(struct work_struct *work)
> >  							destroy_work);
> >  	int i;
> >  
> > +	unregister_shrinker_delayed_finalize(&s->s_shrink);
> >  	for (i = 0; i < SB_FREEZE_LEVELS; i++)
> >  		percpu_free_rwsem(&s->s_writers.rw_sem[i]);
> >  	kfree(s);
> > @@ -327,7 +328,7 @@ void deactivate_locked_super(struct super_block *s)
> >  {
> >  	struct file_system_type *fs = s->s_type;
> >  	if (atomic_dec_and_test(&s->s_active)) {
> > -		unregister_shrinker(&s->s_shrink);
> > +		unregister_shrinker_delayed_initiate(&s->s_shrink);
> 
> Hm, it makes the API more complex and easier to mess with. Like what will happen
> if the second part is never called? Or it's called without the first part being
> called first?

Bad things.

Also, it doesn't fix the three other unregister_shrinker() calls in
the XFS unmount path, nor the three in the ext4/mbcache/jbd2 unmount
path.

Those are just some of the unregister_shrinker() calls that have
dynamic contexts that would also need this same fix; I haven't
audited the 3 dozen other unregister_shrinker() calls around the
kernel to determine if any of them need similar treatment, too.

IOWs, this patchset is purely a band-aid to fix the reported
regression, not an actual fix for the underlying problems caused by
moving the shrinker infrastructure to SRCU protection.  This is why
I really want the SRCU changeover reverted.

Not only are the significant changes the API being necessary, it's
put the entire shrinker paths under a SRCU critical section. AIUI,
this means while the shrinkers are running the RCU grace period
cannot expire and no RCU freed memory will actually get freed until
the srcu read lock is dropped by the shrinker.

Given the superblock shrinkers are freeing dentry and inode objects
by RCU freeing, this is also a fairly significant change of
behaviour. i.e.  cond_resched() in the shrinker processing loops no
longer allows RCU grace periods to expire and have memory freed with
the shrinkers are running.

Are there problems this will cause? I don't know, but I'm pretty
sure they haven't even been considered until now....

> Isn't it possible to hide it from a user and call the second part from a work
> context automatically?

Nope, because it has to be done before the struct shrinker is freed.
Those are embedded into other structures rather than being
dynamically allocated objects. Hence the synchronise_srcu() has to
complete before the structure the shrinker is embedded in is freed.

Now, this can be dealt with by having register_shrinker() return an
allocated struct shrinker and the callers only keep a pointer, but
that's an even bigger API change. But, IMO, it is an API change that
should have been done before SRCU was introduced precisely because
it allows decoupling of shrinker execution and completion from
the owning structure.

Then we can stop shrinker execution, wait for it to complete and
prevent future execution in unregister_shrinker(), then punt the
expensive shrinker list removal to background work where processing
delays just don't matter for dead shrinker instances. It doesn't
need SRCU at all...

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2023-06-06  1:24 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-05 19:02 [PATCH v2 0/3] mm: Make unregistration of super_block shrinker more faster Kirill Tkhai
2023-06-05 19:03 ` [PATCH v2 1/3] mm: vmscan: move shrinker_debugfs_remove() before synchronize_srcu() Kirill Tkhai
2023-06-06  0:31   ` Roman Gushchin
2023-06-05 19:03 ` [PATCH v2 2/3] mm: Split unregister_shrinker() in fast and slow part Kirill Tkhai
2023-06-07  4:49   ` kernel test robot
2023-06-07  7:33     ` Yujie Liu
2023-06-05 19:03 ` [PATCH v2 3/3] fs: Use delayed shrinker unregistration Kirill Tkhai
2023-06-06  0:38   ` Roman Gushchin
2023-06-06  1:24     ` Dave Chinner [this message]
2023-06-06  2:56       ` Roman Gushchin
2023-06-06  6:51         ` Dave Chinner
2023-06-06 15:56           ` Roman Gushchin
2023-06-06 21:21       ` Kirill Tkhai
2023-06-06 22:30         ` Dave Chinner
2023-06-08 16:36       ` Theodore Ts'o
2023-06-08 23:17         ` Dave Chinner
2023-06-09  0:27           ` Andrew Morton
2023-06-09  2:50             ` Qi Zheng
2023-06-05 22:32 ` [PATCH v2 0/3] mm: Make unregistration of super_block shrinker more faster Dave Chinner
2023-06-06 21:06   ` Kirill Tkhai
2023-06-06 22:02     ` Dave Chinner
2023-06-07  2:51       ` Qi Zheng
2023-06-08 21:58         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZH6K0McWBeCjaf16@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=djwong@kernel.org \
    --cc=hughd@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=paulmck@kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=tkhai@ya.ru \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.