From: "Paul E. McKenney" <paulmck@kernel.org>
To: Michal Hocko <mhocko@suse.com>
Cc: Uladzislau Rezki <urezki@gmail.com>,
Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>, RCU <rcu@vger.kernel.org>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>,
Matthew Wilcox <willy@infradead.org>,
"Theodore Y . Ts'o" <tytso@mit.edu>,
Joel Fernandes <joel@joelfernandes.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag
Date: Fri, 14 Aug 2020 11:01:41 -0700 [thread overview]
Message-ID: <20200814180141.GP4295@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200814140604.GE9477@dhcp22.suse.cz>
On Fri, Aug 14, 2020 at 04:06:04PM +0200, Michal Hocko wrote:
> On Fri 14-08-20 06:34:50, Paul E. McKenney wrote:
> > On Fri, Aug 14, 2020 at 02:48:32PM +0200, Michal Hocko wrote:
> > > On Fri 14-08-20 14:15:44, Uladzislau Rezki wrote:
> > > > > On Thu 13-08-20 19:09:29, Thomas Gleixner wrote:
> > > > > > Michal Hocko <mhocko@suse.com> writes:
> > > > > [...]
> > > > > > > Why should we limit the functionality of the allocator for something
> > > > > > > that is not a real problem?
> > > > > >
> > > > > > We'd limit the allocator for exactly ONE new user which was aware of
> > > > > > this problem _before_ the code hit mainline. And that ONE user is
> > > > > > prepared to handle the fail.
> > > > >
> > > > > If we are to limit the functionality to this one particular user then
> > > > > I would consider a dedicated gfp flag a huge overkill. It would be much
> > > > > more easier to have a preallocated pool of pages and use those and
> > > > > completely avoid the core allocator. That would certainly only shift the
> > > > > complexity to the caller but if it is expected there would be only that
> > > > > single user then it would be probably better than opening a can of worms
> > > > > like allocator usable from raw spin locks.
> > > > >
> > > > Vlastimil raised same question earlier, i answered, but let me answer again:
> > > >
> > > > It is hard to achieve because the logic does not stick to certain static test
> > > > case, i.e. it depends on how heavily kfree_rcu(single/double) are used. Based
> > > > on that, "how heavily" - number of pages are formed, until the drain/reclaimer
> > > > thread frees them.
> > >
> > > How many pages are talking about - ball park? 100s, 1000s?
> >
> > Under normal operation, a couple of pages per CPU, which would make
> > preallocation entirely reasonable. Except that if someone does something
> > that floods RCU callbacks (close(open) in a tight userspace loop, for but
> > one example), then 2000 per CPU might not be enough, which on a 64-CPU
> > system comes to about 500MB. This is beyond excessive for preallocation
> > on the systems I am familiar with.
> >
> > And the flooding case is where you most want the reclamation to be
> > efficient, and thus where you want the pages.
>
> I am not sure the page allocator would help you with this scenario
> unless you are on very large machines. Pagesets scale with the available
> memory and percpu_pagelist_fraction sysctl (have a look at
> pageset_set_high_and_batch). It is roughly 1000th of the zone size for
> each zone. You can check that in /proc/vmstat (my 8G machine)
Small systems might have ~64G. The medium-sized systems might have
~250G. There are a few big ones that might have 1.5T. None of the
/proc/vmstat files from those machines contain anything resembling
the list below, though.
> Node 0, zone DMA
> Not interesting at all
> Node 0, zone DMA32
> pagesets
> cpu: 0
> count: 242
> high: 378
> batch: 63
> cpu: 1
> count: 355
> high: 378
> batch: 63
> cpu: 2
> count: 359
> high: 378
> batch: 63
> cpu: 3
> count: 366
> high: 378
> batch: 63
> Node 0, zone Normal
> pagesets
> cpu: 0
> count: 359
> high: 378
> batch: 63
> cpu: 1
> count: 241
> high: 378
> batch: 63
> cpu: 2
> count: 297
> high: 378
> batch: 63
> cpu: 3
> count: 227
> high: 378
> batch: 63
>
> Besides that do you need to be per-cpu? Having 1000 pages available and
> managed under your raw spinlock should be good enough already no?
It needs to be almost entirely per-CPU for performance reasons. Plus
a user could do a tight close(open()) loop on each CPU.
> > This of course raises the question of how much memory the lockless caches
> > contain, but fortunately these RCU callback flooding scenarios also
> > involve process-context allocation of the memory that is being passed
> > to kfree_rcu(). That allocation should keep the lockless caches from
> > going empty in the common case, correct?
>
> Yes, those are refilled both on the allocation/free paths. But you
> cannot really rely on that to happen early enough.
So the really ugly scenarios with the tight loops normally allocate
something and immediately either call_rcu() or kfree_rcu() it.
But you are right, someone doing "rm -rf" on a large file tree
with lots of small files might not be doing that many allocations.
> Do you happen to have any numbers that would show the typical usage
> and how often the slow path has to be taken becase pcp lists are
> depleted? In other words even if we provide a functionality to give
> completely lockless way to allocate memory how useful that is?
Not yet, but let's see what we can do.
Thanx, Paul
next prev parent reply other threads:[~2020-08-14 18:01 UTC|newest]
Thread overview: 111+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-09 20:43 [RFC-PATCH 0/2] __GFP_NO_LOCKS Uladzislau Rezki (Sony)
2020-08-09 20:43 ` [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag Uladzislau Rezki (Sony)
2020-08-10 12:31 ` Michal Hocko
2020-08-10 16:07 ` Uladzislau Rezki
2020-08-10 19:25 ` Michal Hocko
2020-08-11 8:19 ` Michal Hocko
2020-08-11 9:37 ` Uladzislau Rezki
2020-08-11 9:42 ` Uladzislau Rezki
2020-08-11 10:28 ` Michal Hocko
2020-08-11 10:45 ` Uladzislau Rezki
2020-08-11 10:26 ` Michal Hocko
2020-08-11 11:33 ` Uladzislau Rezki
2020-08-11 9:18 ` Uladzislau Rezki
2020-08-11 10:21 ` Michal Hocko
2020-08-11 11:10 ` Uladzislau Rezki
2020-08-11 14:44 ` Thomas Gleixner
2020-08-11 15:22 ` Thomas Gleixner
2020-08-12 11:38 ` Thomas Gleixner
2020-08-12 12:01 ` Uladzislau Rezki
2020-08-13 7:18 ` Michal Hocko
2020-08-11 15:33 ` Paul E. McKenney
2020-08-11 15:43 ` Thomas Gleixner
2020-08-11 15:56 ` Sebastian Andrzej Siewior
2020-08-11 16:02 ` Paul E. McKenney
2020-08-11 16:19 ` Paul E. McKenney
2020-08-11 19:39 ` Thomas Gleixner
2020-08-11 21:09 ` Paul E. McKenney
2020-08-12 0:13 ` Thomas Gleixner
2020-08-12 4:29 ` Paul E. McKenney
2020-08-12 8:32 ` Thomas Gleixner
2020-08-12 13:30 ` Paul E. McKenney
2020-08-13 7:50 ` Michal Hocko
2020-08-13 9:58 ` Uladzislau Rezki
2020-08-13 11:15 ` Michal Hocko
2020-08-13 13:27 ` Thomas Gleixner
2020-08-13 13:45 ` Michal Hocko
2020-08-13 14:32 ` Matthew Wilcox
2020-08-13 16:14 ` Thomas Gleixner
2020-08-13 16:22 ` Matthew Wilcox
2020-08-13 13:22 ` Thomas Gleixner
2020-08-13 13:33 ` Michal Hocko
2020-08-13 14:34 ` Thomas Gleixner
2020-08-13 14:53 ` Michal Hocko
2020-08-13 15:41 ` Paul E. McKenney
2020-08-13 15:54 ` Michal Hocko
2020-08-13 16:04 ` Paul E. McKenney
2020-08-13 16:13 ` Michal Hocko
2020-08-13 16:29 ` Paul E. McKenney
2020-08-13 17:12 ` Michal Hocko
2020-08-13 17:27 ` Paul E. McKenney
2020-08-13 18:31 ` peterz
2020-08-13 19:13 ` Michal Hocko
2020-08-13 16:20 ` Uladzislau Rezki
2020-08-13 16:36 ` Michal Hocko
2020-08-14 11:54 ` Uladzislau Rezki
2020-08-13 17:09 ` Thomas Gleixner
2020-08-13 17:22 ` Michal Hocko
2020-08-14 7:17 ` Michal Hocko
2020-08-14 12:15 ` Uladzislau Rezki
2020-08-14 12:48 ` Michal Hocko
2020-08-14 13:34 ` Paul E. McKenney
2020-08-14 14:06 ` Michal Hocko
2020-08-14 18:01 ` Paul E. McKenney [this message]
2020-08-14 23:14 ` Thomas Gleixner
2020-08-14 23:41 ` Paul E. McKenney
2020-08-15 0:43 ` Thomas Gleixner
2020-08-15 3:01 ` Paul E. McKenney
2020-08-15 8:27 ` Peter Zijlstra
2020-08-15 13:03 ` Paul E. McKenney
2020-08-15 8:42 ` Peter Zijlstra
2020-08-15 14:18 ` Paul E. McKenney
2020-08-15 14:23 ` Paul E. McKenney
2020-08-17 8:47 ` Michal Hocko
2020-08-13 18:26 ` peterz
2020-08-13 18:52 ` Paul E. McKenney
2020-08-13 22:06 ` peterz
2020-08-13 23:23 ` Paul E. McKenney
2020-08-13 23:59 ` Thomas Gleixner
2020-08-14 8:30 ` Peter Zijlstra
2020-08-14 10:23 ` peterz
2020-08-14 15:26 ` Paul E. McKenney
2020-08-14 14:14 ` Paul E. McKenney
2020-08-14 16:11 ` Paul E. McKenney
2020-08-14 17:49 ` Peter Zijlstra
2020-08-14 18:02 ` Paul E. McKenney
2020-08-14 19:33 ` Thomas Gleixner
2020-08-14 20:41 ` Paul E. McKenney
2020-08-14 21:52 ` Peter Zijlstra
2020-08-14 23:27 ` Paul E. McKenney
2020-08-14 23:40 ` Thomas Gleixner
2020-08-16 22:56 ` Uladzislau Rezki
2020-08-17 8:28 ` Michal Hocko
2020-08-17 10:36 ` Uladzislau Rezki
2020-08-17 22:28 ` Paul E. McKenney
2020-08-18 7:43 ` Michal Hocko
2020-08-18 13:53 ` Paul E. McKenney
2020-08-18 14:43 ` Thomas Gleixner
2020-08-18 16:13 ` Paul E. McKenney
2020-08-18 16:55 ` Thomas Gleixner
2020-08-18 17:13 ` Paul E. McKenney
2020-08-18 23:26 ` Thomas Gleixner
2020-08-19 23:07 ` Paul E. McKenney
2020-08-18 15:02 ` Michal Hocko
2020-08-18 15:45 ` Uladzislau Rezki
2020-08-18 16:18 ` Paul E. McKenney
2020-08-14 16:19 ` peterz
2020-08-14 18:15 ` Paul E. McKenney
2020-08-13 13:29 ` Uladzislau Rezki
2020-08-13 13:41 ` Michal Hocko
2020-08-13 14:22 ` Uladzislau Rezki
2020-08-09 20:43 ` [PATCH 2/2] rcu/tree: use " Uladzislau Rezki (Sony)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200814180141.GP4295@paulmck-ThinkPad-P72 \
--to=paulmck@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=joel@joelfernandes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=oleksiy.avramchenko@sonymobile.com \
--cc=peterz@infradead.org \
--cc=rcu@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=tytso@mit.edu \
--cc=urezki@gmail.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).