All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, frederic@kernel.org, tglx@linutronix.de,
	mtosatti@redhat.com, linux-rt-users@vger.kernel.org,
	vbabka@suse.cz, cl@linux.com, paulmck@kernel.org,
	willy@infradead.org
Subject: Re: [PATCH 0/2] mm/page_alloc: Remote per-cpu lists drain support
Date: Thu, 31 Mar 2022 16:24:09 +0100	[thread overview]
Message-ID: <20220331152409.GK4363@suse.de> (raw)
In-Reply-To: <7d115ec39714b906e31398373855c28391229ff9.camel@redhat.com>

On Wed, Mar 30, 2022 at 01:29:04PM +0200, Nicolas Saenz Julienne wrote:
> Hi Mel,
> 
> On Thu, 2022-03-03 at 11:45 +0000, Mel Gorman wrote:
> > On Tue, Feb 08, 2022 at 11:07:48AM +0100, Nicolas Saenz Julienne wrote:
> > > This series replaces mm/page_alloc's per-cpu page lists drain mechanism with
> > > one that allows accessing the lists remotely. Currently, only a local CPU is
> > > permitted to change its per-cpu lists, and it's expected to do so, on-demand,
> > > whenever a process demands it by means of queueing a drain task on the local
> > > CPU. This causes problems for NOHZ_FULL CPUs and real-time systems that can't
> > > take any sort of interruption and to some lesser extent inconveniences idle and
> > > virtualised systems.
> > > 
> > 
> > I know this has been sitting here for a long while. Last few weeks have
> > not been fun.
> > 
> > > Note that this is not the first attempt at fixing this per-cpu page lists:
> > >  - The first attempt[1] tried to conditionally change the pagesets locking
> > >    scheme based the NOHZ_FULL config. It was deemed hard to maintain as the
> > >    NOHZ_FULL code path would be rarely tested. Also, this only solves the issue
> > >    for NOHZ_FULL setups, which isn't ideal.
> > >  - The second[2] unanimously switched the local_locks to per-cpu spinlocks. The
> > >    performance degradation was too big.
> > > 
> > 
> > For unrelated reasons I looked at using llist to avoid locks entirely. It
> > turns out it's not possible and needs a lock. We know "local_locks to
> > per-cpu spinlocks" took a large penalty so I considered alternatives on
> > how a lock could be used.  I found it's possible to both remote drain
> > the lists and avoid the disable/enable of IRQs entirely as long as a
> > preempting IRQ is willing to take the zone lock instead (should be very
> > rare). The IRQ part is a bit hairy though as softirqs are also a problem
> > and preempt-rt needs different rules and the llist has to sort PCP
> > refills which might be a loss in total. However, the remote draining may
> > still be interesting. The full series is at
> > https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git/ mm-pcpllist-v1r2
> > 
> > It's still waiting on tests to complete and not all the changelogs are
> > complete which is why it's not posted.
> > 
> > This is a comparison of vanilla vs "local_locks to per-cpu spinlocks"
> > versus the git series up to "mm/page_alloc: Remotely drain per-cpu lists"
> > for the page faulting microbench I originally complained about.  The test
> > machine is a 2-socket CascadeLake machine.
> > 
> > pft timings
> >                                  5.17.0-rc5             5.17.0-rc5             5.17.0-rc5
> >                                     vanilla    mm-remotedrain-v2r1       mm-pcpdrain-v1r1
> > Amean     elapsed-1        32.54 (   0.00%)       33.08 *  -1.66%*       32.82 *  -0.86%*
> > Amean     elapsed-4         8.66 (   0.00%)        9.24 *  -6.72%*        8.69 *  -0.38%*
> > Amean     elapsed-7         5.02 (   0.00%)        5.43 *  -8.16%*        5.05 *  -0.55%*
> > Amean     elapsed-12        3.07 (   0.00%)        3.38 * -10.00%*        3.09 *  -0.72%*
> > Amean     elapsed-21        2.36 (   0.00%)        2.38 *  -0.89%*        2.19 *   7.39%*
> > Amean     elapsed-30        1.75 (   0.00%)        1.87 *  -6.50%*        1.62 *   7.59%*
> > Amean     elapsed-48        1.71 (   0.00%)        2.00 * -17.32%*        1.71 (  -0.08%)
> > Amean     elapsed-79        1.56 (   0.00%)        1.62 *  -3.84%*        1.56 (  -0.02%)
> > Amean     elapsed-80        1.57 (   0.00%)        1.65 *  -5.31%*        1.57 (  -0.04%)
> > 
> > Note the local_lock conversion took 1 1-17% penalty while the git tree
> > takes a negligile penalty while still allowing remote drains. It might
> > have some potential while being less complex than the RCU approach.
> 
> I've been made aware of a problem with the spin_trylock() approach. It doesn't
> work for UP since in that context spin_lock() is a NOOP (well, it only disables
> preemption). So nothing prevents a race with an IRQ.
> 

I didn't think of UP being a problem. I'm offline shortly until early next
week but superficially the spin_[try]lock for PCP would need a pcp_lock
and pcp_trylock helpers. On SMP, it would be the equivalent lock. On UP,
pcp_lock would map to spin_lock but pcp_trylock would likely need to map
to spin_lock_irqsave. It means that UP would always disable IRQs but that
would be no worse than the current allocator.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2022-03-31 15:24 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-08 10:07 [PATCH 0/2] mm/page_alloc: Remote per-cpu lists drain support Nicolas Saenz Julienne
2022-02-08 10:07 ` [PATCH 1/2] mm/page_alloc: Access lists in 'struct per_cpu_pages' indirectly Nicolas Saenz Julienne
2022-03-03 14:33   ` Marcelo Tosatti
2022-02-08 10:07 ` [PATCH 2/2] mm/page_alloc: Add remote draining support to per-cpu lists Nicolas Saenz Julienne
2022-02-08 15:47   ` Marcelo Tosatti
2022-02-15  8:47     ` Nicolas Saenz Julienne
2022-02-15 17:32       ` Paul E. McKenney
2022-02-09  8:55 ` [PATCH 0/2] mm/page_alloc: Remote per-cpu lists drain support Xiongfeng Wang
2022-02-09  9:45   ` Nicolas Saenz Julienne
2022-02-09 11:26     ` Xiongfeng Wang
2022-02-09 11:36       ` Nicolas Saenz Julienne
2022-02-10 10:59 ` Xiongfeng Wang
2022-02-10 11:04   ` Nicolas Saenz Julienne
2022-03-03 11:45 ` Mel Gorman
2022-03-07 13:57   ` Nicolas Saenz Julienne
2022-03-10 16:31     ` Mel Gorman
2022-03-07 20:47   ` Marcelo Tosatti
2022-03-24 18:59   ` Nicolas Saenz Julienne
2022-03-25 10:48     ` Mel Gorman
2022-03-28 13:51       ` Nicolas Saenz Julienne
2022-03-29  9:45         ` Mel Gorman
2022-03-30 11:29   ` Nicolas Saenz Julienne
2022-03-31 15:24     ` Mel Gorman [this message]
2022-03-03 13:27 ` Vlastimil Babka
2022-03-03 14:10   ` Nicolas Saenz Julienne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220331152409.GK4363@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=nsaenzju@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.