linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Saenz Julienne <nsaenzju@redhat.com>
To: Mel Gorman <mgorman@suse.de>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, frederic@kernel.org, tglx@linutronix.de,
	mtosatti@redhat.com, linux-rt-users@vger.kernel.org,
	vbabka@suse.cz, cl@linux.com, paulmck@kernel.org,
	willy@infradead.org
Subject: Re: [PATCH 0/2] mm/page_alloc: Remote per-cpu lists drain support
Date: Mon, 28 Mar 2022 15:51:43 +0200	[thread overview]
Message-ID: <d21d742154cbd6d2b7546533655810e0bf7dd82f.camel@redhat.com> (raw)
In-Reply-To: <20220325104800.GI4363@suse.de>

Hi Mel,

On Fri, 2022-03-25 at 10:48 +0000, Mel Gorman wrote:
> > [1] It follows this pattern:
> > 
> > 	struct per_cpu_pages *pcp;
> > 
> > 	pcp = raw_cpu_ptr(page_zone(page)->per_cpu_pageset);
> > 	// <- Migration here is OK: spin_lock protects vs eventual pcplist
> > 	// access from local CPU as long as all list access happens through the
> > 	// pcp pointer.
> > 	spin_lock(&pcp->lock);
> > 	do_stuff_with_pcp_lists(pcp);
> > 	spin_unlock(&pcp->lock);
> > 
> 
> And this was the part I am concerned with. We are accessing a PCP
> structure that is not necessarily the one belonging to the CPU we
> are currently running on. This type of pattern is warned about in
> Documentation/locking/locktypes.rst
> 
> ---8<---
> A typical scenario is protection of per-CPU variables in thread context::
> 
>   struct foo *p = get_cpu_ptr(&var1);
> 
>   spin_lock(&p->lock);
>   p->count += this_cpu_read(var2);
> 
> This is correct code on a non-PREEMPT_RT kernel, but on a PREEMPT_RT kernel
> this breaks. The PREEMPT_RT-specific change of spinlock_t semantics does
> not allow to acquire p->lock because get_cpu_ptr() implicitly disables
> preemption. The following substitution works on both kernels::
> ---8<---
> 
> Now we don't explicitly have this pattern because there isn't an
> obvious this_cpu_read() for example but it can accidentally happen for
> counting. __count_zid_vm_events -> __count_vm_events -> raw_cpu_add is
> an example although a harmless one.
> 
> Any of the mod_page_state ones are more problematic though because we
> lock one PCP but potentially update the per-cpu pcp stats of another CPU
> of a different PCP that we have not locked and those counters must be
> accurate.

But IIUC vmstats don't track pcplist usage (i.e. adding a page into the local
pcplist doesn't affect the count at all). It is only when interacting with the
buddy allocator that they get updated. It makes sense for the CPU that
adds/removes pages from the allocator to do the stat update, regardless of the
page's journey.

> It *might* still be safe but it's subtle, it could be easily accidentally
> broken in the future and it would be hard to detect because it would be
> very slow corruption of VM counters like NR_FREE_PAGES that must be
> accurate.

What does accurate mean here? vmstat consumers don't get accurate data, only
snapshots. And as I comment above you can't infer information about pcplist
usage from these stats. So, I see no real need for CPU locality when updating
them (which we're still retaining nonetheless, as per my comment above), the
only thing that is really needed is atomicity, achieved by disabling IRQs (and
preemption on RT). And this, even with your solution, is achieved through the
struct zone's spin_lock (plus a preempt_disable() in RT).

All in all, my point is that none of the stats are affected by the change, nor
have a dependency with the pcplists handling. And if we ever have the need to
pin vmstat updates to pcplist usage they should share the same pcp structure.
That said, I'm happy with either solution as long as we get remote pcplist
draining. So if still unconvinced, let me know how can I help. I have access to
all sorts of machines to validate perf results, time to review, or even to move
the series forward.

Thanks!

-- 
Nicolás Sáenz


  reply	other threads:[~2022-03-28 13:51 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-08 10:07 [PATCH 0/2] mm/page_alloc: Remote per-cpu lists drain support Nicolas Saenz Julienne
2022-02-08 10:07 ` [PATCH 1/2] mm/page_alloc: Access lists in 'struct per_cpu_pages' indirectly Nicolas Saenz Julienne
2022-03-03 14:33   ` Marcelo Tosatti
2022-02-08 10:07 ` [PATCH 2/2] mm/page_alloc: Add remote draining support to per-cpu lists Nicolas Saenz Julienne
2022-02-08 15:47   ` Marcelo Tosatti
2022-02-15  8:47     ` Nicolas Saenz Julienne
2022-02-15 17:32       ` Paul E. McKenney
2022-02-09  8:55 ` [PATCH 0/2] mm/page_alloc: Remote per-cpu lists drain support Xiongfeng Wang
2022-02-09  9:45   ` Nicolas Saenz Julienne
2022-02-09 11:26     ` Xiongfeng Wang
2022-02-09 11:36       ` Nicolas Saenz Julienne
2022-02-10 10:59 ` Xiongfeng Wang
2022-02-10 11:04   ` Nicolas Saenz Julienne
2022-03-03 11:45 ` Mel Gorman
2022-03-07 13:57   ` Nicolas Saenz Julienne
2022-03-10 16:31     ` Mel Gorman
2022-03-07 20:47   ` Marcelo Tosatti
2022-03-24 18:59   ` Nicolas Saenz Julienne
2022-03-25 10:48     ` Mel Gorman
2022-03-28 13:51       ` Nicolas Saenz Julienne [this message]
2022-03-29  9:45         ` Mel Gorman
2022-03-30 11:29   ` Nicolas Saenz Julienne
2022-03-31 15:24     ` Mel Gorman
2022-03-03 13:27 ` Vlastimil Babka
2022-03-03 14:10   ` Nicolas Saenz Julienne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d21d742154cbd6d2b7546533655810e0bf7dd82f.camel@redhat.com \
    --to=nsaenzju@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mtosatti@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).