From: Mel Gorman <mgorman@techsingularity.net>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
Linux-MM <linux-mm@kvack.org>,
Linux-RT-Users <linux-rt-users@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Chuck Lever <chuck.lever@oracle.com>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Matthew Wilcox <willy@infradead.org>,
Ingo Molnar <mingo@kernel.org>, Michal Hocko <mhocko@kernel.org>,
Oscar Salvador <osalvador@suse.de>
Subject: Re: [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock
Date: Tue, 13 Apr 2021 17:52:02 +0100 [thread overview]
Message-ID: <20210413165201.GE3697@techsingularity.net> (raw)
In-Reply-To: <87lf9nyy3v.ffs@nanos.tec.linutronix.de>
On Mon, Apr 12, 2021 at 11:47:00PM +0200, Thomas Gleixner wrote:
> On Mon, Apr 12 2021 at 12:56, Mel Gorman wrote:
> > On Fri, Apr 09, 2021 at 08:55:39PM +0200, Peter Zijlstra wrote:
> > I'll update the changelog and comment accordingly. I'll decide later
> > whether to leave it or move the location of the lock at the end of the
> > series. If the patch is added, it'll either incur the double lookup (not
> > that expensive, might be optimised by the compiler) or come up with a
> > helper that takes the lock and returns the per-cpu structure. The double
> > lookup probably makes more sense initially because there are multiple
> > potential users of a helper that says "pin to CPU, lookup, lock and return
> > a per-cpu structure" for both IRQ-safe and IRQ-unsafe variants with the
> > associated expansion of the local_lock API. It might be better to introduce
> > such a helper with multiple users converted at the same time and there are
> > other local_lock users in preempt-rt that could do with upstreaming first.
>
> We had such helpers in RT a while ago but it turned into an helper
> explosion pretty fast. But that was one of the early versions of local
> locks which could not be embedded into a per CPU data structure due to
> raisins (my stupidity).
>
> But with the more thought out approach of today we can have (+/- the
> obligatory naming bikeshedding):
>
> <SNIP>
I don't have strong opinions on the name -- it's long but it's clear.
The overhead of local_lock_get_cpu_ptr has similar weight to get_cpu_ptr
in terms of the cost of preempt_disable. The helper also means that new
users of a local_lock embedded within a per-cpu structure do not have to
figure out if it's safe from scratch.
If the page allocator embeds local_lock within struct per_cpu_pages then
the conversion to the helper is at the end of the mail. The messiest part
is free_unref_page_commit and that is a mess because free_unref_page_list
has to check if a new lock is required in case a list of pages is from
different zones.
> <SNIP>
>
> and RT will then change that to:
>
> --- a/include/linux/local_lock_internal.h
> +++ b/include/linux/local_lock_internal.h
> @@ -96,7 +96,7 @@ static inline void local_lock_release(lo
> ({ \
> type *__pcp; \
> \
> - preempt_disable(); \
> + ll_preempt_disable(); \
> __pcp = this_cpu_ptr(pcp); \
> local_lock_acquire(&__pcp->lock); \
> __pcp; \
> @@ -106,7 +106,7 @@ static inline void local_lock_release(lo
> ({ \
> type *__pcp; \
> \
> - local_irq_disable(); \
> + ll_local_irq_disable(); \
> __pcp = this_cpu_ptr(pcp); \
> local_lock_acquire(&__pcp->lock); \
> __pcp; \
> @@ -116,7 +116,7 @@ static inline void local_lock_release(lo
> ({ \
> type *__pcp; \
> \
> - local_irq_save(flags); \
> + ll_local_irq_save(flags); \
> __pcp = this_cpu_ptr(pcp); \
> local_lock_acquire(&__pcp->lock); \
> __pcp; \
>
>
> where ll_xxx is defined as xxx for non-RT and on RT all of them
> get mapped to migrate_disable().
>
> Thoughts?
>
I think that works. I created the obvious definitions of ll_* and rebased
on top of preempt-rt to see. I'll see if it boots :P
Page allocator conversion to helper looks like
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d9d7f6d68243..2948a5502589 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3008,9 +3008,7 @@ static void drain_pages_zone(unsigned int cpu, struct zone *zone)
unsigned long flags;
struct per_cpu_pages *pcp;
- local_lock_irqsave(&zone->per_cpu_pageset->lock, flags);
-
- pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
+ pcp = local_lock_irqsave_get_cpu_ptr(zone->per_cpu_pageset, lock, flags);
if (pcp->count)
free_pcppages_bulk(zone, pcp->count, pcp);
@@ -3235,12 +3233,10 @@ static bool free_unref_page_prepare(struct page *page, unsigned long pfn)
}
static void free_unref_page_commit(struct page *page, struct zone *zone,
- unsigned long pfn, int migratetype)
+ struct per_cpu_pages *pcp, unsigned long pfn,
+ int migratetype)
{
- struct per_cpu_pages *pcp;
-
__count_vm_event(PGFREE);
- pcp = this_cpu_ptr(zone->per_cpu_pageset);
list_add(&page->lru, &pcp->lists[migratetype]);
pcp->count++;
if (pcp->count >= READ_ONCE(pcp->high))
@@ -3252,6 +3248,7 @@ static void free_unref_page_commit(struct page *page, struct zone *zone,
*/
void free_unref_page(struct page *page)
{
+ struct per_cpu_pages *pcp;
struct zone *zone;
unsigned long flags;
unsigned long pfn = page_to_pfn(page);
@@ -3277,8 +3274,8 @@ void free_unref_page(struct page *page)
}
zone = page_zone(page);
- local_lock_irqsave(&zone->per_cpu_pageset->lock, flags);
- free_unref_page_commit(page, zone, pfn, migratetype);
+ pcp = local_lock_irqsave_get_cpu_ptr(zone->per_cpu_pageset, lock, flags);
+ free_unref_page_commit(page, zone, pcp, pfn, migratetype);
local_unlock_irqrestore(&zone->per_cpu_pageset->lock, flags);
}
@@ -3287,6 +3284,7 @@ void free_unref_page(struct page *page)
*/
void free_unref_page_list(struct list_head *list)
{
+ struct per_cpu_pages *pcp;
struct zone *locked_zone;
struct page *page, *next;
unsigned long flags, pfn;
@@ -3320,7 +3318,7 @@ void free_unref_page_list(struct list_head *list)
/* Acquire the lock required for the first page. */
page = list_first_entry(list, struct page, lru);
locked_zone = page_zone(page);
- local_lock_irqsave(&locked_zone->per_cpu_pageset->lock, flags);
+ pcp = local_lock_irqsave_get_cpu_ptr(locked_zone->per_cpu_pageset, lock, flags);
list_for_each_entry_safe(page, next, list, lru) {
struct zone *zone = page_zone(page);
@@ -3342,12 +3340,12 @@ void free_unref_page_list(struct list_head *list)
#if defined(CONFIG_PREEMPT_RT) || defined(CONFIG_DEBUG_LOCK_ALLOC)
if (locked_zone != zone) {
local_unlock_irqrestore(&locked_zone->per_cpu_pageset->lock, flags);
- local_lock_irqsave(&zone->per_cpu_pageset->lock, flags);
+ pcp = local_lock_irqsave_get_cpu_ptr(zone->per_cpu_pageset, lock, flags);
locked_zone = zone;
}
#endif
- free_unref_page_commit(page, zone, pfn, migratetype);
+ free_unref_page_commit(page, zone, pcp, pfn, migratetype);
/*
* Guard against excessive IRQ disabled times when we get
@@ -3517,8 +3515,7 @@ static struct page *rmqueue_pcplist(struct zone *preferred_zone,
struct page *page;
unsigned long flags;
- local_lock_irqsave(&zone->per_cpu_pageset->lock, flags);
- pcp = this_cpu_ptr(zone->per_cpu_pageset);
+ pcp = local_lock_irqsave_get_cpu_ptr(zone->per_cpu_pageset, lock, flags);
list = &pcp->lists[migratetype];
page = __rmqueue_pcplist(zone, migratetype, alloc_flags, pcp, list);
local_unlock_irqrestore(&zone->per_cpu_pageset->lock, flags);
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2021-04-13 16:52 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-07 20:24 [PATCH 0/11 v2] Use local_lock for pcp protection and reduce stat overhead Mel Gorman
2021-04-07 20:24 ` [PATCH 01/11] mm/page_alloc: Split per cpu page lists and zone stats Mel Gorman
2021-04-12 17:43 ` Vlastimil Babka
2021-04-13 13:27 ` Mel Gorman
2021-04-07 20:24 ` [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock Mel Gorman
2021-04-08 10:52 ` Peter Zijlstra
2021-04-08 17:42 ` Mel Gorman
2021-04-09 6:39 ` Peter Zijlstra
2021-04-09 7:59 ` Mel Gorman
2021-04-09 8:24 ` Peter Zijlstra
2021-04-09 13:32 ` Mel Gorman
2021-04-09 18:55 ` Peter Zijlstra
2021-04-12 11:56 ` Mel Gorman
2021-04-12 21:47 ` Thomas Gleixner
2021-04-13 16:52 ` Mel Gorman [this message]
2021-04-07 20:24 ` [PATCH 03/11] mm/memory_hotplug: Make unpopulated zones PCP structures unreachable during hot remove Mel Gorman
2021-04-07 20:24 ` [PATCH 04/11] mm/vmstat: Convert NUMA statistics to basic NUMA counters Mel Gorman
2021-04-14 12:56 ` Vlastimil Babka
2021-04-14 15:18 ` Mel Gorman
2021-04-14 15:56 ` Vlastimil Babka
2021-04-15 10:06 ` Mel Gorman
2021-04-07 20:24 ` [PATCH 05/11] mm/vmstat: Inline NUMA event counter updates Mel Gorman
2021-04-07 20:24 ` [PATCH 06/11] mm/page_alloc: Batch the accounting updates in the bulk allocator Mel Gorman
2021-04-07 20:24 ` [PATCH 07/11] mm/page_alloc: Reduce duration that IRQs are disabled for VM counters Mel Gorman
2021-04-07 20:24 ` [PATCH 08/11] mm/page_alloc: Remove duplicate checks if migratetype should be isolated Mel Gorman
2021-04-07 20:24 ` [PATCH 09/11] mm/page_alloc: Explicitly acquire the zone lock in __free_pages_ok Mel Gorman
2021-04-07 20:24 ` [PATCH 10/11] mm/page_alloc: Avoid conflating IRQs disabled with zone->lock Mel Gorman
2021-04-07 20:24 ` [PATCH 11/11] mm/page_alloc: Update PGFREE outside the zone lock in __free_pages_ok Mel Gorman
2021-04-08 10:56 ` [PATCH 0/11 v2] Use local_lock for pcp protection and reduce stat overhead Peter Zijlstra
2021-04-08 17:48 ` Mel Gorman
2021-04-14 13:39 [PATCH 0/11 v3] " Mel Gorman
2021-04-14 13:39 ` [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210413165201.GE3697@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=brouer@redhat.com \
--cc=chuck.lever@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=mhocko@kernel.org \
--cc=mingo@kernel.org \
--cc=osalvador@suse.de \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).