From: Vlastimil Babka <vbabka@suse.cz>
To: Mel Gorman <mgorman@techsingularity.net>,
Linux-MM <linux-mm@kvack.org>,
Linux-RT-Users <linux-rt-users@vger.kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
Chuck Lever <chuck.lever@oracle.com>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>, Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH 09/11] mm/page_alloc: Avoid conflating IRQs disabled with zone->lock
Date: Thu, 15 Apr 2021 14:25:36 +0200 [thread overview]
Message-ID: <838c6734-1e5d-6a26-8c88-90e89d407482@suse.cz> (raw)
In-Reply-To: <20210414133931.4555-10-mgorman@techsingularity.net>
On 4/14/21 3:39 PM, Mel Gorman wrote:
> Historically when freeing pages, free_one_page() assumed that callers
> had IRQs disabled and the zone->lock could be acquired with spin_lock().
> This confuses the scope of what local_lock_irq is protecting and what
> zone->lock is protecting in free_unref_page_list in particular.
>
> This patch uses spin_lock_irqsave() for the zone->lock in
> free_one_page() instead of relying on callers to have disabled
> IRQs. free_unref_page_commit() is changed to only deal with PCP pages
> protected by the local lock. free_unref_page_list() then first frees
> isolated pages to the buddy lists with free_one_page() and frees the rest
> of the pages to the PCP via free_unref_page_commit(). The end result
> is that free_one_page() is no longer depending on side-effects of
> local_lock to be correct.
>
> Note that this may incur a performance penalty while memory hot-remove
> is running but that is not a common operation.
>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
A nit below:
> @@ -3294,6 +3295,7 @@ void free_unref_page_list(struct list_head *list)
> struct page *page, *next;
> unsigned long flags, pfn;
> int batch_count = 0;
> + int migratetype;
>
> /* Prepare pages for freeing */
> list_for_each_entry_safe(page, next, list, lru) {
> @@ -3301,15 +3303,28 @@ void free_unref_page_list(struct list_head *list)
> if (!free_unref_page_prepare(page, pfn))
> list_del(&page->lru);
> set_page_private(page, pfn);
Should probably move this below so we don't set private for pages that then go
through free_one_page()? Doesn't seem to be a bug, just unneccessary.
> +
> + /*
> + * Free isolated pages directly to the allocator, see
> + * comment in free_unref_page.
> + */
> + migratetype = get_pcppage_migratetype(page);
> + if (unlikely(migratetype >= MIGRATE_PCPTYPES)) {
> + if (unlikely(is_migrate_isolate(migratetype))) {
> + free_one_page(page_zone(page), page, pfn, 0,
> + migratetype, FPI_NONE);
> + list_del(&page->lru);
> + }
> + }
> }
>
> local_lock_irqsave(&pagesets.lock, flags);
> list_for_each_entry_safe(page, next, list, lru) {
> - unsigned long pfn = page_private(page);
> -
> + pfn = page_private(page);
> set_page_private(page, 0);
> + migratetype = get_pcppage_migratetype(page);
> trace_mm_page_free_batched(page);
> - free_unref_page_commit(page, pfn);
> + free_unref_page_commit(page, pfn, migratetype);
>
> /*
> * Guard against excessive IRQ disabled times when we get
>
next prev parent reply other threads:[~2021-04-15 12:25 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-14 13:39 [PATCH 0/11 v3] Use local_lock for pcp protection and reduce stat overhead Mel Gorman
2021-04-14 13:39 ` [PATCH 01/11] mm/page_alloc: Split per cpu page lists and zone stats Mel Gorman
2021-04-14 13:39 ` [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock Mel Gorman
2021-04-14 13:39 ` [PATCH 03/11] mm/vmstat: Convert NUMA statistics to basic NUMA counters Mel Gorman
2021-04-14 13:39 ` [PATCH 04/11] mm/vmstat: Inline NUMA event counter updates Mel Gorman
2021-04-14 16:20 ` Vlastimil Babka
2021-04-14 16:26 ` Vlastimil Babka
2021-04-15 9:34 ` Mel Gorman
2021-04-14 13:39 ` [PATCH 05/11] mm/page_alloc: Batch the accounting updates in the bulk allocator Mel Gorman
2021-04-14 16:31 ` Vlastimil Babka
2021-04-14 13:39 ` [PATCH 06/11] mm/page_alloc: Reduce duration that IRQs are disabled for VM counters Mel Gorman
2021-04-14 17:10 ` Vlastimil Babka
2021-04-14 13:39 ` [PATCH 07/11] mm/page_alloc: Remove duplicate checks if migratetype should be isolated Mel Gorman
2021-04-14 17:21 ` Vlastimil Babka
2021-04-15 9:33 ` Mel Gorman
2021-04-15 11:24 ` Vlastimil Babka
2021-04-14 13:39 ` [PATCH 08/11] mm/page_alloc: Explicitly acquire the zone lock in __free_pages_ok Mel Gorman
2021-04-15 10:24 ` Vlastimil Babka
2021-04-14 13:39 ` [PATCH 09/11] mm/page_alloc: Avoid conflating IRQs disabled with zone->lock Mel Gorman
2021-04-15 12:25 ` Vlastimil Babka [this message]
2021-04-15 14:11 ` Mel Gorman
2021-04-14 13:39 ` [PATCH 10/11] mm/page_alloc: Update PGFREE outside the zone lock in __free_pages_ok Mel Gorman
2021-04-15 13:04 ` Vlastimil Babka
2021-04-14 13:39 ` [PATCH 11/11] mm/page_alloc: Embed per_cpu_pages locking within the per-cpu structure Mel Gorman
2021-04-15 14:53 ` Vlastimil Babka
2021-04-15 15:29 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=838c6734-1e5d-6a26-8c88-90e89d407482@suse.cz \
--to=vbabka@suse.cz \
--cc=brouer@redhat.com \
--cc=chuck.lever@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).