From: David Hildenbrand <david@redhat.com>
To: Vlastimil Babka <vbabka@suse.cz>, linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, Michal Hocko <mhocko@kernel.org>,
Pavel Tatashin <pasha.tatashin@soleen.com>,
Oscar Salvador <osalvador@suse.de>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: Re: [PATCH 8/9] mm, page_alloc: drain all pcplists during memory offline
Date: Fri, 25 Sep 2020 12:46:27 +0200 [thread overview]
Message-ID: <a247fc08-d2d8-4f09-88e0-2ebbb5f67890@redhat.com> (raw)
In-Reply-To: <20200922143712.12048-9-vbabka@suse.cz>
On 22.09.20 16:37, Vlastimil Babka wrote:
> drain_all_pages() is optimized to only execute on cpus where pcplists are not
> empty. The check can however race with a free to pcplist that has not yet
> increased the pcp->count from 0 to 1. Make the drain optionally skip the racy
> check and drain on all cpus, and use it in memory offline context, where we
> want to make sure no isolated pages are left behind on pcplists.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> include/linux/gfp.h | 1 +
> mm/memory_hotplug.c | 4 ++--
> mm/page_alloc.c | 29 ++++++++++++++++++++---------
> 3 files changed, 23 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 67a0774e080b..cc52c5cc9fab 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -592,6 +592,7 @@ extern void page_frag_free(void *addr);
>
> void page_alloc_init(void);
> void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
> +void __drain_all_pages(struct zone *zone, bool page_isolation);
> void drain_all_pages(struct zone *zone);
> void drain_local_pages(struct zone *zone);
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 08f729922e18..bbde415b558b 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1524,7 +1524,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
> goto failed_removal;
> }
>
> - drain_all_pages(zone);
> + __drain_all_pages(zone, true);
>
> arg.start_pfn = start_pfn;
> arg.nr_pages = nr_pages;
> @@ -1588,7 +1588,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
> */
> ret = test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE);
> if (ret)
> - drain_all_pages(zone);
> + __drain_all_pages(zone, true);
> } while (ret);
>
> /* Mark all sections offline and remove free pages from the buddy. */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4e37bc3f6077..33cc35d152b1 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2960,14 +2960,7 @@ static void drain_local_pages_wq(struct work_struct *work)
> preempt_enable();
> }
>
> -/*
> - * Spill all the per-cpu pages from all CPUs back into the buddy allocator.
> - *
> - * When zone parameter is non-NULL, spill just the single zone's pages.
> - *
> - * Note that this can be extremely slow as the draining happens in a workqueue.
> - */
> -void drain_all_pages(struct zone *zone)
> +void __drain_all_pages(struct zone *zone, bool force_all_cpus)
> {
> int cpu;
>
> @@ -3006,7 +2999,13 @@ void drain_all_pages(struct zone *zone)
> struct zone *z;
> bool has_pcps = false;
>
> - if (zone) {
> + if (force_all_cpus) {
> + /*
> + * The pcp.count check is racy, some callers need a
> + * guarantee that no cpu is missed.
> + */
> + has_pcps = true;
> + } else if (zone) {
> pcp = per_cpu_ptr(zone->pageset, cpu);
> if (pcp->pcp.count)
> has_pcps = true;
> @@ -3039,6 +3038,18 @@ void drain_all_pages(struct zone *zone)
> mutex_unlock(&pcpu_drain_mutex);
> }
>
> +/*
> + * Spill all the per-cpu pages from all CPUs back into the buddy allocator.
> + *
> + * When zone parameter is non-NULL, spill just the single zone's pages.
> + *
> + * Note that this can be extremely slow as the draining happens in a workqueue.
> + */
> +void drain_all_pages(struct zone *zone)
> +{
> + __drain_all_pages(zone, false);
> +}
> +
> #ifdef CONFIG_HIBERNATION
>
> /*
>
Interesting race. Instead of this ugly __drain_all_pages() with a
boolean parameter, can we have two properly named functions to be used
in !page_alloc.c code without scratching your head what the difference is?
(yeah, coming up with a proper name is difficult. the one gives more
guarantees than the other, that cannot really be deducted from
"force_all_cpus" - maybe we can encode the actual semantics in the name)
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2020-09-25 10:46 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-22 14:37 [PATCH 0/9] disable pcplists during memory offline Vlastimil Babka
2020-09-22 14:37 ` [PATCH 1/9] mm, page_alloc: clean up pageset high and batch update Vlastimil Babka
2020-09-25 10:18 ` David Hildenbrand
2020-10-05 12:03 ` Michal Hocko
2020-09-22 14:37 ` [PATCH 2/9] mm, page_alloc: calculate pageset high and batch once per zone Vlastimil Babka
2020-10-05 12:52 ` Michal Hocko
2020-10-06 22:04 ` Vlastimil Babka
2020-09-22 14:37 ` [PATCH 3/9] mm, page_alloc: remove setup_pageset() Vlastimil Babka
2020-09-25 10:19 ` David Hildenbrand
2020-10-05 12:59 ` Michal Hocko
2020-10-06 22:11 ` Vlastimil Babka
2020-09-22 14:37 ` [PATCH 4/9] mm, page_alloc: simplify pageset_update() Vlastimil Babka
2020-09-25 10:23 ` David Hildenbrand
2020-10-05 13:20 ` Michal Hocko
2020-09-22 14:37 ` [PATCH 5/9] mm, page_alloc: make per_cpu_pageset accessible only after init Vlastimil Babka
2020-09-25 10:25 ` David Hildenbrand
2020-10-05 13:24 ` Michal Hocko
2020-10-06 22:28 ` Vlastimil Babka
2020-09-22 14:37 ` [PATCH 6/9] mm, page_alloc: cache pageset high and batch in struct zone Vlastimil Babka
2020-09-25 10:34 ` David Hildenbrand
2020-10-06 22:31 ` Vlastimil Babka
2020-10-05 13:28 ` Michal Hocko
2020-10-06 22:34 ` Vlastimil Babka
2020-09-22 14:37 ` [PATCH 7/9] mm, page_alloc: move draining pcplists to page isolation users Vlastimil Babka
2020-09-25 10:39 ` David Hildenbrand
2020-10-05 13:57 ` Michal Hocko
2020-09-22 14:37 ` [PATCH 8/9] mm, page_alloc: drain all pcplists during memory offline Vlastimil Babka
2020-09-25 10:46 ` David Hildenbrand [this message]
2020-10-05 14:03 ` Michal Hocko
2020-09-22 14:37 ` [PATCH 9/9] mm, page_alloc: optionally disable pcplists during page isolation Vlastimil Babka
2020-09-25 10:53 ` David Hildenbrand
2020-09-25 10:54 ` David Hildenbrand
2020-09-25 11:10 ` Vlastimil Babka
2020-10-01 8:47 ` David Hildenbrand
2020-10-05 14:05 ` Michal Hocko
2020-10-05 14:22 ` Vlastimil Babka
2020-10-05 16:56 ` Michal Hocko
2020-10-06 8:34 ` Michal Hocko
2020-10-06 8:40 ` David Hildenbrand
2020-10-06 10:05 ` Michal Hocko
2020-09-22 17:15 ` [PATCH 0/9] disable pcplists during memory offline David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a247fc08-d2d8-4f09-88e0-2ebbb5f67890@redhat.com \
--to=david@redhat.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=osalvador@suse.de \
--cc=pasha.tatashin@soleen.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).