From: Mel Gorman <mgorman@techsingularity.net>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Chuck Lever III <chuck.lever@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>,
Christoph Hellwig <hch@infradead.org>,
Alexander Duyck <alexander.duyck@gmail.com>,
Matthew Wilcox <willy@infradead.org>,
LKML <linux-kernel@vger.kernel.org>,
Linux-Net <netdev@vger.kernel.org>, Linux-MM <linux-mm@kvack.org>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 0/3 v5] Introduce a bulk order-0 page allocator
Date: Tue, 23 Mar 2021 14:45:41 +0000 [thread overview]
Message-ID: <20210323144541.GL3697@techsingularity.net> (raw)
In-Reply-To: <20210323120851.18d430cf@carbon>
On Tue, Mar 23, 2021 at 12:08:51PM +0100, Jesper Dangaard Brouer wrote:
> > > <SNIP>
> > > My results show that, because svc_alloc_arg() ends up calling
> > > __alloc_pages_bulk() twice in this case, it ends up being
> > > twice as expensive as the list case, on average, for the same
> > > workload.
> > >
> >
> > Ok, so in this case the caller knows that holes are always at the
> > start. If the API returns an index that is a valid index and populated,
> > it can check the next index and if it is valid then the whole array
> > must be populated.
> >
> > <SNIP>
>
> I do know that I suggested moving prep_new_page() out of the
> IRQ-disabled loop, but maybe was a bad idea, for several reasons.
>
> All prep_new_page does is to write into struct page, unless some
> debugging stuff (like kasan) is enabled. This cache-line is hot as
> LRU-list update just wrote into this cache-line. As the bulk size goes
> up, as Matthew pointed out, this cache-line might be pushed into
> L2-cache, and then need to be accessed again when prep_new_page() is
> called.
>
> Another observation is that moving prep_new_page() into loop reduced
> function size with 253 bytes (which affect I-cache).
>
> ./scripts/bloat-o-meter mm/page_alloc.o-prep_new_page-outside mm/page_alloc.o-prep_new_page-inside
> add/remove: 18/18 grow/shrink: 0/1 up/down: 144/-397 (-253)
> Function old new delta
> __alloc_pages_bulk 1965 1712 -253
> Total: Before=60799, After=60546, chg -0.42%
>
> Maybe it is better to keep prep_new_page() inside the loop. This also
> allows list vs array variant to share the call. And it should simplify
> the array variant code.
>
I agree. I did not like the level of complexity it incurred for arrays
or the fact it required that a list to be empty when alloc_pages_bulk()
is called. I thought the concern for calling prep_new_page() with IRQs
disabled was a little overblown but did not feel strongly enough to push
back on it hard given that we've had problems with IRQs being disabled
for long periods before. At worst, at some point in the future we'll have
to cap the number of pages that can be requested or enable/disable IRQs
every X pages.
New candidate
git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git mm-bulk-rebase-v6r4
Interface is still the same so a rebase should be trivial. Diff between
v6r2 and v6r4 is as follows. I like the diffstat if nothing else :P
mm/page_alloc.c | 54 +++++++++++++-----------------------------------------
1 file changed, 13 insertions(+), 41 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 547a84f11310..be1e33a4df39 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4999,25 +4999,20 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
struct alloc_context ac;
gfp_t alloc_gfp;
unsigned int alloc_flags;
- int nr_populated = 0, prep_index = 0;
- bool hole = false;
+ int nr_populated = 0;
if (WARN_ON_ONCE(nr_pages <= 0))
return 0;
- if (WARN_ON_ONCE(page_list && !list_empty(page_list)))
- return 0;
-
- /* Skip populated array elements. */
- if (page_array) {
- while (nr_populated < nr_pages && page_array[nr_populated])
- nr_populated++;
- if (nr_populated == nr_pages)
- return nr_populated;
- prep_index = nr_populated;
- }
+ /*
+ * Skip populated array elements to determine if any pages need
+ * to be allocated before disabling IRQs.
+ */
+ while (page_array && page_array[nr_populated] && nr_populated < nr_pages)
+ nr_populated++;
- if (nr_pages == 1)
+ /* Use the single page allocator for one page. */
+ if (nr_pages - nr_populated == 1)
goto failed;
/* May set ALLOC_NOFRAGMENT, fragmentation will return 1 page. */
@@ -5056,22 +5051,17 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
if (!zone)
goto failed;
-retry_hole:
/* Attempt the batch allocation */
local_irq_save(flags);
pcp = &this_cpu_ptr(zone->pageset)->pcp;
pcp_list = &pcp->lists[ac.migratetype];
while (nr_populated < nr_pages) {
- /*
- * Stop allocating if the next index has a populated
- * page or the page will be prepared a second time when
- * IRQs are enabled.
- */
+
+ /* Skip existing pages */
if (page_array && page_array[nr_populated]) {
- hole = true;
nr_populated++;
- break;
+ continue;
}
page = __rmqueue_pcplist(zone, ac.migratetype, alloc_flags,
@@ -5092,6 +5082,7 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
__count_zid_vm_events(PGALLOC, zone_idx(zone), 1);
zone_statistics(ac.preferred_zoneref->zone, zone);
+ prep_new_page(page, 0, gfp, 0);
if (page_list)
list_add(&page->lru, page_list);
else
@@ -5101,25 +5092,6 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
local_irq_restore(flags);
- /* Prep pages with IRQs enabled. */
- if (page_list) {
- list_for_each_entry(page, page_list, lru)
- prep_new_page(page, 0, gfp, 0);
- } else {
- while (prep_index < nr_populated)
- prep_new_page(page_array[prep_index++], 0, gfp, 0);
-
- /*
- * If the array is sparse, check whether the array is
- * now fully populated. Continue allocations if
- * necessary.
- */
- while (nr_populated < nr_pages && page_array[nr_populated])
- nr_populated++;
- if (hole && nr_populated < nr_pages)
- goto retry_hole;
- }
-
return nr_populated;
failed_irq:
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2021-03-23 14:45 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-22 9:18 [PATCH 0/3 v5] Introduce a bulk order-0 page allocator Mel Gorman
2021-03-22 9:18 ` [PATCH 1/3] mm/page_alloc: Rename alloced to allocated Mel Gorman
2021-03-22 9:18 ` [PATCH 2/3] mm/page_alloc: Add a bulk page allocator Mel Gorman
2021-03-23 16:00 ` Jesper Dangaard Brouer
2021-03-23 18:43 ` Mel Gorman
2021-03-22 9:18 ` [PATCH 3/3] mm/page_alloc: Add an array-based interface to the " Mel Gorman
2021-03-22 12:04 ` [PATCH 0/3 v5] Introduce a bulk order-0 " Jesper Dangaard Brouer
2021-03-22 16:44 ` Mel Gorman
2021-03-22 18:25 ` Chuck Lever III
2021-03-22 19:49 ` Mel Gorman
2021-03-22 20:32 ` Chuck Lever III
2021-03-22 20:58 ` Mel Gorman
2021-03-23 11:08 ` Jesper Dangaard Brouer
2021-03-23 14:45 ` Mel Gorman [this message]
2021-03-23 18:52 ` Chuck Lever III
2021-03-23 11:13 ` Matthew Wilcox
2021-03-23 10:44 ` Mel Gorman
2021-03-23 15:08 ` Jesper Dangaard Brouer
2021-03-23 16:29 ` Mel Gorman
2021-03-23 17:06 ` Jesper Dangaard Brouer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210323144541.GL3697@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=akpm@linux-foundation.org \
--cc=alexander.duyck@gmail.com \
--cc=brouer@redhat.com \
--cc=chuck.lever@oracle.com \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).