All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Matthew Wilcox <willy@infradead.org>,
	Chuck Lever III <chuck.lever@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-Net <netdev@vger.kernel.org>, Linux-MM <linux-mm@kvack.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	brouer@redhat.com
Subject: Re: [PATCH 2/5] mm/page_alloc: Add a bulk page allocator
Date: Fri, 19 Mar 2021 18:10:31 +0100	[thread overview]
Message-ID: <20210319181031.44dd3113@carbon> (raw)
In-Reply-To: <20210314125231.GA3697@techsingularity.net>

On Sun, 14 Mar 2021 12:52:32 +0000
Mel Gorman <mgorman@techsingularity.net> wrote:

> mm/page_alloc: Add an array-based interface to the bulk page allocator
> 
> The existing callers for the bulk allocator are storing the pages in
> arrays. This patch adds an array-based interface to the API to avoid
> multiple list iterations. The page list interface is preserved to
> avoid requiring all users of the bulk API to allocate and manage
> enough storage to store the pages.

I'm testing this patch, see results below and in commit[1].  The array
variant is clearly faster in these micro-benchmarks.
(Some comment inlined about code)

The change to my page_bench04_bulk is here[1]:
 [1] https://github.com/netoptimizer/prototype-kernel/commit/4c41fe0d4107f514

Notice these "per elem" measurements are alloc+free cost for order-0 pages.

BASELINE
 single_page alloc+put: 207 cycles(tsc) 57.773 ns

LIST variant: time_bulk_page_alloc_free_list: step=bulk size

 Per elem: 294 cycles(tsc) 81.866 ns (step:1)
 Per elem: 214 cycles(tsc) 59.477 ns (step:2)
 Per elem: 199 cycles(tsc) 55.504 ns (step:3)
 Per elem: 192 cycles(tsc) 53.489 ns (step:4)
 Per elem: 188 cycles(tsc) 52.456 ns (step:8)
 Per elem: 184 cycles(tsc) 51.346 ns (step:16)
 Per elem: 183 cycles(tsc) 50.883 ns (step:32)
 Per elem: 184 cycles(tsc) 51.236 ns (step:64)
 Per elem: 189 cycles(tsc) 52.620 ns (step:128)

ARRAY variant: time_bulk_page_alloc_free_array: step=bulk size

 Per elem: 195 cycles(tsc) 54.174 ns (step:1)
 Per elem: 123 cycles(tsc) 34.224 ns (step:2)
 Per elem: 113 cycles(tsc) 31.430 ns (step:3)
 Per elem: 108 cycles(tsc) 30.003 ns (step:4)
 Per elem: 102 cycles(tsc) 28.417 ns (step:8)
 Per elem:  98 cycles(tsc) 27.475 ns (step:16)
 Per elem:  96 cycles(tsc) 26.901 ns (step:32)
 Per elem:  95 cycles(tsc) 26.487 ns (step:64)
 Per elem:  94 cycles(tsc) 26.170 ns (step:128)

The array variant is clearly faster.

It it worth mentioning that bulk=1 result in fallback to normal
single page allocation via __alloc_pages().


> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 4a304fd39916..fb6234e1fe59 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -520,13 +520,20 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
>  
>  int __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
>  				nodemask_t *nodemask, int nr_pages,
> -				struct list_head *list);
> +				struct list_head *page_list,
> +				struct page **page_array);
>  
>  /* Bulk allocate order-0 pages */
>  static inline unsigned long
> -alloc_pages_bulk(gfp_t gfp, unsigned long nr_pages, struct list_head *list)
> +alloc_pages_bulk_list(gfp_t gfp, unsigned long nr_pages, struct list_head *list)
>  {
> -	return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, list);
> +	return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, list, NULL);
> +}
> +
> +static inline unsigned long
> +alloc_pages_bulk_array(gfp_t gfp, unsigned long nr_pages, struct page **page_array)
> +{
> +	return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, NULL, page_array);
>  }
>  
>  /*
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3e0c87c588d3..96590f0726c7 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4965,13 +4965,20 @@ static inline bool prepare_alloc_pages(gfp_t gfp, unsigned int order,
>  
>  /*
>   * This is a batched version of the page allocator that attempts to
> - * allocate nr_pages quickly from the preferred zone and add them to list.
> + * allocate nr_pages quickly from the preferred zone. Pages are added
> + * to page_list if page_list is not NULL, otherwise it is assumed
> + * that the page_array is valid.
> + *
> + * If using page_array, only NULL elements are populated with pages.
> + * The caller must ensure that the array has enough NULL elements
> + * to store nr_pages or the buffer overruns.
>   *
>   * Returns the number of pages allocated.
>   */
>  int __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
>  			nodemask_t *nodemask, int nr_pages,
> -			struct list_head *alloc_list)
> +			struct list_head *page_list,
> +			struct page **page_array)
>  {
>  	struct page *page;
>  	unsigned long flags;
> @@ -4987,6 +4994,9 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
>  	if (WARN_ON_ONCE(nr_pages <= 0))
>  		return 0;
>  
> +	if (WARN_ON_ONCE(!page_list && !page_array))
> +		return 0;
> +
>  	if (nr_pages == 1)
>  		goto failed;
>  
> @@ -5035,7 +5045,24 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
>  			break;
>  		}
>  
> -		list_add(&page->lru, alloc_list);
> +		if (page_list) {
> +			/* New page prep is deferred */
> +			list_add(&page->lru, page_list);
> +		} else {
> +			/* Skip populated elements */
> +			while (*page_array)
> +				page_array++;

I don't like this approach as it is a dangerous construct, that can run
wild through the memory.  I have coded up another approach where I have
an nr_avail counter instead, that will "include" and count existing
elements in the array.

> +
> +			/*
> +			 * Array pages must be prepped immediately to
> +			 * avoid tracking which pages are new and
> +			 * which ones were already on the array.
> +			 */
> +			prep_new_page(page, 0, gfp, 0);
> +			*page_array = page;
> +			page_array++;
> +		}
> +
>  		allocated++;
>  	}
>  
> @@ -5044,9 +5071,12 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
>  
>  	local_irq_restore(flags);
>  
> -	/* Prep page with IRQs enabled to reduce disabled times */
> -	list_for_each_entry(page, alloc_list, lru)
> -		prep_new_page(page, 0, gfp, 0);
> +	/* Prep pages with IRQs enabled if using a list */
> +	if (page_list) {
> +		list_for_each_entry(page, page_list, lru) {
> +			prep_new_page(page, 0, gfp, 0);
> +		}
> +	}
>  
>  	return allocated;
>  
> @@ -5056,7 +5086,10 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
>  failed:
>  	page = __alloc_pages(gfp, 0, preferred_nid, nodemask);
>  	if (page) {
> -		list_add(&page->lru, alloc_list);
> +		if (page_list)
> +			list_add(&page->lru, page_list);
> +		else
> +			*page_array = page;
>  		allocated = 1;
>  	}
>  

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


  parent reply	other threads:[~2021-03-19 17:11 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10 10:46 [PATCH 0/5] Introduce a bulk order-0 page allocator with two in-tree users Mel Gorman
2021-03-10 10:46 ` [PATCH 1/5] SUNRPC: Set rq_page_end differently Mel Gorman
2021-03-10 10:46 ` [PATCH 2/5] mm/page_alloc: Add a bulk page allocator Mel Gorman
2021-03-10 23:46   ` Andrew Morton
2021-03-11  8:42     ` Mel Gorman
2021-03-12 11:46       ` Jesper Dangaard Brouer
2021-03-12 13:44         ` Mel Gorman
2021-03-12 14:58         ` Matthew Wilcox
2021-03-12 16:03           ` Mel Gorman
2021-03-12 21:08             ` Matthew Wilcox
2021-03-13 13:16               ` Mel Gorman
2021-03-13 16:39                 ` Matthew Wilcox
2021-03-13 16:56                   ` Chuck Lever III
2021-03-13 19:33                     ` Matthew Wilcox
2021-03-14 12:52                       ` Mel Gorman
2021-03-14 15:22                         ` Chuck Lever III
2021-03-15 10:42                           ` Mel Gorman
2021-03-15 16:42                             ` Jesper Dangaard Brouer
2021-03-19 17:10                         ` Jesper Dangaard Brouer [this message]
2021-03-12 12:43   ` Matthew Wilcox
2021-03-12 14:15     ` Mel Gorman
2021-03-10 10:46 ` [PATCH 3/5] SUNRPC: Refresh rq_pages using " Mel Gorman
2021-03-10 10:46 ` [PATCH 4/5] net: page_pool: refactor dma_map into own function page_pool_dma_map Mel Gorman
2021-03-10 10:46 ` [PATCH 5/5] net: page_pool: use alloc_pages_bulk in refill code path Mel Gorman
2021-03-10 23:47 ` [PATCH 0/5] Introduce a bulk order-0 page allocator with two in-tree users Andrew Morton
2021-03-11  8:48   ` Mel Gorman
2021-03-12 15:10     ` Matthew Wilcox
  -- strict thread matches above, loose matches on Subject: below --
2021-03-11 11:49 [PATCH 0/5 v3] " Mel Gorman
2021-03-11 11:49 ` [PATCH 2/5] mm/page_alloc: Add a bulk page allocator Mel Gorman
2021-03-11 16:42   ` Alexander Duyck
2021-03-11 16:42     ` Alexander Duyck
2021-03-12  7:32     ` Mel Gorman
2021-03-01 16:11 [PATCH 0/5] Introduce a bulk order-0 page allocator with two in-tree users Mel Gorman
2021-03-01 16:11 ` [PATCH 2/5] mm/page_alloc: Add a bulk page allocator Mel Gorman
2021-03-09 17:12   ` Christoph Hellwig
2021-03-09 18:10     ` Mel Gorman
2021-03-10 11:04   ` Shay Agroskin
2021-03-10 11:38     ` Mel Gorman
2021-03-12 12:01       ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210319181031.44dd3113@carbon \
    --to=brouer@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=chuck.lever@oracle.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=netdev@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.