All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Mel Gorman <mgorman@techsingularity.net>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Linux-MM <linux-mm@kvack.org>,
	Linux-FSDevel <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>,
	Andi Kleen <ak@linux.intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH 8/8] mm: Remove __GFP_COLD
Date: Thu, 19 Oct 2017 15:42:12 +0200	[thread overview]
Message-ID: <f6505442-98a9-12e4-b2cd-0fa83874c159@suse.cz> (raw)
In-Reply-To: <20171018075952.10627-9-mgorman@techsingularity.net>

On 10/18/2017 09:59 AM, Mel Gorman wrote:
> As the page free path makes no distinction between cache hot and cold
> pages, there is no real useful ordering of pages in the free list that
> allocation requests can take advantage of. Juding from the users of
> __GFP_COLD, it is likely that a number of them are the result of copying
> other sites instead of actually measuring the impact. Remove the
> __GFP_COLD parameter which simplifies a number of paths in the page
> allocator.
> 
> This is potentially controversial but bear in mind that the size of the
> per-cpu pagelists versus modern cache sizes means that the whole per-cpu
> list can often fit in the L3 cache. Hence, there is only a potential benefit
> for microbenchmarks that alloc/free pages in a tight loop. It's even worse
> when THP is taken into account which has little or no chance of getting a
> cache-hot page as the per-cpu list is bypassed and the zeroing of multiple
> pages will thrash the cache anyway.
> 
> The truncate microbenchmarks are not shown as this patch affects the
> allocation path and not the free path. A page fault microbenchmark was
> tested but it showed no sigificant difference which is not surprising given
> that the __GFP_COLD branches are a miniscule percentage of the fault path.
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

I've updated patch https://marc.info/?l=linux-mm&m=150831216224521&w=2 on top
of this. It's a small non-functional change, so it might even be folded.

----8<----
>From b002266c1a826805a50087db851f93e7a87ceb2f Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Tue, 17 Oct 2017 16:03:02 +0200
Subject: [PATCH] mm, page_alloc: simplify list handling in rmqueue_bulk()

The rmqueue_bulk() function fills an empty pcplist with pages from the free
list. It tries to preserve increasing order by pfn to the caller, because it
leads to better performance with some I/O controllers, as explained in
e084b2d95e48 ("page-allocator: preserve PFN ordering when __GFP_COLD is set").

To preserve the order, it's sufficient to add pages to the tail of the list
as they are retrieved. The current code instead adds to the head of the list,
but then updates the list head pointer to the last added page, in each step.
This does result in the same order, but is needlessly confusing and potentially
wasteful, with no apparent benefit. This patch simplifies the code and adjusts
comment accordingly.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/page_alloc.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3e60d8a37e3f..4e1b3c686cc2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2330,16 +2330,16 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
 			continue;
 
 		/*
-		 * Split buddy pages returned by expand() are received here
-		 * in physical page order. The page is added to the callers and
-		 * list and the list head then moves forward. From the callers
-		 * perspective, the linked list is ordered by page number in
-		 * some conditions. This is useful for IO devices that can
-		 * merge IO requests if the physical pages are ordered
-		 * properly.
+		 * Split buddy pages returned by expand() are received here in
+		 * physical page order. The page is added to the tail of
+		 * caller's list. From the callers perspective, the linked list
+		 * is ordered by page number under some conditions. This is
+		 * useful for IO devices that can forward direction from the
+		 * head, thus also in the physical page order. This is useful
+		 * for IO devices that can merge IO requests if the physical
+		 * pages are ordered properly.
 		 */
-		list_add(&page->lru, list);
-		list = &page->lru;
+		list_add_tail(&page->lru, list);
 		alloced++;
 		if (is_migrate_cma(get_pcppage_migratetype(page)))
 			__mod_zone_page_state(zone, NR_FREE_CMA_PAGES,
-- 
2.14.2

WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: Mel Gorman <mgorman@techsingularity.net>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Linux-MM <linux-mm@kvack.org>,
	Linux-FSDevel <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>,
	Andi Kleen <ak@linux.intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH 8/8] mm: Remove __GFP_COLD
Date: Thu, 19 Oct 2017 15:42:12 +0200	[thread overview]
Message-ID: <f6505442-98a9-12e4-b2cd-0fa83874c159@suse.cz> (raw)
In-Reply-To: <20171018075952.10627-9-mgorman@techsingularity.net>

On 10/18/2017 09:59 AM, Mel Gorman wrote:
> As the page free path makes no distinction between cache hot and cold
> pages, there is no real useful ordering of pages in the free list that
> allocation requests can take advantage of. Juding from the users of
> __GFP_COLD, it is likely that a number of them are the result of copying
> other sites instead of actually measuring the impact. Remove the
> __GFP_COLD parameter which simplifies a number of paths in the page
> allocator.
> 
> This is potentially controversial but bear in mind that the size of the
> per-cpu pagelists versus modern cache sizes means that the whole per-cpu
> list can often fit in the L3 cache. Hence, there is only a potential benefit
> for microbenchmarks that alloc/free pages in a tight loop. It's even worse
> when THP is taken into account which has little or no chance of getting a
> cache-hot page as the per-cpu list is bypassed and the zeroing of multiple
> pages will thrash the cache anyway.
> 
> The truncate microbenchmarks are not shown as this patch affects the
> allocation path and not the free path. A page fault microbenchmark was
> tested but it showed no sigificant difference which is not surprising given
> that the __GFP_COLD branches are a miniscule percentage of the fault path.
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

I've updated patch https://marc.info/?l=linux-mm&m=150831216224521&w=2 on top
of this. It's a small non-functional change, so it might even be folded.

----8<----
>From b002266c1a826805a50087db851f93e7a87ceb2f Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Tue, 17 Oct 2017 16:03:02 +0200
Subject: [PATCH] mm, page_alloc: simplify list handling in rmqueue_bulk()

The rmqueue_bulk() function fills an empty pcplist with pages from the free
list. It tries to preserve increasing order by pfn to the caller, because it
leads to better performance with some I/O controllers, as explained in
e084b2d95e48 ("page-allocator: preserve PFN ordering when __GFP_COLD is set").

To preserve the order, it's sufficient to add pages to the tail of the list
as they are retrieved. The current code instead adds to the head of the list,
but then updates the list head pointer to the last added page, in each step.
This does result in the same order, but is needlessly confusing and potentially
wasteful, with no apparent benefit. This patch simplifies the code and adjusts
comment accordingly.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/page_alloc.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3e60d8a37e3f..4e1b3c686cc2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2330,16 +2330,16 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
 			continue;
 
 		/*
-		 * Split buddy pages returned by expand() are received here
-		 * in physical page order. The page is added to the callers and
-		 * list and the list head then moves forward. From the callers
-		 * perspective, the linked list is ordered by page number in
-		 * some conditions. This is useful for IO devices that can
-		 * merge IO requests if the physical pages are ordered
-		 * properly.
+		 * Split buddy pages returned by expand() are received here in
+		 * physical page order. The page is added to the tail of
+		 * caller's list. From the callers perspective, the linked list
+		 * is ordered by page number under some conditions. This is
+		 * useful for IO devices that can forward direction from the
+		 * head, thus also in the physical page order. This is useful
+		 * for IO devices that can merge IO requests if the physical
+		 * pages are ordered properly.
 		 */
-		list_add(&page->lru, list);
-		list = &page->lru;
+		list_add_tail(&page->lru, list);
 		alloced++;
 		if (is_migrate_cma(get_pcppage_migratetype(page)))
 			__mod_zone_page_state(zone, NR_FREE_CMA_PAGES,
-- 
2.14.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: Mel Gorman <mgorman@techsingularity.net>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Linux-MM <linux-mm@kvack.org>,
	Linux-FSDevel <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>,
	Andi Kleen <ak@linux.intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH 8/8] mm: Remove __GFP_COLD
Date: Thu, 19 Oct 2017 15:42:12 +0200	[thread overview]
Message-ID: <f6505442-98a9-12e4-b2cd-0fa83874c159@suse.cz> (raw)
In-Reply-To: <20171018075952.10627-9-mgorman@techsingularity.net>

On 10/18/2017 09:59 AM, Mel Gorman wrote:
> As the page free path makes no distinction between cache hot and cold
> pages, there is no real useful ordering of pages in the free list that
> allocation requests can take advantage of. Juding from the users of
> __GFP_COLD, it is likely that a number of them are the result of copying
> other sites instead of actually measuring the impact. Remove the
> __GFP_COLD parameter which simplifies a number of paths in the page
> allocator.
> 
> This is potentially controversial but bear in mind that the size of the
> per-cpu pagelists versus modern cache sizes means that the whole per-cpu
> list can often fit in the L3 cache. Hence, there is only a potential benefit
> for microbenchmarks that alloc/free pages in a tight loop. It's even worse
> when THP is taken into account which has little or no chance of getting a
> cache-hot page as the per-cpu list is bypassed and the zeroing of multiple
> pages will thrash the cache anyway.
> 
> The truncate microbenchmarks are not shown as this patch affects the
> allocation path and not the free path. A page fault microbenchmark was
> tested but it showed no sigificant difference which is not surprising given
> that the __GFP_COLD branches are a miniscule percentage of the fault path.
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

I've updated patch https://marc.info/?l=linux-mm&m=150831216224521&w=2 on top
of this. It's a small non-functional change, so it might even be folded.

----8<----

  parent reply	other threads:[~2017-10-19 13:42 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-18  7:59 [PATCH 0/8] Follow-up for speed up page cache truncation v2 Mel Gorman
2017-10-18  7:59 ` Mel Gorman
2017-10-18  7:59 ` [PATCH 1/8] mm, page_alloc: Enable/disable IRQs once when freeing a list of pages Mel Gorman
2017-10-18  7:59   ` Mel Gorman
2017-10-18  9:02   ` Vlastimil Babka
2017-10-18  9:02     ` Vlastimil Babka
2017-10-18 10:15     ` Mel Gorman
2017-10-18 10:15       ` Mel Gorman
2017-10-18  7:59 ` [PATCH 2/8] mm, truncate: Do not check mapping for every page being truncated Mel Gorman
2017-10-18  7:59   ` Mel Gorman
2017-10-19  8:11   ` Jan Kara
2017-10-19  8:11     ` Jan Kara
2017-10-18  7:59 ` [PATCH 3/8] mm, truncate: Remove all exceptional entries from pagevec under one lock Mel Gorman
2017-10-18  7:59   ` Mel Gorman
2017-10-18  7:59 ` [PATCH 4/8] mm: Only drain per-cpu pagevecs once per pagevec usage Mel Gorman
2017-10-18  7:59   ` Mel Gorman
2017-10-19  9:12   ` Vlastimil Babka
2017-10-19  9:12     ` Vlastimil Babka
2017-10-19  9:33     ` Mel Gorman
2017-10-19  9:33       ` Mel Gorman
2017-10-19 13:21       ` Vlastimil Babka
2017-10-19 13:21         ` Vlastimil Babka
2017-10-18  7:59 ` [PATCH 5/8] mm, pagevec: Remove cold parameter for pagevecs Mel Gorman
2017-10-18  7:59   ` Mel Gorman
2017-10-19  9:24   ` Vlastimil Babka
2017-10-19  9:24     ` Vlastimil Babka
2017-10-18  7:59 ` [PATCH 6/8] mm: Remove cold parameter for release_pages Mel Gorman
2017-10-18  7:59   ` Mel Gorman
2017-10-19  9:26   ` Vlastimil Babka
2017-10-19  9:26     ` Vlastimil Babka
2017-10-18  7:59 ` [PATCH 7/8] mm, Remove cold parameter from free_hot_cold_page* Mel Gorman
2017-10-18  7:59   ` Mel Gorman
2017-10-19 13:12   ` Vlastimil Babka
2017-10-19 13:12     ` Vlastimil Babka
2017-10-19 15:43     ` Mel Gorman
2017-10-19 15:43       ` Mel Gorman
2017-10-18  7:59 ` [PATCH 8/8] mm: Remove __GFP_COLD Mel Gorman
2017-10-18  7:59   ` Mel Gorman
2017-10-19 13:18   ` Vlastimil Babka
2017-10-19 13:18     ` Vlastimil Babka
2017-10-19 13:42   ` Vlastimil Babka [this message]
2017-10-19 13:42     ` Vlastimil Babka
2017-10-19 13:42     ` Vlastimil Babka
2017-10-19 14:32     ` Mel Gorman
2017-10-19 14:32       ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2017-10-12  9:30 [PATCH 0/8] Follow-up for speed up page cache truncation Mel Gorman
2017-10-12  9:31 ` [PATCH 8/8] mm: Remove __GFP_COLD Mel Gorman
2017-10-12  9:31   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f6505442-98a9-12e4-b2cd-0fa83874c159@suse.cz \
    --to=vbabka@suse.cz \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.