All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm, page_alloc: simplify hot/cold page handling in rmqueue_bulk()
@ 2017-10-18  7:35 ` Vlastimil Babka
  0 siblings, 0 replies; 6+ messages in thread
From: Vlastimil Babka @ 2017-10-18  7:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Mel Gorman, Johannes Weiner,
	Michal Hocko, Vlastimil Babka

The rmqueue_bulk() function fills an empty pcplist with pages from the free
list. It tries to preserve increasing order by pfn to the caller, because it
leads to better performance with some I/O controllers, as explained in
e084b2d95e48 ("page-allocator: preserve PFN ordering when __GFP_COLD is set").
For callers requesting cold pages, which are obtained from the tail of
pcplists, it means the pcplist has to be filled in reverse order from the free
lists (the hot/cold property only applies when pages are recycled on the
pcplists, not when refilled from free lists).

The related comment in rmqueue_bulk() wasn't clear to me without reading the
log of the commit mentioned above, so try to clarify it.

The code for filling the pcplists in order determined by the cold flag also
seems unnecessarily hard to follow. It's sufficient to either use list_add()
or list_add_tail(), but the current code also updates the list head pointer
in each step to the last added page, which then counterintuitively requires
to switch the usage of list_add() and list_add_tail() to achieve the desired
order, with no apparent benefit. This patch simplifies the code.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/page_alloc.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6191c9a04789..4b296fc8e599 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2329,19 +2329,18 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
 			continue;
 
 		/*
-		 * Split buddy pages returned by expand() are received here
-		 * in physical page order. The page is added to the callers and
-		 * list and the list head then moves forward. From the callers
-		 * perspective, the linked list is ordered by page number in
-		 * some conditions. This is useful for IO devices that can
-		 * merge IO requests if the physical pages are ordered
+		 * Split buddy pages returned by expand() are received here in
+		 * physical page order. The page is added to the caller's list.
+		 * From the callers perspective, make sure the pages will be
+		 * consumed in the order as returned by expand(), regardless of
+		 * cold being true or false. This is useful for IO devices that
+		 * can merge IO requests if the physical pages are ordered
 		 * properly.
 		 */
 		if (likely(!cold))
-			list_add(&page->lru, list);
-		else
 			list_add_tail(&page->lru, list);
-		list = &page->lru;
+		else
+			list_add(&page->lru, list);
 		alloced++;
 		if (is_migrate_cma(get_pcppage_migratetype(page)))
 			__mod_zone_page_state(zone, NR_FREE_CMA_PAGES,
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] mm, page_alloc: simplify hot/cold page handling in rmqueue_bulk()
@ 2017-10-18  7:35 ` Vlastimil Babka
  0 siblings, 0 replies; 6+ messages in thread
From: Vlastimil Babka @ 2017-10-18  7:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Mel Gorman, Johannes Weiner,
	Michal Hocko, Vlastimil Babka

The rmqueue_bulk() function fills an empty pcplist with pages from the free
list. It tries to preserve increasing order by pfn to the caller, because it
leads to better performance with some I/O controllers, as explained in
e084b2d95e48 ("page-allocator: preserve PFN ordering when __GFP_COLD is set").
For callers requesting cold pages, which are obtained from the tail of
pcplists, it means the pcplist has to be filled in reverse order from the free
lists (the hot/cold property only applies when pages are recycled on the
pcplists, not when refilled from free lists).

The related comment in rmqueue_bulk() wasn't clear to me without reading the
log of the commit mentioned above, so try to clarify it.

The code for filling the pcplists in order determined by the cold flag also
seems unnecessarily hard to follow. It's sufficient to either use list_add()
or list_add_tail(), but the current code also updates the list head pointer
in each step to the last added page, which then counterintuitively requires
to switch the usage of list_add() and list_add_tail() to achieve the desired
order, with no apparent benefit. This patch simplifies the code.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/page_alloc.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6191c9a04789..4b296fc8e599 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2329,19 +2329,18 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
 			continue;
 
 		/*
-		 * Split buddy pages returned by expand() are received here
-		 * in physical page order. The page is added to the callers and
-		 * list and the list head then moves forward. From the callers
-		 * perspective, the linked list is ordered by page number in
-		 * some conditions. This is useful for IO devices that can
-		 * merge IO requests if the physical pages are ordered
+		 * Split buddy pages returned by expand() are received here in
+		 * physical page order. The page is added to the caller's list.
+		 * From the callers perspective, make sure the pages will be
+		 * consumed in the order as returned by expand(), regardless of
+		 * cold being true or false. This is useful for IO devices that
+		 * can merge IO requests if the physical pages are ordered
 		 * properly.
 		 */
 		if (likely(!cold))
-			list_add(&page->lru, list);
-		else
 			list_add_tail(&page->lru, list);
-		list = &page->lru;
+		else
+			list_add(&page->lru, list);
 		alloced++;
 		if (is_migrate_cma(get_pcppage_migratetype(page)))
 			__mod_zone_page_state(zone, NR_FREE_CMA_PAGES,
-- 
2.14.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm, page_alloc: simplify hot/cold page handling in rmqueue_bulk()
  2017-10-18  7:35 ` Vlastimil Babka
@ 2017-10-18  8:06   ` Mel Gorman
  -1 siblings, 0 replies; 6+ messages in thread
From: Mel Gorman @ 2017-10-18  8:06 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Johannes Weiner, Michal Hocko

On Wed, Oct 18, 2017 at 09:35:28AM +0200, Vlastimil Babka wrote:
> The rmqueue_bulk() function fills an empty pcplist with pages from the free
> list. It tries to preserve increasing order by pfn to the caller, because it
> leads to better performance with some I/O controllers, as explained in
> e084b2d95e48 ("page-allocator: preserve PFN ordering when __GFP_COLD is set").
> For callers requesting cold pages, which are obtained from the tail of
> pcplists, it means the pcplist has to be filled in reverse order from the free
> lists (the hot/cold property only applies when pages are recycled on the
> pcplists, not when refilled from free lists).
> 
> The related comment in rmqueue_bulk() wasn't clear to me without reading the
> log of the commit mentioned above, so try to clarify it.
> 
> The code for filling the pcplists in order determined by the cold flag also
> seems unnecessarily hard to follow. It's sufficient to either use list_add()
> or list_add_tail(), but the current code also updates the list head pointer
> in each step to the last added page, which then counterintuitively requires
> to switch the usage of list_add() and list_add_tail() to achieve the desired
> order, with no apparent benefit. This patch simplifies the code.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

The "cold" treatment is dubious because almost everything that frees
considers the page "hot" which limits the usefulness of hot/cold in the
allocator. While I do not see a problem with your patch as such, please
take a look at "mm: Remove __GFP_COLD" in particular. The last 4 patches
in that series make a number of observations on how "cold" is treated in
the allocator.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm, page_alloc: simplify hot/cold page handling in rmqueue_bulk()
@ 2017-10-18  8:06   ` Mel Gorman
  0 siblings, 0 replies; 6+ messages in thread
From: Mel Gorman @ 2017-10-18  8:06 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Johannes Weiner, Michal Hocko

On Wed, Oct 18, 2017 at 09:35:28AM +0200, Vlastimil Babka wrote:
> The rmqueue_bulk() function fills an empty pcplist with pages from the free
> list. It tries to preserve increasing order by pfn to the caller, because it
> leads to better performance with some I/O controllers, as explained in
> e084b2d95e48 ("page-allocator: preserve PFN ordering when __GFP_COLD is set").
> For callers requesting cold pages, which are obtained from the tail of
> pcplists, it means the pcplist has to be filled in reverse order from the free
> lists (the hot/cold property only applies when pages are recycled on the
> pcplists, not when refilled from free lists).
> 
> The related comment in rmqueue_bulk() wasn't clear to me without reading the
> log of the commit mentioned above, so try to clarify it.
> 
> The code for filling the pcplists in order determined by the cold flag also
> seems unnecessarily hard to follow. It's sufficient to either use list_add()
> or list_add_tail(), but the current code also updates the list head pointer
> in each step to the last added page, which then counterintuitively requires
> to switch the usage of list_add() and list_add_tail() to achieve the desired
> order, with no apparent benefit. This patch simplifies the code.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

The "cold" treatment is dubious because almost everything that frees
considers the page "hot" which limits the usefulness of hot/cold in the
allocator. While I do not see a problem with your patch as such, please
take a look at "mm: Remove __GFP_COLD" in particular. The last 4 patches
in that series make a number of observations on how "cold" is treated in
the allocator.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm, page_alloc: simplify hot/cold page handling in rmqueue_bulk()
  2017-10-18  8:06   ` Mel Gorman
@ 2017-10-18  8:38     ` Vlastimil Babka
  -1 siblings, 0 replies; 6+ messages in thread
From: Vlastimil Babka @ 2017-10-18  8:38 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, linux-mm, linux-kernel, Johannes Weiner, Michal Hocko

On 10/18/2017 10:06 AM, Mel Gorman wrote:
> On Wed, Oct 18, 2017 at 09:35:28AM +0200, Vlastimil Babka wrote:
>> The code for filling the pcplists in order determined by the cold flag also
>> seems unnecessarily hard to follow. It's sufficient to either use list_add()
>> or list_add_tail(), but the current code also updates the list head pointer
>> in each step to the last added page, which then counterintuitively requires
>> to switch the usage of list_add() and list_add_tail() to achieve the desired
>> order, with no apparent benefit. This patch simplifies the code.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> 
> The "cold" treatment is dubious because almost everything that frees
> considers the page "hot" which limits the usefulness of hot/cold in the
> allocator. While I do not see a problem with your patch as such, please
> take a look at "mm: Remove __GFP_COLD" in particular. The last 4 patches
> in that series make a number of observations on how "cold" is treated in
> the allocator.

Ah, somehow I managed to miss that series, thanks for pointing me.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm, page_alloc: simplify hot/cold page handling in rmqueue_bulk()
@ 2017-10-18  8:38     ` Vlastimil Babka
  0 siblings, 0 replies; 6+ messages in thread
From: Vlastimil Babka @ 2017-10-18  8:38 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, linux-mm, linux-kernel, Johannes Weiner, Michal Hocko

On 10/18/2017 10:06 AM, Mel Gorman wrote:
> On Wed, Oct 18, 2017 at 09:35:28AM +0200, Vlastimil Babka wrote:
>> The code for filling the pcplists in order determined by the cold flag also
>> seems unnecessarily hard to follow. It's sufficient to either use list_add()
>> or list_add_tail(), but the current code also updates the list head pointer
>> in each step to the last added page, which then counterintuitively requires
>> to switch the usage of list_add() and list_add_tail() to achieve the desired
>> order, with no apparent benefit. This patch simplifies the code.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> 
> The "cold" treatment is dubious because almost everything that frees
> considers the page "hot" which limits the usefulness of hot/cold in the
> allocator. While I do not see a problem with your patch as such, please
> take a look at "mm: Remove __GFP_COLD" in particular. The last 4 patches
> in that series make a number of observations on how "cold" is treated in
> the allocator.

Ah, somehow I managed to miss that series, thanks for pointing me.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-10-18  8:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-18  7:35 [PATCH] mm, page_alloc: simplify hot/cold page handling in rmqueue_bulk() Vlastimil Babka
2017-10-18  7:35 ` Vlastimil Babka
2017-10-18  8:06 ` Mel Gorman
2017-10-18  8:06   ` Mel Gorman
2017-10-18  8:38   ` Vlastimil Babka
2017-10-18  8:38     ` Vlastimil Babka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.