All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Follow-up fixes to node-lru series v1
@ 2016-07-13 10:00 Mel Gorman
  2016-07-13 10:00 ` [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix Mel Gorman
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Mel Gorman @ 2016-07-13 10:00 UTC (permalink / raw)
  To: Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML, Mel Gorman

These are some follow-up fixes to the node-lru series based on feedback
from Johannes and Minchan.

 include/linux/vm_event_item.h |  2 +-
 mm/page-writeback.c           | 16 ++++++++++------
 mm/vmscan.c                   | 15 ++++++++-------
 mm/vmstat.c                   |  2 +-
 4 files changed, 20 insertions(+), 15 deletions(-)

-- 
2.6.4

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 0/4] Follow-up fixes to node-lru series v1
@ 2016-07-13 10:00 Mel Gorman
  2016-07-13 10:00 ` [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix Mel Gorman
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Mel Gorman @ 2016-07-13 10:00 UTC (permalink / raw)
  To: Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML, Mel Gorman

These are some follow-up fixes to the node-lru series based on feedback
from Johannes and Minchan.

 include/linux/vm_event_item.h |  2 +-
 mm/page-writeback.c           | 16 ++++++++++------
 mm/vmscan.c                   | 15 ++++++++-------
 mm/vmstat.c                   |  2 +-
 4 files changed, 20 insertions(+), 15 deletions(-)

-- 
2.6.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix
  2016-07-13 10:00 [PATCH 0/4] Follow-up fixes to node-lru series v1 Mel Gorman
@ 2016-07-13 10:00 ` Mel Gorman
  2016-07-13 13:10   ` Johannes Weiner
  2016-07-14  1:22   ` Minchan Kim
  2016-07-13 10:00 ` [PATCH 2/4] mm: vmstat: account per-zone stalls and pages skipped during reclaim -fix Mel Gorman
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 24+ messages in thread
From: Mel Gorman @ 2016-07-13 10:00 UTC (permalink / raw)
  To: Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML, Mel Gorman

Johannes reported that the comment about buffer_heads_over_limit in
balance_pgdat only made sense in the context of the patch. This patch
clarifies the reasoning and how it applies to 32 and 64 bit systems.

This is a fix to the mmotm patch
mm-vmscan-have-kswapd-reclaim-from-all-zones-if-reclaiming-and-buffer_heads_over_limit.patch

Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/vmscan.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index d079210d46ee..21eae17ee730 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3131,12 +3131,13 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx)
 
 		/*
 		 * If the number of buffer_heads exceeds the maximum allowed
-		 * then consider reclaiming from all zones. This is not
-		 * specific to highmem which may not exist but it is it is
-		 * expected that buffer_heads are stripped in writeback.
-		 * Reclaim may still not go ahead if all eligible zones
-		 * for the original allocation request are balanced to
-		 * avoid excessive reclaim from kswapd.
+		 * then consider reclaiming from all zones. This has a dual
+		 * purpose -- on 64-bit systems it is expected that
+		 * buffer_heads are stripped during active rotation. On 32-bit
+		 * systems, highmem pages can pin lowmem memory and shrinking
+		 * buffers can relieve lowmem pressure. Reclaim may still not
+		 * go ahead if all eligible zones for the original allocation
+		 * request are balanced to avoid excessive reclaim from kswapd.
 		 */
 		if (buffer_heads_over_limit) {
 			for (i = MAX_NR_ZONES - 1; i >= 0; i--) {
-- 
2.6.4

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix
@ 2016-07-13 10:00 ` Mel Gorman
  2016-07-13 13:10   ` Johannes Weiner
  2016-07-14  1:22   ` Minchan Kim
  0 siblings, 2 replies; 24+ messages in thread
From: Mel Gorman @ 2016-07-13 10:00 UTC (permalink / raw)
  To: Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML, Mel Gorman

Johannes reported that the comment about buffer_heads_over_limit in
balance_pgdat only made sense in the context of the patch. This patch
clarifies the reasoning and how it applies to 32 and 64 bit systems.

This is a fix to the mmotm patch
mm-vmscan-have-kswapd-reclaim-from-all-zones-if-reclaiming-and-buffer_heads_over_limit.patch

Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/vmscan.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index d079210d46ee..21eae17ee730 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3131,12 +3131,13 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx)
 
 		/*
 		 * If the number of buffer_heads exceeds the maximum allowed
-		 * then consider reclaiming from all zones. This is not
-		 * specific to highmem which may not exist but it is it is
-		 * expected that buffer_heads are stripped in writeback.
-		 * Reclaim may still not go ahead if all eligible zones
-		 * for the original allocation request are balanced to
-		 * avoid excessive reclaim from kswapd.
+		 * then consider reclaiming from all zones. This has a dual
+		 * purpose -- on 64-bit systems it is expected that
+		 * buffer_heads are stripped during active rotation. On 32-bit
+		 * systems, highmem pages can pin lowmem memory and shrinking
+		 * buffers can relieve lowmem pressure. Reclaim may still not
+		 * go ahead if all eligible zones for the original allocation
+		 * request are balanced to avoid excessive reclaim from kswapd.
 		 */
 		if (buffer_heads_over_limit) {
 			for (i = MAX_NR_ZONES - 1; i >= 0; i--) {
-- 
2.6.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/4] mm: vmstat: account per-zone stalls and pages skipped during reclaim -fix
  2016-07-13 10:00 [PATCH 0/4] Follow-up fixes to node-lru series v1 Mel Gorman
  2016-07-13 10:00 ` [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix Mel Gorman
@ 2016-07-13 10:00 ` Mel Gorman
  2016-07-13 13:10   ` Johannes Weiner
  2016-07-14 14:45   ` Vlastimil Babka
  2016-07-13 10:00 ` [PATCH 3/4] mm, page_alloc: fix dirtyable highmem calculation Mel Gorman
  2016-07-13 10:00 ` [PATCH 4/4] mm: move most file-based accounting to the node -fix Mel Gorman
  3 siblings, 2 replies; 24+ messages in thread
From: Mel Gorman @ 2016-07-13 10:00 UTC (permalink / raw)
  To: Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML, Mel Gorman

As pointed out by Johannes -- the PG prefix seems to stand for page, and
all stat names that contain it represent some per-page event. PGSTALL is
not a page event. This patch renames it.

This is a fix for the mmotm patch
mm-vmstat-account-per-zone-stalls-and-pages-skipped-during-reclaim.patch

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 include/linux/vm_event_item.h | 2 +-
 mm/vmscan.c                   | 2 +-
 mm/vmstat.c                   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index 6d47f66f0e9c..4d6ec58a8d45 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -23,7 +23,7 @@
 
 enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
 		FOR_ALL_ZONES(PGALLOC),
-		FOR_ALL_ZONES(PGSTALL),
+		FOR_ALL_ZONES(ALLOCSTALL),
 		FOR_ALL_ZONES(PGSCAN_SKIP),
 		PGFREE, PGACTIVATE, PGDEACTIVATE,
 		PGFAULT, PGMAJFAULT,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 21eae17ee730..429bf3a9c06c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2674,7 +2674,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 	delayacct_freepages_start();
 
 	if (global_reclaim(sc))
-		__count_zid_vm_events(PGSTALL, sc->reclaim_idx, 1);
+		__count_zid_vm_events(ALLOCSTALL, sc->reclaim_idx, 1);
 
 	do {
 		vmpressure_prio(sc->gfp_mask, sc->target_mem_cgroup,
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 7415775faf08..91ecca96dcae 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -980,7 +980,7 @@ const char * const vmstat_text[] = {
 	"pswpout",
 
 	TEXTS_FOR_ZONES("pgalloc")
-	TEXTS_FOR_ZONES("pgstall")
+	TEXTS_FOR_ZONES("allocstall")
 	TEXTS_FOR_ZONES("pgskip")
 
 	"pgfree",
-- 
2.6.4

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/4] mm: vmstat: account per-zone stalls and pages skipped during reclaim -fix
@ 2016-07-13 10:00 ` Mel Gorman
  2016-07-13 13:10   ` Johannes Weiner
  2016-07-14 14:45   ` Vlastimil Babka
  0 siblings, 2 replies; 24+ messages in thread
From: Mel Gorman @ 2016-07-13 10:00 UTC (permalink / raw)
  To: Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML, Mel Gorman

As pointed out by Johannes -- the PG prefix seems to stand for page, and
all stat names that contain it represent some per-page event. PGSTALL is
not a page event. This patch renames it.

This is a fix for the mmotm patch
mm-vmstat-account-per-zone-stalls-and-pages-skipped-during-reclaim.patch

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 include/linux/vm_event_item.h | 2 +-
 mm/vmscan.c                   | 2 +-
 mm/vmstat.c                   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index 6d47f66f0e9c..4d6ec58a8d45 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -23,7 +23,7 @@
 
 enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
 		FOR_ALL_ZONES(PGALLOC),
-		FOR_ALL_ZONES(PGSTALL),
+		FOR_ALL_ZONES(ALLOCSTALL),
 		FOR_ALL_ZONES(PGSCAN_SKIP),
 		PGFREE, PGACTIVATE, PGDEACTIVATE,
 		PGFAULT, PGMAJFAULT,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 21eae17ee730..429bf3a9c06c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2674,7 +2674,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 	delayacct_freepages_start();
 
 	if (global_reclaim(sc))
-		__count_zid_vm_events(PGSTALL, sc->reclaim_idx, 1);
+		__count_zid_vm_events(ALLOCSTALL, sc->reclaim_idx, 1);
 
 	do {
 		vmpressure_prio(sc->gfp_mask, sc->target_mem_cgroup,
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 7415775faf08..91ecca96dcae 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -980,7 +980,7 @@ const char * const vmstat_text[] = {
 	"pswpout",
 
 	TEXTS_FOR_ZONES("pgalloc")
-	TEXTS_FOR_ZONES("pgstall")
+	TEXTS_FOR_ZONES("allocstall")
 	TEXTS_FOR_ZONES("pgskip")
 
 	"pgfree",
-- 
2.6.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/4] mm, page_alloc: fix dirtyable highmem calculation
  2016-07-13 10:00 [PATCH 0/4] Follow-up fixes to node-lru series v1 Mel Gorman
  2016-07-13 10:00 ` [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix Mel Gorman
  2016-07-13 10:00 ` [PATCH 2/4] mm: vmstat: account per-zone stalls and pages skipped during reclaim -fix Mel Gorman
@ 2016-07-13 10:00 ` Mel Gorman
  2016-07-13 13:15   ` Johannes Weiner
  2016-07-14 15:22   ` Vlastimil Babka
  2016-07-13 10:00 ` [PATCH 4/4] mm: move most file-based accounting to the node -fix Mel Gorman
  3 siblings, 2 replies; 24+ messages in thread
From: Mel Gorman @ 2016-07-13 10:00 UTC (permalink / raw)
  To: Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML, Mel Gorman

From: Minchan Kim <minchan@kernel.org>

Note from Mel: This may optionally be considered a fix to the mmotm patch
	mm-page_alloc-consider-dirtyable-memory-in-terms-of-nodes.patch
	but if so, please preserve credit for Minchan.

When I tested vmscale in mmtest in 32bit, I found the benchmark was slow
down 0.5 times.

                base        node
                   1    global-1
User           12.98       16.04
System        147.61      166.42
Elapsed        26.48       38.08

With vmstat, I found IO wait avg is much increased compared to base.

The reason was highmem_dirtyable_memory accumulates free pages and
highmem_file_pages from HIGHMEM to MOVABLE zones which was wrong. With
that, dirth_thresh in throtlle_vm_write is always 0 so that it calls
congestion_wait frequently if writeback starts.

With this patch, it is much recovered.

                base        node          fi
                   1    global-1         fix
User           12.98       16.04       13.78
System        147.61      166.42      143.92
Elapsed        26.48       38.08       29.64

Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/page-writeback.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0bca2376bd42..7b41d1290783 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -307,27 +307,31 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
 {
 #ifdef CONFIG_HIGHMEM
 	int node;
-	unsigned long x = 0;
+	unsigned long x;
 	int i;
-	unsigned long dirtyable = atomic_read(&highmem_file_pages);
+	unsigned long dirtyable = 0;
 
 	for_each_node_state(node, N_HIGH_MEMORY) {
 		for (i = ZONE_NORMAL + 1; i < MAX_NR_ZONES; i++) {
 			struct zone *z;
+			unsigned long nr_pages;
 
 			if (!is_highmem_idx(i))
 				continue;
 
 			z = &NODE_DATA(node)->node_zones[i];
-			dirtyable += zone_page_state(z, NR_FREE_PAGES);
+			if (!populated_zone(z))
+				continue;
 
+			nr_pages = zone_page_state(z, NR_FREE_PAGES);
 			/* watch for underflows */
-			dirtyable -= min(dirtyable, high_wmark_pages(z));
-
-			x += dirtyable;
+			nr_pages -= min(nr_pages, high_wmark_pages(z));
+			dirtyable += nr_pages;
 		}
 	}
 
+	x = dirtyable + atomic_read(&highmem_file_pages);
+
 	/*
 	 * Unreclaimable memory (kernel memory or anonymous memory
 	 * without swap) can bring down the dirtyable pages below
-- 
2.6.4

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/4] mm, page_alloc: fix dirtyable highmem calculation
@ 2016-07-13 10:00 ` Mel Gorman
  2016-07-13 13:15   ` Johannes Weiner
  2016-07-14 15:22   ` Vlastimil Babka
  0 siblings, 2 replies; 24+ messages in thread
From: Mel Gorman @ 2016-07-13 10:00 UTC (permalink / raw)
  To: Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML, Mel Gorman

From: Minchan Kim <minchan@kernel.org>

Note from Mel: This may optionally be considered a fix to the mmotm patch
	mm-page_alloc-consider-dirtyable-memory-in-terms-of-nodes.patch
	but if so, please preserve credit for Minchan.

When I tested vmscale in mmtest in 32bit, I found the benchmark was slow
down 0.5 times.

                base        node
                   1    global-1
User           12.98       16.04
System        147.61      166.42
Elapsed        26.48       38.08

With vmstat, I found IO wait avg is much increased compared to base.

The reason was highmem_dirtyable_memory accumulates free pages and
highmem_file_pages from HIGHMEM to MOVABLE zones which was wrong. With
that, dirth_thresh in throtlle_vm_write is always 0 so that it calls
congestion_wait frequently if writeback starts.

With this patch, it is much recovered.

                base        node          fi
                   1    global-1         fix
User           12.98       16.04       13.78
System        147.61      166.42      143.92
Elapsed        26.48       38.08       29.64

Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/page-writeback.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0bca2376bd42..7b41d1290783 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -307,27 +307,31 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
 {
 #ifdef CONFIG_HIGHMEM
 	int node;
-	unsigned long x = 0;
+	unsigned long x;
 	int i;
-	unsigned long dirtyable = atomic_read(&highmem_file_pages);
+	unsigned long dirtyable = 0;
 
 	for_each_node_state(node, N_HIGH_MEMORY) {
 		for (i = ZONE_NORMAL + 1; i < MAX_NR_ZONES; i++) {
 			struct zone *z;
+			unsigned long nr_pages;
 
 			if (!is_highmem_idx(i))
 				continue;
 
 			z = &NODE_DATA(node)->node_zones[i];
-			dirtyable += zone_page_state(z, NR_FREE_PAGES);
+			if (!populated_zone(z))
+				continue;
 
+			nr_pages = zone_page_state(z, NR_FREE_PAGES);
 			/* watch for underflows */
-			dirtyable -= min(dirtyable, high_wmark_pages(z));
-
-			x += dirtyable;
+			nr_pages -= min(nr_pages, high_wmark_pages(z));
+			dirtyable += nr_pages;
 		}
 	}
 
+	x = dirtyable + atomic_read(&highmem_file_pages);
+
 	/*
 	 * Unreclaimable memory (kernel memory or anonymous memory
 	 * without swap) can bring down the dirtyable pages below
-- 
2.6.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 4/4] mm: move most file-based accounting to the node -fix
  2016-07-13 10:00 [PATCH 0/4] Follow-up fixes to node-lru series v1 Mel Gorman
                   ` (2 preceding siblings ...)
  2016-07-13 10:00 ` [PATCH 3/4] mm, page_alloc: fix dirtyable highmem calculation Mel Gorman
@ 2016-07-13 10:00 ` Mel Gorman
  2016-07-13 13:16   ` Johannes Weiner
  3 siblings, 1 reply; 24+ messages in thread
From: Mel Gorman @ 2016-07-13 10:00 UTC (permalink / raw)
  To: Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML, Mel Gorman

As noted by Johannes Weiner, NR_ZONE_WRITE_PENDING gets decremented twice
during migration instead of a dec(old) -> inc(new) cycle as intended.

This is a fix to mmotm patch
mm-move-most-file-based-accounting-to-the-node.patch

Note that it'll cause a conflict with
mm-vmstat-remove-zone-and-node-double-accounting-by-approximating-retries.patch
but that the resolution is trivial.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/migrate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index c77997dc6ed7..ed0268268e93 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -515,7 +515,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
 			__dec_node_state(oldzone->zone_pgdat, NR_FILE_DIRTY);
 			__dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
 			__inc_node_state(newzone->zone_pgdat, NR_FILE_DIRTY);
-			__dec_zone_state(newzone, NR_ZONE_WRITE_PENDING);
+			__inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
 		}
 	}
 	local_irq_enable();
-- 
2.6.4

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 4/4] mm: move most file-based accounting to the node -fix
@ 2016-07-13 10:00 ` Mel Gorman
  2016-07-13 13:16   ` Johannes Weiner
  0 siblings, 1 reply; 24+ messages in thread
From: Mel Gorman @ 2016-07-13 10:00 UTC (permalink / raw)
  To: Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML, Mel Gorman

As noted by Johannes Weiner, NR_ZONE_WRITE_PENDING gets decremented twice
during migration instead of a dec(old) -> inc(new) cycle as intended.

This is a fix to mmotm patch
mm-move-most-file-based-accounting-to-the-node.patch

Note that it'll cause a conflict with
mm-vmstat-remove-zone-and-node-double-accounting-by-approximating-retries.patch
but that the resolution is trivial.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/migrate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index c77997dc6ed7..ed0268268e93 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -515,7 +515,7 @@ int migrate_page_move_mapping(struct address_space *mapping,
 			__dec_node_state(oldzone->zone_pgdat, NR_FILE_DIRTY);
 			__dec_zone_state(oldzone, NR_ZONE_WRITE_PENDING);
 			__inc_node_state(newzone->zone_pgdat, NR_FILE_DIRTY);
-			__dec_zone_state(newzone, NR_ZONE_WRITE_PENDING);
+			__inc_zone_state(newzone, NR_ZONE_WRITE_PENDING);
 		}
 	}
 	local_irq_enable();
-- 
2.6.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix
  2016-07-13 10:00 ` [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix Mel Gorman
@ 2016-07-13 13:10   ` Johannes Weiner
  2016-07-14  1:22   ` Minchan Kim
  1 sibling, 0 replies; 24+ messages in thread
From: Johannes Weiner @ 2016-07-13 13:10 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, Linux-MM, Minchan Kim, LKML

On Wed, Jul 13, 2016 at 11:00:01AM +0100, Mel Gorman wrote:
> Johannes reported that the comment about buffer_heads_over_limit in
> balance_pgdat only made sense in the context of the patch. This patch
> clarifies the reasoning and how it applies to 32 and 64 bit systems.
> 
> This is a fix to the mmotm patch
> mm-vmscan-have-kswapd-reclaim-from-all-zones-if-reclaiming-and-buffer_heads_over_limit.patch
> 
> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

This is a great comment now, thank you.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix
@ 2016-07-13 13:10   ` Johannes Weiner
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Weiner @ 2016-07-13 13:10 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, Linux-MM, Minchan Kim, LKML

On Wed, Jul 13, 2016 at 11:00:01AM +0100, Mel Gorman wrote:
> Johannes reported that the comment about buffer_heads_over_limit in
> balance_pgdat only made sense in the context of the patch. This patch
> clarifies the reasoning and how it applies to 32 and 64 bit systems.
> 
> This is a fix to the mmotm patch
> mm-vmscan-have-kswapd-reclaim-from-all-zones-if-reclaiming-and-buffer_heads_over_limit.patch
> 
> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

This is a great comment now, thank you.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] mm: vmstat: account per-zone stalls and pages skipped during reclaim -fix
  2016-07-13 10:00 ` [PATCH 2/4] mm: vmstat: account per-zone stalls and pages skipped during reclaim -fix Mel Gorman
@ 2016-07-13 13:10   ` Johannes Weiner
  2016-07-14 14:45   ` Vlastimil Babka
  1 sibling, 0 replies; 24+ messages in thread
From: Johannes Weiner @ 2016-07-13 13:10 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, Linux-MM, Minchan Kim, LKML

On Wed, Jul 13, 2016 at 11:00:02AM +0100, Mel Gorman wrote:
> As pointed out by Johannes -- the PG prefix seems to stand for page, and
> all stat names that contain it represent some per-page event. PGSTALL is
> not a page event. This patch renames it.
> 
> This is a fix for the mmotm patch
> mm-vmstat-account-per-zone-stalls-and-pages-skipped-during-reclaim.patch
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Thanks

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] mm: vmstat: account per-zone stalls and pages skipped during reclaim -fix
@ 2016-07-13 13:10   ` Johannes Weiner
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Weiner @ 2016-07-13 13:10 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, Linux-MM, Minchan Kim, LKML

On Wed, Jul 13, 2016 at 11:00:02AM +0100, Mel Gorman wrote:
> As pointed out by Johannes -- the PG prefix seems to stand for page, and
> all stat names that contain it represent some per-page event. PGSTALL is
> not a page event. This patch renames it.
> 
> This is a fix for the mmotm patch
> mm-vmstat-account-per-zone-stalls-and-pages-skipped-during-reclaim.patch
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Thanks

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] mm, page_alloc: fix dirtyable highmem calculation
  2016-07-13 10:00 ` [PATCH 3/4] mm, page_alloc: fix dirtyable highmem calculation Mel Gorman
@ 2016-07-13 13:15   ` Johannes Weiner
  2016-07-14 15:22   ` Vlastimil Babka
  1 sibling, 0 replies; 24+ messages in thread
From: Johannes Weiner @ 2016-07-13 13:15 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, Linux-MM, Minchan Kim, LKML

On Wed, Jul 13, 2016 at 11:00:03AM +0100, Mel Gorman wrote:
> From: Minchan Kim <minchan@kernel.org>
> 
> Note from Mel: This may optionally be considered a fix to the mmotm patch
> 	mm-page_alloc-consider-dirtyable-memory-in-terms-of-nodes.patch
> 	but if so, please preserve credit for Minchan.
> 
> When I tested vmscale in mmtest in 32bit, I found the benchmark was slow
> down 0.5 times.
> 
>                 base        node
>                    1    global-1
> User           12.98       16.04
> System        147.61      166.42
> Elapsed        26.48       38.08
> 
> With vmstat, I found IO wait avg is much increased compared to base.
> 
> The reason was highmem_dirtyable_memory accumulates free pages and
> highmem_file_pages from HIGHMEM to MOVABLE zones which was wrong. With
> that, dirth_thresh in throtlle_vm_write is always 0 so that it calls
> congestion_wait frequently if writeback starts.
> 
> With this patch, it is much recovered.
> 
>                 base        node          fi
>                    1    global-1         fix
> User           12.98       16.04       13.78
> System        147.61      166.42      143.92
> Elapsed        26.48       38.08       29.64
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] mm, page_alloc: fix dirtyable highmem calculation
@ 2016-07-13 13:15   ` Johannes Weiner
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Weiner @ 2016-07-13 13:15 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, Linux-MM, Minchan Kim, LKML

On Wed, Jul 13, 2016 at 11:00:03AM +0100, Mel Gorman wrote:
> From: Minchan Kim <minchan@kernel.org>
> 
> Note from Mel: This may optionally be considered a fix to the mmotm patch
> 	mm-page_alloc-consider-dirtyable-memory-in-terms-of-nodes.patch
> 	but if so, please preserve credit for Minchan.
> 
> When I tested vmscale in mmtest in 32bit, I found the benchmark was slow
> down 0.5 times.
> 
>                 base        node
>                    1    global-1
> User           12.98       16.04
> System        147.61      166.42
> Elapsed        26.48       38.08
> 
> With vmstat, I found IO wait avg is much increased compared to base.
> 
> The reason was highmem_dirtyable_memory accumulates free pages and
> highmem_file_pages from HIGHMEM to MOVABLE zones which was wrong. With
> that, dirth_thresh in throtlle_vm_write is always 0 so that it calls
> congestion_wait frequently if writeback starts.
> 
> With this patch, it is much recovered.
> 
>                 base        node          fi
>                    1    global-1         fix
> User           12.98       16.04       13.78
> System        147.61      166.42      143.92
> Elapsed        26.48       38.08       29.64
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] mm: move most file-based accounting to the node -fix
  2016-07-13 10:00 ` [PATCH 4/4] mm: move most file-based accounting to the node -fix Mel Gorman
@ 2016-07-13 13:16   ` Johannes Weiner
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Weiner @ 2016-07-13 13:16 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, Linux-MM, Minchan Kim, LKML

On Wed, Jul 13, 2016 at 11:00:04AM +0100, Mel Gorman wrote:
> As noted by Johannes Weiner, NR_ZONE_WRITE_PENDING gets decremented twice
> during migration instead of a dec(old) -> inc(new) cycle as intended.
> 
> This is a fix to mmotm patch
> mm-move-most-file-based-accounting-to-the-node.patch
> 
> Note that it'll cause a conflict with
> mm-vmstat-remove-zone-and-node-double-accounting-by-approximating-retries.patch
> but that the resolution is trivial.
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] mm: move most file-based accounting to the node -fix
@ 2016-07-13 13:16   ` Johannes Weiner
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Weiner @ 2016-07-13 13:16 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, Linux-MM, Minchan Kim, LKML

On Wed, Jul 13, 2016 at 11:00:04AM +0100, Mel Gorman wrote:
> As noted by Johannes Weiner, NR_ZONE_WRITE_PENDING gets decremented twice
> during migration instead of a dec(old) -> inc(new) cycle as intended.
> 
> This is a fix to mmotm patch
> mm-move-most-file-based-accounting-to-the-node.patch
> 
> Note that it'll cause a conflict with
> mm-vmstat-remove-zone-and-node-double-accounting-by-approximating-retries.patch
> but that the resolution is trivial.
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix
  2016-07-13 10:00 ` [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix Mel Gorman
  2016-07-13 13:10   ` Johannes Weiner
@ 2016-07-14  1:22   ` Minchan Kim
  1 sibling, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2016-07-14  1:22 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, Linux-MM, Johannes Weiner, LKML

On Wed, Jul 13, 2016 at 11:00:01AM +0100, Mel Gorman wrote:
> Johannes reported that the comment about buffer_heads_over_limit in
> balance_pgdat only made sense in the context of the patch. This patch
> clarifies the reasoning and how it applies to 32 and 64 bit systems.
> 
> This is a fix to the mmotm patch
> mm-vmscan-have-kswapd-reclaim-from-all-zones-if-reclaiming-and-buffer_heads_over_limit.patch
> 
> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
>  mm/vmscan.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index d079210d46ee..21eae17ee730 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3131,12 +3131,13 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx)
>  
>  		/*
>  		 * If the number of buffer_heads exceeds the maximum allowed
> -		 * then consider reclaiming from all zones. This is not
> -		 * specific to highmem which may not exist but it is it is
> -		 * expected that buffer_heads are stripped in writeback.
> -		 * Reclaim may still not go ahead if all eligible zones
> -		 * for the original allocation request are balanced to
> -		 * avoid excessive reclaim from kswapd.
> +		 * then consider reclaiming from all zones. This has a dual
> +		 * purpose -- on 64-bit systems it is expected that
> +		 * buffer_heads are stripped during active rotation. On 32-bit
> +		 * systems, highmem pages can pin lowmem memory and shrinking
> +		 * buffers can relieve lowmem pressure. Reclaim may still not

It's good but I hope we can make it more clear.

On 32-bit systems, highmem pages can pin lowmem pages storing buffer_heads
so shrinking highmem pages can relieve lowmem pressure.

If you don't think it's much readable compared to yours, feel free to drop.

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix
@ 2016-07-14  1:22   ` Minchan Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2016-07-14  1:22 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, Linux-MM, Johannes Weiner, LKML

On Wed, Jul 13, 2016 at 11:00:01AM +0100, Mel Gorman wrote:
> Johannes reported that the comment about buffer_heads_over_limit in
> balance_pgdat only made sense in the context of the patch. This patch
> clarifies the reasoning and how it applies to 32 and 64 bit systems.
> 
> This is a fix to the mmotm patch
> mm-vmscan-have-kswapd-reclaim-from-all-zones-if-reclaiming-and-buffer_heads_over_limit.patch
> 
> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
>  mm/vmscan.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index d079210d46ee..21eae17ee730 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3131,12 +3131,13 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx)
>  
>  		/*
>  		 * If the number of buffer_heads exceeds the maximum allowed
> -		 * then consider reclaiming from all zones. This is not
> -		 * specific to highmem which may not exist but it is it is
> -		 * expected that buffer_heads are stripped in writeback.
> -		 * Reclaim may still not go ahead if all eligible zones
> -		 * for the original allocation request are balanced to
> -		 * avoid excessive reclaim from kswapd.
> +		 * then consider reclaiming from all zones. This has a dual
> +		 * purpose -- on 64-bit systems it is expected that
> +		 * buffer_heads are stripped during active rotation. On 32-bit
> +		 * systems, highmem pages can pin lowmem memory and shrinking
> +		 * buffers can relieve lowmem pressure. Reclaim may still not

It's good but I hope we can make it more clear.

On 32-bit systems, highmem pages can pin lowmem pages storing buffer_heads
so shrinking highmem pages can relieve lowmem pressure.

If you don't think it's much readable compared to yours, feel free to drop.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] mm: vmstat: account per-zone stalls and pages skipped during reclaim -fix
  2016-07-13 10:00 ` [PATCH 2/4] mm: vmstat: account per-zone stalls and pages skipped during reclaim -fix Mel Gorman
  2016-07-13 13:10   ` Johannes Weiner
@ 2016-07-14 14:45   ` Vlastimil Babka
  1 sibling, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-14 14:45 UTC (permalink / raw)
  To: Mel Gorman, Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML

On 07/13/2016 12:00 PM, Mel Gorman wrote:
> As pointed out by Johannes -- the PG prefix seems to stand for page, and
> all stat names that contain it represent some per-page event. PGSTALL is
> not a page event. This patch renames it.
>
> This is a fix for the mmotm patch
> mm-vmstat-account-per-zone-stalls-and-pages-skipped-during-reclaim.patch
>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] mm: vmstat: account per-zone stalls and pages skipped during reclaim -fix
@ 2016-07-14 14:45   ` Vlastimil Babka
  0 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-14 14:45 UTC (permalink / raw)
  To: Mel Gorman, Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML

On 07/13/2016 12:00 PM, Mel Gorman wrote:
> As pointed out by Johannes -- the PG prefix seems to stand for page, and
> all stat names that contain it represent some per-page event. PGSTALL is
> not a page event. This patch renames it.
>
> This is a fix for the mmotm patch
> mm-vmstat-account-per-zone-stalls-and-pages-skipped-during-reclaim.patch
>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] mm, page_alloc: fix dirtyable highmem calculation
  2016-07-13 10:00 ` [PATCH 3/4] mm, page_alloc: fix dirtyable highmem calculation Mel Gorman
  2016-07-13 13:15   ` Johannes Weiner
@ 2016-07-14 15:22   ` Vlastimil Babka
  1 sibling, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-14 15:22 UTC (permalink / raw)
  To: Mel Gorman, Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML

On 07/13/2016 12:00 PM, Mel Gorman wrote:
> From: Minchan Kim <minchan@kernel.org>
>
> Note from Mel: This may optionally be considered a fix to the mmotm patch
> 	mm-page_alloc-consider-dirtyable-memory-in-terms-of-nodes.patch
> 	but if so, please preserve credit for Minchan.
>
> When I tested vmscale in mmtest in 32bit, I found the benchmark was slow
> down 0.5 times.
>
>                 base        node
>                    1    global-1
> User           12.98       16.04
> System        147.61      166.42
> Elapsed        26.48       38.08
>
> With vmstat, I found IO wait avg is much increased compared to base.
>
> The reason was highmem_dirtyable_memory accumulates free pages and
> highmem_file_pages from HIGHMEM to MOVABLE zones which was wrong. With
> that, dirth_thresh in throtlle_vm_write is always 0 so that it calls
> congestion_wait frequently if writeback starts.
>
> With this patch, it is much recovered.
>
>                 base        node          fi
>                    1    global-1         fix
> User           12.98       16.04       13.78
> System        147.61      166.42      143.92
> Elapsed        26.48       38.08       29.64
>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Just some nitpicks:

> ---
>  mm/page-writeback.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 0bca2376bd42..7b41d1290783 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -307,27 +307,31 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
>  {
>  #ifdef CONFIG_HIGHMEM
>  	int node;
> -	unsigned long x = 0;
> +	unsigned long x;
>  	int i;
> -	unsigned long dirtyable = atomic_read(&highmem_file_pages);
> +	unsigned long dirtyable = 0;

This wasn't necessary?

>
>  	for_each_node_state(node, N_HIGH_MEMORY) {
>  		for (i = ZONE_NORMAL + 1; i < MAX_NR_ZONES; i++) {
>  			struct zone *z;
> +			unsigned long nr_pages;
>
>  			if (!is_highmem_idx(i))
>  				continue;
>
>  			z = &NODE_DATA(node)->node_zones[i];
> -			dirtyable += zone_page_state(z, NR_FREE_PAGES);
> +			if (!populated_zone(z))
> +				continue;
>
> +			nr_pages = zone_page_state(z, NR_FREE_PAGES);
>  			/* watch for underflows */
> -			dirtyable -= min(dirtyable, high_wmark_pages(z));
> -
> -			x += dirtyable;
> +			nr_pages -= min(nr_pages, high_wmark_pages(z));
> +			dirtyable += nr_pages;
>  		}
>  	}
>
> +	x = dirtyable + atomic_read(&highmem_file_pages);

And then this addition wouldn't be necessary. BTW I think we could also 
ditch the "x" variable and just use the "dirtyable" for the rest of the 
function.

> +
>  	/*
>  	 * Unreclaimable memory (kernel memory or anonymous memory
>  	 * without swap) can bring down the dirtyable pages below
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] mm, page_alloc: fix dirtyable highmem calculation
@ 2016-07-14 15:22   ` Vlastimil Babka
  0 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-14 15:22 UTC (permalink / raw)
  To: Mel Gorman, Andrew Morton, Linux-MM; +Cc: Johannes Weiner, Minchan Kim, LKML

On 07/13/2016 12:00 PM, Mel Gorman wrote:
> From: Minchan Kim <minchan@kernel.org>
>
> Note from Mel: This may optionally be considered a fix to the mmotm patch
> 	mm-page_alloc-consider-dirtyable-memory-in-terms-of-nodes.patch
> 	but if so, please preserve credit for Minchan.
>
> When I tested vmscale in mmtest in 32bit, I found the benchmark was slow
> down 0.5 times.
>
>                 base        node
>                    1    global-1
> User           12.98       16.04
> System        147.61      166.42
> Elapsed        26.48       38.08
>
> With vmstat, I found IO wait avg is much increased compared to base.
>
> The reason was highmem_dirtyable_memory accumulates free pages and
> highmem_file_pages from HIGHMEM to MOVABLE zones which was wrong. With
> that, dirth_thresh in throtlle_vm_write is always 0 so that it calls
> congestion_wait frequently if writeback starts.
>
> With this patch, it is much recovered.
>
>                 base        node          fi
>                    1    global-1         fix
> User           12.98       16.04       13.78
> System        147.61      166.42      143.92
> Elapsed        26.48       38.08       29.64
>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Just some nitpicks:

> ---
>  mm/page-writeback.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 0bca2376bd42..7b41d1290783 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -307,27 +307,31 @@ static unsigned long highmem_dirtyable_memory(unsigned long total)
>  {
>  #ifdef CONFIG_HIGHMEM
>  	int node;
> -	unsigned long x = 0;
> +	unsigned long x;
>  	int i;
> -	unsigned long dirtyable = atomic_read(&highmem_file_pages);
> +	unsigned long dirtyable = 0;

This wasn't necessary?

>
>  	for_each_node_state(node, N_HIGH_MEMORY) {
>  		for (i = ZONE_NORMAL + 1; i < MAX_NR_ZONES; i++) {
>  			struct zone *z;
> +			unsigned long nr_pages;
>
>  			if (!is_highmem_idx(i))
>  				continue;
>
>  			z = &NODE_DATA(node)->node_zones[i];
> -			dirtyable += zone_page_state(z, NR_FREE_PAGES);
> +			if (!populated_zone(z))
> +				continue;
>
> +			nr_pages = zone_page_state(z, NR_FREE_PAGES);
>  			/* watch for underflows */
> -			dirtyable -= min(dirtyable, high_wmark_pages(z));
> -
> -			x += dirtyable;
> +			nr_pages -= min(nr_pages, high_wmark_pages(z));
> +			dirtyable += nr_pages;
>  		}
>  	}
>
> +	x = dirtyable + atomic_read(&highmem_file_pages);

And then this addition wouldn't be necessary. BTW I think we could also 
ditch the "x" variable and just use the "dirtyable" for the rest of the 
function.

> +
>  	/*
>  	 * Unreclaimable memory (kernel memory or anonymous memory
>  	 * without swap) can bring down the dirtyable pages below
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2016-07-14 15:22 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-13 10:00 [PATCH 0/4] Follow-up fixes to node-lru series v1 Mel Gorman
2016-07-13 10:00 ` [PATCH 1/4] mm, vmscan: Have kswapd reclaim from all zones if reclaiming and buffer_heads_over_limit -fix Mel Gorman
2016-07-13 13:10   ` Johannes Weiner
2016-07-14  1:22   ` Minchan Kim
2016-07-13 10:00 ` [PATCH 2/4] mm: vmstat: account per-zone stalls and pages skipped during reclaim -fix Mel Gorman
2016-07-13 13:10   ` Johannes Weiner
2016-07-14 14:45   ` Vlastimil Babka
2016-07-13 10:00 ` [PATCH 3/4] mm, page_alloc: fix dirtyable highmem calculation Mel Gorman
2016-07-13 13:15   ` Johannes Weiner
2016-07-14 15:22   ` Vlastimil Babka
2016-07-13 10:00 ` [PATCH 4/4] mm: move most file-based accounting to the node -fix Mel Gorman
2016-07-13 13:16   ` Johannes Weiner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.