[PATCH 1/2 v2] mm: vmpressure: fix sending wrong events on underflow

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2 v2] mm: vmpressure: fix sending wrong events on underflow
@ 2017-02-09 11:59 ` Vinayak Menon
  0 siblings, 0 replies; 14+ messages in thread
From: Vinayak Menon @ 2017-02-09 11:59 UTC (permalink / raw)
  To: akpm, hannes, mgorman, vbabka, mhocko, riel, vdavydov.dev,
	anton.vorontsov, minchan, shashim
  Cc: linux-mm, linux-kernel, Vinayak Menon

At the end of a window period, if the reclaimed pages
is greater than scanned, an unsigned underflow can
result in a huge pressure value and thus a critical event.
Reclaimed pages is found to go higher than scanned because
of the addition of reclaimed slab pages to reclaimed in
shrink_node without a corresponding increment to scanned
pages. Minchan Kim mentioned that this can also happen in
the case of a THP page where the scanned is 1 and reclaimed
could be 512.

Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
---
v2: Adding a comment and reordering the patches
    as per Michal's suggestion

 mm/vmpressure.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index 149fdf6..6063581 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -112,9 +112,16 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
 						    unsigned long reclaimed)
 {
 	unsigned long scale = scanned + reclaimed;
-	unsigned long pressure;
+	unsigned long pressure = 0;
 
 	/*
+	 * reclaimed can be greater than scanned in cases
+	 * like THP, where the scanned is 1 and reclaimed
+	 * could be 512
+	 */
+	if (reclaimed >= scanned)
+		goto out;
+	/*
 	 * We calculate the ratio (in percents) of how many pages were
 	 * scanned vs. reclaimed in a given time frame (window). Note that
 	 * time is in VM reclaimer's "ticks", i.e. number of pages
@@ -124,6 +131,7 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
 	pressure = scale - (reclaimed * scale / scanned);
 	pressure = pressure * 100 / scale;
 
+out:
 	pr_debug("%s: %3lu  (s: %lu  r: %lu)\n", __func__, pressure,
 		 scanned, reclaimed);
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 1/2 v2] mm: vmpressure: fix sending wrong events on underflow
@ 2017-02-09 11:59 ` Vinayak Menon
  0 siblings, 0 replies; 14+ messages in thread
From: Vinayak Menon @ 2017-02-09 11:59 UTC (permalink / raw)
  To: akpm, hannes, mgorman, vbabka, mhocko, riel, vdavydov.dev,
	anton.vorontsov, minchan, shashim
  Cc: linux-mm, linux-kernel, Vinayak Menon

At the end of a window period, if the reclaimed pages
is greater than scanned, an unsigned underflow can
result in a huge pressure value and thus a critical event.
Reclaimed pages is found to go higher than scanned because
of the addition of reclaimed slab pages to reclaimed in
shrink_node without a corresponding increment to scanned
pages. Minchan Kim mentioned that this can also happen in
the case of a THP page where the scanned is 1 and reclaimed
could be 512.

Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
---
v2: Adding a comment and reordering the patches
    as per Michal's suggestion

 mm/vmpressure.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index 149fdf6..6063581 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -112,9 +112,16 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
 						    unsigned long reclaimed)
 {
 	unsigned long scale = scanned + reclaimed;
-	unsigned long pressure;
+	unsigned long pressure = 0;
 
 	/*
+	 * reclaimed can be greater than scanned in cases
+	 * like THP, where the scanned is 1 and reclaimed
+	 * could be 512
+	 */
+	if (reclaimed >= scanned)
+		goto out;
+	/*
 	 * We calculate the ratio (in percents) of how many pages were
 	 * scanned vs. reclaimed in a given time frame (window). Note that
 	 * time is in VM reclaimer's "ticks", i.e. number of pages
@@ -124,6 +131,7 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
 	pressure = scale - (reclaimed * scale / scanned);
 	pressure = pressure * 100 / scale;
 
+out:
 	pr_debug("%s: %3lu  (s: %lu  r: %lu)\n", __func__, pressure,
 		 scanned, reclaimed);
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2 v5] mm: vmscan: do not pass reclaimed slab to vmpressure
  2017-02-09 11:59 ` Vinayak Menon
@ 2017-02-09 11:59   ` Vinayak Menon
  -1 siblings, 0 replies; 14+ messages in thread
From: Vinayak Menon @ 2017-02-09 11:59 UTC (permalink / raw)
  To: akpm, hannes, mgorman, vbabka, mhocko, riel, vdavydov.dev,
	anton.vorontsov, minchan, shashim
  Cc: linux-mm, linux-kernel, Vinayak Menon

During global reclaim, the nr_reclaimed passed to vmpressure includes the
pages reclaimed from slab.  But the corresponding scanned slab pages is
not passed. There is an impact to the vmpressure values because of this.
While moving from kernel version 3.18 to 4.4, a difference is seen
in the vmpressure values for the same workload resulting in a different
behaviour of the vmpressure consumer. One such case is of a vmpressure
based lowmemorykiller. It is observed that the vmpressure events are
received late and less in number resulting in tasks not being killed at
the right time. The following numbers show the impact on reclaim activity
due to the change in behaviour of lowmemorykiller on a 4GB device. The test
launches a number of apps in sequence and repeats it multiple times.

                      v4.4           v3.18
pgpgin                163016456      145617236
pgpgout               4366220        4188004
workingset_refault    29857868       26781854
workingset_activate   6293946        5634625
pswpin                1327601        1133912
pswpout               3593842        3229602
pgalloc_dma           99520618       94402970
pgalloc_normal        104046854      98124798
pgfree                203772640      192600737
pgmajfault            2126962        1851836
pgsteal_kswapd_dma    19732899       18039462
pgsteal_kswapd_normal 19945336       17977706
pgsteal_direct_dma    206757         131376
pgsteal_direct_normal 236783         138247
pageoutrun            116622         108370
allocstall            7220           4684
compact_stall         931            856

This is a regression introduced by commit 6b4f7799c6a5 ("mm: vmscan:
invoke slab shrinkers from shrink_zone()").

So do not consider reclaimed slab pages for vmpressure calculation. The
reclaimed pages from slab can be excluded because the freeing of a page
by slab shrinking depends on each slab's object population, making the
cost model (i.e. scan:free) different from that of LRU.  Also, not every
shrinker accounts the pages it reclaims. But ideally the pages reclaimed
from slab should be passed to vmpressure, otherwise higher vmpressure
levels can be triggered even when there is a reclaim progress. But
accounting only the reclaimed slab pages without the scanned, and adding
something which does not fit into the cost model just adds noise to the
vmpressure values.

Fixes: 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()")
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Shiraz Hashim <shashim@codeaurora.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
---
v5: Modifying the changelog and reordering the patches
    as per Michal's suggestion

 mm/vmscan.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 947ab6f..8969f8e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2594,16 +2594,23 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 				    sc->nr_scanned - nr_scanned,
 				    node_lru_pages);

+		/*
+		 * Record the subtree's reclaim efficiency. The reclaimed
+		 * pages from slab is excluded here because the corresponding
+		 * scanned pages is not accounted. Moreover, freeing a page
+		 * by slab shrinking depends on each slab's object population,
+		 * making the cost model (i.e. scan:free) different from that
+		 * of LRU.
+		 */
+		vmpressure(sc->gfp_mask, sc->target_mem_cgroup, true,
+			   sc->nr_scanned - nr_scanned,
+			   sc->nr_reclaimed - nr_reclaimed);
+
 		if (reclaim_state) {
 			sc->nr_reclaimed += reclaim_state->reclaimed_slab;
 			reclaim_state->reclaimed_slab = 0;
 		}

-		/* Record the subtree's reclaim efficiency */
-		vmpressure(sc->gfp_mask, sc->target_mem_cgroup, true,
-			   sc->nr_scanned - nr_scanned,
-			   sc->nr_reclaimed - nr_reclaimed);
-
 		if (sc->nr_reclaimed - nr_reclaimed)
 			reclaimable = true;

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2 v5] mm: vmscan: do not pass reclaimed slab to vmpressure
@ 2017-02-09 11:59   ` Vinayak Menon
  0 siblings, 0 replies; 14+ messages in thread
From: Vinayak Menon @ 2017-02-09 11:59 UTC (permalink / raw)
  To: akpm, hannes, mgorman, vbabka, mhocko, riel, vdavydov.dev,
	anton.vorontsov, minchan, shashim
  Cc: linux-mm, linux-kernel, Vinayak Menon

During global reclaim, the nr_reclaimed passed to vmpressure includes the
pages reclaimed from slab.  But the corresponding scanned slab pages is
not passed. There is an impact to the vmpressure values because of this.
While moving from kernel version 3.18 to 4.4, a difference is seen
in the vmpressure values for the same workload resulting in a different
behaviour of the vmpressure consumer. One such case is of a vmpressure
based lowmemorykiller. It is observed that the vmpressure events are
received late and less in number resulting in tasks not being killed at
the right time. The following numbers show the impact on reclaim activity
due to the change in behaviour of lowmemorykiller on a 4GB device. The test
launches a number of apps in sequence and repeats it multiple times.

                      v4.4           v3.18
pgpgin                163016456      145617236
pgpgout               4366220        4188004
workingset_refault    29857868       26781854
workingset_activate   6293946        5634625
pswpin                1327601        1133912
pswpout               3593842        3229602
pgalloc_dma           99520618       94402970
pgalloc_normal        104046854      98124798
pgfree                203772640      192600737
pgmajfault            2126962        1851836
pgsteal_kswapd_dma    19732899       18039462
pgsteal_kswapd_normal 19945336       17977706
pgsteal_direct_dma    206757         131376
pgsteal_direct_normal 236783         138247
pageoutrun            116622         108370
allocstall            7220           4684
compact_stall         931            856

This is a regression introduced by commit 6b4f7799c6a5 ("mm: vmscan:
invoke slab shrinkers from shrink_zone()").

So do not consider reclaimed slab pages for vmpressure calculation. The
reclaimed pages from slab can be excluded because the freeing of a page
by slab shrinking depends on each slab's object population, making the
cost model (i.e. scan:free) different from that of LRU.  Also, not every
shrinker accounts the pages it reclaims. But ideally the pages reclaimed
from slab should be passed to vmpressure, otherwise higher vmpressure
levels can be triggered even when there is a reclaim progress. But
accounting only the reclaimed slab pages without the scanned, and adding
something which does not fit into the cost model just adds noise to the
vmpressure values.

Fixes: 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()")
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Shiraz Hashim <shashim@codeaurora.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
---
v5: Modifying the changelog and reordering the patches
    as per Michal's suggestion

 mm/vmscan.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 947ab6f..8969f8e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2594,16 +2594,23 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 				    sc->nr_scanned - nr_scanned,
 				    node_lru_pages);

+		/*
+		 * Record the subtree's reclaim efficiency. The reclaimed
+		 * pages from slab is excluded here because the corresponding
+		 * scanned pages is not accounted. Moreover, freeing a page
+		 * by slab shrinking depends on each slab's object population,
+		 * making the cost model (i.e. scan:free) different from that
+		 * of LRU.
+		 */
+		vmpressure(sc->gfp_mask, sc->target_mem_cgroup, true,
+			   sc->nr_scanned - nr_scanned,
+			   sc->nr_reclaimed - nr_reclaimed);
+
 		if (reclaim_state) {
 			sc->nr_reclaimed += reclaim_state->reclaimed_slab;
 			reclaim_state->reclaimed_slab = 0;
 		}

-		/* Record the subtree's reclaim efficiency */
-		vmpressure(sc->gfp_mask, sc->target_mem_cgroup, true,
-			   sc->nr_scanned - nr_scanned,
-			   sc->nr_reclaimed - nr_reclaimed);
-
 		if (sc->nr_reclaimed - nr_reclaimed)
 			reclaimable = true;

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2 v2] mm: vmpressure: fix sending wrong events on underflow
  2017-02-09 11:59 ` Vinayak Menon
@ 2017-02-09 12:10   ` Michal Hocko
  -1 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2017-02-09 12:10 UTC (permalink / raw)
  To: Vinayak Menon
  Cc: akpm, hannes, mgorman, vbabka, riel, vdavydov.dev,
	anton.vorontsov, minchan, shashim, linux-mm, linux-kernel

On Thu 09-02-17 17:29:36, Vinayak Menon wrote:
> At the end of a window period, if the reclaimed pages
> is greater than scanned, an unsigned underflow can
> result in a huge pressure value and thus a critical event.
> Reclaimed pages is found to go higher than scanned because
> of the addition of reclaimed slab pages to reclaimed in
> shrink_node without a corresponding increment to scanned
> pages. Minchan Kim mentioned that this can also happen in
> the case of a THP page where the scanned is 1 and reclaimed
> could be 512.
> 
> Acked-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>

Acked-by: Michal Hocko <mhocko@suse.com>

I would prefer the fixup in vmpressure() as already mentioned but this
should work as well.

> ---
> v2: Adding a comment and reordering the patches
>     as per Michal's suggestion
> 
>  mm/vmpressure.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> index 149fdf6..6063581 100644
> --- a/mm/vmpressure.c
> +++ b/mm/vmpressure.c
> @@ -112,9 +112,16 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
>  						    unsigned long reclaimed)
>  {
>  	unsigned long scale = scanned + reclaimed;
> -	unsigned long pressure;
> +	unsigned long pressure = 0;
>  
>  	/*
> +	 * reclaimed can be greater than scanned in cases
> +	 * like THP, where the scanned is 1 and reclaimed
> +	 * could be 512
> +	 */
> +	if (reclaimed >= scanned)
> +		goto out;
> +	/*
>  	 * We calculate the ratio (in percents) of how many pages were
>  	 * scanned vs. reclaimed in a given time frame (window). Note that
>  	 * time is in VM reclaimer's "ticks", i.e. number of pages
> @@ -124,6 +131,7 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
>  	pressure = scale - (reclaimed * scale / scanned);
>  	pressure = pressure * 100 / scale;
>  
> +out:
>  	pr_debug("%s: %3lu  (s: %lu  r: %lu)\n", __func__, pressure,
>  		 scanned, reclaimed);
>  
> -- 
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
> member of the Code Aurora Forum, hosted by The Linux Foundation
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2 v2] mm: vmpressure: fix sending wrong events on underflow
@ 2017-02-09 12:10   ` Michal Hocko
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2017-02-09 12:10 UTC (permalink / raw)
  To: Vinayak Menon
  Cc: akpm, hannes, mgorman, vbabka, riel, vdavydov.dev,
	anton.vorontsov, minchan, shashim, linux-mm, linux-kernel

On Thu 09-02-17 17:29:36, Vinayak Menon wrote:
> At the end of a window period, if the reclaimed pages
> is greater than scanned, an unsigned underflow can
> result in a huge pressure value and thus a critical event.
> Reclaimed pages is found to go higher than scanned because
> of the addition of reclaimed slab pages to reclaimed in
> shrink_node without a corresponding increment to scanned
> pages. Minchan Kim mentioned that this can also happen in
> the case of a THP page where the scanned is 1 and reclaimed
> could be 512.
> 
> Acked-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>

Acked-by: Michal Hocko <mhocko@suse.com>

I would prefer the fixup in vmpressure() as already mentioned but this
should work as well.

> ---
> v2: Adding a comment and reordering the patches
>     as per Michal's suggestion
> 
>  mm/vmpressure.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> index 149fdf6..6063581 100644
> --- a/mm/vmpressure.c
> +++ b/mm/vmpressure.c
> @@ -112,9 +112,16 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
>  						    unsigned long reclaimed)
>  {
>  	unsigned long scale = scanned + reclaimed;
> -	unsigned long pressure;
> +	unsigned long pressure = 0;
>  
>  	/*
> +	 * reclaimed can be greater than scanned in cases
> +	 * like THP, where the scanned is 1 and reclaimed
> +	 * could be 512
> +	 */
> +	if (reclaimed >= scanned)
> +		goto out;
> +	/*
>  	 * We calculate the ratio (in percents) of how many pages were
>  	 * scanned vs. reclaimed in a given time frame (window). Note that
>  	 * time is in VM reclaimer's "ticks", i.e. number of pages
> @@ -124,6 +131,7 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
>  	pressure = scale - (reclaimed * scale / scanned);
>  	pressure = pressure * 100 / scale;
>  
> +out:
>  	pr_debug("%s: %3lu  (s: %lu  r: %lu)\n", __func__, pressure,
>  		 scanned, reclaimed);
>  
> -- 
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
> member of the Code Aurora Forum, hosted by The Linux Foundation
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2 v5] mm: vmscan: do not pass reclaimed slab to vmpressure
  2017-02-09 11:59   ` Vinayak Menon
@ 2017-02-09 12:20     ` Michal Hocko
  -1 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2017-02-09 12:20 UTC (permalink / raw)
  To: Vinayak Menon
  Cc: akpm, hannes, mgorman, vbabka, riel, vdavydov.dev,
	anton.vorontsov, minchan, shashim, linux-mm, linux-kernel

On Thu 09-02-17 17:29:37, Vinayak Menon wrote:
> During global reclaim, the nr_reclaimed passed to vmpressure includes the
> pages reclaimed from slab.  But the corresponding scanned slab pages is
> not passed. There is an impact to the vmpressure values because of this.
> While moving from kernel version 3.18 to 4.4, a difference is seen
> in the vmpressure values for the same workload resulting in a different
> behaviour of the vmpressure consumer. One such case is of a vmpressure
> based lowmemorykiller. It is observed that the vmpressure events are
> received late and less in number resulting in tasks not being killed at
> the right time. The following numbers show the impact on reclaim activity
> due to the change in behaviour of lowmemorykiller on a 4GB device. The test
> launches a number of apps in sequence and repeats it multiple times.

this is really vague description of your workload and doesn't really
explain why getting critical events later is a bad thing.

> 
>                       v4.4           v3.18
> pgpgin                163016456      145617236
> pgpgout               4366220        4188004
> workingset_refault    29857868       26781854
> workingset_activate   6293946        5634625
> pswpin                1327601        1133912
> pswpout               3593842        3229602
> pgalloc_dma           99520618       94402970
> pgalloc_normal        104046854      98124798
> pgfree                203772640      192600737
> pgmajfault            2126962        1851836
> pgsteal_kswapd_dma    19732899       18039462
> pgsteal_kswapd_normal 19945336       17977706
> pgsteal_direct_dma    206757         131376
> pgsteal_direct_normal 236783         138247
> pageoutrun            116622         108370
> allocstall            7220           4684
> compact_stall         931            856

this is missing any vmpressure events data and so it is not very useful
on its own

> This is a regression introduced by commit 6b4f7799c6a5 ("mm: vmscan:
> invoke slab shrinkers from shrink_zone()").
> 
> So do not consider reclaimed slab pages for vmpressure calculation. The
> reclaimed pages from slab can be excluded because the freeing of a page
> by slab shrinking depends on each slab's object population, making the
> cost model (i.e. scan:free) different from that of LRU.  Also, not every
> shrinker accounts the pages it reclaims. But ideally the pages reclaimed
> from slab should be passed to vmpressure, otherwise higher vmpressure
> levels can be triggered even when there is a reclaim progress. But
> accounting only the reclaimed slab pages without the scanned, and adding
> something which does not fit into the cost model just adds noise to the
> vmpressure values.
> 
> Fixes: 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()")
> Acked-by: Minchan Kim <minchan@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
> Cc: Shiraz Hashim <shashim@codeaurora.org>
> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>

I have already said I will _not_ NAK the patch but we need a much better
description and justification why the older behavior was better to
consider this a regression before this can be merged. It is hard to
expect that the underlying implementation of the vmpressure will stay
carved in stone and there might be changes in this area in the future. I
want to hear why we believe that the tested workload is sufficiently
universal and we won't see another report in few months because somebody
else will see higher vmpressure levels even though we make reclaim
progress. I have asked those questions already but it seems those were
ignored.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2 v5] mm: vmscan: do not pass reclaimed slab to vmpressure
@ 2017-02-09 12:20     ` Michal Hocko
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2017-02-09 12:20 UTC (permalink / raw)
  To: Vinayak Menon
  Cc: akpm, hannes, mgorman, vbabka, riel, vdavydov.dev,
	anton.vorontsov, minchan, shashim, linux-mm, linux-kernel

On Thu 09-02-17 17:29:37, Vinayak Menon wrote:
> During global reclaim, the nr_reclaimed passed to vmpressure includes the
> pages reclaimed from slab.  But the corresponding scanned slab pages is
> not passed. There is an impact to the vmpressure values because of this.
> While moving from kernel version 3.18 to 4.4, a difference is seen
> in the vmpressure values for the same workload resulting in a different
> behaviour of the vmpressure consumer. One such case is of a vmpressure
> based lowmemorykiller. It is observed that the vmpressure events are
> received late and less in number resulting in tasks not being killed at
> the right time. The following numbers show the impact on reclaim activity
> due to the change in behaviour of lowmemorykiller on a 4GB device. The test
> launches a number of apps in sequence and repeats it multiple times.

this is really vague description of your workload and doesn't really
explain why getting critical events later is a bad thing.

> 
>                       v4.4           v3.18
> pgpgin                163016456      145617236
> pgpgout               4366220        4188004
> workingset_refault    29857868       26781854
> workingset_activate   6293946        5634625
> pswpin                1327601        1133912
> pswpout               3593842        3229602
> pgalloc_dma           99520618       94402970
> pgalloc_normal        104046854      98124798
> pgfree                203772640      192600737
> pgmajfault            2126962        1851836
> pgsteal_kswapd_dma    19732899       18039462
> pgsteal_kswapd_normal 19945336       17977706
> pgsteal_direct_dma    206757         131376
> pgsteal_direct_normal 236783         138247
> pageoutrun            116622         108370
> allocstall            7220           4684
> compact_stall         931            856

this is missing any vmpressure events data and so it is not very useful
on its own

> This is a regression introduced by commit 6b4f7799c6a5 ("mm: vmscan:
> invoke slab shrinkers from shrink_zone()").
> 
> So do not consider reclaimed slab pages for vmpressure calculation. The
> reclaimed pages from slab can be excluded because the freeing of a page
> by slab shrinking depends on each slab's object population, making the
> cost model (i.e. scan:free) different from that of LRU.  Also, not every
> shrinker accounts the pages it reclaims. But ideally the pages reclaimed
> from slab should be passed to vmpressure, otherwise higher vmpressure
> levels can be triggered even when there is a reclaim progress. But
> accounting only the reclaimed slab pages without the scanned, and adding
> something which does not fit into the cost model just adds noise to the
> vmpressure values.
> 
> Fixes: 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()")
> Acked-by: Minchan Kim <minchan@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
> Cc: Shiraz Hashim <shashim@codeaurora.org>
> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>

I have already said I will _not_ NAK the patch but we need a much better
description and justification why the older behavior was better to
consider this a regression before this can be merged. It is hard to
expect that the underlying implementation of the vmpressure will stay
carved in stone and there might be changes in this area in the future. I
want to hear why we believe that the tested workload is sufficiently
universal and we won't see another report in few months because somebody
else will see higher vmpressure levels even though we make reclaim
progress. I have asked those questions already but it seems those were
ignored.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2 v2] mm: vmpressure: fix sending wrong events on underflow
  2017-02-09 12:10   ` Michal Hocko
@ 2017-02-09 12:22     ` Michal Hocko
  -1 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2017-02-09 12:22 UTC (permalink / raw)
  To: Vinayak Menon
  Cc: akpm, hannes, mgorman, vbabka, riel, vdavydov.dev,
	anton.vorontsov, minchan, shashim, linux-mm, linux-kernel

On Thu 09-02-17 13:10:57, Michal Hocko wrote:
> On Thu 09-02-17 17:29:36, Vinayak Menon wrote:
> > At the end of a window period, if the reclaimed pages
> > is greater than scanned, an unsigned underflow can
> > result in a huge pressure value and thus a critical event.
> > Reclaimed pages is found to go higher than scanned because
> > of the addition of reclaimed slab pages to reclaimed in
> > shrink_node without a corresponding increment to scanned
> > pages. Minchan Kim mentioned that this can also happen in
> > the case of a THP page where the scanned is 1 and reclaimed
> > could be 512.
> > 
> > Acked-by: Minchan Kim <minchan@kernel.org>
> > Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> I would prefer the fixup in vmpressure() as already mentioned but this
> should work as well.

Btw. I guess this should be good to mark for stable. Reclaiming THP is
not all that rare (even though we try to avoid anon reclaim as much as
possible) and hitting critical events can lead to disruptive actions to
early.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2 v2] mm: vmpressure: fix sending wrong events on underflow
@ 2017-02-09 12:22     ` Michal Hocko
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2017-02-09 12:22 UTC (permalink / raw)
  To: Vinayak Menon
  Cc: akpm, hannes, mgorman, vbabka, riel, vdavydov.dev,
	anton.vorontsov, minchan, shashim, linux-mm, linux-kernel

On Thu 09-02-17 13:10:57, Michal Hocko wrote:
> On Thu 09-02-17 17:29:36, Vinayak Menon wrote:
> > At the end of a window period, if the reclaimed pages
> > is greater than scanned, an unsigned underflow can
> > result in a huge pressure value and thus a critical event.
> > Reclaimed pages is found to go higher than scanned because
> > of the addition of reclaimed slab pages to reclaimed in
> > shrink_node without a corresponding increment to scanned
> > pages. Minchan Kim mentioned that this can also happen in
> > the case of a THP page where the scanned is 1 and reclaimed
> > could be 512.
> > 
> > Acked-by: Minchan Kim <minchan@kernel.org>
> > Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> I would prefer the fixup in vmpressure() as already mentioned but this
> should work as well.

Btw. I guess this should be good to mark for stable. Reclaiming THP is
not all that rare (even though we try to avoid anon reclaim as much as
possible) and hitting critical events can lead to disruptive actions to
early.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2 v5] mm: vmscan: do not pass reclaimed slab to vmpressure
  2017-02-09 12:20     ` Michal Hocko
@ 2017-02-10  8:45       ` vinayak menon
  -1 siblings, 0 replies; 14+ messages in thread
From: vinayak menon @ 2017-02-10  8:45 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vinayak Menon, Andrew Morton, Johannes Weiner, mgorman, vbabka,
	Rik van Riel, vdavydov.dev, anton.vorontsov, Minchan Kim,
	shashim, linux-mm, linux-kernel

On Thu, Feb 9, 2017 at 5:50 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 09-02-17 17:29:37, Vinayak Menon wrote:
>> During global reclaim, the nr_reclaimed passed to vmpressure includes the
>> pages reclaimed from slab.  But the corresponding scanned slab pages is
>> not passed. There is an impact to the vmpressure values because of this.
>> While moving from kernel version 3.18 to 4.4, a difference is seen
>> in the vmpressure values for the same workload resulting in a different
>> behaviour of the vmpressure consumer. One such case is of a vmpressure
>> based lowmemorykiller. It is observed that the vmpressure events are
>> received late and less in number resulting in tasks not being killed at
>> the right time. The following numbers show the impact on reclaim activity
>> due to the change in behaviour of lowmemorykiller on a 4GB device. The test
>> launches a number of apps in sequence and repeats it multiple times.
>
> this is really vague description of your workload and doesn't really
> explain why getting critical events later is a bad thing.
Ok. I will add that.

>
>>
>>                       v4.4           v3.18
>> pgpgin                163016456      145617236
>> pgpgout               4366220        4188004
>> workingset_refault    29857868       26781854
>> workingset_activate   6293946        5634625
>> pswpin                1327601        1133912
>> pswpout               3593842        3229602
>> pgalloc_dma           99520618       94402970
>> pgalloc_normal        104046854      98124798
>> pgfree                203772640      192600737
>> pgmajfault            2126962        1851836
>> pgsteal_kswapd_dma    19732899       18039462
>> pgsteal_kswapd_normal 19945336       17977706
>> pgsteal_direct_dma    206757         131376
>> pgsteal_direct_normal 236783         138247
>> pageoutrun            116622         108370
>> allocstall            7220           4684
>> compact_stall         931            856
>
> this is missing any vmpressure events data and so it is not very useful
> on its own
Done.

>
>> This is a regression introduced by commit 6b4f7799c6a5 ("mm: vmscan:
>> invoke slab shrinkers from shrink_zone()").
>>
>> So do not consider reclaimed slab pages for vmpressure calculation. The
>> reclaimed pages from slab can be excluded because the freeing of a page
>> by slab shrinking depends on each slab's object population, making the
>> cost model (i.e. scan:free) different from that of LRU.  Also, not every
>> shrinker accounts the pages it reclaims. But ideally the pages reclaimed
>> from slab should be passed to vmpressure, otherwise higher vmpressure
>> levels can be triggered even when there is a reclaim progress. But
>> accounting only the reclaimed slab pages without the scanned, and adding
>> something which does not fit into the cost model just adds noise to the
>> vmpressure values.
>>
>> Fixes: 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()")
>> Acked-by: Minchan Kim <minchan@kernel.org>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
>> Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
>> Cc: Shiraz Hashim <shashim@codeaurora.org>
>> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
>
> I have already said I will _not_ NAK the patch but we need a much better
> description and justification why the older behavior was better to
> consider this a regression before this can be merged. It is hard to
> expect that the underlying implementation of the vmpressure will stay
> carved in stone and there might be changes in this area in the future. I
> want to hear why we believe that the tested workload is sufficiently
> universal and we won't see another report in few months because somebody
> else will see higher vmpressure levels even though we make reclaim
> progress. I have asked those questions already but it seems those were
> ignored.

The tested workload is not universal. The lowmemorykiller example was used just
to mention the effect of vmpressure change on one of the workloads. I
can drop the
reclaim stats and just keep the stats of change observed in vmpressure
critical events.
I am not sure whether we would see another issue reported with this
patch. We may because
someone would have written a code that works with this new vmpressure
values. I am not
sure whether that matters because the core issue is whether the kernel
is reporting the
right values.  This could be termed as a regression because,

1) Accounting only reclaimed pages to a model which works on scanned
and reclaimed
seems like a wrong thing. It is just adding noise to it. There could
be issues with vmpressure
implementation, but it at least gives an estimate on what the pressure
on LRU is. There are
many other shrinkers like zsmalloc which does not report reclaimed
pages, and when add those
also in a similar fashion without considering the cost part,
vmpressure values would always
remain low. So util we have a way to give correct information to
vmpressure about non-LRU
reclaimers, I feel its better to keep it in its original form.

2) As Minchan mentioned, the cost model is different and thus adding
slab reclaimed would
not be the right thing to do at this point.

But if you feel we don't have to fix this now and that it is better to
fix the core problems with
vmpressure first, that's ok. Anyway I will sent a patch with a new
changelog with vmpressure
event details. Thanks Michal for your comments.

Vinayak

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2 v5] mm: vmscan: do not pass reclaimed slab to vmpressure
@ 2017-02-10  8:45       ` vinayak menon
  0 siblings, 0 replies; 14+ messages in thread
From: vinayak menon @ 2017-02-10  8:45 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vinayak Menon, Andrew Morton, Johannes Weiner, mgorman, vbabka,
	Rik van Riel, vdavydov.dev, anton.vorontsov, Minchan Kim,
	shashim, linux-mm, linux-kernel

On Thu, Feb 9, 2017 at 5:50 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Thu 09-02-17 17:29:37, Vinayak Menon wrote:
>> During global reclaim, the nr_reclaimed passed to vmpressure includes the
>> pages reclaimed from slab.  But the corresponding scanned slab pages is
>> not passed. There is an impact to the vmpressure values because of this.
>> While moving from kernel version 3.18 to 4.4, a difference is seen
>> in the vmpressure values for the same workload resulting in a different
>> behaviour of the vmpressure consumer. One such case is of a vmpressure
>> based lowmemorykiller. It is observed that the vmpressure events are
>> received late and less in number resulting in tasks not being killed at
>> the right time. The following numbers show the impact on reclaim activity
>> due to the change in behaviour of lowmemorykiller on a 4GB device. The test
>> launches a number of apps in sequence and repeats it multiple times.
>
> this is really vague description of your workload and doesn't really
> explain why getting critical events later is a bad thing.
Ok. I will add that.

>
>>
>>                       v4.4           v3.18
>> pgpgin                163016456      145617236
>> pgpgout               4366220        4188004
>> workingset_refault    29857868       26781854
>> workingset_activate   6293946        5634625
>> pswpin                1327601        1133912
>> pswpout               3593842        3229602
>> pgalloc_dma           99520618       94402970
>> pgalloc_normal        104046854      98124798
>> pgfree                203772640      192600737
>> pgmajfault            2126962        1851836
>> pgsteal_kswapd_dma    19732899       18039462
>> pgsteal_kswapd_normal 19945336       17977706
>> pgsteal_direct_dma    206757         131376
>> pgsteal_direct_normal 236783         138247
>> pageoutrun            116622         108370
>> allocstall            7220           4684
>> compact_stall         931            856
>
> this is missing any vmpressure events data and so it is not very useful
> on its own
Done.

>
>> This is a regression introduced by commit 6b4f7799c6a5 ("mm: vmscan:
>> invoke slab shrinkers from shrink_zone()").
>>
>> So do not consider reclaimed slab pages for vmpressure calculation. The
>> reclaimed pages from slab can be excluded because the freeing of a page
>> by slab shrinking depends on each slab's object population, making the
>> cost model (i.e. scan:free) different from that of LRU.  Also, not every
>> shrinker accounts the pages it reclaims. But ideally the pages reclaimed
>> from slab should be passed to vmpressure, otherwise higher vmpressure
>> levels can be triggered even when there is a reclaim progress. But
>> accounting only the reclaimed slab pages without the scanned, and adding
>> something which does not fit into the cost model just adds noise to the
>> vmpressure values.
>>
>> Fixes: 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()")
>> Acked-by: Minchan Kim <minchan@kernel.org>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
>> Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
>> Cc: Shiraz Hashim <shashim@codeaurora.org>
>> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
>
> I have already said I will _not_ NAK the patch but we need a much better
> description and justification why the older behavior was better to
> consider this a regression before this can be merged. It is hard to
> expect that the underlying implementation of the vmpressure will stay
> carved in stone and there might be changes in this area in the future. I
> want to hear why we believe that the tested workload is sufficiently
> universal and we won't see another report in few months because somebody
> else will see higher vmpressure levels even though we make reclaim
> progress. I have asked those questions already but it seems those were
> ignored.

The tested workload is not universal. The lowmemorykiller example was used just
to mention the effect of vmpressure change on one of the workloads. I
can drop the
reclaim stats and just keep the stats of change observed in vmpressure
critical events.
I am not sure whether we would see another issue reported with this
patch. We may because
someone would have written a code that works with this new vmpressure
values. I am not
sure whether that matters because the core issue is whether the kernel
is reporting the
right values.  This could be termed as a regression because,

1) Accounting only reclaimed pages to a model which works on scanned
and reclaimed
seems like a wrong thing. It is just adding noise to it. There could
be issues with vmpressure
implementation, but it at least gives an estimate on what the pressure
on LRU is. There are
many other shrinkers like zsmalloc which does not report reclaimed
pages, and when add those
also in a similar fashion without considering the cost part,
vmpressure values would always
remain low. So util we have a way to give correct information to
vmpressure about non-LRU
reclaimers, I feel its better to keep it in its original form.

2) As Minchan mentioned, the cost model is different and thus adding
slab reclaimed would
not be the right thing to do at this point.

But if you feel we don't have to fix this now and that it is better to
fix the core problems with
vmpressure first, that's ok. Anyway I will sent a patch with a new
changelog with vmpressure
event details. Thanks Michal for your comments.

Vinayak

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2 v5] mm: vmscan: do not pass reclaimed slab to vmpressure
  2017-02-10  8:45       ` vinayak menon
@ 2017-02-10  9:05         ` Michal Hocko
  -1 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2017-02-10  9:05 UTC (permalink / raw)
  To: vinayak menon
  Cc: Vinayak Menon, Andrew Morton, Johannes Weiner, mgorman, vbabka,
	Rik van Riel, vdavydov.dev, anton.vorontsov, Minchan Kim,
	shashim, linux-mm, linux-kernel

On Fri 10-02-17 14:15:20, vinayak menon wrote:
> On Thu, Feb 9, 2017 at 5:50 PM, Michal Hocko <mhocko@kernel.org> wrote:
[...]
> > I have already said I will _not_ NAK the patch but we need a much better
> > description and justification why the older behavior was better to
> > consider this a regression before this can be merged. It is hard to
> > expect that the underlying implementation of the vmpressure will stay
> > carved in stone and there might be changes in this area in the future. I
> > want to hear why we believe that the tested workload is sufficiently
> > universal and we won't see another report in few months because somebody
> > else will see higher vmpressure levels even though we make reclaim
> > progress. I have asked those questions already but it seems those were
> > ignored.
> 
> The tested workload is not universal. The lowmemorykiller example was used just
> to mention the effect of vmpressure change on one of the workloads. 

My point is whether this workload even matters. AFAIU the test benefits
from killing as quickly as possible, right? So it directly benefits from
seeing critical events as soon as possible even when the reclaim makes
progress.

> I can drop the reclaim stats and just keep the stats of change
> observed in vmpressure critical events.  I am not sure whether we
> would see another issue reported with this patch. We may because
> someone would have written a code that works with this new vmpressure
> values. I am not sure whether that matters because the core issue
> is whether the kernel is reporting the right values.

Right. THe right values is a bit fuzzy, though.

> This could be
> termed as a regression because,
> 
> 1) Accounting only reclaimed pages to a model which works on scanned
> and reclaimed seems like a wrong thing. It is just adding noise to
> it. There could be issues with vmpressure implementation, but it at
> least gives an estimate on what the pressure on LRU is. There are many
> other shrinkers like zsmalloc which does not report reclaimed pages,
> and when add those also in a similar fashion without considering the
> cost part, vmpressure values would always remain low. So util we
> have a way to give correct information to vmpressure about non-LRU
> reclaimers, I feel its better to keep it in its original form.

Yeah, I understand that the current cost model is far from ideal and it
needs fixing. My main question would be whether the model would be much
better if we exclude pages freed from the slab shrinkers. I can only say
it would be more pesimistic that way. Is this a good thing? If yes, why?
 
> 2) As Minchan mentioned, the cost model is different and thus adding
> slab reclaimed would not be the right thing to do at this point.
> 
> But if you feel we don't have to fix this now and that it is better
> to fix the core problems with vmpressure first, that's ok.

Yes, I believe we should reconsider how we calculate the pressure
levels. This seems a larger project but definitely something we need. I
do not have a good ideas how to do this properly
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2 v5] mm: vmscan: do not pass reclaimed slab to vmpressure
@ 2017-02-10  9:05         ` Michal Hocko
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2017-02-10  9:05 UTC (permalink / raw)
  To: vinayak menon
  Cc: Vinayak Menon, Andrew Morton, Johannes Weiner, mgorman, vbabka,
	Rik van Riel, vdavydov.dev, anton.vorontsov, Minchan Kim,
	shashim, linux-mm, linux-kernel

On Fri 10-02-17 14:15:20, vinayak menon wrote:
> On Thu, Feb 9, 2017 at 5:50 PM, Michal Hocko <mhocko@kernel.org> wrote:
[...]
> > I have already said I will _not_ NAK the patch but we need a much better
> > description and justification why the older behavior was better to
> > consider this a regression before this can be merged. It is hard to
> > expect that the underlying implementation of the vmpressure will stay
> > carved in stone and there might be changes in this area in the future. I
> > want to hear why we believe that the tested workload is sufficiently
> > universal and we won't see another report in few months because somebody
> > else will see higher vmpressure levels even though we make reclaim
> > progress. I have asked those questions already but it seems those were
> > ignored.
> 
> The tested workload is not universal. The lowmemorykiller example was used just
> to mention the effect of vmpressure change on one of the workloads. 

My point is whether this workload even matters. AFAIU the test benefits
from killing as quickly as possible, right? So it directly benefits from
seeing critical events as soon as possible even when the reclaim makes
progress.

> I can drop the reclaim stats and just keep the stats of change
> observed in vmpressure critical events.  I am not sure whether we
> would see another issue reported with this patch. We may because
> someone would have written a code that works with this new vmpressure
> values. I am not sure whether that matters because the core issue
> is whether the kernel is reporting the right values.

Right. THe right values is a bit fuzzy, though.

> This could be
> termed as a regression because,
> 
> 1) Accounting only reclaimed pages to a model which works on scanned
> and reclaimed seems like a wrong thing. It is just adding noise to
> it. There could be issues with vmpressure implementation, but it at
> least gives an estimate on what the pressure on LRU is. There are many
> other shrinkers like zsmalloc which does not report reclaimed pages,
> and when add those also in a similar fashion without considering the
> cost part, vmpressure values would always remain low. So util we
> have a way to give correct information to vmpressure about non-LRU
> reclaimers, I feel its better to keep it in its original form.

Yeah, I understand that the current cost model is far from ideal and it
needs fixing. My main question would be whether the model would be much
better if we exclude pages freed from the slab shrinkers. I can only say
it would be more pesimistic that way. Is this a good thing? If yes, why?
 
> 2) As Minchan mentioned, the cost model is different and thus adding
> slab reclaimed would not be the right thing to do at this point.
> 
> But if you feel we don't have to fix this now and that it is better
> to fix the core problems with vmpressure first, that's ok.

Yes, I believe we should reconsider how we calculate the pressure
levels. This seems a larger project but definitely something we need. I
do not have a good ideas how to do this properly
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-02-10  9:25 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-09 11:59 [PATCH 1/2 v2] mm: vmpressure: fix sending wrong events on underflow Vinayak Menon
2017-02-09 11:59 ` Vinayak Menon
2017-02-09 11:59 ` [PATCH 2/2 v5] mm: vmscan: do not pass reclaimed slab to vmpressure Vinayak Menon
2017-02-09 11:59   ` Vinayak Menon
2017-02-09 12:20   ` Michal Hocko
2017-02-09 12:20     ` Michal Hocko
2017-02-10  8:45     ` vinayak menon
2017-02-10  8:45       ` vinayak menon
2017-02-10  9:05       ` Michal Hocko
2017-02-10  9:05         ` Michal Hocko
2017-02-09 12:10 ` [PATCH 1/2 v2] mm: vmpressure: fix sending wrong events on underflow Michal Hocko
2017-02-09 12:10   ` Michal Hocko
2017-02-09 12:22   ` Michal Hocko
2017-02-09 12:22     ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.