All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] vmscan: don't subtraction of unsined
@ 2010-07-08  7:38 ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-08  7:38 UTC (permalink / raw)
  To: LKML, linux-mm, Andrew Morton, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner
  Cc: kosaki.motohiro

'slab_reclaimable' and 'nr_pages' are unsigned. so, subtraction is
unsafe.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
---
 mm/vmscan.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9c7e57c..8715da1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		.swappiness = vm_swappiness,
 		.order = order,
 	};
-	unsigned long slab_reclaimable;
+	unsigned long n, m;
 
 	disable_swap_token();
 	cond_resched();
@@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
 	}
 
-	slab_reclaimable = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
-	if (slab_reclaimable > zone->min_slab_pages) {
+	n = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+	if (n > zone->min_slab_pages) {
 		/*
 		 * shrink_slab() does not currently allow us to determine how
 		 * many pages were freed in this zone. So we take the current
@@ -2628,16 +2628,16 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		 * take a long time.
 		 */
 		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
-			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
-				slab_reclaimable - nr_pages)
+		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > n))
 			;
 
 		/*
 		 * Update nr_reclaimed by the number of slab pages we
 		 * reclaimed from this zone.
 		 */
-		sc.nr_reclaimed += slab_reclaimable -
-			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+		m = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+		if (m < n)
+			sc.nr_reclaimed += n - m;
 	}
 
 	p->reclaim_state = NULL;
-- 
1.6.5.2




^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v2 1/2] vmscan: don't subtraction of unsined
@ 2010-07-08  7:38 ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-08  7:38 UTC (permalink / raw)
  To: LKML, linux-mm, Andrew Morton, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner
  Cc: kosaki.motohiro

'slab_reclaimable' and 'nr_pages' are unsigned. so, subtraction is
unsafe.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
---
 mm/vmscan.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9c7e57c..8715da1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		.swappiness = vm_swappiness,
 		.order = order,
 	};
-	unsigned long slab_reclaimable;
+	unsigned long n, m;
 
 	disable_swap_token();
 	cond_resched();
@@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
 	}
 
-	slab_reclaimable = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
-	if (slab_reclaimable > zone->min_slab_pages) {
+	n = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+	if (n > zone->min_slab_pages) {
 		/*
 		 * shrink_slab() does not currently allow us to determine how
 		 * many pages were freed in this zone. So we take the current
@@ -2628,16 +2628,16 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		 * take a long time.
 		 */
 		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
-			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
-				slab_reclaimable - nr_pages)
+		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > n))
 			;
 
 		/*
 		 * Update nr_reclaimed by the number of slab pages we
 		 * reclaimed from this zone.
 		 */
-		sc.nr_reclaimed += slab_reclaimable -
-			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+		m = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+		if (m < n)
+			sc.nr_reclaimed += n - m;
 	}
 
 	p->reclaim_state = NULL;
-- 
1.6.5.2



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
  2010-07-08  7:38 ` KOSAKI Motohiro
@ 2010-07-08  7:40   ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-08  7:40 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner

Fix simple argument error. Usually 'order' is very small value than
lru_pages. then it can makes unnecessary icache dropping.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/vmscan.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8715da1..60d813b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2617,6 +2617,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 
 	n = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
 	if (n > zone->min_slab_pages) {
+		unsigned long lru_pages = zone_reclaimable_pages(zone);
+
 		/*
 		 * shrink_slab() does not currently allow us to determine how
 		 * many pages were freed in this zone. So we take the current
@@ -2627,7 +2629,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		 * Note that shrink_slab will free memory on all zones and may
 		 * take a long time.
 		 */
-		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
+		while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
 		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > n))
 			;
 
-- 
1.6.5.2




^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
@ 2010-07-08  7:40   ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-08  7:40 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner

Fix simple argument error. Usually 'order' is very small value than
lru_pages. then it can makes unnecessary icache dropping.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/vmscan.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8715da1..60d813b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2617,6 +2617,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 
 	n = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
 	if (n > zone->min_slab_pages) {
+		unsigned long lru_pages = zone_reclaimable_pages(zone);
+
 		/*
 		 * shrink_slab() does not currently allow us to determine how
 		 * many pages were freed in this zone. So we take the current
@@ -2627,7 +2629,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		 * Note that shrink_slab will free memory on all zones and may
 		 * take a long time.
 		 */
-		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
+		while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
 		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > n))
 			;
 
-- 
1.6.5.2



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
  2010-07-08  7:38 ` KOSAKI Motohiro
@ 2010-07-08  7:41   ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-08  7:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner

Oops, sorry. I did forget cc Christoph. 
resend.


> 'slab_reclaimable' and 'nr_pages' are unsigned. so, subtraction is
> unsafe.
> 
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Acked-by: Christoph Lameter <cl@linux-foundation.org>
> ---
>  mm/vmscan.c |   14 +++++++-------
>  1 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 9c7e57c..8715da1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		.swappiness = vm_swappiness,
>  		.order = order,
>  	};
> -	unsigned long slab_reclaimable;
> +	unsigned long n, m;
>  
>  	disable_swap_token();
>  	cond_resched();
> @@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
>  	}
>  
> -	slab_reclaimable = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> -	if (slab_reclaimable > zone->min_slab_pages) {
> +	n = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +	if (n > zone->min_slab_pages) {
>  		/*
>  		 * shrink_slab() does not currently allow us to determine how
>  		 * many pages were freed in this zone. So we take the current
> @@ -2628,16 +2628,16 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		 * take a long time.
>  		 */
>  		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> -			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
> -				slab_reclaimable - nr_pages)
> +		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > n))
>  			;
>  
>  		/*
>  		 * Update nr_reclaimed by the number of slab pages we
>  		 * reclaimed from this zone.
>  		 */
> -		sc.nr_reclaimed += slab_reclaimable -
> -			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +		m = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +		if (m < n)
> +			sc.nr_reclaimed += n - m;
>  	}
>  
>  	p->reclaim_state = NULL;
> -- 
> 1.6.5.2
> 
> 
> 




^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
@ 2010-07-08  7:41   ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-08  7:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner

Oops, sorry. I did forget cc Christoph. 
resend.


> 'slab_reclaimable' and 'nr_pages' are unsigned. so, subtraction is
> unsafe.
> 
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Acked-by: Christoph Lameter <cl@linux-foundation.org>
> ---
>  mm/vmscan.c |   14 +++++++-------
>  1 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 9c7e57c..8715da1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		.swappiness = vm_swappiness,
>  		.order = order,
>  	};
> -	unsigned long slab_reclaimable;
> +	unsigned long n, m;
>  
>  	disable_swap_token();
>  	cond_resched();
> @@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
>  	}
>  
> -	slab_reclaimable = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> -	if (slab_reclaimable > zone->min_slab_pages) {
> +	n = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +	if (n > zone->min_slab_pages) {
>  		/*
>  		 * shrink_slab() does not currently allow us to determine how
>  		 * many pages were freed in this zone. So we take the current
> @@ -2628,16 +2628,16 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		 * take a long time.
>  		 */
>  		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> -			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
> -				slab_reclaimable - nr_pages)
> +		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > n))
>  			;
>  
>  		/*
>  		 * Update nr_reclaimed by the number of slab pages we
>  		 * reclaimed from this zone.
>  		 */
> -		sc.nr_reclaimed += slab_reclaimable -
> -			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +		m = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +		if (m < n)
> +			sc.nr_reclaimed += n - m;
>  	}
>  
>  	p->reclaim_state = NULL;
> -- 
> 1.6.5.2
> 
> 
> 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
  2010-07-08  7:40   ` KOSAKI Motohiro
@ 2010-07-08 13:23     ` Rik van Riel
  -1 siblings, 0 replies; 62+ messages in thread
From: Rik van Riel @ 2010-07-08 13:23 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Christoph Lameter, LKML, linux-mm, Andrew Morton, Mel Gorman,
	Minchan Kim, Johannes Weiner

On 07/08/2010 03:40 AM, KOSAKI Motohiro wrote:
> Fix simple argument error. Usually 'order' is very small value than
> lru_pages. then it can makes unnecessary icache dropping.
>
> Signed-off-by: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com>

Acked-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
@ 2010-07-08 13:23     ` Rik van Riel
  0 siblings, 0 replies; 62+ messages in thread
From: Rik van Riel @ 2010-07-08 13:23 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Christoph Lameter, LKML, linux-mm, Andrew Morton, Mel Gorman,
	Minchan Kim, Johannes Weiner

On 07/08/2010 03:40 AM, KOSAKI Motohiro wrote:
> Fix simple argument error. Usually 'order' is very small value than
> lru_pages. then it can makes unnecessary icache dropping.
>
> Signed-off-by: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com>

Acked-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
  2010-07-08  7:41   ` KOSAKI Motohiro
@ 2010-07-08 14:01     ` Christoph Lameter
  -1 siblings, 0 replies; 62+ messages in thread
From: Christoph Lameter @ 2010-07-08 14:01 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Andrew Morton, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner


Reviewed-by: Christoph Lameter <cl@linux-foundation.org>



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
@ 2010-07-08 14:01     ` Christoph Lameter
  0 siblings, 0 replies; 62+ messages in thread
From: Christoph Lameter @ 2010-07-08 14:01 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Andrew Morton, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner


Reviewed-by: Christoph Lameter <cl@linux-foundation.org>


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
  2010-07-08  7:40   ` KOSAKI Motohiro
@ 2010-07-08 14:04     ` Christoph Lameter
  -1 siblings, 0 replies; 62+ messages in thread
From: Christoph Lameter @ 2010-07-08 14:04 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Andrew Morton, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner

On Thu, 8 Jul 2010, KOSAKI Motohiro wrote:

> Fix simple argument error. Usually 'order' is very small value than
> lru_pages. then it can makes unnecessary icache dropping.

AFAICT this is not argument error but someone changed the naming of the
parameter. The "lru_pages" parameter is really a division factor affecting
the number of pages scanned. This patch increases this division factor
significantly and therefore reduces the number of items scanned during
zone_reclaim.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
@ 2010-07-08 14:04     ` Christoph Lameter
  0 siblings, 0 replies; 62+ messages in thread
From: Christoph Lameter @ 2010-07-08 14:04 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Andrew Morton, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner

On Thu, 8 Jul 2010, KOSAKI Motohiro wrote:

> Fix simple argument error. Usually 'order' is very small value than
> lru_pages. then it can makes unnecessary icache dropping.

AFAICT this is not argument error but someone changed the naming of the
parameter. The "lru_pages" parameter is really a division factor affecting
the number of pages scanned. This patch increases this division factor
significantly and therefore reduces the number of items scanned during
zone_reclaim.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
  2010-07-08  7:38 ` KOSAKI Motohiro
@ 2010-07-08 20:00   ` Andrew Morton
  -1 siblings, 0 replies; 62+ messages in thread
From: Andrew Morton @ 2010-07-08 20:00 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Mel Gorman, Rik van Riel, Minchan Kim, Johannes Weiner

On Thu,  8 Jul 2010 16:38:10 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> 'slab_reclaimable' and 'nr_pages' are unsigned. so, subtraction is
> unsafe.
> 
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Acked-by: Christoph Lameter <cl@linux-foundation.org>
> ---
>  mm/vmscan.c |   14 +++++++-------
>  1 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 9c7e57c..8715da1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		.swappiness = vm_swappiness,
>  		.order = order,
>  	};
> -	unsigned long slab_reclaimable;
> +	unsigned long n, m;

Please use better identifiers.

>  	disable_swap_token();
>  	cond_resched();
> @@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
>  	}
>  
> -	slab_reclaimable = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> -	if (slab_reclaimable > zone->min_slab_pages) {
> +	n = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +	if (n > zone->min_slab_pages) {
>  		/*
>  		 * shrink_slab() does not currently allow us to determine how
>  		 * many pages were freed in this zone. So we take the current
> @@ -2628,16 +2628,16 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		 * take a long time.
>  		 */
>  		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> -			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
> -				slab_reclaimable - nr_pages)
> +		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > n))
>  			;
>  
>  		/*
>  		 * Update nr_reclaimed by the number of slab pages we
>  		 * reclaimed from this zone.
>  		 */
> -		sc.nr_reclaimed += slab_reclaimable -
> -			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +		m = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +		if (m < n)
> +			sc.nr_reclaimed += n - m;

And it's not a completly trivial objection.  Your patch made the above
code snippet quite a lot harder to read (and hence harder to maintain).


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
@ 2010-07-08 20:00   ` Andrew Morton
  0 siblings, 0 replies; 62+ messages in thread
From: Andrew Morton @ 2010-07-08 20:00 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Mel Gorman, Rik van Riel, Minchan Kim, Johannes Weiner

On Thu,  8 Jul 2010 16:38:10 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> 'slab_reclaimable' and 'nr_pages' are unsigned. so, subtraction is
> unsafe.
> 
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Acked-by: Christoph Lameter <cl@linux-foundation.org>
> ---
>  mm/vmscan.c |   14 +++++++-------
>  1 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 9c7e57c..8715da1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		.swappiness = vm_swappiness,
>  		.order = order,
>  	};
> -	unsigned long slab_reclaimable;
> +	unsigned long n, m;

Please use better identifiers.

>  	disable_swap_token();
>  	cond_resched();
> @@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
>  	}
>  
> -	slab_reclaimable = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> -	if (slab_reclaimable > zone->min_slab_pages) {
> +	n = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +	if (n > zone->min_slab_pages) {
>  		/*
>  		 * shrink_slab() does not currently allow us to determine how
>  		 * many pages were freed in this zone. So we take the current
> @@ -2628,16 +2628,16 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		 * take a long time.
>  		 */
>  		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> -			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
> -				slab_reclaimable - nr_pages)
> +		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > n))
>  			;
>  
>  		/*
>  		 * Update nr_reclaimed by the number of slab pages we
>  		 * reclaimed from this zone.
>  		 */
> -		sc.nr_reclaimed += slab_reclaimable -
> -			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +		m = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +		if (m < n)
> +			sc.nr_reclaimed += n - m;

And it's not a completly trivial objection.  Your patch made the above
code snippet quite a lot harder to read (and hence harder to maintain).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
  2010-07-08 14:04     ` Christoph Lameter
@ 2010-07-08 20:31       ` Andrew Morton
  -1 siblings, 0 replies; 62+ messages in thread
From: Andrew Morton @ 2010-07-08 20:31 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: KOSAKI Motohiro, LKML, linux-mm, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner

On Thu, 8 Jul 2010 09:04:18 -0500 (CDT)
Christoph Lameter <cl@linux-foundation.org> wrote:

> On Thu, 8 Jul 2010, KOSAKI Motohiro wrote:
> 
> > Fix simple argument error. Usually 'order' is very small value than
> > lru_pages. then it can makes unnecessary icache dropping.
> 
> AFAICT this is not argument error but someone changed the naming of the
> parameter.

It's been there since day zero:

: commit 2a16e3f4b0c408b9e50297d2ec27e295d490267a
: Author:     Christoph Lameter <clameter@engr.sgi.com>
: AuthorDate: Wed Feb 1 03:05:35 2006 -0800
: Commit:     Linus Torvalds <torvalds@g5.osdl.org>
: CommitDate: Wed Feb 1 08:53:16 2006 -0800
: 
:     [PATCH] Reclaim slab during zone reclaim
:     
:     If large amounts of zone memory are used by empty slabs then zone_reclaim
:     becomes uneffective.  This patch shakes the slab a bit.
:     
:     The problem with this patch is that the slab reclaim is not containable to a
:     zone.  Thus slab reclaim may affect the whole system and be extremely slow.
:     This also means that we cannot determine how many pages were freed in this
:     zone.  Thus we need to go off node for at least one allocation.
:     
:     The functionality is disabled by default.
:     
:     We could modify the shrinkers to take a zone parameter but that would be quite
:     invasive.  Better ideas are welcome.
:     
:     Signed-off-by: Christoph Lameter <clameter@sgi.com>
:     Signed-off-by: Andrew Morton <akpm@osdl.org>
:     Signed-off-by: Linus Torvalds <torvalds@osdl.org>
: 
: diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
: index 4bca2a3..a46c10f 100644
: --- a/Documentation/sysctl/vm.txt
: +++ b/Documentation/sysctl/vm.txt
: @@ -137,6 +137,7 @@ This is value ORed together of
:  1	= Zone reclaim on
:  2	= Zone reclaim writes dirty pages out
:  4	= Zone reclaim swaps pages
: +8	= Also do a global slab reclaim pass
:  
:  zone_reclaim_mode is set during bootup to 1 if it is determined that pages
:  from remote zones will cause a measurable performance reduction. The
: @@ -160,6 +161,11 @@ Allowing regular swap effectively restricts allocations to the local
:  node unless explicitly overridden by memory policies or cpuset
:  configurations.
:  
: +It may be advisable to allow slab reclaim if the system makes heavy
: +use of files and builds up large slab caches. However, the slab
: +shrink operation is global, may take a long time and free slabs
: +in all nodes of the system.
: +
:  ================================================================
:  
:  zone_reclaim_interval:
: diff --git a/mm/vmscan.c b/mm/vmscan.c
: index 9e2ef36..aa4b80d 100644
: --- a/mm/vmscan.c
: +++ b/mm/vmscan.c
: @@ -1596,6 +1596,7 @@ int zone_reclaim_mode __read_mostly;
:  #define RECLAIM_ZONE (1<<0)	/* Run shrink_cache on the zone */
:  #define RECLAIM_WRITE (1<<1)	/* Writeout pages during reclaim */
:  #define RECLAIM_SWAP (1<<2)	/* Swap pages out during reclaim */
: +#define RECLAIM_SLAB (1<<3)	/* Do a global slab shrink if the zone is out of memory */
:  
:  /*
:   * Mininum time between zone reclaim scans
: @@ -1666,6 +1667,19 @@ int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
:  
:  	} while (sc.nr_reclaimed < nr_pages && sc.priority > 0);
:  
: +	if (sc.nr_reclaimed < nr_pages && (zone_reclaim_mode & RECLAIM_SLAB)) {
: +		/*
: +		 * shrink_slab does not currently allow us to determine
: +		 * how many pages were freed in the zone. So we just
: +		 * shake the slab and then go offnode for a single allocation.
: +		 *
: +		 * shrink_slab will free memory on all zones and may take
: +		 * a long time.
: +		 */
: +		shrink_slab(sc.nr_scanned, gfp_mask, order);
: +		sc.nr_reclaimed = 1;    /* Avoid getting the off node timeout */
: +	}
: +
:  	p->reclaim_state = NULL;
:  	current->flags &= ~PF_MEMALLOC;

> The "lru_pages" parameter is really a division factor affecting
> the number of pages scanned. This patch increases this division factor
> significantly and therefore reduces the number of items scanned during
> zone_reclaim.
> 

And for that reason I won't apply the patch.  I'd be crazy to do so. 
It tosses away four years testing, replacing it with something which
could have a large effect on reclaim behaviour, with no indication
whether that effect is good or bad.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
@ 2010-07-08 20:31       ` Andrew Morton
  0 siblings, 0 replies; 62+ messages in thread
From: Andrew Morton @ 2010-07-08 20:31 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: KOSAKI Motohiro, LKML, linux-mm, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner

On Thu, 8 Jul 2010 09:04:18 -0500 (CDT)
Christoph Lameter <cl@linux-foundation.org> wrote:

> On Thu, 8 Jul 2010, KOSAKI Motohiro wrote:
> 
> > Fix simple argument error. Usually 'order' is very small value than
> > lru_pages. then it can makes unnecessary icache dropping.
> 
> AFAICT this is not argument error but someone changed the naming of the
> parameter.

It's been there since day zero:

: commit 2a16e3f4b0c408b9e50297d2ec27e295d490267a
: Author:     Christoph Lameter <clameter@engr.sgi.com>
: AuthorDate: Wed Feb 1 03:05:35 2006 -0800
: Commit:     Linus Torvalds <torvalds@g5.osdl.org>
: CommitDate: Wed Feb 1 08:53:16 2006 -0800
: 
:     [PATCH] Reclaim slab during zone reclaim
:     
:     If large amounts of zone memory are used by empty slabs then zone_reclaim
:     becomes uneffective.  This patch shakes the slab a bit.
:     
:     The problem with this patch is that the slab reclaim is not containable to a
:     zone.  Thus slab reclaim may affect the whole system and be extremely slow.
:     This also means that we cannot determine how many pages were freed in this
:     zone.  Thus we need to go off node for at least one allocation.
:     
:     The functionality is disabled by default.
:     
:     We could modify the shrinkers to take a zone parameter but that would be quite
:     invasive.  Better ideas are welcome.
:     
:     Signed-off-by: Christoph Lameter <clameter@sgi.com>
:     Signed-off-by: Andrew Morton <akpm@osdl.org>
:     Signed-off-by: Linus Torvalds <torvalds@osdl.org>
: 
: diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
: index 4bca2a3..a46c10f 100644
: --- a/Documentation/sysctl/vm.txt
: +++ b/Documentation/sysctl/vm.txt
: @@ -137,6 +137,7 @@ This is value ORed together of
:  1	= Zone reclaim on
:  2	= Zone reclaim writes dirty pages out
:  4	= Zone reclaim swaps pages
: +8	= Also do a global slab reclaim pass
:  
:  zone_reclaim_mode is set during bootup to 1 if it is determined that pages
:  from remote zones will cause a measurable performance reduction. The
: @@ -160,6 +161,11 @@ Allowing regular swap effectively restricts allocations to the local
:  node unless explicitly overridden by memory policies or cpuset
:  configurations.
:  
: +It may be advisable to allow slab reclaim if the system makes heavy
: +use of files and builds up large slab caches. However, the slab
: +shrink operation is global, may take a long time and free slabs
: +in all nodes of the system.
: +
:  ================================================================
:  
:  zone_reclaim_interval:
: diff --git a/mm/vmscan.c b/mm/vmscan.c
: index 9e2ef36..aa4b80d 100644
: --- a/mm/vmscan.c
: +++ b/mm/vmscan.c
: @@ -1596,6 +1596,7 @@ int zone_reclaim_mode __read_mostly;
:  #define RECLAIM_ZONE (1<<0)	/* Run shrink_cache on the zone */
:  #define RECLAIM_WRITE (1<<1)	/* Writeout pages during reclaim */
:  #define RECLAIM_SWAP (1<<2)	/* Swap pages out during reclaim */
: +#define RECLAIM_SLAB (1<<3)	/* Do a global slab shrink if the zone is out of memory */
:  
:  /*
:   * Mininum time between zone reclaim scans
: @@ -1666,6 +1667,19 @@ int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
:  
:  	} while (sc.nr_reclaimed < nr_pages && sc.priority > 0);
:  
: +	if (sc.nr_reclaimed < nr_pages && (zone_reclaim_mode & RECLAIM_SLAB)) {
: +		/*
: +		 * shrink_slab does not currently allow us to determine
: +		 * how many pages were freed in the zone. So we just
: +		 * shake the slab and then go offnode for a single allocation.
: +		 *
: +		 * shrink_slab will free memory on all zones and may take
: +		 * a long time.
: +		 */
: +		shrink_slab(sc.nr_scanned, gfp_mask, order);
: +		sc.nr_reclaimed = 1;    /* Avoid getting the off node timeout */
: +	}
: +
:  	p->reclaim_state = NULL;
:  	current->flags &= ~PF_MEMALLOC;

> The "lru_pages" parameter is really a division factor affecting
> the number of pages scanned. This patch increases this division factor
> significantly and therefore reduces the number of items scanned during
> zone_reclaim.
> 

And for that reason I won't apply the patch.  I'd be crazy to do so. 
It tosses away four years testing, replacing it with something which
could have a large effect on reclaim behaviour, with no indication
whether that effect is good or bad.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
  2010-07-08 20:31       ` Andrew Morton
@ 2010-07-08 21:01         ` Christoph Lameter
  -1 siblings, 0 replies; 62+ messages in thread
From: Christoph Lameter @ 2010-07-08 21:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, LKML, linux-mm, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner

On Thu, 8 Jul 2010, Andrew Morton wrote:

> > AFAICT this is not argument error but someone changed the naming of the
> > parameter.
>
> It's been there since day zero:
>
> : commit 2a16e3f4b0c408b9e50297d2ec27e295d490267a
> : Author:     Christoph Lameter <clameter@engr.sgi.com>
> : AuthorDate: Wed Feb 1 03:05:35 2006 -0800
> : Commit:     Linus Torvalds <torvalds@g5.osdl.org>
> : CommitDate: Wed Feb 1 08:53:16 2006 -0800
> :
> :     [PATCH] Reclaim slab during zone reclaim

That only shows that the order parameter was passed to shrink_slab() which
is what I remember being done intentionally.

Vaguely recall that this was necessary to avoid shrink_slab() throwing out
too many pages for higher order allocs.

Initially zone_reclaim was full of heuristics that later were replaced by
counter as the new ZVCs became available and gradually better ways of
accounting for pages became possible.




^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
@ 2010-07-08 21:01         ` Christoph Lameter
  0 siblings, 0 replies; 62+ messages in thread
From: Christoph Lameter @ 2010-07-08 21:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, LKML, linux-mm, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner

On Thu, 8 Jul 2010, Andrew Morton wrote:

> > AFAICT this is not argument error but someone changed the naming of the
> > parameter.
>
> It's been there since day zero:
>
> : commit 2a16e3f4b0c408b9e50297d2ec27e295d490267a
> : Author:     Christoph Lameter <clameter@engr.sgi.com>
> : AuthorDate: Wed Feb 1 03:05:35 2006 -0800
> : Commit:     Linus Torvalds <torvalds@g5.osdl.org>
> : CommitDate: Wed Feb 1 08:53:16 2006 -0800
> :
> :     [PATCH] Reclaim slab during zone reclaim

That only shows that the order parameter was passed to shrink_slab() which
is what I remember being done intentionally.

Vaguely recall that this was necessary to avoid shrink_slab() throwing out
too many pages for higher order allocs.

Initially zone_reclaim was full of heuristics that later were replaced by
counter as the new ZVCs became available and gradually better ways of
accounting for pages became possible.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
  2010-07-08 21:01         ` Christoph Lameter
@ 2010-07-09  0:46           ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-09  0:46 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: kosaki.motohiro, Andrew Morton, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner

> On Thu, 8 Jul 2010, Andrew Morton wrote:
> 
> > > AFAICT this is not argument error but someone changed the naming of the
> > > parameter.
> >
> > It's been there since day zero:
> >
> > : commit 2a16e3f4b0c408b9e50297d2ec27e295d490267a
> > : Author:     Christoph Lameter <clameter@engr.sgi.com>
> > : AuthorDate: Wed Feb 1 03:05:35 2006 -0800
> > : Commit:     Linus Torvalds <torvalds@g5.osdl.org>
> > : CommitDate: Wed Feb 1 08:53:16 2006 -0800
> > :
> > :     [PATCH] Reclaim slab during zone reclaim
> 
> That only shows that the order parameter was passed to shrink_slab() which
> is what I remember being done intentionally.
> 
> Vaguely recall that this was necessary to avoid shrink_slab() throwing out
> too many pages for higher order allocs.

But It make opposite effect. number of scanning objects of shrink_slab() are

                          lru_scanned        max_pass
basic_scan_objects = 4 x -------------  x -----------------------------
                          lru_pages        shrinker->seeks (default:2)

scan_objects = min(basic_scan_objects, max_pass * 2)


That said, small lru_pages parameter makes too many slab dropping.
Practically, zone-reclaim always take max_pass*2. about inode, 
shrink_icache_memory() takes number of unused inode as max_pass.
It mean one shrink_slab() calling drop all icache. Is this optimal
behavior? why?

Am I missing something?

> Initially zone_reclaim was full of heuristics that later were replaced by
> counter as the new ZVCs became available and gradually better ways of
> accounting for pages became possible.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
@ 2010-07-09  0:46           ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-09  0:46 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: kosaki.motohiro, Andrew Morton, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner

> On Thu, 8 Jul 2010, Andrew Morton wrote:
> 
> > > AFAICT this is not argument error but someone changed the naming of the
> > > parameter.
> >
> > It's been there since day zero:
> >
> > : commit 2a16e3f4b0c408b9e50297d2ec27e295d490267a
> > : Author:     Christoph Lameter <clameter@engr.sgi.com>
> > : AuthorDate: Wed Feb 1 03:05:35 2006 -0800
> > : Commit:     Linus Torvalds <torvalds@g5.osdl.org>
> > : CommitDate: Wed Feb 1 08:53:16 2006 -0800
> > :
> > :     [PATCH] Reclaim slab during zone reclaim
> 
> That only shows that the order parameter was passed to shrink_slab() which
> is what I remember being done intentionally.
> 
> Vaguely recall that this was necessary to avoid shrink_slab() throwing out
> too many pages for higher order allocs.

But It make opposite effect. number of scanning objects of shrink_slab() are

                          lru_scanned        max_pass
basic_scan_objects = 4 x -------------  x -----------------------------
                          lru_pages        shrinker->seeks (default:2)

scan_objects = min(basic_scan_objects, max_pass * 2)


That said, small lru_pages parameter makes too many slab dropping.
Practically, zone-reclaim always take max_pass*2. about inode, 
shrink_icache_memory() takes number of unused inode as max_pass.
It mean one shrink_slab() calling drop all icache. Is this optimal
behavior? why?

Am I missing something?

> Initially zone_reclaim was full of heuristics that later were replaced by
> counter as the new ZVCs became available and gradually better ways of
> accounting for pages became possible.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
  2010-07-08 20:00   ` Andrew Morton
@ 2010-07-09  1:16     ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-09  1:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kosaki.motohiro, LKML, linux-mm, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner


> > @@ -2628,16 +2628,16 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> >  		 * take a long time.
> >  		 */
> >  		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> > -			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
> > -				slab_reclaimable - nr_pages)
> > +		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > n))
> >  			;
> >  
> >  		/*
> >  		 * Update nr_reclaimed by the number of slab pages we
> >  		 * reclaimed from this zone.
> >  		 */
> > -		sc.nr_reclaimed += slab_reclaimable -
> > -			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> > +		m = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> > +		if (m < n)
> > +			sc.nr_reclaimed += n - m;
> 
> And it's not a completly trivial objection.  Your patch made the above
> code snippet quite a lot harder to read (and hence harder to maintain).

Initially, I proposed following patch to Christoph. but he prefer n and m.
To be honest, I don't think this naming is big matter. so you prefer following
I'll submit it.




=====================================================================
>From 397199d69860061eaa5e1aaadac45c46c76b0522 Mon Sep 17 00:00:00 2001
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Wed, 30 Jun 2010 13:35:16 +0900
Subject: [PATCH] vmscan: don't subtraction of unsined

'slab_reclaimable' and 'nr_pages' are unsigned. so, subtraction is
unsafe.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/vmscan.c |   15 ++++++++-------
 1 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9c7e57c..79ff877 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		.swappiness = vm_swappiness,
 		.order = order,
 	};
-	unsigned long slab_reclaimable;
+	unsigned long nr_slab_pages0, nr_slab_pages1;
 
 	disable_swap_token();
 	cond_resched();
@@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
 	}
 
-	slab_reclaimable = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
-	if (slab_reclaimable > zone->min_slab_pages) {
+	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+	if (nr_slab_pages0 > zone->min_slab_pages) {
 		/*
 		 * shrink_slab() does not currently allow us to determine how
 		 * many pages were freed in this zone. So we take the current
@@ -2628,16 +2628,17 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		 * take a long time.
 		 */
 		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
-			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
-				slab_reclaimable - nr_pages)
+		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
+				nr_slab_pages0))
 			;
 
 		/*
 		 * Update nr_reclaimed by the number of slab pages we
 		 * reclaimed from this zone.
 		 */
-		sc.nr_reclaimed += slab_reclaimable -
-			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+		nr_slab_pages1 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+		if (nr_slab_pages1 < nr_slab_pages0)
+			sc.nr_reclaimed += nr_slab_pages0 - nr_slab_pages1;
 	}
 
 	p->reclaim_state = NULL;
-- 
1.6.5.2








^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
@ 2010-07-09  1:16     ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-09  1:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kosaki.motohiro, LKML, linux-mm, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner


> > @@ -2628,16 +2628,16 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> >  		 * take a long time.
> >  		 */
> >  		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> > -			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
> > -				slab_reclaimable - nr_pages)
> > +		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > n))
> >  			;
> >  
> >  		/*
> >  		 * Update nr_reclaimed by the number of slab pages we
> >  		 * reclaimed from this zone.
> >  		 */
> > -		sc.nr_reclaimed += slab_reclaimable -
> > -			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> > +		m = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> > +		if (m < n)
> > +			sc.nr_reclaimed += n - m;
> 
> And it's not a completly trivial objection.  Your patch made the above
> code snippet quite a lot harder to read (and hence harder to maintain).

Initially, I proposed following patch to Christoph. but he prefer n and m.
To be honest, I don't think this naming is big matter. so you prefer following
I'll submit it.




=====================================================================
From 397199d69860061eaa5e1aaadac45c46c76b0522 Mon Sep 17 00:00:00 2001
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Wed, 30 Jun 2010 13:35:16 +0900
Subject: [PATCH] vmscan: don't subtraction of unsined

'slab_reclaimable' and 'nr_pages' are unsigned. so, subtraction is
unsafe.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/vmscan.c |   15 ++++++++-------
 1 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9c7e57c..79ff877 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		.swappiness = vm_swappiness,
 		.order = order,
 	};
-	unsigned long slab_reclaimable;
+	unsigned long nr_slab_pages0, nr_slab_pages1;
 
 	disable_swap_token();
 	cond_resched();
@@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
 	}
 
-	slab_reclaimable = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
-	if (slab_reclaimable > zone->min_slab_pages) {
+	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+	if (nr_slab_pages0 > zone->min_slab_pages) {
 		/*
 		 * shrink_slab() does not currently allow us to determine how
 		 * many pages were freed in this zone. So we take the current
@@ -2628,16 +2628,17 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		 * take a long time.
 		 */
 		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
-			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
-				slab_reclaimable - nr_pages)
+		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
+				nr_slab_pages0))
 			;
 
 		/*
 		 * Update nr_reclaimed by the number of slab pages we
 		 * reclaimed from this zone.
 		 */
-		sc.nr_reclaimed += slab_reclaimable -
-			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+		nr_slab_pages1 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+		if (nr_slab_pages1 < nr_slab_pages0)
+			sc.nr_reclaimed += nr_slab_pages0 - nr_slab_pages1;
 	}
 
 	p->reclaim_state = NULL;
-- 
1.6.5.2







--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
  2010-07-09  1:16     ` KOSAKI Motohiro
@ 2010-07-09  1:46       ` Minchan Kim
  -1 siblings, 0 replies; 62+ messages in thread
From: Minchan Kim @ 2010-07-09  1:46 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, LKML, linux-mm, Mel Gorman, Rik van Riel, Johannes Weiner

On Fri, Jul 9, 2010 at 10:16 AM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>
>> > @@ -2628,16 +2628,16 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>> >              * take a long time.
>> >              */
>> >             while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
>> > -                   zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
>> > -                           slab_reclaimable - nr_pages)
>> > +                  (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > n))
>> >                     ;
>> >
>> >             /*
>> >              * Update nr_reclaimed by the number of slab pages we
>> >              * reclaimed from this zone.
>> >              */
>> > -           sc.nr_reclaimed += slab_reclaimable -
>> > -                   zone_page_state(zone, NR_SLAB_RECLAIMABLE);
>> > +           m = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
>> > +           if (m < n)
>> > +                   sc.nr_reclaimed += n - m;
>>
>> And it's not a completly trivial objection.  Your patch made the above
>> code snippet quite a lot harder to read (and hence harder to maintain).
>
> Initially, I proposed following patch to Christoph. but he prefer n and m.
> To be honest, I don't think this naming is big matter. so you prefer following
> I'll submit it.
>
>
>
>
> =====================================================================
> From 397199d69860061eaa5e1aaadac45c46c76b0522 Mon Sep 17 00:00:00 2001
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date: Wed, 30 Jun 2010 13:35:16 +0900
> Subject: [PATCH] vmscan: don't subtraction of unsined
>
> 'slab_reclaimable' and 'nr_pages' are unsigned. so, subtraction is
> unsafe.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>

I like this than n,m.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
@ 2010-07-09  1:46       ` Minchan Kim
  0 siblings, 0 replies; 62+ messages in thread
From: Minchan Kim @ 2010-07-09  1:46 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, LKML, linux-mm, Mel Gorman, Rik van Riel, Johannes Weiner

On Fri, Jul 9, 2010 at 10:16 AM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>
>> > @@ -2628,16 +2628,16 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>> >              * take a long time.
>> >              */
>> >             while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
>> > -                   zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
>> > -                           slab_reclaimable - nr_pages)
>> > +                  (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > n))
>> >                     ;
>> >
>> >             /*
>> >              * Update nr_reclaimed by the number of slab pages we
>> >              * reclaimed from this zone.
>> >              */
>> > -           sc.nr_reclaimed += slab_reclaimable -
>> > -                   zone_page_state(zone, NR_SLAB_RECLAIMABLE);
>> > +           m = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
>> > +           if (m < n)
>> > +                   sc.nr_reclaimed += n - m;
>>
>> And it's not a completly trivial objection.  Your patch made the above
>> code snippet quite a lot harder to read (and hence harder to maintain).
>
> Initially, I proposed following patch to Christoph. but he prefer n and m.
> To be honest, I don't think this naming is big matter. so you prefer following
> I'll submit it.
>
>
>
>
> =====================================================================
> From 397199d69860061eaa5e1aaadac45c46c76b0522 Mon Sep 17 00:00:00 2001
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date: Wed, 30 Jun 2010 13:35:16 +0900
> Subject: [PATCH] vmscan: don't subtraction of unsined
>
> 'slab_reclaimable' and 'nr_pages' are unsigned. so, subtraction is
> unsafe.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>

I like this than n,m.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
  2010-07-08 20:31       ` Andrew Morton
@ 2010-07-09  8:21         ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-09  8:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kosaki.motohiro, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner

> > The "lru_pages" parameter is really a division factor affecting
> > the number of pages scanned. This patch increases this division factor
> > significantly and therefore reduces the number of items scanned during
> > zone_reclaim.
> > 
> 
> And for that reason I won't apply the patch.  I'd be crazy to do so. 
> It tosses away four years testing, replacing it with something which
> could have a large effect on reclaim behaviour, with no indication
> whether that effect is good or bad.

Unfortunatelly, current code is more crazy. it is nearly worst bad behavior.
I've applied attached debugging patch and got following result.

That said, really low memory pressure (scan = 32, order = 0) drop _all_
icache and dcache (about 100MB!). I don't blieve anyone tested slab behavior
on zone_reclaim=1 for this four years.

please remember shrink_slab() scanning equation. order=0 always makes
all slab dropping!

btw, current shrink_slab() don't exit even if (*shrinker->shrink)(0) return 0.
It's a bit inefficient and meaningless loop iteration. I'll make a fix.


           <...>-2677  [001]   840.832711: shrink_slab: shrink_icache_memory before=56100 after=56000
           <...>-2677  [001]   840.832898: shrink_slab: shrink_icache_memory before=56000 after=55900
           <...>-2677  [001]   840.833081: shrink_slab: shrink_icache_memory before=55900 after=55800
           <...>-2677  [001]   840.833693: shrink_slab: shrink_icache_memory before=55800 after=55600
           <...>-2677  [001]   840.833860: shrink_slab: shrink_icache_memory before=55600 after=55500
           <...>-2677  [001]   840.834033: shrink_slab: shrink_icache_memory before=55500 after=55400
           <...>-2677  [001]   840.834201: shrink_slab: shrink_icache_memory before=55400 after=55200
           <...>-2677  [001]   840.834385: shrink_slab: shrink_icache_memory before=55200 after=55100
           <...>-2677  [001]   840.834553: shrink_slab: shrink_icache_memory before=55100 after=55000
           <...>-2677  [001]   840.834720: shrink_slab: shrink_icache_memory before=55000 after=54900
           <...>-2677  [001]   840.834889: shrink_slab: shrink_icache_memory before=54900 after=54700
           <...>-2677  [001]   840.835063: shrink_slab: shrink_icache_memory before=54700 after=54600
           <...>-2677  [001]   840.835229: shrink_slab: shrink_icache_memory before=54600 after=54500
           <...>-2677  [001]   840.835596: shrink_slab: shrink_icache_memory before=54500 after=54300
           <...>-2677  [001]   840.835761: shrink_slab: shrink_icache_memory before=54300 after=54200
           <...>-2677  [001]   840.835926: shrink_slab: shrink_icache_memory before=54200 after=54100
           <...>-2677  [001]   840.836097: shrink_slab: shrink_icache_memory before=54100 after=54000
           <...>-2677  [001]   840.836284: shrink_slab: shrink_icache_memory before=54000 after=53800
           <...>-2677  [001]   840.836453: shrink_slab: shrink_icache_memory before=53800 after=53700
           <...>-2677  [001]   840.836621: shrink_slab: shrink_icache_memory before=53700 after=53600
           <...>-2677  [001]   840.836793: shrink_slab: shrink_icache_memory before=53600 after=53500
           <...>-2677  [001]   840.836962: shrink_slab: shrink_icache_memory before=53500 after=53300
           <...>-2677  [001]   840.837137: shrink_slab: shrink_icache_memory before=53300 after=53200
           <...>-2677  [001]   840.837317: shrink_slab: shrink_icache_memory before=53200 after=53100
           <...>-2677  [001]   840.837485: shrink_slab: shrink_icache_memory before=53100 after=52900
           <...>-2677  [001]   840.837652: shrink_slab: shrink_icache_memory before=52900 after=52800
           <...>-2677  [001]   840.837821: shrink_slab: shrink_icache_memory before=52800 after=52700
           <...>-2677  [001]   840.837993: shrink_slab: shrink_icache_memory before=52700 after=52600
           <...>-2677  [001]   840.838168: shrink_slab: shrink_icache_memory before=52600 after=52400
           <...>-2677  [001]   840.838353: shrink_slab: shrink_icache_memory before=52400 after=52300
           <...>-2677  [001]   840.838524: shrink_slab: shrink_icache_memory before=52300 after=52200
           <...>-2677  [001]   840.838695: shrink_slab: shrink_icache_memory before=52200 after=52000
           <...>-2677  [001]   840.838865: shrink_slab: shrink_icache_memory before=52000 after=51900
           <...>-2677  [001]   840.839037: shrink_slab: shrink_icache_memory before=51900 after=51800
           <...>-2677  [001]   840.839207: shrink_slab: shrink_icache_memory before=51800 after=51700
           <...>-2677  [001]   840.839422: shrink_slab: shrink_icache_memory before=51700 after=51500
           <...>-2677  [001]   840.839589: shrink_slab: shrink_icache_memory before=51500 after=51400
           <...>-2677  [001]   840.839753: shrink_slab: shrink_icache_memory before=51400 after=51300
           <...>-2677  [001]   840.839920: shrink_slab: shrink_icache_memory before=51300 after=51100
           <...>-2677  [001]   840.840094: shrink_slab: shrink_icache_memory before=51100 after=51000
           <...>-2677  [001]   840.840278: shrink_slab: shrink_icache_memory before=51000 after=50900
           <...>-2677  [001]   840.840446: shrink_slab: shrink_icache_memory before=50900 after=50800
           <...>-2677  [001]   840.840618: shrink_slab: shrink_icache_memory before=50800 after=50600
           <...>-2677  [001]   840.840787: shrink_slab: shrink_icache_memory before=50600 after=50500
           <...>-2677  [001]   840.840953: shrink_slab: shrink_icache_memory before=50500 after=50400
           <...>-2677  [001]   840.841128: shrink_slab: shrink_icache_memory before=50400 after=50300
           <...>-2677  [001]   840.841310: shrink_slab: shrink_icache_memory before=50300 after=50100
           <...>-2677  [001]   840.841480: shrink_slab: shrink_icache_memory before=50100 after=50000
           <...>-2677  [001]   840.841649: shrink_slab: shrink_icache_memory before=50000 after=49900
           <...>-2677  [001]   840.841815: shrink_slab: shrink_icache_memory before=49900 after=49700
           <...>-2677  [001]   840.841984: shrink_slab: shrink_icache_memory before=49700 after=49600
           <...>-2677  [001]   840.842159: shrink_slab: shrink_icache_memory before=49600 after=49500
           <...>-2677  [001]   840.842346: shrink_slab: shrink_icache_memory before=49500 after=49400
           <...>-2677  [001]   840.842515: shrink_slab: shrink_icache_memory before=49400 after=49200
           <...>-2677  [001]   840.842684: shrink_slab: shrink_icache_memory before=49200 after=49100
           <...>-2677  [001]   840.842864: shrink_slab: shrink_icache_memory before=49100 after=49000
           <...>-2677  [001]   840.843039: shrink_slab: shrink_icache_memory before=49000 after=48800
           <...>-2677  [001]   840.843205: shrink_slab: shrink_icache_memory before=48800 after=48700
           <...>-2677  [001]   840.843391: shrink_slab: shrink_icache_memory before=48700 after=48600
           <...>-2677  [001]   840.843560: shrink_slab: shrink_icache_memory before=48600 after=48500
           <...>-2677  [001]   840.843735: shrink_slab: shrink_icache_memory before=48500 after=48300
           <...>-2677  [001]   840.844964: shrink_slab: shrink_icache_memory before=48300 after=48200
           <...>-2677  [001]   840.845242: shrink_slab: shrink_icache_memory before=48200 after=48100
           <...>-2677  [001]   840.845411: shrink_slab: shrink_icache_memory before=48100 after=47900
           <...>-2677  [001]   840.845581: shrink_slab: shrink_icache_memory before=47900 after=47800
           <...>-2677  [001]   840.845752: shrink_slab: shrink_icache_memory before=47800 after=47700
           <...>-2677  [001]   840.845920: shrink_slab: shrink_icache_memory before=47700 after=47600
           <...>-2677  [001]   840.860766: shrink_slab: shrink_icache_memory before=47600 after=47400
           <...>-2677  [001]   840.860949: shrink_slab: shrink_icache_memory before=47400 after=47300
           <...>-2677  [001]   840.861118: shrink_slab: shrink_icache_memory before=47300 after=47200
           <...>-2677  [001]   840.861306: shrink_slab: shrink_icache_memory before=47200 after=47100
           <...>-2677  [001]   840.861476: shrink_slab: shrink_icache_memory before=47100 after=46900
           <...>-2677  [001]   840.861646: shrink_slab: shrink_icache_memory before=46900 after=46800
           <...>-2677  [001]   840.861817: shrink_slab: shrink_icache_memory before=46800 after=46700
           <...>-2677  [001]   840.861986: shrink_slab: shrink_icache_memory before=46700 after=46500
           <...>-2677  [001]   840.862159: shrink_slab: shrink_icache_memory before=46500 after=46400
           <...>-2677  [001]   840.862438: shrink_slab: shrink_icache_memory before=46400 after=46300
           <...>-2677  [001]   840.862626: shrink_slab: shrink_icache_memory before=46300 after=46200
           <...>-2677  [001]   840.862796: shrink_slab: shrink_icache_memory before=46200 after=46000
           <...>-2677  [001]   840.862963: shrink_slab: shrink_icache_memory before=46000 after=45900
           <...>-2677  [001]   840.863148: shrink_slab: shrink_icache_memory before=45900 after=45800
           <...>-2677  [001]   840.863323: shrink_slab: shrink_icache_memory before=45800 after=45600
           <...>-2677  [001]   840.863492: shrink_slab: shrink_icache_memory before=45600 after=45500
           <...>-2677  [001]   840.863656: shrink_slab: shrink_icache_memory before=45500 after=45400
           <...>-2677  [001]   840.863821: shrink_slab: shrink_icache_memory before=45400 after=45300
           <...>-2677  [001]   840.863988: shrink_slab: shrink_icache_memory before=45300 after=45100
CPU:1 [LOST 702 EVENTS]
           <...>-2677  [001]   840.928410: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928411: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928412: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928414: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928415: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928416: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928418: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928419: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928420: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928422: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928423: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928425: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928426: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928427: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928429: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928430: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928432: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928433: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928434: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928436: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928437: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928438: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928440: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928441: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928443: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928444: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928445: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928447: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928448: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928449: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928451: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928452: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928454: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928455: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928456: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928458: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928459: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928460: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928462: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928463: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928464: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928466: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928467: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928468: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928470: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928471: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928473: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928474: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928475: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928477: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928478: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928479: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928481: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928482: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928483: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928485: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928486: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928487: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928489: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928490: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928492: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928493: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928494: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928496: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928497: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928498: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928500: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928501: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928502: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928504: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928505: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928506: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928508: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928509: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928510: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928512: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928513: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928515: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928516: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928518: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928519: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928520: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928522: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928523: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928524: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928526: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928527: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928529: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928532: zone_reclaim: scan 32, order 0, old 39546 new 16605





diff --git a/mm/vmscan.c b/mm/vmscan.c
index 5a377e6..5767a08 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -244,6 +244,12 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,

                        nr_before = (*shrinker->shrink)(0, gfp_mask);
                        shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask);
+
+                       trace_printk("%pf before=%d after=%d\n",
+                                    shrinker->shrink,
+                                    nr_before,
+                                    shrink_ret);
+
                        if (shrink_ret == -1)
                                break;
                        if (shrink_ret < nr_before)
@@ -2619,10 +2625,23 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int orde
                 * Note that shrink_slab will free memory on all zones and may
                 * take a long time.
                 */
-               while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
-                       zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
-                               slab_reclaimable - nr_pages)
-                       ;
+               for (;;) {
+                       unsigned long slab;
+
+                       if (!shrink_slab(sc.nr_scanned, gfp_mask, order))
+                               break;
+
+                       slab = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+
+                       trace_printk("scan %lu, order %d, old %lu new %lu\n",
+                                    sc.nr_scanned,
+                                    order,
+                                    slab_reclaimable,
+                                    slab);
+
+                       if (slab + nr_pages <= slab_reclaimable)
+                               break;
+               }


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
@ 2010-07-09  8:21         ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-09  8:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kosaki.motohiro, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner

> > The "lru_pages" parameter is really a division factor affecting
> > the number of pages scanned. This patch increases this division factor
> > significantly and therefore reduces the number of items scanned during
> > zone_reclaim.
> > 
> 
> And for that reason I won't apply the patch.  I'd be crazy to do so. 
> It tosses away four years testing, replacing it with something which
> could have a large effect on reclaim behaviour, with no indication
> whether that effect is good or bad.

Unfortunatelly, current code is more crazy. it is nearly worst bad behavior.
I've applied attached debugging patch and got following result.

That said, really low memory pressure (scan = 32, order = 0) drop _all_
icache and dcache (about 100MB!). I don't blieve anyone tested slab behavior
on zone_reclaim=1 for this four years.

please remember shrink_slab() scanning equation. order=0 always makes
all slab dropping!

btw, current shrink_slab() don't exit even if (*shrinker->shrink)(0) return 0.
It's a bit inefficient and meaningless loop iteration. I'll make a fix.


           <...>-2677  [001]   840.832711: shrink_slab: shrink_icache_memory before=56100 after=56000
           <...>-2677  [001]   840.832898: shrink_slab: shrink_icache_memory before=56000 after=55900
           <...>-2677  [001]   840.833081: shrink_slab: shrink_icache_memory before=55900 after=55800
           <...>-2677  [001]   840.833693: shrink_slab: shrink_icache_memory before=55800 after=55600
           <...>-2677  [001]   840.833860: shrink_slab: shrink_icache_memory before=55600 after=55500
           <...>-2677  [001]   840.834033: shrink_slab: shrink_icache_memory before=55500 after=55400
           <...>-2677  [001]   840.834201: shrink_slab: shrink_icache_memory before=55400 after=55200
           <...>-2677  [001]   840.834385: shrink_slab: shrink_icache_memory before=55200 after=55100
           <...>-2677  [001]   840.834553: shrink_slab: shrink_icache_memory before=55100 after=55000
           <...>-2677  [001]   840.834720: shrink_slab: shrink_icache_memory before=55000 after=54900
           <...>-2677  [001]   840.834889: shrink_slab: shrink_icache_memory before=54900 after=54700
           <...>-2677  [001]   840.835063: shrink_slab: shrink_icache_memory before=54700 after=54600
           <...>-2677  [001]   840.835229: shrink_slab: shrink_icache_memory before=54600 after=54500
           <...>-2677  [001]   840.835596: shrink_slab: shrink_icache_memory before=54500 after=54300
           <...>-2677  [001]   840.835761: shrink_slab: shrink_icache_memory before=54300 after=54200
           <...>-2677  [001]   840.835926: shrink_slab: shrink_icache_memory before=54200 after=54100
           <...>-2677  [001]   840.836097: shrink_slab: shrink_icache_memory before=54100 after=54000
           <...>-2677  [001]   840.836284: shrink_slab: shrink_icache_memory before=54000 after=53800
           <...>-2677  [001]   840.836453: shrink_slab: shrink_icache_memory before=53800 after=53700
           <...>-2677  [001]   840.836621: shrink_slab: shrink_icache_memory before=53700 after=53600
           <...>-2677  [001]   840.836793: shrink_slab: shrink_icache_memory before=53600 after=53500
           <...>-2677  [001]   840.836962: shrink_slab: shrink_icache_memory before=53500 after=53300
           <...>-2677  [001]   840.837137: shrink_slab: shrink_icache_memory before=53300 after=53200
           <...>-2677  [001]   840.837317: shrink_slab: shrink_icache_memory before=53200 after=53100
           <...>-2677  [001]   840.837485: shrink_slab: shrink_icache_memory before=53100 after=52900
           <...>-2677  [001]   840.837652: shrink_slab: shrink_icache_memory before=52900 after=52800
           <...>-2677  [001]   840.837821: shrink_slab: shrink_icache_memory before=52800 after=52700
           <...>-2677  [001]   840.837993: shrink_slab: shrink_icache_memory before=52700 after=52600
           <...>-2677  [001]   840.838168: shrink_slab: shrink_icache_memory before=52600 after=52400
           <...>-2677  [001]   840.838353: shrink_slab: shrink_icache_memory before=52400 after=52300
           <...>-2677  [001]   840.838524: shrink_slab: shrink_icache_memory before=52300 after=52200
           <...>-2677  [001]   840.838695: shrink_slab: shrink_icache_memory before=52200 after=52000
           <...>-2677  [001]   840.838865: shrink_slab: shrink_icache_memory before=52000 after=51900
           <...>-2677  [001]   840.839037: shrink_slab: shrink_icache_memory before=51900 after=51800
           <...>-2677  [001]   840.839207: shrink_slab: shrink_icache_memory before=51800 after=51700
           <...>-2677  [001]   840.839422: shrink_slab: shrink_icache_memory before=51700 after=51500
           <...>-2677  [001]   840.839589: shrink_slab: shrink_icache_memory before=51500 after=51400
           <...>-2677  [001]   840.839753: shrink_slab: shrink_icache_memory before=51400 after=51300
           <...>-2677  [001]   840.839920: shrink_slab: shrink_icache_memory before=51300 after=51100
           <...>-2677  [001]   840.840094: shrink_slab: shrink_icache_memory before=51100 after=51000
           <...>-2677  [001]   840.840278: shrink_slab: shrink_icache_memory before=51000 after=50900
           <...>-2677  [001]   840.840446: shrink_slab: shrink_icache_memory before=50900 after=50800
           <...>-2677  [001]   840.840618: shrink_slab: shrink_icache_memory before=50800 after=50600
           <...>-2677  [001]   840.840787: shrink_slab: shrink_icache_memory before=50600 after=50500
           <...>-2677  [001]   840.840953: shrink_slab: shrink_icache_memory before=50500 after=50400
           <...>-2677  [001]   840.841128: shrink_slab: shrink_icache_memory before=50400 after=50300
           <...>-2677  [001]   840.841310: shrink_slab: shrink_icache_memory before=50300 after=50100
           <...>-2677  [001]   840.841480: shrink_slab: shrink_icache_memory before=50100 after=50000
           <...>-2677  [001]   840.841649: shrink_slab: shrink_icache_memory before=50000 after=49900
           <...>-2677  [001]   840.841815: shrink_slab: shrink_icache_memory before=49900 after=49700
           <...>-2677  [001]   840.841984: shrink_slab: shrink_icache_memory before=49700 after=49600
           <...>-2677  [001]   840.842159: shrink_slab: shrink_icache_memory before=49600 after=49500
           <...>-2677  [001]   840.842346: shrink_slab: shrink_icache_memory before=49500 after=49400
           <...>-2677  [001]   840.842515: shrink_slab: shrink_icache_memory before=49400 after=49200
           <...>-2677  [001]   840.842684: shrink_slab: shrink_icache_memory before=49200 after=49100
           <...>-2677  [001]   840.842864: shrink_slab: shrink_icache_memory before=49100 after=49000
           <...>-2677  [001]   840.843039: shrink_slab: shrink_icache_memory before=49000 after=48800
           <...>-2677  [001]   840.843205: shrink_slab: shrink_icache_memory before=48800 after=48700
           <...>-2677  [001]   840.843391: shrink_slab: shrink_icache_memory before=48700 after=48600
           <...>-2677  [001]   840.843560: shrink_slab: shrink_icache_memory before=48600 after=48500
           <...>-2677  [001]   840.843735: shrink_slab: shrink_icache_memory before=48500 after=48300
           <...>-2677  [001]   840.844964: shrink_slab: shrink_icache_memory before=48300 after=48200
           <...>-2677  [001]   840.845242: shrink_slab: shrink_icache_memory before=48200 after=48100
           <...>-2677  [001]   840.845411: shrink_slab: shrink_icache_memory before=48100 after=47900
           <...>-2677  [001]   840.845581: shrink_slab: shrink_icache_memory before=47900 after=47800
           <...>-2677  [001]   840.845752: shrink_slab: shrink_icache_memory before=47800 after=47700
           <...>-2677  [001]   840.845920: shrink_slab: shrink_icache_memory before=47700 after=47600
           <...>-2677  [001]   840.860766: shrink_slab: shrink_icache_memory before=47600 after=47400
           <...>-2677  [001]   840.860949: shrink_slab: shrink_icache_memory before=47400 after=47300
           <...>-2677  [001]   840.861118: shrink_slab: shrink_icache_memory before=47300 after=47200
           <...>-2677  [001]   840.861306: shrink_slab: shrink_icache_memory before=47200 after=47100
           <...>-2677  [001]   840.861476: shrink_slab: shrink_icache_memory before=47100 after=46900
           <...>-2677  [001]   840.861646: shrink_slab: shrink_icache_memory before=46900 after=46800
           <...>-2677  [001]   840.861817: shrink_slab: shrink_icache_memory before=46800 after=46700
           <...>-2677  [001]   840.861986: shrink_slab: shrink_icache_memory before=46700 after=46500
           <...>-2677  [001]   840.862159: shrink_slab: shrink_icache_memory before=46500 after=46400
           <...>-2677  [001]   840.862438: shrink_slab: shrink_icache_memory before=46400 after=46300
           <...>-2677  [001]   840.862626: shrink_slab: shrink_icache_memory before=46300 after=46200
           <...>-2677  [001]   840.862796: shrink_slab: shrink_icache_memory before=46200 after=46000
           <...>-2677  [001]   840.862963: shrink_slab: shrink_icache_memory before=46000 after=45900
           <...>-2677  [001]   840.863148: shrink_slab: shrink_icache_memory before=45900 after=45800
           <...>-2677  [001]   840.863323: shrink_slab: shrink_icache_memory before=45800 after=45600
           <...>-2677  [001]   840.863492: shrink_slab: shrink_icache_memory before=45600 after=45500
           <...>-2677  [001]   840.863656: shrink_slab: shrink_icache_memory before=45500 after=45400
           <...>-2677  [001]   840.863821: shrink_slab: shrink_icache_memory before=45400 after=45300
           <...>-2677  [001]   840.863988: shrink_slab: shrink_icache_memory before=45300 after=45100
CPU:1 [LOST 702 EVENTS]
           <...>-2677  [001]   840.928410: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928411: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928412: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928414: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928415: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928416: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928418: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928419: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928420: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928422: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928423: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928425: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928426: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928427: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928429: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928430: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928432: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928433: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928434: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928436: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928437: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928438: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928440: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928441: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928443: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928444: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928445: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928447: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928448: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928449: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928451: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928452: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928454: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928455: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928456: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928458: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928459: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928460: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928462: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928463: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928464: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928466: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928467: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928468: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928470: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928471: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928473: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928474: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928475: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928477: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928478: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928479: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928481: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928482: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928483: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928485: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928486: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928487: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928489: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928490: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928492: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928493: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928494: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928496: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928497: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928498: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928500: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928501: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928502: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928504: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928505: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928506: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928508: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928509: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928510: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928512: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928513: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928515: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928516: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928518: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928519: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928520: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928522: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928523: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928524: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928526: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928527: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928529: shrink_slab: shrink_icache_memory before=0 after=0
           <...>-2677  [001]   840.928532: zone_reclaim: scan 32, order 0, old 39546 new 16605





diff --git a/mm/vmscan.c b/mm/vmscan.c
index 5a377e6..5767a08 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -244,6 +244,12 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,

                        nr_before = (*shrinker->shrink)(0, gfp_mask);
                        shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask);
+
+                       trace_printk("%pf before=%d after=%d\n",
+                                    shrinker->shrink,
+                                    nr_before,
+                                    shrink_ret);
+
                        if (shrink_ret == -1)
                                break;
                        if (shrink_ret < nr_before)
@@ -2619,10 +2625,23 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int orde
                 * Note that shrink_slab will free memory on all zones and may
                 * take a long time.
                 */
-               while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
-                       zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
-                               slab_reclaimable - nr_pages)
-                       ;
+               for (;;) {
+                       unsigned long slab;
+
+                       if (!shrink_slab(sc.nr_scanned, gfp_mask, order))
+                               break;
+
+                       slab = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+
+                       trace_printk("scan %lu, order %d, old %lu new %lu\n",
+                                    sc.nr_scanned,
+                                    order,
+                                    slab_reclaimable,
+                                    slab);
+
+                       if (slab + nr_pages <= slab_reclaimable)
+                               break;
+               }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages,  not page order
  2010-07-08  7:40   ` KOSAKI Motohiro
@ 2010-07-09  8:36     ` Minchan Kim
  -1 siblings, 0 replies; 62+ messages in thread
From: Minchan Kim @ 2010-07-09  8:36 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Christoph Lameter, LKML, linux-mm, Andrew Morton, Mel Gorman,
	Rik van Riel, Johannes Weiner

On Thu, Jul 8, 2010 at 4:40 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
> Fix simple argument error. Usually 'order' is very small value than
> lru_pages. then it can makes unnecessary icache dropping.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>

With your test result, This patch makes sense to me.
Please, include your test result in description.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
@ 2010-07-09  8:36     ` Minchan Kim
  0 siblings, 0 replies; 62+ messages in thread
From: Minchan Kim @ 2010-07-09  8:36 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Christoph Lameter, LKML, linux-mm, Andrew Morton, Mel Gorman,
	Rik van Riel, Johannes Weiner

On Thu, Jul 8, 2010 at 4:40 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
> Fix simple argument error. Usually 'order' is very small value than
> lru_pages. then it can makes unnecessary icache dropping.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>

With your test result, This patch makes sense to me.
Please, include your test result in description.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH] vmscan: stop meaningless loop iteration when no reclaimable slab
  2010-07-09  8:21         ` KOSAKI Motohiro
@ 2010-07-09 10:13           ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-09 10:13 UTC (permalink / raw)
  To: Andrew Morton, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner
  Cc: kosaki.motohiro

If number of reclaimable slabs are zero, shrink_icache_memory() and
shrink_dcache_memory() return 0. but strangely shrink_slab() ignore
it and continue meaningless loop iteration.

This patch fixes it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/vmscan.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0f9f624..8f61adb 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -243,6 +243,11 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
 			int nr_before;
 
 			nr_before = (*shrinker->shrink)(0, gfp_mask);
+			/* no slab objects, no more reclaim. */
+			if (nr_before == 0) {
+				total_scan = 0;
+				break;
+			}
 			shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask);
 			if (shrink_ret == -1)
 				break;
-- 
1.6.5.2




^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH] vmscan: stop meaningless loop iteration when no reclaimable slab
@ 2010-07-09 10:13           ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-09 10:13 UTC (permalink / raw)
  To: Andrew Morton, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner
  Cc: kosaki.motohiro

If number of reclaimable slabs are zero, shrink_icache_memory() and
shrink_dcache_memory() return 0. but strangely shrink_slab() ignore
it and continue meaningless loop iteration.

This patch fixes it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/vmscan.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0f9f624..8f61adb 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -243,6 +243,11 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
 			int nr_before;
 
 			nr_before = (*shrinker->shrink)(0, gfp_mask);
+			/* no slab objects, no more reclaim. */
+			if (nr_before == 0) {
+				total_scan = 0;
+				break;
+			}
 			shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask);
 			if (shrink_ret == -1)
 				break;
-- 
1.6.5.2



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no  reclaimable slab
  2010-07-09 10:13           ` KOSAKI Motohiro
@ 2010-07-09 10:53             ` Minchan Kim
  -1 siblings, 0 replies; 62+ messages in thread
From: Minchan Kim @ 2010-07-09 10:53 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Johannes Weiner

On Fri, Jul 9, 2010 at 7:13 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
> If number of reclaimable slabs are zero, shrink_icache_memory() and
> shrink_dcache_memory() return 0. but strangely shrink_slab() ignore
> it and continue meaningless loop iteration.
>
> This patch fixes it.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> ---
>  mm/vmscan.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 0f9f624..8f61adb 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -243,6 +243,11 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
>                        int nr_before;
>
>                        nr_before = (*shrinker->shrink)(0, gfp_mask);
> +                       /* no slab objects, no more reclaim. */
> +                       if (nr_before == 0) {
> +                               total_scan = 0;

Why do you reset totoal_scan to 0?
I don't know exact meaning of shrinker->nr.
AFAIU, it can affect next shrinker's total_scan.
Isn't it harmful?

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no reclaimable slab
@ 2010-07-09 10:53             ` Minchan Kim
  0 siblings, 0 replies; 62+ messages in thread
From: Minchan Kim @ 2010-07-09 10:53 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Johannes Weiner

On Fri, Jul 9, 2010 at 7:13 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
> If number of reclaimable slabs are zero, shrink_icache_memory() and
> shrink_dcache_memory() return 0. but strangely shrink_slab() ignore
> it and continue meaningless loop iteration.
>
> This patch fixes it.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> ---
>  mm/vmscan.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 0f9f624..8f61adb 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -243,6 +243,11 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
>                        int nr_before;
>
>                        nr_before = (*shrinker->shrink)(0, gfp_mask);
> +                       /* no slab objects, no more reclaim. */
> +                       if (nr_before == 0) {
> +                               total_scan = 0;

Why do you reset totoal_scan to 0?
I don't know exact meaning of shrinker->nr.
AFAIU, it can affect next shrinker's total_scan.
Isn't it harmful?

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no  reclaimable slab
  2010-07-09 10:53             ` Minchan Kim
@ 2010-07-09 11:04               ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-09 11:04 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Andrew Morton, Christoph Lameter, LKML,
	linux-mm, Mel Gorman, Rik van Riel, Johannes Weiner

> On Fri, Jul 9, 2010 at 7:13 PM, KOSAKI Motohiro
> <kosaki.motohiro@jp.fujitsu.com> wrote:
> > If number of reclaimable slabs are zero, shrink_icache_memory() and
> > shrink_dcache_memory() return 0. but strangely shrink_slab() ignore
> > it and continue meaningless loop iteration.
> >
> > This patch fixes it.
> >
> > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > ---
> >  mm/vmscan.c |    5 +++++
> >  1 files changed, 5 insertions(+), 0 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 0f9f624..8f61adb 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -243,6 +243,11 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
> >                        int nr_before;
> >
> >                        nr_before = (*shrinker->shrink)(0, gfp_mask);
> > +                       /* no slab objects, no more reclaim. */
> > +                       if (nr_before == 0) {
> > +                               total_scan = 0;
> 
> Why do you reset totoal_scan to 0?

If shab objects are zero, we don't need more reclaim. 

> I don't know exact meaning of shrinker->nr.

similar meaning of reclaim_stat->nr_saved_scan.
If total_scan can't divide SHRINK_BATCH(128), saving remainder and using at next shrink_slab().

> AFAIU, it can affect next shrinker's total_scan.
> Isn't it harmful?

No.  This loop is

                total_scan = shrinker->nr;		/* Reset and init total_scan */
                shrinker->nr = 0;

                while (total_scan >= SHRINK_BATCH) {
                        nr_before = (*shrinker->shrink)(0, gfp_mask);
                        /* no slab objects, no more reclaim. */
                        if (nr_before == 0) {
                                total_scan = 0;
                                break;
                        }
                        shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask);
                        if (shrink_ret == -1)
                                break;
                        if (shrink_ret < nr_before)
                                ret += nr_before - shrink_ret;
                        total_scan -= this_scan;
                }

                shrinker->nr += total_scan;		/* save remainder #of-scan */




^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no  reclaimable slab
@ 2010-07-09 11:04               ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-09 11:04 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Andrew Morton, Christoph Lameter, LKML,
	linux-mm, Mel Gorman, Rik van Riel, Johannes Weiner

> On Fri, Jul 9, 2010 at 7:13 PM, KOSAKI Motohiro
> <kosaki.motohiro@jp.fujitsu.com> wrote:
> > If number of reclaimable slabs are zero, shrink_icache_memory() and
> > shrink_dcache_memory() return 0. but strangely shrink_slab() ignore
> > it and continue meaningless loop iteration.
> >
> > This patch fixes it.
> >
> > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > ---
> >  mm/vmscan.c |    5 +++++
> >  1 files changed, 5 insertions(+), 0 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 0f9f624..8f61adb 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -243,6 +243,11 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
> >                        int nr_before;
> >
> >                        nr_before = (*shrinker->shrink)(0, gfp_mask);
> > +                       /* no slab objects, no more reclaim. */
> > +                       if (nr_before == 0) {
> > +                               total_scan = 0;
> 
> Why do you reset totoal_scan to 0?

If shab objects are zero, we don't need more reclaim. 

> I don't know exact meaning of shrinker->nr.

similar meaning of reclaim_stat->nr_saved_scan.
If total_scan can't divide SHRINK_BATCH(128), saving remainder and using at next shrink_slab().

> AFAIU, it can affect next shrinker's total_scan.
> Isn't it harmful?

No.  This loop is

                total_scan = shrinker->nr;		/* Reset and init total_scan */
                shrinker->nr = 0;

                while (total_scan >= SHRINK_BATCH) {
                        nr_before = (*shrinker->shrink)(0, gfp_mask);
                        /* no slab objects, no more reclaim. */
                        if (nr_before == 0) {
                                total_scan = 0;
                                break;
                        }
                        shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask);
                        if (shrink_ret == -1)
                                break;
                        if (shrink_ret < nr_before)
                                ret += nr_before - shrink_ret;
                        total_scan -= this_scan;
                }

                shrinker->nr += total_scan;		/* save remainder #of-scan */



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
  2010-07-09  8:36     ` Minchan Kim
@ 2010-07-09 13:54       ` Christoph Lameter
  -1 siblings, 0 replies; 62+ messages in thread
From: Christoph Lameter @ 2010-07-09 13:54 UTC (permalink / raw)
  To: Minchan Kim
  Cc: KOSAKI Motohiro, LKML, linux-mm, Andrew Morton, Mel Gorman,
	Rik van Riel, Johannes Weiner


Ok. I am convinced.

Acked-by: Christoph Lameter <cl@linux-foundation.org>


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
@ 2010-07-09 13:54       ` Christoph Lameter
  0 siblings, 0 replies; 62+ messages in thread
From: Christoph Lameter @ 2010-07-09 13:54 UTC (permalink / raw)
  To: Minchan Kim
  Cc: KOSAKI Motohiro, LKML, linux-mm, Andrew Morton, Mel Gorman,
	Rik van Riel, Johannes Weiner


Ok. I am convinced.

Acked-by: Christoph Lameter <cl@linux-foundation.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no reclaimable slab
  2010-07-09 10:13           ` KOSAKI Motohiro
@ 2010-07-09 14:02             ` Christoph Lameter
  -1 siblings, 0 replies; 62+ messages in thread
From: Christoph Lameter @ 2010-07-09 14:02 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, LKML, linux-mm, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner

On Fri, 9 Jul 2010, KOSAKI Motohiro wrote:

> If number of reclaimable slabs are zero, shrink_icache_memory() and
> shrink_dcache_memory() return 0. but strangely shrink_slab() ignore
> it and continue meaningless loop iteration.

There is also a per zone/node/global counter SLAB_RECLAIM_ACCOUNT that
could be used to determine if its worth looking at things at all. I saw
some effort going into making the shrinkers zone aware. If so then we may
be able to avoid scanning slabs.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no reclaimable slab
@ 2010-07-09 14:02             ` Christoph Lameter
  0 siblings, 0 replies; 62+ messages in thread
From: Christoph Lameter @ 2010-07-09 14:02 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, LKML, linux-mm, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner

On Fri, 9 Jul 2010, KOSAKI Motohiro wrote:

> If number of reclaimable slabs are zero, shrink_icache_memory() and
> shrink_dcache_memory() return 0. but strangely shrink_slab() ignore
> it and continue meaningless loop iteration.

There is also a per zone/node/global counter SLAB_RECLAIM_ACCOUNT that
could be used to determine if its worth looking at things at all. I saw
some effort going into making the shrinkers zone aware. If so then we may
be able to avoid scanning slabs.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
  2010-07-09  1:16     ` KOSAKI Motohiro
@ 2010-07-09 22:28       ` Andrew Morton
  -1 siblings, 0 replies; 62+ messages in thread
From: Andrew Morton @ 2010-07-09 22:28 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Mel Gorman, Rik van Riel, Minchan Kim, Johannes Weiner

On Fri,  9 Jul 2010 10:16:33 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		.swappiness = vm_swappiness,
>  		.order = order,
>  	};
> -	unsigned long slab_reclaimable;
> +	unsigned long nr_slab_pages0, nr_slab_pages1;
>  
>  	disable_swap_token();
>  	cond_resched();
> @@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
>  	}
>  
> -	slab_reclaimable = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> -	if (slab_reclaimable > zone->min_slab_pages) {
> +	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +	if (nr_slab_pages0 > zone->min_slab_pages) {
>  		/*
>  		 * shrink_slab() does not currently allow us to determine how
>  		 * many pages were freed in this zone.

Well no, but it could do so, with some minor changes to struct
reclaim_state and its handling.  Put a zone* and a counter in
reclaim_state, handle them in sl?b.c.

> So we take the current
> @@ -2628,16 +2628,17 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		 * take a long time.
>  		 */
>  		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> -			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
> -				slab_reclaimable - nr_pages)
> +		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
> +				nr_slab_pages0))
>  			;
>  
>  		/*
>  		 * Update nr_reclaimed by the number of slab pages we
>  		 * reclaimed from this zone.
>  		 */
> -		sc.nr_reclaimed += slab_reclaimable -
> -			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +		nr_slab_pages1 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +		if (nr_slab_pages1 < nr_slab_pages0)
> +			sc.nr_reclaimed += nr_slab_pages0 - nr_slab_pages1;

My, that's horrible.  The whole expression says "this number is
basically a pile of random junk.  Let's add it in anyway".


>  	}
>  
>  	p->reclaim_state = NULL;

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
@ 2010-07-09 22:28       ` Andrew Morton
  0 siblings, 0 replies; 62+ messages in thread
From: Andrew Morton @ 2010-07-09 22:28 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: LKML, linux-mm, Mel Gorman, Rik van Riel, Minchan Kim, Johannes Weiner

On Fri,  9 Jul 2010 10:16:33 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		.swappiness = vm_swappiness,
>  		.order = order,
>  	};
> -	unsigned long slab_reclaimable;
> +	unsigned long nr_slab_pages0, nr_slab_pages1;
>  
>  	disable_swap_token();
>  	cond_resched();
> @@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
>  	}
>  
> -	slab_reclaimable = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> -	if (slab_reclaimable > zone->min_slab_pages) {
> +	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +	if (nr_slab_pages0 > zone->min_slab_pages) {
>  		/*
>  		 * shrink_slab() does not currently allow us to determine how
>  		 * many pages were freed in this zone.

Well no, but it could do so, with some minor changes to struct
reclaim_state and its handling.  Put a zone* and a counter in
reclaim_state, handle them in sl?b.c.

> So we take the current
> @@ -2628,16 +2628,17 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		 * take a long time.
>  		 */
>  		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> -			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
> -				slab_reclaimable - nr_pages)
> +		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
> +				nr_slab_pages0))
>  			;
>  
>  		/*
>  		 * Update nr_reclaimed by the number of slab pages we
>  		 * reclaimed from this zone.
>  		 */
> -		sc.nr_reclaimed += slab_reclaimable -
> -			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +		nr_slab_pages1 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> +		if (nr_slab_pages1 < nr_slab_pages0)
> +			sc.nr_reclaimed += nr_slab_pages0 - nr_slab_pages1;

My, that's horrible.  The whole expression says "this number is
basically a pile of random junk.  Let's add it in anyway".


>  	}
>  
>  	p->reclaim_state = NULL;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no  reclaimable slab
  2010-07-09 11:04               ` KOSAKI Motohiro
@ 2010-07-11 22:28                 ` Minchan Kim
  -1 siblings, 0 replies; 62+ messages in thread
From: Minchan Kim @ 2010-07-11 22:28 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Johannes Weiner

On Fri, Jul 9, 2010 at 8:04 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>> On Fri, Jul 9, 2010 at 7:13 PM, KOSAKI Motohiro
>> <kosaki.motohiro@jp.fujitsu.com> wrote:
>> > If number of reclaimable slabs are zero, shrink_icache_memory() and
>> > shrink_dcache_memory() return 0. but strangely shrink_slab() ignore
>> > it and continue meaningless loop iteration.
>> >
>> > This patch fixes it.
>> >
>> > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>> > ---
>> >  mm/vmscan.c |    5 +++++
>> >  1 files changed, 5 insertions(+), 0 deletions(-)
>> >
>> > diff --git a/mm/vmscan.c b/mm/vmscan.c
>> > index 0f9f624..8f61adb 100644
>> > --- a/mm/vmscan.c
>> > +++ b/mm/vmscan.c
>> > @@ -243,6 +243,11 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
>> >                        int nr_before;
>> >
>> >                        nr_before = (*shrinker->shrink)(0, gfp_mask);
>> > +                       /* no slab objects, no more reclaim. */
>> > +                       if (nr_before == 0) {
>> > +                               total_scan = 0;
>>
>> Why do you reset totoal_scan to 0?
>
> If shab objects are zero, we don't need more reclaim.
>
>> I don't know exact meaning of shrinker->nr.
>
> similar meaning of reclaim_stat->nr_saved_scan.
> If total_scan can't divide SHRINK_BATCH(128), saving remainder and using at next shrink_slab().
>
>> AFAIU, it can affect next shrinker's total_scan.
>> Isn't it harmful?
>
> No.  This loop is
>
>                total_scan = shrinker->nr;              /* Reset and init total_scan */
>                shrinker->nr = 0;
>
>                while (total_scan >= SHRINK_BATCH) {
>                        nr_before = (*shrinker->shrink)(0, gfp_mask);
>                        /* no slab objects, no more reclaim. */
>                        if (nr_before == 0) {
>                                total_scan = 0;
>                                break;
>                        }
>                        shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask);
>                        if (shrink_ret == -1)
>                                break;
>                        if (shrink_ret < nr_before)
>                                ret += nr_before - shrink_ret;
>                        total_scan -= this_scan;
>                }
>
>                shrinker->nr += total_scan;             /* save remainder #of-scan */
>
>
I can't understand your point.


old shrink_slab

shrinker->nr += delta; /* Add delta to previous shrinker's remained count */
total_scan = shrinker->nr;

while(total_scan >= SHRINK_BATCH) {
	nr_before = shrink(xxx);
	total_scan =- this_scan;
}

shrinker->nr += total_scan;

The total_scan can always be the number < SHRINK_BATCH.
So, when next shrinker calcuates loop count, the number can affect.

new shrink_slab

shrinker->nr += delta; /* nr is always zero by your patch */
total_scan = shrinker->nr;

while(total_scan >= SHRINK_BATCH) {
	nr_before = shrink(xxx);
	if (nr_before == 0) {
		total_scan = 0;
		break;
	}
}

shrinker->nr += 0;

But after your patch, total_scan is always zero. It never affect
next shrinker's loop count.

Am I missing something?
-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no reclaimable slab
@ 2010-07-11 22:28                 ` Minchan Kim
  0 siblings, 0 replies; 62+ messages in thread
From: Minchan Kim @ 2010-07-11 22:28 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Johannes Weiner

On Fri, Jul 9, 2010 at 8:04 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>> On Fri, Jul 9, 2010 at 7:13 PM, KOSAKI Motohiro
>> <kosaki.motohiro@jp.fujitsu.com> wrote:
>> > If number of reclaimable slabs are zero, shrink_icache_memory() and
>> > shrink_dcache_memory() return 0. but strangely shrink_slab() ignore
>> > it and continue meaningless loop iteration.
>> >
>> > This patch fixes it.
>> >
>> > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>> > ---
>> >  mm/vmscan.c |    5 +++++
>> >  1 files changed, 5 insertions(+), 0 deletions(-)
>> >
>> > diff --git a/mm/vmscan.c b/mm/vmscan.c
>> > index 0f9f624..8f61adb 100644
>> > --- a/mm/vmscan.c
>> > +++ b/mm/vmscan.c
>> > @@ -243,6 +243,11 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
>> >                        int nr_before;
>> >
>> >                        nr_before = (*shrinker->shrink)(0, gfp_mask);
>> > +                       /* no slab objects, no more reclaim. */
>> > +                       if (nr_before == 0) {
>> > +                               total_scan = 0;
>>
>> Why do you reset totoal_scan to 0?
>
> If shab objects are zero, we don't need more reclaim.
>
>> I don't know exact meaning of shrinker->nr.
>
> similar meaning of reclaim_stat->nr_saved_scan.
> If total_scan can't divide SHRINK_BATCH(128), saving remainder and using at next shrink_slab().
>
>> AFAIU, it can affect next shrinker's total_scan.
>> Isn't it harmful?
>
> No.  This loop is
>
>                total_scan = shrinker->nr;              /* Reset and init total_scan */
>                shrinker->nr = 0;
>
>                while (total_scan >= SHRINK_BATCH) {
>                        nr_before = (*shrinker->shrink)(0, gfp_mask);
>                        /* no slab objects, no more reclaim. */
>                        if (nr_before == 0) {
>                                total_scan = 0;
>                                break;
>                        }
>                        shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask);
>                        if (shrink_ret == -1)
>                                break;
>                        if (shrink_ret < nr_before)
>                                ret += nr_before - shrink_ret;
>                        total_scan -= this_scan;
>                }
>
>                shrinker->nr += total_scan;             /* save remainder #of-scan */
>
>
I can't understand your point.


old shrink_slab

shrinker->nr += delta; /* Add delta to previous shrinker's remained count */
total_scan = shrinker->nr;

while(total_scan >= SHRINK_BATCH) {
	nr_before = shrink(xxx);
	total_scan =- this_scan;
}

shrinker->nr += total_scan;

The total_scan can always be the number < SHRINK_BATCH.
So, when next shrinker calcuates loop count, the number can affect.

new shrink_slab

shrinker->nr += delta; /* nr is always zero by your patch */
total_scan = shrinker->nr;

while(total_scan >= SHRINK_BATCH) {
	nr_before = shrink(xxx);
	if (nr_before == 0) {
		total_scan = 0;
		break;
	}
}

shrinker->nr += 0;

But after your patch, total_scan is always zero. It never affect
next shrinker's loop count.

Am I missing something?
-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no  reclaimable slab
  2010-07-11 22:28                 ` Minchan Kim
@ 2010-07-13  4:48                   ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-13  4:48 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Andrew Morton, Christoph Lameter, LKML,
	linux-mm, Mel Gorman, Rik van Riel, Johannes Weiner

Hi

> 
> old shrink_slab
> 
> shrinker->nr += delta; /* Add delta to previous shrinker's remained count */
> total_scan = shrinker->nr;
> 
> while(total_scan >= SHRINK_BATCH) {
> 	nr_before = shrink(xxx);
> 	total_scan =- this_scan;
> }
> 
> shrinker->nr += total_scan;
> 
> The total_scan can always be the number < SHRINK_BATCH.
> So, when next shrinker calcuates loop count, the number can affect.

Correct.


> 
> new shrink_slab
> 
> shrinker->nr += delta; /* nr is always zero by your patch */

no.
my patch don't change delta calculation at all.


> total_scan = shrinker->nr;
> 
> while(total_scan >= SHRINK_BATCH) {
> 	nr_before = shrink(xxx);
> 	if (nr_before == 0) {
> 		total_scan = 0;
> 		break;
> 	}
> }
> 
> shrinker->nr += 0;
> 
> But after your patch, total_scan is always zero. It never affect
> next shrinker's loop count.

No. after my patch this loop has two exiting way
 1) total_scan are less than SHRINK_BATCH.
      -> no behavior change.  we still pass shrinker->nr += total_scan code.
 2) (*shrinker->shrink)(0, gfp_mask) return 0
      don't increase shrinker->nr.  because two reason,
      a) if total_scan are 10000,  we shouldn't carry over such big number.
      b) now, we have zero slab objects, then we have been freed form the guilty of keeping
          balance page and slab reclaim. shrinker->nr += 0; have zero side effect.

Thanks.

> 
> Am I missing something?
> -- 
> Kind regards,
> Minchan Kim




^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no  reclaimable slab
@ 2010-07-13  4:48                   ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-13  4:48 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Andrew Morton, Christoph Lameter, LKML,
	linux-mm, Mel Gorman, Rik van Riel, Johannes Weiner

Hi

> 
> old shrink_slab
> 
> shrinker->nr += delta; /* Add delta to previous shrinker's remained count */
> total_scan = shrinker->nr;
> 
> while(total_scan >= SHRINK_BATCH) {
> 	nr_before = shrink(xxx);
> 	total_scan =- this_scan;
> }
> 
> shrinker->nr += total_scan;
> 
> The total_scan can always be the number < SHRINK_BATCH.
> So, when next shrinker calcuates loop count, the number can affect.

Correct.


> 
> new shrink_slab
> 
> shrinker->nr += delta; /* nr is always zero by your patch */

no.
my patch don't change delta calculation at all.


> total_scan = shrinker->nr;
> 
> while(total_scan >= SHRINK_BATCH) {
> 	nr_before = shrink(xxx);
> 	if (nr_before == 0) {
> 		total_scan = 0;
> 		break;
> 	}
> }
> 
> shrinker->nr += 0;
> 
> But after your patch, total_scan is always zero. It never affect
> next shrinker's loop count.

No. after my patch this loop has two exiting way
 1) total_scan are less than SHRINK_BATCH.
      -> no behavior change.  we still pass shrinker->nr += total_scan code.
 2) (*shrinker->shrink)(0, gfp_mask) return 0
      don't increase shrinker->nr.  because two reason,
      a) if total_scan are 10000,  we shouldn't carry over such big number.
      b) now, we have zero slab objects, then we have been freed form the guilty of keeping
          balance page and slab reclaim. shrinker->nr += 0; have zero side effect.

Thanks.

> 
> Am I missing something?
> -- 
> Kind regards,
> Minchan Kim



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no reclaimable slab
  2010-07-09 14:02             ` Christoph Lameter
@ 2010-07-13  4:59               ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-13  4:59 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: kosaki.motohiro, Andrew Morton, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner

> On Fri, 9 Jul 2010, KOSAKI Motohiro wrote:
> 
> > If number of reclaimable slabs are zero, shrink_icache_memory() and
> > shrink_dcache_memory() return 0. but strangely shrink_slab() ignore
> > it and continue meaningless loop iteration.
> 
> There is also a per zone/node/global counter SLAB_RECLAIM_ACCOUNT that
> could be used to determine if its worth looking at things at all. I saw
> some effort going into making the shrinkers zone aware. If so then we may
> be able to avoid scanning slabs.

Yup.
After to merge nick's effort, we can makes more imrovement. I bet :)





^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no reclaimable slab
@ 2010-07-13  4:59               ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-13  4:59 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: kosaki.motohiro, Andrew Morton, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner

> On Fri, 9 Jul 2010, KOSAKI Motohiro wrote:
> 
> > If number of reclaimable slabs are zero, shrink_icache_memory() and
> > shrink_dcache_memory() return 0. but strangely shrink_slab() ignore
> > it and continue meaningless loop iteration.
> 
> There is also a per zone/node/global counter SLAB_RECLAIM_ACCOUNT that
> could be used to determine if its worth looking at things at all. I saw
> some effort going into making the shrinkers zone aware. If so then we may
> be able to avoid scanning slabs.

Yup.
After to merge nick's effort, we can makes more imrovement. I bet :)




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages,  not page order
  2010-07-09  8:36     ` Minchan Kim
@ 2010-07-13  5:41       ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-13  5:41 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Christoph Lameter, LKML, linux-mm,
	Andrew Morton, Mel Gorman, Rik van Riel, Johannes Weiner

> On Thu, Jul 8, 2010 at 4:40 PM, KOSAKI Motohiro
> <kosaki.motohiro@jp.fujitsu.com> wrote:
> > Fix simple argument error. Usually 'order' is very small value than
> > lru_pages. then it can makes unnecessary icache dropping.
> >
> > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> 
> With your test result, This patch makes sense to me.
> Please, include your test result in description.

How's this?

==============================================================
>From 19872d74875e2331cbe7eca46c8ef65f5f00d7c4 Mon Sep 17 00:00:00 2001
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Tue, 13 Jul 2010 13:39:11 +0900
Subject: [PATCH] vmscan: shrink_slab() require number of lru_pages, not page order

Now, shrink_slab() has following scanning equation.

                            lru_scanned        max_pass
  basic_scan_objects = 4 x -------------  x -----------------------------
                            lru_pages        shrinker->seeks (default:2)

  scan_objects = min(basic_scan_objects, max_pass * 2)

Then, If we pass very small value as lru_pages instead real number of
lru pages, shrink_slab() drop much objects rather than necessary. and
now, __zone_reclaim() pass 'order' as lru_pages by mistake. that makes
bad result.

Example, If we receive very low memory pressure (scan = 32, order = 0),
shrink_slab() via zone_reclaim() always drop _all_ icache/dcache
objects. (see above equation, very small lru_pages make very big
scan_objects result)

This patch fixes it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: Rik van Riel <riel@redhat.com>
---
 mm/vmscan.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6ff51c0..1bf9f72 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2612,6 +2612,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 
 	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
 	if (nr_slab_pages0 > zone->min_slab_pages) {
+		unsigned long lru_pages = zone_reclaimable_pages(zone);
+
 		/*
 		 * shrink_slab() does not currently allow us to determine how
 		 * many pages were freed in this zone. So we take the current
@@ -2622,7 +2624,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		 * Note that shrink_slab will free memory on all zones and may
 		 * take a long time.
 		 */
-		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
+		while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
 		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
 			nr_slab_pages0))
 			;
-- 
1.6.5.2





^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages,  not page order
@ 2010-07-13  5:41       ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-13  5:41 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Christoph Lameter, LKML, linux-mm,
	Andrew Morton, Mel Gorman, Rik van Riel, Johannes Weiner

> On Thu, Jul 8, 2010 at 4:40 PM, KOSAKI Motohiro
> <kosaki.motohiro@jp.fujitsu.com> wrote:
> > Fix simple argument error. Usually 'order' is very small value than
> > lru_pages. then it can makes unnecessary icache dropping.
> >
> > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> 
> With your test result, This patch makes sense to me.
> Please, include your test result in description.

How's this?

==============================================================
From 19872d74875e2331cbe7eca46c8ef65f5f00d7c4 Mon Sep 17 00:00:00 2001
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Tue, 13 Jul 2010 13:39:11 +0900
Subject: [PATCH] vmscan: shrink_slab() require number of lru_pages, not page order

Now, shrink_slab() has following scanning equation.

                            lru_scanned        max_pass
  basic_scan_objects = 4 x -------------  x -----------------------------
                            lru_pages        shrinker->seeks (default:2)

  scan_objects = min(basic_scan_objects, max_pass * 2)

Then, If we pass very small value as lru_pages instead real number of
lru pages, shrink_slab() drop much objects rather than necessary. and
now, __zone_reclaim() pass 'order' as lru_pages by mistake. that makes
bad result.

Example, If we receive very low memory pressure (scan = 32, order = 0),
shrink_slab() via zone_reclaim() always drop _all_ icache/dcache
objects. (see above equation, very small lru_pages make very big
scan_objects result)

This patch fixes it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: Rik van Riel <riel@redhat.com>
---
 mm/vmscan.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6ff51c0..1bf9f72 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2612,6 +2612,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 
 	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
 	if (nr_slab_pages0 > zone->min_slab_pages) {
+		unsigned long lru_pages = zone_reclaimable_pages(zone);
+
 		/*
 		 * shrink_slab() does not currently allow us to determine how
 		 * many pages were freed in this zone. So we take the current
@@ -2622,7 +2624,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		 * Note that shrink_slab will free memory on all zones and may
 		 * take a long time.
 		 */
-		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
+		while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
 		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
 			nr_slab_pages0))
 			;
-- 
1.6.5.2




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no  reclaimable slab
  2010-07-13  4:48                   ` KOSAKI Motohiro
@ 2010-07-13  6:33                     ` Minchan Kim
  -1 siblings, 0 replies; 62+ messages in thread
From: Minchan Kim @ 2010-07-13  6:33 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Johannes Weiner

On Tue, Jul 13, 2010 at 1:48 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
> Hi
>
>>
>> old shrink_slab
>>
>> shrinker->nr += delta; /* Add delta to previous shrinker's remained count */
>> total_scan = shrinker->nr;
>>
>> while(total_scan >= SHRINK_BATCH) {
>>       nr_before = shrink(xxx);
>>       total_scan =- this_scan;
>> }
>>
>> shrinker->nr += total_scan;
>>
>> The total_scan can always be the number < SHRINK_BATCH.
>> So, when next shrinker calcuates loop count, the number can affect.
>
> Correct.
>
>
>>
>> new shrink_slab
>>
>> shrinker->nr += delta; /* nr is always zero by your patch */
>
> no.
> my patch don't change delta calculation at all.
>
>
>> total_scan = shrinker->nr;
>>
>> while(total_scan >= SHRINK_BATCH) {
>>       nr_before = shrink(xxx);
>>       if (nr_before == 0) {
>>               total_scan = 0;
>>               break;
>>       }
>> }
>>
>> shrinker->nr += 0;
>>
>> But after your patch, total_scan is always zero. It never affect
>> next shrinker's loop count.
>
> No. after my patch this loop has two exiting way
>  1) total_scan are less than SHRINK_BATCH.
>      -> no behavior change.  we still pass shrinker->nr += total_scan code.
>  2) (*shrinker->shrink)(0, gfp_mask) return 0
>      don't increase shrinker->nr.  because two reason,
>      a) if total_scan are 10000,  we shouldn't carry over such big number.
>      b) now, we have zero slab objects, then we have been freed form the guilty of keeping
>          balance page and slab reclaim. shrinker->nr += 0; have zero side effect.

Totally, I agree with you.
Thanks for good explanation, Kosaki.

Reviewed-by: Minchan kim <minchan.kim@gmail.com>


-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] vmscan: stop meaningless loop iteration when no reclaimable slab
@ 2010-07-13  6:33                     ` Minchan Kim
  0 siblings, 0 replies; 62+ messages in thread
From: Minchan Kim @ 2010-07-13  6:33 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Johannes Weiner

On Tue, Jul 13, 2010 at 1:48 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
> Hi
>
>>
>> old shrink_slab
>>
>> shrinker->nr += delta; /* Add delta to previous shrinker's remained count */
>> total_scan = shrinker->nr;
>>
>> while(total_scan >= SHRINK_BATCH) {
>>       nr_before = shrink(xxx);
>>       total_scan =- this_scan;
>> }
>>
>> shrinker->nr += total_scan;
>>
>> The total_scan can always be the number < SHRINK_BATCH.
>> So, when next shrinker calcuates loop count, the number can affect.
>
> Correct.
>
>
>>
>> new shrink_slab
>>
>> shrinker->nr += delta; /* nr is always zero by your patch */
>
> no.
> my patch don't change delta calculation at all.
>
>
>> total_scan = shrinker->nr;
>>
>> while(total_scan >= SHRINK_BATCH) {
>>       nr_before = shrink(xxx);
>>       if (nr_before == 0) {
>>               total_scan = 0;
>>               break;
>>       }
>> }
>>
>> shrinker->nr += 0;
>>
>> But after your patch, total_scan is always zero. It never affect
>> next shrinker's loop count.
>
> No. after my patch this loop has two exiting way
>  1) total_scan are less than SHRINK_BATCH.
>      -> no behavior change.  we still pass shrinker->nr += total_scan code.
>  2) (*shrinker->shrink)(0, gfp_mask) return 0
>      don't increase shrinker->nr.  because two reason,
>      a) if total_scan are 10000,  we shouldn't carry over such big number.
>      b) now, we have zero slab objects, then we have been freed form the guilty of keeping
>          balance page and slab reclaim. shrinker->nr += 0; have zero side effect.

Totally, I agree with you.
Thanks for good explanation, Kosaki.

Reviewed-by: Minchan kim <minchan.kim@gmail.com>


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
  2010-07-09 22:28       ` Andrew Morton
@ 2010-07-13  9:32         ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-13  9:32 UTC (permalink / raw)
  To: Andrew Morton, Christoph Lameter
  Cc: kosaki.motohiro, LKML, linux-mm, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner

> On Fri,  9 Jul 2010 10:16:33 +0900 (JST)
> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> 
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> >  		.swappiness = vm_swappiness,
> >  		.order = order,
> >  	};
> > -	unsigned long slab_reclaimable;
> > +	unsigned long nr_slab_pages0, nr_slab_pages1;
> >  
> >  	disable_swap_token();
> >  	cond_resched();
> > @@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> >  		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
> >  	}
> >  
> > -	slab_reclaimable = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> > -	if (slab_reclaimable > zone->min_slab_pages) {
> > +	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> > +	if (nr_slab_pages0 > zone->min_slab_pages) {
> >  		/*
> >  		 * shrink_slab() does not currently allow us to determine how
> >  		 * many pages were freed in this zone.
> 
> Well no, but it could do so, with some minor changes to struct
> reclaim_state and its handling.  Put a zone* and a counter in
> reclaim_state, handle them in sl?b.c.
> 
> > So we take the current
> > @@ -2628,16 +2628,17 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> >  		 * take a long time.
> >  		 */
> >  		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> > -			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
> > -				slab_reclaimable - nr_pages)
> > +		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
> > +				nr_slab_pages0))
> >  			;
> >  
> >  		/*
> >  		 * Update nr_reclaimed by the number of slab pages we
> >  		 * reclaimed from this zone.
> >  		 */
> > -		sc.nr_reclaimed += slab_reclaimable -
> > -			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> > +		nr_slab_pages1 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> > +		if (nr_slab_pages1 < nr_slab_pages0)
> > +			sc.nr_reclaimed += nr_slab_pages0 - nr_slab_pages1;
> 
> My, that's horrible.  The whole expression says "this number is
> basically a pile of random junk.  Let's add it in anyway".
> 
> 
> >  	}
> >  
> >  	p->reclaim_state = NULL;


How's this?

Christoph, Can we hear your opinion about to add new branch in slab-free path?
I think this is ok, because reclaim makes a lot of cache miss then branch
mistaken is relatively minor penalty. thought?



>From 9f7d7a9bd836b7373ade3056e6a3d2a3d82ac7ce Mon Sep 17 00:00:00 2001
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Tue, 13 Jul 2010 14:43:21 +0900
Subject: [PATCH] vmscan: count reclaimed slab pages properly

Andrew Morton pointed out __zone_reclaim() shouldn't compare old and new
zone_page_state(NR_SLAB_RECLAIMABLE) result. Instead, it have to account
number of free slab pages by to enhance reclaim_state.

This patch does it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 include/linux/swap.h |    3 ++-
 mm/slab.c            |    4 +++-
 mm/slob.c            |    4 +++-
 mm/slub.c            |    7 +++++--
 mm/vmscan.c          |   44 ++++++++++++++++----------------------------
 5 files changed, 29 insertions(+), 33 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index ff4acea..b8d3f33 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -107,7 +107,8 @@ typedef struct {
  * memory reclaim
  */
 struct reclaim_state {
-	unsigned long reclaimed_slab;
+	unsigned long	reclaimed_slab;
+	struct zone	*zone;
 };
 
 #ifdef __KERNEL__
diff --git a/mm/slab.c b/mm/slab.c
index 4e9c46f..aac9306 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1741,7 +1741,9 @@ static void kmem_freepages(struct kmem_cache *cachep, void *addr)
 		page++;
 	}
 	if (current->reclaim_state)
-		current->reclaim_state->reclaimed_slab += nr_freed;
+		if (!current->reclaim_state->zone ||
+		    current->reclaim_state->zone == page_zone(page))
+			current->reclaim_state->reclaimed_slab += nr_freed;
 	free_pages((unsigned long)addr, cachep->gfporder);
 }
 
diff --git a/mm/slob.c b/mm/slob.c
index 3f19a34..192d05c 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -260,7 +260,9 @@ static void *slob_new_pages(gfp_t gfp, int order, int node)
 static void slob_free_pages(void *b, int order)
 {
 	if (current->reclaim_state)
-		current->reclaim_state->reclaimed_slab += 1 << order;
+		if (!current->reclaim_state->zone ||
+		    current->reclaim_state->zone == page_zone(page))
+			current->reclaim_state->reclaimed_slab += 1 << order;
 	free_pages((unsigned long)b, order);
 }
 
diff --git a/mm/slub.c b/mm/slub.c
index 7bb7940..f510b14 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1204,8 +1204,11 @@ static void __free_slab(struct kmem_cache *s, struct page *page)
 
 	__ClearPageSlab(page);
 	reset_page_mapcount(page);
-	if (current->reclaim_state)
-		current->reclaim_state->reclaimed_slab += pages;
+	if (current->reclaim_state) {
+		if (!current->reclaim_state->zone ||
+		    current->reclaim_state->zone == page_zone(page))
+			current->reclaim_state->reclaimed_slab += pages;
+	}
 	__free_pages(page, order);
 }
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1bf9f72..8faef0c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2571,7 +2571,6 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 	/* Minimum pages needed in order to stay on node */
 	const unsigned long nr_pages = 1 << order;
 	struct task_struct *p = current;
-	struct reclaim_state reclaim_state;
 	int priority;
 	struct scan_control sc = {
 		.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
@@ -2583,8 +2582,10 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		.swappiness = vm_swappiness,
 		.order = order,
 	};
-	unsigned long nr_slab_pages0, nr_slab_pages1;
-
+	struct reclaim_state reclaim_state = {
+		.reclaimed_slab = 0,
+		.zone		= zone,
+	};
 
 	cond_resched();
 	/*
@@ -2594,7 +2595,6 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 	 */
 	p->flags |= PF_MEMALLOC | PF_SWAPWRITE;
 	lockdep_set_current_reclaim_state(gfp_mask);
-	reclaim_state.reclaimed_slab = 0;
 	p->reclaim_state = &reclaim_state;
 
 	if (zone_pagecache_reclaimable(zone) > zone->min_unmapped_pages) {
@@ -2610,34 +2610,22 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
 	}
 
-	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
-	if (nr_slab_pages0 > zone->min_slab_pages) {
+	if (zone_page_state(zone, NR_SLAB_RECLAIMABLE) > zone->min_slab_pages) {
 		unsigned long lru_pages = zone_reclaimable_pages(zone);
 
-		/*
-		 * shrink_slab() does not currently allow us to determine how
-		 * many pages were freed in this zone. So we take the current
-		 * number of slab pages and shake the slab until it is reduced
-		 * by the same nr_pages that we used for reclaiming unmapped
-		 * pages.
-		 *
-		 * Note that shrink_slab will free memory on all zones and may
-		 * take a long time.
-		 */
-		while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
-		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
-			nr_slab_pages0))
-			;
-
-		/*
-		 * Update nr_reclaimed by the number of slab pages we
-		 * reclaimed from this zone.
-		 */
-		nr_slab_pages1 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
-		if (nr_slab_pages1 < nr_slab_pages0)
-			sc.nr_reclaimed += nr_slab_pages0 - nr_slab_pages1;
+		for(;;) {
+			/*
+			 * Note that shrink_slab will free memory on all zones
+			 * and may take a long time.
+			 */
+			if (!shrink_slab(sc.nr_scanned, gfp_mask, lru_pages))
+				break;
+			if (reclaim_state.reclaimed_slab >= nr_pages)
+				break;
+		}
 	}
 
+	sc.nr_reclaimed += reclaim_state.reclaimed_slab;
 	p->reclaim_state = NULL;
 	current->flags &= ~(PF_MEMALLOC | PF_SWAPWRITE);
 	lockdep_clear_current_reclaim_state();
-- 
1.6.5.2







^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
@ 2010-07-13  9:32         ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-13  9:32 UTC (permalink / raw)
  To: Andrew Morton, Christoph Lameter
  Cc: kosaki.motohiro, LKML, linux-mm, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner

> On Fri,  9 Jul 2010 10:16:33 +0900 (JST)
> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> 
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> >  		.swappiness = vm_swappiness,
> >  		.order = order,
> >  	};
> > -	unsigned long slab_reclaimable;
> > +	unsigned long nr_slab_pages0, nr_slab_pages1;
> >  
> >  	disable_swap_token();
> >  	cond_resched();
> > @@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> >  		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
> >  	}
> >  
> > -	slab_reclaimable = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> > -	if (slab_reclaimable > zone->min_slab_pages) {
> > +	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> > +	if (nr_slab_pages0 > zone->min_slab_pages) {
> >  		/*
> >  		 * shrink_slab() does not currently allow us to determine how
> >  		 * many pages were freed in this zone.
> 
> Well no, but it could do so, with some minor changes to struct
> reclaim_state and its handling.  Put a zone* and a counter in
> reclaim_state, handle them in sl?b.c.
> 
> > So we take the current
> > @@ -2628,16 +2628,17 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> >  		 * take a long time.
> >  		 */
> >  		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> > -			zone_page_state(zone, NR_SLAB_RECLAIMABLE) >
> > -				slab_reclaimable - nr_pages)
> > +		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
> > +				nr_slab_pages0))
> >  			;
> >  
> >  		/*
> >  		 * Update nr_reclaimed by the number of slab pages we
> >  		 * reclaimed from this zone.
> >  		 */
> > -		sc.nr_reclaimed += slab_reclaimable -
> > -			zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> > +		nr_slab_pages1 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> > +		if (nr_slab_pages1 < nr_slab_pages0)
> > +			sc.nr_reclaimed += nr_slab_pages0 - nr_slab_pages1;
> 
> My, that's horrible.  The whole expression says "this number is
> basically a pile of random junk.  Let's add it in anyway".
> 
> 
> >  	}
> >  
> >  	p->reclaim_state = NULL;


How's this?

Christoph, Can we hear your opinion about to add new branch in slab-free path?
I think this is ok, because reclaim makes a lot of cache miss then branch
mistaken is relatively minor penalty. thought?



From 9f7d7a9bd836b7373ade3056e6a3d2a3d82ac7ce Mon Sep 17 00:00:00 2001
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Tue, 13 Jul 2010 14:43:21 +0900
Subject: [PATCH] vmscan: count reclaimed slab pages properly

Andrew Morton pointed out __zone_reclaim() shouldn't compare old and new
zone_page_state(NR_SLAB_RECLAIMABLE) result. Instead, it have to account
number of free slab pages by to enhance reclaim_state.

This patch does it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 include/linux/swap.h |    3 ++-
 mm/slab.c            |    4 +++-
 mm/slob.c            |    4 +++-
 mm/slub.c            |    7 +++++--
 mm/vmscan.c          |   44 ++++++++++++++++----------------------------
 5 files changed, 29 insertions(+), 33 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index ff4acea..b8d3f33 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -107,7 +107,8 @@ typedef struct {
  * memory reclaim
  */
 struct reclaim_state {
-	unsigned long reclaimed_slab;
+	unsigned long	reclaimed_slab;
+	struct zone	*zone;
 };
 
 #ifdef __KERNEL__
diff --git a/mm/slab.c b/mm/slab.c
index 4e9c46f..aac9306 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1741,7 +1741,9 @@ static void kmem_freepages(struct kmem_cache *cachep, void *addr)
 		page++;
 	}
 	if (current->reclaim_state)
-		current->reclaim_state->reclaimed_slab += nr_freed;
+		if (!current->reclaim_state->zone ||
+		    current->reclaim_state->zone == page_zone(page))
+			current->reclaim_state->reclaimed_slab += nr_freed;
 	free_pages((unsigned long)addr, cachep->gfporder);
 }
 
diff --git a/mm/slob.c b/mm/slob.c
index 3f19a34..192d05c 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -260,7 +260,9 @@ static void *slob_new_pages(gfp_t gfp, int order, int node)
 static void slob_free_pages(void *b, int order)
 {
 	if (current->reclaim_state)
-		current->reclaim_state->reclaimed_slab += 1 << order;
+		if (!current->reclaim_state->zone ||
+		    current->reclaim_state->zone == page_zone(page))
+			current->reclaim_state->reclaimed_slab += 1 << order;
 	free_pages((unsigned long)b, order);
 }
 
diff --git a/mm/slub.c b/mm/slub.c
index 7bb7940..f510b14 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1204,8 +1204,11 @@ static void __free_slab(struct kmem_cache *s, struct page *page)
 
 	__ClearPageSlab(page);
 	reset_page_mapcount(page);
-	if (current->reclaim_state)
-		current->reclaim_state->reclaimed_slab += pages;
+	if (current->reclaim_state) {
+		if (!current->reclaim_state->zone ||
+		    current->reclaim_state->zone == page_zone(page))
+			current->reclaim_state->reclaimed_slab += pages;
+	}
 	__free_pages(page, order);
 }
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1bf9f72..8faef0c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2571,7 +2571,6 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 	/* Minimum pages needed in order to stay on node */
 	const unsigned long nr_pages = 1 << order;
 	struct task_struct *p = current;
-	struct reclaim_state reclaim_state;
 	int priority;
 	struct scan_control sc = {
 		.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
@@ -2583,8 +2582,10 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		.swappiness = vm_swappiness,
 		.order = order,
 	};
-	unsigned long nr_slab_pages0, nr_slab_pages1;
-
+	struct reclaim_state reclaim_state = {
+		.reclaimed_slab = 0,
+		.zone		= zone,
+	};
 
 	cond_resched();
 	/*
@@ -2594,7 +2595,6 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 	 */
 	p->flags |= PF_MEMALLOC | PF_SWAPWRITE;
 	lockdep_set_current_reclaim_state(gfp_mask);
-	reclaim_state.reclaimed_slab = 0;
 	p->reclaim_state = &reclaim_state;
 
 	if (zone_pagecache_reclaimable(zone) > zone->min_unmapped_pages) {
@@ -2610,34 +2610,22 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
 	}
 
-	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
-	if (nr_slab_pages0 > zone->min_slab_pages) {
+	if (zone_page_state(zone, NR_SLAB_RECLAIMABLE) > zone->min_slab_pages) {
 		unsigned long lru_pages = zone_reclaimable_pages(zone);
 
-		/*
-		 * shrink_slab() does not currently allow us to determine how
-		 * many pages were freed in this zone. So we take the current
-		 * number of slab pages and shake the slab until it is reduced
-		 * by the same nr_pages that we used for reclaiming unmapped
-		 * pages.
-		 *
-		 * Note that shrink_slab will free memory on all zones and may
-		 * take a long time.
-		 */
-		while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
-		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
-			nr_slab_pages0))
-			;
-
-		/*
-		 * Update nr_reclaimed by the number of slab pages we
-		 * reclaimed from this zone.
-		 */
-		nr_slab_pages1 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
-		if (nr_slab_pages1 < nr_slab_pages0)
-			sc.nr_reclaimed += nr_slab_pages0 - nr_slab_pages1;
+		for(;;) {
+			/*
+			 * Note that shrink_slab will free memory on all zones
+			 * and may take a long time.
+			 */
+			if (!shrink_slab(sc.nr_scanned, gfp_mask, lru_pages))
+				break;
+			if (reclaim_state.reclaimed_slab >= nr_pages)
+				break;
+		}
 	}
 
+	sc.nr_reclaimed += reclaim_state.reclaimed_slab;
 	p->reclaim_state = NULL;
 	current->flags &= ~(PF_MEMALLOC | PF_SWAPWRITE);
 	lockdep_clear_current_reclaim_state();
-- 
1.6.5.2






--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
  2010-07-13  9:32         ` KOSAKI Motohiro
@ 2010-07-14  1:50           ` Christoph Lameter
  -1 siblings, 0 replies; 62+ messages in thread
From: Christoph Lameter @ 2010-07-14  1:50 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, LKML, linux-mm, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner

On Tue, 13 Jul 2010, KOSAKI Motohiro wrote:

> Christoph, Can we hear your opinion about to add new branch in slab-free path?
> I think this is ok, because reclaim makes a lot of cache miss then branch
> mistaken is relatively minor penalty. thought?

Its on the slow path so I would think that should be okay. But is this
really necessary? Working with the per zone slab reclaim counters is not
enough? We are adding counter after counter that have similar purposes and
the handling gets more complex.

Maybe we can get rid of the code in the slabs instead by just relying on
the difference of the zone counters?


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
@ 2010-07-14  1:50           ` Christoph Lameter
  0 siblings, 0 replies; 62+ messages in thread
From: Christoph Lameter @ 2010-07-14  1:50 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, LKML, linux-mm, Mel Gorman, Rik van Riel,
	Minchan Kim, Johannes Weiner

On Tue, 13 Jul 2010, KOSAKI Motohiro wrote:

> Christoph, Can we hear your opinion about to add new branch in slab-free path?
> I think this is ok, because reclaim makes a lot of cache miss then branch
> mistaken is relatively minor penalty. thought?

Its on the slow path so I would think that should be okay. But is this
really necessary? Working with the per zone slab reclaim counters is not
enough? We are adding counter after counter that have similar purposes and
the handling gets more complex.

Maybe we can get rid of the code in the slabs instead by just relying on
the difference of the zone counters?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
  2010-07-14  1:50           ` Christoph Lameter
@ 2010-07-14  2:15             ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-14  2:15 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: kosaki.motohiro, Andrew Morton, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner

> On Tue, 13 Jul 2010, KOSAKI Motohiro wrote:
> 
> > Christoph, Can we hear your opinion about to add new branch in slab-free path?
> > I think this is ok, because reclaim makes a lot of cache miss then branch
> > mistaken is relatively minor penalty. thought?
> 
> Its on the slow path so I would think that should be okay. But is this
> really necessary? Working with the per zone slab reclaim counters is not
> enough? We are adding counter after counter that have similar purposes and
> the handling gets more complex.
> 
> Maybe we can get rid of the code in the slabs instead by just relying on
> the difference of the zone counters?

Okey, I agree. I'm pending this work at once. and I'll (probably) resume it
after Nick's work merged.

Thanks.





^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined
@ 2010-07-14  2:15             ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-14  2:15 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: kosaki.motohiro, Andrew Morton, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Minchan Kim, Johannes Weiner

> On Tue, 13 Jul 2010, KOSAKI Motohiro wrote:
> 
> > Christoph, Can we hear your opinion about to add new branch in slab-free path?
> > I think this is ok, because reclaim makes a lot of cache miss then branch
> > mistaken is relatively minor penalty. thought?
> 
> Its on the slow path so I would think that should be okay. But is this
> really necessary? Working with the per zone slab reclaim counters is not
> enough? We are adding counter after counter that have similar purposes and
> the handling gets more complex.
> 
> Maybe we can get rid of the code in the slabs instead by just relying on
> the difference of the zone counters?

Okey, I agree. I'm pending this work at once. and I'll (probably) resume it
after Nick's work merged.

Thanks.




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages,  not page order
  2010-07-13  5:41       ` KOSAKI Motohiro
@ 2010-07-15 19:15         ` Andrew Morton
  -1 siblings, 0 replies; 62+ messages in thread
From: Andrew Morton @ 2010-07-15 19:15 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Johannes Weiner

On Tue, 13 Jul 2010 14:41:28 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> Now, shrink_slab() has following scanning equation.
> 
>                             lru_scanned        max_pass
>   basic_scan_objects = 4 x -------------  x -----------------------------
>                             lru_pages        shrinker->seeks (default:2)
> 
>   scan_objects = min(basic_scan_objects, max_pass * 2)
> 
> Then, If we pass very small value as lru_pages instead real number of
> lru pages, shrink_slab() drop much objects rather than necessary. and
> now, __zone_reclaim() pass 'order' as lru_pages by mistake. that makes
> bad result.
> 
> Example, If we receive very low memory pressure (scan = 32, order = 0),
> shrink_slab() via zone_reclaim() always drop _all_ icache/dcache
> objects. (see above equation, very small lru_pages make very big
> scan_objects result)
> 
> This patch fixes it.
> 
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> Acked-by: Christoph Lameter <cl@linux-foundation.org>
> Acked-by: Rik van Riel <riel@redhat.com>
> ---
>  mm/vmscan.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 6ff51c0..1bf9f72 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2612,6 +2612,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  
>  	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
>  	if (nr_slab_pages0 > zone->min_slab_pages) {
> +		unsigned long lru_pages = zone_reclaimable_pages(zone);
> +
>  		/*
>  		 * shrink_slab() does not currently allow us to determine how
>  		 * many pages were freed in this zone. So we take the current
> @@ -2622,7 +2624,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		 * Note that shrink_slab will free memory on all zones and may
>  		 * take a long time.
>  		 */
> -		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> +		while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
>  		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
>  			nr_slab_pages0))
>  			;

Wouldn't it be better to recalculate zone_reclaimable_pages() each time
around the loop?  For example, shrink_icache_memory()->prune_icache()
will remove pagecache from an inode if it hits the tail of the list. 
This can change the number of reclaimable pages by squigabytes, but
this code thinks nothing changed?



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages,  not page order
@ 2010-07-15 19:15         ` Andrew Morton
  0 siblings, 0 replies; 62+ messages in thread
From: Andrew Morton @ 2010-07-15 19:15 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Johannes Weiner

On Tue, 13 Jul 2010 14:41:28 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> Now, shrink_slab() has following scanning equation.
> 
>                             lru_scanned        max_pass
>   basic_scan_objects = 4 x -------------  x -----------------------------
>                             lru_pages        shrinker->seeks (default:2)
> 
>   scan_objects = min(basic_scan_objects, max_pass * 2)
> 
> Then, If we pass very small value as lru_pages instead real number of
> lru pages, shrink_slab() drop much objects rather than necessary. and
> now, __zone_reclaim() pass 'order' as lru_pages by mistake. that makes
> bad result.
> 
> Example, If we receive very low memory pressure (scan = 32, order = 0),
> shrink_slab() via zone_reclaim() always drop _all_ icache/dcache
> objects. (see above equation, very small lru_pages make very big
> scan_objects result)
> 
> This patch fixes it.
> 
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> Acked-by: Christoph Lameter <cl@linux-foundation.org>
> Acked-by: Rik van Riel <riel@redhat.com>
> ---
>  mm/vmscan.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 6ff51c0..1bf9f72 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2612,6 +2612,8 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  
>  	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
>  	if (nr_slab_pages0 > zone->min_slab_pages) {
> +		unsigned long lru_pages = zone_reclaimable_pages(zone);
> +
>  		/*
>  		 * shrink_slab() does not currently allow us to determine how
>  		 * many pages were freed in this zone. So we take the current
> @@ -2622,7 +2624,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  		 * Note that shrink_slab will free memory on all zones and may
>  		 * take a long time.
>  		 */
> -		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> +		while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
>  		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
>  			nr_slab_pages0))
>  			;

Wouldn't it be better to recalculate zone_reclaimable_pages() each time
around the loop?  For example, shrink_icache_memory()->prune_icache()
will remove pagecache from an inode if it hits the tail of the list. 
This can change the number of reclaimable pages by squigabytes, but
this code thinks nothing changed?


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages,  not page order
  2010-07-15 19:15         ` Andrew Morton
@ 2010-07-16  1:39           ` KOSAKI Motohiro
  -1 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-16  1:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kosaki.motohiro, Minchan Kim, Christoph Lameter, LKML, linux-mm,
	Mel Gorman, Rik van Riel, Johannes Weiner

> >  	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> >  	if (nr_slab_pages0 > zone->min_slab_pages) {
> > +		unsigned long lru_pages = zone_reclaimable_pages(zone);
> > +
> >  		/*
> >  		 * shrink_slab() does not currently allow us to determine how
> >  		 * many pages were freed in this zone. So we take the current
> > @@ -2622,7 +2624,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> >  		 * Note that shrink_slab will free memory on all zones and may
> >  		 * take a long time.
> >  		 */
> > -		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> > +		while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
> >  		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
> >  			nr_slab_pages0))
> >  			;
> 
> Wouldn't it be better to recalculate zone_reclaimable_pages() each time
> around the loop?  For example, shrink_icache_memory()->prune_icache()
> will remove pagecache from an inode if it hits the tail of the list. 
> This can change the number of reclaimable pages by squigabytes, but
> this code thinks nothing changed?

Ah, I missed this. incrementa patch is here.

thank you!



>From 8f7c70cfb4a25f8292a59564db6c3ff425a69b53 Mon Sep 17 00:00:00 2001
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Fri, 16 Jul 2010 08:40:01 +0900
Subject: [PATCH] vmscan: recalculate lru_pages on each shrink_slab()

Andrew Morton pointed out shrink_slab() may change number of reclaimable
pages (e.g. shrink_icache_memory()->prune_icache() will remove unmapped
pagecache).

So, we need to recalculate lru_pages on each shrink_slab() calling.
This patch fixes it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/vmscan.c |   18 ++++++++++++------
 1 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1bf9f72..1da9b14 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2612,8 +2612,6 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 
 	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
 	if (nr_slab_pages0 > zone->min_slab_pages) {
-		unsigned long lru_pages = zone_reclaimable_pages(zone);
-
 		/*
 		 * shrink_slab() does not currently allow us to determine how
 		 * many pages were freed in this zone. So we take the current
@@ -2624,10 +2622,18 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		 * Note that shrink_slab will free memory on all zones and may
 		 * take a long time.
 		 */
-		while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
-		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
-			nr_slab_pages0))
-			;
+		for (;;) {
+			unsigned long lru_pages = zone_reclaimable_pages(zone);
+
+			/* No reclaimable slab or very low memroy pressure */
+			if (!shrink_slab(sc.nr_scanned, gfp_mask, lru_pages))
+				break;
+
+			/* Freed enouch memory */
+			nr_slab_pages1 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+			if (nr_slab_pages1 + nr_pages <= nr_slab_pages0)
+				break;
+		}
 
 		/*
 		 * Update nr_reclaimed by the number of slab pages we
-- 
1.6.5.2




^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages,  not page order
@ 2010-07-16  1:39           ` KOSAKI Motohiro
  0 siblings, 0 replies; 62+ messages in thread
From: KOSAKI Motohiro @ 2010-07-16  1:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kosaki.motohiro, Minchan Kim, Christoph Lameter, LKML, linux-mm,
	Mel Gorman, Rik van Riel, Johannes Weiner

> >  	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
> >  	if (nr_slab_pages0 > zone->min_slab_pages) {
> > +		unsigned long lru_pages = zone_reclaimable_pages(zone);
> > +
> >  		/*
> >  		 * shrink_slab() does not currently allow us to determine how
> >  		 * many pages were freed in this zone. So we take the current
> > @@ -2622,7 +2624,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
> >  		 * Note that shrink_slab will free memory on all zones and may
> >  		 * take a long time.
> >  		 */
> > -		while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
> > +		while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
> >  		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
> >  			nr_slab_pages0))
> >  			;
> 
> Wouldn't it be better to recalculate zone_reclaimable_pages() each time
> around the loop?  For example, shrink_icache_memory()->prune_icache()
> will remove pagecache from an inode if it hits the tail of the list. 
> This can change the number of reclaimable pages by squigabytes, but
> this code thinks nothing changed?

Ah, I missed this. incrementa patch is here.

thank you!



From 8f7c70cfb4a25f8292a59564db6c3ff425a69b53 Mon Sep 17 00:00:00 2001
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Fri, 16 Jul 2010 08:40:01 +0900
Subject: [PATCH] vmscan: recalculate lru_pages on each shrink_slab()

Andrew Morton pointed out shrink_slab() may change number of reclaimable
pages (e.g. shrink_icache_memory()->prune_icache() will remove unmapped
pagecache).

So, we need to recalculate lru_pages on each shrink_slab() calling.
This patch fixes it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/vmscan.c |   18 ++++++++++++------
 1 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1bf9f72..1da9b14 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2612,8 +2612,6 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 
 	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
 	if (nr_slab_pages0 > zone->min_slab_pages) {
-		unsigned long lru_pages = zone_reclaimable_pages(zone);
-
 		/*
 		 * shrink_slab() does not currently allow us to determine how
 		 * many pages were freed in this zone. So we take the current
@@ -2624,10 +2622,18 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		 * Note that shrink_slab will free memory on all zones and may
 		 * take a long time.
 		 */
-		while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
-		       (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
-			nr_slab_pages0))
-			;
+		for (;;) {
+			unsigned long lru_pages = zone_reclaimable_pages(zone);
+
+			/* No reclaimable slab or very low memroy pressure */
+			if (!shrink_slab(sc.nr_scanned, gfp_mask, lru_pages))
+				break;
+
+			/* Freed enouch memory */
+			nr_slab_pages1 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
+			if (nr_slab_pages1 + nr_pages <= nr_slab_pages0)
+				break;
+		}
 
 		/*
 		 * Update nr_reclaimed by the number of slab pages we
-- 
1.6.5.2



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages,  not page order
  2010-07-16  1:39           ` KOSAKI Motohiro
@ 2010-07-16  1:44             ` Minchan Kim
  -1 siblings, 0 replies; 62+ messages in thread
From: Minchan Kim @ 2010-07-16  1:44 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Johannes Weiner

On Fri, Jul 16, 2010 at 10:39 AM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>> >     nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
>> >     if (nr_slab_pages0 > zone->min_slab_pages) {
>> > +           unsigned long lru_pages = zone_reclaimable_pages(zone);
>> > +
>> >             /*
>> >              * shrink_slab() does not currently allow us to determine how
>> >              * many pages were freed in this zone. So we take the current
>> > @@ -2622,7 +2624,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>> >              * Note that shrink_slab will free memory on all zones and may
>> >              * take a long time.
>> >              */
>> > -           while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
>> > +           while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
>> >                    (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
>> >                     nr_slab_pages0))
>> >                     ;
>>
>> Wouldn't it be better to recalculate zone_reclaimable_pages() each time
>> around the loop?  For example, shrink_icache_memory()->prune_icache()
>> will remove pagecache from an inode if it hits the tail of the list.
>> This can change the number of reclaimable pages by squigabytes, but
>> this code thinks nothing changed?
>
> Ah, I missed this. incrementa patch is here.
>
> thank you!
>
>
>
> From 8f7c70cfb4a25f8292a59564db6c3ff425a69b53 Mon Sep 17 00:00:00 2001
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date: Fri, 16 Jul 2010 08:40:01 +0900
> Subject: [PATCH] vmscan: recalculate lru_pages on each shrink_slab()
>
> Andrew Morton pointed out shrink_slab() may change number of reclaimable
> pages (e.g. shrink_icache_memory()->prune_icache() will remove unmapped
> pagecache).
>
> So, we need to recalculate lru_pages on each shrink_slab() calling.
> This patch fixes it.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>

It does make sense.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order
@ 2010-07-16  1:44             ` Minchan Kim
  0 siblings, 0 replies; 62+ messages in thread
From: Minchan Kim @ 2010-07-16  1:44 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, Christoph Lameter, LKML, linux-mm, Mel Gorman,
	Rik van Riel, Johannes Weiner

On Fri, Jul 16, 2010 at 10:39 AM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>> >     nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);
>> >     if (nr_slab_pages0 > zone->min_slab_pages) {
>> > +           unsigned long lru_pages = zone_reclaimable_pages(zone);
>> > +
>> >             /*
>> >              * shrink_slab() does not currently allow us to determine how
>> >              * many pages were freed in this zone. So we take the current
>> > @@ -2622,7 +2624,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>> >              * Note that shrink_slab will free memory on all zones and may
>> >              * take a long time.
>> >              */
>> > -           while (shrink_slab(sc.nr_scanned, gfp_mask, order) &&
>> > +           while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) &&
>> >                    (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages >
>> >                     nr_slab_pages0))
>> >                     ;
>>
>> Wouldn't it be better to recalculate zone_reclaimable_pages() each time
>> around the loop?  For example, shrink_icache_memory()->prune_icache()
>> will remove pagecache from an inode if it hits the tail of the list.
>> This can change the number of reclaimable pages by squigabytes, but
>> this code thinks nothing changed?
>
> Ah, I missed this. incrementa patch is here.
>
> thank you!
>
>
>
> From 8f7c70cfb4a25f8292a59564db6c3ff425a69b53 Mon Sep 17 00:00:00 2001
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date: Fri, 16 Jul 2010 08:40:01 +0900
> Subject: [PATCH] vmscan: recalculate lru_pages on each shrink_slab()
>
> Andrew Morton pointed out shrink_slab() may change number of reclaimable
> pages (e.g. shrink_icache_memory()->prune_icache() will remove unmapped
> pagecache).
>
> So, we need to recalculate lru_pages on each shrink_slab() calling.
> This patch fixes it.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>

It does make sense.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2010-07-16  1:44 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-08  7:38 [PATCH v2 1/2] vmscan: don't subtraction of unsined KOSAKI Motohiro
2010-07-08  7:38 ` KOSAKI Motohiro
2010-07-08  7:40 ` [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order KOSAKI Motohiro
2010-07-08  7:40   ` KOSAKI Motohiro
2010-07-08 13:23   ` Rik van Riel
2010-07-08 13:23     ` Rik van Riel
2010-07-08 14:04   ` Christoph Lameter
2010-07-08 14:04     ` Christoph Lameter
2010-07-08 20:31     ` Andrew Morton
2010-07-08 20:31       ` Andrew Morton
2010-07-08 21:01       ` Christoph Lameter
2010-07-08 21:01         ` Christoph Lameter
2010-07-09  0:46         ` KOSAKI Motohiro
2010-07-09  0:46           ` KOSAKI Motohiro
2010-07-09  8:21       ` KOSAKI Motohiro
2010-07-09  8:21         ` KOSAKI Motohiro
2010-07-09 10:13         ` [PATCH] vmscan: stop meaningless loop iteration when no reclaimable slab KOSAKI Motohiro
2010-07-09 10:13           ` KOSAKI Motohiro
2010-07-09 10:53           ` Minchan Kim
2010-07-09 10:53             ` Minchan Kim
2010-07-09 11:04             ` KOSAKI Motohiro
2010-07-09 11:04               ` KOSAKI Motohiro
2010-07-11 22:28               ` Minchan Kim
2010-07-11 22:28                 ` Minchan Kim
2010-07-13  4:48                 ` KOSAKI Motohiro
2010-07-13  4:48                   ` KOSAKI Motohiro
2010-07-13  6:33                   ` Minchan Kim
2010-07-13  6:33                     ` Minchan Kim
2010-07-09 14:02           ` Christoph Lameter
2010-07-09 14:02             ` Christoph Lameter
2010-07-13  4:59             ` KOSAKI Motohiro
2010-07-13  4:59               ` KOSAKI Motohiro
2010-07-09  8:36   ` [PATCH v2 2/2] vmscan: shrink_slab() require number of lru_pages, not page order Minchan Kim
2010-07-09  8:36     ` Minchan Kim
2010-07-09 13:54     ` Christoph Lameter
2010-07-09 13:54       ` Christoph Lameter
2010-07-13  5:41     ` KOSAKI Motohiro
2010-07-13  5:41       ` KOSAKI Motohiro
2010-07-15 19:15       ` Andrew Morton
2010-07-15 19:15         ` Andrew Morton
2010-07-16  1:39         ` KOSAKI Motohiro
2010-07-16  1:39           ` KOSAKI Motohiro
2010-07-16  1:44           ` Minchan Kim
2010-07-16  1:44             ` Minchan Kim
2010-07-08  7:41 ` [PATCH v2 1/2] vmscan: don't subtraction of unsined KOSAKI Motohiro
2010-07-08  7:41   ` KOSAKI Motohiro
2010-07-08 14:01   ` Christoph Lameter
2010-07-08 14:01     ` Christoph Lameter
2010-07-08 20:00 ` Andrew Morton
2010-07-08 20:00   ` Andrew Morton
2010-07-09  1:16   ` KOSAKI Motohiro
2010-07-09  1:16     ` KOSAKI Motohiro
2010-07-09  1:46     ` Minchan Kim
2010-07-09  1:46       ` Minchan Kim
2010-07-09 22:28     ` Andrew Morton
2010-07-09 22:28       ` Andrew Morton
2010-07-13  9:32       ` KOSAKI Motohiro
2010-07-13  9:32         ` KOSAKI Motohiro
2010-07-14  1:50         ` Christoph Lameter
2010-07-14  1:50           ` Christoph Lameter
2010-07-14  2:15           ` KOSAKI Motohiro
2010-07-14  2:15             ` KOSAKI Motohiro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.