All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
@ 2013-01-20  0:02 ` paul.szabo
  0 siblings, 0 replies; 20+ messages in thread
From: paul.szabo @ 2013-01-20  0:02 UTC (permalink / raw)
  To: linux-mm; +Cc: 695182, linux-kernel

In bdi_position_ratio(), get difference (setpoint-dirty) right even when
negative. Both setpoint and dirty are unsigned long, the difference was
zero-padded thus wrongly sign-extended to s64. This issue affects all
32-bit architectures, does not affect 64-bit architectures where long
and s64 are equivalent.

In this function, dirty is between freerun and limit, the pseudo-float x
is between [-1,1], expected to be negative about half the time. With
zero-padding, instead of a small negative x we obtained a large positive
one so bdi_position_ratio() returned garbage.

Casting the difference to s64 also prevents overflow with left-shift;
though normally these numbers are small and I never observed a 32-bit
overflow there.

(This patch does not solve the PAE OOM issue.)

Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of Sydney    Australia

Reported-by: Paul Szabo <psz@maths.usyd.edu.au>
Reference: http://bugs.debian.org/695182
Signed-off-by: Paul Szabo <psz@maths.usyd.edu.au>

--- mm/page-writeback.c.old	2012-12-06 22:20:40.000000000 +1100
+++ mm/page-writeback.c	2013-01-20 07:47:55.000000000 +1100
@@ -559,7 +559,7 @@ static unsigned long bdi_position_ratio(
 	 *     => fast response on large errors; small oscillation near setpoint
 	 */
 	setpoint = (freerun + limit) / 2;
-	x = div_s64((setpoint - dirty) << RATELIMIT_CALC_SHIFT,
+	x = div_s64(((s64)setpoint - (s64)dirty) << RATELIMIT_CALC_SHIFT,
 		    limit - setpoint + 1);
 	pos_ratio = x;
 	pos_ratio = pos_ratio * x >> RATELIMIT_CALC_SHIFT;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
@ 2013-01-20  0:02 ` paul.szabo
  0 siblings, 0 replies; 20+ messages in thread
From: paul.szabo @ 2013-01-20  0:02 UTC (permalink / raw)
  To: linux-mm; +Cc: 695182, linux-kernel

In bdi_position_ratio(), get difference (setpoint-dirty) right even when
negative. Both setpoint and dirty are unsigned long, the difference was
zero-padded thus wrongly sign-extended to s64. This issue affects all
32-bit architectures, does not affect 64-bit architectures where long
and s64 are equivalent.

In this function, dirty is between freerun and limit, the pseudo-float x
is between [-1,1], expected to be negative about half the time. With
zero-padding, instead of a small negative x we obtained a large positive
one so bdi_position_ratio() returned garbage.

Casting the difference to s64 also prevents overflow with left-shift;
though normally these numbers are small and I never observed a 32-bit
overflow there.

(This patch does not solve the PAE OOM issue.)

Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of Sydney    Australia

Reported-by: Paul Szabo <psz@maths.usyd.edu.au>
Reference: http://bugs.debian.org/695182
Signed-off-by: Paul Szabo <psz@maths.usyd.edu.au>

--- mm/page-writeback.c.old	2012-12-06 22:20:40.000000000 +1100
+++ mm/page-writeback.c	2013-01-20 07:47:55.000000000 +1100
@@ -559,7 +559,7 @@ static unsigned long bdi_position_ratio(
 	 *     => fast response on large errors; small oscillation near setpoint
 	 */
 	setpoint = (freerun + limit) / 2;
-	x = div_s64((setpoint - dirty) << RATELIMIT_CALC_SHIFT,
+	x = div_s64(((s64)setpoint - (s64)dirty) << RATELIMIT_CALC_SHIFT,
 		    limit - setpoint + 1);
 	pos_ratio = x;
 	pos_ratio = pos_ratio * x >> RATELIMIT_CALC_SHIFT;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
  2013-01-20  0:02 ` paul.szabo
@ 2013-01-22 23:54   ` Jan Kara
  -1 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2013-01-22 23:54 UTC (permalink / raw)
  To: paul.szabo; +Cc: linux-mm, 695182, linux-kernel, Wu Fengguang

On Sun 20-01-13 11:02:10, paul.szabo@sydney.edu.au wrote:
> In bdi_position_ratio(), get difference (setpoint-dirty) right even when
> negative. Both setpoint and dirty are unsigned long, the difference was
> zero-padded thus wrongly sign-extended to s64. This issue affects all
> 32-bit architectures, does not affect 64-bit architectures where long
> and s64 are equivalent.
> 
> In this function, dirty is between freerun and limit, the pseudo-float x
> is between [-1,1], expected to be negative about half the time. With
> zero-padding, instead of a small negative x we obtained a large positive
> one so bdi_position_ratio() returned garbage.
> 
> Casting the difference to s64 also prevents overflow with left-shift;
> though normally these numbers are small and I never observed a 32-bit
> overflow there.
> 
> (This patch does not solve the PAE OOM issue.)
> 
> Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
> School of Mathematics and Statistics   University of Sydney    Australia
> 
> Reported-by: Paul Szabo <psz@maths.usyd.edu.au>
> Reference: http://bugs.debian.org/695182
> Signed-off-by: Paul Szabo <psz@maths.usyd.edu.au>
  Ah, good catch. Thanks for the patch. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>

  I've also added CC to writeback maintainer.

								Honza

> 
> --- mm/page-writeback.c.old	2012-12-06 22:20:40.000000000 +1100
> +++ mm/page-writeback.c	2013-01-20 07:47:55.000000000 +1100
> @@ -559,7 +559,7 @@ static unsigned long bdi_position_ratio(
>  	 *     => fast response on large errors; small oscillation near setpoint
>  	 */
>  	setpoint = (freerun + limit) / 2;
> -	x = div_s64((setpoint - dirty) << RATELIMIT_CALC_SHIFT,
> +	x = div_s64(((s64)setpoint - (s64)dirty) << RATELIMIT_CALC_SHIFT,
>  		    limit - setpoint + 1);
>  	pos_ratio = x;
>  	pos_ratio = pos_ratio * x >> RATELIMIT_CALC_SHIFT;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
@ 2013-01-22 23:54   ` Jan Kara
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2013-01-22 23:54 UTC (permalink / raw)
  To: paul.szabo; +Cc: linux-mm, 695182, linux-kernel, Wu Fengguang

On Sun 20-01-13 11:02:10, paul.szabo@sydney.edu.au wrote:
> In bdi_position_ratio(), get difference (setpoint-dirty) right even when
> negative. Both setpoint and dirty are unsigned long, the difference was
> zero-padded thus wrongly sign-extended to s64. This issue affects all
> 32-bit architectures, does not affect 64-bit architectures where long
> and s64 are equivalent.
> 
> In this function, dirty is between freerun and limit, the pseudo-float x
> is between [-1,1], expected to be negative about half the time. With
> zero-padding, instead of a small negative x we obtained a large positive
> one so bdi_position_ratio() returned garbage.
> 
> Casting the difference to s64 also prevents overflow with left-shift;
> though normally these numbers are small and I never observed a 32-bit
> overflow there.
> 
> (This patch does not solve the PAE OOM issue.)
> 
> Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
> School of Mathematics and Statistics   University of Sydney    Australia
> 
> Reported-by: Paul Szabo <psz@maths.usyd.edu.au>
> Reference: http://bugs.debian.org/695182
> Signed-off-by: Paul Szabo <psz@maths.usyd.edu.au>
  Ah, good catch. Thanks for the patch. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>

  I've also added CC to writeback maintainer.

								Honza

> 
> --- mm/page-writeback.c.old	2012-12-06 22:20:40.000000000 +1100
> +++ mm/page-writeback.c	2013-01-20 07:47:55.000000000 +1100
> @@ -559,7 +559,7 @@ static unsigned long bdi_position_ratio(
>  	 *     => fast response on large errors; small oscillation near setpoint
>  	 */
>  	setpoint = (freerun + limit) / 2;
> -	x = div_s64((setpoint - dirty) << RATELIMIT_CALC_SHIFT,
> +	x = div_s64(((s64)setpoint - (s64)dirty) << RATELIMIT_CALC_SHIFT,
>  		    limit - setpoint + 1);
>  	pos_ratio = x;
>  	pos_ratio = pos_ratio * x >> RATELIMIT_CALC_SHIFT;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
  2013-01-22 23:54   ` Jan Kara
@ 2013-01-24 14:14     ` Fengguang Wu
  -1 siblings, 0 replies; 20+ messages in thread
From: Fengguang Wu @ 2013-01-24 14:14 UTC (permalink / raw)
  To: Jan Kara; +Cc: paul.szabo, linux-mm, 695182, linux-kernel

On Wed, Jan 23, 2013 at 12:54:38AM +0100, Jan Kara wrote:
> On Sun 20-01-13 11:02:10, paul.szabo@sydney.edu.au wrote:
> > In bdi_position_ratio(), get difference (setpoint-dirty) right even when
> > negative. Both setpoint and dirty are unsigned long, the difference was
> > zero-padded thus wrongly sign-extended to s64. This issue affects all
> > 32-bit architectures, does not affect 64-bit architectures where long
> > and s64 are equivalent.
> > 
> > In this function, dirty is between freerun and limit, the pseudo-float x
> > is between [-1,1], expected to be negative about half the time. With
> > zero-padding, instead of a small negative x we obtained a large positive
> > one so bdi_position_ratio() returned garbage.
> > 
> > Casting the difference to s64 also prevents overflow with left-shift;
> > though normally these numbers are small and I never observed a 32-bit
> > overflow there.
> > 
> > (This patch does not solve the PAE OOM issue.)
> > 
> > Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
> > School of Mathematics and Statistics   University of Sydney    Australia
> > 
> > Reported-by: Paul Szabo <psz@maths.usyd.edu.au>
> > Reference: http://bugs.debian.org/695182
> > Signed-off-by: Paul Szabo <psz@maths.usyd.edu.au>
>   Ah, good catch. Thanks for the patch. You can add:
> Reviewed-by: Jan Kara <jack@suse.cz>
> 
>   I've also added CC to writeback maintainer.

Applied. Thanks! It's a good fix.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
@ 2013-01-24 14:14     ` Fengguang Wu
  0 siblings, 0 replies; 20+ messages in thread
From: Fengguang Wu @ 2013-01-24 14:14 UTC (permalink / raw)
  To: Jan Kara; +Cc: paul.szabo, linux-mm, 695182, linux-kernel

On Wed, Jan 23, 2013 at 12:54:38AM +0100, Jan Kara wrote:
> On Sun 20-01-13 11:02:10, paul.szabo@sydney.edu.au wrote:
> > In bdi_position_ratio(), get difference (setpoint-dirty) right even when
> > negative. Both setpoint and dirty are unsigned long, the difference was
> > zero-padded thus wrongly sign-extended to s64. This issue affects all
> > 32-bit architectures, does not affect 64-bit architectures where long
> > and s64 are equivalent.
> > 
> > In this function, dirty is between freerun and limit, the pseudo-float x
> > is between [-1,1], expected to be negative about half the time. With
> > zero-padding, instead of a small negative x we obtained a large positive
> > one so bdi_position_ratio() returned garbage.
> > 
> > Casting the difference to s64 also prevents overflow with left-shift;
> > though normally these numbers are small and I never observed a 32-bit
> > overflow there.
> > 
> > (This patch does not solve the PAE OOM issue.)
> > 
> > Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
> > School of Mathematics and Statistics   University of Sydney    Australia
> > 
> > Reported-by: Paul Szabo <psz@maths.usyd.edu.au>
> > Reference: http://bugs.debian.org/695182
> > Signed-off-by: Paul Szabo <psz@maths.usyd.edu.au>
>   Ah, good catch. Thanks for the patch. You can add:
> Reviewed-by: Jan Kara <jack@suse.cz>
> 
>   I've also added CC to writeback maintainer.

Applied. Thanks! It's a good fix.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
  2013-01-20  0:02 ` paul.szabo
@ 2013-01-24 14:57   ` Fengguang Wu
  -1 siblings, 0 replies; 20+ messages in thread
From: Fengguang Wu @ 2013-01-24 14:57 UTC (permalink / raw)
  To: paul.szabo; +Cc: linux-mm, 695182, linux-kernel, Andrew Morton, Jan Kara

Hi Paul,

> (This patch does not solve the PAE OOM issue.)

You may try the below debug patch. The only way the writeback patches
should trigger OOM, I think, is for the number of dirty/writeback
pages going out of control.

Or more simple, you may show us the OOM dmesg which will contain the
number of dirty pages. Or run this in a continuous loop during your
tests, and see how the dirty numbers change before OOM:

while :
do
        grep -E '(Dirty|Writeback)' /proc/meminfo
        sleep 1
done

Thanks,
Fengguang

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 50f0824..cf1165a 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1147,6 +1147,16 @@ pause:
 		if (task_ratelimit)
 			break;
 
+		if (nr_dirty > dirty_thresh + dirty_thresh / 2) {
+			if (printk_ratelimit())
+				printk(KERN_WARNING "nr_dirty=%lu dirty_thresh=%lu task_ratelimit=%lu dirty_ratelimit=%lu pos_ratio=%lu\n",
+				       nr_dirty,
+				       dirty_thresh,
+				       task_ratelimit,
+				       dirty_ratelimit,
+				       pos_ratio);
+		}
+
 		/*
 		 * In the case of an unresponding NFS server and the NFS dirty
 		 * pages exceeds dirty_thresh, give the other good bdi's a pipe

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
@ 2013-01-24 14:57   ` Fengguang Wu
  0 siblings, 0 replies; 20+ messages in thread
From: Fengguang Wu @ 2013-01-24 14:57 UTC (permalink / raw)
  To: paul.szabo; +Cc: linux-mm, 695182, linux-kernel, Andrew Morton, Jan Kara

Hi Paul,

> (This patch does not solve the PAE OOM issue.)

You may try the below debug patch. The only way the writeback patches
should trigger OOM, I think, is for the number of dirty/writeback
pages going out of control.

Or more simple, you may show us the OOM dmesg which will contain the
number of dirty pages. Or run this in a continuous loop during your
tests, and see how the dirty numbers change before OOM:

while :
do
        grep -E '(Dirty|Writeback)' /proc/meminfo
        sleep 1
done

Thanks,
Fengguang

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 50f0824..cf1165a 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1147,6 +1147,16 @@ pause:
 		if (task_ratelimit)
 			break;
 
+		if (nr_dirty > dirty_thresh + dirty_thresh / 2) {
+			if (printk_ratelimit())
+				printk(KERN_WARNING "nr_dirty=%lu dirty_thresh=%lu task_ratelimit=%lu dirty_ratelimit=%lu pos_ratio=%lu\n",
+				       nr_dirty,
+				       dirty_thresh,
+				       task_ratelimit,
+				       dirty_ratelimit,
+				       pos_ratio);
+		}
+
 		/*
 		 * In the case of an unresponding NFS server and the NFS dirty
 		 * pages exceeds dirty_thresh, give the other good bdi's a pipe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
  2013-01-24 14:57   ` Fengguang Wu
@ 2013-01-24 15:16     ` Jan Kara
  -1 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2013-01-24 15:16 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: paul.szabo, linux-mm, 695182, linux-kernel, Andrew Morton, Jan Kara

On Thu 24-01-13 22:57:07, Wu Fengguang wrote:
> Hi Paul,
> 
> > (This patch does not solve the PAE OOM issue.)
> 
> You may try the below debug patch. The only way the writeback patches
> should trigger OOM, I think, is for the number of dirty/writeback
> pages going out of control.
> 
> Or more simple, you may show us the OOM dmesg which will contain the
> number of dirty pages. Or run this in a continuous loop during your
> tests, and see how the dirty numbers change before OOM:
  I think he found the culprit of the problem being min_free_kbytes was not
properly reflected in the dirty throttling. But the patch has been already
picked up by Andrew so I didn't forward it to you. Paul please correct me
if I'm wrong.

								Honza

> 
> while :
> do
>         grep -E '(Dirty|Writeback)' /proc/meminfo
>         sleep 1
> done
> 
> Thanks,
> Fengguang
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 50f0824..cf1165a 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1147,6 +1147,16 @@ pause:
>  		if (task_ratelimit)
>  			break;
>  
> +		if (nr_dirty > dirty_thresh + dirty_thresh / 2) {
> +			if (printk_ratelimit())
> +				printk(KERN_WARNING "nr_dirty=%lu dirty_thresh=%lu task_ratelimit=%lu dirty_ratelimit=%lu pos_ratio=%lu\n",
> +				       nr_dirty,
> +				       dirty_thresh,
> +				       task_ratelimit,
> +				       dirty_ratelimit,
> +				       pos_ratio);
> +		}
> +
>  		/*
>  		 * In the case of an unresponding NFS server and the NFS dirty
>  		 * pages exceeds dirty_thresh, give the other good bdi's a pipe
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
@ 2013-01-24 15:16     ` Jan Kara
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2013-01-24 15:16 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: paul.szabo, linux-mm, 695182, linux-kernel, Andrew Morton, Jan Kara

On Thu 24-01-13 22:57:07, Wu Fengguang wrote:
> Hi Paul,
> 
> > (This patch does not solve the PAE OOM issue.)
> 
> You may try the below debug patch. The only way the writeback patches
> should trigger OOM, I think, is for the number of dirty/writeback
> pages going out of control.
> 
> Or more simple, you may show us the OOM dmesg which will contain the
> number of dirty pages. Or run this in a continuous loop during your
> tests, and see how the dirty numbers change before OOM:
  I think he found the culprit of the problem being min_free_kbytes was not
properly reflected in the dirty throttling. But the patch has been already
picked up by Andrew so I didn't forward it to you. Paul please correct me
if I'm wrong.

								Honza

> 
> while :
> do
>         grep -E '(Dirty|Writeback)' /proc/meminfo
>         sleep 1
> done
> 
> Thanks,
> Fengguang
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 50f0824..cf1165a 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1147,6 +1147,16 @@ pause:
>  		if (task_ratelimit)
>  			break;
>  
> +		if (nr_dirty > dirty_thresh + dirty_thresh / 2) {
> +			if (printk_ratelimit())
> +				printk(KERN_WARNING "nr_dirty=%lu dirty_thresh=%lu task_ratelimit=%lu dirty_ratelimit=%lu pos_ratio=%lu\n",
> +				       nr_dirty,
> +				       dirty_thresh,
> +				       task_ratelimit,
> +				       dirty_ratelimit,
> +				       pos_ratio);
> +		}
> +
>  		/*
>  		 * In the case of an unresponding NFS server and the NFS dirty
>  		 * pages exceeds dirty_thresh, give the other good bdi's a pipe
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
  2013-01-24 14:57   ` Fengguang Wu
@ 2013-01-24 23:43     ` paul.szabo
  -1 siblings, 0 replies; 20+ messages in thread
From: paul.szabo @ 2013-01-24 23:43 UTC (permalink / raw)
  To: fengguang.wu; +Cc: 695182, akpm, jack, linux-kernel, linux-mm

Dear Fengguang,

> Or more simple, you may show us the OOM dmesg which will contain the
> number of dirty pages. ...

Do you mean kern.log lines like:

[  744.754199] bash invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0, oom_score_adj=0
[  744.754202] bash cpuset=/ mems_allowed=0
[  744.754204] Pid: 3836, comm: bash Not tainted 3.2.0-4-686-pae #1 Debian 3.2.32-1
...
[  744.754354] active_anon:13497 inactive_anon:129 isolated_anon:0
[  744.754354]  active_file:2664 inactive_file:4144756 isolated_file:0
[  744.754355]  unevictable:0 dirty:510 writeback:0 unstable:0
[  744.754356]  free:11867217 slab_reclaimable:68289 slab_unreclaimable:7204
[  744.754356]  mapped:8066 shmem:250 pagetables:519 bounce:0
[  744.754361] DMA free:4260kB min:784kB low:980kB high:1176kB active_anon:0kB inactive_anon:0kB active_file:4kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15784kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:11628kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:499 all_unreclaimable? yes
[  744.754364] lowmem_reserve[]: 0 867 62932 62932
[  744.754369] Normal free:43788kB min:44112kB low:55140kB high:66168kB active_anon:0kB inactive_anon:0kB active_file:912kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:887976kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:261528kB slab_unreclaimable:28812kB kernel_stack:3096kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:16060 all_unreclaimable? yes
[  744.754372] lowmem_reserve[]: 0 0 496525 496525
[  744.754377] HighMem free:47420820kB min:512kB low:789888kB high:1579264kB active_anon:53988kB inactive_anon:516kB active_file:9740kB inactive_file:16579320kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:63555300kB mlocked:0kB dirty:2040kB writeback:0kB mapped:32260kB shmem:1000kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:2076kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  744.754380] lowmem_reserve[]: 0 0 0 0
[  744.754381] DMA: 445*4kB 36*8kB 3*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 4260kB
[  744.754386] Normal: 1132*4kB 620*8kB 237*16kB 70*32kB 38*64kB 26*128kB 20*256kB 14*512kB 4*1024kB 3*2048kB 0*4096kB = 43808kB
[  744.754390] HighMem: 226*4kB 242*8kB 155*16kB 66*32kB 10*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 2*2048kB 11574*4096kB = 47420680kB
[  744.754395] 4148173 total pagecache pages
[  744.754396] 0 pages in swap cache
[  744.754397] Swap cache stats: add 0, delete 0, find 0/0
[  744.754397] Free swap  = 0kB
[  744.754398] Total swap = 0kB
[  744.900649] 16777200 pages RAM
[  744.900650] 16549378 pages HighMem
[  744.900651] 664304 pages reserved
[  744.900652] 4162276 pages shared
[  744.900653] 104263 pages non-shared

? (The above and similar were reported to http://bugs.debian.org/695182 .)
Do you want me to log and report something else?

I believe the above crash may be provoked simply by running:
  n=0; while [ $n -lt 99 ]; do dd bs=1M count=1024 if=/dev/zero of=x$n; (( n = $n + 1 )); done &
on any PAE machine with over 32GB RAM. Oddly the problem does not seem
to occur when using mem=32g or lower on the kernel boot line (or on
machines with less than 32GB RAM).

Cheers, Paul

Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of Sydney    Australia

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
@ 2013-01-24 23:43     ` paul.szabo
  0 siblings, 0 replies; 20+ messages in thread
From: paul.szabo @ 2013-01-24 23:43 UTC (permalink / raw)
  To: fengguang.wu; +Cc: 695182, akpm, jack, linux-kernel, linux-mm

Dear Fengguang,

> Or more simple, you may show us the OOM dmesg which will contain the
> number of dirty pages. ...

Do you mean kern.log lines like:

[  744.754199] bash invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0, oom_score_adj=0
[  744.754202] bash cpuset=/ mems_allowed=0
[  744.754204] Pid: 3836, comm: bash Not tainted 3.2.0-4-686-pae #1 Debian 3.2.32-1
...
[  744.754354] active_anon:13497 inactive_anon:129 isolated_anon:0
[  744.754354]  active_file:2664 inactive_file:4144756 isolated_file:0
[  744.754355]  unevictable:0 dirty:510 writeback:0 unstable:0
[  744.754356]  free:11867217 slab_reclaimable:68289 slab_unreclaimable:7204
[  744.754356]  mapped:8066 shmem:250 pagetables:519 bounce:0
[  744.754361] DMA free:4260kB min:784kB low:980kB high:1176kB active_anon:0kB inactive_anon:0kB active_file:4kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15784kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:11628kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:499 all_unreclaimable? yes
[  744.754364] lowmem_reserve[]: 0 867 62932 62932
[  744.754369] Normal free:43788kB min:44112kB low:55140kB high:66168kB active_anon:0kB inactive_anon:0kB active_file:912kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:887976kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:261528kB slab_unreclaimable:28812kB kernel_stack:3096kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:16060 all_unreclaimable? yes
[  744.754372] lowmem_reserve[]: 0 0 496525 496525
[  744.754377] HighMem free:47420820kB min:512kB low:789888kB high:1579264kB active_anon:53988kB inactive_anon:516kB active_file:9740kB inactive_file:16579320kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:63555300kB mlocked:0kB dirty:2040kB writeback:0kB mapped:32260kB shmem:1000kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:2076kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  744.754380] lowmem_reserve[]: 0 0 0 0
[  744.754381] DMA: 445*4kB 36*8kB 3*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 4260kB
[  744.754386] Normal: 1132*4kB 620*8kB 237*16kB 70*32kB 38*64kB 26*128kB 20*256kB 14*512kB 4*1024kB 3*2048kB 0*4096kB = 43808kB
[  744.754390] HighMem: 226*4kB 242*8kB 155*16kB 66*32kB 10*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 2*2048kB 11574*4096kB = 47420680kB
[  744.754395] 4148173 total pagecache pages
[  744.754396] 0 pages in swap cache
[  744.754397] Swap cache stats: add 0, delete 0, find 0/0
[  744.754397] Free swap  = 0kB
[  744.754398] Total swap = 0kB
[  744.900649] 16777200 pages RAM
[  744.900650] 16549378 pages HighMem
[  744.900651] 664304 pages reserved
[  744.900652] 4162276 pages shared
[  744.900653] 104263 pages non-shared

? (The above and similar were reported to http://bugs.debian.org/695182 .)
Do you want me to log and report something else?

I believe the above crash may be provoked simply by running:
  n=0; while [ $n -lt 99 ]; do dd bs=1M count=1024 if=/dev/zero of=x$n; (( n = $n + 1 )); done &
on any PAE machine with over 32GB RAM. Oddly the problem does not seem
to occur when using mem=32g or lower on the kernel boot line (or on
machines with less than 32GB RAM).

Cheers, Paul

Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of Sydney    Australia

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
  2013-01-24 15:16     ` Jan Kara
@ 2013-01-25  0:15       ` paul.szabo
  -1 siblings, 0 replies; 20+ messages in thread
From: paul.szabo @ 2013-01-25  0:15 UTC (permalink / raw)
  To: fengguang.wu, jack; +Cc: 695182, akpm, linux-kernel, linux-mm

Dear Jan,

> I think he found the culprit of the problem being min_free_kbytes was not
> properly reflected in the dirty throttling. ... Paul please correct me
> if I'm wrong.

Sorry but have to correct you.

I noticed and patched/corrected two problems, one with (setpoint-dirty)
in bdi_position_ratio(), another with min_free_kbytes not subtracted
from dirtyable memory. Fixing those problems, singly or in combination,
did not help in avoiding OOM: running
  n=0; while [ $n -lt 99 ]; do dd bs=1M count=1024 if=/dev/zero of=x$n; ((n=$n+1)); done
still produces an OOM after a few files written (on a PAE machine with
over 32GB RAM).

Also, a quite similar OOM may be produced on any PAE machine with
  n=0; while [ $n -lt 33000 ]; do sleep 600 & ((n=n+1)); done
This was tested on machines with as low as just 3GB RAM ... and
curiously the same machine with "plain" (not PAE but HIGHMEM4G)
kernel handles the same "sleep test" without any problems.

(Thus I now think that the remaining bug is not with writeback.)

Cheers, Paul

Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of Sydney    Australia

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
@ 2013-01-25  0:15       ` paul.szabo
  0 siblings, 0 replies; 20+ messages in thread
From: paul.szabo @ 2013-01-25  0:15 UTC (permalink / raw)
  To: fengguang.wu, jack; +Cc: 695182, akpm, linux-kernel, linux-mm

Dear Jan,

> I think he found the culprit of the problem being min_free_kbytes was not
> properly reflected in the dirty throttling. ... Paul please correct me
> if I'm wrong.

Sorry but have to correct you.

I noticed and patched/corrected two problems, one with (setpoint-dirty)
in bdi_position_ratio(), another with min_free_kbytes not subtracted
from dirtyable memory. Fixing those problems, singly or in combination,
did not help in avoiding OOM: running
  n=0; while [ $n -lt 99 ]; do dd bs=1M count=1024 if=/dev/zero of=x$n; ((n=$n+1)); done
still produces an OOM after a few files written (on a PAE machine with
over 32GB RAM).

Also, a quite similar OOM may be produced on any PAE machine with
  n=0; while [ $n -lt 33000 ]; do sleep 600 & ((n=n+1)); done
This was tested on machines with as low as just 3GB RAM ... and
curiously the same machine with "plain" (not PAE but HIGHMEM4G)
kernel handles the same "sleep test" without any problems.

(Thus I now think that the remaining bug is not with writeback.)

Cheers, Paul

Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of Sydney    Australia

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
  2013-01-24 23:43     ` paul.szabo
@ 2013-01-25  0:55       ` Fengguang Wu
  -1 siblings, 0 replies; 20+ messages in thread
From: Fengguang Wu @ 2013-01-25  0:55 UTC (permalink / raw)
  To: paul.szabo; +Cc: 695182, akpm, jack, linux-kernel, linux-mm

On Fri, Jan 25, 2013 at 10:43:45AM +1100, paul.szabo@sydney.edu.au wrote:
> Dear Fengguang,
> 
> > Or more simple, you may show us the OOM dmesg which will contain the
> > number of dirty pages. ...
> 
> Do you mean kern.log lines like:

Yes.

> [  744.754199] bash invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0, oom_score_adj=0

It's an 2-page allocation in the Normal zone.

> [  744.754202] bash cpuset=/ mems_allowed=0
> [  744.754204] Pid: 3836, comm: bash Not tainted 3.2.0-4-686-pae #1 Debian 3.2.32-1
> ...
> [  744.754354] active_anon:13497 inactive_anon:129 isolated_anon:0
> [  744.754354]  active_file:2664 inactive_file:4144756 isolated_file:0
> [  744.754355]  unevictable:0 dirty:510 writeback:0 unstable:0

Almost no dirty/writeback pages.

> [  744.754356]  free:11867217 slab_reclaimable:68289 slab_unreclaimable:7204
> [  744.754356]  mapped:8066 shmem:250 pagetables:519 bounce:0
> [  744.754361] DMA free:4260kB min:784kB low:980kB high:1176kB active_anon:0kB inactive_anon:0kB active_file:4kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15784kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:11628kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:499 all_unreclaimable? yes
> [  744.754364] lowmem_reserve[]: 0 867 62932 62932
> [  744.754369] Normal free:43788kB min:44112kB low:55140kB high:66168kB active_anon:0kB inactive_anon:0kB active_file:912kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:887976kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:261528kB slab_unreclaimable:28812kB kernel_stack:3096kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:16060 all_unreclaimable? yes

There are 260MB reclaimable slab pages in the normal zone, however we
somehow failed to reclaim them. What's your filesystem and the content
of /proc/slabinfo?

> [  744.754372] lowmem_reserve[]: 0 0 496525 496525
> [  744.754377] HighMem free:47420820kB min:512kB low:789888kB high:1579264kB active_anon:53988kB inactive_anon:516kB active_file:9740kB inactive_file:16579320kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:63555300kB mlocked:0kB dirty:2040kB writeback:0kB mapped:32260kB shmem:1000kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:2076kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no

There are plenty of free and inactive file pages in the HighMem zone.

Thanks,
Fengguang

> [  744.754380] lowmem_reserve[]: 0 0 0 0
> [  744.754381] DMA: 445*4kB 36*8kB 3*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 4260kB
> [  744.754386] Normal: 1132*4kB 620*8kB 237*16kB 70*32kB 38*64kB 26*128kB 20*256kB 14*512kB 4*1024kB 3*2048kB 0*4096kB = 43808kB
> [  744.754390] HighMem: 226*4kB 242*8kB 155*16kB 66*32kB 10*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 2*2048kB 11574*4096kB = 47420680kB
> [  744.754395] 4148173 total pagecache pages
> [  744.754396] 0 pages in swap cache
> [  744.754397] Swap cache stats: add 0, delete 0, find 0/0
> [  744.754397] Free swap  = 0kB
> [  744.754398] Total swap = 0kB
> [  744.900649] 16777200 pages RAM
> [  744.900650] 16549378 pages HighMem
> [  744.900651] 664304 pages reserved
> [  744.900652] 4162276 pages shared
> [  744.900653] 104263 pages non-shared
> 
> ? (The above and similar were reported to http://bugs.debian.org/695182 .)
> Do you want me to log and report something else?
> 
> I believe the above crash may be provoked simply by running:
>   n=0; while [ $n -lt 99 ]; do dd bs=1M count=1024 if=/dev/zero of=x$n; (( n = $n + 1 )); done &
> on any PAE machine with over 32GB RAM. Oddly the problem does not seem
> to occur when using mem=32g or lower on the kernel boot line (or on
> machines with less than 32GB RAM).
> 
> Cheers, Paul
> 
> Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
> School of Mathematics and Statistics   University of Sydney    Australia

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
@ 2013-01-25  0:55       ` Fengguang Wu
  0 siblings, 0 replies; 20+ messages in thread
From: Fengguang Wu @ 2013-01-25  0:55 UTC (permalink / raw)
  To: paul.szabo; +Cc: 695182, akpm, jack, linux-kernel, linux-mm

On Fri, Jan 25, 2013 at 10:43:45AM +1100, paul.szabo@sydney.edu.au wrote:
> Dear Fengguang,
> 
> > Or more simple, you may show us the OOM dmesg which will contain the
> > number of dirty pages. ...
> 
> Do you mean kern.log lines like:

Yes.

> [  744.754199] bash invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0, oom_score_adj=0

It's an 2-page allocation in the Normal zone.

> [  744.754202] bash cpuset=/ mems_allowed=0
> [  744.754204] Pid: 3836, comm: bash Not tainted 3.2.0-4-686-pae #1 Debian 3.2.32-1
> ...
> [  744.754354] active_anon:13497 inactive_anon:129 isolated_anon:0
> [  744.754354]  active_file:2664 inactive_file:4144756 isolated_file:0
> [  744.754355]  unevictable:0 dirty:510 writeback:0 unstable:0

Almost no dirty/writeback pages.

> [  744.754356]  free:11867217 slab_reclaimable:68289 slab_unreclaimable:7204
> [  744.754356]  mapped:8066 shmem:250 pagetables:519 bounce:0
> [  744.754361] DMA free:4260kB min:784kB low:980kB high:1176kB active_anon:0kB inactive_anon:0kB active_file:4kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15784kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:11628kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:499 all_unreclaimable? yes
> [  744.754364] lowmem_reserve[]: 0 867 62932 62932
> [  744.754369] Normal free:43788kB min:44112kB low:55140kB high:66168kB active_anon:0kB inactive_anon:0kB active_file:912kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:887976kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:261528kB slab_unreclaimable:28812kB kernel_stack:3096kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:16060 all_unreclaimable? yes

There are 260MB reclaimable slab pages in the normal zone, however we
somehow failed to reclaim them. What's your filesystem and the content
of /proc/slabinfo?

> [  744.754372] lowmem_reserve[]: 0 0 496525 496525
> [  744.754377] HighMem free:47420820kB min:512kB low:789888kB high:1579264kB active_anon:53988kB inactive_anon:516kB active_file:9740kB inactive_file:16579320kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:63555300kB mlocked:0kB dirty:2040kB writeback:0kB mapped:32260kB shmem:1000kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:2076kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no

There are plenty of free and inactive file pages in the HighMem zone.

Thanks,
Fengguang

> [  744.754380] lowmem_reserve[]: 0 0 0 0
> [  744.754381] DMA: 445*4kB 36*8kB 3*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 4260kB
> [  744.754386] Normal: 1132*4kB 620*8kB 237*16kB 70*32kB 38*64kB 26*128kB 20*256kB 14*512kB 4*1024kB 3*2048kB 0*4096kB = 43808kB
> [  744.754390] HighMem: 226*4kB 242*8kB 155*16kB 66*32kB 10*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 2*2048kB 11574*4096kB = 47420680kB
> [  744.754395] 4148173 total pagecache pages
> [  744.754396] 0 pages in swap cache
> [  744.754397] Swap cache stats: add 0, delete 0, find 0/0
> [  744.754397] Free swap  = 0kB
> [  744.754398] Total swap = 0kB
> [  744.900649] 16777200 pages RAM
> [  744.900650] 16549378 pages HighMem
> [  744.900651] 664304 pages reserved
> [  744.900652] 4162276 pages shared
> [  744.900653] 104263 pages non-shared
> 
> ? (The above and similar were reported to http://bugs.debian.org/695182 .)
> Do you want me to log and report something else?
> 
> I believe the above crash may be provoked simply by running:
>   n=0; while [ $n -lt 99 ]; do dd bs=1M count=1024 if=/dev/zero of=x$n; (( n = $n + 1 )); done &
> on any PAE machine with over 32GB RAM. Oddly the problem does not seem
> to occur when using mem=32g or lower on the kernel boot line (or on
> machines with less than 32GB RAM).
> 
> Cheers, Paul
> 
> Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
> School of Mathematics and Statistics   University of Sydney    Australia

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
  2013-01-25  0:55       ` Fengguang Wu
@ 2013-01-25  1:47         ` paul.szabo
  -1 siblings, 0 replies; 20+ messages in thread
From: paul.szabo @ 2013-01-25  1:47 UTC (permalink / raw)
  To: fengguang.wu; +Cc: 695182, akpm, jack, linux-kernel, linux-mm

Dear Fengguang,

> There are 260MB reclaimable slab pages in the normal zone ...

Marked "all_unreclaimable? yes": is that wrong? Question asked also in:
http://marc.info/?l=linux-mm&m=135873981326767&w=2

> ... however we somehow failed to reclaim them. ...

I made a patch that would do a drop_caches at that point, please see:
http://bugs.debian.org/695182
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=101;filename=drop_caches.patch;att=1;bug=695182
http://marc.info/?l=linux-mm&m=135785511125549&w=2
and that successfully avoided OOM when writing files.
But, the drop_caches patch did not protect against the "sleep test".

> ... What's your filesystem and the content of /proc/slabinfo?

Filesystem is EXT3. See output of slabinfo in Debian bug above or in
http://marc.info/?l=linux-mm&m=135796154427544&w=2

Thanks, Paul

Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of Sydney    Australia

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
@ 2013-01-25  1:47         ` paul.szabo
  0 siblings, 0 replies; 20+ messages in thread
From: paul.szabo @ 2013-01-25  1:47 UTC (permalink / raw)
  To: fengguang.wu; +Cc: 695182, akpm, jack, linux-kernel, linux-mm

Dear Fengguang,

> There are 260MB reclaimable slab pages in the normal zone ...

Marked "all_unreclaimable? yes": is that wrong? Question asked also in:
http://marc.info/?l=linux-mm&m=135873981326767&w=2

> ... however we somehow failed to reclaim them. ...

I made a patch that would do a drop_caches at that point, please see:
http://bugs.debian.org/695182
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=101;filename=drop_caches.patch;att=1;bug=695182
http://marc.info/?l=linux-mm&m=135785511125549&w=2
and that successfully avoided OOM when writing files.
But, the drop_caches patch did not protect against the "sleep test".

> ... What's your filesystem and the content of /proc/slabinfo?

Filesystem is EXT3. See output of slabinfo in Debian bug above or in
http://marc.info/?l=linux-mm&m=135796154427544&w=2

Thanks, Paul

Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of Sydney    Australia

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
  2013-01-25  0:55       ` Fengguang Wu
@ 2013-01-26  3:57         ` paul.szabo
  -1 siblings, 0 replies; 20+ messages in thread
From: paul.szabo @ 2013-01-26  3:57 UTC (permalink / raw)
  To: fengguang.wu; +Cc: 695182, akpm, jack, linux-kernel, linux-mm

Dear Fengguang (et al),

> There are 260MB reclaimable slab pages in the normal zone, however we
> somehow failed to reclaim them. ...

Could the problem be that without CONFIG_NUMA, zone_reclaim_mode stays
at zero and anyway zone_reclaim() does nothing in include/linux/swap.h ?

Though... there is no CONFIG_NUMA nor /proc/sys/vm/zone_reclaim_mode in
the Ubuntu non-PAE "plain" HIGHMEM4G kernel, and still it handles the
"sleep test" just fine.

Where does reclaiming happen (or meant to happen)?

Thanks, Paul

Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of Sydney    Australia

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Negative (setpoint-dirty) in bdi_position_ratio()
@ 2013-01-26  3:57         ` paul.szabo
  0 siblings, 0 replies; 20+ messages in thread
From: paul.szabo @ 2013-01-26  3:57 UTC (permalink / raw)
  To: fengguang.wu; +Cc: 695182, akpm, jack, linux-kernel, linux-mm

Dear Fengguang (et al),

> There are 260MB reclaimable slab pages in the normal zone, however we
> somehow failed to reclaim them. ...

Could the problem be that without CONFIG_NUMA, zone_reclaim_mode stays
at zero and anyway zone_reclaim() does nothing in include/linux/swap.h ?

Though... there is no CONFIG_NUMA nor /proc/sys/vm/zone_reclaim_mode in
the Ubuntu non-PAE "plain" HIGHMEM4G kernel, and still it handles the
"sleep test" just fine.

Where does reclaiming happen (or meant to happen)?

Thanks, Paul

Paul Szabo   psz@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of Sydney    Australia

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2013-01-26  3:58 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-20  0:02 [PATCH] Negative (setpoint-dirty) in bdi_position_ratio() paul.szabo
2013-01-20  0:02 ` paul.szabo
2013-01-22 23:54 ` Jan Kara
2013-01-22 23:54   ` Jan Kara
2013-01-24 14:14   ` Fengguang Wu
2013-01-24 14:14     ` Fengguang Wu
2013-01-24 14:57 ` Fengguang Wu
2013-01-24 14:57   ` Fengguang Wu
2013-01-24 15:16   ` Jan Kara
2013-01-24 15:16     ` Jan Kara
2013-01-25  0:15     ` paul.szabo
2013-01-25  0:15       ` paul.szabo
2013-01-24 23:43   ` paul.szabo
2013-01-24 23:43     ` paul.szabo
2013-01-25  0:55     ` Fengguang Wu
2013-01-25  0:55       ` Fengguang Wu
2013-01-25  1:47       ` paul.szabo
2013-01-25  1:47         ` paul.szabo
2013-01-26  3:57       ` paul.szabo
2013-01-26  3:57         ` paul.szabo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.