All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic
@ 2011-06-30 18:32 Jan Kara
  2011-07-04  1:06 ` Wu Fengguang
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2011-06-30 18:32 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-mm, Jan Kara, Andrew Morton, Christoph Hellwig,
	Dave Chinner, Peter Zijlstra

We set bdi->dirty_exceeded (and thus ratelimiting code starts to
call balance_dirty_pages() every 8 pages) when a per-bdi limit is
exceeded or global limit is exceeded. But per-bdi limit also depends
on the task. Thus different tasks reach the limit on that bdi at
different levels of dirty pages. The result is that with current code
bdi->dirty_exceeded ping-ponged between 1 and 0 depending on which task
just got into balance_dirty_pages().

We fix the issue by clearing bdi->dirty_exceeded only when per-bdi amount
of dirty pages drops below the threshold (7/8 * bdi_dirty_limit) where task
limits already do not have any influence.

CC: Andrew Morton <akpm@linux-foundation.org>
CC: Christoph Hellwig <hch@infradead.org>
CC: Dave Chinner <david@fromorbit.com>
CC: Wu Fengguang <fengguang.wu@intel.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/page-writeback.c |   29 ++++++++++++++++++++++-------
 1 files changed, 22 insertions(+), 7 deletions(-)

 This is the patch fixing dirty_exceeded logic I promised you last week.
I based it on current Linus's tree as it is a relatively direct fix so I
expect it can be somewhere in the beginning of the patch series and merged
relatively quickly. Can you please add it to your tree? Thanks.

								Honza

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 31f6988..d8b395f 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -274,12 +274,13 @@ static inline void task_dirties_fraction(struct task_struct *tsk,
  * effectively curb the growth of dirty pages. Light dirtiers with high enough
  * dirty threshold may never get throttled.
  */
+#define TASK_LIMIT_FRACTION 8
 static unsigned long task_dirty_limit(struct task_struct *tsk,
 				       unsigned long bdi_dirty)
 {
 	long numerator, denominator;
 	unsigned long dirty = bdi_dirty;
-	u64 inv = dirty >> 3;
+	u64 inv = dirty / TASK_LIMIT_FRACTION;
 
 	task_dirties_fraction(tsk, &numerator, &denominator);
 	inv *= numerator;
@@ -290,6 +291,12 @@ static unsigned long task_dirty_limit(struct task_struct *tsk,
 	return max(dirty, bdi_dirty/2);
 }
 
+/* Minimum limit for any task */
+static unsigned long task_min_dirty_limit(unsigned long bdi_dirty)
+{
+	return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION;
+}
+
 /*
  *
  */
@@ -483,9 +490,12 @@ static void balance_dirty_pages(struct address_space *mapping,
 	unsigned long background_thresh;
 	unsigned long dirty_thresh;
 	unsigned long bdi_thresh;
+	unsigned long task_bdi_thresh;
+	unsigned long min_task_bdi_thresh;
 	unsigned long pages_written = 0;
 	unsigned long pause = 1;
 	bool dirty_exceeded = false;
+	bool clear_dirty_exceeded = true;
 	struct backing_dev_info *bdi = mapping->backing_dev_info;
 
 	for (;;) {
@@ -512,7 +522,8 @@ static void balance_dirty_pages(struct address_space *mapping,
 			break;
 
 		bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh);
-		bdi_thresh = task_dirty_limit(current, bdi_thresh);
+		min_task_bdi_thresh = task_min_dirty_limit(bdi_thresh);
+		task_bdi_thresh = task_dirty_limit(current, bdi_thresh);
 
 		/*
 		 * In order to avoid the stacked BDI deadlock we need
@@ -524,7 +535,7 @@ static void balance_dirty_pages(struct address_space *mapping,
 		 * actually dirty; with m+n sitting in the percpu
 		 * deltas.
 		 */
-		if (bdi_thresh < 2*bdi_stat_error(bdi)) {
+		if (task_bdi_thresh < 2 * bdi_stat_error(bdi)) {
 			bdi_nr_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE);
 			bdi_nr_writeback = bdi_stat_sum(bdi, BDI_WRITEBACK);
 		} else {
@@ -539,8 +550,11 @@ static void balance_dirty_pages(struct address_space *mapping,
 		 * the last resort safeguard.
 		 */
 		dirty_exceeded =
-			(bdi_nr_reclaimable + bdi_nr_writeback > bdi_thresh)
-			|| (nr_reclaimable + nr_writeback > dirty_thresh);
+		  (bdi_nr_reclaimable + bdi_nr_writeback > task_bdi_thresh)
+		  || (nr_reclaimable + nr_writeback > dirty_thresh);
+		clear_dirty_exceeded =
+		  (bdi_nr_reclaimable + bdi_nr_writeback <= min_task_bdi_thresh)
+		  && (nr_reclaimable + nr_writeback <= dirty_thresh);
 
 		if (!dirty_exceeded)
 			break;
@@ -558,7 +572,7 @@ static void balance_dirty_pages(struct address_space *mapping,
 		 * up.
 		 */
 		trace_wbc_balance_dirty_start(&wbc, bdi);
-		if (bdi_nr_reclaimable > bdi_thresh) {
+		if (bdi_nr_reclaimable > task_bdi_thresh) {
 			writeback_inodes_wb(&bdi->wb, &wbc);
 			pages_written += write_chunk - wbc.nr_to_write;
 			trace_wbc_balance_dirty_written(&wbc, bdi);
@@ -578,7 +592,8 @@ static void balance_dirty_pages(struct address_space *mapping,
 			pause = HZ / 10;
 	}
 
-	if (!dirty_exceeded && bdi->dirty_exceeded)
+	/* Clear dirty_exceeded flag only when no task can exceed the limit */
+	if (clear_dirty_exceeded && bdi->dirty_exceeded)
 		bdi->dirty_exceeded = 0;
 
 	if (writeback_in_progress(bdi))
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic
  2011-06-30 18:32 [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic Jan Kara
@ 2011-07-04  1:06 ` Wu Fengguang
  2011-07-11 17:06   ` Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Wu Fengguang @ 2011-07-04  1:06 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-mm, Andrew Morton, Christoph Hellwig, Dave Chinner, Peter Zijlstra

Hi Jan,

On Fri, Jul 01, 2011 at 02:32:44AM +0800, Jan Kara wrote:
> We set bdi->dirty_exceeded (and thus ratelimiting code starts to
> call balance_dirty_pages() every 8 pages) when a per-bdi limit is
> exceeded or global limit is exceeded. But per-bdi limit also depends
> on the task. Thus different tasks reach the limit on that bdi at
> different levels of dirty pages. The result is that with current code
> bdi->dirty_exceeded ping-ponged between 1 and 0 depending on which task
> just got into balance_dirty_pages().
> 
> We fix the issue by clearing bdi->dirty_exceeded only when per-bdi amount
> of dirty pages drops below the threshold (7/8 * bdi_dirty_limit) where task
> limits already do not have any influence.

The end result, I think, is that the dirty pages are kept more tightly
under control, with the average number a slightly lowered than before. 
This reduces the risk to throttle light dirtiers and hence more
responsive. However it does introduce more overheads by enforcing
balance_dirty_pages() calls on every 8 pages.

> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: Christoph Hellwig <hch@infradead.org>
> CC: Dave Chinner <david@fromorbit.com>
> CC: Wu Fengguang <fengguang.wu@intel.com>
> CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  mm/page-writeback.c |   29 ++++++++++++++++++++++-------
>  1 files changed, 22 insertions(+), 7 deletions(-)
> 
>  This is the patch fixing dirty_exceeded logic I promised you last week.
> I based it on current Linus's tree as it is a relatively direct fix so I
> expect it can be somewhere in the beginning of the patch series and merged
> relatively quickly. Can you please add it to your tree? Thanks.

OK. I noticed that bdi_thresh is no longer used. What if we just
rename it? But I admit that the patch in its current form looks a bit
more clear in concept.

Thanks,
Fengguang

> 								Honza
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 31f6988..d8b395f 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -274,12 +274,13 @@ static inline void task_dirties_fraction(struct task_struct *tsk,
>   * effectively curb the growth of dirty pages. Light dirtiers with high enough
>   * dirty threshold may never get throttled.
>   */
> +#define TASK_LIMIT_FRACTION 8
>  static unsigned long task_dirty_limit(struct task_struct *tsk,
>  				       unsigned long bdi_dirty)
>  {
>  	long numerator, denominator;
>  	unsigned long dirty = bdi_dirty;
> -	u64 inv = dirty >> 3;
> +	u64 inv = dirty / TASK_LIMIT_FRACTION;
>  
>  	task_dirties_fraction(tsk, &numerator, &denominator);
>  	inv *= numerator;
> @@ -290,6 +291,12 @@ static unsigned long task_dirty_limit(struct task_struct *tsk,
>  	return max(dirty, bdi_dirty/2);
>  }
>  
> +/* Minimum limit for any task */
> +static unsigned long task_min_dirty_limit(unsigned long bdi_dirty)
> +{
> +	return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION;
> +}
> +
>  /*
>   *
>   */
> @@ -483,9 +490,12 @@ static void balance_dirty_pages(struct address_space *mapping,
>  	unsigned long background_thresh;
>  	unsigned long dirty_thresh;
>  	unsigned long bdi_thresh;
> +	unsigned long task_bdi_thresh;
> +	unsigned long min_task_bdi_thresh;
>  	unsigned long pages_written = 0;
>  	unsigned long pause = 1;
>  	bool dirty_exceeded = false;
> +	bool clear_dirty_exceeded = true;
>  	struct backing_dev_info *bdi = mapping->backing_dev_info;
>  
>  	for (;;) {
> @@ -512,7 +522,8 @@ static void balance_dirty_pages(struct address_space *mapping,
>  			break;
>  
>  		bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh);
> -		bdi_thresh = task_dirty_limit(current, bdi_thresh);
> +		min_task_bdi_thresh = task_min_dirty_limit(bdi_thresh);
> +		task_bdi_thresh = task_dirty_limit(current, bdi_thresh);
>  
>  		/*
>  		 * In order to avoid the stacked BDI deadlock we need
> @@ -524,7 +535,7 @@ static void balance_dirty_pages(struct address_space *mapping,
>  		 * actually dirty; with m+n sitting in the percpu
>  		 * deltas.
>  		 */
> -		if (bdi_thresh < 2*bdi_stat_error(bdi)) {
> +		if (task_bdi_thresh < 2 * bdi_stat_error(bdi)) {
>  			bdi_nr_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE);
>  			bdi_nr_writeback = bdi_stat_sum(bdi, BDI_WRITEBACK);
>  		} else {
> @@ -539,8 +550,11 @@ static void balance_dirty_pages(struct address_space *mapping,
>  		 * the last resort safeguard.
>  		 */
>  		dirty_exceeded =
> -			(bdi_nr_reclaimable + bdi_nr_writeback > bdi_thresh)
> -			|| (nr_reclaimable + nr_writeback > dirty_thresh);
> +		  (bdi_nr_reclaimable + bdi_nr_writeback > task_bdi_thresh)
> +		  || (nr_reclaimable + nr_writeback > dirty_thresh);
> +		clear_dirty_exceeded =
> +		  (bdi_nr_reclaimable + bdi_nr_writeback <= min_task_bdi_thresh)
> +		  && (nr_reclaimable + nr_writeback <= dirty_thresh);
>  
>  		if (!dirty_exceeded)
>  			break;
> @@ -558,7 +572,7 @@ static void balance_dirty_pages(struct address_space *mapping,
>  		 * up.
>  		 */
>  		trace_wbc_balance_dirty_start(&wbc, bdi);
> -		if (bdi_nr_reclaimable > bdi_thresh) {
> +		if (bdi_nr_reclaimable > task_bdi_thresh) {
>  			writeback_inodes_wb(&bdi->wb, &wbc);
>  			pages_written += write_chunk - wbc.nr_to_write;
>  			trace_wbc_balance_dirty_written(&wbc, bdi);
> @@ -578,7 +592,8 @@ static void balance_dirty_pages(struct address_space *mapping,
>  			pause = HZ / 10;
>  	}
>  
> -	if (!dirty_exceeded && bdi->dirty_exceeded)
> +	/* Clear dirty_exceeded flag only when no task can exceed the limit */
> +	if (clear_dirty_exceeded && bdi->dirty_exceeded)
>  		bdi->dirty_exceeded = 0;
>  
>  	if (writeback_in_progress(bdi))
> -- 
> 1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic
  2011-07-04  1:06 ` Wu Fengguang
@ 2011-07-11 17:06   ` Jan Kara
  2011-07-13 23:02     ` Wu Fengguang
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2011-07-11 17:06 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Jan Kara, linux-mm, Andrew Morton, Christoph Hellwig,
	Dave Chinner, Peter Zijlstra

On Mon 04-07-11 09:06:19, Wu Fengguang wrote:
> Hi Jan,
> 
> On Fri, Jul 01, 2011 at 02:32:44AM +0800, Jan Kara wrote:
> > We set bdi->dirty_exceeded (and thus ratelimiting code starts to
> > call balance_dirty_pages() every 8 pages) when a per-bdi limit is
> > exceeded or global limit is exceeded. But per-bdi limit also depends
> > on the task. Thus different tasks reach the limit on that bdi at
> > different levels of dirty pages. The result is that with current code
> > bdi->dirty_exceeded ping-ponged between 1 and 0 depending on which task
> > just got into balance_dirty_pages().
> > 
> > We fix the issue by clearing bdi->dirty_exceeded only when per-bdi amount
> > of dirty pages drops below the threshold (7/8 * bdi_dirty_limit) where task
> > limits already do not have any influence.
> 
> The end result, I think, is that the dirty pages are kept more tightly
> under control, with the average number a slightly lowered than before. 
> This reduces the risk to throttle light dirtiers and hence more
> responsive. However it does introduce more overheads by enforcing
> balance_dirty_pages() calls on every 8 pages.
  Yes. I think this was actually the original intention when the code was
written.

> > CC: Andrew Morton <akpm@linux-foundation.org>
> > CC: Christoph Hellwig <hch@infradead.org>
> > CC: Dave Chinner <david@fromorbit.com>
> > CC: Wu Fengguang <fengguang.wu@intel.com>
> > CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  mm/page-writeback.c |   29 ++++++++++++++++++++++-------
> >  1 files changed, 22 insertions(+), 7 deletions(-)
> > 
> >  This is the patch fixing dirty_exceeded logic I promised you last week.
> > I based it on current Linus's tree as it is a relatively direct fix so I
> > expect it can be somewhere in the beginning of the patch series and merged
> > relatively quickly. Can you please add it to your tree? Thanks.
> 
> OK. I noticed that bdi_thresh is no longer used. What if we just
> rename it? But I admit that the patch in its current form looks a bit
> more clear in concept.
  You are right bdi_thresh is only used for computing task_bdi_thresh and
min_task_bdi_thresh now. We could possibly eliminate that one variable but
I guess compiler will optimize it away anyway and I find the code more
legible when written this way...

								Honza

> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > index 31f6988..d8b395f 100644
> > --- a/mm/page-writeback.c
> > +++ b/mm/page-writeback.c
> > @@ -274,12 +274,13 @@ static inline void task_dirties_fraction(struct task_struct *tsk,
> >   * effectively curb the growth of dirty pages. Light dirtiers with high enough
> >   * dirty threshold may never get throttled.
> >   */
> > +#define TASK_LIMIT_FRACTION 8
> >  static unsigned long task_dirty_limit(struct task_struct *tsk,
> >  				       unsigned long bdi_dirty)
> >  {
> >  	long numerator, denominator;
> >  	unsigned long dirty = bdi_dirty;
> > -	u64 inv = dirty >> 3;
> > +	u64 inv = dirty / TASK_LIMIT_FRACTION;
> >  
> >  	task_dirties_fraction(tsk, &numerator, &denominator);
> >  	inv *= numerator;
> > @@ -290,6 +291,12 @@ static unsigned long task_dirty_limit(struct task_struct *tsk,
> >  	return max(dirty, bdi_dirty/2);
> >  }
> >  
> > +/* Minimum limit for any task */
> > +static unsigned long task_min_dirty_limit(unsigned long bdi_dirty)
> > +{
> > +	return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION;
> > +}
> > +
> >  /*
> >   *
> >   */
> > @@ -483,9 +490,12 @@ static void balance_dirty_pages(struct address_space *mapping,
> >  	unsigned long background_thresh;
> >  	unsigned long dirty_thresh;
> >  	unsigned long bdi_thresh;
> > +	unsigned long task_bdi_thresh;
> > +	unsigned long min_task_bdi_thresh;
> >  	unsigned long pages_written = 0;
> >  	unsigned long pause = 1;
> >  	bool dirty_exceeded = false;
> > +	bool clear_dirty_exceeded = true;
> >  	struct backing_dev_info *bdi = mapping->backing_dev_info;
> >  
> >  	for (;;) {
> > @@ -512,7 +522,8 @@ static void balance_dirty_pages(struct address_space *mapping,
> >  			break;
> >  
> >  		bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh);
> > -		bdi_thresh = task_dirty_limit(current, bdi_thresh);
> > +		min_task_bdi_thresh = task_min_dirty_limit(bdi_thresh);
> > +		task_bdi_thresh = task_dirty_limit(current, bdi_thresh);
> >  
> >  		/*
> >  		 * In order to avoid the stacked BDI deadlock we need
> > @@ -524,7 +535,7 @@ static void balance_dirty_pages(struct address_space *mapping,
> >  		 * actually dirty; with m+n sitting in the percpu
> >  		 * deltas.
> >  		 */
> > -		if (bdi_thresh < 2*bdi_stat_error(bdi)) {
> > +		if (task_bdi_thresh < 2 * bdi_stat_error(bdi)) {
> >  			bdi_nr_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE);
> >  			bdi_nr_writeback = bdi_stat_sum(bdi, BDI_WRITEBACK);
> >  		} else {
> > @@ -539,8 +550,11 @@ static void balance_dirty_pages(struct address_space *mapping,
> >  		 * the last resort safeguard.
> >  		 */
> >  		dirty_exceeded =
> > -			(bdi_nr_reclaimable + bdi_nr_writeback > bdi_thresh)
> > -			|| (nr_reclaimable + nr_writeback > dirty_thresh);
> > +		  (bdi_nr_reclaimable + bdi_nr_writeback > task_bdi_thresh)
> > +		  || (nr_reclaimable + nr_writeback > dirty_thresh);
> > +		clear_dirty_exceeded =
> > +		  (bdi_nr_reclaimable + bdi_nr_writeback <= min_task_bdi_thresh)
> > +		  && (nr_reclaimable + nr_writeback <= dirty_thresh);
> >  
> >  		if (!dirty_exceeded)
> >  			break;
> > @@ -558,7 +572,7 @@ static void balance_dirty_pages(struct address_space *mapping,
> >  		 * up.
> >  		 */
> >  		trace_wbc_balance_dirty_start(&wbc, bdi);
> > -		if (bdi_nr_reclaimable > bdi_thresh) {
> > +		if (bdi_nr_reclaimable > task_bdi_thresh) {
> >  			writeback_inodes_wb(&bdi->wb, &wbc);
> >  			pages_written += write_chunk - wbc.nr_to_write;
> >  			trace_wbc_balance_dirty_written(&wbc, bdi);
> > @@ -578,7 +592,8 @@ static void balance_dirty_pages(struct address_space *mapping,
> >  			pause = HZ / 10;
> >  	}
> >  
> > -	if (!dirty_exceeded && bdi->dirty_exceeded)
> > +	/* Clear dirty_exceeded flag only when no task can exceed the limit */
> > +	if (clear_dirty_exceeded && bdi->dirty_exceeded)
> >  		bdi->dirty_exceeded = 0;
> >  
> >  	if (writeback_in_progress(bdi))
> > -- 
> > 1.7.1
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic
  2011-07-11 17:06   ` Jan Kara
@ 2011-07-13 23:02     ` Wu Fengguang
  2011-07-14 21:34       ` Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Wu Fengguang @ 2011-07-13 23:02 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-mm, Andrew Morton, Christoph Hellwig, Dave Chinner, Peter Zijlstra

On Tue, Jul 12, 2011 at 01:06:05AM +0800, Jan Kara wrote:
> On Mon 04-07-11 09:06:19, Wu Fengguang wrote:
> > Hi Jan,
> > 
> > On Fri, Jul 01, 2011 at 02:32:44AM +0800, Jan Kara wrote:
> > > We set bdi->dirty_exceeded (and thus ratelimiting code starts to
> > > call balance_dirty_pages() every 8 pages) when a per-bdi limit is
> > > exceeded or global limit is exceeded. But per-bdi limit also depends
> > > on the task. Thus different tasks reach the limit on that bdi at
> > > different levels of dirty pages. The result is that with current code
> > > bdi->dirty_exceeded ping-ponged between 1 and 0 depending on which task
> > > just got into balance_dirty_pages().
> > > 
> > > We fix the issue by clearing bdi->dirty_exceeded only when per-bdi amount
> > > of dirty pages drops below the threshold (7/8 * bdi_dirty_limit) where task
> > > limits already do not have any influence.
> > 
> > The end result, I think, is that the dirty pages are kept more tightly
> > under control, with the average number a slightly lowered than before. 
> > This reduces the risk to throttle light dirtiers and hence more
> > responsive. However it does introduce more overheads by enforcing
> > balance_dirty_pages() calls on every 8 pages.
>   Yes. I think this was actually the original intention when the code was
> written.

I'm still a bit nervous on the added overheads for the common 1 heavy
dirty case.  Before patch, the 1 heavy dirty will normally clear
bdi->dirty_exceeded on leaving balance_dirty_pages(), so it doesn't
have the burden of rechecking the limit on every 8 pages.

How about this minimal change?

 static unsigned long task_min_dirty_limit(unsigned long bdi_dirty)
 {
-       return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION;
+       return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION / 4;
 }

A more complete version will involve changing the function name and
comment. But the basic rationales are,

- there is no serious danger of dirty exceeding as long as there are
  less than 4 heavy dirties

- tasks dirtying close to 25% pages probably cannot be called light
  dirtier and there is no need to protect such tasks

> > > CC: Andrew Morton <akpm@linux-foundation.org>
> > > CC: Christoph Hellwig <hch@infradead.org>
> > > CC: Dave Chinner <david@fromorbit.com>
> > > CC: Wu Fengguang <fengguang.wu@intel.com>
> > > CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > ---
> > >  mm/page-writeback.c |   29 ++++++++++++++++++++++-------
> > >  1 files changed, 22 insertions(+), 7 deletions(-)
> > > 
> > >  This is the patch fixing dirty_exceeded logic I promised you last week.
> > > I based it on current Linus's tree as it is a relatively direct fix so I
> > > expect it can be somewhere in the beginning of the patch series and merged
> > > relatively quickly. Can you please add it to your tree? Thanks.
> > 
> > OK. I noticed that bdi_thresh is no longer used. What if we just
> > rename it? But I admit that the patch in its current form looks a bit
> > more clear in concept.
>   You are right bdi_thresh is only used for computing task_bdi_thresh and
> min_task_bdi_thresh now. We could possibly eliminate that one variable but
> I guess compiler will optimize it away anyway and I find the code more
> legible when written this way...

OK.

Thanks,
Fengguang

> > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > > index 31f6988..d8b395f 100644
> > > --- a/mm/page-writeback.c
> > > +++ b/mm/page-writeback.c
> > > @@ -274,12 +274,13 @@ static inline void task_dirties_fraction(struct task_struct *tsk,
> > >   * effectively curb the growth of dirty pages. Light dirtiers with high enough
> > >   * dirty threshold may never get throttled.
> > >   */
> > > +#define TASK_LIMIT_FRACTION 8
> > >  static unsigned long task_dirty_limit(struct task_struct *tsk,
> > >  				       unsigned long bdi_dirty)
> > >  {
> > >  	long numerator, denominator;
> > >  	unsigned long dirty = bdi_dirty;
> > > -	u64 inv = dirty >> 3;
> > > +	u64 inv = dirty / TASK_LIMIT_FRACTION;
> > >  
> > >  	task_dirties_fraction(tsk, &numerator, &denominator);
> > >  	inv *= numerator;
> > > @@ -290,6 +291,12 @@ static unsigned long task_dirty_limit(struct task_struct *tsk,
> > >  	return max(dirty, bdi_dirty/2);
> > >  }
> > >  
> > > +/* Minimum limit for any task */
> > > +static unsigned long task_min_dirty_limit(unsigned long bdi_dirty)
> > > +{
> > > +	return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION;
> > > +}
> > > +
> > >  /*
> > >   *
> > >   */
> > > @@ -483,9 +490,12 @@ static void balance_dirty_pages(struct address_space *mapping,
> > >  	unsigned long background_thresh;
> > >  	unsigned long dirty_thresh;
> > >  	unsigned long bdi_thresh;
> > > +	unsigned long task_bdi_thresh;
> > > +	unsigned long min_task_bdi_thresh;
> > >  	unsigned long pages_written = 0;
> > >  	unsigned long pause = 1;
> > >  	bool dirty_exceeded = false;
> > > +	bool clear_dirty_exceeded = true;
> > >  	struct backing_dev_info *bdi = mapping->backing_dev_info;
> > >  
> > >  	for (;;) {
> > > @@ -512,7 +522,8 @@ static void balance_dirty_pages(struct address_space *mapping,
> > >  			break;
> > >  
> > >  		bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh);
> > > -		bdi_thresh = task_dirty_limit(current, bdi_thresh);
> > > +		min_task_bdi_thresh = task_min_dirty_limit(bdi_thresh);
> > > +		task_bdi_thresh = task_dirty_limit(current, bdi_thresh);
> > >  
> > >  		/*
> > >  		 * In order to avoid the stacked BDI deadlock we need
> > > @@ -524,7 +535,7 @@ static void balance_dirty_pages(struct address_space *mapping,
> > >  		 * actually dirty; with m+n sitting in the percpu
> > >  		 * deltas.
> > >  		 */
> > > -		if (bdi_thresh < 2*bdi_stat_error(bdi)) {
> > > +		if (task_bdi_thresh < 2 * bdi_stat_error(bdi)) {
> > >  			bdi_nr_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE);
> > >  			bdi_nr_writeback = bdi_stat_sum(bdi, BDI_WRITEBACK);
> > >  		} else {
> > > @@ -539,8 +550,11 @@ static void balance_dirty_pages(struct address_space *mapping,
> > >  		 * the last resort safeguard.
> > >  		 */
> > >  		dirty_exceeded =
> > > -			(bdi_nr_reclaimable + bdi_nr_writeback > bdi_thresh)
> > > -			|| (nr_reclaimable + nr_writeback > dirty_thresh);
> > > +		  (bdi_nr_reclaimable + bdi_nr_writeback > task_bdi_thresh)
> > > +		  || (nr_reclaimable + nr_writeback > dirty_thresh);
> > > +		clear_dirty_exceeded =
> > > +		  (bdi_nr_reclaimable + bdi_nr_writeback <= min_task_bdi_thresh)
> > > +		  && (nr_reclaimable + nr_writeback <= dirty_thresh);
> > >  
> > >  		if (!dirty_exceeded)
> > >  			break;
> > > @@ -558,7 +572,7 @@ static void balance_dirty_pages(struct address_space *mapping,
> > >  		 * up.
> > >  		 */
> > >  		trace_wbc_balance_dirty_start(&wbc, bdi);
> > > -		if (bdi_nr_reclaimable > bdi_thresh) {
> > > +		if (bdi_nr_reclaimable > task_bdi_thresh) {
> > >  			writeback_inodes_wb(&bdi->wb, &wbc);
> > >  			pages_written += write_chunk - wbc.nr_to_write;
> > >  			trace_wbc_balance_dirty_written(&wbc, bdi);
> > > @@ -578,7 +592,8 @@ static void balance_dirty_pages(struct address_space *mapping,
> > >  			pause = HZ / 10;
> > >  	}
> > >  
> > > -	if (!dirty_exceeded && bdi->dirty_exceeded)
> > > +	/* Clear dirty_exceeded flag only when no task can exceed the limit */
> > > +	if (clear_dirty_exceeded && bdi->dirty_exceeded)
> > >  		bdi->dirty_exceeded = 0;
> > >  
> > >  	if (writeback_in_progress(bdi))
> > > -- 
> > > 1.7.1
> -- 
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic
  2011-07-13 23:02     ` Wu Fengguang
@ 2011-07-14 21:34       ` Jan Kara
  2011-07-23  7:43         ` Wu Fengguang
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2011-07-14 21:34 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Jan Kara, linux-mm, Andrew Morton, Christoph Hellwig,
	Dave Chinner, Peter Zijlstra

On Wed 13-07-11 16:02:58, Wu Fengguang wrote:
> On Tue, Jul 12, 2011 at 01:06:05AM +0800, Jan Kara wrote:
> > On Mon 04-07-11 09:06:19, Wu Fengguang wrote:
> > > On Fri, Jul 01, 2011 at 02:32:44AM +0800, Jan Kara wrote:
> > > > We set bdi->dirty_exceeded (and thus ratelimiting code starts to
> > > > call balance_dirty_pages() every 8 pages) when a per-bdi limit is
> > > > exceeded or global limit is exceeded. But per-bdi limit also depends
> > > > on the task. Thus different tasks reach the limit on that bdi at
> > > > different levels of dirty pages. The result is that with current code
> > > > bdi->dirty_exceeded ping-ponged between 1 and 0 depending on which task
> > > > just got into balance_dirty_pages().
> > > > 
> > > > We fix the issue by clearing bdi->dirty_exceeded only when per-bdi amount
> > > > of dirty pages drops below the threshold (7/8 * bdi_dirty_limit) where task
> > > > limits already do not have any influence.
> > > 
> > > The end result, I think, is that the dirty pages are kept more tightly
> > > under control, with the average number a slightly lowered than before. 
> > > This reduces the risk to throttle light dirtiers and hence more
> > > responsive. However it does introduce more overheads by enforcing
> > > balance_dirty_pages() calls on every 8 pages.
> >   Yes. I think this was actually the original intention when the code was
> > written.
> 
> I'm still a bit nervous on the added overheads for the common 1 heavy
> dirty case.  Before patch, the 1 heavy dirty will normally clear
> bdi->dirty_exceeded on leaving balance_dirty_pages(), so it doesn't
> have the burden of rechecking the limit on every 8 pages.
Yes, but if there is 1 heavy dirtier, task_min_dirty_limit() will be
actually equal (or almost equal) to the dirty limit of that task. So that
case won't be really different.

When we will be different is when there are two or more dirtying tasks.
Then we originally cleared dirty_exceeded already at higher level of dirty
pages so we stopped checking after every 8 dirtied pages earlier.
 
> How about this minimal change?
> 
>  static unsigned long task_min_dirty_limit(unsigned long bdi_dirty)
>  {
> -       return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION;
> +       return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION / 4;
>  }
> 
> A more complete version will involve changing the function name and
> comment. But the basic rationales are,
> 
> - there is no serious danger of dirty exceeding as long as there are
>   less than 4 heavy dirties
> 
> - tasks dirtying close to 25% pages probably cannot be called light
>   dirtier and there is no need to protect such tasks
  The idea is interesting. The only problem is that we don't want to set
dirty_exceeded too late so that heavy dirtiers won't push light dirtiers
over their limits so easily due to ratelimiting. It did some computations:
We normally ratelimit after 4 MB. Take a low end desktop these days. Say
1 GB of ram, 4 CPUs. So dirty limit will be ~200 MB and the area for task
differentiation ~25 MB. We enter balance_dirty_pages() after dirtying
num_cpu * ratelimit / 2 pages on average which gives 8 MB. So we should
set dirty_exceeded at latest at bdi_dirty / TASK_LIMIT_FRACTION / 2 or
task differentiation would have no effect because of ratelimiting.

So we could change the limit to something like:
bdi_dirty - min(bdi_dirty / TASK_LIMIT_FRACTION, ratelimit_pages *
num_online_cpus / 2 + bdi_dirty / TASK_LIMIT_FRACTION / 16)

But I'm not sure setups where this would make difference are common...

								Honza

> > > > CC: Andrew Morton <akpm@linux-foundation.org>
> > > > CC: Christoph Hellwig <hch@infradead.org>
> > > > CC: Dave Chinner <david@fromorbit.com>
> > > > CC: Wu Fengguang <fengguang.wu@intel.com>
> > > > CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > > ---
> > > >  mm/page-writeback.c |   29 ++++++++++++++++++++++-------
> > > >  1 files changed, 22 insertions(+), 7 deletions(-)
> > > > 
> > > >  This is the patch fixing dirty_exceeded logic I promised you last week.
> > > > I based it on current Linus's tree as it is a relatively direct fix so I
> > > > expect it can be somewhere in the beginning of the patch series and merged
> > > > relatively quickly. Can you please add it to your tree? Thanks.
> > > 
> > > OK. I noticed that bdi_thresh is no longer used. What if we just
> > > rename it? But I admit that the patch in its current form looks a bit
> > > more clear in concept.
> >   You are right bdi_thresh is only used for computing task_bdi_thresh and
> > min_task_bdi_thresh now. We could possibly eliminate that one variable but
> > I guess compiler will optimize it away anyway and I find the code more
> > legible when written this way...
> 
> OK.
> 
> Thanks,
> Fengguang
> 
> > > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > > > index 31f6988..d8b395f 100644
> > > > --- a/mm/page-writeback.c
> > > > +++ b/mm/page-writeback.c
> > > > @@ -274,12 +274,13 @@ static inline void task_dirties_fraction(struct task_struct *tsk,
> > > >   * effectively curb the growth of dirty pages. Light dirtiers with high enough
> > > >   * dirty threshold may never get throttled.
> > > >   */
> > > > +#define TASK_LIMIT_FRACTION 8
> > > >  static unsigned long task_dirty_limit(struct task_struct *tsk,
> > > >  				       unsigned long bdi_dirty)
> > > >  {
> > > >  	long numerator, denominator;
> > > >  	unsigned long dirty = bdi_dirty;
> > > > -	u64 inv = dirty >> 3;
> > > > +	u64 inv = dirty / TASK_LIMIT_FRACTION;
> > > >  
> > > >  	task_dirties_fraction(tsk, &numerator, &denominator);
> > > >  	inv *= numerator;
> > > > @@ -290,6 +291,12 @@ static unsigned long task_dirty_limit(struct task_struct *tsk,
> > > >  	return max(dirty, bdi_dirty/2);
> > > >  }
> > > >  
> > > > +/* Minimum limit for any task */
> > > > +static unsigned long task_min_dirty_limit(unsigned long bdi_dirty)
> > > > +{
> > > > +	return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION;
> > > > +}
> > > > +
> > > >  /*
> > > >   *
> > > >   */
> > > > @@ -483,9 +490,12 @@ static void balance_dirty_pages(struct address_space *mapping,
> > > >  	unsigned long background_thresh;
> > > >  	unsigned long dirty_thresh;
> > > >  	unsigned long bdi_thresh;
> > > > +	unsigned long task_bdi_thresh;
> > > > +	unsigned long min_task_bdi_thresh;
> > > >  	unsigned long pages_written = 0;
> > > >  	unsigned long pause = 1;
> > > >  	bool dirty_exceeded = false;
> > > > +	bool clear_dirty_exceeded = true;
> > > >  	struct backing_dev_info *bdi = mapping->backing_dev_info;
> > > >  
> > > >  	for (;;) {
> > > > @@ -512,7 +522,8 @@ static void balance_dirty_pages(struct address_space *mapping,
> > > >  			break;
> > > >  
> > > >  		bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh);
> > > > -		bdi_thresh = task_dirty_limit(current, bdi_thresh);
> > > > +		min_task_bdi_thresh = task_min_dirty_limit(bdi_thresh);
> > > > +		task_bdi_thresh = task_dirty_limit(current, bdi_thresh);
> > > >  
> > > >  		/*
> > > >  		 * In order to avoid the stacked BDI deadlock we need
> > > > @@ -524,7 +535,7 @@ static void balance_dirty_pages(struct address_space *mapping,
> > > >  		 * actually dirty; with m+n sitting in the percpu
> > > >  		 * deltas.
> > > >  		 */
> > > > -		if (bdi_thresh < 2*bdi_stat_error(bdi)) {
> > > > +		if (task_bdi_thresh < 2 * bdi_stat_error(bdi)) {
> > > >  			bdi_nr_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE);
> > > >  			bdi_nr_writeback = bdi_stat_sum(bdi, BDI_WRITEBACK);
> > > >  		} else {
> > > > @@ -539,8 +550,11 @@ static void balance_dirty_pages(struct address_space *mapping,
> > > >  		 * the last resort safeguard.
> > > >  		 */
> > > >  		dirty_exceeded =
> > > > -			(bdi_nr_reclaimable + bdi_nr_writeback > bdi_thresh)
> > > > -			|| (nr_reclaimable + nr_writeback > dirty_thresh);
> > > > +		  (bdi_nr_reclaimable + bdi_nr_writeback > task_bdi_thresh)
> > > > +		  || (nr_reclaimable + nr_writeback > dirty_thresh);
> > > > +		clear_dirty_exceeded =
> > > > +		  (bdi_nr_reclaimable + bdi_nr_writeback <= min_task_bdi_thresh)
> > > > +		  && (nr_reclaimable + nr_writeback <= dirty_thresh);
> > > >  
> > > >  		if (!dirty_exceeded)
> > > >  			break;
> > > > @@ -558,7 +572,7 @@ static void balance_dirty_pages(struct address_space *mapping,
> > > >  		 * up.
> > > >  		 */
> > > >  		trace_wbc_balance_dirty_start(&wbc, bdi);
> > > > -		if (bdi_nr_reclaimable > bdi_thresh) {
> > > > +		if (bdi_nr_reclaimable > task_bdi_thresh) {
> > > >  			writeback_inodes_wb(&bdi->wb, &wbc);
> > > >  			pages_written += write_chunk - wbc.nr_to_write;
> > > >  			trace_wbc_balance_dirty_written(&wbc, bdi);
> > > > @@ -578,7 +592,8 @@ static void balance_dirty_pages(struct address_space *mapping,
> > > >  			pause = HZ / 10;
> > > >  	}
> > > >  
> > > > -	if (!dirty_exceeded && bdi->dirty_exceeded)
> > > > +	/* Clear dirty_exceeded flag only when no task can exceed the limit */
> > > > +	if (clear_dirty_exceeded && bdi->dirty_exceeded)
> > > >  		bdi->dirty_exceeded = 0;
> > > >  
> > > >  	if (writeback_in_progress(bdi))
> > > > -- 
> > > > 1.7.1
> > -- 
> > Jan Kara <jack@suse.cz>
> > SUSE Labs, CR
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic
  2011-07-14 21:34       ` Jan Kara
@ 2011-07-23  7:43         ` Wu Fengguang
  2011-07-25 16:04           ` Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Wu Fengguang @ 2011-07-23  7:43 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-mm, Andrew Morton, Christoph Hellwig, Dave Chinner, Peter Zijlstra

On Fri, Jul 15, 2011 at 05:34:09AM +0800, Jan Kara wrote:
> On Wed 13-07-11 16:02:58, Wu Fengguang wrote:
> > On Tue, Jul 12, 2011 at 01:06:05AM +0800, Jan Kara wrote:
> > > On Mon 04-07-11 09:06:19, Wu Fengguang wrote:
> > > > On Fri, Jul 01, 2011 at 02:32:44AM +0800, Jan Kara wrote:
> > > > > We set bdi->dirty_exceeded (and thus ratelimiting code starts to
> > > > > call balance_dirty_pages() every 8 pages) when a per-bdi limit is
> > > > > exceeded or global limit is exceeded. But per-bdi limit also depends
> > > > > on the task. Thus different tasks reach the limit on that bdi at
> > > > > different levels of dirty pages. The result is that with current code
> > > > > bdi->dirty_exceeded ping-ponged between 1 and 0 depending on which task
> > > > > just got into balance_dirty_pages().
> > > > > 
> > > > > We fix the issue by clearing bdi->dirty_exceeded only when per-bdi amount
> > > > > of dirty pages drops below the threshold (7/8 * bdi_dirty_limit) where task
> > > > > limits already do not have any influence.
> > > > 
> > > > The end result, I think, is that the dirty pages are kept more tightly
> > > > under control, with the average number a slightly lowered than before. 
> > > > This reduces the risk to throttle light dirtiers and hence more
> > > > responsive. However it does introduce more overheads by enforcing
> > > > balance_dirty_pages() calls on every 8 pages.
> > >   Yes. I think this was actually the original intention when the code was
> > > written.
> > 
> > I'm still a bit nervous on the added overheads for the common 1 heavy
> > dirty case.  Before patch, the 1 heavy dirty will normally clear
> > bdi->dirty_exceeded on leaving balance_dirty_pages(), so it doesn't
> > have the burden of rechecking the limit on every 8 pages.
> Yes, but if there is 1 heavy dirtier, task_min_dirty_limit() will be
> actually equal (or almost equal) to the dirty limit of that task. So that
> case won't be really different.

Good point! Sorry I didn't notice it.

> When we will be different is when there are two or more dirtying tasks.
> Then we originally cleared dirty_exceeded already at higher level of dirty
> pages so we stopped checking after every 8 dirtied pages earlier.
>  
> > How about this minimal change?
> > 
> >  static unsigned long task_min_dirty_limit(unsigned long bdi_dirty)
> >  {
> > -       return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION;
> > +       return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION / 4;
> >  }
> > 
> > A more complete version will involve changing the function name and
> > comment. But the basic rationales are,
> > 
> > - there is no serious danger of dirty exceeding as long as there are
> >   less than 4 heavy dirties
> > 
> > - tasks dirtying close to 25% pages probably cannot be called light
> >   dirtier and there is no need to protect such tasks
>   The idea is interesting. The only problem is that we don't want to set
> dirty_exceeded too late so that heavy dirtiers won't push light dirtiers
> over their limits so easily due to ratelimiting. It did some computations:
> We normally ratelimit after 4 MB. Take a low end desktop these days. Say
> 1 GB of ram, 4 CPUs. So dirty limit will be ~200 MB and the area for task
> differentiation ~25 MB. We enter balance_dirty_pages() after dirtying
> num_cpu * ratelimit / 2 pages on average which gives 8 MB. So we should
> set dirty_exceeded at latest at bdi_dirty / TASK_LIMIT_FRACTION / 2 or
> task differentiation would have no effect because of ratelimiting.
> 
> So we could change the limit to something like:
> bdi_dirty - min(bdi_dirty / TASK_LIMIT_FRACTION, ratelimit_pages *
> num_online_cpus / 2 + bdi_dirty / TASK_LIMIT_FRACTION / 16)

Good analyze!

> But I'm not sure setups where this would make difference are common...

I think I'd prefer the original simple patch given that the common
1-dirtier is not impacted.

Thanks,
Fengguang

> > > > > CC: Andrew Morton <akpm@linux-foundation.org>
> > > > > CC: Christoph Hellwig <hch@infradead.org>
> > > > > CC: Dave Chinner <david@fromorbit.com>
> > > > > CC: Wu Fengguang <fengguang.wu@intel.com>
> > > > > CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > > > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > > > ---
> > > > >  mm/page-writeback.c |   29 ++++++++++++++++++++++-------
> > > > >  1 files changed, 22 insertions(+), 7 deletions(-)
> > > > > 
> > > > >  This is the patch fixing dirty_exceeded logic I promised you last week.
> > > > > I based it on current Linus's tree as it is a relatively direct fix so I
> > > > > expect it can be somewhere in the beginning of the patch series and merged
> > > > > relatively quickly. Can you please add it to your tree? Thanks.
> > > > 
> > > > OK. I noticed that bdi_thresh is no longer used. What if we just
> > > > rename it? But I admit that the patch in its current form looks a bit
> > > > more clear in concept.
> > >   You are right bdi_thresh is only used for computing task_bdi_thresh and
> > > min_task_bdi_thresh now. We could possibly eliminate that one variable but
> > > I guess compiler will optimize it away anyway and I find the code more
> > > legible when written this way...
> > 
> > OK.
> > 
> > Thanks,
> > Fengguang
> > 
> > > > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > > > > index 31f6988..d8b395f 100644
> > > > > --- a/mm/page-writeback.c
> > > > > +++ b/mm/page-writeback.c
> > > > > @@ -274,12 +274,13 @@ static inline void task_dirties_fraction(struct task_struct *tsk,
> > > > >   * effectively curb the growth of dirty pages. Light dirtiers with high enough
> > > > >   * dirty threshold may never get throttled.
> > > > >   */
> > > > > +#define TASK_LIMIT_FRACTION 8
> > > > >  static unsigned long task_dirty_limit(struct task_struct *tsk,
> > > > >  				       unsigned long bdi_dirty)
> > > > >  {
> > > > >  	long numerator, denominator;
> > > > >  	unsigned long dirty = bdi_dirty;
> > > > > -	u64 inv = dirty >> 3;
> > > > > +	u64 inv = dirty / TASK_LIMIT_FRACTION;
> > > > >  
> > > > >  	task_dirties_fraction(tsk, &numerator, &denominator);
> > > > >  	inv *= numerator;
> > > > > @@ -290,6 +291,12 @@ static unsigned long task_dirty_limit(struct task_struct *tsk,
> > > > >  	return max(dirty, bdi_dirty/2);
> > > > >  }
> > > > >  
> > > > > +/* Minimum limit for any task */
> > > > > +static unsigned long task_min_dirty_limit(unsigned long bdi_dirty)
> > > > > +{
> > > > > +	return bdi_dirty - bdi_dirty / TASK_LIMIT_FRACTION;
> > > > > +}
> > > > > +
> > > > >  /*
> > > > >   *
> > > > >   */
> > > > > @@ -483,9 +490,12 @@ static void balance_dirty_pages(struct address_space *mapping,
> > > > >  	unsigned long background_thresh;
> > > > >  	unsigned long dirty_thresh;
> > > > >  	unsigned long bdi_thresh;
> > > > > +	unsigned long task_bdi_thresh;
> > > > > +	unsigned long min_task_bdi_thresh;
> > > > >  	unsigned long pages_written = 0;
> > > > >  	unsigned long pause = 1;
> > > > >  	bool dirty_exceeded = false;
> > > > > +	bool clear_dirty_exceeded = true;
> > > > >  	struct backing_dev_info *bdi = mapping->backing_dev_info;
> > > > >  
> > > > >  	for (;;) {
> > > > > @@ -512,7 +522,8 @@ static void balance_dirty_pages(struct address_space *mapping,
> > > > >  			break;
> > > > >  
> > > > >  		bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh);
> > > > > -		bdi_thresh = task_dirty_limit(current, bdi_thresh);
> > > > > +		min_task_bdi_thresh = task_min_dirty_limit(bdi_thresh);
> > > > > +		task_bdi_thresh = task_dirty_limit(current, bdi_thresh);
> > > > >  
> > > > >  		/*
> > > > >  		 * In order to avoid the stacked BDI deadlock we need
> > > > > @@ -524,7 +535,7 @@ static void balance_dirty_pages(struct address_space *mapping,
> > > > >  		 * actually dirty; with m+n sitting in the percpu
> > > > >  		 * deltas.
> > > > >  		 */
> > > > > -		if (bdi_thresh < 2*bdi_stat_error(bdi)) {
> > > > > +		if (task_bdi_thresh < 2 * bdi_stat_error(bdi)) {
> > > > >  			bdi_nr_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE);
> > > > >  			bdi_nr_writeback = bdi_stat_sum(bdi, BDI_WRITEBACK);
> > > > >  		} else {
> > > > > @@ -539,8 +550,11 @@ static void balance_dirty_pages(struct address_space *mapping,
> > > > >  		 * the last resort safeguard.
> > > > >  		 */
> > > > >  		dirty_exceeded =
> > > > > -			(bdi_nr_reclaimable + bdi_nr_writeback > bdi_thresh)
> > > > > -			|| (nr_reclaimable + nr_writeback > dirty_thresh);
> > > > > +		  (bdi_nr_reclaimable + bdi_nr_writeback > task_bdi_thresh)
> > > > > +		  || (nr_reclaimable + nr_writeback > dirty_thresh);
> > > > > +		clear_dirty_exceeded =
> > > > > +		  (bdi_nr_reclaimable + bdi_nr_writeback <= min_task_bdi_thresh)
> > > > > +		  && (nr_reclaimable + nr_writeback <= dirty_thresh);
> > > > >  
> > > > >  		if (!dirty_exceeded)
> > > > >  			break;
> > > > > @@ -558,7 +572,7 @@ static void balance_dirty_pages(struct address_space *mapping,
> > > > >  		 * up.
> > > > >  		 */
> > > > >  		trace_wbc_balance_dirty_start(&wbc, bdi);
> > > > > -		if (bdi_nr_reclaimable > bdi_thresh) {
> > > > > +		if (bdi_nr_reclaimable > task_bdi_thresh) {
> > > > >  			writeback_inodes_wb(&bdi->wb, &wbc);
> > > > >  			pages_written += write_chunk - wbc.nr_to_write;
> > > > >  			trace_wbc_balance_dirty_written(&wbc, bdi);
> > > > > @@ -578,7 +592,8 @@ static void balance_dirty_pages(struct address_space *mapping,
> > > > >  			pause = HZ / 10;
> > > > >  	}
> > > > >  
> > > > > -	if (!dirty_exceeded && bdi->dirty_exceeded)
> > > > > +	/* Clear dirty_exceeded flag only when no task can exceed the limit */
> > > > > +	if (clear_dirty_exceeded && bdi->dirty_exceeded)
> > > > >  		bdi->dirty_exceeded = 0;
> > > > >  
> > > > >  	if (writeback_in_progress(bdi))
> > > > > -- 
> > > > > 1.7.1
> > > -- 
> > > Jan Kara <jack@suse.cz>
> > > SUSE Labs, CR
> -- 
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic
  2011-07-23  7:43         ` Wu Fengguang
@ 2011-07-25 16:04           ` Jan Kara
  2011-07-26  4:13             ` Wu Fengguang
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2011-07-25 16:04 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Jan Kara, linux-mm, Andrew Morton, Christoph Hellwig,
	Dave Chinner, Peter Zijlstra

On Sat 23-07-11 15:43:45, Wu Fengguang wrote:
> On Fri, Jul 15, 2011 at 05:34:09AM +0800, Jan Kara wrote:
> > > - tasks dirtying close to 25% pages probably cannot be called light
> > >   dirtier and there is no need to protect such tasks
> >   The idea is interesting. The only problem is that we don't want to set
> > dirty_exceeded too late so that heavy dirtiers won't push light dirtiers
> > over their limits so easily due to ratelimiting. It did some computations:
> > We normally ratelimit after 4 MB. Take a low end desktop these days. Say
> > 1 GB of ram, 4 CPUs. So dirty limit will be ~200 MB and the area for task
> > differentiation ~25 MB. We enter balance_dirty_pages() after dirtying
> > num_cpu * ratelimit / 2 pages on average which gives 8 MB. So we should
> > set dirty_exceeded at latest at bdi_dirty / TASK_LIMIT_FRACTION / 2 or
> > task differentiation would have no effect because of ratelimiting.
> > 
> > So we could change the limit to something like:
> > bdi_dirty - min(bdi_dirty / TASK_LIMIT_FRACTION, ratelimit_pages *
> > num_online_cpus / 2 + bdi_dirty / TASK_LIMIT_FRACTION / 16)
> 
> Good analyze!
> 
> > But I'm not sure setups where this would make difference are common...
> 
> I think I'd prefer the original simple patch given that the common
> 1-dirtier is not impacted.
  OK, thanks. So will you merge the patch please?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic
  2011-07-25 16:04           ` Jan Kara
@ 2011-07-26  4:13             ` Wu Fengguang
  2011-07-26 13:57               ` Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Wu Fengguang @ 2011-07-26  4:13 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-mm, Andrew Morton, Christoph Hellwig, Dave Chinner, Peter Zijlstra

On Tue, Jul 26, 2011 at 12:04:29AM +0800, Jan Kara wrote:
> On Sat 23-07-11 15:43:45, Wu Fengguang wrote:
> > On Fri, Jul 15, 2011 at 05:34:09AM +0800, Jan Kara wrote:
> > > > - tasks dirtying close to 25% pages probably cannot be called light
> > > >   dirtier and there is no need to protect such tasks
> > >   The idea is interesting. The only problem is that we don't want to set
> > > dirty_exceeded too late so that heavy dirtiers won't push light dirtiers
> > > over their limits so easily due to ratelimiting. It did some computations:
> > > We normally ratelimit after 4 MB. Take a low end desktop these days. Say
> > > 1 GB of ram, 4 CPUs. So dirty limit will be ~200 MB and the area for task
> > > differentiation ~25 MB. We enter balance_dirty_pages() after dirtying
> > > num_cpu * ratelimit / 2 pages on average which gives 8 MB. So we should
> > > set dirty_exceeded at latest at bdi_dirty / TASK_LIMIT_FRACTION / 2 or
> > > task differentiation would have no effect because of ratelimiting.
> > > 
> > > So we could change the limit to something like:
> > > bdi_dirty - min(bdi_dirty / TASK_LIMIT_FRACTION, ratelimit_pages *
> > > num_online_cpus / 2 + bdi_dirty / TASK_LIMIT_FRACTION / 16)
> > 
> > Good analyze!
> > 
> > > But I'm not sure setups where this would make difference are common...
> > 
> > I think I'd prefer the original simple patch given that the common
> > 1-dirtier is not impacted.
>   OK, thanks. So will you merge the patch please?

The patch with a minor variable rename has been in writeback.git and
linux-next for two weeks, and two days ago I updated it to your
original patch:

http://git.kernel.org/?p=linux/kernel/git/wfg/writeback.git;a=commit;h=bcff25fc8aa47a13faff8b4b992589813f7b450a

If you have no more problems with the patchset, I'll ask Linus
to pull that branch.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic
  2011-07-26  4:13             ` Wu Fengguang
@ 2011-07-26 13:57               ` Jan Kara
  2011-07-27 14:04                 ` Wu Fengguang
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2011-07-26 13:57 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Jan Kara, linux-mm, Andrew Morton, Christoph Hellwig,
	Dave Chinner, Peter Zijlstra

On iue 26-07-11 12:13:22, Wu Fengguang wrote:
> On Tue, Jul 26, 2011 at 12:04:29AM +0800, Jan Kara wrote:
> > On Sat 23-07-11 15:43:45, Wu Fengguang wrote:
> > > On Fri, Jul 15, 2011 at 05:34:09AM +0800, Jan Kara wrote:
> > > > > - tasks dirtying close to 25% pages probably cannot be called light
> > > > >   dirtier and there is no need to protect such tasks
> > > >   The idea is interesting. The only problem is that we don't want to set
> > > > dirty_exceeded too late so that heavy dirtiers won't push light dirtiers
> > > > over their limits so easily due to ratelimiting. It did some computations:
> > > > We normally ratelimit after 4 MB. Take a low end desktop these days. Say
> > > > 1 GB of ram, 4 CPUs. So dirty limit will be ~200 MB and the area for task
> > > > differentiation ~25 MB. We enter balance_dirty_pages() after dirtying
> > > > num_cpu * ratelimit / 2 pages on average which gives 8 MB. So we should
> > > > set dirty_exceeded at latest at bdi_dirty / TASK_LIMIT_FRACTION / 2 or
> > > > task differentiation would have no effect because of ratelimiting.
> > > > 
> > > > So we could change the limit to something like:
> > > > bdi_dirty - min(bdi_dirty / TASK_LIMIT_FRACTION, ratelimit_pages *
> > > > num_online_cpus / 2 + bdi_dirty / TASK_LIMIT_FRACTION / 16)
> > > 
> > > Good analyze!
> > > 
> > > > But I'm not sure setups where this would make difference are common...
> > > 
> > > I think I'd prefer the original simple patch given that the common
> > > 1-dirtier is not impacted.
> >   OK, thanks. So will you merge the patch please?
> 
> The patch with a minor variable rename has been in writeback.git and
> linux-next for two weeks, and two days ago I updated it to your
> original patch:
> 
> http://git.kernel.org/?p=linux/kernel/git/wfg/writeback.git;a=commit;h=bcff25fc8aa47a13faff8b4b992589813f7b450a
  Ah, thanks.

> If you have no more problems with the patchset, I'll ask Linus
> to pull that branch.
  Umm, I thought we ultimately still push changes through Andrew? I don't
mind pushing them directly but I'm not sure e.g. Andrew is aware of this.

Regarding patches in your for_linus branch. I see there:
6e6938b writeback: introduce .tagged_writepages for the WB_SYNC_NONE sync stage
94c3dcb writeback: update dirtied_when for synced inode to prevent livelock
cb9bd11 writeback: introduce writeback_control.inodes_written
e6fb6da writeback: try more writeback as long as something was written
ba9aa83 writeback: the kupdate expire timestamp should be a moving target
424b351 writeback: refill b_io iff empty
f758eea writeback: split inode_wb_list_lock into bdi_writeback.list_lock
e8dfc30 writeback: elevate queue_io() into wb_writeback()
e185dda writeback: avoid extra sync work at enqueue time
6f71865 writeback: add bdi_dirty_limit() kernel-doc
3efaf0f writeback: skip balance_dirty_pages() for in-memory fs
b7a2441 writeback: remove writeback_control.more_io
846d5a0 writeback: remove .nonblocking and .encountered_congestion
251d6a4 writeback: trace event writeback_single_inode
e84d0a4 writeback: trace event writeback_queue_io
36715ce writeback: skip tmpfs early in balance_dirty_pages_ratelimited_nr()
d46db3d writeback: make writeback_control.nr_to_write straight

I believe the above patches were generally agreed upon so they can go in.

f7d2b1e writeback: account per-bdi accumulated written pages
e98be2d writeback: bdi write bandwidth estimation
00821b0 writeback: show bdi write bandwidth in debugfs
7762741 writeback: consolidate variable names in balance_dirty_pages()
c42843f writeback: introduce smoothed global dirty limit
ffd1f60 writeback: introduce max-pause and pass-good dirty limits
e1cbe23 writeback: trace global_dirty_state
1a12d8b writeback: scale IO chunk size up to half device bandwidth

But why do you think these patches should be merged? f7d2b1e, 7762741 are
probably OK to go but don't have much sense without the rest. The other
patches do not have any Acked-by or Reviewed-by from anyone and I don't
think they are really obvious enough to not deserve some.

fcc5c22 writeback: don't busy retry writeback on new/freeing inodes
bcff25f mm: properly reflect task dirty limits in dirty_exceeded logic

These two are fixes so they should go in as well.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic
  2011-07-26 13:57               ` Jan Kara
@ 2011-07-27 14:04                 ` Wu Fengguang
  2011-07-27 15:10                   ` Christoph Hellwig
  2011-07-28 15:31                   ` Jan Kara
  0 siblings, 2 replies; 12+ messages in thread
From: Wu Fengguang @ 2011-07-27 14:04 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-mm, Andrew Morton, Christoph Hellwig, Dave Chinner, Peter Zijlstra

On Tue, Jul 26, 2011 at 09:57:30PM +0800, Jan Kara wrote:
> On iue 26-07-11 12:13:22, Wu Fengguang wrote:
> > On Tue, Jul 26, 2011 at 12:04:29AM +0800, Jan Kara wrote:
> > > On Sat 23-07-11 15:43:45, Wu Fengguang wrote:
> > > > On Fri, Jul 15, 2011 at 05:34:09AM +0800, Jan Kara wrote:
> > > > > > - tasks dirtying close to 25% pages probably cannot be called light
> > > > > >   dirtier and there is no need to protect such tasks
> > > > >   The idea is interesting. The only problem is that we don't want to set
> > > > > dirty_exceeded too late so that heavy dirtiers won't push light dirtiers
> > > > > over their limits so easily due to ratelimiting. It did some computations:
> > > > > We normally ratelimit after 4 MB. Take a low end desktop these days. Say
> > > > > 1 GB of ram, 4 CPUs. So dirty limit will be ~200 MB and the area for task
> > > > > differentiation ~25 MB. We enter balance_dirty_pages() after dirtying
> > > > > num_cpu * ratelimit / 2 pages on average which gives 8 MB. So we should
> > > > > set dirty_exceeded at latest at bdi_dirty / TASK_LIMIT_FRACTION / 2 or
> > > > > task differentiation would have no effect because of ratelimiting.
> > > > > 
> > > > > So we could change the limit to something like:
> > > > > bdi_dirty - min(bdi_dirty / TASK_LIMIT_FRACTION, ratelimit_pages *
> > > > > num_online_cpus / 2 + bdi_dirty / TASK_LIMIT_FRACTION / 16)
> > > > 
> > > > Good analyze!
> > > > 
> > > > > But I'm not sure setups where this would make difference are common...
> > > > 
> > > > I think I'd prefer the original simple patch given that the common
> > > > 1-dirtier is not impacted.
> > >   OK, thanks. So will you merge the patch please?
> > 
> > The patch with a minor variable rename has been in writeback.git and
> > linux-next for two weeks, and two days ago I updated it to your
> > original patch:
> > 
> > http://git.kernel.org/?p=linux/kernel/git/wfg/writeback.git;a=commit;h=bcff25fc8aa47a13faff8b4b992589813f7b450a
>   Ah, thanks.
> 
> > If you have no more problems with the patchset, I'll ask Linus
> > to pull that branch.
>   Umm, I thought we ultimately still push changes through Andrew? I don't
> mind pushing them directly but I'm not sure e.g. Andrew is aware of this.

I'll happily send patches to Andrew Morton if he would like to take
care of the mess :) In particular Andrew should still carry the
writeback changes that may interact or conflict with the -mm tree.

> Regarding patches in your for_linus branch. I see there:
> 6e6938b writeback: introduce .tagged_writepages for the WB_SYNC_NONE sync stage
> 94c3dcb writeback: update dirtied_when for synced inode to prevent livelock
> cb9bd11 writeback: introduce writeback_control.inodes_written
> e6fb6da writeback: try more writeback as long as something was written
> ba9aa83 writeback: the kupdate expire timestamp should be a moving target
> 424b351 writeback: refill b_io iff empty
> f758eea writeback: split inode_wb_list_lock into bdi_writeback.list_lock
> e8dfc30 writeback: elevate queue_io() into wb_writeback()
> e185dda writeback: avoid extra sync work at enqueue time
> 6f71865 writeback: add bdi_dirty_limit() kernel-doc
> 3efaf0f writeback: skip balance_dirty_pages() for in-memory fs
> b7a2441 writeback: remove writeback_control.more_io
> 846d5a0 writeback: remove .nonblocking and .encountered_congestion
> 251d6a4 writeback: trace event writeback_single_inode
> e84d0a4 writeback: trace event writeback_queue_io
> 36715ce writeback: skip tmpfs early in balance_dirty_pages_ratelimited_nr()
> d46db3d writeback: make writeback_control.nr_to_write straight
> 
> I believe the above patches were generally agreed upon so they can go in.

OK, thanks.

> f7d2b1e writeback: account per-bdi accumulated written pages
> e98be2d writeback: bdi write bandwidth estimation
> 00821b0 writeback: show bdi write bandwidth in debugfs
> 7762741 writeback: consolidate variable names in balance_dirty_pages()
> c42843f writeback: introduce smoothed global dirty limit
> ffd1f60 writeback: introduce max-pause and pass-good dirty limits
> e1cbe23 writeback: trace global_dirty_state
> 1a12d8b writeback: scale IO chunk size up to half device bandwidth
> 
> But why do you think these patches should be merged? f7d2b1e, 7762741 are
> probably OK to go but don't have much sense without the rest. The other
> patches do not have any Acked-by or Reviewed-by from anyone and I don't
> think they are really obvious enough to not deserve some.

Sorry I overlooked the Acked-by/Reviewed-by principle, which is
definitely good practice to follow. However given that Linus has
merged the patches and they do look like pretty safe changes, we may
consider watch and improve the algorithms based on them.

> fcc5c22 writeback: don't busy retry writeback on new/freeing inodes
> bcff25f mm: properly reflect task dirty limits in dirty_exceeded logic
> 
> These two are fixes so they should go in as well.

Yeah.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic
  2011-07-27 14:04                 ` Wu Fengguang
@ 2011-07-27 15:10                   ` Christoph Hellwig
  2011-07-28 15:31                   ` Jan Kara
  1 sibling, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2011-07-27 15:10 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Jan Kara, linux-mm, Andrew Morton, Christoph Hellwig,
	Dave Chinner, Peter Zijlstra

On Wed, Jul 27, 2011 at 10:04:41PM +0800, Wu Fengguang wrote:
> > > to pull that branch.
> >   Umm, I thought we ultimately still push changes through Andrew? I don't
> > mind pushing them directly but I'm not sure e.g. Andrew is aware of this.
> 
> I'll happily send patches to Andrew Morton if he would like to take
> care of the mess :) In particular Andrew should still carry the
> writeback changes that may interact or conflict with the -mm tree.

I have to say that I'd really like to keep the writeback tree as it is
right now.  We have a tree that has all the changes, goes in -next and
gets merged right like it was in -next.  That's the canonical model used
for all other normal trees, and it works extremely well.

> Sorry I overlooked the Acked-by/Reviewed-by principle, which is
> definitely good practice to follow. However given that Linus has
> merged the patches and they do look like pretty safe changes, we may
> consider watch and improve the algorithms based on them.

Yes, absolutely.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic
  2011-07-27 14:04                 ` Wu Fengguang
  2011-07-27 15:10                   ` Christoph Hellwig
@ 2011-07-28 15:31                   ` Jan Kara
  1 sibling, 0 replies; 12+ messages in thread
From: Jan Kara @ 2011-07-28 15:31 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Jan Kara, linux-mm, Andrew Morton, Christoph Hellwig,
	Dave Chinner, Peter Zijlstra

On Wed 27-07-11 22:04:41, Wu Fengguang wrote:
> On Tue, Jul 26, 2011 at 09:57:30PM +0800, Jan Kara wrote:
> > On iue 26-07-11 12:13:22, Wu Fengguang wrote:
> > f7d2b1e writeback: account per-bdi accumulated written pages
> > e98be2d writeback: bdi write bandwidth estimation
> > 00821b0 writeback: show bdi write bandwidth in debugfs
> > 7762741 writeback: consolidate variable names in balance_dirty_pages()
> > c42843f writeback: introduce smoothed global dirty limit
> > ffd1f60 writeback: introduce max-pause and pass-good dirty limits
> > e1cbe23 writeback: trace global_dirty_state
> > 1a12d8b writeback: scale IO chunk size up to half device bandwidth
> > 
> > But why do you think these patches should be merged? f7d2b1e, 7762741 are
> > probably OK to go but don't have much sense without the rest. The other
> > patches do not have any Acked-by or Reviewed-by from anyone and I don't
> > think they are really obvious enough to not deserve some.
> 
> Sorry I overlooked the Acked-by/Reviewed-by principle, which is
> definitely good practice to follow. However given that Linus has
> merged the patches and they do look like pretty safe changes, we may
> consider watch and improve the algorithms based on them.
  :-| Well, at least c42843f and 1a12d8b do not look "pretty safe" to me.
But when it already happened, let's work with what we have.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-07-28 15:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-30 18:32 [PATCH] mm: Properly reflect task dirty limits in dirty_exceeded logic Jan Kara
2011-07-04  1:06 ` Wu Fengguang
2011-07-11 17:06   ` Jan Kara
2011-07-13 23:02     ` Wu Fengguang
2011-07-14 21:34       ` Jan Kara
2011-07-23  7:43         ` Wu Fengguang
2011-07-25 16:04           ` Jan Kara
2011-07-26  4:13             ` Wu Fengguang
2011-07-26 13:57               ` Jan Kara
2011-07-27 14:04                 ` Wu Fengguang
2011-07-27 15:10                   ` Christoph Hellwig
2011-07-28 15:31                   ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.