linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] backing_dev: Fix hung task on sync
@ 2014-02-15  4:12 Derek Basehore
  2014-02-17  9:20 ` Jan Kara
  2014-02-18 22:55 ` Tejun Heo
  0 siblings, 2 replies; 15+ messages in thread
From: Derek Basehore @ 2014-02-15  4:12 UTC (permalink / raw)
  Cc: Alexander Viro, Jan Kara, Andrew Morton, Tejun Heo,
	Greg Kroah-Hartman, Darrick J. Wong, Derek Basehore, Kees Cook,
	linux-fsdevel, linux-kernel, linux-mm, bleung, sonnyrao,
	semenzato

bdi_wakeup_thread_delayed used the mod_delayed_work function to schedule work
to writeback dirty inodes. The problem with this is that it can delay work that
is scheduled for immediate execution, such as the work from sync_inodes_sb.
This can happen since mod_delayed_work can now steal work from a work_queue.
This fixes the problem by using queue_delayed_work instead. This is a
regression from the move to the bdi workqueue design.

The reason that this causes a problem is that laptop-mode will change the
delay, dirty_writeback_centisecs, to 60000 (10 minutes) by default. In the case
that bdi_wakeup_thread_delayed races with sync_inodes_sb, sync will be stopped
for 10 minutes and trigger a hung task. Even if dirty_writeback_centisecs is
not long enough to cause a hung task, we still don't want to delay sync for
that long.

For the same reason, this also changes bdi_writeback_workfn to immediately
queue the work again in the case that the work_list is not empty. The same
problem can happen if the sync work is run on the rescue worker.

Signed-off-by: Derek Basehore <dbasehore@chromium.org>
---
 fs/fs-writeback.c | 5 +++--
 mm/backing-dev.c  | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index e0259a1..95b7b8c 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1047,8 +1047,9 @@ void bdi_writeback_workfn(struct work_struct *work)
 		trace_writeback_pages_written(pages_written);
 	}
 
-	if (!list_empty(&bdi->work_list) ||
-	    (wb_has_dirty_io(wb) && dirty_writeback_interval))
+	if (!list_empty(&bdi->work_list))
+		mod_delayed_work(bdi_wq, &wb->dwork, 0);
+	else if (wb_has_dirty_io(wb) && dirty_writeback_interval)
 		queue_delayed_work(bdi_wq, &wb->dwork,
 			msecs_to_jiffies(dirty_writeback_interval * 10));
 
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index ce682f7..3fde024 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -294,7 +294,7 @@ void bdi_wakeup_thread_delayed(struct backing_dev_info *bdi)
 	unsigned long timeout;
 
 	timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
-	mod_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
+	queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
 }
 
 /*
-- 
1.9.0.rc1.175.g0b1dcb5


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-02-15  4:12 [PATCH] backing_dev: Fix hung task on sync Derek Basehore
@ 2014-02-17  9:20 ` Jan Kara
  2014-02-18 22:55 ` Tejun Heo
  1 sibling, 0 replies; 15+ messages in thread
From: Jan Kara @ 2014-02-17  9:20 UTC (permalink / raw)
  To: Derek Basehore
  Cc: Alexander Viro, Jan Kara, Andrew Morton, Tejun Heo,
	Greg Kroah-Hartman, Darrick J. Wong, Kees Cook, linux-fsdevel,
	linux-kernel, linux-mm, bleung, sonnyrao, semenzato

On Fri 14-02-14 20:12:17, Derek Basehore wrote:
> bdi_wakeup_thread_delayed used the mod_delayed_work function to schedule work
> to writeback dirty inodes. The problem with this is that it can delay work that
> is scheduled for immediate execution, such as the work from sync_inodes_sb.
> This can happen since mod_delayed_work can now steal work from a work_queue.
> This fixes the problem by using queue_delayed_work instead. This is a
> regression from the move to the bdi workqueue design.
> 
> The reason that this causes a problem is that laptop-mode will change the
> delay, dirty_writeback_centisecs, to 60000 (10 minutes) by default. In the case
> that bdi_wakeup_thread_delayed races with sync_inodes_sb, sync will be stopped
> for 10 minutes and trigger a hung task. Even if dirty_writeback_centisecs is
> not long enough to cause a hung task, we still don't want to delay sync for
> that long.
> 
> For the same reason, this also changes bdi_writeback_workfn to immediately
> queue the work again in the case that the work_list is not empty. The same
> problem can happen if the sync work is run on the rescue worker.
  The patch looks good. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>

  I'd also suggest to push this to stable kernels.

								Honza

> 
> Signed-off-by: Derek Basehore <dbasehore@chromium.org>
> ---
>  fs/fs-writeback.c | 5 +++--
>  mm/backing-dev.c  | 2 +-
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index e0259a1..95b7b8c 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1047,8 +1047,9 @@ void bdi_writeback_workfn(struct work_struct *work)
>  		trace_writeback_pages_written(pages_written);
>  	}
>  
> -	if (!list_empty(&bdi->work_list) ||
> -	    (wb_has_dirty_io(wb) && dirty_writeback_interval))
> +	if (!list_empty(&bdi->work_list))
> +		mod_delayed_work(bdi_wq, &wb->dwork, 0);
> +	else if (wb_has_dirty_io(wb) && dirty_writeback_interval)
>  		queue_delayed_work(bdi_wq, &wb->dwork,
>  			msecs_to_jiffies(dirty_writeback_interval * 10));
>  
> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index ce682f7..3fde024 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -294,7 +294,7 @@ void bdi_wakeup_thread_delayed(struct backing_dev_info *bdi)
>  	unsigned long timeout;
>  
>  	timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
> -	mod_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
> +	queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
>  }
>  
>  /*
> -- 
> 1.9.0.rc1.175.g0b1dcb5
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-02-15  4:12 [PATCH] backing_dev: Fix hung task on sync Derek Basehore
  2014-02-17  9:20 ` Jan Kara
@ 2014-02-18 22:55 ` Tejun Heo
  2014-02-19  9:27   ` Jan Kara
  1 sibling, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2014-02-18 22:55 UTC (permalink / raw)
  To: Derek Basehore
  Cc: Alexander Viro, Jan Kara, Andrew Morton, Greg Kroah-Hartman,
	Darrick J. Wong, Kees Cook, linux-fsdevel, linux-kernel,
	linux-mm, bleung, sonnyrao, semenzato

Hello,

On Fri, Feb 14, 2014 at 08:12:17PM -0800, Derek Basehore wrote:
> bdi_wakeup_thread_delayed used the mod_delayed_work function to schedule work
> to writeback dirty inodes. The problem with this is that it can delay work that
> is scheduled for immediate execution, such as the work from sync_inodes_sb.
> This can happen since mod_delayed_work can now steal work from a work_queue.
> This fixes the problem by using queue_delayed_work instead. This is a
> regression from the move to the bdi workqueue design.
> 
> The reason that this causes a problem is that laptop-mode will change the
> delay, dirty_writeback_centisecs, to 60000 (10 minutes) by default. In the case
> that bdi_wakeup_thread_delayed races with sync_inodes_sb, sync will be stopped
> for 10 minutes and trigger a hung task. Even if dirty_writeback_centisecs is
> not long enough to cause a hung task, we still don't want to delay sync for
> that long.

Oops.

> For the same reason, this also changes bdi_writeback_workfn to immediately
> queue the work again in the case that the work_list is not empty. The same
> problem can happen if the sync work is run on the rescue worker.
> 
> Signed-off-by: Derek Basehore <dbasehore@chromium.org>
> ---
>  fs/fs-writeback.c | 5 +++--
>  mm/backing-dev.c  | 2 +-
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index e0259a1..95b7b8c 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1047,8 +1047,9 @@ void bdi_writeback_workfn(struct work_struct *work)
>  		trace_writeback_pages_written(pages_written);
>  	}
>  
> -	if (!list_empty(&bdi->work_list) ||
> -	    (wb_has_dirty_io(wb) && dirty_writeback_interval))
> +	if (!list_empty(&bdi->work_list))
> +		mod_delayed_work(bdi_wq, &wb->dwork, 0);
> +	else if (wb_has_dirty_io(wb) && dirty_writeback_interval)
>  		queue_delayed_work(bdi_wq, &wb->dwork,
>  			msecs_to_jiffies(dirty_writeback_interval * 10));

Can you please add some comments explaining why the specific variants
are being used here?

> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index ce682f7..3fde024 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -294,7 +294,7 @@ void bdi_wakeup_thread_delayed(struct backing_dev_info *bdi)
>  	unsigned long timeout;
>  
>  	timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
> -	mod_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
> +	queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);

and here?

Hmmm.... but doesn't this create an opposite problem?  Now a flush
queued for an earlier time may be overridden by something scheduled
later, no?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-02-18 22:55 ` Tejun Heo
@ 2014-02-19  9:27   ` Jan Kara
  2014-02-19 19:01     ` Tejun Heo
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kara @ 2014-02-19  9:27 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Derek Basehore, Alexander Viro, Jan Kara, Andrew Morton,
	Greg Kroah-Hartman, Darrick J. Wong, Kees Cook, linux-fsdevel,
	linux-kernel, linux-mm, bleung, sonnyrao, semenzato

On Tue 18-02-14 17:55:48, Tejun Heo wrote:
> Hello,
> 
> On Fri, Feb 14, 2014 at 08:12:17PM -0800, Derek Basehore wrote:
> > bdi_wakeup_thread_delayed used the mod_delayed_work function to schedule work
> > to writeback dirty inodes. The problem with this is that it can delay work that
> > is scheduled for immediate execution, such as the work from sync_inodes_sb.
> > This can happen since mod_delayed_work can now steal work from a work_queue.
> > This fixes the problem by using queue_delayed_work instead. This is a
> > regression from the move to the bdi workqueue design.
> > 
> > The reason that this causes a problem is that laptop-mode will change the
> > delay, dirty_writeback_centisecs, to 60000 (10 minutes) by default. In the case
> > that bdi_wakeup_thread_delayed races with sync_inodes_sb, sync will be stopped
> > for 10 minutes and trigger a hung task. Even if dirty_writeback_centisecs is
> > not long enough to cause a hung task, we still don't want to delay sync for
> > that long.
> 
> Oops.
> 
> > For the same reason, this also changes bdi_writeback_workfn to immediately
> > queue the work again in the case that the work_list is not empty. The same
> > problem can happen if the sync work is run on the rescue worker.
> > 
> > Signed-off-by: Derek Basehore <dbasehore@chromium.org>
> > ---
> >  fs/fs-writeback.c | 5 +++--
> >  mm/backing-dev.c  | 2 +-
> >  2 files changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> > index e0259a1..95b7b8c 100644
> > --- a/fs/fs-writeback.c
> > +++ b/fs/fs-writeback.c
> > @@ -1047,8 +1047,9 @@ void bdi_writeback_workfn(struct work_struct *work)
> >  		trace_writeback_pages_written(pages_written);
> >  	}
> >  
> > -	if (!list_empty(&bdi->work_list) ||
> > -	    (wb_has_dirty_io(wb) && dirty_writeback_interval))
> > +	if (!list_empty(&bdi->work_list))
> > +		mod_delayed_work(bdi_wq, &wb->dwork, 0);
> > +	else if (wb_has_dirty_io(wb) && dirty_writeback_interval)
> >  		queue_delayed_work(bdi_wq, &wb->dwork,
> >  			msecs_to_jiffies(dirty_writeback_interval * 10));
> 
> Can you please add some comments explaining why the specific variants
> are being used here?
> 
> > diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> > index ce682f7..3fde024 100644
> > --- a/mm/backing-dev.c
> > +++ b/mm/backing-dev.c
> > @@ -294,7 +294,7 @@ void bdi_wakeup_thread_delayed(struct backing_dev_info *bdi)
> >  	unsigned long timeout;
> >  
> >  	timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
> > -	mod_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
> > +	queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
> 
> and here?
> 
> Hmmm.... but doesn't this create an opposite problem?  Now a flush
> queued for an earlier time may be overridden by something scheduled
> later, no?
  You are the workqueue expert so you may know better ;) But the way I
understand it is that queue_delayed_work() does nothing if the timer is
already running. Since we queue flusher work to run either immediately or
after dirty_writeback_interval we are safe to run queue_delayed_work()
whenever we want it to run after dirty_writeback_interval and
mod_delayed_work() whenever we want to run it immediately.

But it's subtle and some interface where we could say queue delayed work
after no later than X would be easier to grasp.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-02-19  9:27   ` Jan Kara
@ 2014-02-19 19:01     ` Tejun Heo
  2014-03-11 18:23       ` Andrew Morton
  2014-03-15 20:22       ` dbasehore .
  0 siblings, 2 replies; 15+ messages in thread
From: Tejun Heo @ 2014-02-19 19:01 UTC (permalink / raw)
  To: Jan Kara
  Cc: Derek Basehore, Alexander Viro, Andrew Morton,
	Greg Kroah-Hartman, Darrick J. Wong, Kees Cook, linux-fsdevel,
	linux-kernel, linux-mm, bleung, sonnyrao, semenzato

Hello, Jan.

On Wed, Feb 19, 2014 at 10:27:31AM +0100, Jan Kara wrote:
>   You are the workqueue expert so you may know better ;) But the way I
> understand it is that queue_delayed_work() does nothing if the timer is
> already running. Since we queue flusher work to run either immediately or
> after dirty_writeback_interval we are safe to run queue_delayed_work()
> whenever we want it to run after dirty_writeback_interval and
> mod_delayed_work() whenever we want to run it immediately.

Ah, okay, so it's always mod on immediate and queue on delayed.  Yeah,
that should work.

> But it's subtle and some interface where we could say queue delayed work
> after no later than X would be easier to grasp.

Yeah, I think it'd be better if we had something like
mod_delayed_work_if_later().  Hmm...

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-02-19 19:01     ` Tejun Heo
@ 2014-03-11 18:23       ` Andrew Morton
  2014-03-11 20:19         ` Jan Kara
  2014-03-15 20:22       ` dbasehore .
  1 sibling, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2014-03-11 18:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jan Kara, Derek Basehore, Alexander Viro, Greg Kroah-Hartman,
	Darrick J. Wong, Kees Cook, linux-fsdevel, linux-kernel,
	linux-mm, bleung, sonnyrao, semenzato

On Wed, 19 Feb 2014 14:01:39 -0500 Tejun Heo <tj@kernel.org> wrote:

> Hello, Jan.
> 
> On Wed, Feb 19, 2014 at 10:27:31AM +0100, Jan Kara wrote:
> >   You are the workqueue expert so you may know better ;) But the way I
> > understand it is that queue_delayed_work() does nothing if the timer is
> > already running. Since we queue flusher work to run either immediately or
> > after dirty_writeback_interval we are safe to run queue_delayed_work()
> > whenever we want it to run after dirty_writeback_interval and
> > mod_delayed_work() whenever we want to run it immediately.
> 
> Ah, okay, so it's always mod on immediate and queue on delayed.  Yeah,
> that should work.
> 
> > But it's subtle and some interface where we could say queue delayed work
> > after no later than X would be easier to grasp.
> 
> Yeah, I think it'd be better if we had something like
> mod_delayed_work_if_later().  Hmm...

The code comments which you asked for were not forthcoming.

Are you otherwise OK with merging this into 3.14 and -stable?


From: Derek Basehore <dbasehore@chromium.org>
Subject: backing_dev: fix hung task on sync

bdi_wakeup_thread_delayed() used mod_delayed_work() to schedule work to
writeback dirty inodes.  The problem with this is that it can delay work
that is scheduled for immediate execution, such as the work from
sync_inodes_sb().  This can happen since mod_delayed_work can now steal
work from a work_queue.  This fixes the problem by using
queue_delayed_work instead.  This is a regression from the move to the bdi
workqueue design.

The reason that this causes a problem is that laptop-mode will change the
delay, dirty_writeback_centisecs, to 60000 (10 minutes) by default.  In
the case that bdi_wakeup_thread_delayed races with sync_inodes_sb, sync
will be stopped for 10 minutes and trigger a hung task.  Even if
dirty_writeback_centisecs is not long enough to cause a hung task, we
still don't want to delay sync for that long.

For the same reason, this also changes bdi_writeback_workfn to immediately
queue the work again in the case that the work_list is not empty.  The
same problem can happen if the sync work is run on the rescue worker.

Signed-off-by: Derek Basehore <dbasehore@chromium.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zento.linux.org.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Derek Basehore <dbasehore@chromium.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Benson Leung <bleung@chromium.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Luigi Semenzato <semenzato@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/fs-writeback.c |    5 +++--
 mm/backing-dev.c  |    2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff -puN fs/fs-writeback.c~backing_dev-fix-hung-task-on-sync fs/fs-writeback.c
--- a/fs/fs-writeback.c~backing_dev-fix-hung-task-on-sync
+++ a/fs/fs-writeback.c
@@ -1039,8 +1039,9 @@ void bdi_writeback_workfn(struct work_st
 		trace_writeback_pages_written(pages_written);
 	}
 
-	if (!list_empty(&bdi->work_list) ||
-	    (wb_has_dirty_io(wb) && dirty_writeback_interval))
+	if (!list_empty(&bdi->work_list))
+		mod_delayed_work(bdi_wq, &wb->dwork, 0);
+	else if (wb_has_dirty_io(wb) && dirty_writeback_interval)
 		queue_delayed_work(bdi_wq, &wb->dwork,
 			msecs_to_jiffies(dirty_writeback_interval * 10));
 
diff -puN mm/backing-dev.c~backing_dev-fix-hung-task-on-sync mm/backing-dev.c
--- a/mm/backing-dev.c~backing_dev-fix-hung-task-on-sync
+++ a/mm/backing-dev.c
@@ -294,7 +294,7 @@ void bdi_wakeup_thread_delayed(struct ba
 	unsigned long timeout;
 
 	timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
-	mod_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
+	queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
 }
 
 /*
_


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-03-11 18:23       ` Andrew Morton
@ 2014-03-11 20:19         ` Jan Kara
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Kara @ 2014-03-11 20:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Jan Kara, Derek Basehore, Alexander Viro,
	Greg Kroah-Hartman, Darrick J. Wong, Kees Cook, linux-fsdevel,
	linux-kernel, linux-mm, bleung, sonnyrao, semenzato

On Tue 11-03-14 11:23:55, Andrew Morton wrote:
> On Wed, 19 Feb 2014 14:01:39 -0500 Tejun Heo <tj@kernel.org> wrote:
> 
> > Hello, Jan.
> > 
> > On Wed, Feb 19, 2014 at 10:27:31AM +0100, Jan Kara wrote:
> > >   You are the workqueue expert so you may know better ;) But the way I
> > > understand it is that queue_delayed_work() does nothing if the timer is
> > > already running. Since we queue flusher work to run either immediately or
> > > after dirty_writeback_interval we are safe to run queue_delayed_work()
> > > whenever we want it to run after dirty_writeback_interval and
> > > mod_delayed_work() whenever we want to run it immediately.
> > 
> > Ah, okay, so it's always mod on immediate and queue on delayed.  Yeah,
> > that should work.
> > 
> > > But it's subtle and some interface where we could say queue delayed work
> > > after no later than X would be easier to grasp.
> > 
> > Yeah, I think it'd be better if we had something like
> > mod_delayed_work_if_later().  Hmm...
> 
> The code comments which you asked for were not forthcoming.
> 
> Are you otherwise OK with merging this into 3.14 and -stable?
  I have actually added the comment when nothing happened and have a patch
with Tejun's Reviewed-by. Just noone picked it up. I'll send it your way
together with another fix in flush work handling.

								Honza


> From: Derek Basehore <dbasehore@chromium.org>
> Subject: backing_dev: fix hung task on sync
> 
> bdi_wakeup_thread_delayed() used mod_delayed_work() to schedule work to
> writeback dirty inodes.  The problem with this is that it can delay work
> that is scheduled for immediate execution, such as the work from
> sync_inodes_sb().  This can happen since mod_delayed_work can now steal
> work from a work_queue.  This fixes the problem by using
> queue_delayed_work instead.  This is a regression from the move to the bdi
> workqueue design.
> 
> The reason that this causes a problem is that laptop-mode will change the
> delay, dirty_writeback_centisecs, to 60000 (10 minutes) by default.  In
> the case that bdi_wakeup_thread_delayed races with sync_inodes_sb, sync
> will be stopped for 10 minutes and trigger a hung task.  Even if
> dirty_writeback_centisecs is not long enough to cause a hung task, we
> still don't want to delay sync for that long.
> 
> For the same reason, this also changes bdi_writeback_workfn to immediately
> queue the work again in the case that the work_list is not empty.  The
> same problem can happen if the sync work is run on the rescue worker.
> 
> Signed-off-by: Derek Basehore <dbasehore@chromium.org>
> Reviewed-by: Jan Kara <jack@suse.cz>
> Cc: Alexander Viro <viro@zento.linux.org.uk>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
> Cc: Derek Basehore <dbasehore@chromium.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Benson Leung <bleung@chromium.org>
> Cc: Sonny Rao <sonnyrao@chromium.org>
> Cc: Luigi Semenzato <semenzato@chromium.org>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  fs/fs-writeback.c |    5 +++--
>  mm/backing-dev.c  |    2 +-
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff -puN fs/fs-writeback.c~backing_dev-fix-hung-task-on-sync fs/fs-writeback.c
> --- a/fs/fs-writeback.c~backing_dev-fix-hung-task-on-sync
> +++ a/fs/fs-writeback.c
> @@ -1039,8 +1039,9 @@ void bdi_writeback_workfn(struct work_st
>  		trace_writeback_pages_written(pages_written);
>  	}
>  
> -	if (!list_empty(&bdi->work_list) ||
> -	    (wb_has_dirty_io(wb) && dirty_writeback_interval))
> +	if (!list_empty(&bdi->work_list))
> +		mod_delayed_work(bdi_wq, &wb->dwork, 0);
> +	else if (wb_has_dirty_io(wb) && dirty_writeback_interval)
>  		queue_delayed_work(bdi_wq, &wb->dwork,
>  			msecs_to_jiffies(dirty_writeback_interval * 10));
>  
> diff -puN mm/backing-dev.c~backing_dev-fix-hung-task-on-sync mm/backing-dev.c
> --- a/mm/backing-dev.c~backing_dev-fix-hung-task-on-sync
> +++ a/mm/backing-dev.c
> @@ -294,7 +294,7 @@ void bdi_wakeup_thread_delayed(struct ba
>  	unsigned long timeout;
>  
>  	timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
> -	mod_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
> +	queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
>  }
>  
>  /*
> _
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-02-19 19:01     ` Tejun Heo
  2014-03-11 18:23       ` Andrew Morton
@ 2014-03-15 20:22       ` dbasehore .
  2014-03-16 14:59         ` Tejun Heo
  2014-03-17  9:53         ` Jan Kara
  1 sibling, 2 replies; 15+ messages in thread
From: dbasehore . @ 2014-03-15 20:22 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jan Kara, Alexander Viro, Andrew Morton, Greg Kroah-Hartman,
	Darrick J. Wong, Kees Cook, linux-fsdevel, linux-kernel,
	linux-mm, bleung, sonnyrao, Luigi Semenzato

Resurrecting this for further discussion about the root of the problem.

mod_delayed_work_if_later addresses the problem one way, but the
problem is still there for mod_delayed_work. I think we could take
another approach that doesn't modify the API, but still addresses
(most of) the problem.

mod_delayed_work currently removes a work item from a workqueue if it
is on it. Correct me if I'm wrong, but I don't think that this is
necessarily required for mod_delayed_work to have the current
behavior. We should be able to set the timer while a delayed_work is
currently on a workqueue. If the delayed_work is still on the
workqueue when the timer goes off, everything is fine. If it has left
the workqueue, we can queue it again.

On Wed, Feb 19, 2014 at 11:01 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Jan.
>
> On Wed, Feb 19, 2014 at 10:27:31AM +0100, Jan Kara wrote:
>>   You are the workqueue expert so you may know better ;) But the way I
>> understand it is that queue_delayed_work() does nothing if the timer is
>> already running. Since we queue flusher work to run either immediately or
>> after dirty_writeback_interval we are safe to run queue_delayed_work()
>> whenever we want it to run after dirty_writeback_interval and
>> mod_delayed_work() whenever we want to run it immediately.
>
> Ah, okay, so it's always mod on immediate and queue on delayed.  Yeah,
> that should work.
>
>> But it's subtle and some interface where we could say queue delayed work
>> after no later than X would be easier to grasp.
>
> Yeah, I think it'd be better if we had something like
> mod_delayed_work_if_later().  Hmm...
>
> Thanks.
>
> --
> tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-03-15 20:22       ` dbasehore .
@ 2014-03-16 14:59         ` Tejun Heo
  2014-03-16 19:13           ` dbasehore .
  2014-03-17  9:53         ` Jan Kara
  1 sibling, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2014-03-16 14:59 UTC (permalink / raw)
  To: dbasehore .
  Cc: Jan Kara, Alexander Viro, Andrew Morton, Greg Kroah-Hartman,
	Darrick J. Wong, Kees Cook, linux-fsdevel, linux-kernel,
	linux-mm, bleung, sonnyrao, Luigi Semenzato

On Sat, Mar 15, 2014 at 01:22:53PM -0700, dbasehore . wrote:
> mod_delayed_work currently removes a work item from a workqueue if it
> is on it. Correct me if I'm wrong, but I don't think that this is
> necessarily required for mod_delayed_work to have the current
> behavior. We should be able to set the timer while a delayed_work is
> currently on a workqueue. If the delayed_work is still on the
> workqueue when the timer goes off, everything is fine. If it has left
> the workqueue, we can queue it again.

What different would that make w.r.t. this issue?  Plus, please note
that a work item may wait non-insignificant amount of time pending if
the workqueue is saturated to max_active.  Doing the above would make
mod_delayed_work()'s behavior quite fuzzy - the work item is modified
or queued to the specified time but if the timer has already expired,
the work item may execute after unspecified amount of time which may
be shorter than the new timeout.  What kind of interface would that
be?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-03-16 14:59         ` Tejun Heo
@ 2014-03-16 19:13           ` dbasehore .
  2014-03-16 20:20             ` dbasehore .
  2014-03-17 14:40             ` Tejun Heo
  0 siblings, 2 replies; 15+ messages in thread
From: dbasehore . @ 2014-03-16 19:13 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jan Kara, Alexander Viro, Andrew Morton, Greg Kroah-Hartman,
	Darrick J. Wong, Kees Cook, linux-fsdevel, linux-kernel,
	linux-mm, bleung, sonnyrao, Luigi Semenzato

There's already behavior that is somewhat like that with the current
implementation. If there's an item on a workqueue, it could run at any
time. From the perspective of the driver/etc. that is using the
workqueue, there should be no difference between work being on the
workqueue and the kernel triggering a schedule right after the work is
removed from the workqueue, but before the work function has done
anything.

So to reiterate, calling mod_delayed_work on something that is already
in the workqueue has two behaviors. One, the work is dispatched before
mod_delayed_work can remove it from the workqueue. Two,
mod_delayed_work removes it from the workqueue and sets the timer (or
not in the case of 0). The behavior of the proposed change should be
no different than the first behavior.

This should not introduce new behavior from the perspective of the
code using delayed_work. It is true that there is a larger window of
time between when you call mod_delayed_work and when an already queued
work item will run, but I don't believe that matters.

The API will still make sense since we will only ever mod delayed work
but not work that is no longer delayed (on the workqueue).

On Sun, Mar 16, 2014 at 7:59 AM, Tejun Heo <tj@kernel.org> wrote:
> On Sat, Mar 15, 2014 at 01:22:53PM -0700, dbasehore . wrote:
>> mod_delayed_work currently removes a work item from a workqueue if it
>> is on it. Correct me if I'm wrong, but I don't think that this is
>> necessarily required for mod_delayed_work to have the current
>> behavior. We should be able to set the timer while a delayed_work is
>> currently on a workqueue. If the delayed_work is still on the
>> workqueue when the timer goes off, everything is fine. If it has left
>> the workqueue, we can queue it again.
>
> What different would that make w.r.t. this issue?  Plus, please note
> that a work item may wait non-insignificant amount of time pending if
> the workqueue is saturated to max_active.  Doing the above would make
> mod_delayed_work()'s behavior quite fuzzy - the work item is modified
> or queued to the specified time but if the timer has already expired,
> the work item may execute after unspecified amount of time which may
> be shorter than the new timeout.  What kind of interface would that
> be?
>
> Thanks.
>
> --
> tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-03-16 19:13           ` dbasehore .
@ 2014-03-16 20:20             ` dbasehore .
  2014-03-17 14:40             ` Tejun Heo
  1 sibling, 0 replies; 15+ messages in thread
From: dbasehore . @ 2014-03-16 20:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jan Kara, Andrew Morton, Greg Kroah-Hartman, Darrick J. Wong,
	Kees Cook, linux-fsdevel, linux-kernel, linux-mm, bleung,
	sonnyrao, Luigi Semenzato

Also, the difference it would make is fix the issue for when a
delayed_work is used for both immediate work (mod_delayed_work(0)) and
delayed work.

On Sun, Mar 16, 2014 at 12:13 PM, dbasehore . <dbasehore@chromium.org> wrote:
> There's already behavior that is somewhat like that with the current
> implementation. If there's an item on a workqueue, it could run at any
> time. From the perspective of the driver/etc. that is using the
> workqueue, there should be no difference between work being on the
> workqueue and the kernel triggering a schedule right after the work is
> removed from the workqueue, but before the work function has done
> anything.
>
> So to reiterate, calling mod_delayed_work on something that is already
> in the workqueue has two behaviors. One, the work is dispatched before
> mod_delayed_work can remove it from the workqueue. Two,
> mod_delayed_work removes it from the workqueue and sets the timer (or
> not in the case of 0). The behavior of the proposed change should be
> no different than the first behavior.
>
> This should not introduce new behavior from the perspective of the
> code using delayed_work. It is true that there is a larger window of
> time between when you call mod_delayed_work and when an already queued
> work item will run, but I don't believe that matters.
>
> The API will still make sense since we will only ever mod delayed work
> but not work that is no longer delayed (on the workqueue).
>
> On Sun, Mar 16, 2014 at 7:59 AM, Tejun Heo <tj@kernel.org> wrote:
>> On Sat, Mar 15, 2014 at 01:22:53PM -0700, dbasehore . wrote:
>>> mod_delayed_work currently removes a work item from a workqueue if it
>>> is on it. Correct me if I'm wrong, but I don't think that this is
>>> necessarily required for mod_delayed_work to have the current
>>> behavior. We should be able to set the timer while a delayed_work is
>>> currently on a workqueue. If the delayed_work is still on the
>>> workqueue when the timer goes off, everything is fine. If it has left
>>> the workqueue, we can queue it again.
>>
>> What different would that make w.r.t. this issue?  Plus, please note
>> that a work item may wait non-insignificant amount of time pending if
>> the workqueue is saturated to max_active.  Doing the above would make
>> mod_delayed_work()'s behavior quite fuzzy - the work item is modified
>> or queued to the specified time but if the timer has already expired,
>> the work item may execute after unspecified amount of time which may
>> be shorter than the new timeout.  What kind of interface would that
>> be?
>>
>> Thanks.
>>
>> --
>> tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-03-15 20:22       ` dbasehore .
  2014-03-16 14:59         ` Tejun Heo
@ 2014-03-17  9:53         ` Jan Kara
  1 sibling, 0 replies; 15+ messages in thread
From: Jan Kara @ 2014-03-17  9:53 UTC (permalink / raw)
  To: dbasehore .
  Cc: Tejun Heo, Jan Kara, Alexander Viro, Andrew Morton,
	Greg Kroah-Hartman, Darrick J. Wong, Kees Cook, linux-fsdevel,
	linux-kernel, linux-mm, bleung, sonnyrao, Luigi Semenzato

On Sat 15-03-14 13:22:53, dbasehore . wrote:
> Resurrecting this for further discussion about the root of the problem.
> 
> mod_delayed_work_if_later addresses the problem one way, but the
> problem is still there for mod_delayed_work.
  But flusher works care about only that one way, don't they? We always
want the flushing work to execute at min(timer so far, new time target). So
for that mod_delayed_work_if_later() works just fine.

> I think we could take
> another approach that doesn't modify the API, but still addresses
> (most of) the problem.
> 
> mod_delayed_work currently removes a work item from a workqueue if it
> is on it. Correct me if I'm wrong, but I don't think that this is
> necessarily required for mod_delayed_work to have the current
> behavior. We should be able to set the timer while a delayed_work is
> currently on a workqueue. If the delayed_work is still on the
> workqueue when the timer goes off, everything is fine. If it has left
> the workqueue, we can queue it again.
  But here you are relying on the fact that flusher works always want the
work to be executed immediately (i.e., they will be queued), or after some
fixed time T. So I agree what you suggest will work but changing the API as
Tejun described seems cleaner to me.

								Honza

> On Wed, Feb 19, 2014 at 11:01 AM, Tejun Heo <tj@kernel.org> wrote:
> > Hello, Jan.
> >
> > On Wed, Feb 19, 2014 at 10:27:31AM +0100, Jan Kara wrote:
> >>   You are the workqueue expert so you may know better ;) But the way I
> >> understand it is that queue_delayed_work() does nothing if the timer is
> >> already running. Since we queue flusher work to run either immediately or
> >> after dirty_writeback_interval we are safe to run queue_delayed_work()
> >> whenever we want it to run after dirty_writeback_interval and
> >> mod_delayed_work() whenever we want to run it immediately.
> >
> > Ah, okay, so it's always mod on immediate and queue on delayed.  Yeah,
> > that should work.
> >
> >> But it's subtle and some interface where we could say queue delayed work
> >> after no later than X would be easier to grasp.
> >
> > Yeah, I think it'd be better if we had something like
> > mod_delayed_work_if_later().  Hmm...
> >
> > Thanks.
> >
> > --
> > tejun
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-03-16 19:13           ` dbasehore .
  2014-03-16 20:20             ` dbasehore .
@ 2014-03-17 14:40             ` Tejun Heo
  2014-03-17 20:53               ` dbasehore .
  1 sibling, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2014-03-17 14:40 UTC (permalink / raw)
  To: dbasehore .
  Cc: Jan Kara, Alexander Viro, Andrew Morton, Greg Kroah-Hartman,
	Darrick J. Wong, Kees Cook, linux-fsdevel, linux-kernel,
	linux-mm, bleung, sonnyrao, Luigi Semenzato

Hello,

On Sun, Mar 16, 2014 at 12:13:55PM -0700, dbasehore . wrote:
> There's already behavior that is somewhat like that with the current
> implementation. If there's an item on a workqueue, it could run at any
> time. From the perspective of the driver/etc. that is using the
> workqueue, there should be no difference between work being on the
> workqueue and the kernel triggering a schedule right after the work is
> removed from the workqueue, but before the work function has done
> anything.

It is different.  mod_delayed_work() *guarantees* that the target work
item will become pending for execution at least after the specified
time has passed.  What you're suggesting removes any semantically
consistent meaning of the API.

> So to reiterate, calling mod_delayed_work on something that is already
> in the workqueue has two behaviors. One, the work is dispatched before
> mod_delayed_work can remove it from the workqueue. Two,
> mod_delayed_work removes it from the workqueue and sets the timer (or
> not in the case of 0). The behavior of the proposed change should be
> no different than the first behavior.

No, mod_delayed_work() does *one* thing - the work item is queued for
the specified delay no matter the current state of the work item.  It
is *guaranteed* that the work item will go pending after the specified
time.  That is the sole meaning of the API.

> This should not introduce new behavior from the perspective of the
> code using delayed_work. It is true that there is a larger window of
> time between when you call mod_delayed_work and when an already queued
> work item will run, but I don't believe that matters.

You're completely misunderstanding the API.  Plesae re-read it and
understand what it does first.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-03-17 14:40             ` Tejun Heo
@ 2014-03-17 20:53               ` dbasehore .
  2014-03-17 20:59                 ` Tejun Heo
  0 siblings, 1 reply; 15+ messages in thread
From: dbasehore . @ 2014-03-17 20:53 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jan Kara, Alexander Viro, Andrew Morton, Greg Kroah-Hartman,
	Darrick J. Wong, Kees Cook, linux-fsdevel, linux-kernel,
	linux-mm, bleung, sonnyrao, Luigi Semenzato

On Mon, Mar 17, 2014 at 7:40 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On Sun, Mar 16, 2014 at 12:13:55PM -0700, dbasehore . wrote:
>> There's already behavior that is somewhat like that with the current
>> implementation. If there's an item on a workqueue, it could run at any
>> time. From the perspective of the driver/etc. that is using the
>> workqueue, there should be no difference between work being on the
>> workqueue and the kernel triggering a schedule right after the work is
>> removed from the workqueue, but before the work function has done
>> anything.
>
> It is different.  mod_delayed_work() *guarantees* that the target work
> item will become pending for execution at least after the specified
> time has passed.  What you're suggesting removes any semantically
> consistent meaning of the API.
>

It will still be at least be pending after the specified time has
passed. I'm proposing that we still set the timer. The difference is
that there is a possibility the work will already be pending when the
timer goes off. There will still at least be an execution after the
given time has past. We could still remove the work in the workqueue
from the timer function, but this would make the mod_delayed_work not
race with any work that was scheduled for immediate execution
previously.

If you make the timer function remove any pending work from the
workqueue when the timer goes off, this is still following the API.
The work will still become pending at least after the specified time
has passed.

>> So to reiterate, calling mod_delayed_work on something that is already
>> in the workqueue has two behaviors. One, the work is dispatched before
>> mod_delayed_work can remove it from the workqueue. Two,
>> mod_delayed_work removes it from the workqueue and sets the timer (or
>> not in the case of 0). The behavior of the proposed change should be
>> no different than the first behavior.
>
> No, mod_delayed_work() does *one* thing - the work item is queued for
> the specified delay no matter the current state of the work item.  It
> is *guaranteed* that the work item will go pending after the specified
> time.  That is the sole meaning of the API.
>
>> This should not introduce new behavior from the perspective of the
>> code using delayed_work. It is true that there is a larger window of
>> time between when you call mod_delayed_work and when an already queued
>> work item will run, but I don't believe that matters.
>
> You're completely misunderstanding the API.  Plesae re-read it and
> understand what it does first.
>
> Thanks.
>
> --
> tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] backing_dev: Fix hung task on sync
  2014-03-17 20:53               ` dbasehore .
@ 2014-03-17 20:59                 ` Tejun Heo
  0 siblings, 0 replies; 15+ messages in thread
From: Tejun Heo @ 2014-03-17 20:59 UTC (permalink / raw)
  To: dbasehore .
  Cc: Jan Kara, Alexander Viro, Andrew Morton, Greg Kroah-Hartman,
	Darrick J. Wong, Kees Cook, linux-fsdevel, linux-kernel,
	linux-mm, bleung, sonnyrao, Luigi Semenzato

On Mon, Mar 17, 2014 at 01:53:57PM -0700, dbasehore . wrote:
> It will still be at least be pending after the specified time has
> passed. I'm proposing that we still set the timer. The difference is
> that there is a possibility the work will already be pending when the
> timer goes off. There will still at least be an execution after the
> given time has past. We could still remove the work in the workqueue
> from the timer function, but this would make the mod_delayed_work not
> race with any work that was scheduled for immediate execution
> previously.

I really don't see what you're suggesting happening.  Managing work
item pending status is already extremely delicate and I'd like to keep
all the paths which can share pending state management to do so.
You're suggesting introducing a new pending state where a work item
may be pending in two different places which will also affect cancel
and flushing for rather dubious benefit.  If you can write up a patch
which isn't too complicated, let's talk about it, but I'm likely to
resist any significant amount of extra complexity coming from it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-03-17 20:59 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-15  4:12 [PATCH] backing_dev: Fix hung task on sync Derek Basehore
2014-02-17  9:20 ` Jan Kara
2014-02-18 22:55 ` Tejun Heo
2014-02-19  9:27   ` Jan Kara
2014-02-19 19:01     ` Tejun Heo
2014-03-11 18:23       ` Andrew Morton
2014-03-11 20:19         ` Jan Kara
2014-03-15 20:22       ` dbasehore .
2014-03-16 14:59         ` Tejun Heo
2014-03-16 19:13           ` dbasehore .
2014-03-16 20:20             ` dbasehore .
2014-03-17 14:40             ` Tejun Heo
2014-03-17 20:53               ` dbasehore .
2014-03-17 20:59                 ` Tejun Heo
2014-03-17  9:53         ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).