linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] bdi fixes
@ 2014-02-25 22:29 Jan Kara
  2014-02-25 22:29 ` [PATCH 1/2] bdi: Fix hung task on sync Jan Kara
  2014-02-25 22:29 ` [PATCH 2/2] bdi: Avoid oops on device removal Jan Kara
  0 siblings, 2 replies; 8+ messages in thread
From: Jan Kara @ 2014-02-25 22:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: LKML, Jens Axboe, Tejun Heo


  Hello,

these two patches fix regressions caused by conversion from flusher
threads to standard workqueue. The first patch is an updated version of
Derek's fix, the second patch fixes a reliable oops on removal of a
device that is written to. I'm open to suggestions for a better fix in
case of the second patch but at least for 3.13 and stable kernels what I
did seemed good enough.

								Honza

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] bdi: Fix hung task on sync
  2014-02-25 22:29 [PATCH 0/2] bdi fixes Jan Kara
@ 2014-02-25 22:29 ` Jan Kara
  2014-02-25 22:43   ` Tejun Heo
  2014-02-25 22:29 ` [PATCH 2/2] bdi: Avoid oops on device removal Jan Kara
  1 sibling, 1 reply; 8+ messages in thread
From: Jan Kara @ 2014-02-25 22:29 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: LKML, Jens Axboe, Tejun Heo, Derek Basehore, stable, Jan Kara

From: Derek Basehore <dbasehore@chromium.org>

bdi_wakeup_thread_delayed() used the mod_delayed_work() function to
schedule work to writeback dirty inodes. The problem with this is that
it can delay work that is scheduled for immediate execution, such as the
work from sync_inodes_sb().  This can happen since mod_delayed_work()
can now steal work from a work_queue.  This fixes the problem by using
queue_delayed_work() instead. This is a regression caused by
839a8e8660b6 "writeback: replace custom worker pool implementation with
unbound workqueue".

The reason that this causes a problem is that laptop-mode will change
the delay, dirty_writeback_centisecs, to 60000 (10 minutes) by default.
In the case that bdi_wakeup_thread_delayed() races with
sync_inodes_sb(), sync will be stopped for 10 minutes and trigger a hung
task. Even if dirty_writeback_centisecs is not long enough to cause a
hung task, we still don't want to delay sync for that long.

We fix the problem by using queue_delayed_work() when we want to
schedule writeback sometime in future. This function doesn't change the
timer if it is already armed.

For the same reason, we also change bdi_writeback_workfn() to
immediately queue the work again in the case that the work_list is not
empty. The same problem can happen if the sync work is run on the rescue
worker.

Fixes: 839a8e8660b6777e7fe4e80af1a048aebe2b5977
CC: stable@vger.kernel.org
Signed-off-by: Derek Basehore <dbasehore@chromium.org>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/fs-writeback.c | 8 ++++----
 mm/backing-dev.c  | 5 ++++-
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index e0259a163f98..8277b76be983 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1047,10 +1047,10 @@ void bdi_writeback_workfn(struct work_struct *work)
 		trace_writeback_pages_written(pages_written);
 	}
 
-	if (!list_empty(&bdi->work_list) ||
-	    (wb_has_dirty_io(wb) && dirty_writeback_interval))
-		queue_delayed_work(bdi_wq, &wb->dwork,
-			msecs_to_jiffies(dirty_writeback_interval * 10));
+	if (!list_empty(&bdi->work_list))
+		mod_delayed_work(bdi_wq, &wb->dwork, 0);
+	else if (wb_has_dirty_io(wb) && dirty_writeback_interval)
+		bdi_wakeup_thread_delayed(bdi);
 
 	current->flags &= ~PF_SWAPWRITE;
 }
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index ce682f7a4f29..fab8401fc54e 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -288,13 +288,16 @@ int bdi_has_dirty_io(struct backing_dev_info *bdi)
  * Note, we wouldn't bother setting up the timer, but this function is on the
  * fast-path (used by '__mark_inode_dirty()'), so we save few context switches
  * by delaying the wake-up.
+ *
+ * We have to be careful not to postpone flush work if it is scheduled for
+ * earlier. Thus we use queue_delayed_work().
  */
 void bdi_wakeup_thread_delayed(struct backing_dev_info *bdi)
 {
 	unsigned long timeout;
 
 	timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
-	mod_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
+	queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
 }
 
 /*
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] bdi: Avoid oops on device removal
  2014-02-25 22:29 [PATCH 0/2] bdi fixes Jan Kara
  2014-02-25 22:29 ` [PATCH 1/2] bdi: Fix hung task on sync Jan Kara
@ 2014-02-25 22:29 ` Jan Kara
  2014-02-27 20:07   ` Tejun Heo
  1 sibling, 1 reply; 8+ messages in thread
From: Jan Kara @ 2014-02-25 22:29 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: LKML, Jens Axboe, Tejun Heo, Jan Kara, stable

After 839a8e8660b67 "writeback: replace custom worker pool
implementation with unbound workqueue" when device is removed while we
are writing to it we crash in bdi_writeback_workfn() ->
set_worker_desc() because bdi->dev is NULL.  This can happen because
even though bdi_unregister() cancels all pending flushing work, nothing
really prevents new ones from being queued from balance_dirty_pages() or
other places.

Fix the problem by clearing BDI_registered bit in bdi_unregister() and
checking it before scheduling of any flushing work.

Fixes: 839a8e8660b6777e7fe4e80af1a048aebe2b5977
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/fs-writeback.c           | 25 ++++++++++++++++++++-----
 include/linux/backing-dev.h |  2 +-
 mm/backing-dev.c            | 13 +++++++++----
 3 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 8277b76be983..2127441fabf4 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -94,18 +94,33 @@ static inline struct inode *wb_inode(struct list_head *head)
 #define CREATE_TRACE_POINTS
 #include <trace/events/writeback.h>
 
+static void bdi_wakeup_thread(struct backing_dev_info *bdi)
+{
+	spin_lock_bh(&bdi->wb_lock);
+	if (test_bit(BDI_registered, &bdi->state))
+		mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
+	spin_unlock_bh(&bdi->wb_lock);
+}
+
 static void bdi_queue_work(struct backing_dev_info *bdi,
 			   struct wb_writeback_work *work)
 {
 	trace_writeback_queue(bdi, work);
 
 	spin_lock_bh(&bdi->wb_lock);
+	if (!test_bit(BDI_registered, &bdi->state)) {
+		if (work->done)
+			complete(work->done);
+		goto out_unlock;
+	}
 	list_add_tail(&work->list, &bdi->work_list);
-	spin_unlock_bh(&bdi->wb_lock);
-
 	mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
+out_unlock:
+	spin_unlock_bh(&bdi->wb_lock);
 }
 
+
+
 static void
 __bdi_start_writeback(struct backing_dev_info *bdi, long nr_pages,
 		      bool range_cyclic, enum wb_reason reason)
@@ -119,7 +134,7 @@ __bdi_start_writeback(struct backing_dev_info *bdi, long nr_pages,
 	work = kzalloc(sizeof(*work), GFP_ATOMIC);
 	if (!work) {
 		trace_writeback_nowork(bdi);
-		mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
+		bdi_wakeup_thread(bdi);
 		return;
 	}
 
@@ -166,7 +181,7 @@ void bdi_start_background_writeback(struct backing_dev_info *bdi)
 	 * writeback as soon as there is no other work to do.
 	 */
 	trace_writeback_wake_background(bdi);
-	mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
+	bdi_wakeup_thread(bdi);
 }
 
 /*
@@ -1025,7 +1040,7 @@ void bdi_writeback_workfn(struct work_struct *work)
 	current->flags |= PF_SWAPWRITE;
 
 	if (likely(!current_is_workqueue_rescuer() ||
-		   list_empty(&bdi->bdi_list))) {
+		   !test_bit(BDI_registered, &bdi->state))) {
 		/*
 		 * The normal path.  Keep writing back @bdi until its
 		 * work_list is empty.  Note that this path is also taken
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 24819001f5c8..e488e9459a93 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -95,7 +95,7 @@ struct backing_dev_info {
 	unsigned int max_ratio, max_prop_frac;
 
 	struct bdi_writeback wb;  /* default writeback info for this bdi */
-	spinlock_t wb_lock;	  /* protects work_list */
+	spinlock_t wb_lock;	  /* protects work_list & wb.dwork scheduling */
 
 	struct list_head work_list;
 
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index fab8401fc54e..09d9591b7708 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -297,7 +297,10 @@ void bdi_wakeup_thread_delayed(struct backing_dev_info *bdi)
 	unsigned long timeout;
 
 	timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
-	queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
+	spin_lock_bh(&bdi->wb_lock);
+	if (test_bit(BDI_registered, &bdi->state))
+		queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
+	spin_unlock_bh(&bdi->wb_lock);
 }
 
 /*
@@ -310,9 +313,6 @@ static void bdi_remove_from_list(struct backing_dev_info *bdi)
 	spin_unlock_bh(&bdi_lock);
 
 	synchronize_rcu_expedited();
-
-	/* bdi_list is now unused, clear it to mark @bdi dying */
-	INIT_LIST_HEAD(&bdi->bdi_list);
 }
 
 int bdi_register(struct backing_dev_info *bdi, struct device *parent,
@@ -363,6 +363,11 @@ static void bdi_wb_shutdown(struct backing_dev_info *bdi)
 	 */
 	bdi_remove_from_list(bdi);
 
+	/* Make sure nobody queues further work */
+	spin_lock_bh(&bdi->wb_lock);
+	clear_bit(BDI_registered, &bdi->state);
+	spin_unlock_bh(&bdi->wb_lock);
+
 	/*
 	 * Drain work list and shutdown the delayed_work.  At this point,
 	 * @bdi->bdi_list is empty telling bdi_Writeback_workfn() that @bdi
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] bdi: Fix hung task on sync
  2014-02-25 22:29 ` [PATCH 1/2] bdi: Fix hung task on sync Jan Kara
@ 2014-02-25 22:43   ` Tejun Heo
  2014-02-27 16:13     ` Jan Kara
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2014-02-25 22:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, LKML, Jens Axboe, Derek Basehore, stable

On Tue, Feb 25, 2014 at 11:29:13PM +0100, Jan Kara wrote:
> From: Derek Basehore <dbasehore@chromium.org>
> 
> bdi_wakeup_thread_delayed() used the mod_delayed_work() function to
> schedule work to writeback dirty inodes. The problem with this is that
> it can delay work that is scheduled for immediate execution, such as the
> work from sync_inodes_sb().  This can happen since mod_delayed_work()
> can now steal work from a work_queue.  This fixes the problem by using
> queue_delayed_work() instead. This is a regression caused by
> 839a8e8660b6 "writeback: replace custom worker pool implementation with
> unbound workqueue".
> 
> The reason that this causes a problem is that laptop-mode will change
> the delay, dirty_writeback_centisecs, to 60000 (10 minutes) by default.
> In the case that bdi_wakeup_thread_delayed() races with
> sync_inodes_sb(), sync will be stopped for 10 minutes and trigger a hung
> task. Even if dirty_writeback_centisecs is not long enough to cause a
> hung task, we still don't want to delay sync for that long.
> 
> We fix the problem by using queue_delayed_work() when we want to
> schedule writeback sometime in future. This function doesn't change the
> timer if it is already armed.
> 
> For the same reason, we also change bdi_writeback_workfn() to
> immediately queue the work again in the case that the work_list is not
> empty. The same problem can happen if the sync work is run on the rescue
> worker.
> 
> Fixes: 839a8e8660b6777e7fe4e80af1a048aebe2b5977
> CC: stable@vger.kernel.org
> Signed-off-by: Derek Basehore <dbasehore@chromium.org>
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] bdi: Fix hung task on sync
  2014-02-25 22:43   ` Tejun Heo
@ 2014-02-27 16:13     ` Jan Kara
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2014-02-27 16:13 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jan Kara, linux-fsdevel, LKML, Jens Axboe, Derek Basehore, stable

On Tue 25-02-14 17:43:30, Tejun Heo wrote:
> On Tue, Feb 25, 2014 at 11:29:13PM +0100, Jan Kara wrote:
> > From: Derek Basehore <dbasehore@chromium.org>
> > 
> > bdi_wakeup_thread_delayed() used the mod_delayed_work() function to
> > schedule work to writeback dirty inodes. The problem with this is that
> > it can delay work that is scheduled for immediate execution, such as the
> > work from sync_inodes_sb().  This can happen since mod_delayed_work()
> > can now steal work from a work_queue.  This fixes the problem by using
> > queue_delayed_work() instead. This is a regression caused by
> > 839a8e8660b6 "writeback: replace custom worker pool implementation with
> > unbound workqueue".
> > 
> > The reason that this causes a problem is that laptop-mode will change
> > the delay, dirty_writeback_centisecs, to 60000 (10 minutes) by default.
> > In the case that bdi_wakeup_thread_delayed() races with
> > sync_inodes_sb(), sync will be stopped for 10 minutes and trigger a hung
> > task. Even if dirty_writeback_centisecs is not long enough to cause a
> > hung task, we still don't want to delay sync for that long.
> > 
> > We fix the problem by using queue_delayed_work() when we want to
> > schedule writeback sometime in future. This function doesn't change the
> > timer if it is already armed.
> > 
> > For the same reason, we also change bdi_writeback_workfn() to
> > immediately queue the work again in the case that the work_list is not
> > empty. The same problem can happen if the sync work is run on the rescue
> > worker.
> > 
> > Fixes: 839a8e8660b6777e7fe4e80af1a048aebe2b5977
> > CC: stable@vger.kernel.org
> > Signed-off-by: Derek Basehore <dbasehore@chromium.org>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> Reviewed-by: Tejun Heo <tj@kernel.org>
  Thanks for review. Did you have time to look into the patch 2/2?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] bdi: Avoid oops on device removal
  2014-02-25 22:29 ` [PATCH 2/2] bdi: Avoid oops on device removal Jan Kara
@ 2014-02-27 20:07   ` Tejun Heo
  2014-02-27 21:32     ` Jan Kara
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2014-02-27 20:07 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, LKML, Jens Axboe, stable

Hello,

On Tue, Feb 25, 2014 at 11:29:14PM +0100, Jan Kara wrote:
> +static void bdi_wakeup_thread(struct backing_dev_info *bdi)
> +{
> +	spin_lock_bh(&bdi->wb_lock);
> +	if (test_bit(BDI_registered, &bdi->state))
> +		mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
> +	spin_unlock_bh(&bdi->wb_lock);
> +}

I wonder whether this can be smarter without requiring wb_lock each
timer but this probably is the simplest for -stable backports.

>  static void bdi_queue_work(struct backing_dev_info *bdi,
>  			   struct wb_writeback_work *work)
>  {
>  	trace_writeback_queue(bdi, work);
>  
>  	spin_lock_bh(&bdi->wb_lock);
> +	if (!test_bit(BDI_registered, &bdi->state)) {
> +		if (work->done)
> +			complete(work->done);
> +		goto out_unlock;
> +	}
>  	list_add_tail(&work->list, &bdi->work_list);
> -	spin_unlock_bh(&bdi->wb_lock);
> -
>  	mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
> +out_unlock:
> +	spin_unlock_bh(&bdi->wb_lock);
>  }
>  
> +
> +

Why three blank lines?

Other than that,

Reviewed-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] bdi: Avoid oops on device removal
  2014-02-27 20:07   ` Tejun Heo
@ 2014-02-27 21:32     ` Jan Kara
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2014-02-27 21:32 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jan Kara, linux-fsdevel, LKML, Jens Axboe, stable

On Thu 27-02-14 15:07:48, Tejun Heo wrote:
> Hello,
> 
> On Tue, Feb 25, 2014 at 11:29:14PM +0100, Jan Kara wrote:
> > +static void bdi_wakeup_thread(struct backing_dev_info *bdi)
> > +{
> > +	spin_lock_bh(&bdi->wb_lock);
> > +	if (test_bit(BDI_registered, &bdi->state))
> > +		mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
> > +	spin_unlock_bh(&bdi->wb_lock);
> > +}
> 
> I wonder whether this can be smarter without requiring wb_lock each
> timer but this probably is the simplest for -stable backports.
  We could be clever, check whether the work is already queued for
execution and bail out without taking wb_lock if yes (that would also
save us some unnecessary juggling in try_to_grab_pending() for the situation
were the work is already queued). But I'm not sure how to cleanly implement
this...

> >  static void bdi_queue_work(struct backing_dev_info *bdi,
> >  			   struct wb_writeback_work *work)
> >  {
> >  	trace_writeback_queue(bdi, work);
> >  
> >  	spin_lock_bh(&bdi->wb_lock);
> > +	if (!test_bit(BDI_registered, &bdi->state)) {
> > +		if (work->done)
> > +			complete(work->done);
> > +		goto out_unlock;
> > +	}
> >  	list_add_tail(&work->list, &bdi->work_list);
> > -	spin_unlock_bh(&bdi->wb_lock);
> > -
> >  	mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
> > +out_unlock:
> > +	spin_unlock_bh(&bdi->wb_lock);
> >  }
> >  
> > +
> > +
> 
> Why three blank lines?
  A mistake. Will fix.

> Other than that,
> 
> Reviewed-by: Tejun Heo <tj@kernel.org>
  Thanks!

							Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/2] bdi: Avoid oops on device removal
  2014-02-27 21:43 [PATCH 0/2 v2] bdi: Fix hung tasks and oops in writeback Jan Kara
@ 2014-02-27 21:43 ` Jan Kara
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2014-02-27 21:43 UTC (permalink / raw)
  To: Jens Axboe; +Cc: LKML, linux-fsdevel, Tejun Heo, Jan Kara, stable

After 839a8e8660b67 "writeback: replace custom worker pool
implementation with unbound workqueue" when device is removed while we
are writing to it we crash in bdi_writeback_workfn() ->
set_worker_desc() because bdi->dev is NULL.  This can happen because
even though bdi_unregister() cancels all pending flushing work, nothing
really prevents new ones from being queued from balance_dirty_pages() or
other places.

Fix the problem by clearing BDI_registered bit in bdi_unregister() and
checking it before scheduling of any flushing work.

Fixes: 839a8e8660b6777e7fe4e80af1a048aebe2b5977
CC: stable@vger.kernel.org
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/fs-writeback.c           | 23 ++++++++++++++++++-----
 include/linux/backing-dev.h |  2 +-
 mm/backing-dev.c            | 13 +++++++++----
 3 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 8277b76be983..c24e7b8b521b 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -94,16 +94,29 @@ static inline struct inode *wb_inode(struct list_head *head)
 #define CREATE_TRACE_POINTS
 #include <trace/events/writeback.h>
 
+static void bdi_wakeup_thread(struct backing_dev_info *bdi)
+{
+	spin_lock_bh(&bdi->wb_lock);
+	if (test_bit(BDI_registered, &bdi->state))
+		mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
+	spin_unlock_bh(&bdi->wb_lock);
+}
+
 static void bdi_queue_work(struct backing_dev_info *bdi,
 			   struct wb_writeback_work *work)
 {
 	trace_writeback_queue(bdi, work);
 
 	spin_lock_bh(&bdi->wb_lock);
+	if (!test_bit(BDI_registered, &bdi->state)) {
+		if (work->done)
+			complete(work->done);
+		goto out_unlock;
+	}
 	list_add_tail(&work->list, &bdi->work_list);
-	spin_unlock_bh(&bdi->wb_lock);
-
 	mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
+out_unlock:
+	spin_unlock_bh(&bdi->wb_lock);
 }
 
 static void
@@ -119,7 +132,7 @@ __bdi_start_writeback(struct backing_dev_info *bdi, long nr_pages,
 	work = kzalloc(sizeof(*work), GFP_ATOMIC);
 	if (!work) {
 		trace_writeback_nowork(bdi);
-		mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
+		bdi_wakeup_thread(bdi);
 		return;
 	}
 
@@ -166,7 +179,7 @@ void bdi_start_background_writeback(struct backing_dev_info *bdi)
 	 * writeback as soon as there is no other work to do.
 	 */
 	trace_writeback_wake_background(bdi);
-	mod_delayed_work(bdi_wq, &bdi->wb.dwork, 0);
+	bdi_wakeup_thread(bdi);
 }
 
 /*
@@ -1025,7 +1038,7 @@ void bdi_writeback_workfn(struct work_struct *work)
 	current->flags |= PF_SWAPWRITE;
 
 	if (likely(!current_is_workqueue_rescuer() ||
-		   list_empty(&bdi->bdi_list))) {
+		   !test_bit(BDI_registered, &bdi->state))) {
 		/*
 		 * The normal path.  Keep writing back @bdi until its
 		 * work_list is empty.  Note that this path is also taken
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 24819001f5c8..e488e9459a93 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -95,7 +95,7 @@ struct backing_dev_info {
 	unsigned int max_ratio, max_prop_frac;
 
 	struct bdi_writeback wb;  /* default writeback info for this bdi */
-	spinlock_t wb_lock;	  /* protects work_list */
+	spinlock_t wb_lock;	  /* protects work_list & wb.dwork scheduling */
 
 	struct list_head work_list;
 
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index fab8401fc54e..09d9591b7708 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -297,7 +297,10 @@ void bdi_wakeup_thread_delayed(struct backing_dev_info *bdi)
 	unsigned long timeout;
 
 	timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
-	queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
+	spin_lock_bh(&bdi->wb_lock);
+	if (test_bit(BDI_registered, &bdi->state))
+		queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
+	spin_unlock_bh(&bdi->wb_lock);
 }
 
 /*
@@ -310,9 +313,6 @@ static void bdi_remove_from_list(struct backing_dev_info *bdi)
 	spin_unlock_bh(&bdi_lock);
 
 	synchronize_rcu_expedited();
-
-	/* bdi_list is now unused, clear it to mark @bdi dying */
-	INIT_LIST_HEAD(&bdi->bdi_list);
 }
 
 int bdi_register(struct backing_dev_info *bdi, struct device *parent,
@@ -363,6 +363,11 @@ static void bdi_wb_shutdown(struct backing_dev_info *bdi)
 	 */
 	bdi_remove_from_list(bdi);
 
+	/* Make sure nobody queues further work */
+	spin_lock_bh(&bdi->wb_lock);
+	clear_bit(BDI_registered, &bdi->state);
+	spin_unlock_bh(&bdi->wb_lock);
+
 	/*
 	 * Drain work list and shutdown the delayed_work.  At this point,
 	 * @bdi->bdi_list is empty telling bdi_Writeback_workfn() that @bdi
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-02-27 21:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-25 22:29 [PATCH 0/2] bdi fixes Jan Kara
2014-02-25 22:29 ` [PATCH 1/2] bdi: Fix hung task on sync Jan Kara
2014-02-25 22:43   ` Tejun Heo
2014-02-27 16:13     ` Jan Kara
2014-02-25 22:29 ` [PATCH 2/2] bdi: Avoid oops on device removal Jan Kara
2014-02-27 20:07   ` Tejun Heo
2014-02-27 21:32     ` Jan Kara
2014-02-27 21:43 [PATCH 0/2 v2] bdi: Fix hung tasks and oops in writeback Jan Kara
2014-02-27 21:43 ` [PATCH 2/2] bdi: Avoid oops on device removal Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).