All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] fs/writeback: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount
@ 2019-04-18  2:04 Jiufei Xue
  2019-04-18 22:32 ` Andrew Morton
  2019-04-19 18:32 ` Tejun Heo
  0 siblings, 2 replies; 3+ messages in thread
From: Jiufei Xue @ 2019-04-18  2:04 UTC (permalink / raw)
  To: cgroups, linux-mm; +Cc: tj, akpm, joseph.qi, bo.liu

synchronize_rcu() didn't wait for call_rcu() callbacks, so inode wb
switch may not go to the workqueue after synchronize_rcu(). Thus
previous scheduled switches was not finished even flushing the
workqueue, which will cause a NULL pointer dereferenced followed below.

VFS: Busy inodes after unmount of vdd. Self-destruct in 5 seconds.  Have a nice day...
BUG: unable to handle kernel NULL pointer dereference at 0000000000000278
[<ffffffff8126a303>] evict+0xb3/0x180
[<ffffffff8126a760>] iput+0x1b0/0x230
[<ffffffff8127c690>] inode_switch_wbs_work_fn+0x3c0/0x6a0
[<ffffffff810a5b2e>] worker_thread+0x4e/0x490
[<ffffffff810a5ae0>] ? process_one_work+0x410/0x410
[<ffffffff810ac056>] kthread+0xe6/0x100
[<ffffffff8173c199>] ret_from_fork+0x39/0x50

Replace the synchronize_rcu() call with a rcu_barrier() to wait for all
pending callbacks to finish. And inc isw_nr_in_flight after call_rcu()
in inode_switch_wbs() to make more sense.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Cc: stable@kernel.org
---
 fs/fs-writeback.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 36855c1f8daf..fede1f685539 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -523,8 +523,6 @@ static void inode_switch_wbs(struct inode *inode, int new_wb_id)
 
 	isw->inode = inode;
 
-	atomic_inc(&isw_nr_in_flight);
-
 	/*
 	 * In addition to synchronizing among switchers, I_WB_SWITCH tells
 	 * the RCU protected stat update paths to grab the i_page
@@ -532,6 +530,9 @@ static void inode_switch_wbs(struct inode *inode, int new_wb_id)
 	 * Let's continue after I_WB_SWITCH is guaranteed to be visible.
 	 */
 	call_rcu(&isw->rcu_head, inode_switch_wbs_rcu_fn);
+
+	atomic_inc(&isw_nr_in_flight);
+
 	goto out_unlock;
 
 out_free:
@@ -901,7 +902,7 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
 void cgroup_writeback_umount(void)
 {
 	if (atomic_read(&isw_nr_in_flight)) {
-		synchronize_rcu();
+		rcu_barrier();
 		flush_workqueue(isw_wq);
 	}
 }
-- 
2.19.1.856.g8858448bb


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] fs/writeback: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount
  2019-04-18  2:04 [PATCH v3] fs/writeback: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount Jiufei Xue
@ 2019-04-18 22:32 ` Andrew Morton
  2019-04-19 18:32 ` Tejun Heo
  1 sibling, 0 replies; 3+ messages in thread
From: Andrew Morton @ 2019-04-18 22:32 UTC (permalink / raw)
  To: Jiufei Xue; +Cc: cgroups, linux-mm, tj, joseph.qi, bo.liu

On Thu, 18 Apr 2019 10:04:26 +0800 Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote:

> synchronize_rcu() didn't wait for call_rcu() callbacks, so inode wb
> switch may not go to the workqueue after synchronize_rcu(). Thus
> previous scheduled switches was not finished even flushing the
> workqueue, which will cause a NULL pointer dereferenced followed below.
> 
> VFS: Busy inodes after unmount of vdd. Self-destruct in 5 seconds.  Have a nice day...
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000278
> [<ffffffff8126a303>] evict+0xb3/0x180
> [<ffffffff8126a760>] iput+0x1b0/0x230
> [<ffffffff8127c690>] inode_switch_wbs_work_fn+0x3c0/0x6a0
> [<ffffffff810a5b2e>] worker_thread+0x4e/0x490
> [<ffffffff810a5ae0>] ? process_one_work+0x410/0x410
> [<ffffffff810ac056>] kthread+0xe6/0x100
> [<ffffffff8173c199>] ret_from_fork+0x39/0x50
> 
> Replace the synchronize_rcu() call with a rcu_barrier() to wait for all
> pending callbacks to finish. And inc isw_nr_in_flight after call_rcu()
> in inode_switch_wbs() to make more sense.
> 
> ...
>
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
>
> ...
>
> @@ -901,7 +902,7 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
>  void cgroup_writeback_umount(void)
>  {
>  	if (atomic_read(&isw_nr_in_flight)) {
> -		synchronize_rcu();
> +		rcu_barrier();
>  		flush_workqueue(isw_wq);
>  	}
>  }

it would be nice to have a comment here explaining why the barrier is
being performed.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] fs/writeback: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount
  2019-04-18  2:04 [PATCH v3] fs/writeback: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount Jiufei Xue
  2019-04-18 22:32 ` Andrew Morton
@ 2019-04-19 18:32 ` Tejun Heo
  1 sibling, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2019-04-19 18:32 UTC (permalink / raw)
  To: Jiufei Xue; +Cc: cgroups, linux-mm, akpm, joseph.qi, bo.liu

On Thu, Apr 18, 2019 at 10:04:26AM +0800, Jiufei Xue wrote:
> synchronize_rcu() didn't wait for call_rcu() callbacks, so inode wb
> switch may not go to the workqueue after synchronize_rcu(). Thus
> previous scheduled switches was not finished even flushing the
> workqueue, which will cause a NULL pointer dereferenced followed below.
> 
> VFS: Busy inodes after unmount of vdd. Self-destruct in 5 seconds.  Have a nice day...
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000278
> [<ffffffff8126a303>] evict+0xb3/0x180
> [<ffffffff8126a760>] iput+0x1b0/0x230
> [<ffffffff8127c690>] inode_switch_wbs_work_fn+0x3c0/0x6a0
> [<ffffffff810a5b2e>] worker_thread+0x4e/0x490
> [<ffffffff810a5ae0>] ? process_one_work+0x410/0x410
> [<ffffffff810ac056>] kthread+0xe6/0x100
> [<ffffffff8173c199>] ret_from_fork+0x39/0x50
> 
> Replace the synchronize_rcu() call with a rcu_barrier() to wait for all
> pending callbacks to finish. And inc isw_nr_in_flight after call_rcu()
> in inode_switch_wbs() to make more sense.
> 
> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
> Cc: stable@kernel.org

Except for the documentation part that Andrew raised,

  Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-04-19 18:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-18  2:04 [PATCH v3] fs/writeback: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount Jiufei Xue
2019-04-18 22:32 ` Andrew Morton
2019-04-19 18:32 ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.