linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH block/for-5.14-fixes] blk-iocost: fix operation ordering in iocg_wake_fn()
@ 2021-07-28  0:38 Tejun Heo
  2021-07-28  1:27 ` Jens Axboe
  0 siblings, 1 reply; 2+ messages in thread
From: Tejun Heo @ 2021-07-28  0:38 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, cgroups, linux-kernel, Rik van Riel

From aae4e1b4e26c3c671fc19aed2fb2ee19f7438707 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Tue, 27 Jul 2021 14:21:30 -1000

iocg_wake_fn() open-codes wait_queue_entry removal and wakeup because it
wants the wq_entry to be always removed whether it ended up waking the task
or not. finish_wait() tests whether wq_entry needs removal without grabbing
the wait_queue lock and expects the waker to use list_del_init_careful()
after all waking operations are complete, which iocg_wake_fn() didn't do.
The operation order was wrong and the regular list_del_init() was used.

The result is that if a watier wakes up racing the waker, it can free pop
the wq_entry off stack before the waker is still looking at it, which can
lead to a backtrace like the following.

  [7312084.588951] general protection fault, probably for non-canonical address 0x586bf4005b2b88: 0000 [#1] SMP
  ...
  [7312084.647079] RIP: 0010:queued_spin_lock_slowpath+0x171/0x1b0
  ...
  [7312084.858314] Call Trace:
  [7312084.863548]  _raw_spin_lock_irqsave+0x22/0x30
  [7312084.872605]  try_to_wake_up+0x4c/0x4f0
  [7312084.880444]  iocg_wake_fn+0x71/0x80
  [7312084.887763]  __wake_up_common+0x71/0x140
  [7312084.895951]  iocg_kick_waitq+0xe8/0x2b0
  [7312084.903964]  ioc_rqos_throttle+0x275/0x650
  [7312084.922423]  __rq_qos_throttle+0x20/0x30
  [7312084.930608]  blk_mq_make_request+0x120/0x650
  [7312084.939490]  generic_make_request+0xca/0x310
  [7312084.957600]  submit_bio+0x173/0x200
  [7312084.981806]  swap_readpage+0x15c/0x240
  [7312084.989646]  read_swap_cache_async+0x58/0x60
  [7312084.998527]  swap_cluster_readahead+0x201/0x320
  [7312085.023432]  swapin_readahead+0x2df/0x450
  [7312085.040672]  do_swap_page+0x52f/0x820
  [7312085.058259]  handle_mm_fault+0xa16/0x1420
  [7312085.066620]  do_page_fault+0x2c6/0x5c0
  [7312085.074459]  page_fault+0x2f/0x40

Fix it by switching to list_del_init_careful() and putting it at the end.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Rik van Riel <riel@surriel.com>
Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost")
Cc: stable@vger.kernel.org # v5.4+
---
 block/blk-iocost.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index c2d6bc88d3f15..5fac3757e6e05 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -1440,16 +1440,17 @@ static int iocg_wake_fn(struct wait_queue_entry *wq_entry, unsigned mode,
 		return -1;
 
 	iocg_commit_bio(ctx->iocg, wait->bio, wait->abs_cost, cost);
+	wait->committed = true;
 
 	/*
 	 * autoremove_wake_function() removes the wait entry only when it
-	 * actually changed the task state.  We want the wait always
-	 * removed.  Remove explicitly and use default_wake_function().
+	 * actually changed the task state. We want the wait always removed.
+	 * Remove explicitly and use default_wake_function(). Note that the
+	 * order of operations is important as finish_wait() tests whether
+	 * @wq_entry is removed without grabbing the lock.
 	 */
-	list_del_init(&wq_entry->entry);
-	wait->committed = true;
-
 	default_wake_function(wq_entry, mode, flags, key);
+	list_del_init_careful(&wq_entry->entry);
 	return 0;
 }
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH block/for-5.14-fixes] blk-iocost: fix operation ordering in iocg_wake_fn()
  2021-07-28  0:38 [PATCH block/for-5.14-fixes] blk-iocost: fix operation ordering in iocg_wake_fn() Tejun Heo
@ 2021-07-28  1:27 ` Jens Axboe
  0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2021-07-28  1:27 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-block, cgroups, linux-kernel, Rik van Riel

On 7/27/21 6:38 PM, Tejun Heo wrote:
> From aae4e1b4e26c3c671fc19aed2fb2ee19f7438707 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj@kernel.org>
> Date: Tue, 27 Jul 2021 14:21:30 -1000
> 
> iocg_wake_fn() open-codes wait_queue_entry removal and wakeup because it
> wants the wq_entry to be always removed whether it ended up waking the task
> or not. finish_wait() tests whether wq_entry needs removal without grabbing
> the wait_queue lock and expects the waker to use list_del_init_careful()
> after all waking operations are complete, which iocg_wake_fn() didn't do.
> The operation order was wrong and the regular list_del_init() was used.
> 
> The result is that if a watier wakes up racing the waker, it can free pop
> the wq_entry off stack before the waker is still looking at it, which can
> lead to a backtrace like the following.
> 
>   [7312084.588951] general protection fault, probably for non-canonical address 0x586bf4005b2b88: 0000 [#1] SMP
>   ...
>   [7312084.647079] RIP: 0010:queued_spin_lock_slowpath+0x171/0x1b0
>   ...
>   [7312084.858314] Call Trace:
>   [7312084.863548]  _raw_spin_lock_irqsave+0x22/0x30
>   [7312084.872605]  try_to_wake_up+0x4c/0x4f0
>   [7312084.880444]  iocg_wake_fn+0x71/0x80
>   [7312084.887763]  __wake_up_common+0x71/0x140
>   [7312084.895951]  iocg_kick_waitq+0xe8/0x2b0
>   [7312084.903964]  ioc_rqos_throttle+0x275/0x650
>   [7312084.922423]  __rq_qos_throttle+0x20/0x30
>   [7312084.930608]  blk_mq_make_request+0x120/0x650
>   [7312084.939490]  generic_make_request+0xca/0x310
>   [7312084.957600]  submit_bio+0x173/0x200
>   [7312084.981806]  swap_readpage+0x15c/0x240
>   [7312084.989646]  read_swap_cache_async+0x58/0x60
>   [7312084.998527]  swap_cluster_readahead+0x201/0x320
>   [7312085.023432]  swapin_readahead+0x2df/0x450
>   [7312085.040672]  do_swap_page+0x52f/0x820
>   [7312085.058259]  handle_mm_fault+0xa16/0x1420
>   [7312085.066620]  do_page_fault+0x2c6/0x5c0
>   [7312085.074459]  page_fault+0x2f/0x40
> 
> Fix it by switching to list_del_init_careful() and putting it at the end.

Fixed up the malformed commit message, applied, thanks!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-07-28  1:27 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-28  0:38 [PATCH block/for-5.14-fixes] blk-iocost: fix operation ordering in iocg_wake_fn() Tejun Heo
2021-07-28  1:27 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).