All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups
@ 2022-01-05 14:36 Jan Kara
  2022-01-05 14:36 ` [PATCH 1/5] bfq: Avoid false marking of bic as stably merged Jan Kara
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Jan Kara @ 2022-01-05 14:36 UTC (permalink / raw)
  To: linux-block; +Cc: Paolo Valente, Jens Axboe, yukuai (C), Jan Kara

Hello,

here is the second version of my patches to fix use-after-free issues in BFQ
when processes with merged queues get moved to different cgroups. The patches
have survived some beating in my test VM but so far I fail to reproduce the
original KASAN reports so testing from people who can reproduce them is most
welcome. Thanks!

Changes since v1:
* Added fix for bfq_put_cooperator()
* Added fix to handle move between cgroups in bfq_merge_bio()

								Honza
Previous versions:
Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] bfq: Avoid false marking of bic as stably merged
  2022-01-05 14:36 [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups Jan Kara
@ 2022-01-05 14:36 ` Jan Kara
  2022-01-05 14:36 ` [PATCH 2/5] bfq: Avoid merging queues with different parents Jan Kara
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2022-01-05 14:36 UTC (permalink / raw)
  To: linux-block; +Cc: Paolo Valente, Jens Axboe, yukuai (C), Jan Kara, stable

bfq_setup_cooperator() can mark bic as stably merged even though it
decides to not merge its bfqqs (when bfq_setup_merge() returns NULL).
Make sure to mark bic as stably merged only if we are really going to
merge bfqqs.

CC: stable@vger.kernel.org
Fixes: 430a67f9d616 ("block, bfq: merge bursts of newly-created queues")
Signed-off-by: Jan Kara <jack@suse.cz>
---
 block/bfq-iosched.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index fec18118dc30..056399185c2f 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2762,9 +2762,12 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 				struct bfq_queue *new_bfqq =
 					bfq_setup_merge(bfqq, stable_merge_bfqq);
 
-				bic->stably_merged = true;
-				if (new_bfqq && new_bfqq->bic)
-					new_bfqq->bic->stably_merged = true;
+				if (new_bfqq) {
+					bic->stably_merged = true;
+					if (new_bfqq->bic)
+						new_bfqq->bic->stably_merged =
+									true;
+				}
 				return new_bfqq;
 			} else
 				return NULL;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/5] bfq: Avoid merging queues with different parents
  2022-01-05 14:36 [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups Jan Kara
  2022-01-05 14:36 ` [PATCH 1/5] bfq: Avoid false marking of bic as stably merged Jan Kara
@ 2022-01-05 14:36 ` Jan Kara
  2022-01-05 14:36 ` [PATCH 3/5] bfq: Simplify bfq_put_cooperator() Jan Kara
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2022-01-05 14:36 UTC (permalink / raw)
  To: linux-block; +Cc: Paolo Valente, Jens Axboe, yukuai (C), Jan Kara, stable

It can happen that the parent of a bfqq changes between the moment we
decide two queues are worth to merge (and set bic->stable_merge_bfqq)
and the moment bfq_setup_merge() is called. This can happen e.g. because
the process submitted IO for a different cgroup and thus bfqq got
reparented. It can even happen that the bfqq we are merging with has
parent cgroup that is already offline and going to be destroyed in which
case the merge can lead to use-after-free issues such as:

BUG: KASAN: use-after-free in __bfq_deactivate_entity+0x9cb/0xa50
Read of size 8 at addr ffff88800693c0c0 by task runc:[2:INIT]/10544

CPU: 0 PID: 10544 Comm: runc:[2:INIT] Tainted: G            E     5.15.2-0.g5fb85fd-default #1 openSUSE Tumbleweed (unreleased) f1f3b891c72369aebecd2e43e4641a6358867c70
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
Call Trace:
 <IRQ>
 dump_stack_lvl+0x46/0x5a
 print_address_description.constprop.0+0x1f/0x140
 ? __bfq_deactivate_entity+0x9cb/0xa50
 kasan_report.cold+0x7f/0x11b
 ? __bfq_deactivate_entity+0x9cb/0xa50
 __bfq_deactivate_entity+0x9cb/0xa50
 ? update_curr+0x32f/0x5d0
 bfq_deactivate_entity+0xa0/0x1d0
 bfq_del_bfqq_busy+0x28a/0x420
 ? resched_curr+0x116/0x1d0
 ? bfq_requeue_bfqq+0x70/0x70
 ? check_preempt_wakeup+0x52b/0xbc0
 __bfq_bfqq_expire+0x1a2/0x270
 bfq_bfqq_expire+0xd16/0x2160
 ? try_to_wake_up+0x4ee/0x1260
 ? bfq_end_wr_async_queues+0xe0/0xe0
 ? _raw_write_unlock_bh+0x60/0x60
 ? _raw_spin_lock_irq+0x81/0xe0
 bfq_idle_slice_timer+0x109/0x280
 ? bfq_dispatch_request+0x4870/0x4870
 __hrtimer_run_queues+0x37d/0x700
 ? enqueue_hrtimer+0x1b0/0x1b0
 ? kvm_clock_get_cycles+0xd/0x10
 ? ktime_get_update_offsets_now+0x6f/0x280
 hrtimer_interrupt+0x2c8/0x740

Fix the problem by checking that the parent of the two bfqqs we are
merging in bfq_setup_merge() is the same.

Link: https://lore.kernel.org/linux-block/20211125172809.GC19572@quack2.suse.cz/
CC: stable@vger.kernel.org
Fixes: 430a67f9d616 ("block, bfq: merge bursts of newly-created queues")
Signed-off-by: Jan Kara <jack@suse.cz>
---
 block/bfq-iosched.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 056399185c2f..0da47f2ca781 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2638,6 +2638,14 @@ bfq_setup_merge(struct bfq_queue *bfqq, struct bfq_queue *new_bfqq)
 	if (process_refs == 0 || new_process_refs == 0)
 		return NULL;
 
+	/*
+	 * Make sure merged queues belong to the same parent. Parents could
+	 * have changed since the time we decided the two queues are suitable
+	 * for merging.
+	 */
+	if (new_bfqq->entity.parent != bfqq->entity.parent)
+		return NULL;
+
 	bfq_log_bfqq(bfqq->bfqd, bfqq, "scheduling merge with queue %d",
 		new_bfqq->pid);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/5] bfq: Simplify bfq_put_cooperator()
  2022-01-05 14:36 [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups Jan Kara
  2022-01-05 14:36 ` [PATCH 1/5] bfq: Avoid false marking of bic as stably merged Jan Kara
  2022-01-05 14:36 ` [PATCH 2/5] bfq: Avoid merging queues with different parents Jan Kara
@ 2022-01-05 14:36 ` Jan Kara
  2022-01-05 14:36 ` [PATCH 4/5] bfq: Split shared queues on move between cgroups Jan Kara
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2022-01-05 14:36 UTC (permalink / raw)
  To: linux-block; +Cc: Paolo Valente, Jens Axboe, yukuai (C), Jan Kara, stable

All calls to bfq_setup_merge() are followed by bfq_merge_bfqqs() so
there should be no chance for chaining several queue merges. And if
chained queue merges were possible, then bfq_put_cooperator() would drop
cooperator references without clearing corresponding bfqq->new_bfqq
pointers causing possible use-after-free issues. Fix these problems by
making bfq_put_cooperator() drop only the immediate bfqq->new_bfqq
reference.

CC: stable@vger.kernel.org
Fixes: 36eca8948323 ("block, bfq: add Early Queue Merge (EQM)")
Signed-off-by: Jan Kara <jack@suse.cz>
---
 block/bfq-iosched.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 0da47f2ca781..654191c6face 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -5184,22 +5184,16 @@ static void bfq_put_stable_ref(struct bfq_queue *bfqq)
 	bfq_put_queue(bfqq);
 }
 
+
+/*
+ * If this queue was scheduled to merge with another queue, be
+ * sure to drop the reference taken on that queue.
+ */
 static void bfq_put_cooperator(struct bfq_queue *bfqq)
 {
-	struct bfq_queue *__bfqq, *next;
-
-	/*
-	 * If this queue was scheduled to merge with another queue, be
-	 * sure to drop the reference taken on that queue (and others in
-	 * the merge chain). See bfq_setup_merge and bfq_merge_bfqqs.
-	 */
-	__bfqq = bfqq->new_bfqq;
-	while (__bfqq) {
-		if (__bfqq == bfqq)
-			break;
-		next = __bfqq->new_bfqq;
-		bfq_put_queue(__bfqq);
-		__bfqq = next;
+	if (bfqq->new_bfqq) {
+		bfq_put_queue(bfqq->new_bfqq);
+		bfqq->new_bfqq = NULL;
 	}
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/5] bfq: Split shared queues on move between cgroups
  2022-01-05 14:36 [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups Jan Kara
                   ` (2 preceding siblings ...)
  2022-01-05 14:36 ` [PATCH 3/5] bfq: Simplify bfq_put_cooperator() Jan Kara
@ 2022-01-05 14:36 ` Jan Kara
  2022-01-05 14:36 ` [PATCH 5/5] bfq: Update cgroup information before merging bio Jan Kara
  2022-01-07  9:15 ` [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups yukuai (C)
  5 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2022-01-05 14:36 UTC (permalink / raw)
  To: linux-block; +Cc: Paolo Valente, Jens Axboe, yukuai (C), Jan Kara, stable

When bfqq is shared by multiple processes it can happen that one of the
processes gets moved to a different cgroup (or just starts submitting IO
for different cgroup). In case that happens we need to split the merged
bfqq as otherwise we will have IO for multiple cgroups in one bfqq and
we will just account IO time to wrong entities etc.

Similarly if the bfqq is scheduled to merge with another bfqq but the
merge didn't happen yet, cancel the merge as it need not be valid
anymore.

CC: stable@vger.kernel.org
Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support")
Signed-off-by: Jan Kara <jack@suse.cz>
---
 block/bfq-cgroup.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 24a5c5329bcd..a78d86805bd5 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -730,8 +730,19 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
 
 	if (sync_bfqq) {
 		entity = &sync_bfqq->entity;
-		if (entity->sched_data != &bfqg->sched_data)
+		if (entity->sched_data != &bfqg->sched_data) {
+			/*
+			 * Moving bfqq that is shared with another process?
+			 * Split the queues at the nearest occasion as the
+			 * processes can be in different cgroups now.
+			 */
+			if (bfq_bfqq_coop(sync_bfqq)) {
+				bic->stably_merged = false;
+				bfq_mark_bfqq_split_coop(sync_bfqq);
+			}
+			WARN_ON_ONCE(sync_bfqq->new_bfqq);
 			bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
+		}
 	}
 
 	return bfqg;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/5] bfq: Update cgroup information before merging bio
  2022-01-05 14:36 [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups Jan Kara
                   ` (3 preceding siblings ...)
  2022-01-05 14:36 ` [PATCH 4/5] bfq: Split shared queues on move between cgroups Jan Kara
@ 2022-01-05 14:36 ` Jan Kara
  2022-01-07  9:15 ` [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups yukuai (C)
  5 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2022-01-05 14:36 UTC (permalink / raw)
  To: linux-block; +Cc: Paolo Valente, Jens Axboe, yukuai (C), Jan Kara, stable

When the process is migrated to a different cgroup (or in case of
writeback just starts submitting bios associated with a different
cgroup) bfq_merge_bio() can operate with stale cgroup information in
bic. Thus the bio can be merged to a request from a different cgroup or
it can result in merging of bfqqs for different cgroups or bfqqs of
already dead cgroups and causing possible use-after-free issues. Fix the
problem by updating cgroup information in bfq_merge_bio().

CC: stable@vger.kernel.org
Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support")
Signed-off-by: Jan Kara <jack@suse.cz>
---
 block/bfq-iosched.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 654191c6face..f77f79d1d04c 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2337,10 +2337,17 @@ static bool bfq_bio_merge(struct request_queue *q, struct bio *bio,
 
 	spin_lock_irq(&bfqd->lock);
 
-	if (bic)
+	if (bic) {
+		/*
+		 * Make sure cgroup info is uptodate for current process before
+ 		 * considering the merge.
+		 */
+		bfq_bic_update_cgroup(bic, bio);
+
 		bfqd->bio_bfqq = bic_to_bfqq(bic, op_is_sync(bio->bi_opf));
-	else
+	} else {
 		bfqd->bio_bfqq = NULL;
+	}
 	bfqd->bio_bic = bic;
 
 	ret = blk_mq_sched_try_merge(q, bio, nr_segs, &free);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups
  2022-01-05 14:36 [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups Jan Kara
                   ` (4 preceding siblings ...)
  2022-01-05 14:36 ` [PATCH 5/5] bfq: Update cgroup information before merging bio Jan Kara
@ 2022-01-07  9:15 ` yukuai (C)
  2022-01-07 14:58   ` Jan Kara
  5 siblings, 1 reply; 14+ messages in thread
From: yukuai (C) @ 2022-01-07  9:15 UTC (permalink / raw)
  To: Jan Kara, linux-block; +Cc: Paolo Valente, Jens Axboe

在 2022/01/05 22:36, Jan Kara 写道:
> Hello,
> 
> here is the second version of my patches to fix use-after-free issues in BFQ
> when processes with merged queues get moved to different cgroups. The patches
> have survived some beating in my test VM but so far I fail to reproduce the
> original KASAN reports so testing from people who can reproduce them is most
> welcome. Thanks!
> 
> Changes since v1:
> * Added fix for bfq_put_cooperator()
> * Added fix to handle move between cgroups in bfq_merge_bio()
> 
> 								Honza
> Previous versions:
> Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
> .
> 

Hi,

I repoduced the problem again with this patchset...

[   71.004788] BUG: KASAN: use-after-free in 
__bfq_deactivate_entity+0x21/0x290
[   71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801
[   71.007723]
[   71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G        W 
5.16.0-rc5-next-2021127
[   71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS ?-20190727_073836-4
[   71.012274] Call Trace:
[   71.012603]  <TASK>
[   71.012886]  dump_stack_lvl+0x34/0x44
[   71.013379]  print_address_description.constprop.0.cold+0xab/0x36b
[   71.014182]  ? __bfq_deactivate_entity+0x21/0x290
[   71.014795]  ? __bfq_deactivate_entity+0x21/0x290
[   71.015398]  kasan_report.cold+0x83/0xdf
[   71.015904]  ? _raw_read_lock_bh+0x20/0x40
[   71.016433]  ? __bfq_deactivate_entity+0x21/0x290
[   71.017033]  __bfq_deactivate_entity+0x21/0x290
[   71.017617]  bfq_pd_offline+0xc1/0x110
[   71.018105]  blkcg_deactivate_policy+0x14b/0x210
[   71.018699]  bfq_exit_queue+0xe5/0x100
[   71.019189]  blk_mq_exit_sched+0x113/0x140
[   71.019720]  elevator_exit+0x30/0x50
[   71.020186]  blk_release_queue+0xa8/0x160
[   71.020702]  kobject_put+0xd4/0x270
[   71.021158]  disk_release+0xc5/0xf0
[   71.021609]  device_release+0x56/0xe0
[   71.022086]  kobject_put+0xd4/0x270
[   71.022546]  null_del_dev.part.0+0xe6/0x220 [null_blk]
[   71.023228]  null_exit+0x62/0xa6 [null_blk]
[   71.023792]  __x64_sys_delete_module+0x20a/0x2f0
[   71.024387]  ? __ia32_sys_delete_module+0x2f0/0x2f0
[   71.025008]  ? fpregs_assert_state_consistent+0x55/0x60
[   71.025674]  ? exit_to_user_mode_prepare+0x39/0x1e0
[   71.026304]  do_syscall_64+0x35/0x80
[   71.026763]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   71.027410] RIP: 0033:0x7fddec6bdfc7
[   71.027869] Code: 73 01 c3 48 8b 0d c1 de 2b 00 f7 d8 64 89 01 48 83 
c8 ff c3 66 2e 0f 1f 8
[   71.030206] RSP: 002b:00007fff9486bc08 EFLAGS: 00000206 ORIG_RAX: 
00000000000000b0
[   71.031160] RAX: ffffffffffffffda RBX: 00007fff9486bc68 RCX: 
00007fddec6bdfc7
[   71.032058] RDX: 000000000000000a RSI: 0000000000000800 RDI: 
000055fc863ac258
[   71.032962] RBP: 000055fc863ac1f0 R08: 00007fff9486ab81 R09: 
0000000000000000
[   71.033861] R10: 00007fddec730260 R11: 0000000000000206 R12: 
00007fff9486be30
[   71.034758] R13: 00007fff9486c45b R14: 000055fc863ab010 R15: 
000055fc863ac1f0
[   71.035659]  </TASK>
[   71.035946]
[   71.036149] Allocated by task 620:
[   71.036585]  kasan_save_stack+0x1e/0x40
[   71.037076]  __kasan_kmalloc+0x81/0xa0
[   71.037558]  bfq_pd_alloc+0x72/0x110
[   71.038015]  blkg_alloc+0x252/0x2c0
[   71.038467]  blkg_create+0x38e/0x560
[   71.038927]  bio_associate_blkg_from_css+0x3bc/0x490
[   71.039563]  bio_associate_blkg+0x3b/0x90
[   71.040076]  mpage_alloc+0x7c/0xe0
[   71.040516]  do_mpage_readpage+0xb42/0xff0
[   71.041040]  mpage_readahead+0x1fd/0x3f0
[   71.041544]  read_pages+0x144/0x770
[   71.041994]  page_cache_ra_unbounded+0x2d2/0x380
[   71.042586]  filemap_get_pages+0x1bc/0xaf0
[   71.043115]  filemap_read+0x1bf/0x5a0
[   71.043585]  aio_read+0x1ca/0x2f0
[   71.044014]  io_submit_one+0x874/0x11c0
[   71.044511]  __x64_sys_io_submit+0xfa/0x250
[   71.045048]  do_syscall_64+0x35/0x80
[   71.045513]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   71.046161]
[   71.046361] Freed by task 0:
[   71.046735]  kasan_save_stack+0x1e/0x40
[   71.047230]  kasan_set_track+0x21/0x30
[   71.047709]  kasan_set_free_info+0x20/0x30
[   71.048235]  __kasan_slab_free+0x103/0x180
[   71.048757]  kfree+0x9a/0x4b0
[   71.049144]  blkg_free.part.0+0x4a/0x90
[   71.049635]  rcu_do_batch+0x2e1/0x760
[   71.050111]  rcu_core+0x367/0x530
[   71.050536]  __do_softirq+0x119/0x3ba
[   71.051007]
[   71.051210] The buggy address belongs to the object at ffff88817a3dc000
[   71.051210]  which belongs to the cache kmalloc-2k of size 2048
[   71.052766] The buggy address is located 176 bytes inside of
[   71.052766]  2048-byte region [ffff88817a3dc000, ffff88817a3dc800)
[   71.054240] The buggy address belongs to the page:
[   71.054843] page:00000000ce62c7c2 refcount:1 mapcount:0 
mapping:0000000000000000 index:0x08
[   71.056021] head:00000000ce62c7c2 order:3 compound_mapcount:0 
compound_pincount:0
[   71.056961] flags: 
0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
[   71.057894] raw: 0017ffffc0010200 dead000000000100 dead000000000122 
ffff888100042f00
[   71.058866] raw: 0000000000000000 0000000080080008 00000001ffffffff 
0000000000000000
[   71.059840] page dumped because: kasan: bad access detected
[   71.060543]
[   71.060742] Memory state around the buggy address:
[   71.061349]  ffff88817a3dbf80: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[   71.062256]  ffff88817a3dc000: fa fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   71.063163] >ffff88817a3dc080: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   71.064129]                                      ^
[   71.064734]  ffff88817a3dc100: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   71.065639]  ffff88817a3dc180: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   71.066544] 
==================================================================

Here is the caller of  __bfq_deactivate_entity:
(gdb) list *(bfq_pd_offline+0xc1)
0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942).
937                      * entities to the idle tree. It happens if, in some
938                      * of the calls to bfq_bfqq_move() performed by
939                      * bfq_reparent_active_queues(), the queue to 
move is
940                      * empty and gets expired.
941                      */
942                     bfq_flush_idle_tree(st);
943             }
944
945             __bfq_deactivate_entity(entity, false);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups
  2022-01-07  9:15 ` [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups yukuai (C)
@ 2022-01-07 14:58   ` Jan Kara
  2022-01-07 17:27     ` Jan Kara
  2022-01-10  1:49     ` yukuai (C)
  0 siblings, 2 replies; 14+ messages in thread
From: Jan Kara @ 2022-01-07 14:58 UTC (permalink / raw)
  To: yukuai (C); +Cc: Jan Kara, linux-block, Paolo Valente, Jens Axboe

On Fri 07-01-22 17:15:43, yukuai (C) wrote:
> 在 2022/01/05 22:36, Jan Kara 写道:
> > Hello,
> > 
> > here is the second version of my patches to fix use-after-free issues in BFQ
> > when processes with merged queues get moved to different cgroups. The patches
> > have survived some beating in my test VM but so far I fail to reproduce the
> > original KASAN reports so testing from people who can reproduce them is most
> > welcome. Thanks!
> > 
> > Changes since v1:
> > * Added fix for bfq_put_cooperator()
> > * Added fix to handle move between cgroups in bfq_merge_bio()
> > 
> > 								Honza
> > Previous versions:
> > Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
> > .
> > 
> 
> Hi,
> 
> I repoduced the problem again with this patchset...

Thanks for testing! 

> [   71.004788] BUG: KASAN: use-after-free in
> __bfq_deactivate_entity+0x21/0x290
> [   71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801
> [   71.007723]
> [   71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G        W
> 5.16.0-rc5-next-2021127
> [   71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> ?-20190727_073836-4
> [   71.012274] Call Trace:
> [   71.012603]  <TASK>
> [   71.012886]  dump_stack_lvl+0x34/0x44
> [   71.013379]  print_address_description.constprop.0.cold+0xab/0x36b
> [   71.014182]  ? __bfq_deactivate_entity+0x21/0x290
> [   71.014795]  ? __bfq_deactivate_entity+0x21/0x290
> [   71.015398]  kasan_report.cold+0x83/0xdf
> [   71.015904]  ? _raw_read_lock_bh+0x20/0x40
> [   71.016433]  ? __bfq_deactivate_entity+0x21/0x290
> [   71.017033]  __bfq_deactivate_entity+0x21/0x290
> [   71.017617]  bfq_pd_offline+0xc1/0x110
> [   71.018105]  blkcg_deactivate_policy+0x14b/0x210
...

> Here is the caller of  __bfq_deactivate_entity:
> (gdb) list *(bfq_pd_offline+0xc1)
> 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942).
> 937                      * entities to the idle tree. It happens if, in some
> 938                      * of the calls to bfq_bfqq_move() performed by
> 939                      * bfq_reparent_active_queues(), the queue to move
> is
> 940                      * empty and gets expired.
> 941                      */
> 942                     bfq_flush_idle_tree(st);
> 943             }
> 944
> 945             __bfq_deactivate_entity(entity, false);

So this is indeed strange. The group has in some of its idle service trees
an entity whose entity->sched_data points to already freed cgroup. In
particular it means the bfqq_entity_service_tree() leads to somewhere else
than the 'st' we called bfq_flush_idle_tree() with. This looks like a
different kind of problem AFAICT but still it needs fixing :). Can you
please run your reproducer with my patches + the attached debug patch on
top? Hopefully it should tell us more about the problematic entity and how
it got there... Thanks!

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups
  2022-01-07 14:58   ` Jan Kara
@ 2022-01-07 17:27     ` Jan Kara
  2022-01-10  1:49     ` yukuai (C)
  1 sibling, 0 replies; 14+ messages in thread
From: Jan Kara @ 2022-01-07 17:27 UTC (permalink / raw)
  To: yukuai (C); +Cc: Jan Kara, linux-block, Paolo Valente, Jens Axboe

[-- Attachment #1: Type: text/plain, Size: 3175 bytes --]

On Fri 07-01-22 15:58:53, Jan Kara wrote:
> On Fri 07-01-22 17:15:43, yukuai (C) wrote:
> > 在 2022/01/05 22:36, Jan Kara 写道:
> > > Hello,
> > > 
> > > here is the second version of my patches to fix use-after-free issues in BFQ
> > > when processes with merged queues get moved to different cgroups. The patches
> > > have survived some beating in my test VM but so far I fail to reproduce the
> > > original KASAN reports so testing from people who can reproduce them is most
> > > welcome. Thanks!
> > > 
> > > Changes since v1:
> > > * Added fix for bfq_put_cooperator()
> > > * Added fix to handle move between cgroups in bfq_merge_bio()
> > > 
> > > 								Honza
> > > Previous versions:
> > > Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
> > > .
> > > 
> > 
> > Hi,
> > 
> > I repoduced the problem again with this patchset...
> 
> Thanks for testing! 
> 
> > [   71.004788] BUG: KASAN: use-after-free in
> > __bfq_deactivate_entity+0x21/0x290
> > [   71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801
> > [   71.007723]
> > [   71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G        W
> > 5.16.0-rc5-next-2021127
> > [   71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > ?-20190727_073836-4
> > [   71.012274] Call Trace:
> > [   71.012603]  <TASK>
> > [   71.012886]  dump_stack_lvl+0x34/0x44
> > [   71.013379]  print_address_description.constprop.0.cold+0xab/0x36b
> > [   71.014182]  ? __bfq_deactivate_entity+0x21/0x290
> > [   71.014795]  ? __bfq_deactivate_entity+0x21/0x290
> > [   71.015398]  kasan_report.cold+0x83/0xdf
> > [   71.015904]  ? _raw_read_lock_bh+0x20/0x40
> > [   71.016433]  ? __bfq_deactivate_entity+0x21/0x290
> > [   71.017033]  __bfq_deactivate_entity+0x21/0x290
> > [   71.017617]  bfq_pd_offline+0xc1/0x110
> > [   71.018105]  blkcg_deactivate_policy+0x14b/0x210
> ...
> 
> > Here is the caller of  __bfq_deactivate_entity:
> > (gdb) list *(bfq_pd_offline+0xc1)
> > 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942).
> > 937                      * entities to the idle tree. It happens if, in some
> > 938                      * of the calls to bfq_bfqq_move() performed by
> > 939                      * bfq_reparent_active_queues(), the queue to move
> > is
> > 940                      * empty and gets expired.
> > 941                      */
> > 942                     bfq_flush_idle_tree(st);
> > 943             }
> > 944
> > 945             __bfq_deactivate_entity(entity, false);
> 
> So this is indeed strange. The group has in some of its idle service trees
> an entity whose entity->sched_data points to already freed cgroup. In
> particular it means the bfqq_entity_service_tree() leads to somewhere else
> than the 'st' we called bfq_flush_idle_tree() with. This looks like a
> different kind of problem AFAICT but still it needs fixing :). Can you
> please run your reproducer with my patches + the attached debug patch on
> top? Hopefully it should tell us more about the problematic entity and how
> it got there... Thanks!

Forgot to attach the patch. Here it is...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-bfq-Debug-entity-attached-to-dead-cgroup.patch --]
[-- Type: text/x-patch, Size: 2226 bytes --]

From 4c4a832f7bb49bd319cd16e613746368b322a4a0 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Fri, 7 Jan 2022 18:26:42 +0100
Subject: [PATCH] bfq: Debug entity attached to dead cgroup

Signed-off-by: Jan Kara <jack@suse.cz>
---
 block/bfq-cgroup.c | 20 ++++++++++++++++----
 block/bfq-wf2q.c   |  1 +
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index a78d86805bd5..63386f0cc375 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -825,12 +825,24 @@ void bfq_bic_update_cgroup(struct bfq_io_cq *bic, struct bio *bio)
  * bfq_flush_idle_tree - deactivate any entity on the idle tree of @st.
  * @st: the service tree being flushed.
  */
-static void bfq_flush_idle_tree(struct bfq_service_tree *st)
+static void bfq_flush_idle_tree(struct bfq_sched_data *sched_data,
+				struct bfq_service_tree *st)
 {
 	struct bfq_entity *entity = st->first_idle;
-
-	for (; entity ; entity = st->first_idle)
+	int count = 0;
+
+	for (; entity ; entity = st->first_idle) {
+		if (entity->sched_data != sched_data) {
+			printk(KERN_ERR "entity %d sched_data %p (parent %p) "
+				"my_sched_data %p on_st %d not matching "
+				"service tree!\n", count,
+				entity->sched_data, entity->parent,
+				entity->my_sched_data, entity->on_st_or_in_serv);
+			BUG_ON(1);
+		}
 		__bfq_deactivate_entity(entity, false);
+		count++;
+	}
 }
 
 /**
@@ -939,7 +951,7 @@ static void bfq_pd_offline(struct blkg_policy_data *pd)
 		 * bfq_reparent_active_queues(), the queue to move is
 		 * empty and gets expired.
 		 */
-		bfq_flush_idle_tree(st);
+		bfq_flush_idle_tree(&bfqg->sched_data, st);
 	}
 
 	__bfq_deactivate_entity(entity, false);
diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
index b74cc0da118e..b9f4ecb99043 100644
--- a/block/bfq-wf2q.c
+++ b/block/bfq-wf2q.c
@@ -610,6 +610,7 @@ static void bfq_idle_insert(struct bfq_service_tree *st,
 	struct bfq_entity *first_idle = st->first_idle;
 	struct bfq_entity *last_idle = st->last_idle;
 
+	BUG_ON(bfq_entity_service_tree(entity) != st);
 	if (!first_idle || bfq_gt(first_idle->finish, entity->finish))
 		st->first_idle = entity;
 	if (!last_idle || bfq_gt(entity->finish, last_idle->finish))
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups
  2022-01-07 14:58   ` Jan Kara
  2022-01-07 17:27     ` Jan Kara
@ 2022-01-10  1:49     ` yukuai (C)
  2022-01-11  8:28       ` yukuai (C)
  2022-01-11 14:12       ` Jan Kara
  1 sibling, 2 replies; 14+ messages in thread
From: yukuai (C) @ 2022-01-10  1:49 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-block, Paolo Valente, Jens Axboe

在 2022/01/07 22:58, Jan Kara 写道:
> On Fri 07-01-22 17:15:43, yukuai (C) wrote:
>> 在 2022/01/05 22:36, Jan Kara 写道:
>>> Hello,
>>>
>>> here is the second version of my patches to fix use-after-free issues in BFQ
>>> when processes with merged queues get moved to different cgroups. The patches
>>> have survived some beating in my test VM but so far I fail to reproduce the
>>> original KASAN reports so testing from people who can reproduce them is most
>>> welcome. Thanks!
>>>
>>> Changes since v1:
>>> * Added fix for bfq_put_cooperator()
>>> * Added fix to handle move between cgroups in bfq_merge_bio()
>>>
>>> 								Honza
>>> Previous versions:
>>> Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
>>> .
>>>
>>
>> Hi,
>>
>> I repoduced the problem again with this patchset...
> 
> Thanks for testing!
> 
>> [   71.004788] BUG: KASAN: use-after-free in
>> __bfq_deactivate_entity+0x21/0x290
>> [   71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801
>> [   71.007723]
>> [   71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G        W
>> 5.16.0-rc5-next-2021127
>> [   71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> ?-20190727_073836-4
>> [   71.012274] Call Trace:
>> [   71.012603]  <TASK>
>> [   71.012886]  dump_stack_lvl+0x34/0x44
>> [   71.013379]  print_address_description.constprop.0.cold+0xab/0x36b
>> [   71.014182]  ? __bfq_deactivate_entity+0x21/0x290
>> [   71.014795]  ? __bfq_deactivate_entity+0x21/0x290
>> [   71.015398]  kasan_report.cold+0x83/0xdf
>> [   71.015904]  ? _raw_read_lock_bh+0x20/0x40
>> [   71.016433]  ? __bfq_deactivate_entity+0x21/0x290
>> [   71.017033]  __bfq_deactivate_entity+0x21/0x290
>> [   71.017617]  bfq_pd_offline+0xc1/0x110
>> [   71.018105]  blkcg_deactivate_policy+0x14b/0x210
> ...
> 
>> Here is the caller of  __bfq_deactivate_entity:
>> (gdb) list *(bfq_pd_offline+0xc1)
>> 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942).
>> 937                      * entities to the idle tree. It happens if, in some
>> 938                      * of the calls to bfq_bfqq_move() performed by
>> 939                      * bfq_reparent_active_queues(), the queue to move
>> is
>> 940                      * empty and gets expired.
>> 941                      */
>> 942                     bfq_flush_idle_tree(st);
>> 943             }
>> 944
>> 945             __bfq_deactivate_entity(entity, false);
> 
> So this is indeed strange. The group has in some of its idle service trees
> an entity whose entity->sched_data points to already freed cgroup. In
> particular it means the bfqq_entity_service_tree() leads to somewhere else
> than the 'st' we called bfq_flush_idle_tree() with. This looks like a
> different kind of problem AFAICT but still it needs fixing :). Can you
> please run your reproducer with my patches + the attached debug patch on
> top? Hopefully it should tell us more about the problematic entity and how
> it got there... Thanks!

Hi,

I'm not sure I understand what you mean... I reporduced again with your
debug patch applied, however, no extra messages are printed.

I think this is exactly the same problem we discussed before:

1) bfqq->new_bfqq is set, they are under g1
2) bfqq is moved to another group, and user thread of new_bfqq exit
3) g1 is offlied
3) io issued from bfqq will end up in new_bfqq, and the offlined
g1 will be inserted to st of g1's parent.

Currently such entity should be in active tree, however, I think it's
prossible that it can end up in idle tree throuph deactivation of
entity?

4) io is done, new_bfqq is deactivated
5) new_bfqq is freed, and then g1 is freed
6) the problem is triggered when g1's parent is offlined.

Thanks,
Kuai
> 
> 								Honza
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups
  2022-01-10  1:49     ` yukuai (C)
@ 2022-01-11  8:28       ` yukuai (C)
  2022-01-11 12:36         ` Jan Kara
  2022-01-11 14:12       ` Jan Kara
  1 sibling, 1 reply; 14+ messages in thread
From: yukuai (C) @ 2022-01-11  8:28 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-block, Paolo Valente, Jens Axboe

在 2022/01/10 9:49, yukuai (C) 写道:
> 在 2022/01/07 22:58, Jan Kara 写道:
>> On Fri 07-01-22 17:15:43, yukuai (C) wrote:
>>> 在 2022/01/05 22:36, Jan Kara 写道:
>>>> Hello,
>>>>
>>>> here is the second version of my patches to fix use-after-free 
>>>> issues in BFQ
>>>> when processes with merged queues get moved to different cgroups. 
>>>> The patches
>>>> have survived some beating in my test VM but so far I fail to 
>>>> reproduce the
>>>> original KASAN reports so testing from people who can reproduce them 
>>>> is most
>>>> welcome. Thanks!
>>>>
>>>> Changes since v1:
>>>> * Added fix for bfq_put_cooperator()
>>>> * Added fix to handle move between cgroups in bfq_merge_bio()
>>>>
>>>>                                 Honza
>>>> Previous versions:
>>>> Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
>>>> .
>>>>
>>>
>>> Hi,
>>>
>>> I repoduced the problem again with this patchset...
>>
>> Thanks for testing!
>>
>>> [   71.004788] BUG: KASAN: use-after-free in
>>> __bfq_deactivate_entity+0x21/0x290
>>> [   71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801
>>> [   71.007723]
>>> [   71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G        W
>>> 5.16.0-rc5-next-2021127
>>> [   71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
>>> BIOS
>>> ?-20190727_073836-4
>>> [   71.012274] Call Trace:
>>> [   71.012603]  <TASK>
>>> [   71.012886]  dump_stack_lvl+0x34/0x44
>>> [   71.013379]  print_address_description.constprop.0.cold+0xab/0x36b
>>> [   71.014182]  ? __bfq_deactivate_entity+0x21/0x290
>>> [   71.014795]  ? __bfq_deactivate_entity+0x21/0x290
>>> [   71.015398]  kasan_report.cold+0x83/0xdf
>>> [   71.015904]  ? _raw_read_lock_bh+0x20/0x40
>>> [   71.016433]  ? __bfq_deactivate_entity+0x21/0x290
>>> [   71.017033]  __bfq_deactivate_entity+0x21/0x290
>>> [   71.017617]  bfq_pd_offline+0xc1/0x110
>>> [   71.018105]  blkcg_deactivate_policy+0x14b/0x210
>> ...
>>
>>> Here is the caller of  __bfq_deactivate_entity:
>>> (gdb) list *(bfq_pd_offline+0xc1)
>>> 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942).
>>> 937                      * entities to the idle tree. It happens if, 
>>> in some
>>> 938                      * of the calls to bfq_bfqq_move() performed by
>>> 939                      * bfq_reparent_active_queues(), the queue to 
>>> move
>>> is
>>> 940                      * empty and gets expired.
>>> 941                      */
>>> 942                     bfq_flush_idle_tree(st);
>>> 943             }
>>> 944
>>> 945             __bfq_deactivate_entity(entity, false);
>>
>> So this is indeed strange. The group has in some of its idle service 
>> trees
>> an entity whose entity->sched_data points to already freed cgroup. In
>> particular it means the bfqq_entity_service_tree() leads to somewhere 
>> else
>> than the 'st' we called bfq_flush_idle_tree() with. This looks like a
>> different kind of problem AFAICT but still it needs fixing :). Can you
>> please run your reproducer with my patches + the attached debug patch on
>> top? Hopefully it should tell us more about the problematic entity and 
>> how
>> it got there... Thanks!
> 
> Hi,
> 
> I'm not sure I understand what you mean... I reporduced again with your
> debug patch applied, however, no extra messages are printed.
> 
> I think this is exactly the same problem we discussed before:
> 
> 1) bfqq->new_bfqq is set, they are under g1
> 2) bfqq is moved to another group, and user thread of new_bfqq exit
> 3) g1 is offlied
> 3) io issued from bfqq will end up in new_bfqq, and the offlined
> g1 will be inserted to st of g1's parent.
> 
> Currently such entity should be in active tree, however, I think it's
> prossible that it can end up in idle tree throuph deactivation of
> entity?
> 
> 4) io is done, new_bfqq is deactivated
> 5) new_bfqq is freed, and then g1 is freed
> 6) the problem is triggered when g1's parent is offlined.

Hi,

I applied the following modification and the problem can't be
repoduced anymore.

diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 24a5c5329bcd..9b29af66635f 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -730,8 +730,19 @@ static struct bfq_group 
*__bfq_bic_change_cgroup(struct bfq_data *bfqd,

         if (sync_bfqq) {
                 entity = &sync_bfqq->entity;
-               if (entity->sched_data != &bfqg->sched_data)
+               if (entity->sched_data != &bfqg->sched_data) {
+                       if (sync_bfqq->new_bfqq) {
+                               bfq_put_queue(sync_bfqq->new_bfqq);
+                               sync_bfqq->new_bfqq = NULL;
+                       }
+
+                       if (bfq_bfqq_coop(sync_bfqq)) {
+                               bic->stably_merged = false;
+                               bfq_mark_bfqq_split_coop(sync_bfqq);
+                       }
+
                         bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
+               }

I'm not sure if we can set sync_bfqq->new_bfqq = NULL here, however,
this can make sure the problem here really is what we discussed
before...
> 
> Thanks,
> Kuai
>>
>>                                 Honza
>>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups
  2022-01-11  8:28       ` yukuai (C)
@ 2022-01-11 12:36         ` Jan Kara
  2022-01-11 13:13           ` yukuai (C)
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2022-01-11 12:36 UTC (permalink / raw)
  To: yukuai (C); +Cc: Jan Kara, linux-block, Paolo Valente, Jens Axboe

On Tue 11-01-22 16:28:35, yukuai (C) wrote:
> 在 2022/01/10 9:49, yukuai (C) 写道:
> > 在 2022/01/07 22:58, Jan Kara 写道:
> > > On Fri 07-01-22 17:15:43, yukuai (C) wrote:
> > > > 在 2022/01/05 22:36, Jan Kara 写道:
> > > > > Hello,
> > > > > 
> > > > > here is the second version of my patches to fix
> > > > > use-after-free issues in BFQ
> > > > > when processes with merged queues get moved to different
> > > > > cgroups. The patches
> > > > > have survived some beating in my test VM but so far I fail
> > > > > to reproduce the
> > > > > original KASAN reports so testing from people who can
> > > > > reproduce them is most
> > > > > welcome. Thanks!
> > > > > 
> > > > > Changes since v1:
> > > > > * Added fix for bfq_put_cooperator()
> > > > > * Added fix to handle move between cgroups in bfq_merge_bio()
> > > > > 
> > > > >                                 Honza
> > > > > Previous versions:
> > > > > Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
> > > > > .
> > > > > 
> > > > 
> > > > Hi,
> > > > 
> > > > I repoduced the problem again with this patchset...
> > > 
> > > Thanks for testing!
> > > 
> > > > [   71.004788] BUG: KASAN: use-after-free in
> > > > __bfq_deactivate_entity+0x21/0x290
> > > > [   71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801
> > > > [   71.007723]
> > > > [   71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G        W
> > > > 5.16.0-rc5-next-2021127
> > > > [   71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX,
> > > > 1996), BIOS
> > > > ?-20190727_073836-4
> > > > [   71.012274] Call Trace:
> > > > [   71.012603]  <TASK>
> > > > [   71.012886]  dump_stack_lvl+0x34/0x44
> > > > [   71.013379]  print_address_description.constprop.0.cold+0xab/0x36b
> > > > [   71.014182]  ? __bfq_deactivate_entity+0x21/0x290
> > > > [   71.014795]  ? __bfq_deactivate_entity+0x21/0x290
> > > > [   71.015398]  kasan_report.cold+0x83/0xdf
> > > > [   71.015904]  ? _raw_read_lock_bh+0x20/0x40
> > > > [   71.016433]  ? __bfq_deactivate_entity+0x21/0x290
> > > > [   71.017033]  __bfq_deactivate_entity+0x21/0x290
> > > > [   71.017617]  bfq_pd_offline+0xc1/0x110
> > > > [   71.018105]  blkcg_deactivate_policy+0x14b/0x210
> > > ...
> > > 
> > > > Here is the caller of  __bfq_deactivate_entity:
> > > > (gdb) list *(bfq_pd_offline+0xc1)
> > > > 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942).
> > > > 937                      * entities to the idle tree. It happens
> > > > if, in some
> > > > 938                      * of the calls to bfq_bfqq_move() performed by
> > > > 939                      * bfq_reparent_active_queues(), the
> > > > queue to move
> > > > is
> > > > 940                      * empty and gets expired.
> > > > 941                      */
> > > > 942                     bfq_flush_idle_tree(st);
> > > > 943             }
> > > > 944
> > > > 945             __bfq_deactivate_entity(entity, false);
> > > 
> > > So this is indeed strange. The group has in some of its idle service
> > > trees
> > > an entity whose entity->sched_data points to already freed cgroup. In
> > > particular it means the bfqq_entity_service_tree() leads to
> > > somewhere else
> > > than the 'st' we called bfq_flush_idle_tree() with. This looks like a
> > > different kind of problem AFAICT but still it needs fixing :). Can you
> > > please run your reproducer with my patches + the attached debug patch on
> > > top? Hopefully it should tell us more about the problematic entity
> > > and how
> > > it got there... Thanks!
> > 
> > Hi,
> > 
> > I'm not sure I understand what you mean... I reporduced again with your
> > debug patch applied, however, no extra messages are printed.
> > 
> > I think this is exactly the same problem we discussed before:
> > 
> > 1) bfqq->new_bfqq is set, they are under g1
> > 2) bfqq is moved to another group, and user thread of new_bfqq exit
> > 3) g1 is offlied
> > 3) io issued from bfqq will end up in new_bfqq, and the offlined
> > g1 will be inserted to st of g1's parent.
> > 
> > Currently such entity should be in active tree, however, I think it's
> > prossible that it can end up in idle tree throuph deactivation of
> > entity?
> > 
> > 4) io is done, new_bfqq is deactivated
> > 5) new_bfqq is freed, and then g1 is freed
> > 6) the problem is triggered when g1's parent is offlined.
> 
> Hi,
> 
> I applied the following modification and the problem can't be
> repoduced anymore.
> 
> diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
> index 24a5c5329bcd..9b29af66635f 100644
> --- a/block/bfq-cgroup.c
> +++ b/block/bfq-cgroup.c
> @@ -730,8 +730,19 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct
> bfq_data *bfqd,
> 
>         if (sync_bfqq) {
>                 entity = &sync_bfqq->entity;
> -               if (entity->sched_data != &bfqg->sched_data)
> +               if (entity->sched_data != &bfqg->sched_data) {
> +                       if (sync_bfqq->new_bfqq) {
> +                               bfq_put_queue(sync_bfqq->new_bfqq);
> +                               sync_bfqq->new_bfqq = NULL;
> +                       }
> +
> +                       if (bfq_bfqq_coop(sync_bfqq)) {
> +                               bic->stably_merged = false;
> +                               bfq_mark_bfqq_split_coop(sync_bfqq);
> +                       }
> +
>                         bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
> +               }
> 
> I'm not sure if we can set sync_bfqq->new_bfqq = NULL here, however,
> this can make sure the problem here really is what we discussed
> before...

OK, if this fixed the problem for you then I'm wondering why

  WARN_ON_ONCE(sync_bfqq->new_bfqq);

in my patch "bfq: Split shared queues on move between cgroups" didn't
trigger? Now I see how sync_bfqq->new_bfqq can indeed be != NULL here so I
agree with the change in principle (it has to use bfq_put_cooperator()
though) but I'm wondering why the WARN_ON didn't trigger. Or you just
didn't notice?

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups
  2022-01-11 12:36         ` Jan Kara
@ 2022-01-11 13:13           ` yukuai (C)
  0 siblings, 0 replies; 14+ messages in thread
From: yukuai (C) @ 2022-01-11 13:13 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-block, Paolo Valente, Jens Axboe

在 2022/01/11 20:36, Jan Kara 写道:
> On Tue 11-01-22 16:28:35, yukuai (C) wrote:
>> 在 2022/01/10 9:49, yukuai (C) 写道:
>>> 在 2022/01/07 22:58, Jan Kara 写道:
>>>> On Fri 07-01-22 17:15:43, yukuai (C) wrote:
>>>>> 在 2022/01/05 22:36, Jan Kara 写道:
>>>>>> Hello,
>>>>>>
>>>>>> here is the second version of my patches to fix
>>>>>> use-after-free issues in BFQ
>>>>>> when processes with merged queues get moved to different
>>>>>> cgroups. The patches
>>>>>> have survived some beating in my test VM but so far I fail
>>>>>> to reproduce the
>>>>>> original KASAN reports so testing from people who can
>>>>>> reproduce them is most
>>>>>> welcome. Thanks!
>>>>>>
>>>>>> Changes since v1:
>>>>>> * Added fix for bfq_put_cooperator()
>>>>>> * Added fix to handle move between cgroups in bfq_merge_bio()
>>>>>>
>>>>>>                                  Honza
>>>>>> Previous versions:
>>>>>> Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
>>>>>> .
>>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I repoduced the problem again with this patchset...
>>>>
>>>> Thanks for testing!
>>>>
>>>>> [   71.004788] BUG: KASAN: use-after-free in
>>>>> __bfq_deactivate_entity+0x21/0x290
>>>>> [   71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801
>>>>> [   71.007723]
>>>>> [   71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G        W
>>>>> 5.16.0-rc5-next-2021127
>>>>> [   71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX,
>>>>> 1996), BIOS
>>>>> ?-20190727_073836-4
>>>>> [   71.012274] Call Trace:
>>>>> [   71.012603]  <TASK>
>>>>> [   71.012886]  dump_stack_lvl+0x34/0x44
>>>>> [   71.013379]  print_address_description.constprop.0.cold+0xab/0x36b
>>>>> [   71.014182]  ? __bfq_deactivate_entity+0x21/0x290
>>>>> [   71.014795]  ? __bfq_deactivate_entity+0x21/0x290
>>>>> [   71.015398]  kasan_report.cold+0x83/0xdf
>>>>> [   71.015904]  ? _raw_read_lock_bh+0x20/0x40
>>>>> [   71.016433]  ? __bfq_deactivate_entity+0x21/0x290
>>>>> [   71.017033]  __bfq_deactivate_entity+0x21/0x290
>>>>> [   71.017617]  bfq_pd_offline+0xc1/0x110
>>>>> [   71.018105]  blkcg_deactivate_policy+0x14b/0x210
>>>> ...
>>>>
>>>>> Here is the caller of  __bfq_deactivate_entity:
>>>>> (gdb) list *(bfq_pd_offline+0xc1)
>>>>> 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942).
>>>>> 937                      * entities to the idle tree. It happens
>>>>> if, in some
>>>>> 938                      * of the calls to bfq_bfqq_move() performed by
>>>>> 939                      * bfq_reparent_active_queues(), the
>>>>> queue to move
>>>>> is
>>>>> 940                      * empty and gets expired.
>>>>> 941                      */
>>>>> 942                     bfq_flush_idle_tree(st);
>>>>> 943             }
>>>>> 944
>>>>> 945             __bfq_deactivate_entity(entity, false);
>>>>
>>>> So this is indeed strange. The group has in some of its idle service
>>>> trees
>>>> an entity whose entity->sched_data points to already freed cgroup. In
>>>> particular it means the bfqq_entity_service_tree() leads to
>>>> somewhere else
>>>> than the 'st' we called bfq_flush_idle_tree() with. This looks like a
>>>> different kind of problem AFAICT but still it needs fixing :). Can you
>>>> please run your reproducer with my patches + the attached debug patch on
>>>> top? Hopefully it should tell us more about the problematic entity
>>>> and how
>>>> it got there... Thanks!
>>>
>>> Hi,
>>>
>>> I'm not sure I understand what you mean... I reporduced again with your
>>> debug patch applied, however, no extra messages are printed.
>>>
>>> I think this is exactly the same problem we discussed before:
>>>
>>> 1) bfqq->new_bfqq is set, they are under g1
>>> 2) bfqq is moved to another group, and user thread of new_bfqq exit
>>> 3) g1 is offlied
>>> 3) io issued from bfqq will end up in new_bfqq, and the offlined
>>> g1 will be inserted to st of g1's parent.
>>>
>>> Currently such entity should be in active tree, however, I think it's
>>> prossible that it can end up in idle tree throuph deactivation of
>>> entity?
>>>
>>> 4) io is done, new_bfqq is deactivated
>>> 5) new_bfqq is freed, and then g1 is freed
>>> 6) the problem is triggered when g1's parent is offlined.
>>
>> Hi,
>>
>> I applied the following modification and the problem can't be
>> repoduced anymore.
>>
>> diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
>> index 24a5c5329bcd..9b29af66635f 100644
>> --- a/block/bfq-cgroup.c
>> +++ b/block/bfq-cgroup.c
>> @@ -730,8 +730,19 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct
>> bfq_data *bfqd,
>>
>>          if (sync_bfqq) {
>>                  entity = &sync_bfqq->entity;
>> -               if (entity->sched_data != &bfqg->sched_data)
>> +               if (entity->sched_data != &bfqg->sched_data) {
>> +                       if (sync_bfqq->new_bfqq) {
>> +                               bfq_put_queue(sync_bfqq->new_bfqq);
>> +                               sync_bfqq->new_bfqq = NULL;
>> +                       }
>> +
>> +                       if (bfq_bfqq_coop(sync_bfqq)) {
>> +                               bic->stably_merged = false;
>> +                               bfq_mark_bfqq_split_coop(sync_bfqq);
>> +                       }
>> +
>>                          bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
>> +               }
>>
>> I'm not sure if we can set sync_bfqq->new_bfqq = NULL here, however,
>> this can make sure the problem here really is what we discussed
>> before...
> 
> OK, if this fixed the problem for you then I'm wondering why
> 
>    WARN_ON_ONCE(sync_bfqq->new_bfqq);
> 
> in my patch "bfq: Split shared queues on move between cgroups" didn't
> trigger? Now I see how sync_bfqq->new_bfqq can indeed be != NULL here so I
> agree with the change in principle (it has to use bfq_put_cooperator()
> though) but I'm wondering why the WARN_ON didn't trigger. Or you just
> didn't notice?
Hi,

When the repoducer start running, the WARN_ON() will be triggered
immediately, sorry that the missing log lead to misunderstanding.

Thanks,
Kuai
> 
> 								Honza
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups
  2022-01-10  1:49     ` yukuai (C)
  2022-01-11  8:28       ` yukuai (C)
@ 2022-01-11 14:12       ` Jan Kara
  1 sibling, 0 replies; 14+ messages in thread
From: Jan Kara @ 2022-01-11 14:12 UTC (permalink / raw)
  To: yukuai (C); +Cc: Jan Kara, linux-block, Paolo Valente, Jens Axboe

On Mon 10-01-22 09:49:34, yukuai (C) wrote:
> 在 2022/01/07 22:58, Jan Kara 写道:
> > On Fri 07-01-22 17:15:43, yukuai (C) wrote:
> > > 在 2022/01/05 22:36, Jan Kara 写道:
> > > > Hello,
> > > > 
> > > > here is the second version of my patches to fix use-after-free issues in BFQ
> > > > when processes with merged queues get moved to different cgroups. The patches
> > > > have survived some beating in my test VM but so far I fail to reproduce the
> > > > original KASAN reports so testing from people who can reproduce them is most
> > > > welcome. Thanks!
> > > > 
> > > > Changes since v1:
> > > > * Added fix for bfq_put_cooperator()
> > > > * Added fix to handle move between cgroups in bfq_merge_bio()
> > > > 
> > > > 								Honza
> > > > Previous versions:
> > > > Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
> > > > .
> > > > 
> > > 
> > > Hi,
> > > 
> > > I repoduced the problem again with this patchset...
> > 
> > Thanks for testing!
> > 
> > > [   71.004788] BUG: KASAN: use-after-free in
> > > __bfq_deactivate_entity+0x21/0x290
> > > [   71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801
> > > [   71.007723]
> > > [   71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G        W
> > > 5.16.0-rc5-next-2021127
> > > [   71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > > ?-20190727_073836-4
> > > [   71.012274] Call Trace:
> > > [   71.012603]  <TASK>
> > > [   71.012886]  dump_stack_lvl+0x34/0x44
> > > [   71.013379]  print_address_description.constprop.0.cold+0xab/0x36b
> > > [   71.014182]  ? __bfq_deactivate_entity+0x21/0x290
> > > [   71.014795]  ? __bfq_deactivate_entity+0x21/0x290
> > > [   71.015398]  kasan_report.cold+0x83/0xdf
> > > [   71.015904]  ? _raw_read_lock_bh+0x20/0x40
> > > [   71.016433]  ? __bfq_deactivate_entity+0x21/0x290
> > > [   71.017033]  __bfq_deactivate_entity+0x21/0x290
> > > [   71.017617]  bfq_pd_offline+0xc1/0x110
> > > [   71.018105]  blkcg_deactivate_policy+0x14b/0x210
> > ...
> > 
> > > Here is the caller of  __bfq_deactivate_entity:
> > > (gdb) list *(bfq_pd_offline+0xc1)
> > > 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942).
> > > 937                      * entities to the idle tree. It happens if, in some
> > > 938                      * of the calls to bfq_bfqq_move() performed by
> > > 939                      * bfq_reparent_active_queues(), the queue to move
> > > is
> > > 940                      * empty and gets expired.
> > > 941                      */
> > > 942                     bfq_flush_idle_tree(st);
> > > 943             }
> > > 944
> > > 945             __bfq_deactivate_entity(entity, false);
> > 
> > So this is indeed strange. The group has in some of its idle service trees
> > an entity whose entity->sched_data points to already freed cgroup. In
> > particular it means the bfqq_entity_service_tree() leads to somewhere else
> > than the 'st' we called bfq_flush_idle_tree() with. This looks like a
> > different kind of problem AFAICT but still it needs fixing :). Can you
> > please run your reproducer with my patches + the attached debug patch on
> > top? Hopefully it should tell us more about the problematic entity and how
> > it got there... Thanks!
> 
> Hi,
> 
> I'm not sure I understand what you mean... I reporduced again with your
> debug patch applied, however, no extra messages are printed.
> 
> I think this is exactly the same problem we discussed before:
> 
> 1) bfqq->new_bfqq is set, they are under g1
> 2) bfqq is moved to another group, and user thread of new_bfqq exit
> 3) g1 is offlied
> 3) io issued from bfqq will end up in new_bfqq, and the offlined
> g1 will be inserted to st of g1's parent.

Hmm, you are right. I was confused by the fact that bfq_setup_merge() is
always immediately (under big bfq lock) followed by bfq_merge_bfqqs() but
that redirects BIC of just one process pointing to the new bfqq. So the
following is a bit more detailed and graphical version of your scenario for
future reference :):

Initial state, bfqq2 is shared by Process 2 & Process 3:

Process 1 (blkcg1)	Process 2 (blkcg1)	Process 3 (blkcg1)
 (BIC)			 (BIC)			 (BIC)
   |			   |			   |
   |			   |			  /
   v			   v			 /
 bfqq1			bfqq2<-------------------
   \			  /
    \			 /
     \			/
      ----> BFQ group1<-

Now while processing request for Process 2 we decide to merge bfqq2 to
bfqq1. Situation after the merge:

Process 1 (blkcg1)	Process 2 (blkcg1)	Process 3 (blkcg1)
 (BIC)			 (BIC)			 (BIC)
   |			   /			   |
   |/---------------------/   			  /
   vv		new_bfqq   			 /
 bfqq1<-----------------bfqq2<-------------------
   \			  /
    \			 /
     \			/
      ----> BFQ group1<-

Processes 1 and 2 exit:
					Process 3 (blkcg1)
					 (BIC)
					   |
					  /
		new_bfqq		 /
 bfqq1<-----------------bfqq2<-----------
   \			  /
    \			 /
     \			/
      ----> BFQ group1<-

Process 3 is moved to blkcg2 and submits IO, blkcg1 is offlined.
bfq_bic_update_cgroup() will change the picture to:

					Process 3 (blkcg2)
					 (BIC)
					   |
					  /
		new_bfqq		 /
 bfqq1<-----------------bfqq2<-----------
   |			  |
   |			  |
   v 			  v
BFQ group1		BFQ group2

and following bfq_merge_bfqqs() when submitting the request will further
modify the picture to:
					Process 3 (blkcg2)
					 (BIC)
					   |
     /-------------------------------------/
     v		new_bfqq
 bfqq1<-----------------bfqq2
   |			  |
   |			  |
   v 			  v
BFQ group1		BFQ group2

and boom, we queue again offlined BFQ group1.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-01-11 14:12 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-05 14:36 [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups Jan Kara
2022-01-05 14:36 ` [PATCH 1/5] bfq: Avoid false marking of bic as stably merged Jan Kara
2022-01-05 14:36 ` [PATCH 2/5] bfq: Avoid merging queues with different parents Jan Kara
2022-01-05 14:36 ` [PATCH 3/5] bfq: Simplify bfq_put_cooperator() Jan Kara
2022-01-05 14:36 ` [PATCH 4/5] bfq: Split shared queues on move between cgroups Jan Kara
2022-01-05 14:36 ` [PATCH 5/5] bfq: Update cgroup information before merging bio Jan Kara
2022-01-07  9:15 ` [PATCH 0/5 v2] bfq: Avoid use-after-free when moving processes between cgroups yukuai (C)
2022-01-07 14:58   ` Jan Kara
2022-01-07 17:27     ` Jan Kara
2022-01-10  1:49     ` yukuai (C)
2022-01-11  8:28       ` yukuai (C)
2022-01-11 12:36         ` Jan Kara
2022-01-11 13:13           ` yukuai (C)
2022-01-11 14:12       ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.