All of lore.kernel.org
 help / color / mirror / Atom feed
* bioset_exit poison from dm_destroy
@ 2022-05-29  0:17 ` Matthew Wilcox
  0 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox @ 2022-05-29  0:17 UTC (permalink / raw)
  To: dm-devel, linux-block

Not quite sure whose bug this is.  Current Linus head running xfstests
against ext4 (probably not ext4's fault?)

01818 generic/250	run fstests generic/250 at 2022-05-28 23:48:09
01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
01818 EXT4-fs (dm-0): unmounting filesystem.
01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
01818 EXT4-fs (dm-0): unmounting filesystem.
01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
01818 Buffer I/O error on dev dm-0, logical block 3670000, async page read
01818 EXT4-fs (dm-0): unmounting filesystem.
01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
01818 EXT4-fs (dm-0): unmounting filesystem.
01818 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
01818 CPU: 0 PID: 1579117 Comm: dmsetup Kdump: loaded Not tainted 5.18.0-11049-g1dec3d7fd0c3-dirty #262
01818 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
01818 RIP: 0010:__cpuhp_state_remove_instance+0xf0/0x1b0
01818 Code: a0 f8 d7 81 42 3b 1c 28 7f d9 4c 89 e1 31 d2 89 de 89 7d dc e8 01 fd ff ff 8b 7d dc eb c5 49 8b 04 24 49 8b 54 24 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 c7
01818 RSP: 0018:ffff888101fcfc60 EFLAGS: 00010286
01818 RAX: dead000000000100 RBX: 0000000000000017 RCX: 0000000000000000
01818 RDX: dead000000000122 RSI: ffff8881233b0ae8 RDI: ffffffff81e3b080
01818 RBP: ffff888101fcfc88 R08: 0000000000000008 R09: 0000000000000003
01818 R10: 0000000000000000 R11: 00000000002dc6c0 R12: ffff8881233b0ae8
01818 R13: 0000000000000000 R14: ffffffff81e38f58 R15: ffff88817b5a3c00
01818 FS:  00007ff56daec280(0000) GS:ffff888275800000(0000) knlGS:0000000000000000
01818 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
01818 CR2: 00005591ad94f198 CR3: 000000017b5a0004 CR4: 0000000000770eb0
01818 PKRU: 55555554
01818 Call Trace:
01818  <TASK>
01818  ? kfree+0x66/0x250
01818  bioset_exit+0x32/0x210
01818  cleanup_mapped_device+0x34/0xf0
01818  __dm_destroy+0x149/0x1f0
01818  ? table_clear+0xc0/0xc0
01818  dm_destroy+0xe/0x10
01818  dev_remove+0xd9/0x120
01818  ctl_ioctl+0x1cb/0x420
01818  dm_ctl_ioctl+0x9/0x10
01818  __x64_sys_ioctl+0x89/0xb0
01818  do_syscall_64+0x35/0x80
01818  entry_SYSCALL_64_after_hwframe+0x46/0xb0
01818 RIP: 0033:0x7ff56de3b397
01818 Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
01818 RSP: 002b:00007ffe55367ef8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
01818 RAX: ffffffffffffffda RBX: 00007ff56df31a8e RCX: 00007ff56de3b397
01818 RDX: 000055daad7cab30 RSI: 00000000c138fd04 RDI: 0000000000000003
01818 RBP: 00007ffe55367fb0 R08: 00007ff56df81558 R09: 00007ffe55367d60
01818 R10: 00007ff56df808a2 R11: 0000000000000206 R12: 00007ff56df808a2
01818 R13: 00007ff56df808a2 R14: 00007ff56df808a2 R15: 00007ff56df808a2
01818  </TASK>
01818 Modules linked in: crct10dif_generic crct10dif_common [last unloaded: crc_t10dif]


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [dm-devel] bioset_exit poison from dm_destroy
@ 2022-05-29  0:17 ` Matthew Wilcox
  0 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox @ 2022-05-29  0:17 UTC (permalink / raw)
  To: dm-devel, linux-block

Not quite sure whose bug this is.  Current Linus head running xfstests
against ext4 (probably not ext4's fault?)

01818 generic/250	run fstests generic/250 at 2022-05-28 23:48:09
01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
01818 EXT4-fs (dm-0): unmounting filesystem.
01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
01818 EXT4-fs (dm-0): unmounting filesystem.
01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
01818 Buffer I/O error on dev dm-0, logical block 3670000, async page read
01818 EXT4-fs (dm-0): unmounting filesystem.
01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
01818 EXT4-fs (dm-0): unmounting filesystem.
01818 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
01818 CPU: 0 PID: 1579117 Comm: dmsetup Kdump: loaded Not tainted 5.18.0-11049-g1dec3d7fd0c3-dirty #262
01818 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
01818 RIP: 0010:__cpuhp_state_remove_instance+0xf0/0x1b0
01818 Code: a0 f8 d7 81 42 3b 1c 28 7f d9 4c 89 e1 31 d2 89 de 89 7d dc e8 01 fd ff ff 8b 7d dc eb c5 49 8b 04 24 49 8b 54 24 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 c7
01818 RSP: 0018:ffff888101fcfc60 EFLAGS: 00010286
01818 RAX: dead000000000100 RBX: 0000000000000017 RCX: 0000000000000000
01818 RDX: dead000000000122 RSI: ffff8881233b0ae8 RDI: ffffffff81e3b080
01818 RBP: ffff888101fcfc88 R08: 0000000000000008 R09: 0000000000000003
01818 R10: 0000000000000000 R11: 00000000002dc6c0 R12: ffff8881233b0ae8
01818 R13: 0000000000000000 R14: ffffffff81e38f58 R15: ffff88817b5a3c00
01818 FS:  00007ff56daec280(0000) GS:ffff888275800000(0000) knlGS:0000000000000000
01818 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
01818 CR2: 00005591ad94f198 CR3: 000000017b5a0004 CR4: 0000000000770eb0
01818 PKRU: 55555554
01818 Call Trace:
01818  <TASK>
01818  ? kfree+0x66/0x250
01818  bioset_exit+0x32/0x210
01818  cleanup_mapped_device+0x34/0xf0
01818  __dm_destroy+0x149/0x1f0
01818  ? table_clear+0xc0/0xc0
01818  dm_destroy+0xe/0x10
01818  dev_remove+0xd9/0x120
01818  ctl_ioctl+0x1cb/0x420
01818  dm_ctl_ioctl+0x9/0x10
01818  __x64_sys_ioctl+0x89/0xb0
01818  do_syscall_64+0x35/0x80
01818  entry_SYSCALL_64_after_hwframe+0x46/0xb0
01818 RIP: 0033:0x7ff56de3b397
01818 Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
01818 RSP: 002b:00007ffe55367ef8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
01818 RAX: ffffffffffffffda RBX: 00007ff56df31a8e RCX: 00007ff56de3b397
01818 RDX: 000055daad7cab30 RSI: 00000000c138fd04 RDI: 0000000000000003
01818 RBP: 00007ffe55367fb0 R08: 00007ff56df81558 R09: 00007ffe55367d60
01818 R10: 00007ff56df808a2 R11: 0000000000000206 R12: 00007ff56df808a2
01818 R13: 00007ff56df808a2 R14: 00007ff56df808a2 R15: 00007ff56df808a2
01818  </TASK>
01818 Modules linked in: crct10dif_generic crct10dif_common [last unloaded: crc_t10dif]

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
  2022-05-29  0:17 ` [dm-devel] " Matthew Wilcox
@ 2022-05-29 12:46   ` Jens Axboe
  -1 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2022-05-29 12:46 UTC (permalink / raw)
  To: Matthew Wilcox, dm-devel, linux-block

On 5/28/22 6:17 PM, Matthew Wilcox wrote:
> Not quite sure whose bug this is.  Current Linus head running xfstests
> against ext4 (probably not ext4's fault?)
> 
> 01818 generic/250	run fstests generic/250 at 2022-05-28 23:48:09
> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> 01818 EXT4-fs (dm-0): unmounting filesystem.
> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> 01818 EXT4-fs (dm-0): unmounting filesystem.
> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> 01818 Buffer I/O error on dev dm-0, logical block 3670000, async page read
> 01818 EXT4-fs (dm-0): unmounting filesystem.
> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> 01818 EXT4-fs (dm-0): unmounting filesystem.
> 01818 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
> 01818 CPU: 0 PID: 1579117 Comm: dmsetup Kdump: loaded Not tainted 5.18.0-11049-g1dec3d7fd0c3-dirty #262
> 01818 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> 01818 RIP: 0010:__cpuhp_state_remove_instance+0xf0/0x1b0
> 01818 Code: a0 f8 d7 81 42 3b 1c 28 7f d9 4c 89 e1 31 d2 89 de 89 7d dc e8 01 fd ff ff 8b 7d dc eb c5 49 8b 04 24 49 8b 54 24 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 c7
> 01818 RSP: 0018:ffff888101fcfc60 EFLAGS: 00010286
> 01818 RAX: dead000000000100 RBX: 0000000000000017 RCX: 0000000000000000
> 01818 RDX: dead000000000122 RSI: ffff8881233b0ae8 RDI: ffffffff81e3b080
> 01818 RBP: ffff888101fcfc88 R08: 0000000000000008 R09: 0000000000000003
> 01818 R10: 0000000000000000 R11: 00000000002dc6c0 R12: ffff8881233b0ae8
> 01818 R13: 0000000000000000 R14: ffffffff81e38f58 R15: ffff88817b5a3c00
> 01818 FS:  00007ff56daec280(0000) GS:ffff888275800000(0000) knlGS:0000000000000000
> 01818 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 01818 CR2: 00005591ad94f198 CR3: 000000017b5a0004 CR4: 0000000000770eb0
> 01818 PKRU: 55555554
> 01818 Call Trace:
> 01818  <TASK>
> 01818  ? kfree+0x66/0x250
> 01818  bioset_exit+0x32/0x210
> 01818  cleanup_mapped_device+0x34/0xf0
> 01818  __dm_destroy+0x149/0x1f0
> 01818  ? table_clear+0xc0/0xc0
> 01818  dm_destroy+0xe/0x10
> 01818  dev_remove+0xd9/0x120
> 01818  ctl_ioctl+0x1cb/0x420
> 01818  dm_ctl_ioctl+0x9/0x10
> 01818  __x64_sys_ioctl+0x89/0xb0
> 01818  do_syscall_64+0x35/0x80
> 01818  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> 01818 RIP: 0033:0x7ff56de3b397
> 01818 Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
> 01818 RSP: 002b:00007ffe55367ef8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
> 01818 RAX: ffffffffffffffda RBX: 00007ff56df31a8e RCX: 00007ff56de3b397
> 01818 RDX: 000055daad7cab30 RSI: 00000000c138fd04 RDI: 0000000000000003
> 01818 RBP: 00007ffe55367fb0 R08: 00007ff56df81558 R09: 00007ffe55367d60
> 01818 R10: 00007ff56df808a2 R11: 0000000000000206 R12: 00007ff56df808a2
> 01818 R13: 00007ff56df808a2 R14: 00007ff56df808a2 R15: 00007ff56df808a2
> 01818  </TASK>
> 01818 Modules linked in: crct10dif_generic crct10dif_common [last unloaded: crc_t10dif]

I suspect dm is calling bioset_exit() multiple times? Which it probably
should not.

The reset of bioset_exit() is resilient against this, so might be best
to include bio_alloc_cache_destroy() in that.


diff --git a/block/bio.c b/block/bio.c
index a3893d80dccc..be3937b84e68 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -722,6 +722,7 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
 		bio_alloc_cache_prune(cache, -1U);
 	}
 	free_percpu(bs->cache);
+	bs->cache = NULL;
 }
 
 /**

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
@ 2022-05-29 12:46   ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2022-05-29 12:46 UTC (permalink / raw)
  To: Matthew Wilcox, dm-devel, linux-block

On 5/28/22 6:17 PM, Matthew Wilcox wrote:
> Not quite sure whose bug this is.  Current Linus head running xfstests
> against ext4 (probably not ext4's fault?)
> 
> 01818 generic/250	run fstests generic/250 at 2022-05-28 23:48:09
> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> 01818 EXT4-fs (dm-0): unmounting filesystem.
> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> 01818 EXT4-fs (dm-0): unmounting filesystem.
> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> 01818 Buffer I/O error on dev dm-0, logical block 3670000, async page read
> 01818 EXT4-fs (dm-0): unmounting filesystem.
> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> 01818 EXT4-fs (dm-0): unmounting filesystem.
> 01818 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
> 01818 CPU: 0 PID: 1579117 Comm: dmsetup Kdump: loaded Not tainted 5.18.0-11049-g1dec3d7fd0c3-dirty #262
> 01818 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> 01818 RIP: 0010:__cpuhp_state_remove_instance+0xf0/0x1b0
> 01818 Code: a0 f8 d7 81 42 3b 1c 28 7f d9 4c 89 e1 31 d2 89 de 89 7d dc e8 01 fd ff ff 8b 7d dc eb c5 49 8b 04 24 49 8b 54 24 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 c7
> 01818 RSP: 0018:ffff888101fcfc60 EFLAGS: 00010286
> 01818 RAX: dead000000000100 RBX: 0000000000000017 RCX: 0000000000000000
> 01818 RDX: dead000000000122 RSI: ffff8881233b0ae8 RDI: ffffffff81e3b080
> 01818 RBP: ffff888101fcfc88 R08: 0000000000000008 R09: 0000000000000003
> 01818 R10: 0000000000000000 R11: 00000000002dc6c0 R12: ffff8881233b0ae8
> 01818 R13: 0000000000000000 R14: ffffffff81e38f58 R15: ffff88817b5a3c00
> 01818 FS:  00007ff56daec280(0000) GS:ffff888275800000(0000) knlGS:0000000000000000
> 01818 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 01818 CR2: 00005591ad94f198 CR3: 000000017b5a0004 CR4: 0000000000770eb0
> 01818 PKRU: 55555554
> 01818 Call Trace:
> 01818  <TASK>
> 01818  ? kfree+0x66/0x250
> 01818  bioset_exit+0x32/0x210
> 01818  cleanup_mapped_device+0x34/0xf0
> 01818  __dm_destroy+0x149/0x1f0
> 01818  ? table_clear+0xc0/0xc0
> 01818  dm_destroy+0xe/0x10
> 01818  dev_remove+0xd9/0x120
> 01818  ctl_ioctl+0x1cb/0x420
> 01818  dm_ctl_ioctl+0x9/0x10
> 01818  __x64_sys_ioctl+0x89/0xb0
> 01818  do_syscall_64+0x35/0x80
> 01818  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> 01818 RIP: 0033:0x7ff56de3b397
> 01818 Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
> 01818 RSP: 002b:00007ffe55367ef8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
> 01818 RAX: ffffffffffffffda RBX: 00007ff56df31a8e RCX: 00007ff56de3b397
> 01818 RDX: 000055daad7cab30 RSI: 00000000c138fd04 RDI: 0000000000000003
> 01818 RBP: 00007ffe55367fb0 R08: 00007ff56df81558 R09: 00007ffe55367d60
> 01818 R10: 00007ff56df808a2 R11: 0000000000000206 R12: 00007ff56df808a2
> 01818 R13: 00007ff56df808a2 R14: 00007ff56df808a2 R15: 00007ff56df808a2
> 01818  </TASK>
> 01818 Modules linked in: crct10dif_generic crct10dif_common [last unloaded: crc_t10dif]

I suspect dm is calling bioset_exit() multiple times? Which it probably
should not.

The reset of bioset_exit() is resilient against this, so might be best
to include bio_alloc_cache_destroy() in that.


diff --git a/block/bio.c b/block/bio.c
index a3893d80dccc..be3937b84e68 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -722,6 +722,7 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
 		bio_alloc_cache_prune(cache, -1U);
 	}
 	free_percpu(bs->cache);
+	bs->cache = NULL;
 }
 
 /**

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
  2022-05-29 12:46   ` [dm-devel] " Jens Axboe
@ 2022-05-31 18:58     ` Mike Snitzer
  -1 siblings, 0 replies; 28+ messages in thread
From: Mike Snitzer @ 2022-05-31 18:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Matthew Wilcox, dm-devel, linux-block, david

On Sun, May 29 2022 at  8:46P -0400,
Jens Axboe <axboe@kernel.dk> wrote:

> On 5/28/22 6:17 PM, Matthew Wilcox wrote:
> > Not quite sure whose bug this is.  Current Linus head running xfstests
> > against ext4 (probably not ext4's fault?)
> > 
> > 01818 generic/250	run fstests generic/250 at 2022-05-28 23:48:09
> > 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> > 01818 EXT4-fs (dm-0): unmounting filesystem.
> > 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> > 01818 EXT4-fs (dm-0): unmounting filesystem.
> > 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> > 01818 Buffer I/O error on dev dm-0, logical block 3670000, async page read
> > 01818 EXT4-fs (dm-0): unmounting filesystem.
> > 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> > 01818 EXT4-fs (dm-0): unmounting filesystem.
> > 01818 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
> > 01818 CPU: 0 PID: 1579117 Comm: dmsetup Kdump: loaded Not tainted 5.18.0-11049-g1dec3d7fd0c3-dirty #262
> > 01818 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> > 01818 RIP: 0010:__cpuhp_state_remove_instance+0xf0/0x1b0
> > 01818 Code: a0 f8 d7 81 42 3b 1c 28 7f d9 4c 89 e1 31 d2 89 de 89 7d dc e8 01 fd ff ff 8b 7d dc eb c5 49 8b 04 24 49 8b 54 24 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 c7
> > 01818 RSP: 0018:ffff888101fcfc60 EFLAGS: 00010286
> > 01818 RAX: dead000000000100 RBX: 0000000000000017 RCX: 0000000000000000
> > 01818 RDX: dead000000000122 RSI: ffff8881233b0ae8 RDI: ffffffff81e3b080
> > 01818 RBP: ffff888101fcfc88 R08: 0000000000000008 R09: 0000000000000003
> > 01818 R10: 0000000000000000 R11: 00000000002dc6c0 R12: ffff8881233b0ae8
> > 01818 R13: 0000000000000000 R14: ffffffff81e38f58 R15: ffff88817b5a3c00
> > 01818 FS:  00007ff56daec280(0000) GS:ffff888275800000(0000) knlGS:0000000000000000
> > 01818 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > 01818 CR2: 00005591ad94f198 CR3: 000000017b5a0004 CR4: 0000000000770eb0
> > 01818 PKRU: 55555554
> > 01818 Call Trace:
> > 01818  <TASK>
> > 01818  ? kfree+0x66/0x250
> > 01818  bioset_exit+0x32/0x210
> > 01818  cleanup_mapped_device+0x34/0xf0
> > 01818  __dm_destroy+0x149/0x1f0
> > 01818  ? table_clear+0xc0/0xc0
> > 01818  dm_destroy+0xe/0x10
> > 01818  dev_remove+0xd9/0x120
> > 01818  ctl_ioctl+0x1cb/0x420
> > 01818  dm_ctl_ioctl+0x9/0x10
> > 01818  __x64_sys_ioctl+0x89/0xb0
> > 01818  do_syscall_64+0x35/0x80
> > 01818  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> > 01818 RIP: 0033:0x7ff56de3b397
> > 01818 Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
> > 01818 RSP: 002b:00007ffe55367ef8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
> > 01818 RAX: ffffffffffffffda RBX: 00007ff56df31a8e RCX: 00007ff56de3b397
> > 01818 RDX: 000055daad7cab30 RSI: 00000000c138fd04 RDI: 0000000000000003
> > 01818 RBP: 00007ffe55367fb0 R08: 00007ff56df81558 R09: 00007ffe55367d60
> > 01818 R10: 00007ff56df808a2 R11: 0000000000000206 R12: 00007ff56df808a2
> > 01818 R13: 00007ff56df808a2 R14: 00007ff56df808a2 R15: 00007ff56df808a2
> > 01818  </TASK>
> > 01818 Modules linked in: crct10dif_generic crct10dif_common [last unloaded: crc_t10dif]
> 
> I suspect dm is calling bioset_exit() multiple times? Which it probably
> should not.
> 
> The reset of bioset_exit() is resilient against this, so might be best
> to include bio_alloc_cache_destroy() in that.
> 
> 
> diff --git a/block/bio.c b/block/bio.c
> index a3893d80dccc..be3937b84e68 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -722,6 +722,7 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
>  		bio_alloc_cache_prune(cache, -1U);
>  	}
>  	free_percpu(bs->cache);
> +	bs->cache = NULL;
>  }
>  
>  /**

Yes, we need the above to fix the crash.  Does it also make sense to
add this?

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 49eff01fb829..f410c78e9c0c 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -681,7 +681,7 @@ struct bio_set {

 static inline bool bioset_initialized(struct bio_set *bs)
 {
-	return bs->bio_slab != NULL;
+	return (bs->bio_slab != NULL || bs->cache != NULL);
 }

 #if defined(CONFIG_BLK_DEV_INTEGRITY)

FYI, DM's unique use of bioset_init_from_src() is the primary reason
why this stale pointer is biting us.

I dug into this issue further and have queued this patch:

From: Mike Snitzer <snitzer@kernel.org>
Date: Tue, 31 May 2022 12:16:49 -0400
Subject: [PATCH] dm table: fix dm_table_supports_poll to return false if no data devices

It was reported that the "generic/250" test in xfstests (which uses
the dm-error target) demonstrates a regression where the kernel
crashes in bioset_exit().

Since commit cfc97abcbe0b ("dm: conditionally enable
BIOSET_PERCPU_CACHE for dm_io bioset") the bioset_init() for the dm_io
bioset will setup the bioset's per-cpu alloc cache if all devices have
QUEUE_FLAG_POLL set.

But there was an bug where a target that doesn't have any data devices
(and that doesn't even set the .iterate_devices dm target callback)
will incorrectly return true from dm_table_supports_poll().

Fix this by updating dm_table_supports_poll() to follow dm-table.c's
well-worn pattern for testing that _all_ targets in a DM table do in
fact have underlying devices that set QUEUE_FLAG_POLL.

NOTE: An additional block fix is still needed so that
bio_alloc_cache_destroy() clears the bioset's ->cache member.
Otherwise, a DM device's table reload that transitions the DM device's
bioset from using a per-cpu alloc cache to _not_ using one will result
in bioset_exit() crashing in bio_alloc_cache_destroy() because dm's
dm_io bioset ("io_bs") was left with a stale ->cache member.

Fixes: cfc97abcbe0b ("dm: conditionally enable BIOSET_PERCPU_CACHE for dm_io bioset")
Reported-by: Matthew Wilcox <willy@infradead.org>
Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
 drivers/md/dm-table.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index a37c7b763643..0e833a154b31 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1005,7 +1005,7 @@ bool dm_table_request_based(struct dm_table *t)
 	return __table_type_request_based(dm_table_get_type(t));
 }
 
-static int dm_table_supports_poll(struct dm_table *t);
+static bool dm_table_supports_poll(struct dm_table *t);
 
 static int dm_table_alloc_md_mempools(struct dm_table *t, struct mapped_device *md)
 {
@@ -1027,7 +1027,7 @@ static int dm_table_alloc_md_mempools(struct dm_table *t, struct mapped_device *
 			per_io_data_size = max(per_io_data_size, ti->per_io_data_size);
 			min_pool_size = max(min_pool_size, ti->num_flush_bios);
 		}
-		poll_supported = !!dm_table_supports_poll(t);
+		poll_supported = dm_table_supports_poll(t);
 	}
 
 	t->mempools = dm_alloc_md_mempools(md, type, per_io_data_size, min_pool_size,
@@ -1547,9 +1547,20 @@ static int count_device(struct dm_target *ti, struct dm_dev *dev,
 	return 0;
 }
 
-static int dm_table_supports_poll(struct dm_table *t)
+static bool dm_table_supports_poll(struct dm_table *t)
 {
-	return !dm_table_any_dev_attr(t, device_not_poll_capable, NULL);
+	struct dm_target *ti;
+	unsigned i = 0;
+
+	while (i < dm_table_get_num_targets(t)) {
+		ti = dm_table_get_target(t, i++);
+
+		if (!ti->type->iterate_devices ||
+		    ti->type->iterate_devices(ti, device_not_poll_capable, NULL))
+			return false;
+	}
+
+	return true;
 }
 
 /*
-- 
2.15.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
@ 2022-05-31 18:58     ` Mike Snitzer
  0 siblings, 0 replies; 28+ messages in thread
From: Mike Snitzer @ 2022-05-31 18:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, dm-devel, david, Matthew Wilcox

On Sun, May 29 2022 at  8:46P -0400,
Jens Axboe <axboe@kernel.dk> wrote:

> On 5/28/22 6:17 PM, Matthew Wilcox wrote:
> > Not quite sure whose bug this is.  Current Linus head running xfstests
> > against ext4 (probably not ext4's fault?)
> > 
> > 01818 generic/250	run fstests generic/250 at 2022-05-28 23:48:09
> > 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> > 01818 EXT4-fs (dm-0): unmounting filesystem.
> > 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> > 01818 EXT4-fs (dm-0): unmounting filesystem.
> > 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> > 01818 Buffer I/O error on dev dm-0, logical block 3670000, async page read
> > 01818 EXT4-fs (dm-0): unmounting filesystem.
> > 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> > 01818 EXT4-fs (dm-0): unmounting filesystem.
> > 01818 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
> > 01818 CPU: 0 PID: 1579117 Comm: dmsetup Kdump: loaded Not tainted 5.18.0-11049-g1dec3d7fd0c3-dirty #262
> > 01818 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> > 01818 RIP: 0010:__cpuhp_state_remove_instance+0xf0/0x1b0
> > 01818 Code: a0 f8 d7 81 42 3b 1c 28 7f d9 4c 89 e1 31 d2 89 de 89 7d dc e8 01 fd ff ff 8b 7d dc eb c5 49 8b 04 24 49 8b 54 24 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 c7
> > 01818 RSP: 0018:ffff888101fcfc60 EFLAGS: 00010286
> > 01818 RAX: dead000000000100 RBX: 0000000000000017 RCX: 0000000000000000
> > 01818 RDX: dead000000000122 RSI: ffff8881233b0ae8 RDI: ffffffff81e3b080
> > 01818 RBP: ffff888101fcfc88 R08: 0000000000000008 R09: 0000000000000003
> > 01818 R10: 0000000000000000 R11: 00000000002dc6c0 R12: ffff8881233b0ae8
> > 01818 R13: 0000000000000000 R14: ffffffff81e38f58 R15: ffff88817b5a3c00
> > 01818 FS:  00007ff56daec280(0000) GS:ffff888275800000(0000) knlGS:0000000000000000
> > 01818 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > 01818 CR2: 00005591ad94f198 CR3: 000000017b5a0004 CR4: 0000000000770eb0
> > 01818 PKRU: 55555554
> > 01818 Call Trace:
> > 01818  <TASK>
> > 01818  ? kfree+0x66/0x250
> > 01818  bioset_exit+0x32/0x210
> > 01818  cleanup_mapped_device+0x34/0xf0
> > 01818  __dm_destroy+0x149/0x1f0
> > 01818  ? table_clear+0xc0/0xc0
> > 01818  dm_destroy+0xe/0x10
> > 01818  dev_remove+0xd9/0x120
> > 01818  ctl_ioctl+0x1cb/0x420
> > 01818  dm_ctl_ioctl+0x9/0x10
> > 01818  __x64_sys_ioctl+0x89/0xb0
> > 01818  do_syscall_64+0x35/0x80
> > 01818  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> > 01818 RIP: 0033:0x7ff56de3b397
> > 01818 Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
> > 01818 RSP: 002b:00007ffe55367ef8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
> > 01818 RAX: ffffffffffffffda RBX: 00007ff56df31a8e RCX: 00007ff56de3b397
> > 01818 RDX: 000055daad7cab30 RSI: 00000000c138fd04 RDI: 0000000000000003
> > 01818 RBP: 00007ffe55367fb0 R08: 00007ff56df81558 R09: 00007ffe55367d60
> > 01818 R10: 00007ff56df808a2 R11: 0000000000000206 R12: 00007ff56df808a2
> > 01818 R13: 00007ff56df808a2 R14: 00007ff56df808a2 R15: 00007ff56df808a2
> > 01818  </TASK>
> > 01818 Modules linked in: crct10dif_generic crct10dif_common [last unloaded: crc_t10dif]
> 
> I suspect dm is calling bioset_exit() multiple times? Which it probably
> should not.
> 
> The reset of bioset_exit() is resilient against this, so might be best
> to include bio_alloc_cache_destroy() in that.
> 
> 
> diff --git a/block/bio.c b/block/bio.c
> index a3893d80dccc..be3937b84e68 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -722,6 +722,7 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
>  		bio_alloc_cache_prune(cache, -1U);
>  	}
>  	free_percpu(bs->cache);
> +	bs->cache = NULL;
>  }
>  
>  /**

Yes, we need the above to fix the crash.  Does it also make sense to
add this?

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 49eff01fb829..f410c78e9c0c 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -681,7 +681,7 @@ struct bio_set {

 static inline bool bioset_initialized(struct bio_set *bs)
 {
-	return bs->bio_slab != NULL;
+	return (bs->bio_slab != NULL || bs->cache != NULL);
 }

 #if defined(CONFIG_BLK_DEV_INTEGRITY)

FYI, DM's unique use of bioset_init_from_src() is the primary reason
why this stale pointer is biting us.

I dug into this issue further and have queued this patch:

From: Mike Snitzer <snitzer@kernel.org>
Date: Tue, 31 May 2022 12:16:49 -0400
Subject: [PATCH] dm table: fix dm_table_supports_poll to return false if no data devices

It was reported that the "generic/250" test in xfstests (which uses
the dm-error target) demonstrates a regression where the kernel
crashes in bioset_exit().

Since commit cfc97abcbe0b ("dm: conditionally enable
BIOSET_PERCPU_CACHE for dm_io bioset") the bioset_init() for the dm_io
bioset will setup the bioset's per-cpu alloc cache if all devices have
QUEUE_FLAG_POLL set.

But there was an bug where a target that doesn't have any data devices
(and that doesn't even set the .iterate_devices dm target callback)
will incorrectly return true from dm_table_supports_poll().

Fix this by updating dm_table_supports_poll() to follow dm-table.c's
well-worn pattern for testing that _all_ targets in a DM table do in
fact have underlying devices that set QUEUE_FLAG_POLL.

NOTE: An additional block fix is still needed so that
bio_alloc_cache_destroy() clears the bioset's ->cache member.
Otherwise, a DM device's table reload that transitions the DM device's
bioset from using a per-cpu alloc cache to _not_ using one will result
in bioset_exit() crashing in bio_alloc_cache_destroy() because dm's
dm_io bioset ("io_bs") was left with a stale ->cache member.

Fixes: cfc97abcbe0b ("dm: conditionally enable BIOSET_PERCPU_CACHE for dm_io bioset")
Reported-by: Matthew Wilcox <willy@infradead.org>
Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
 drivers/md/dm-table.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index a37c7b763643..0e833a154b31 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1005,7 +1005,7 @@ bool dm_table_request_based(struct dm_table *t)
 	return __table_type_request_based(dm_table_get_type(t));
 }
 
-static int dm_table_supports_poll(struct dm_table *t);
+static bool dm_table_supports_poll(struct dm_table *t);
 
 static int dm_table_alloc_md_mempools(struct dm_table *t, struct mapped_device *md)
 {
@@ -1027,7 +1027,7 @@ static int dm_table_alloc_md_mempools(struct dm_table *t, struct mapped_device *
 			per_io_data_size = max(per_io_data_size, ti->per_io_data_size);
 			min_pool_size = max(min_pool_size, ti->num_flush_bios);
 		}
-		poll_supported = !!dm_table_supports_poll(t);
+		poll_supported = dm_table_supports_poll(t);
 	}
 
 	t->mempools = dm_alloc_md_mempools(md, type, per_io_data_size, min_pool_size,
@@ -1547,9 +1547,20 @@ static int count_device(struct dm_target *ti, struct dm_dev *dev,
 	return 0;
 }
 
-static int dm_table_supports_poll(struct dm_table *t)
+static bool dm_table_supports_poll(struct dm_table *t)
 {
-	return !dm_table_any_dev_attr(t, device_not_poll_capable, NULL);
+	struct dm_target *ti;
+	unsigned i = 0;
+
+	while (i < dm_table_get_num_targets(t)) {
+		ti = dm_table_get_target(t, i++);
+
+		if (!ti->type->iterate_devices ||
+		    ti->type->iterate_devices(ti, device_not_poll_capable, NULL))
+			return false;
+	}
+
+	return true;
 }
 
 /*
-- 
2.15.0

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
  2022-05-31 18:58     ` [dm-devel] " Mike Snitzer
@ 2022-05-31 19:00       ` Jens Axboe
  -1 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2022-05-31 19:00 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Matthew Wilcox, dm-devel, linux-block, david

On 5/31/22 12:58 PM, Mike Snitzer wrote:
> On Sun, May 29 2022 at  8:46P -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
>> On 5/28/22 6:17 PM, Matthew Wilcox wrote:
>>> Not quite sure whose bug this is.  Current Linus head running xfstests
>>> against ext4 (probably not ext4's fault?)
>>>
>>> 01818 generic/250	run fstests generic/250 at 2022-05-28 23:48:09
>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>> 01818 Buffer I/O error on dev dm-0, logical block 3670000, async page read
>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>> 01818 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
>>> 01818 CPU: 0 PID: 1579117 Comm: dmsetup Kdump: loaded Not tainted 5.18.0-11049-g1dec3d7fd0c3-dirty #262
>>> 01818 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
>>> 01818 RIP: 0010:__cpuhp_state_remove_instance+0xf0/0x1b0
>>> 01818 Code: a0 f8 d7 81 42 3b 1c 28 7f d9 4c 89 e1 31 d2 89 de 89 7d dc e8 01 fd ff ff 8b 7d dc eb c5 49 8b 04 24 49 8b 54 24 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 c7
>>> 01818 RSP: 0018:ffff888101fcfc60 EFLAGS: 00010286
>>> 01818 RAX: dead000000000100 RBX: 0000000000000017 RCX: 0000000000000000
>>> 01818 RDX: dead000000000122 RSI: ffff8881233b0ae8 RDI: ffffffff81e3b080
>>> 01818 RBP: ffff888101fcfc88 R08: 0000000000000008 R09: 0000000000000003
>>> 01818 R10: 0000000000000000 R11: 00000000002dc6c0 R12: ffff8881233b0ae8
>>> 01818 R13: 0000000000000000 R14: ffffffff81e38f58 R15: ffff88817b5a3c00
>>> 01818 FS:  00007ff56daec280(0000) GS:ffff888275800000(0000) knlGS:0000000000000000
>>> 01818 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> 01818 CR2: 00005591ad94f198 CR3: 000000017b5a0004 CR4: 0000000000770eb0
>>> 01818 PKRU: 55555554
>>> 01818 Call Trace:
>>> 01818  <TASK>
>>> 01818  ? kfree+0x66/0x250
>>> 01818  bioset_exit+0x32/0x210
>>> 01818  cleanup_mapped_device+0x34/0xf0
>>> 01818  __dm_destroy+0x149/0x1f0
>>> 01818  ? table_clear+0xc0/0xc0
>>> 01818  dm_destroy+0xe/0x10
>>> 01818  dev_remove+0xd9/0x120
>>> 01818  ctl_ioctl+0x1cb/0x420
>>> 01818  dm_ctl_ioctl+0x9/0x10
>>> 01818  __x64_sys_ioctl+0x89/0xb0
>>> 01818  do_syscall_64+0x35/0x80
>>> 01818  entry_SYSCALL_64_after_hwframe+0x46/0xb0
>>> 01818 RIP: 0033:0x7ff56de3b397
>>> 01818 Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
>>> 01818 RSP: 002b:00007ffe55367ef8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
>>> 01818 RAX: ffffffffffffffda RBX: 00007ff56df31a8e RCX: 00007ff56de3b397
>>> 01818 RDX: 000055daad7cab30 RSI: 00000000c138fd04 RDI: 0000000000000003
>>> 01818 RBP: 00007ffe55367fb0 R08: 00007ff56df81558 R09: 00007ffe55367d60
>>> 01818 R10: 00007ff56df808a2 R11: 0000000000000206 R12: 00007ff56df808a2
>>> 01818 R13: 00007ff56df808a2 R14: 00007ff56df808a2 R15: 00007ff56df808a2
>>> 01818  </TASK>
>>> 01818 Modules linked in: crct10dif_generic crct10dif_common [last unloaded: crc_t10dif]
>>
>> I suspect dm is calling bioset_exit() multiple times? Which it probably
>> should not.
>>
>> The reset of bioset_exit() is resilient against this, so might be best
>> to include bio_alloc_cache_destroy() in that.
>>
>>
>> diff --git a/block/bio.c b/block/bio.c
>> index a3893d80dccc..be3937b84e68 100644
>> --- a/block/bio.c
>> +++ b/block/bio.c
>> @@ -722,6 +722,7 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
>>  		bio_alloc_cache_prune(cache, -1U);
>>  	}
>>  	free_percpu(bs->cache);
>> +	bs->cache = NULL;
>>  }
>>  
>>  /**
> 
> Yes, we need the above to fix the crash.  Does it also make sense to
> add this?
> 
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index 49eff01fb829..f410c78e9c0c 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -681,7 +681,7 @@ struct bio_set {
> 
>  static inline bool bioset_initialized(struct bio_set *bs)
>  {
> -	return bs->bio_slab != NULL;
> +	return (bs->bio_slab != NULL || bs->cache != NULL);
>  }

Should not be possible to have valid bs->cache without bs->bio_slab?


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
@ 2022-05-31 19:00       ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2022-05-31 19:00 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: linux-block, dm-devel, david, Matthew Wilcox

On 5/31/22 12:58 PM, Mike Snitzer wrote:
> On Sun, May 29 2022 at  8:46P -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
>> On 5/28/22 6:17 PM, Matthew Wilcox wrote:
>>> Not quite sure whose bug this is.  Current Linus head running xfstests
>>> against ext4 (probably not ext4's fault?)
>>>
>>> 01818 generic/250	run fstests generic/250 at 2022-05-28 23:48:09
>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>> 01818 Buffer I/O error on dev dm-0, logical block 3670000, async page read
>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>> 01818 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
>>> 01818 CPU: 0 PID: 1579117 Comm: dmsetup Kdump: loaded Not tainted 5.18.0-11049-g1dec3d7fd0c3-dirty #262
>>> 01818 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
>>> 01818 RIP: 0010:__cpuhp_state_remove_instance+0xf0/0x1b0
>>> 01818 Code: a0 f8 d7 81 42 3b 1c 28 7f d9 4c 89 e1 31 d2 89 de 89 7d dc e8 01 fd ff ff 8b 7d dc eb c5 49 8b 04 24 49 8b 54 24 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 c7
>>> 01818 RSP: 0018:ffff888101fcfc60 EFLAGS: 00010286
>>> 01818 RAX: dead000000000100 RBX: 0000000000000017 RCX: 0000000000000000
>>> 01818 RDX: dead000000000122 RSI: ffff8881233b0ae8 RDI: ffffffff81e3b080
>>> 01818 RBP: ffff888101fcfc88 R08: 0000000000000008 R09: 0000000000000003
>>> 01818 R10: 0000000000000000 R11: 00000000002dc6c0 R12: ffff8881233b0ae8
>>> 01818 R13: 0000000000000000 R14: ffffffff81e38f58 R15: ffff88817b5a3c00
>>> 01818 FS:  00007ff56daec280(0000) GS:ffff888275800000(0000) knlGS:0000000000000000
>>> 01818 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> 01818 CR2: 00005591ad94f198 CR3: 000000017b5a0004 CR4: 0000000000770eb0
>>> 01818 PKRU: 55555554
>>> 01818 Call Trace:
>>> 01818  <TASK>
>>> 01818  ? kfree+0x66/0x250
>>> 01818  bioset_exit+0x32/0x210
>>> 01818  cleanup_mapped_device+0x34/0xf0
>>> 01818  __dm_destroy+0x149/0x1f0
>>> 01818  ? table_clear+0xc0/0xc0
>>> 01818  dm_destroy+0xe/0x10
>>> 01818  dev_remove+0xd9/0x120
>>> 01818  ctl_ioctl+0x1cb/0x420
>>> 01818  dm_ctl_ioctl+0x9/0x10
>>> 01818  __x64_sys_ioctl+0x89/0xb0
>>> 01818  do_syscall_64+0x35/0x80
>>> 01818  entry_SYSCALL_64_after_hwframe+0x46/0xb0
>>> 01818 RIP: 0033:0x7ff56de3b397
>>> 01818 Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
>>> 01818 RSP: 002b:00007ffe55367ef8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
>>> 01818 RAX: ffffffffffffffda RBX: 00007ff56df31a8e RCX: 00007ff56de3b397
>>> 01818 RDX: 000055daad7cab30 RSI: 00000000c138fd04 RDI: 0000000000000003
>>> 01818 RBP: 00007ffe55367fb0 R08: 00007ff56df81558 R09: 00007ffe55367d60
>>> 01818 R10: 00007ff56df808a2 R11: 0000000000000206 R12: 00007ff56df808a2
>>> 01818 R13: 00007ff56df808a2 R14: 00007ff56df808a2 R15: 00007ff56df808a2
>>> 01818  </TASK>
>>> 01818 Modules linked in: crct10dif_generic crct10dif_common [last unloaded: crc_t10dif]
>>
>> I suspect dm is calling bioset_exit() multiple times? Which it probably
>> should not.
>>
>> The reset of bioset_exit() is resilient against this, so might be best
>> to include bio_alloc_cache_destroy() in that.
>>
>>
>> diff --git a/block/bio.c b/block/bio.c
>> index a3893d80dccc..be3937b84e68 100644
>> --- a/block/bio.c
>> +++ b/block/bio.c
>> @@ -722,6 +722,7 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
>>  		bio_alloc_cache_prune(cache, -1U);
>>  	}
>>  	free_percpu(bs->cache);
>> +	bs->cache = NULL;
>>  }
>>  
>>  /**
> 
> Yes, we need the above to fix the crash.  Does it also make sense to
> add this?
> 
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index 49eff01fb829..f410c78e9c0c 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -681,7 +681,7 @@ struct bio_set {
> 
>  static inline bool bioset_initialized(struct bio_set *bs)
>  {
> -	return bs->bio_slab != NULL;
> +	return (bs->bio_slab != NULL || bs->cache != NULL);
>  }

Should not be possible to have valid bs->cache without bs->bio_slab?


-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
  2022-05-31 19:00       ` [dm-devel] " Jens Axboe
@ 2022-05-31 19:49         ` Mike Snitzer
  -1 siblings, 0 replies; 28+ messages in thread
From: Mike Snitzer @ 2022-05-31 19:49 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Matthew Wilcox, dm-devel, linux-block, david

On Tue, May 31 2022 at  3:00P -0400,
Jens Axboe <axboe@kernel.dk> wrote:

> On 5/31/22 12:58 PM, Mike Snitzer wrote:
> > On Sun, May 29 2022 at  8:46P -0400,
> > Jens Axboe <axboe@kernel.dk> wrote:
> > 
> >> On 5/28/22 6:17 PM, Matthew Wilcox wrote:
> >>> Not quite sure whose bug this is.  Current Linus head running xfstests
> >>> against ext4 (probably not ext4's fault?)
> >>>
> >>> 01818 generic/250	run fstests generic/250 at 2022-05-28 23:48:09
> >>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> >>> 01818 EXT4-fs (dm-0): unmounting filesystem.
> >>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> >>> 01818 EXT4-fs (dm-0): unmounting filesystem.
> >>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> >>> 01818 Buffer I/O error on dev dm-0, logical block 3670000, async page read
> >>> 01818 EXT4-fs (dm-0): unmounting filesystem.
> >>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> >>> 01818 EXT4-fs (dm-0): unmounting filesystem.
> >>> 01818 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
> >>> 01818 CPU: 0 PID: 1579117 Comm: dmsetup Kdump: loaded Not tainted 5.18.0-11049-g1dec3d7fd0c3-dirty #262
> >>> 01818 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> >>> 01818 RIP: 0010:__cpuhp_state_remove_instance+0xf0/0x1b0
> >>> 01818 Code: a0 f8 d7 81 42 3b 1c 28 7f d9 4c 89 e1 31 d2 89 de 89 7d dc e8 01 fd ff ff 8b 7d dc eb c5 49 8b 04 24 49 8b 54 24 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 c7
> >>> 01818 RSP: 0018:ffff888101fcfc60 EFLAGS: 00010286
> >>> 01818 RAX: dead000000000100 RBX: 0000000000000017 RCX: 0000000000000000
> >>> 01818 RDX: dead000000000122 RSI: ffff8881233b0ae8 RDI: ffffffff81e3b080
> >>> 01818 RBP: ffff888101fcfc88 R08: 0000000000000008 R09: 0000000000000003
> >>> 01818 R10: 0000000000000000 R11: 00000000002dc6c0 R12: ffff8881233b0ae8
> >>> 01818 R13: 0000000000000000 R14: ffffffff81e38f58 R15: ffff88817b5a3c00
> >>> 01818 FS:  00007ff56daec280(0000) GS:ffff888275800000(0000) knlGS:0000000000000000
> >>> 01818 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> 01818 CR2: 00005591ad94f198 CR3: 000000017b5a0004 CR4: 0000000000770eb0
> >>> 01818 PKRU: 55555554
> >>> 01818 Call Trace:
> >>> 01818  <TASK>
> >>> 01818  ? kfree+0x66/0x250
> >>> 01818  bioset_exit+0x32/0x210
> >>> 01818  cleanup_mapped_device+0x34/0xf0
> >>> 01818  __dm_destroy+0x149/0x1f0
> >>> 01818  ? table_clear+0xc0/0xc0
> >>> 01818  dm_destroy+0xe/0x10
> >>> 01818  dev_remove+0xd9/0x120
> >>> 01818  ctl_ioctl+0x1cb/0x420
> >>> 01818  dm_ctl_ioctl+0x9/0x10
> >>> 01818  __x64_sys_ioctl+0x89/0xb0
> >>> 01818  do_syscall_64+0x35/0x80
> >>> 01818  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> >>> 01818 RIP: 0033:0x7ff56de3b397
> >>> 01818 Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
> >>> 01818 RSP: 002b:00007ffe55367ef8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
> >>> 01818 RAX: ffffffffffffffda RBX: 00007ff56df31a8e RCX: 00007ff56de3b397
> >>> 01818 RDX: 000055daad7cab30 RSI: 00000000c138fd04 RDI: 0000000000000003
> >>> 01818 RBP: 00007ffe55367fb0 R08: 00007ff56df81558 R09: 00007ffe55367d60
> >>> 01818 R10: 00007ff56df808a2 R11: 0000000000000206 R12: 00007ff56df808a2
> >>> 01818 R13: 00007ff56df808a2 R14: 00007ff56df808a2 R15: 00007ff56df808a2
> >>> 01818  </TASK>
> >>> 01818 Modules linked in: crct10dif_generic crct10dif_common [last unloaded: crc_t10dif]
> >>
> >> I suspect dm is calling bioset_exit() multiple times? Which it probably
> >> should not.
> >>
> >> The reset of bioset_exit() is resilient against this, so might be best
> >> to include bio_alloc_cache_destroy() in that.
> >>
> >>
> >> diff --git a/block/bio.c b/block/bio.c
> >> index a3893d80dccc..be3937b84e68 100644
> >> --- a/block/bio.c
> >> +++ b/block/bio.c
> >> @@ -722,6 +722,7 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
> >>  		bio_alloc_cache_prune(cache, -1U);
> >>  	}
> >>  	free_percpu(bs->cache);
> >> +	bs->cache = NULL;
> >>  }
> >>  
> >>  /**
> > 
> > Yes, we need the above to fix the crash.  Does it also make sense to
> > add this?
> > 
> > diff --git a/include/linux/bio.h b/include/linux/bio.h
> > index 49eff01fb829..f410c78e9c0c 100644
> > --- a/include/linux/bio.h
> > +++ b/include/linux/bio.h
> > @@ -681,7 +681,7 @@ struct bio_set {
> > 
> >  static inline bool bioset_initialized(struct bio_set *bs)
> >  {
> > -	return bs->bio_slab != NULL;
> > +	return (bs->bio_slab != NULL || bs->cache != NULL);
> >  }
> 
> Should not be possible to have valid bs->cache without bs->bio_slab?

Not quite following your question, but I sprinkled this extra check in
purely because DM core uses bioset_initialized() to verify that the
bioset is _not_ yet initialized.

But it really doesn't make sense to consider the bioset _fully_
initialized if bs->bio_slab is NULL but bs->cache is NOT...

Anyway, really don't need this change but we do need your earlier
patch to reset bs->cache to NULL.

Thanks.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
@ 2022-05-31 19:49         ` Mike Snitzer
  0 siblings, 0 replies; 28+ messages in thread
From: Mike Snitzer @ 2022-05-31 19:49 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, dm-devel, david, Matthew Wilcox

On Tue, May 31 2022 at  3:00P -0400,
Jens Axboe <axboe@kernel.dk> wrote:

> On 5/31/22 12:58 PM, Mike Snitzer wrote:
> > On Sun, May 29 2022 at  8:46P -0400,
> > Jens Axboe <axboe@kernel.dk> wrote:
> > 
> >> On 5/28/22 6:17 PM, Matthew Wilcox wrote:
> >>> Not quite sure whose bug this is.  Current Linus head running xfstests
> >>> against ext4 (probably not ext4's fault?)
> >>>
> >>> 01818 generic/250	run fstests generic/250 at 2022-05-28 23:48:09
> >>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> >>> 01818 EXT4-fs (dm-0): unmounting filesystem.
> >>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> >>> 01818 EXT4-fs (dm-0): unmounting filesystem.
> >>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> >>> 01818 Buffer I/O error on dev dm-0, logical block 3670000, async page read
> >>> 01818 EXT4-fs (dm-0): unmounting filesystem.
> >>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
> >>> 01818 EXT4-fs (dm-0): unmounting filesystem.
> >>> 01818 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
> >>> 01818 CPU: 0 PID: 1579117 Comm: dmsetup Kdump: loaded Not tainted 5.18.0-11049-g1dec3d7fd0c3-dirty #262
> >>> 01818 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> >>> 01818 RIP: 0010:__cpuhp_state_remove_instance+0xf0/0x1b0
> >>> 01818 Code: a0 f8 d7 81 42 3b 1c 28 7f d9 4c 89 e1 31 d2 89 de 89 7d dc e8 01 fd ff ff 8b 7d dc eb c5 49 8b 04 24 49 8b 54 24 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 c7
> >>> 01818 RSP: 0018:ffff888101fcfc60 EFLAGS: 00010286
> >>> 01818 RAX: dead000000000100 RBX: 0000000000000017 RCX: 0000000000000000
> >>> 01818 RDX: dead000000000122 RSI: ffff8881233b0ae8 RDI: ffffffff81e3b080
> >>> 01818 RBP: ffff888101fcfc88 R08: 0000000000000008 R09: 0000000000000003
> >>> 01818 R10: 0000000000000000 R11: 00000000002dc6c0 R12: ffff8881233b0ae8
> >>> 01818 R13: 0000000000000000 R14: ffffffff81e38f58 R15: ffff88817b5a3c00
> >>> 01818 FS:  00007ff56daec280(0000) GS:ffff888275800000(0000) knlGS:0000000000000000
> >>> 01818 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> 01818 CR2: 00005591ad94f198 CR3: 000000017b5a0004 CR4: 0000000000770eb0
> >>> 01818 PKRU: 55555554
> >>> 01818 Call Trace:
> >>> 01818  <TASK>
> >>> 01818  ? kfree+0x66/0x250
> >>> 01818  bioset_exit+0x32/0x210
> >>> 01818  cleanup_mapped_device+0x34/0xf0
> >>> 01818  __dm_destroy+0x149/0x1f0
> >>> 01818  ? table_clear+0xc0/0xc0
> >>> 01818  dm_destroy+0xe/0x10
> >>> 01818  dev_remove+0xd9/0x120
> >>> 01818  ctl_ioctl+0x1cb/0x420
> >>> 01818  dm_ctl_ioctl+0x9/0x10
> >>> 01818  __x64_sys_ioctl+0x89/0xb0
> >>> 01818  do_syscall_64+0x35/0x80
> >>> 01818  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> >>> 01818 RIP: 0033:0x7ff56de3b397
> >>> 01818 Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
> >>> 01818 RSP: 002b:00007ffe55367ef8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
> >>> 01818 RAX: ffffffffffffffda RBX: 00007ff56df31a8e RCX: 00007ff56de3b397
> >>> 01818 RDX: 000055daad7cab30 RSI: 00000000c138fd04 RDI: 0000000000000003
> >>> 01818 RBP: 00007ffe55367fb0 R08: 00007ff56df81558 R09: 00007ffe55367d60
> >>> 01818 R10: 00007ff56df808a2 R11: 0000000000000206 R12: 00007ff56df808a2
> >>> 01818 R13: 00007ff56df808a2 R14: 00007ff56df808a2 R15: 00007ff56df808a2
> >>> 01818  </TASK>
> >>> 01818 Modules linked in: crct10dif_generic crct10dif_common [last unloaded: crc_t10dif]
> >>
> >> I suspect dm is calling bioset_exit() multiple times? Which it probably
> >> should not.
> >>
> >> The reset of bioset_exit() is resilient against this, so might be best
> >> to include bio_alloc_cache_destroy() in that.
> >>
> >>
> >> diff --git a/block/bio.c b/block/bio.c
> >> index a3893d80dccc..be3937b84e68 100644
> >> --- a/block/bio.c
> >> +++ b/block/bio.c
> >> @@ -722,6 +722,7 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
> >>  		bio_alloc_cache_prune(cache, -1U);
> >>  	}
> >>  	free_percpu(bs->cache);
> >> +	bs->cache = NULL;
> >>  }
> >>  
> >>  /**
> > 
> > Yes, we need the above to fix the crash.  Does it also make sense to
> > add this?
> > 
> > diff --git a/include/linux/bio.h b/include/linux/bio.h
> > index 49eff01fb829..f410c78e9c0c 100644
> > --- a/include/linux/bio.h
> > +++ b/include/linux/bio.h
> > @@ -681,7 +681,7 @@ struct bio_set {
> > 
> >  static inline bool bioset_initialized(struct bio_set *bs)
> >  {
> > -	return bs->bio_slab != NULL;
> > +	return (bs->bio_slab != NULL || bs->cache != NULL);
> >  }
> 
> Should not be possible to have valid bs->cache without bs->bio_slab?

Not quite following your question, but I sprinkled this extra check in
purely because DM core uses bioset_initialized() to verify that the
bioset is _not_ yet initialized.

But it really doesn't make sense to consider the bioset _fully_
initialized if bs->bio_slab is NULL but bs->cache is NOT...

Anyway, really don't need this change but we do need your earlier
patch to reset bs->cache to NULL.

Thanks.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
  2022-05-31 19:49         ` [dm-devel] " Mike Snitzer
@ 2022-06-01  3:27           ` Jens Axboe
  -1 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2022-06-01  3:27 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Matthew Wilcox, dm-devel, linux-block, david

On 5/31/22 1:49 PM, Mike Snitzer wrote:
> On Tue, May 31 2022 at  3:00P -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
>> On 5/31/22 12:58 PM, Mike Snitzer wrote:
>>> On Sun, May 29 2022 at  8:46P -0400,
>>> Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>>> On 5/28/22 6:17 PM, Matthew Wilcox wrote:
>>>>> Not quite sure whose bug this is.  Current Linus head running xfstests
>>>>> against ext4 (probably not ext4's fault?)
>>>>>
>>>>> 01818 generic/250	run fstests generic/250 at 2022-05-28 23:48:09
>>>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>>>> 01818 Buffer I/O error on dev dm-0, logical block 3670000, async page read
>>>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>>>> 01818 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
>>>>> 01818 CPU: 0 PID: 1579117 Comm: dmsetup Kdump: loaded Not tainted 5.18.0-11049-g1dec3d7fd0c3-dirty #262
>>>>> 01818 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
>>>>> 01818 RIP: 0010:__cpuhp_state_remove_instance+0xf0/0x1b0
>>>>> 01818 Code: a0 f8 d7 81 42 3b 1c 28 7f d9 4c 89 e1 31 d2 89 de 89 7d dc e8 01 fd ff ff 8b 7d dc eb c5 49 8b 04 24 49 8b 54 24 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 c7
>>>>> 01818 RSP: 0018:ffff888101fcfc60 EFLAGS: 00010286
>>>>> 01818 RAX: dead000000000100 RBX: 0000000000000017 RCX: 0000000000000000
>>>>> 01818 RDX: dead000000000122 RSI: ffff8881233b0ae8 RDI: ffffffff81e3b080
>>>>> 01818 RBP: ffff888101fcfc88 R08: 0000000000000008 R09: 0000000000000003
>>>>> 01818 R10: 0000000000000000 R11: 00000000002dc6c0 R12: ffff8881233b0ae8
>>>>> 01818 R13: 0000000000000000 R14: ffffffff81e38f58 R15: ffff88817b5a3c00
>>>>> 01818 FS:  00007ff56daec280(0000) GS:ffff888275800000(0000) knlGS:0000000000000000
>>>>> 01818 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> 01818 CR2: 00005591ad94f198 CR3: 000000017b5a0004 CR4: 0000000000770eb0
>>>>> 01818 PKRU: 55555554
>>>>> 01818 Call Trace:
>>>>> 01818  <TASK>
>>>>> 01818  ? kfree+0x66/0x250
>>>>> 01818  bioset_exit+0x32/0x210
>>>>> 01818  cleanup_mapped_device+0x34/0xf0
>>>>> 01818  __dm_destroy+0x149/0x1f0
>>>>> 01818  ? table_clear+0xc0/0xc0
>>>>> 01818  dm_destroy+0xe/0x10
>>>>> 01818  dev_remove+0xd9/0x120
>>>>> 01818  ctl_ioctl+0x1cb/0x420
>>>>> 01818  dm_ctl_ioctl+0x9/0x10
>>>>> 01818  __x64_sys_ioctl+0x89/0xb0
>>>>> 01818  do_syscall_64+0x35/0x80
>>>>> 01818  entry_SYSCALL_64_after_hwframe+0x46/0xb0
>>>>> 01818 RIP: 0033:0x7ff56de3b397
>>>>> 01818 Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
>>>>> 01818 RSP: 002b:00007ffe55367ef8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
>>>>> 01818 RAX: ffffffffffffffda RBX: 00007ff56df31a8e RCX: 00007ff56de3b397
>>>>> 01818 RDX: 000055daad7cab30 RSI: 00000000c138fd04 RDI: 0000000000000003
>>>>> 01818 RBP: 00007ffe55367fb0 R08: 00007ff56df81558 R09: 00007ffe55367d60
>>>>> 01818 R10: 00007ff56df808a2 R11: 0000000000000206 R12: 00007ff56df808a2
>>>>> 01818 R13: 00007ff56df808a2 R14: 00007ff56df808a2 R15: 00007ff56df808a2
>>>>> 01818  </TASK>
>>>>> 01818 Modules linked in: crct10dif_generic crct10dif_common [last unloaded: crc_t10dif]
>>>>
>>>> I suspect dm is calling bioset_exit() multiple times? Which it probably
>>>> should not.
>>>>
>>>> The reset of bioset_exit() is resilient against this, so might be best
>>>> to include bio_alloc_cache_destroy() in that.
>>>>
>>>>
>>>> diff --git a/block/bio.c b/block/bio.c
>>>> index a3893d80dccc..be3937b84e68 100644
>>>> --- a/block/bio.c
>>>> +++ b/block/bio.c
>>>> @@ -722,6 +722,7 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
>>>>  		bio_alloc_cache_prune(cache, -1U);
>>>>  	}
>>>>  	free_percpu(bs->cache);
>>>> +	bs->cache = NULL;
>>>>  }
>>>>  
>>>>  /**
>>>
>>> Yes, we need the above to fix the crash.  Does it also make sense to
>>> add this?
>>>
>>> diff --git a/include/linux/bio.h b/include/linux/bio.h
>>> index 49eff01fb829..f410c78e9c0c 100644
>>> --- a/include/linux/bio.h
>>> +++ b/include/linux/bio.h
>>> @@ -681,7 +681,7 @@ struct bio_set {
>>>
>>>  static inline bool bioset_initialized(struct bio_set *bs)
>>>  {
>>> -	return bs->bio_slab != NULL;
>>> +	return (bs->bio_slab != NULL || bs->cache != NULL);
>>>  }
>>
>> Should not be possible to have valid bs->cache without bs->bio_slab?
> 
> Not quite following your question, but I sprinkled this extra check in
> purely because DM core uses bioset_initialized() to verify that the
> bioset is _not_ yet initialized.
> 
> But it really doesn't make sense to consider the bioset _fully_
> initialized if bs->bio_slab is NULL but bs->cache is NOT...

Honestly, I'd be way happier getting rid of bioset_initialized()!

> Anyway, really don't need this change but we do need your earlier
> patch to reset bs->cache to NULL.

Certainly, that patch will be going upstream in a day or two.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
@ 2022-06-01  3:27           ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2022-06-01  3:27 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: linux-block, dm-devel, david, Matthew Wilcox

On 5/31/22 1:49 PM, Mike Snitzer wrote:
> On Tue, May 31 2022 at  3:00P -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
>> On 5/31/22 12:58 PM, Mike Snitzer wrote:
>>> On Sun, May 29 2022 at  8:46P -0400,
>>> Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>>> On 5/28/22 6:17 PM, Matthew Wilcox wrote:
>>>>> Not quite sure whose bug this is.  Current Linus head running xfstests
>>>>> against ext4 (probably not ext4's fault?)
>>>>>
>>>>> 01818 generic/250	run fstests generic/250 at 2022-05-28 23:48:09
>>>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>>>> 01818 Buffer I/O error on dev dm-0, logical block 3670000, async page read
>>>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>>>> 01818 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota mode: none.
>>>>> 01818 EXT4-fs (dm-0): unmounting filesystem.
>>>>> 01818 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
>>>>> 01818 CPU: 0 PID: 1579117 Comm: dmsetup Kdump: loaded Not tainted 5.18.0-11049-g1dec3d7fd0c3-dirty #262
>>>>> 01818 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
>>>>> 01818 RIP: 0010:__cpuhp_state_remove_instance+0xf0/0x1b0
>>>>> 01818 Code: a0 f8 d7 81 42 3b 1c 28 7f d9 4c 89 e1 31 d2 89 de 89 7d dc e8 01 fd ff ff 8b 7d dc eb c5 49 8b 04 24 49 8b 54 24 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 b8 00 01 00 00 00 00 ad de 48 c7 c7
>>>>> 01818 RSP: 0018:ffff888101fcfc60 EFLAGS: 00010286
>>>>> 01818 RAX: dead000000000100 RBX: 0000000000000017 RCX: 0000000000000000
>>>>> 01818 RDX: dead000000000122 RSI: ffff8881233b0ae8 RDI: ffffffff81e3b080
>>>>> 01818 RBP: ffff888101fcfc88 R08: 0000000000000008 R09: 0000000000000003
>>>>> 01818 R10: 0000000000000000 R11: 00000000002dc6c0 R12: ffff8881233b0ae8
>>>>> 01818 R13: 0000000000000000 R14: ffffffff81e38f58 R15: ffff88817b5a3c00
>>>>> 01818 FS:  00007ff56daec280(0000) GS:ffff888275800000(0000) knlGS:0000000000000000
>>>>> 01818 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> 01818 CR2: 00005591ad94f198 CR3: 000000017b5a0004 CR4: 0000000000770eb0
>>>>> 01818 PKRU: 55555554
>>>>> 01818 Call Trace:
>>>>> 01818  <TASK>
>>>>> 01818  ? kfree+0x66/0x250
>>>>> 01818  bioset_exit+0x32/0x210
>>>>> 01818  cleanup_mapped_device+0x34/0xf0
>>>>> 01818  __dm_destroy+0x149/0x1f0
>>>>> 01818  ? table_clear+0xc0/0xc0
>>>>> 01818  dm_destroy+0xe/0x10
>>>>> 01818  dev_remove+0xd9/0x120
>>>>> 01818  ctl_ioctl+0x1cb/0x420
>>>>> 01818  dm_ctl_ioctl+0x9/0x10
>>>>> 01818  __x64_sys_ioctl+0x89/0xb0
>>>>> 01818  do_syscall_64+0x35/0x80
>>>>> 01818  entry_SYSCALL_64_after_hwframe+0x46/0xb0
>>>>> 01818 RIP: 0033:0x7ff56de3b397
>>>>> 01818 Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
>>>>> 01818 RSP: 002b:00007ffe55367ef8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
>>>>> 01818 RAX: ffffffffffffffda RBX: 00007ff56df31a8e RCX: 00007ff56de3b397
>>>>> 01818 RDX: 000055daad7cab30 RSI: 00000000c138fd04 RDI: 0000000000000003
>>>>> 01818 RBP: 00007ffe55367fb0 R08: 00007ff56df81558 R09: 00007ffe55367d60
>>>>> 01818 R10: 00007ff56df808a2 R11: 0000000000000206 R12: 00007ff56df808a2
>>>>> 01818 R13: 00007ff56df808a2 R14: 00007ff56df808a2 R15: 00007ff56df808a2
>>>>> 01818  </TASK>
>>>>> 01818 Modules linked in: crct10dif_generic crct10dif_common [last unloaded: crc_t10dif]
>>>>
>>>> I suspect dm is calling bioset_exit() multiple times? Which it probably
>>>> should not.
>>>>
>>>> The reset of bioset_exit() is resilient against this, so might be best
>>>> to include bio_alloc_cache_destroy() in that.
>>>>
>>>>
>>>> diff --git a/block/bio.c b/block/bio.c
>>>> index a3893d80dccc..be3937b84e68 100644
>>>> --- a/block/bio.c
>>>> +++ b/block/bio.c
>>>> @@ -722,6 +722,7 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
>>>>  		bio_alloc_cache_prune(cache, -1U);
>>>>  	}
>>>>  	free_percpu(bs->cache);
>>>> +	bs->cache = NULL;
>>>>  }
>>>>  
>>>>  /**
>>>
>>> Yes, we need the above to fix the crash.  Does it also make sense to
>>> add this?
>>>
>>> diff --git a/include/linux/bio.h b/include/linux/bio.h
>>> index 49eff01fb829..f410c78e9c0c 100644
>>> --- a/include/linux/bio.h
>>> +++ b/include/linux/bio.h
>>> @@ -681,7 +681,7 @@ struct bio_set {
>>>
>>>  static inline bool bioset_initialized(struct bio_set *bs)
>>>  {
>>> -	return bs->bio_slab != NULL;
>>> +	return (bs->bio_slab != NULL || bs->cache != NULL);
>>>  }
>>
>> Should not be possible to have valid bs->cache without bs->bio_slab?
> 
> Not quite following your question, but I sprinkled this extra check in
> purely because DM core uses bioset_initialized() to verify that the
> bioset is _not_ yet initialized.
> 
> But it really doesn't make sense to consider the bioset _fully_
> initialized if bs->bio_slab is NULL but bs->cache is NOT...

Honestly, I'd be way happier getting rid of bioset_initialized()!

> Anyway, really don't need this change but we do need your earlier
> patch to reset bs->cache to NULL.

Certainly, that patch will be going upstream in a day or two.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
  2022-05-31 18:58     ` [dm-devel] " Mike Snitzer
@ 2022-06-01  6:04       ` Christoph Hellwig
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2022-06-01  6:04 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Jens Axboe, Matthew Wilcox, dm-devel, linux-block, david

On Tue, May 31, 2022 at 02:58:00PM -0400, Mike Snitzer wrote:
> Yes, we need the above to fix the crash.  Does it also make sense to
> add this?

Can we just stop treating bio_sets so sloppily and make the callers
handle their lifetime properly?  No one should have to use
bioset_initialized (or double free bio_sets).


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
@ 2022-06-01  6:04       ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2022-06-01  6:04 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Jens Axboe, linux-block, dm-devel, david, Matthew Wilcox

On Tue, May 31, 2022 at 02:58:00PM -0400, Mike Snitzer wrote:
> Yes, we need the above to fix the crash.  Does it also make sense to
> add this?

Can we just stop treating bio_sets so sloppily and make the callers
handle their lifetime properly?  No one should have to use
bioset_initialized (or double free bio_sets).

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
  2022-06-01  6:04       ` [dm-devel] " Christoph Hellwig
@ 2022-06-01  6:08         ` Jens Axboe
  -1 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2022-06-01  6:08 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Snitzer
  Cc: Matthew Wilcox, dm-devel, linux-block, david

On 6/1/22 12:04 AM, Christoph Hellwig wrote:
> On Tue, May 31, 2022 at 02:58:00PM -0400, Mike Snitzer wrote:
>> Yes, we need the above to fix the crash.  Does it also make sense to
>> add this?
> 
> Can we just stop treating bio_sets so sloppily and make the callers
> handle their lifetime properly?  No one should have to use
> bioset_initialized (or double free bio_sets).

Yes, that was my point in the previous email - get rid of
bioset_initialized() and just fixup dm, which is the only user of it.
There really should be no need to have this kind of conditional checking
and initialization.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
@ 2022-06-01  6:08         ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2022-06-01  6:08 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Snitzer
  Cc: linux-block, dm-devel, david, Matthew Wilcox

On 6/1/22 12:04 AM, Christoph Hellwig wrote:
> On Tue, May 31, 2022 at 02:58:00PM -0400, Mike Snitzer wrote:
>> Yes, we need the above to fix the crash.  Does it also make sense to
>> add this?
> 
> Can we just stop treating bio_sets so sloppily and make the callers
> handle their lifetime properly?  No one should have to use
> bioset_initialized (or double free bio_sets).

Yes, that was my point in the previous email - get rid of
bioset_initialized() and just fixup dm, which is the only user of it.
There really should be no need to have this kind of conditional checking
and initialization.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
  2022-06-01  6:08         ` [dm-devel] " Jens Axboe
@ 2022-06-01  6:18           ` Christoph Hellwig
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2022-06-01  6:18 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, Mike Snitzer, Matthew Wilcox, dm-devel,
	linux-block, david

On Wed, Jun 01, 2022 at 12:08:57AM -0600, Jens Axboe wrote:
> Yes, that was my point in the previous email - get rid of
> bioset_initialized() and just fixup dm, which is the only user of it.
> There really should be no need to have this kind of conditional checking
> and initialization.

dm and md, but yes, we should not really use this check anywhere.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
@ 2022-06-01  6:18           ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2022-06-01  6:18 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Mike Snitzer, david, Matthew Wilcox,
	Christoph Hellwig, dm-devel

On Wed, Jun 01, 2022 at 12:08:57AM -0600, Jens Axboe wrote:
> Yes, that was my point in the previous email - get rid of
> bioset_initialized() and just fixup dm, which is the only user of it.
> There really should be no need to have this kind of conditional checking
> and initialization.

dm and md, but yes, we should not really use this check anywhere.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
  2022-06-01  6:04       ` [dm-devel] " Christoph Hellwig
@ 2022-06-01 14:13         ` Mike Snitzer
  -1 siblings, 0 replies; 28+ messages in thread
From: Mike Snitzer @ 2022-06-01 14:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, dm-devel, david, Matthew Wilcox

On Wed, Jun 01 2022 at  2:04P -0400,
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, May 31, 2022 at 02:58:00PM -0400, Mike Snitzer wrote:
> > Yes, we need the above to fix the crash.  Does it also make sense to
> > add this?
> 
> Can we just stop treating bio_sets so sloppily and make the callers
> handle their lifetime properly?  No one should have to use
> bioset_initialized (or double free bio_sets).
> 

Please take the time to look at the code and save your judgement until
you do.  That said, I'm not in love with the complexity of how DM
handles bioset initialization.  But both you and Jens keep taking
shots at DM for doing things wrong without actually looking.

DM uses bioset_init_from_src().  Yet you've both assumed double frees
and such (while not entirely wrong your glossing over the detail that
there is intervening reinitialization of DM's biosets between the
bioset_exit()s)

And it really can just be that the block code had a bug where it
didn't clear bs->cache.  Doesn't need to be cause for attacks.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
@ 2022-06-01 14:13         ` Mike Snitzer
  0 siblings, 0 replies; 28+ messages in thread
From: Mike Snitzer @ 2022-06-01 14:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Matthew Wilcox, dm-devel, linux-block, david

On Wed, Jun 01 2022 at  2:04P -0400,
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, May 31, 2022 at 02:58:00PM -0400, Mike Snitzer wrote:
> > Yes, we need the above to fix the crash.  Does it also make sense to
> > add this?
> 
> Can we just stop treating bio_sets so sloppily and make the callers
> handle their lifetime properly?  No one should have to use
> bioset_initialized (or double free bio_sets).
> 

Please take the time to look at the code and save your judgement until
you do.  That said, I'm not in love with the complexity of how DM
handles bioset initialization.  But both you and Jens keep taking
shots at DM for doing things wrong without actually looking.

DM uses bioset_init_from_src().  Yet you've both assumed double frees
and such (while not entirely wrong your glossing over the detail that
there is intervening reinitialization of DM's biosets between the
bioset_exit()s)

And it really can just be that the block code had a bug where it
didn't clear bs->cache.  Doesn't need to be cause for attacks.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
  2022-06-01 14:13         ` Mike Snitzer
@ 2022-06-02  8:12           ` Christoph Hellwig
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2022-06-02  8:12 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Christoph Hellwig, Jens Axboe, Matthew Wilcox, dm-devel,
	linux-block, david

On Wed, Jun 01, 2022 at 10:13:34AM -0400, Mike Snitzer wrote:
> Please take the time to look at the code and save your judgement until
> you do.  That said, I'm not in love with the complexity of how DM
> handles bioset initialization.  But both you and Jens keep taking
> shots at DM for doing things wrong without actually looking.

I'm not taking shots.  I'm just saying we should kill this API.  In
the worse case the caller can keep track of if a bioset is initialized,
but in most cases we should be able to deduct it in a nicer way.

> DM uses bioset_init_from_src().  Yet you've both assumed double frees
> and such (while not entirely wrong your glossing over the detail that
> there is intervening reinitialization of DM's biosets between the
> bioset_exit()s)

And looking at the code, that use of bioset_init_from_src is completely
broken.  It does not actually preallocated anything as intended by
dm (maybe that isn't actually needed) but just uses the biosets in
dm_md_mempools as an awkward way to carry parameters.  And completely
loses bringing over the integrity allocations.  And no, this is not
intended as a "cheap shot" against Jens who did that either..

This is what I think should fix this, and will allow us to remove
bioset_init_from_src which was a bad idea from the start:


diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
index d21648a923ea9..54c0473a51dde 100644
--- a/drivers/md/dm-core.h
+++ b/drivers/md/dm-core.h
@@ -33,6 +33,14 @@ struct dm_kobject_holder {
  * access their members!
  */
 
+/*
+ * For mempools pre-allocation at the table loading time.
+ */
+struct dm_md_mempools {
+	struct bio_set bs;
+	struct bio_set io_bs;
+};
+
 struct mapped_device {
 	struct mutex suspend_lock;
 
@@ -110,8 +118,7 @@ struct mapped_device {
 	/*
 	 * io objects are allocated from here.
 	 */
-	struct bio_set io_bs;
-	struct bio_set bs;
+	struct dm_md_mempools *mempools;
 
 	/* kobject and completion */
 	struct dm_kobject_holder kobj_holder;
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 6087cdcaad46d..a83b98a8d2a99 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -319,7 +319,7 @@ static int setup_clone(struct request *clone, struct request *rq,
 {
 	int r;
 
-	r = blk_rq_prep_clone(clone, rq, &tio->md->bs, gfp_mask,
+	r = blk_rq_prep_clone(clone, rq, &tio->md->mempools->bs, gfp_mask,
 			      dm_rq_bio_constructor, tio);
 	if (r)
 		return r;
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 0e833a154b31d..a8b016d6bf16e 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1044,11 +1044,6 @@ void dm_table_free_md_mempools(struct dm_table *t)
 	t->mempools = NULL;
 }
 
-struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t)
-{
-	return t->mempools;
-}
-
 static int setup_indexes(struct dm_table *t)
 {
 	int i;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index dfb0a551bd880..8b21155d3c4f5 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -136,14 +136,6 @@ static int get_swap_bios(void)
 	return latch;
 }
 
-/*
- * For mempools pre-allocation at the table loading time.
- */
-struct dm_md_mempools {
-	struct bio_set bs;
-	struct bio_set io_bs;
-};
-
 struct table_device {
 	struct list_head list;
 	refcount_t count;
@@ -581,7 +573,7 @@ static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio)
 	struct dm_target_io *tio;
 	struct bio *clone;
 
-	clone = bio_alloc_clone(NULL, bio, GFP_NOIO, &md->io_bs);
+	clone = bio_alloc_clone(NULL, bio, GFP_NOIO, &md->mempools->io_bs);
 	/* Set default bdev, but target must bio_set_dev() before issuing IO */
 	clone->bi_bdev = md->disk->part0;
 
@@ -628,7 +620,8 @@ static struct bio *alloc_tio(struct clone_info *ci, struct dm_target *ti,
 	} else {
 		struct mapped_device *md = ci->io->md;
 
-		clone = bio_alloc_clone(NULL, ci->bio, gfp_mask, &md->bs);
+		clone = bio_alloc_clone(NULL, ci->bio, gfp_mask,
+					&md->mempools->bs);
 		if (!clone)
 			return NULL;
 		/* Set default bdev, but target must bio_set_dev() before issuing IO */
@@ -1876,8 +1869,7 @@ static void cleanup_mapped_device(struct mapped_device *md)
 {
 	if (md->wq)
 		destroy_workqueue(md->wq);
-	bioset_exit(&md->bs);
-	bioset_exit(&md->io_bs);
+	dm_free_md_mempools(md->mempools);
 
 	if (md->dax_dev) {
 		dax_remove_host(md->disk);
@@ -2049,48 +2041,6 @@ static void free_dev(struct mapped_device *md)
 	kvfree(md);
 }
 
-static int __bind_mempools(struct mapped_device *md, struct dm_table *t)
-{
-	struct dm_md_mempools *p = dm_table_get_md_mempools(t);
-	int ret = 0;
-
-	if (dm_table_bio_based(t)) {
-		/*
-		 * The md may already have mempools that need changing.
-		 * If so, reload bioset because front_pad may have changed
-		 * because a different table was loaded.
-		 */
-		bioset_exit(&md->bs);
-		bioset_exit(&md->io_bs);
-
-	} else if (bioset_initialized(&md->bs)) {
-		/*
-		 * There's no need to reload with request-based dm
-		 * because the size of front_pad doesn't change.
-		 * Note for future: If you are to reload bioset,
-		 * prep-ed requests in the queue may refer
-		 * to bio from the old bioset, so you must walk
-		 * through the queue to unprep.
-		 */
-		goto out;
-	}
-
-	BUG_ON(!p ||
-	       bioset_initialized(&md->bs) ||
-	       bioset_initialized(&md->io_bs));
-
-	ret = bioset_init_from_src(&md->bs, &p->bs);
-	if (ret)
-		goto out;
-	ret = bioset_init_from_src(&md->io_bs, &p->io_bs);
-	if (ret)
-		bioset_exit(&md->bs);
-out:
-	/* mempool bind completed, no longer need any mempools in the table */
-	dm_table_free_md_mempools(t);
-	return ret;
-}
-
 /*
  * Bind a table to the device.
  */
@@ -2144,12 +2094,28 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
 		 * immutable singletons - used to optimize dm_mq_queue_rq.
 		 */
 		md->immutable_target = dm_table_get_immutable_target(t);
-	}
 
-	ret = __bind_mempools(md, t);
-	if (ret) {
-		old_map = ERR_PTR(ret);
-		goto out;
+		/*
+		 * There is no need to reload with request-based dm because the
+		 * size of front_pad doesn't change.
+		 *
+		 * Note for future: If you are to reload bioset, prep-ed
+		 * requests in the queue may refer to bio from the old bioset,
+		 * so you must walk through the queue to unprep.
+		 */
+		if (!md->mempools) {
+			md->mempools = t->mempools;
+			t->mempools = NULL;
+		}
+	} else {
+		/*
+		 * The md may already have mempools that need changing.
+		 * If so, reload bioset because front_pad may have changed
+		 * because a different table was loaded.
+		 */
+		dm_free_md_mempools(md->mempools);
+		md->mempools = t->mempools;
+		t->mempools = NULL;
 	}
 
 	ret = dm_table_set_restrictions(t, md->queue, limits);
diff --git a/drivers/md/dm.h b/drivers/md/dm.h
index 3f89664fea010..86642234d4adb 100644
--- a/drivers/md/dm.h
+++ b/drivers/md/dm.h
@@ -72,7 +72,6 @@ struct dm_target *dm_table_get_wildcard_target(struct dm_table *t);
 bool dm_table_bio_based(struct dm_table *t);
 bool dm_table_request_based(struct dm_table *t);
 void dm_table_free_md_mempools(struct dm_table *t);
-struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t);
 
 void dm_lock_md_type(struct mapped_device *md);
 void dm_unlock_md_type(struct mapped_device *md);

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
@ 2022-06-02  8:12           ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2022-06-02  8:12 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Jens Axboe, linux-block, david, Matthew Wilcox,
	Christoph Hellwig, dm-devel

On Wed, Jun 01, 2022 at 10:13:34AM -0400, Mike Snitzer wrote:
> Please take the time to look at the code and save your judgement until
> you do.  That said, I'm not in love with the complexity of how DM
> handles bioset initialization.  But both you and Jens keep taking
> shots at DM for doing things wrong without actually looking.

I'm not taking shots.  I'm just saying we should kill this API.  In
the worse case the caller can keep track of if a bioset is initialized,
but in most cases we should be able to deduct it in a nicer way.

> DM uses bioset_init_from_src().  Yet you've both assumed double frees
> and such (while not entirely wrong your glossing over the detail that
> there is intervening reinitialization of DM's biosets between the
> bioset_exit()s)

And looking at the code, that use of bioset_init_from_src is completely
broken.  It does not actually preallocated anything as intended by
dm (maybe that isn't actually needed) but just uses the biosets in
dm_md_mempools as an awkward way to carry parameters.  And completely
loses bringing over the integrity allocations.  And no, this is not
intended as a "cheap shot" against Jens who did that either..

This is what I think should fix this, and will allow us to remove
bioset_init_from_src which was a bad idea from the start:


diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
index d21648a923ea9..54c0473a51dde 100644
--- a/drivers/md/dm-core.h
+++ b/drivers/md/dm-core.h
@@ -33,6 +33,14 @@ struct dm_kobject_holder {
  * access their members!
  */
 
+/*
+ * For mempools pre-allocation at the table loading time.
+ */
+struct dm_md_mempools {
+	struct bio_set bs;
+	struct bio_set io_bs;
+};
+
 struct mapped_device {
 	struct mutex suspend_lock;
 
@@ -110,8 +118,7 @@ struct mapped_device {
 	/*
 	 * io objects are allocated from here.
 	 */
-	struct bio_set io_bs;
-	struct bio_set bs;
+	struct dm_md_mempools *mempools;
 
 	/* kobject and completion */
 	struct dm_kobject_holder kobj_holder;
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 6087cdcaad46d..a83b98a8d2a99 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -319,7 +319,7 @@ static int setup_clone(struct request *clone, struct request *rq,
 {
 	int r;
 
-	r = blk_rq_prep_clone(clone, rq, &tio->md->bs, gfp_mask,
+	r = blk_rq_prep_clone(clone, rq, &tio->md->mempools->bs, gfp_mask,
 			      dm_rq_bio_constructor, tio);
 	if (r)
 		return r;
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 0e833a154b31d..a8b016d6bf16e 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1044,11 +1044,6 @@ void dm_table_free_md_mempools(struct dm_table *t)
 	t->mempools = NULL;
 }
 
-struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t)
-{
-	return t->mempools;
-}
-
 static int setup_indexes(struct dm_table *t)
 {
 	int i;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index dfb0a551bd880..8b21155d3c4f5 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -136,14 +136,6 @@ static int get_swap_bios(void)
 	return latch;
 }
 
-/*
- * For mempools pre-allocation at the table loading time.
- */
-struct dm_md_mempools {
-	struct bio_set bs;
-	struct bio_set io_bs;
-};
-
 struct table_device {
 	struct list_head list;
 	refcount_t count;
@@ -581,7 +573,7 @@ static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio)
 	struct dm_target_io *tio;
 	struct bio *clone;
 
-	clone = bio_alloc_clone(NULL, bio, GFP_NOIO, &md->io_bs);
+	clone = bio_alloc_clone(NULL, bio, GFP_NOIO, &md->mempools->io_bs);
 	/* Set default bdev, but target must bio_set_dev() before issuing IO */
 	clone->bi_bdev = md->disk->part0;
 
@@ -628,7 +620,8 @@ static struct bio *alloc_tio(struct clone_info *ci, struct dm_target *ti,
 	} else {
 		struct mapped_device *md = ci->io->md;
 
-		clone = bio_alloc_clone(NULL, ci->bio, gfp_mask, &md->bs);
+		clone = bio_alloc_clone(NULL, ci->bio, gfp_mask,
+					&md->mempools->bs);
 		if (!clone)
 			return NULL;
 		/* Set default bdev, but target must bio_set_dev() before issuing IO */
@@ -1876,8 +1869,7 @@ static void cleanup_mapped_device(struct mapped_device *md)
 {
 	if (md->wq)
 		destroy_workqueue(md->wq);
-	bioset_exit(&md->bs);
-	bioset_exit(&md->io_bs);
+	dm_free_md_mempools(md->mempools);
 
 	if (md->dax_dev) {
 		dax_remove_host(md->disk);
@@ -2049,48 +2041,6 @@ static void free_dev(struct mapped_device *md)
 	kvfree(md);
 }
 
-static int __bind_mempools(struct mapped_device *md, struct dm_table *t)
-{
-	struct dm_md_mempools *p = dm_table_get_md_mempools(t);
-	int ret = 0;
-
-	if (dm_table_bio_based(t)) {
-		/*
-		 * The md may already have mempools that need changing.
-		 * If so, reload bioset because front_pad may have changed
-		 * because a different table was loaded.
-		 */
-		bioset_exit(&md->bs);
-		bioset_exit(&md->io_bs);
-
-	} else if (bioset_initialized(&md->bs)) {
-		/*
-		 * There's no need to reload with request-based dm
-		 * because the size of front_pad doesn't change.
-		 * Note for future: If you are to reload bioset,
-		 * prep-ed requests in the queue may refer
-		 * to bio from the old bioset, so you must walk
-		 * through the queue to unprep.
-		 */
-		goto out;
-	}
-
-	BUG_ON(!p ||
-	       bioset_initialized(&md->bs) ||
-	       bioset_initialized(&md->io_bs));
-
-	ret = bioset_init_from_src(&md->bs, &p->bs);
-	if (ret)
-		goto out;
-	ret = bioset_init_from_src(&md->io_bs, &p->io_bs);
-	if (ret)
-		bioset_exit(&md->bs);
-out:
-	/* mempool bind completed, no longer need any mempools in the table */
-	dm_table_free_md_mempools(t);
-	return ret;
-}
-
 /*
  * Bind a table to the device.
  */
@@ -2144,12 +2094,28 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
 		 * immutable singletons - used to optimize dm_mq_queue_rq.
 		 */
 		md->immutable_target = dm_table_get_immutable_target(t);
-	}
 
-	ret = __bind_mempools(md, t);
-	if (ret) {
-		old_map = ERR_PTR(ret);
-		goto out;
+		/*
+		 * There is no need to reload with request-based dm because the
+		 * size of front_pad doesn't change.
+		 *
+		 * Note for future: If you are to reload bioset, prep-ed
+		 * requests in the queue may refer to bio from the old bioset,
+		 * so you must walk through the queue to unprep.
+		 */
+		if (!md->mempools) {
+			md->mempools = t->mempools;
+			t->mempools = NULL;
+		}
+	} else {
+		/*
+		 * The md may already have mempools that need changing.
+		 * If so, reload bioset because front_pad may have changed
+		 * because a different table was loaded.
+		 */
+		dm_free_md_mempools(md->mempools);
+		md->mempools = t->mempools;
+		t->mempools = NULL;
 	}
 
 	ret = dm_table_set_restrictions(t, md->queue, limits);
diff --git a/drivers/md/dm.h b/drivers/md/dm.h
index 3f89664fea010..86642234d4adb 100644
--- a/drivers/md/dm.h
+++ b/drivers/md/dm.h
@@ -72,7 +72,6 @@ struct dm_target *dm_table_get_wildcard_target(struct dm_table *t);
 bool dm_table_bio_based(struct dm_table *t);
 bool dm_table_request_based(struct dm_table *t);
 void dm_table_free_md_mempools(struct dm_table *t);
-struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t);
 
 void dm_lock_md_type(struct mapped_device *md);
 void dm_unlock_md_type(struct mapped_device *md);

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
  2022-06-02  8:12           ` [dm-devel] " Christoph Hellwig
@ 2022-06-02  8:18             ` Jens Axboe
  -1 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2022-06-02  8:18 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Snitzer
  Cc: Matthew Wilcox, dm-devel, linux-block, david

On 6/2/22 2:12 AM, Christoph Hellwig wrote:
> On Wed, Jun 01, 2022 at 10:13:34AM -0400, Mike Snitzer wrote:
>> Please take the time to look at the code and save your judgement until
>> you do.  That said, I'm not in love with the complexity of how DM
>> handles bioset initialization.  But both you and Jens keep taking
>> shots at DM for doing things wrong without actually looking.
> 
> I'm not taking shots.  I'm just saying we should kill this API.  In
> the worse case the caller can keep track of if a bioset is initialized,
> but in most cases we should be able to deduct it in a nicer way.

Yeah ditto, it's more an observation on needing something like
this_foo_was_initialized() is just a bad way to program it to begin
with. The caller should know this already, rather than us needing
functions and state in the struct to tell you about it.

>> DM uses bioset_init_from_src().  Yet you've both assumed double frees
>> and such (while not entirely wrong your glossing over the detail that
>> there is intervening reinitialization of DM's biosets between the
>> bioset_exit()s)
> 
> And looking at the code, that use of bioset_init_from_src is completely
> broken.  It does not actually preallocated anything as intended by
> dm (maybe that isn't actually needed) but just uses the biosets in
> dm_md_mempools as an awkward way to carry parameters.  And completely
> loses bringing over the integrity allocations.  And no, this is not
> intended as a "cheap shot" against Jens who did that either..
> 
> This is what I think should fix this, and will allow us to remove
> bioset_init_from_src which was a bad idea from the start:

Based on a quick look, seems good to me.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
@ 2022-06-02  8:18             ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2022-06-02  8:18 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Snitzer
  Cc: linux-block, dm-devel, david, Matthew Wilcox

On 6/2/22 2:12 AM, Christoph Hellwig wrote:
> On Wed, Jun 01, 2022 at 10:13:34AM -0400, Mike Snitzer wrote:
>> Please take the time to look at the code and save your judgement until
>> you do.  That said, I'm not in love with the complexity of how DM
>> handles bioset initialization.  But both you and Jens keep taking
>> shots at DM for doing things wrong without actually looking.
> 
> I'm not taking shots.  I'm just saying we should kill this API.  In
> the worse case the caller can keep track of if a bioset is initialized,
> but in most cases we should be able to deduct it in a nicer way.

Yeah ditto, it's more an observation on needing something like
this_foo_was_initialized() is just a bad way to program it to begin
with. The caller should know this already, rather than us needing
functions and state in the struct to tell you about it.

>> DM uses bioset_init_from_src().  Yet you've both assumed double frees
>> and such (while not entirely wrong your glossing over the detail that
>> there is intervening reinitialization of DM's biosets between the
>> bioset_exit()s)
> 
> And looking at the code, that use of bioset_init_from_src is completely
> broken.  It does not actually preallocated anything as intended by
> dm (maybe that isn't actually needed) but just uses the biosets in
> dm_md_mempools as an awkward way to carry parameters.  And completely
> loses bringing over the integrity allocations.  And no, this is not
> intended as a "cheap shot" against Jens who did that either..
> 
> This is what I think should fix this, and will allow us to remove
> bioset_init_from_src which was a bad idea from the start:

Based on a quick look, seems good to me.

-- 
Jens Axboe

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
  2022-06-02  8:12           ` [dm-devel] " Christoph Hellwig
@ 2022-06-03 13:09             ` Mike Snitzer
  -1 siblings, 0 replies; 28+ messages in thread
From: Mike Snitzer @ 2022-06-03 13:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Matthew Wilcox, dm-devel, linux-block, david

On Thu, Jun 02 2022 at  4:12P -0400,
Christoph Hellwig <hch@infradead.org> wrote:

> On Wed, Jun 01, 2022 at 10:13:34AM -0400, Mike Snitzer wrote:
> > Please take the time to look at the code and save your judgement until
> > you do.  That said, I'm not in love with the complexity of how DM
> > handles bioset initialization.  But both you and Jens keep taking
> > shots at DM for doing things wrong without actually looking.
> 
> I'm not taking shots.  I'm just saying we should kill this API.  In
> the worse case the caller can keep track of if a bioset is initialized,
> but in most cases we should be able to deduct it in a nicer way.
> 
> > DM uses bioset_init_from_src().  Yet you've both assumed double frees
> > and such (while not entirely wrong your glossing over the detail that
> > there is intervening reinitialization of DM's biosets between the
> > bioset_exit()s)
> 
> And looking at the code, that use of bioset_init_from_src is completely
> broken.  It does not actually preallocated anything as intended by
> dm (maybe that isn't actually needed) but just uses the biosets in
> dm_md_mempools as an awkward way to carry parameters.

I think the point was to keep the biosets embedded in struct
mapped_device to avoid any possible cacheline bouncing due to the
bioset memory being disjoint.

But preserving that micro-optimization isn't something I've ever
quantified (at least not with focus).

> And completely loses bringing over the integrity allocations.

Good catch.
 
> This is what I think should fix this, and will allow us to remove
> bioset_init_from_src which was a bad idea from the start:

Looks fine, did you want to send an official patch with a proper
header and Signed-off-by?

Thanks,
Mike

> diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
> index d21648a923ea9..54c0473a51dde 100644
> --- a/drivers/md/dm-core.h
> +++ b/drivers/md/dm-core.h
> @@ -33,6 +33,14 @@ struct dm_kobject_holder {
>   * access their members!
>   */
>  
> +/*
> + * For mempools pre-allocation at the table loading time.
> + */
> +struct dm_md_mempools {
> +	struct bio_set bs;
> +	struct bio_set io_bs;
> +};
> +
>  struct mapped_device {
>  	struct mutex suspend_lock;
>  
> @@ -110,8 +118,7 @@ struct mapped_device {
>  	/*
>  	 * io objects are allocated from here.
>  	 */
> -	struct bio_set io_bs;
> -	struct bio_set bs;
> +	struct dm_md_mempools *mempools;
>  
>  	/* kobject and completion */
>  	struct dm_kobject_holder kobj_holder;
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index 6087cdcaad46d..a83b98a8d2a99 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -319,7 +319,7 @@ static int setup_clone(struct request *clone, struct request *rq,
>  {
>  	int r;
>  
> -	r = blk_rq_prep_clone(clone, rq, &tio->md->bs, gfp_mask,
> +	r = blk_rq_prep_clone(clone, rq, &tio->md->mempools->bs, gfp_mask,
>  			      dm_rq_bio_constructor, tio);
>  	if (r)
>  		return r;
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index 0e833a154b31d..a8b016d6bf16e 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -1044,11 +1044,6 @@ void dm_table_free_md_mempools(struct dm_table *t)
>  	t->mempools = NULL;
>  }
>  
> -struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t)
> -{
> -	return t->mempools;
> -}
> -
>  static int setup_indexes(struct dm_table *t)
>  {
>  	int i;
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index dfb0a551bd880..8b21155d3c4f5 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -136,14 +136,6 @@ static int get_swap_bios(void)
>  	return latch;
>  }
>  
> -/*
> - * For mempools pre-allocation at the table loading time.
> - */
> -struct dm_md_mempools {
> -	struct bio_set bs;
> -	struct bio_set io_bs;
> -};
> -
>  struct table_device {
>  	struct list_head list;
>  	refcount_t count;
> @@ -581,7 +573,7 @@ static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio)
>  	struct dm_target_io *tio;
>  	struct bio *clone;
>  
> -	clone = bio_alloc_clone(NULL, bio, GFP_NOIO, &md->io_bs);
> +	clone = bio_alloc_clone(NULL, bio, GFP_NOIO, &md->mempools->io_bs);
>  	/* Set default bdev, but target must bio_set_dev() before issuing IO */
>  	clone->bi_bdev = md->disk->part0;
>  
> @@ -628,7 +620,8 @@ static struct bio *alloc_tio(struct clone_info *ci, struct dm_target *ti,
>  	} else {
>  		struct mapped_device *md = ci->io->md;
>  
> -		clone = bio_alloc_clone(NULL, ci->bio, gfp_mask, &md->bs);
> +		clone = bio_alloc_clone(NULL, ci->bio, gfp_mask,
> +					&md->mempools->bs);
>  		if (!clone)
>  			return NULL;
>  		/* Set default bdev, but target must bio_set_dev() before issuing IO */
> @@ -1876,8 +1869,7 @@ static void cleanup_mapped_device(struct mapped_device *md)
>  {
>  	if (md->wq)
>  		destroy_workqueue(md->wq);
> -	bioset_exit(&md->bs);
> -	bioset_exit(&md->io_bs);
> +	dm_free_md_mempools(md->mempools);
>  
>  	if (md->dax_dev) {
>  		dax_remove_host(md->disk);
> @@ -2049,48 +2041,6 @@ static void free_dev(struct mapped_device *md)
>  	kvfree(md);
>  }
>  
> -static int __bind_mempools(struct mapped_device *md, struct dm_table *t)
> -{
> -	struct dm_md_mempools *p = dm_table_get_md_mempools(t);
> -	int ret = 0;
> -
> -	if (dm_table_bio_based(t)) {
> -		/*
> -		 * The md may already have mempools that need changing.
> -		 * If so, reload bioset because front_pad may have changed
> -		 * because a different table was loaded.
> -		 */
> -		bioset_exit(&md->bs);
> -		bioset_exit(&md->io_bs);
> -
> -	} else if (bioset_initialized(&md->bs)) {
> -		/*
> -		 * There's no need to reload with request-based dm
> -		 * because the size of front_pad doesn't change.
> -		 * Note for future: If you are to reload bioset,
> -		 * prep-ed requests in the queue may refer
> -		 * to bio from the old bioset, so you must walk
> -		 * through the queue to unprep.
> -		 */
> -		goto out;
> -	}
> -
> -	BUG_ON(!p ||
> -	       bioset_initialized(&md->bs) ||
> -	       bioset_initialized(&md->io_bs));
> -
> -	ret = bioset_init_from_src(&md->bs, &p->bs);
> -	if (ret)
> -		goto out;
> -	ret = bioset_init_from_src(&md->io_bs, &p->io_bs);
> -	if (ret)
> -		bioset_exit(&md->bs);
> -out:
> -	/* mempool bind completed, no longer need any mempools in the table */
> -	dm_table_free_md_mempools(t);
> -	return ret;
> -}
> -
>  /*
>   * Bind a table to the device.
>   */
> @@ -2144,12 +2094,28 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
>  		 * immutable singletons - used to optimize dm_mq_queue_rq.
>  		 */
>  		md->immutable_target = dm_table_get_immutable_target(t);
> -	}
>  
> -	ret = __bind_mempools(md, t);
> -	if (ret) {
> -		old_map = ERR_PTR(ret);
> -		goto out;
> +		/*
> +		 * There is no need to reload with request-based dm because the
> +		 * size of front_pad doesn't change.
> +		 *
> +		 * Note for future: If you are to reload bioset, prep-ed
> +		 * requests in the queue may refer to bio from the old bioset,
> +		 * so you must walk through the queue to unprep.
> +		 */
> +		if (!md->mempools) {
> +			md->mempools = t->mempools;
> +			t->mempools = NULL;
> +		}
> +	} else {
> +		/*
> +		 * The md may already have mempools that need changing.
> +		 * If so, reload bioset because front_pad may have changed
> +		 * because a different table was loaded.
> +		 */
> +		dm_free_md_mempools(md->mempools);
> +		md->mempools = t->mempools;
> +		t->mempools = NULL;
>  	}
>  
>  	ret = dm_table_set_restrictions(t, md->queue, limits);
> diff --git a/drivers/md/dm.h b/drivers/md/dm.h
> index 3f89664fea010..86642234d4adb 100644
> --- a/drivers/md/dm.h
> +++ b/drivers/md/dm.h
> @@ -72,7 +72,6 @@ struct dm_target *dm_table_get_wildcard_target(struct dm_table *t);
>  bool dm_table_bio_based(struct dm_table *t);
>  bool dm_table_request_based(struct dm_table *t);
>  void dm_table_free_md_mempools(struct dm_table *t);
> -struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t);
>  
>  void dm_lock_md_type(struct mapped_device *md);
>  void dm_unlock_md_type(struct mapped_device *md);
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
@ 2022-06-03 13:09             ` Mike Snitzer
  0 siblings, 0 replies; 28+ messages in thread
From: Mike Snitzer @ 2022-06-03 13:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, dm-devel, david, Matthew Wilcox

On Thu, Jun 02 2022 at  4:12P -0400,
Christoph Hellwig <hch@infradead.org> wrote:

> On Wed, Jun 01, 2022 at 10:13:34AM -0400, Mike Snitzer wrote:
> > Please take the time to look at the code and save your judgement until
> > you do.  That said, I'm not in love with the complexity of how DM
> > handles bioset initialization.  But both you and Jens keep taking
> > shots at DM for doing things wrong without actually looking.
> 
> I'm not taking shots.  I'm just saying we should kill this API.  In
> the worse case the caller can keep track of if a bioset is initialized,
> but in most cases we should be able to deduct it in a nicer way.
> 
> > DM uses bioset_init_from_src().  Yet you've both assumed double frees
> > and such (while not entirely wrong your glossing over the detail that
> > there is intervening reinitialization of DM's biosets between the
> > bioset_exit()s)
> 
> And looking at the code, that use of bioset_init_from_src is completely
> broken.  It does not actually preallocated anything as intended by
> dm (maybe that isn't actually needed) but just uses the biosets in
> dm_md_mempools as an awkward way to carry parameters.

I think the point was to keep the biosets embedded in struct
mapped_device to avoid any possible cacheline bouncing due to the
bioset memory being disjoint.

But preserving that micro-optimization isn't something I've ever
quantified (at least not with focus).

> And completely loses bringing over the integrity allocations.

Good catch.
 
> This is what I think should fix this, and will allow us to remove
> bioset_init_from_src which was a bad idea from the start:

Looks fine, did you want to send an official patch with a proper
header and Signed-off-by?

Thanks,
Mike

> diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
> index d21648a923ea9..54c0473a51dde 100644
> --- a/drivers/md/dm-core.h
> +++ b/drivers/md/dm-core.h
> @@ -33,6 +33,14 @@ struct dm_kobject_holder {
>   * access their members!
>   */
>  
> +/*
> + * For mempools pre-allocation at the table loading time.
> + */
> +struct dm_md_mempools {
> +	struct bio_set bs;
> +	struct bio_set io_bs;
> +};
> +
>  struct mapped_device {
>  	struct mutex suspend_lock;
>  
> @@ -110,8 +118,7 @@ struct mapped_device {
>  	/*
>  	 * io objects are allocated from here.
>  	 */
> -	struct bio_set io_bs;
> -	struct bio_set bs;
> +	struct dm_md_mempools *mempools;
>  
>  	/* kobject and completion */
>  	struct dm_kobject_holder kobj_holder;
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index 6087cdcaad46d..a83b98a8d2a99 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -319,7 +319,7 @@ static int setup_clone(struct request *clone, struct request *rq,
>  {
>  	int r;
>  
> -	r = blk_rq_prep_clone(clone, rq, &tio->md->bs, gfp_mask,
> +	r = blk_rq_prep_clone(clone, rq, &tio->md->mempools->bs, gfp_mask,
>  			      dm_rq_bio_constructor, tio);
>  	if (r)
>  		return r;
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index 0e833a154b31d..a8b016d6bf16e 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -1044,11 +1044,6 @@ void dm_table_free_md_mempools(struct dm_table *t)
>  	t->mempools = NULL;
>  }
>  
> -struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t)
> -{
> -	return t->mempools;
> -}
> -
>  static int setup_indexes(struct dm_table *t)
>  {
>  	int i;
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index dfb0a551bd880..8b21155d3c4f5 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -136,14 +136,6 @@ static int get_swap_bios(void)
>  	return latch;
>  }
>  
> -/*
> - * For mempools pre-allocation at the table loading time.
> - */
> -struct dm_md_mempools {
> -	struct bio_set bs;
> -	struct bio_set io_bs;
> -};
> -
>  struct table_device {
>  	struct list_head list;
>  	refcount_t count;
> @@ -581,7 +573,7 @@ static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio)
>  	struct dm_target_io *tio;
>  	struct bio *clone;
>  
> -	clone = bio_alloc_clone(NULL, bio, GFP_NOIO, &md->io_bs);
> +	clone = bio_alloc_clone(NULL, bio, GFP_NOIO, &md->mempools->io_bs);
>  	/* Set default bdev, but target must bio_set_dev() before issuing IO */
>  	clone->bi_bdev = md->disk->part0;
>  
> @@ -628,7 +620,8 @@ static struct bio *alloc_tio(struct clone_info *ci, struct dm_target *ti,
>  	} else {
>  		struct mapped_device *md = ci->io->md;
>  
> -		clone = bio_alloc_clone(NULL, ci->bio, gfp_mask, &md->bs);
> +		clone = bio_alloc_clone(NULL, ci->bio, gfp_mask,
> +					&md->mempools->bs);
>  		if (!clone)
>  			return NULL;
>  		/* Set default bdev, but target must bio_set_dev() before issuing IO */
> @@ -1876,8 +1869,7 @@ static void cleanup_mapped_device(struct mapped_device *md)
>  {
>  	if (md->wq)
>  		destroy_workqueue(md->wq);
> -	bioset_exit(&md->bs);
> -	bioset_exit(&md->io_bs);
> +	dm_free_md_mempools(md->mempools);
>  
>  	if (md->dax_dev) {
>  		dax_remove_host(md->disk);
> @@ -2049,48 +2041,6 @@ static void free_dev(struct mapped_device *md)
>  	kvfree(md);
>  }
>  
> -static int __bind_mempools(struct mapped_device *md, struct dm_table *t)
> -{
> -	struct dm_md_mempools *p = dm_table_get_md_mempools(t);
> -	int ret = 0;
> -
> -	if (dm_table_bio_based(t)) {
> -		/*
> -		 * The md may already have mempools that need changing.
> -		 * If so, reload bioset because front_pad may have changed
> -		 * because a different table was loaded.
> -		 */
> -		bioset_exit(&md->bs);
> -		bioset_exit(&md->io_bs);
> -
> -	} else if (bioset_initialized(&md->bs)) {
> -		/*
> -		 * There's no need to reload with request-based dm
> -		 * because the size of front_pad doesn't change.
> -		 * Note for future: If you are to reload bioset,
> -		 * prep-ed requests in the queue may refer
> -		 * to bio from the old bioset, so you must walk
> -		 * through the queue to unprep.
> -		 */
> -		goto out;
> -	}
> -
> -	BUG_ON(!p ||
> -	       bioset_initialized(&md->bs) ||
> -	       bioset_initialized(&md->io_bs));
> -
> -	ret = bioset_init_from_src(&md->bs, &p->bs);
> -	if (ret)
> -		goto out;
> -	ret = bioset_init_from_src(&md->io_bs, &p->io_bs);
> -	if (ret)
> -		bioset_exit(&md->bs);
> -out:
> -	/* mempool bind completed, no longer need any mempools in the table */
> -	dm_table_free_md_mempools(t);
> -	return ret;
> -}
> -
>  /*
>   * Bind a table to the device.
>   */
> @@ -2144,12 +2094,28 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
>  		 * immutable singletons - used to optimize dm_mq_queue_rq.
>  		 */
>  		md->immutable_target = dm_table_get_immutable_target(t);
> -	}
>  
> -	ret = __bind_mempools(md, t);
> -	if (ret) {
> -		old_map = ERR_PTR(ret);
> -		goto out;
> +		/*
> +		 * There is no need to reload with request-based dm because the
> +		 * size of front_pad doesn't change.
> +		 *
> +		 * Note for future: If you are to reload bioset, prep-ed
> +		 * requests in the queue may refer to bio from the old bioset,
> +		 * so you must walk through the queue to unprep.
> +		 */
> +		if (!md->mempools) {
> +			md->mempools = t->mempools;
> +			t->mempools = NULL;
> +		}
> +	} else {
> +		/*
> +		 * The md may already have mempools that need changing.
> +		 * If so, reload bioset because front_pad may have changed
> +		 * because a different table was loaded.
> +		 */
> +		dm_free_md_mempools(md->mempools);
> +		md->mempools = t->mempools;
> +		t->mempools = NULL;
>  	}
>  
>  	ret = dm_table_set_restrictions(t, md->queue, limits);
> diff --git a/drivers/md/dm.h b/drivers/md/dm.h
> index 3f89664fea010..86642234d4adb 100644
> --- a/drivers/md/dm.h
> +++ b/drivers/md/dm.h
> @@ -72,7 +72,6 @@ struct dm_target *dm_table_get_wildcard_target(struct dm_table *t);
>  bool dm_table_bio_based(struct dm_table *t);
>  bool dm_table_request_based(struct dm_table *t);
>  void dm_table_free_md_mempools(struct dm_table *t);
> -struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t);
>  
>  void dm_lock_md_type(struct mapped_device *md);
>  void dm_unlock_md_type(struct mapped_device *md);
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: bioset_exit poison from dm_destroy
  2022-06-03 13:09             ` [dm-devel] " Mike Snitzer
@ 2022-06-03 13:19               ` Christoph Hellwig
  -1 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2022-06-03 13:19 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Christoph Hellwig, Jens Axboe, Matthew Wilcox, dm-devel,
	linux-block, david

On Fri, Jun 03, 2022 at 09:09:06AM -0400, Mike Snitzer wrote:
> I think the point was to keep the biosets embedded in struct
> mapped_device to avoid any possible cacheline bouncing due to the
> bioset memory being disjoint.

Probably.

> But preserving that micro-optimization isn't something I've ever
> quantified (at least not with focus).

But we can either avoid the pointer or do the preallocation.  Doing both
doesn't really work.  OTOH given that we've surived quite a while without
the preallocation actually working I wonder if it actually is needed.

> Looks fine, did you want to send an official patch with a proper
> header and Signed-off-by?

I plan to send a series with a few follow on after testing it a bit
more.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [dm-devel] bioset_exit poison from dm_destroy
@ 2022-06-03 13:19               ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2022-06-03 13:19 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Jens Axboe, linux-block, david, Matthew Wilcox,
	Christoph Hellwig, dm-devel

On Fri, Jun 03, 2022 at 09:09:06AM -0400, Mike Snitzer wrote:
> I think the point was to keep the biosets embedded in struct
> mapped_device to avoid any possible cacheline bouncing due to the
> bioset memory being disjoint.

Probably.

> But preserving that micro-optimization isn't something I've ever
> quantified (at least not with focus).

But we can either avoid the pointer or do the preallocation.  Doing both
doesn't really work.  OTOH given that we've surived quite a while without
the preallocation actually working I wonder if it actually is needed.

> Looks fine, did you want to send an official patch with a proper
> header and Signed-off-by?

I plan to send a series with a few follow on after testing it a bit
more.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2022-06-03 13:47 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-29  0:17 bioset_exit poison from dm_destroy Matthew Wilcox
2022-05-29  0:17 ` [dm-devel] " Matthew Wilcox
2022-05-29 12:46 ` Jens Axboe
2022-05-29 12:46   ` [dm-devel] " Jens Axboe
2022-05-31 18:58   ` Mike Snitzer
2022-05-31 18:58     ` [dm-devel] " Mike Snitzer
2022-05-31 19:00     ` Jens Axboe
2022-05-31 19:00       ` [dm-devel] " Jens Axboe
2022-05-31 19:49       ` Mike Snitzer
2022-05-31 19:49         ` [dm-devel] " Mike Snitzer
2022-06-01  3:27         ` Jens Axboe
2022-06-01  3:27           ` [dm-devel] " Jens Axboe
2022-06-01  6:04     ` Christoph Hellwig
2022-06-01  6:04       ` [dm-devel] " Christoph Hellwig
2022-06-01  6:08       ` Jens Axboe
2022-06-01  6:08         ` [dm-devel] " Jens Axboe
2022-06-01  6:18         ` Christoph Hellwig
2022-06-01  6:18           ` [dm-devel] " Christoph Hellwig
2022-06-01 14:13       ` Mike Snitzer
2022-06-01 14:13         ` Mike Snitzer
2022-06-02  8:12         ` Christoph Hellwig
2022-06-02  8:12           ` [dm-devel] " Christoph Hellwig
2022-06-02  8:18           ` Jens Axboe
2022-06-02  8:18             ` [dm-devel] " Jens Axboe
2022-06-03 13:09           ` Mike Snitzer
2022-06-03 13:09             ` [dm-devel] " Mike Snitzer
2022-06-03 13:19             ` Christoph Hellwig
2022-06-03 13:19               ` [dm-devel] " Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.