linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kernfs: fix a crash when two processes delete the same directory
@ 2022-09-26 11:04 Mikulas Patocka
  2022-09-26 12:16 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 7+ messages in thread
From: Mikulas Patocka @ 2022-09-26 11:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Christoph Lameter,
	Andrew Morton, Milan Broz
  Cc: dm-devel, linux-kernel

There is a crash when running the cryptsetup testsuite on Fedora Rawhide.
It can be reproduced by installing Rawhide with the 6.0-rc6 kernel,
downloading and compiling the cryptsetup repository and running this test
in a loop for about 15 minuts:
	while ./integrity-compat-test; do :; done

 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 50087 at fs/kernfs/dir.c:504 __kernfs_remove.part.0+0x26f/0x2b0
 Modules linked in: crc32_generic loop dm_integrity async_xor async_tx tls isofs uinput snd_seq_dummy snd_hrtimer nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr sunrpc snd_hda_codec_generic ledtrig_audio snd_hda_intel iTCO_wdt snd_intel_dspcfg intel_pmc_bxt snd_intel_sdw_acpi iTCO_vendor_support snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device joydev snd_pcm i2c_i801 snd_timer pcspkr i2c_smbus virtio_balloon snd lpc_ich soundcore zram virtio_net net_failover virtio_blk serio_raw failover qxl virtio_console drm_ttm_helper ttm ip6_tables ip_tables fuse qemu_fw_cfg
 Unloaded tainted modules: crc32_pclmul():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1
 CPU: 0 PID: 50087 Comm: integritysetup Not tainted 6.0.0-0.rc6.41.fc38.x86_64 #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
 RIP: 0010:__kernfs_remove.part.0+0x26f/0x2b0
 Code: 0f 85 24 ff ff ff 4d 85 e4 0f 84 31 ff ff ff 41 0f b7 44 24 70 4c 89 e3 83 e0 0f 66 83 f8 01 0f 84 2c fe ff ff e9 50 fe ff ff <0f> 0b e9 53 fe ff ff 0f 0b e9 6b fe ff ff 48 8b 57 10 48 c7 c6 e8
 RSP: 0018:ffffa2e001da7a78 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff8e509d111380 RCX: 0000000080200015
 RDX: 0000000000000000 RSI: fffff015c5744440 RDI: ffff8e509d1113c8
 RBP: ffffa2e001da7ac0 R08: 0000000000000000 R09: 0000000080200015
 R10: 0000000000000009 R11: ffff8e514b185488 R12: ffff8e509d111380
 R13: ffff8e5114324f98 R14: ffff8e50401e92a0 R15: 0000000000000000
 FS:  00007fbbbdc13880(0000) GS:ffff8e53afa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055d1cdc94248 CR3: 0000000219990000 CR4: 00000000000006f0
 Call Trace:
  <TASK>
  ? _raw_spin_unlock_irqrestore+0x23/0x40
  ? kernfs_name_hash+0x12/0x80
  kernfs_remove_by_name_ns+0x60/0xa0
  sysfs_slab_add+0x166/0x200
  __kmem_cache_create+0x3c9/0x4b0
  kmem_cache_create_usercopy+0x202/0x340
  kmem_cache_create+0x12/0x20
  bioset_init+0x1fe/0x280
  dm_table_complete+0x3cd/0x6f0
  table_load+0x140/0x2c0
  ? dev_suspend+0x2e0/0x2e0
  ctl_ioctl+0x1f2/0x450
  dm_ctl_ioctl+0xa/0x20
  __x64_sys_ioctl+0x90/0xd0
  do_syscall_64+0x5b/0x80
  ? fpregs_restore_userregs+0x12/0xe0
  ? exit_to_user_mode_prepare+0x18f/0x1f0
  ? syscall_exit_to_user_mode+0x17/0x40
  ? do_syscall_64+0x67/0x80
  ? do_syscall_64+0x67/0x80
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 RIP: 0033:0x7fbbbdf27daf
 Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
 RSP: 002b:00007ffc3c305540 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
 RAX: ffffffffffffffda RBX: 0000000001f897e0 RCX: 00007fbbbdf27daf
 RDX: 0000000001f8e000 RSI: 00000000c138fd09 RDI: 0000000000000006
 RBP: 0000000000000003 R08: 0000000001f8b150 R09: 0000000000000073
 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fbbbe57c6b6
 R13: 00007fbbbe57c38c R14: 0000000001f8e030 R15: 00007fbbbe57c78c
  </TASK>
 ---[ end trace 0000000000000000 ]---
 ------------[ cut here ]------------
 kernel BUG at mm/slub.c:381!
 invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 0 PID: 50087 Comm: integritysetup Tainted: G        W         -------  ---  6.0.0-0.rc6.41.fc38.x86_64 #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
 RIP: 0010:__slab_free+0x11c/0x2f0
 Code: 39 49 8b 04 24 48 89 4c 24 18 48 c1 e8 36 4c 8b ac c3 d8 00 00 00 4c 89 ef e8 90 a2 a3 00 48 8b 4c 24 18 48 89 44 24 20 eb 8e <0f> 0b f7 43 08 00 0d 21 00 75 cc 4d 85 ff 75 c7 80 4c 24 53 80 e9
 RSP: 0018:ffffa2e001da7998 EFLAGS: 00010246
 RAX: ffff8e5059948440 RBX: ffff8e5040042200 RCX: 0000000082000127
 RDX: fffffffcab948440 RSI: fffff015c4665200 RDI: ffff8e5040042200
 RBP: ffff8e5059948440 R08: 0000000000000001 R09: ffffffffae457f18
 R10: 0000000000000009 R11: ffff8e514b185488 R12: fffff015c4665200
 R13: ffff8e509d111398 R14: ffff8e5059948440 R15: ffff8e5059948440
 FS:  00007fbbbdc13880(0000) GS:ffff8e53afa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055d1cdc94248 CR3: 0000000219990000 CR4: 00000000000006f0
 Call Trace:
  <TASK>
  ? osq_unlock+0xf/0x90
  ? rwsem_down_write_slowpath+0x1fc/0x710
  kernfs_put.part.0+0x58/0x1a0
  __kernfs_remove.part.0+0x17c/0x2b0
  ? _raw_spin_unlock_irqrestore+0x23/0x40
  ? kernfs_name_hash+0x12/0x80
  kernfs_remove_by_name_ns+0x60/0xa0
  sysfs_slab_add+0x166/0x200
  __kmem_cache_create+0x3c9/0x4b0
  kmem_cache_create_usercopy+0x202/0x340
  kmem_cache_create+0x12/0x20
  bioset_init+0x1fe/0x280
  dm_table_complete+0x3cd/0x6f0
  table_load+0x140/0x2c0
  ? dev_suspend+0x2e0/0x2e0
  ctl_ioctl+0x1f2/0x450
  dm_ctl_ioctl+0xa/0x20
  __x64_sys_ioctl+0x90/0xd0
  do_syscall_64+0x5b/0x80
  ? fpregs_restore_userregs+0x12/0xe0
  ? exit_to_user_mode_prepare+0x18f/0x1f0
  ? syscall_exit_to_user_mode+0x17/0x40
  ? do_syscall_64+0x67/0x80
  ? do_syscall_64+0x67/0x80
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 RIP: 0033:0x7fbbbdf27daf
 Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
 RSP: 002b:00007ffc3c305540 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
 RAX: ffffffffffffffda RBX: 0000000001f897e0 RCX: 00007fbbbdf27daf
 RDX: 0000000001f8e000 RSI: 00000000c138fd09 RDI: 0000000000000006
 RBP: 0000000000000003 R08: 0000000001f8b150 R09: 0000000000000073
 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fbbbe57c6b6
 R13: 00007fbbbe57c38c R14: 0000000001f8e030 R15: 00007fbbbe57c78c
  </TASK>
 Modules linked in: crc32_generic loop dm_integrity async_xor async_tx tls isofs uinput snd_seq_dummy snd_hrtimer nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr sunrpc snd_hda_codec_generic ledtrig_audio snd_hda_intel iTCO_wdt snd_intel_dspcfg intel_pmc_bxt snd_intel_sdw_acpi iTCO_vendor_support snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device joydev snd_pcm i2c_i801 snd_timer pcspkr i2c_smbus virtio_balloon snd lpc_ich soundcore zram virtio_net net_failover virtio_blk serio_raw failover qxl virtio_console drm_ttm_helper ttm ip6_tables ip_tables fuse qemu_fw_cfg
 Unloaded tainted modules: crc32_pclmul():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1
 ---[ end trace 0000000000000000 ]---
 RIP: 0010:__slab_free+0x11c/0x2f0
 Code: 39 49 8b 04 24 48 89 4c 24 18 48 c1 e8 36 4c 8b ac c3 d8 00 00 00 4c 89 ef e8 90 a2 a3 00 48 8b 4c 24 18 48 89 44 24 20 eb 8e <0f> 0b f7 43 08 00 0d 21 00 75 cc 4d 85 ff 75 c7 80 4c 24 53 80 e9
 RSP: 0018:ffffa2e001da7998 EFLAGS: 00010246
 RAX: ffff8e5059948440 RBX: ffff8e5040042200 RCX: 0000000082000127
 RDX: fffffffcab948440 RSI: fffff015c4665200 RDI: ffff8e5040042200
 RBP: ffff8e5059948440 R08: 0000000000000001 R09: ffffffffae457f18
 R10: 0000000000000009 R11: ffff8e514b185488 R12: fffff015c4665200
 R13: ffff8e509d111398 R14: ffff8e5059948440 R15: ffff8e5059948440
 FS:  00007fbbbdc13880(0000) GS:ffff8e53afa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055d1cdc94248 CR3: 0000000219990000 CR4: 00000000000006f0

The reason for the crash is this:

* create_bio_slab creates the bio slab with SLAB_TYPESAFE_BY_RCU - that
  means that the slab destruction will be delayed by a rcu grace period

* when destroying a slab with SLAB_TYPESAFE_BY_RCU, the function
  shutdown_cache adds the slab to a list and calls
  slab_caches_to_rcu_destroy_workfn that will execute rcu_barrier() and
  free slab caches that are on the list

* while slab_caches_to_rcu_destroy_workfn attempts to free the slab, the
  function bioset_init creates another slab with the same name

* bioset_init goes down to sysfs_slab_add -> sysfs_remove_link ->
  kernfs_remove_by_name -> kernfs_remove_by_name_ns

* simultaneously, slab_caches_to_rcu_destroy_workfn goes down to
  kmem_cache_release which tries to unlink and release the slab kobject

* now, we have two processes that are simultaneously trying to delete the
  same kobject

* kernfs_remove_by_name_ns wins the race, grabs the lock
  &root->kernfs_rwsem and executes __kernfs_remove

* __kernfs_remove goes down to pos = kernfs_leftmost_descendant(kn), then
  it calls kernfs_get(pos) and kernfs_drain(pos)

* kernfs_drain(pos) drops the &root->kernfs_rwsem lock temporarily

* while the lock is dropped, kmem_cache_release calls sysfs_slab_unlink
  -> kobject_del -> __kobject_del -> sysfs_remove_dir -> kernfs_remove ->
  __kernfs_remove

* kmem_cache_release calls sysfs_slab_release -> kobject_put -> kref_put
  -> kobject_release -> kobject_cleanup - this frees the kobject

* then we go back to the process that dropped the lock &root->kernfs_rwsem
  in kernfs_drain(pos) - the process re-acquires the lock and returns to
  __kernfs_remove. The process still keeps reference to the "pos" kobject,
  however it no longer keeps reference to the "kn" kobject (which was
  freed in the step above). It executes kernfs_put(pos) and then continues
  the loop with "kn" pointing to free memory

* kernfs_leftmost_descendant(kn) return "kn" because there are no files
  under "kn". kernfs_get(pos) triggers a warning
  WARN_ON(!atomic_read(&kn->count)). kernfs_put(pos) triggers a BUG in
  set_freepointer because it attempts to free an object that is already
  free

We fix this bug by grabbing a reference to "kn" in __kernfs_remove and
dropping it when we are done. This prevents "kn" from being released when
we drop the lock in kernfs_drain.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Milan Broz <gmazyland@gmail.com>
Cc: stable@vger.kernel.org

---
 fs/kernfs/dir.c |    8 ++++++++
 1 file changed, 8 insertions(+)

Index: linux-2.6/fs/kernfs/dir.c
===================================================================
--- linux-2.6.orig/fs/kernfs/dir.c
+++ linux-2.6/fs/kernfs/dir.c
@@ -1364,6 +1364,12 @@ static void __kernfs_remove(struct kernf
 		if (kernfs_active(pos))
 			atomic_add(KN_DEACTIVATED_BIAS, &pos->active);
 
+	/*
+	 * Make sure that kn won't go away while we drop the lock in
+	 * kernfs_drain().
+	 */
+	kernfs_get(kn);
+
 	/* deactivate and unlink the subtree node-by-node */
 	do {
 		pos = kernfs_leftmost_descendant(kn);
@@ -1406,6 +1412,8 @@ static void __kernfs_remove(struct kernf
 
 		kernfs_put(pos);
 	} while (pos != kn);
+
+	kernfs_put(kn);
 }
 
 /**


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] kernfs: fix a crash when two processes delete the same directory
  2022-09-26 11:04 [PATCH] kernfs: fix a crash when two processes delete the same directory Mikulas Patocka
@ 2022-09-26 12:16 ` Greg Kroah-Hartman
  2022-09-27 15:22   ` [dm-devel] " Milan Broz
  2022-09-27 16:09   ` Mikulas Patocka
  0 siblings, 2 replies; 7+ messages in thread
From: Greg Kroah-Hartman @ 2022-09-26 12:16 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Rafael J. Wysocki, Christoph Lameter, Andrew Morton, Milan Broz,
	dm-devel, linux-kernel

On Mon, Sep 26, 2022 at 07:04:52AM -0400, Mikulas Patocka wrote:
> There is a crash when running the cryptsetup testsuite on Fedora Rawhide.
> It can be reproduced by installing Rawhide with the 6.0-rc6 kernel,
> downloading and compiling the cryptsetup repository and running this test
> in a loop for about 15 minuts:
> 	while ./integrity-compat-test; do :; done
> 
>  ------------[ cut here ]------------
>  WARNING: CPU: 0 PID: 50087 at fs/kernfs/dir.c:504 __kernfs_remove.part.0+0x26f/0x2b0
>  Modules linked in: crc32_generic loop dm_integrity async_xor async_tx tls isofs uinput snd_seq_dummy snd_hrtimer nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr sunrpc snd_hda_codec_generic ledtrig_audio snd_hda_intel iTCO_wdt snd_intel_dspcfg intel_pmc_bxt snd_intel_sdw_acpi iTCO_vendor_support snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device joydev snd_pcm i2c_i801 snd_timer pcspkr i2c_smbus virtio_balloon snd lpc_ich soundcore zram virtio_net net_failover virtio_blk serio_raw failover qxl virtio_console drm_ttm_helper ttm ip6_tables ip_tables fuse qemu_fw_cfg
>  Unloaded tainted modules: crc32_pclmul():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1
>  CPU: 0 PID: 50087 Comm: integritysetup Not tainted 6.0.0-0.rc6.41.fc38.x86_64 #1
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
>  RIP: 0010:__kernfs_remove.part.0+0x26f/0x2b0
>  Code: 0f 85 24 ff ff ff 4d 85 e4 0f 84 31 ff ff ff 41 0f b7 44 24 70 4c 89 e3 83 e0 0f 66 83 f8 01 0f 84 2c fe ff ff e9 50 fe ff ff <0f> 0b e9 53 fe ff ff 0f 0b e9 6b fe ff ff 48 8b 57 10 48 c7 c6 e8
>  RSP: 0018:ffffa2e001da7a78 EFLAGS: 00010246
>  RAX: 0000000000000000 RBX: ffff8e509d111380 RCX: 0000000080200015
>  RDX: 0000000000000000 RSI: fffff015c5744440 RDI: ffff8e509d1113c8
>  RBP: ffffa2e001da7ac0 R08: 0000000000000000 R09: 0000000080200015
>  R10: 0000000000000009 R11: ffff8e514b185488 R12: ffff8e509d111380
>  R13: ffff8e5114324f98 R14: ffff8e50401e92a0 R15: 0000000000000000
>  FS:  00007fbbbdc13880(0000) GS:ffff8e53afa00000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 000055d1cdc94248 CR3: 0000000219990000 CR4: 00000000000006f0
>  Call Trace:
>   <TASK>
>   ? _raw_spin_unlock_irqrestore+0x23/0x40
>   ? kernfs_name_hash+0x12/0x80
>   kernfs_remove_by_name_ns+0x60/0xa0
>   sysfs_slab_add+0x166/0x200
>   __kmem_cache_create+0x3c9/0x4b0
>   kmem_cache_create_usercopy+0x202/0x340
>   kmem_cache_create+0x12/0x20
>   bioset_init+0x1fe/0x280
>   dm_table_complete+0x3cd/0x6f0
>   table_load+0x140/0x2c0
>   ? dev_suspend+0x2e0/0x2e0
>   ctl_ioctl+0x1f2/0x450
>   dm_ctl_ioctl+0xa/0x20
>   __x64_sys_ioctl+0x90/0xd0
>   do_syscall_64+0x5b/0x80
>   ? fpregs_restore_userregs+0x12/0xe0
>   ? exit_to_user_mode_prepare+0x18f/0x1f0
>   ? syscall_exit_to_user_mode+0x17/0x40
>   ? do_syscall_64+0x67/0x80
>   ? do_syscall_64+0x67/0x80
>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
>  RIP: 0033:0x7fbbbdf27daf
>  Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
>  RSP: 002b:00007ffc3c305540 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>  RAX: ffffffffffffffda RBX: 0000000001f897e0 RCX: 00007fbbbdf27daf
>  RDX: 0000000001f8e000 RSI: 00000000c138fd09 RDI: 0000000000000006
>  RBP: 0000000000000003 R08: 0000000001f8b150 R09: 0000000000000073
>  R10: 0000000000000000 R11: 0000000000000246 R12: 00007fbbbe57c6b6
>  R13: 00007fbbbe57c38c R14: 0000000001f8e030 R15: 00007fbbbe57c78c
>   </TASK>
>  ---[ end trace 0000000000000000 ]---
>  ------------[ cut here ]------------
>  kernel BUG at mm/slub.c:381!
>  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
>  CPU: 0 PID: 50087 Comm: integritysetup Tainted: G        W         -------  ---  6.0.0-0.rc6.41.fc38.x86_64 #1
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
>  RIP: 0010:__slab_free+0x11c/0x2f0
>  Code: 39 49 8b 04 24 48 89 4c 24 18 48 c1 e8 36 4c 8b ac c3 d8 00 00 00 4c 89 ef e8 90 a2 a3 00 48 8b 4c 24 18 48 89 44 24 20 eb 8e <0f> 0b f7 43 08 00 0d 21 00 75 cc 4d 85 ff 75 c7 80 4c 24 53 80 e9
>  RSP: 0018:ffffa2e001da7998 EFLAGS: 00010246
>  RAX: ffff8e5059948440 RBX: ffff8e5040042200 RCX: 0000000082000127
>  RDX: fffffffcab948440 RSI: fffff015c4665200 RDI: ffff8e5040042200
>  RBP: ffff8e5059948440 R08: 0000000000000001 R09: ffffffffae457f18
>  R10: 0000000000000009 R11: ffff8e514b185488 R12: fffff015c4665200
>  R13: ffff8e509d111398 R14: ffff8e5059948440 R15: ffff8e5059948440
>  FS:  00007fbbbdc13880(0000) GS:ffff8e53afa00000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 000055d1cdc94248 CR3: 0000000219990000 CR4: 00000000000006f0
>  Call Trace:
>   <TASK>
>   ? osq_unlock+0xf/0x90
>   ? rwsem_down_write_slowpath+0x1fc/0x710
>   kernfs_put.part.0+0x58/0x1a0
>   __kernfs_remove.part.0+0x17c/0x2b0
>   ? _raw_spin_unlock_irqrestore+0x23/0x40
>   ? kernfs_name_hash+0x12/0x80
>   kernfs_remove_by_name_ns+0x60/0xa0
>   sysfs_slab_add+0x166/0x200
>   __kmem_cache_create+0x3c9/0x4b0
>   kmem_cache_create_usercopy+0x202/0x340
>   kmem_cache_create+0x12/0x20
>   bioset_init+0x1fe/0x280
>   dm_table_complete+0x3cd/0x6f0
>   table_load+0x140/0x2c0
>   ? dev_suspend+0x2e0/0x2e0
>   ctl_ioctl+0x1f2/0x450
>   dm_ctl_ioctl+0xa/0x20
>   __x64_sys_ioctl+0x90/0xd0
>   do_syscall_64+0x5b/0x80
>   ? fpregs_restore_userregs+0x12/0xe0
>   ? exit_to_user_mode_prepare+0x18f/0x1f0
>   ? syscall_exit_to_user_mode+0x17/0x40
>   ? do_syscall_64+0x67/0x80
>   ? do_syscall_64+0x67/0x80
>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
>  RIP: 0033:0x7fbbbdf27daf
>  Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
>  RSP: 002b:00007ffc3c305540 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>  RAX: ffffffffffffffda RBX: 0000000001f897e0 RCX: 00007fbbbdf27daf
>  RDX: 0000000001f8e000 RSI: 00000000c138fd09 RDI: 0000000000000006
>  RBP: 0000000000000003 R08: 0000000001f8b150 R09: 0000000000000073
>  R10: 0000000000000000 R11: 0000000000000246 R12: 00007fbbbe57c6b6
>  R13: 00007fbbbe57c38c R14: 0000000001f8e030 R15: 00007fbbbe57c78c
>   </TASK>
>  Modules linked in: crc32_generic loop dm_integrity async_xor async_tx tls isofs uinput snd_seq_dummy snd_hrtimer nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr sunrpc snd_hda_codec_generic ledtrig_audio snd_hda_intel iTCO_wdt snd_intel_dspcfg intel_pmc_bxt snd_intel_sdw_acpi iTCO_vendor_support snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device joydev snd_pcm i2c_i801 snd_timer pcspkr i2c_smbus virtio_balloon snd lpc_ich soundcore zram virtio_net net_failover virtio_blk serio_raw failover qxl virtio_console drm_ttm_helper ttm ip6_tables ip_tables fuse qemu_fw_cfg
>  Unloaded tainted modules: crc32_pclmul():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1
>  ---[ end trace 0000000000000000 ]---
>  RIP: 0010:__slab_free+0x11c/0x2f0
>  Code: 39 49 8b 04 24 48 89 4c 24 18 48 c1 e8 36 4c 8b ac c3 d8 00 00 00 4c 89 ef e8 90 a2 a3 00 48 8b 4c 24 18 48 89 44 24 20 eb 8e <0f> 0b f7 43 08 00 0d 21 00 75 cc 4d 85 ff 75 c7 80 4c 24 53 80 e9
>  RSP: 0018:ffffa2e001da7998 EFLAGS: 00010246
>  RAX: ffff8e5059948440 RBX: ffff8e5040042200 RCX: 0000000082000127
>  RDX: fffffffcab948440 RSI: fffff015c4665200 RDI: ffff8e5040042200
>  RBP: ffff8e5059948440 R08: 0000000000000001 R09: ffffffffae457f18
>  R10: 0000000000000009 R11: ffff8e514b185488 R12: fffff015c4665200
>  R13: ffff8e509d111398 R14: ffff8e5059948440 R15: ffff8e5059948440
>  FS:  00007fbbbdc13880(0000) GS:ffff8e53afa00000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 000055d1cdc94248 CR3: 0000000219990000 CR4: 00000000000006f0
> 
> The reason for the crash is this:
> 
> * create_bio_slab creates the bio slab with SLAB_TYPESAFE_BY_RCU - that
>   means that the slab destruction will be delayed by a rcu grace period
> 
> * when destroying a slab with SLAB_TYPESAFE_BY_RCU, the function
>   shutdown_cache adds the slab to a list and calls
>   slab_caches_to_rcu_destroy_workfn that will execute rcu_barrier() and
>   free slab caches that are on the list
> 
> * while slab_caches_to_rcu_destroy_workfn attempts to free the slab, the
>   function bioset_init creates another slab with the same name
> 
> * bioset_init goes down to sysfs_slab_add -> sysfs_remove_link ->
>   kernfs_remove_by_name -> kernfs_remove_by_name_ns
> 
> * simultaneously, slab_caches_to_rcu_destroy_workfn goes down to
>   kmem_cache_release which tries to unlink and release the slab kobject
> 
> * now, we have two processes that are simultaneously trying to delete the
>   same kobject
> 
> * kernfs_remove_by_name_ns wins the race, grabs the lock
>   &root->kernfs_rwsem and executes __kernfs_remove
> 
> * __kernfs_remove goes down to pos = kernfs_leftmost_descendant(kn), then
>   it calls kernfs_get(pos) and kernfs_drain(pos)
> 
> * kernfs_drain(pos) drops the &root->kernfs_rwsem lock temporarily
> 
> * while the lock is dropped, kmem_cache_release calls sysfs_slab_unlink
>   -> kobject_del -> __kobject_del -> sysfs_remove_dir -> kernfs_remove ->
>   __kernfs_remove
> 
> * kmem_cache_release calls sysfs_slab_release -> kobject_put -> kref_put
>   -> kobject_release -> kobject_cleanup - this frees the kobject
> 
> * then we go back to the process that dropped the lock &root->kernfs_rwsem
>   in kernfs_drain(pos) - the process re-acquires the lock and returns to
>   __kernfs_remove. The process still keeps reference to the "pos" kobject,
>   however it no longer keeps reference to the "kn" kobject (which was
>   freed in the step above). It executes kernfs_put(pos) and then continues
>   the loop with "kn" pointing to free memory
> 
> * kernfs_leftmost_descendant(kn) return "kn" because there are no files
>   under "kn". kernfs_get(pos) triggers a warning
>   WARN_ON(!atomic_read(&kn->count)). kernfs_put(pos) triggers a BUG in
>   set_freepointer because it attempts to free an object that is already
>   free
> 
> We fix this bug by grabbing a reference to "kn" in __kernfs_remove and
> dropping it when we are done. This prevents "kn" from being released when
> we drop the lock in kernfs_drain.
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Reported-by: Milan Broz <gmazyland@gmail.com>
> Cc: stable@vger.kernel.org
> 
> ---
>  fs/kernfs/dir.c |    8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> Index: linux-2.6/fs/kernfs/dir.c
> ===================================================================
> --- linux-2.6.orig/fs/kernfs/dir.c
> +++ linux-2.6/fs/kernfs/dir.c
> @@ -1364,6 +1364,12 @@ static void __kernfs_remove(struct kernf
>  		if (kernfs_active(pos))
>  			atomic_add(KN_DEACTIVATED_BIAS, &pos->active);
>  
> +	/*
> +	 * Make sure that kn won't go away while we drop the lock in
> +	 * kernfs_drain().
> +	 */
> +	kernfs_get(kn);
> +
>  	/* deactivate and unlink the subtree node-by-node */
>  	do {
>  		pos = kernfs_leftmost_descendant(kn);
> @@ -1406,6 +1412,8 @@ static void __kernfs_remove(struct kernf
>  
>  		kernfs_put(pos);
>  	} while (pos != kn);
> +
> +	kernfs_put(kn);
>  }
>  
>  /**
> 

Can you see if 4abc99652812 ("kernfs: fix use-after-free in
__kernfs_remove") in linux-next fixes this for you or not?  It seems to
be the same issue, as was also reported at:
	https://lore.kernel.org/r/7f489b14-2fdc-3d91-c87e-6a802bd8592d@I-love.SAKURA.ne.jp

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dm-devel] [PATCH] kernfs: fix a crash when two processes delete the same directory
  2022-09-26 12:16 ` Greg Kroah-Hartman
@ 2022-09-27 15:22   ` Milan Broz
  2022-09-27 15:37     ` Greg Kroah-Hartman
  2022-09-27 16:09   ` Mikulas Patocka
  1 sibling, 1 reply; 7+ messages in thread
From: Milan Broz @ 2022-09-27 15:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Mikulas Patocka
  Cc: Andrew Morton, Rafael J. Wysocki, linux-kernel, dm-devel,
	Christoph Lameter

On 9/26/22 14:16, Greg Kroah-Hartman wrote:
> On Mon, Sep 26, 2022 at 07:04:52AM -0400, Mikulas Patocka wrote:
>> There is a crash when running the cryptsetup testsuite on Fedora Rawhide.
>> It can be reproduced by installing Rawhide with the 6.0-rc6 kernel,
>> downloading and compiling the cryptsetup repository and running this test
>> in a loop for about 15 minuts:
>> 	while ./integrity-compat-test; do :; done
>>
>>   ------------[ cut here ]------------
>>   WARNING: CPU: 0 PID: 50087 at fs/kernfs/dir.c:504 __kernfs_remove.part.0+0x26f/0x2b0
>>   Modules linked in: crc32_generic loop dm_integrity async_xor async_tx tls isofs uinput snd_seq_dummy snd_hrtimer nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr sunrpc snd_hda_codec_generic ledtrig_audio snd_hda_intel iTCO_wdt snd_intel_dspcfg intel_pmc_bxt snd_intel_sdw_acpi iTCO_vendor_support snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device joydev snd_pcm i2c_i801 snd_timer pcspkr i2c_smbus virtio_balloon snd lpc_ich soundcore zram virtio_net net_failover virtio_blk serio_raw failover qxl virtio_console drm_ttm_helper ttm ip6_tables ip_tables fuse qemu_fw_cfg
>>   Unloaded tainted modules: crc32_pclmul():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1
>>   CPU: 0 PID: 50087 Comm: integritysetup Not tainted 6.0.0-0.rc6.41.fc38.x86_64 #1
>>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
>>   RIP: 0010:__kernfs_remove.part.0+0x26f/0x2b0

...

> Can you see if 4abc99652812 ("kernfs: fix use-after-free in
> __kernfs_remove") in linux-next fixes this for you or not?  It seems to
> be the same issue, as was also reported at:
> 	https://lore.kernel.org/r/7f489b14-2fdc-3d91-c87e-6a802bd8592d@I-love.SAKURA.ne.jp


I tried it on system where cryptsetup testsuite almost immediately crashed in the integrity test.

With the patch in https://lore.kernel.org/r/7f489b14-2fdc-3d91-c87e-6a802bd8592d@I-love.SAKURA.ne.jp
it now iterates for some time without any problems, so I think it is fixed.

Tested-by: Milan Broz <gmazyland@gmail.com>

Thanks,
Milan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dm-devel] [PATCH] kernfs: fix a crash when two processes delete the same directory
  2022-09-27 15:22   ` [dm-devel] " Milan Broz
@ 2022-09-27 15:37     ` Greg Kroah-Hartman
  2022-09-27 17:24       ` Milan Broz
  0 siblings, 1 reply; 7+ messages in thread
From: Greg Kroah-Hartman @ 2022-09-27 15:37 UTC (permalink / raw)
  To: Milan Broz
  Cc: Mikulas Patocka, Andrew Morton, Rafael J. Wysocki, linux-kernel,
	dm-devel, Christoph Lameter

On Tue, Sep 27, 2022 at 05:22:46PM +0200, Milan Broz wrote:
> On 9/26/22 14:16, Greg Kroah-Hartman wrote:
> > On Mon, Sep 26, 2022 at 07:04:52AM -0400, Mikulas Patocka wrote:
> > > There is a crash when running the cryptsetup testsuite on Fedora Rawhide.
> > > It can be reproduced by installing Rawhide with the 6.0-rc6 kernel,
> > > downloading and compiling the cryptsetup repository and running this test
> > > in a loop for about 15 minuts:
> > > 	while ./integrity-compat-test; do :; done
> > > 
> > >   ------------[ cut here ]------------
> > >   WARNING: CPU: 0 PID: 50087 at fs/kernfs/dir.c:504 __kernfs_remove.part.0+0x26f/0x2b0
> > >   Modules linked in: crc32_generic loop dm_integrity async_xor async_tx tls isofs uinput snd_seq_dummy snd_hrtimer nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr sunrpc snd_hda_codec_generic ledtrig_audio snd_hda_intel iTCO_wdt snd_intel_dspcfg intel_pmc_bxt snd_intel_sdw_acpi iTCO_vendor_support snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device joydev snd_pcm i2c_i801 snd_timer pcspkr i2c_smbus virtio_balloon snd lpc_ich soundcore zram virtio_net net_failover virtio_blk serio_raw failover qxl virtio_console drm_ttm_helper ttm ip6_tables ip_tables fuse qemu_fw_cfg
> > >   Unloaded tainted modules: crc32_pclmul():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1
> > >   CPU: 0 PID: 50087 Comm: integritysetup Not tainted 6.0.0-0.rc6.41.fc38.x86_64 #1
> > >   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
> > >   RIP: 0010:__kernfs_remove.part.0+0x26f/0x2b0
> 
> ...
> 
> > Can you see if 4abc99652812 ("kernfs: fix use-after-free in
> > __kernfs_remove") in linux-next fixes this for you or not?  It seems to
> > be the same issue, as was also reported at:
> > 	https://lore.kernel.org/r/7f489b14-2fdc-3d91-c87e-6a802bd8592d@I-love.SAKURA.ne.jp
> 
> 
> I tried it on system where cryptsetup testsuite almost immediately crashed in the integrity test.
> 
> With the patch in https://lore.kernel.org/r/7f489b14-2fdc-3d91-c87e-6a802bd8592d@I-love.SAKURA.ne.jp
> it now iterates for some time without any problems, so I think it is fixed.
> 
> Tested-by: Milan Broz <gmazyland@gmail.com>

Wait, what about the patch that is in linux-next that I pointed to, not
the one in the email?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] kernfs: fix a crash when two processes delete the same directory
  2022-09-26 12:16 ` Greg Kroah-Hartman
  2022-09-27 15:22   ` [dm-devel] " Milan Broz
@ 2022-09-27 16:09   ` Mikulas Patocka
  2022-09-27 16:39     ` Greg Kroah-Hartman
  1 sibling, 1 reply; 7+ messages in thread
From: Mikulas Patocka @ 2022-09-27 16:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Rafael J. Wysocki, Christoph Lameter, Andrew Morton, Milan Broz,
	dm-devel, linux-kernel



On Mon, 26 Sep 2022, Greg Kroah-Hartman wrote:

> Can you see if 4abc99652812 ("kernfs: fix use-after-free in
> __kernfs_remove") in linux-next fixes this for you or not?  It seems to
> be the same issue, as was also reported at:
> 	https://lore.kernel.org/r/7f489b14-2fdc-3d91-c87e-6a802bd8592d@I-love.SAKURA.ne.jp
> 
> thanks,
> 
> greg k-h

I'm running the test with the patch 4abc99652812 for 3 hours, no crash, so 
it's probably ok.

Tested-by: Mikulas Patocka <mpatocka@redhat.com>

Mikulas


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] kernfs: fix a crash when two processes delete the same directory
  2022-09-27 16:09   ` Mikulas Patocka
@ 2022-09-27 16:39     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 7+ messages in thread
From: Greg Kroah-Hartman @ 2022-09-27 16:39 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Rafael J. Wysocki, Christoph Lameter, Andrew Morton, Milan Broz,
	dm-devel, linux-kernel

On Tue, Sep 27, 2022 at 12:09:19PM -0400, Mikulas Patocka wrote:
> 
> 
> On Mon, 26 Sep 2022, Greg Kroah-Hartman wrote:
> 
> > Can you see if 4abc99652812 ("kernfs: fix use-after-free in
> > __kernfs_remove") in linux-next fixes this for you or not?  It seems to
> > be the same issue, as was also reported at:
> > 	https://lore.kernel.org/r/7f489b14-2fdc-3d91-c87e-6a802bd8592d@I-love.SAKURA.ne.jp
> > 
> > thanks,
> > 
> > greg k-h
> 
> I'm running the test with the patch 4abc99652812 for 3 hours, no crash, so 
> it's probably ok.
> 
> Tested-by: Mikulas Patocka <mpatocka@redhat.com>

Wonderful, thanks!

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dm-devel] [PATCH] kernfs: fix a crash when two processes delete the same directory
  2022-09-27 15:37     ` Greg Kroah-Hartman
@ 2022-09-27 17:24       ` Milan Broz
  0 siblings, 0 replies; 7+ messages in thread
From: Milan Broz @ 2022-09-27 17:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mikulas Patocka, Andrew Morton, Rafael J. Wysocki, linux-kernel,
	dm-devel, Christoph Lameter

On 9/27/22 17:37, Greg Kroah-Hartman wrote:
> On Tue, Sep 27, 2022 at 05:22:46PM +0200, Milan Broz wrote:
>> On 9/26/22 14:16, Greg Kroah-Hartman wrote:
>>> On Mon, Sep 26, 2022 at 07:04:52AM -0400, Mikulas Patocka wrote:
>>>> There is a crash when running the cryptsetup testsuite on Fedora Rawhide.
>>>> It can be reproduced by installing Rawhide with the 6.0-rc6 kernel,
>>>> downloading and compiling the cryptsetup repository and running this test
>>>> in a loop for about 15 minuts:
>>>> 	while ./integrity-compat-test; do :; done
>>>>
>>>>    ------------[ cut here ]------------
>>>>    WARNING: CPU: 0 PID: 50087 at fs/kernfs/dir.c:504 __kernfs_remove.part.0+0x26f/0x2b0
>>>>    Modules linked in: crc32_generic loop dm_integrity async_xor async_tx tls isofs uinput snd_seq_dummy snd_hrtimer nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr sunrpc snd_hda_codec_generic ledtrig_audio snd_hda_intel iTCO_wdt snd_intel_dspcfg intel_pmc_bxt snd_intel_sdw_acpi iTCO_vendor_support snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device joydev snd_pcm i2c_i801 snd_timer pcspkr i2c_smbus virtio_balloon snd lpc_ich soundcore zram virtio_net net_failover virtio_blk serio_raw failover qxl virtio_console drm_ttm_helper ttm ip6_tables ip_tables fuse qemu_fw_cfg
>>>>    Unloaded tainted modules: crc32_pclmul():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 edac_mce_amd():1 acpi_cpufreq():1 pcc_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 edac_mce_amd():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1
>>>>    CPU: 0 PID: 50087 Comm: integritysetup Not tainted 6.0.0-0.rc6.41.fc38.x86_64 #1
>>>>    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
>>>>    RIP: 0010:__kernfs_remove.part.0+0x26f/0x2b0
>>
>> ...
>>
>>> Can you see if 4abc99652812 ("kernfs: fix use-after-free in
>>> __kernfs_remove") in linux-next fixes this for you or not?  It seems to
>>> be the same issue, as was also reported at:
>>> 	https://lore.kernel.org/r/7f489b14-2fdc-3d91-c87e-6a802bd8592d@I-love.SAKURA.ne.jp
>>
>>
>> I tried it on system where cryptsetup testsuite almost immediately crashed in the integrity test.
>>
>> With the patch in https://lore.kernel.org/r/7f489b14-2fdc-3d91-c87e-6a802bd8592d@I-love.SAKURA.ne.jp
>> it now iterates for some time without any problems, so I think it is fixed.
>>
>> Tested-by: Milan Broz <gmazyland@gmail.com>
> 
> Wait, what about the patch that is in linux-next that I pointed to, not
> the one in the email?

Ehm, yes, my mistake.

Now tested 4abc99652812 (it is not visible in any branch anymore, perhaps rebased).

Already running several iterations and no problems, so for the correct patch now:

Tested-by: Milan Broz <gmazyland@gmail.com>

Milan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-09-27 17:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-26 11:04 [PATCH] kernfs: fix a crash when two processes delete the same directory Mikulas Patocka
2022-09-26 12:16 ` Greg Kroah-Hartman
2022-09-27 15:22   ` [dm-devel] " Milan Broz
2022-09-27 15:37     ` Greg Kroah-Hartman
2022-09-27 17:24       ` Milan Broz
2022-09-27 16:09   ` Mikulas Patocka
2022-09-27 16:39     ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).