* [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing @ 2022-09-22 10:10 Janusz Krzysztofik 2022-09-22 12:09 ` Robin Murphy ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Janusz Krzysztofik @ 2022-09-22 10:10 UTC (permalink / raw) To: Lucas De Marchi Cc: intel-gfx, Chris Wilson, Robin Murphy, Joerg Roedel, Will Deacon, iommu, linux-kernel From: Chris Wilson <chris@chris-wilson.co.uk> Manual revert of commit f598a497bc7d ("iova: Add CPU hotplug handler to flush rcaches"). It is trying to instantiate a cpuhp notifier from inside a cpuhp callback. That code replaced intel_iommu implementation of flushing per-IOVA domain CPU rcaches which used a single instance of cpuhp held for the module lifetime. <4>[ 6.928112] ====================================================== <4>[ 6.928621] WARNING: possible circular locking dependency detected <4>[ 6.929225] 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 Not tainted <4>[ 6.929818] ------------------------------------------------------ <4>[ 6.930415] cpuhp/0/15 is trying to acquire lock: <4>[ 6.931011] ffff888100e02a78 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x20/0x50 <4>[ 6.931533] but task is already holding lock: <4>[ 6.931534] ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 <4>[ 6.933069] which lock already depends on the new lock. <4>[ 6.933070] the existing dependency chain (in reverse order) is: <4>[ 6.933071] -> #2 (cpuhp_state-up){+.+.}-{0:0}: <4>[ 6.933076] lock_acquire+0xd3/0x310 <4>[ 6.933079] cpuhp_thread_fun+0xa6/0x1f0 <4>[ 6.933082] smpboot_thread_fn+0x1b5/0x260 <4>[ 6.933084] kthread+0xed/0x120 <4>[ 6.933086] ret_from_fork+0x1f/0x30 <4>[ 6.933089] -> #1 (cpu_hotplug_lock){++++}-{0:0}: <4>[ 6.933092] lock_acquire+0xd3/0x310 <4>[ 6.933094] __cpuhp_state_add_instance+0x43/0x1c0 <4>[ 6.933096] iova_domain_init_rcaches+0x199/0x1c0 <4>[ 6.933099] iommu_setup_dma_ops+0x104/0x3d0 <4>[ 6.933101] iommu_probe_device+0xa4/0x180 <4>[ 6.933103] iommu_bus_notifier+0x2d/0x40 <4>[ 6.933105] notifier_call_chain+0x31/0x90 <4>[ 6.933108] blocking_notifier_call_chain+0x3a/0x50 <4>[ 6.933110] device_add+0x3c1/0x900 <4>[ 6.933112] pci_device_add+0x255/0x580 <4>[ 6.933115] pci_scan_single_device+0xa6/0xd0 <4>[ 6.933117] p2sb_bar+0x7f/0x220 <4>[ 6.933120] i801_add_tco_spt.isra.18+0x2b/0xca [i2c_i801] <4>[ 6.933124] i801_add_tco+0xb1/0xfe [i2c_i801] <4>[ 6.933126] i801_probe.cold.25+0xa9/0x3a7 [i2c_i801] <4>[ 6.933129] pci_device_probe+0x95/0x110 <4>[ 6.933132] really_probe+0xd6/0x350 <4>[ 6.933134] __driver_probe_device+0x73/0x170 <4>[ 6.933137] driver_probe_device+0x1a/0x90 <4>[ 6.933140] __driver_attach+0xbc/0x190 <4>[ 6.933141] bus_for_each_dev+0x72/0xc0 <4>[ 6.933143] bus_add_driver+0x1bb/0x210 <4>[ 6.933146] driver_register+0x66/0xc0 <4>[ 6.933147] wmi_bmof_probe+0x3b/0xac [wmi_bmof] <4>[ 6.933150] do_one_initcall+0x53/0x2f0 <4>[ 6.933152] do_init_module+0x45/0x1c0 <4>[ 6.933154] load_module+0x1cd5/0x1ec0 <4>[ 6.933156] __do_sys_finit_module+0xaf/0x120 <4>[ 6.933158] do_syscall_64+0x37/0x90 <4>[ 6.933160] entry_SYSCALL_64_after_hwframe+0x63/0xcd <4>[ 6.953757] -> #0 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}: <4>[ 6.953779] validate_chain+0xb3f/0x2000 <4>[ 6.953785] __lock_acquire+0x5a4/0xb70 <4>[ 6.953786] lock_acquire+0xd3/0x310 <4>[ 6.953787] down_read+0x39/0x140 <4>[ 6.953790] blocking_notifier_call_chain+0x20/0x50 <4>[ 6.953794] device_add+0x3c1/0x900 <4>[ 6.953797] platform_device_add+0x108/0x240 <4>[ 6.953799] coretemp_cpu_online+0xe1/0x15e [coretemp] <4>[ 6.953805] cpuhp_invoke_callback+0x181/0x8a0 <4>[ 6.958244] cpuhp_thread_fun+0x188/0x1f0 <4>[ 6.958267] smpboot_thread_fn+0x1b5/0x260 <4>[ 6.958270] kthread+0xed/0x120 <4>[ 6.958272] ret_from_fork+0x1f/0x30 <4>[ 6.958274] other info that might help us debug this: <4>[ 6.958275] Chain exists of: &(&priv->bus_notifier)->rwsem --> cpu_hotplug_lock --> cpuhp_state-up <4>[ 6.961037] Possible unsafe locking scenario: <4>[ 6.961038] CPU0 CPU1 <4>[ 6.961038] ---- ---- <4>[ 6.961039] lock(cpuhp_state-up); <4>[ 6.961040] lock(cpu_hotplug_lock); <4>[ 6.961041] lock(cpuhp_state-up); <4>[ 6.961042] lock(&(&priv->bus_notifier)->rwsem); <4>[ 6.961044] *** DEADLOCK *** <4>[ 6.961044] 2 locks held by cpuhp/0/15: <4>[ 6.961046] #0: ffffffff82648f10 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 <4>[ 6.961053] #1: ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 <4>[ 6.961058] stack backtrace: <4>[ 6.961059] CPU: 0 PID: 15 Comm: cpuhp/0 Not tainted 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 <4>[ 6.961062] Hardware name: Intel Corporation NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0047.2018.0718.1706 07/18/2018 <4>[ 6.961063] Call Trace: <4>[ 6.961064] <TASK> <4>[ 6.961065] dump_stack_lvl+0x56/0x7f <4>[ 6.961069] check_noncircular+0x132/0x150 <4>[ 6.961078] validate_chain+0xb3f/0x2000 <4>[ 6.961083] __lock_acquire+0x5a4/0xb70 <4>[ 6.961087] lock_acquire+0xd3/0x310 <4>[ 6.961088] ? blocking_notifier_call_chain+0x20/0x50 <4>[ 6.961093] down_read+0x39/0x140 <4>[ 6.961097] ? blocking_notifier_call_chain+0x20/0x50 <4>[ 6.961099] blocking_notifier_call_chain+0x20/0x50 <4>[ 6.961102] device_add+0x3c1/0x900 <4>[ 6.961106] ? dev_set_name+0x4e/0x70 <4>[ 6.961109] platform_device_add+0x108/0x240 <4>[ 6.961112] coretemp_cpu_online+0xe1/0x15e [coretemp] <4>[ 6.961117] ? create_core_data+0x550/0x550 [coretemp] <4>[ 6.961120] cpuhp_invoke_callback+0x181/0x8a0 <4>[ 6.961124] cpuhp_thread_fun+0x188/0x1f0 <4>[ 6.961129] ? smpboot_thread_fn+0x1e/0x260 <4>[ 6.961131] smpboot_thread_fn+0x1b5/0x260 <4>[ 6.961134] ? sort_range+0x20/0x20 <4>[ 6.961135] kthread+0xed/0x120 <4>[ 6.961137] ? kthread_complete_and_exit+0x20/0x20 <4>[ 6.961139] ret_from_fork+0x1f/0x30 <4>[ 6.961145] </TASK> Closes: https://gitlab.freedesktop.org/drm/intel/issues/6641 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> --- drivers/iommu/iova.c | 28 ---------------------------- include/linux/cpuhotplug.h | 1 - include/linux/iova.h | 1 - 3 files changed, 30 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 47d1983dfa2a4..f0136d0231f06 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -31,16 +31,6 @@ unsigned long iova_rcache_range(void) return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1); } -static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node) -{ - struct iova_domain *iovad; - - iovad = hlist_entry_safe(node, struct iova_domain, cpuhp_dead); - - free_cpu_cached_iovas(cpu, iovad); - return 0; -} - static void free_global_cached_iovas(struct iova_domain *iovad); static struct iova *to_iova(struct rb_node *node) @@ -255,21 +245,10 @@ int iova_cache_get(void) { mutex_lock(&iova_cache_mutex); if (!iova_cache_users) { - int ret; - - ret = cpuhp_setup_state_multi(CPUHP_IOMMU_IOVA_DEAD, "iommu/iova:dead", NULL, - iova_cpuhp_dead); - if (ret) { - mutex_unlock(&iova_cache_mutex); - pr_err("Couldn't register cpuhp handler\n"); - return ret; - } - iova_cache = kmem_cache_create( "iommu_iova", sizeof(struct iova), 0, SLAB_HWCACHE_ALIGN, NULL); if (!iova_cache) { - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); mutex_unlock(&iova_cache_mutex); pr_err("Couldn't create iova cache\n"); return -ENOMEM; @@ -292,7 +271,6 @@ void iova_cache_put(void) } iova_cache_users--; if (!iova_cache_users) { - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); kmem_cache_destroy(iova_cache); } mutex_unlock(&iova_cache_mutex); @@ -495,8 +473,6 @@ EXPORT_SYMBOL_GPL(free_iova_fast); static void iova_domain_free_rcaches(struct iova_domain *iovad) { - cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, - &iovad->cpuhp_dead); free_iova_rcaches(iovad); } @@ -755,10 +731,6 @@ int iova_domain_init_rcaches(struct iova_domain *iovad) } } - ret = cpuhp_state_add_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, - &iovad->cpuhp_dead); - if (ret) - goto out_err; return 0; out_err: diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index f61447913db97..8f541a6b63e41 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -95,7 +95,6 @@ enum cpuhp_state { CPUHP_PAGE_ALLOC, CPUHP_NET_DEV_DEAD, CPUHP_PCI_XGENE_DEAD, - CPUHP_IOMMU_IOVA_DEAD, CPUHP_LUSTRE_CFS_DEAD, CPUHP_AP_ARM_CACHE_B15_RAC_DEAD, CPUHP_PADATA_DEAD, diff --git a/include/linux/iova.h b/include/linux/iova.h index c6ba6d95d79c2..fd77cd5bfa333 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -37,7 +37,6 @@ struct iova_domain { struct iova anchor; /* rbtree lookup anchor */ struct iova_rcache *rcaches; - struct hlist_node cpuhp_dead; }; static inline unsigned long iova_size(struct iova *iova) -- 2.25.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing 2022-09-22 10:10 [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing Janusz Krzysztofik @ 2022-09-22 12:09 ` Robin Murphy 2022-09-22 13:37 ` Janusz Krzysztofik 2022-09-30 16:57 ` Janusz Krzysztofik 2022-10-05 14:41 ` [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing #forregzbot Thorsten Leemhuis 2022-11-02 11:17 ` [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing Baolu Lu 2 siblings, 2 replies; 11+ messages in thread From: Robin Murphy @ 2022-09-22 12:09 UTC (permalink / raw) To: Janusz Krzysztofik, Lucas De Marchi Cc: intel-gfx, Chris Wilson, Joerg Roedel, Will Deacon, iommu, linux-kernel On 22/09/2022 11:10 am, Janusz Krzysztofik wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > Manual revert of commit f598a497bc7d ("iova: Add CPU hotplug handler to > flush rcaches"). It is trying to instantiate a cpuhp notifier from inside > a cpuhp callback. That code replaced intel_iommu implementation of > flushing per-IOVA domain CPU rcaches which used a single instance of cpuhp > held for the module lifetime. OK, *now* I see what's going on. It doesn't seem unreasonable to me for bus notifiers to touch CPU hotplug - what seems more unexpected is the coretemp driver creating and adding a platform device from inside a hotplug callback. Once we start trying to revert multiple unrelated bits of important functionality from other subsystems because one driver is doing a weird thing, maybe it's time to instead question whether that driver should be doing a weird thing? Thanks, Robin. > <4>[ 6.928112] ====================================================== > <4>[ 6.928621] WARNING: possible circular locking dependency detected > <4>[ 6.929225] 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 Not tainted > <4>[ 6.929818] ------------------------------------------------------ > <4>[ 6.930415] cpuhp/0/15 is trying to acquire lock: > <4>[ 6.931011] ffff888100e02a78 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x20/0x50 > <4>[ 6.931533] > but task is already holding lock: > <4>[ 6.931534] ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > <4>[ 6.933069] > which lock already depends on the new lock. > > <4>[ 6.933070] > the existing dependency chain (in reverse order) is: > <4>[ 6.933071] > -> #2 (cpuhp_state-up){+.+.}-{0:0}: > <4>[ 6.933076] lock_acquire+0xd3/0x310 > <4>[ 6.933079] cpuhp_thread_fun+0xa6/0x1f0 > <4>[ 6.933082] smpboot_thread_fn+0x1b5/0x260 > <4>[ 6.933084] kthread+0xed/0x120 > <4>[ 6.933086] ret_from_fork+0x1f/0x30 > <4>[ 6.933089] > -> #1 (cpu_hotplug_lock){++++}-{0:0}: > <4>[ 6.933092] lock_acquire+0xd3/0x310 > <4>[ 6.933094] __cpuhp_state_add_instance+0x43/0x1c0 > <4>[ 6.933096] iova_domain_init_rcaches+0x199/0x1c0 > <4>[ 6.933099] iommu_setup_dma_ops+0x104/0x3d0 > <4>[ 6.933101] iommu_probe_device+0xa4/0x180 > <4>[ 6.933103] iommu_bus_notifier+0x2d/0x40 > <4>[ 6.933105] notifier_call_chain+0x31/0x90 > <4>[ 6.933108] blocking_notifier_call_chain+0x3a/0x50 > <4>[ 6.933110] device_add+0x3c1/0x900 > <4>[ 6.933112] pci_device_add+0x255/0x580 > <4>[ 6.933115] pci_scan_single_device+0xa6/0xd0 > <4>[ 6.933117] p2sb_bar+0x7f/0x220 > <4>[ 6.933120] i801_add_tco_spt.isra.18+0x2b/0xca [i2c_i801] > <4>[ 6.933124] i801_add_tco+0xb1/0xfe [i2c_i801] > <4>[ 6.933126] i801_probe.cold.25+0xa9/0x3a7 [i2c_i801] > <4>[ 6.933129] pci_device_probe+0x95/0x110 > <4>[ 6.933132] really_probe+0xd6/0x350 > <4>[ 6.933134] __driver_probe_device+0x73/0x170 > <4>[ 6.933137] driver_probe_device+0x1a/0x90 > <4>[ 6.933140] __driver_attach+0xbc/0x190 > <4>[ 6.933141] bus_for_each_dev+0x72/0xc0 > <4>[ 6.933143] bus_add_driver+0x1bb/0x210 > <4>[ 6.933146] driver_register+0x66/0xc0 > <4>[ 6.933147] wmi_bmof_probe+0x3b/0xac [wmi_bmof] > <4>[ 6.933150] do_one_initcall+0x53/0x2f0 > <4>[ 6.933152] do_init_module+0x45/0x1c0 > <4>[ 6.933154] load_module+0x1cd5/0x1ec0 > <4>[ 6.933156] __do_sys_finit_module+0xaf/0x120 > <4>[ 6.933158] do_syscall_64+0x37/0x90 > <4>[ 6.933160] entry_SYSCALL_64_after_hwframe+0x63/0xcd > <4>[ 6.953757] > -> #0 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}: > <4>[ 6.953779] validate_chain+0xb3f/0x2000 > <4>[ 6.953785] __lock_acquire+0x5a4/0xb70 > <4>[ 6.953786] lock_acquire+0xd3/0x310 > <4>[ 6.953787] down_read+0x39/0x140 > <4>[ 6.953790] blocking_notifier_call_chain+0x20/0x50 > <4>[ 6.953794] device_add+0x3c1/0x900 > <4>[ 6.953797] platform_device_add+0x108/0x240 > <4>[ 6.953799] coretemp_cpu_online+0xe1/0x15e [coretemp] > <4>[ 6.953805] cpuhp_invoke_callback+0x181/0x8a0 > <4>[ 6.958244] cpuhp_thread_fun+0x188/0x1f0 > <4>[ 6.958267] smpboot_thread_fn+0x1b5/0x260 > <4>[ 6.958270] kthread+0xed/0x120 > <4>[ 6.958272] ret_from_fork+0x1f/0x30 > <4>[ 6.958274] > other info that might help us debug this: > > <4>[ 6.958275] Chain exists of: > &(&priv->bus_notifier)->rwsem --> cpu_hotplug_lock --> cpuhp_state-up > > <4>[ 6.961037] Possible unsafe locking scenario: > > <4>[ 6.961038] CPU0 CPU1 > <4>[ 6.961038] ---- ---- > <4>[ 6.961039] lock(cpuhp_state-up); > <4>[ 6.961040] lock(cpu_hotplug_lock); > <4>[ 6.961041] lock(cpuhp_state-up); > <4>[ 6.961042] lock(&(&priv->bus_notifier)->rwsem); > <4>[ 6.961044] > *** DEADLOCK *** > > <4>[ 6.961044] 2 locks held by cpuhp/0/15: > <4>[ 6.961046] #0: ffffffff82648f10 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > <4>[ 6.961053] #1: ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > <4>[ 6.961058] > stack backtrace: > <4>[ 6.961059] CPU: 0 PID: 15 Comm: cpuhp/0 Not tainted 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 > <4>[ 6.961062] Hardware name: Intel Corporation NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0047.2018.0718.1706 07/18/2018 > <4>[ 6.961063] Call Trace: > <4>[ 6.961064] <TASK> > <4>[ 6.961065] dump_stack_lvl+0x56/0x7f > <4>[ 6.961069] check_noncircular+0x132/0x150 > <4>[ 6.961078] validate_chain+0xb3f/0x2000 > <4>[ 6.961083] __lock_acquire+0x5a4/0xb70 > <4>[ 6.961087] lock_acquire+0xd3/0x310 > <4>[ 6.961088] ? blocking_notifier_call_chain+0x20/0x50 > <4>[ 6.961093] down_read+0x39/0x140 > <4>[ 6.961097] ? blocking_notifier_call_chain+0x20/0x50 > <4>[ 6.961099] blocking_notifier_call_chain+0x20/0x50 > <4>[ 6.961102] device_add+0x3c1/0x900 > <4>[ 6.961106] ? dev_set_name+0x4e/0x70 > <4>[ 6.961109] platform_device_add+0x108/0x240 > <4>[ 6.961112] coretemp_cpu_online+0xe1/0x15e [coretemp] > <4>[ 6.961117] ? create_core_data+0x550/0x550 [coretemp] > <4>[ 6.961120] cpuhp_invoke_callback+0x181/0x8a0 > <4>[ 6.961124] cpuhp_thread_fun+0x188/0x1f0 > <4>[ 6.961129] ? smpboot_thread_fn+0x1e/0x260 > <4>[ 6.961131] smpboot_thread_fn+0x1b5/0x260 > <4>[ 6.961134] ? sort_range+0x20/0x20 > <4>[ 6.961135] kthread+0xed/0x120 > <4>[ 6.961137] ? kthread_complete_and_exit+0x20/0x20 > <4>[ 6.961139] ret_from_fork+0x1f/0x30 > <4>[ 6.961145] </TASK> > > Closes: https://gitlab.freedesktop.org/drm/intel/issues/6641 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> > --- > drivers/iommu/iova.c | 28 ---------------------------- > include/linux/cpuhotplug.h | 1 - > include/linux/iova.h | 1 - > 3 files changed, 30 deletions(-) > > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > index 47d1983dfa2a4..f0136d0231f06 100644 > --- a/drivers/iommu/iova.c > +++ b/drivers/iommu/iova.c > @@ -31,16 +31,6 @@ unsigned long iova_rcache_range(void) > return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1); > } > > -static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node) > -{ > - struct iova_domain *iovad; > - > - iovad = hlist_entry_safe(node, struct iova_domain, cpuhp_dead); > - > - free_cpu_cached_iovas(cpu, iovad); > - return 0; > -} > - > static void free_global_cached_iovas(struct iova_domain *iovad); > > static struct iova *to_iova(struct rb_node *node) > @@ -255,21 +245,10 @@ int iova_cache_get(void) > { > mutex_lock(&iova_cache_mutex); > if (!iova_cache_users) { > - int ret; > - > - ret = cpuhp_setup_state_multi(CPUHP_IOMMU_IOVA_DEAD, "iommu/iova:dead", NULL, > - iova_cpuhp_dead); > - if (ret) { > - mutex_unlock(&iova_cache_mutex); > - pr_err("Couldn't register cpuhp handler\n"); > - return ret; > - } > - > iova_cache = kmem_cache_create( > "iommu_iova", sizeof(struct iova), 0, > SLAB_HWCACHE_ALIGN, NULL); > if (!iova_cache) { > - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); > mutex_unlock(&iova_cache_mutex); > pr_err("Couldn't create iova cache\n"); > return -ENOMEM; > @@ -292,7 +271,6 @@ void iova_cache_put(void) > } > iova_cache_users--; > if (!iova_cache_users) { > - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); > kmem_cache_destroy(iova_cache); > } > mutex_unlock(&iova_cache_mutex); > @@ -495,8 +473,6 @@ EXPORT_SYMBOL_GPL(free_iova_fast); > > static void iova_domain_free_rcaches(struct iova_domain *iovad) > { > - cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, > - &iovad->cpuhp_dead); > free_iova_rcaches(iovad); > } > > @@ -755,10 +731,6 @@ int iova_domain_init_rcaches(struct iova_domain *iovad) > } > } > > - ret = cpuhp_state_add_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, > - &iovad->cpuhp_dead); > - if (ret) > - goto out_err; > return 0; > > out_err: > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > index f61447913db97..8f541a6b63e41 100644 > --- a/include/linux/cpuhotplug.h > +++ b/include/linux/cpuhotplug.h > @@ -95,7 +95,6 @@ enum cpuhp_state { > CPUHP_PAGE_ALLOC, > CPUHP_NET_DEV_DEAD, > CPUHP_PCI_XGENE_DEAD, > - CPUHP_IOMMU_IOVA_DEAD, > CPUHP_LUSTRE_CFS_DEAD, > CPUHP_AP_ARM_CACHE_B15_RAC_DEAD, > CPUHP_PADATA_DEAD, > diff --git a/include/linux/iova.h b/include/linux/iova.h > index c6ba6d95d79c2..fd77cd5bfa333 100644 > --- a/include/linux/iova.h > +++ b/include/linux/iova.h > @@ -37,7 +37,6 @@ struct iova_domain { > struct iova anchor; /* rbtree lookup anchor */ > > struct iova_rcache *rcaches; > - struct hlist_node cpuhp_dead; > }; > > static inline unsigned long iova_size(struct iova *iova) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing 2022-09-22 12:09 ` Robin Murphy @ 2022-09-22 13:37 ` Janusz Krzysztofik 2022-09-30 16:57 ` Janusz Krzysztofik 1 sibling, 0 replies; 11+ messages in thread From: Janusz Krzysztofik @ 2022-09-22 13:37 UTC (permalink / raw) To: Lucas De Marchi, Robin Murphy Cc: intel-gfx, Chris Wilson, Joerg Roedel, Will Deacon, iommu, linux-kernel On Thursday, 22 September 2022 14:09:35 CEST Robin Murphy wrote: > On 22/09/2022 11:10 am, Janusz Krzysztofik wrote: > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > Manual revert of commit f598a497bc7d ("iova: Add CPU hotplug handler to > > flush rcaches"). It is trying to instantiate a cpuhp notifier from inside > > a cpuhp callback. That code replaced intel_iommu implementation of > > flushing per-IOVA domain CPU rcaches which used a single instance of cpuhp > > held for the module lifetime. > > OK, *now* I see what's going on. It doesn't seem unreasonable to me for > bus notifiers to touch CPU hotplug - what seems more unexpected is the > coretemp driver creating and adding a platform device from inside a > hotplug callback. > > Once we start trying to revert multiple unrelated bits of important > functionality from other subsystems because one driver is doing a weird > thing, maybe it's time to instead question whether that driver should be > doing a weird thing? To be clear, the intention behind this revert was to unblock intel-gfx-ci, not to propose a solution. I've CC-ed IOMMU and mainstream just for your awareness. Thanks, Janusz > > Thanks, > Robin. > > > <4>[ 6.928112] ====================================================== > > <4>[ 6.928621] WARNING: possible circular locking dependency detected > > <4>[ 6.929225] 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 Not tainted > > <4>[ 6.929818] ------------------------------------------------------ > > <4>[ 6.930415] cpuhp/0/15 is trying to acquire lock: > > <4>[ 6.931011] ffff888100e02a78 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x20/0x50 > > <4>[ 6.931533] > > but task is already holding lock: > > <4>[ 6.931534] ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > > <4>[ 6.933069] > > which lock already depends on the new lock. > > > > <4>[ 6.933070] > > the existing dependency chain (in reverse order) is: > > <4>[ 6.933071] > > -> #2 (cpuhp_state-up){+.+.}-{0:0}: > > <4>[ 6.933076] lock_acquire+0xd3/0x310 > > <4>[ 6.933079] cpuhp_thread_fun+0xa6/0x1f0 > > <4>[ 6.933082] smpboot_thread_fn+0x1b5/0x260 > > <4>[ 6.933084] kthread+0xed/0x120 > > <4>[ 6.933086] ret_from_fork+0x1f/0x30 > > <4>[ 6.933089] > > -> #1 (cpu_hotplug_lock){++++}-{0:0}: > > <4>[ 6.933092] lock_acquire+0xd3/0x310 > > <4>[ 6.933094] __cpuhp_state_add_instance+0x43/0x1c0 > > <4>[ 6.933096] iova_domain_init_rcaches+0x199/0x1c0 > > <4>[ 6.933099] iommu_setup_dma_ops+0x104/0x3d0 > > <4>[ 6.933101] iommu_probe_device+0xa4/0x180 > > <4>[ 6.933103] iommu_bus_notifier+0x2d/0x40 > > <4>[ 6.933105] notifier_call_chain+0x31/0x90 > > <4>[ 6.933108] blocking_notifier_call_chain+0x3a/0x50 > > <4>[ 6.933110] device_add+0x3c1/0x900 > > <4>[ 6.933112] pci_device_add+0x255/0x580 > > <4>[ 6.933115] pci_scan_single_device+0xa6/0xd0 > > <4>[ 6.933117] p2sb_bar+0x7f/0x220 > > <4>[ 6.933120] i801_add_tco_spt.isra.18+0x2b/0xca [i2c_i801] > > <4>[ 6.933124] i801_add_tco+0xb1/0xfe [i2c_i801] > > <4>[ 6.933126] i801_probe.cold.25+0xa9/0x3a7 [i2c_i801] > > <4>[ 6.933129] pci_device_probe+0x95/0x110 > > <4>[ 6.933132] really_probe+0xd6/0x350 > > <4>[ 6.933134] __driver_probe_device+0x73/0x170 > > <4>[ 6.933137] driver_probe_device+0x1a/0x90 > > <4>[ 6.933140] __driver_attach+0xbc/0x190 > > <4>[ 6.933141] bus_for_each_dev+0x72/0xc0 > > <4>[ 6.933143] bus_add_driver+0x1bb/0x210 > > <4>[ 6.933146] driver_register+0x66/0xc0 > > <4>[ 6.933147] wmi_bmof_probe+0x3b/0xac [wmi_bmof] > > <4>[ 6.933150] do_one_initcall+0x53/0x2f0 > > <4>[ 6.933152] do_init_module+0x45/0x1c0 > > <4>[ 6.933154] load_module+0x1cd5/0x1ec0 > > <4>[ 6.933156] __do_sys_finit_module+0xaf/0x120 > > <4>[ 6.933158] do_syscall_64+0x37/0x90 > > <4>[ 6.933160] entry_SYSCALL_64_after_hwframe+0x63/0xcd > > <4>[ 6.953757] > > -> #0 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}: > > <4>[ 6.953779] validate_chain+0xb3f/0x2000 > > <4>[ 6.953785] __lock_acquire+0x5a4/0xb70 > > <4>[ 6.953786] lock_acquire+0xd3/0x310 > > <4>[ 6.953787] down_read+0x39/0x140 > > <4>[ 6.953790] blocking_notifier_call_chain+0x20/0x50 > > <4>[ 6.953794] device_add+0x3c1/0x900 > > <4>[ 6.953797] platform_device_add+0x108/0x240 > > <4>[ 6.953799] coretemp_cpu_online+0xe1/0x15e [coretemp] > > <4>[ 6.953805] cpuhp_invoke_callback+0x181/0x8a0 > > <4>[ 6.958244] cpuhp_thread_fun+0x188/0x1f0 > > <4>[ 6.958267] smpboot_thread_fn+0x1b5/0x260 > > <4>[ 6.958270] kthread+0xed/0x120 > > <4>[ 6.958272] ret_from_fork+0x1f/0x30 > > <4>[ 6.958274] > > other info that might help us debug this: > > > > <4>[ 6.958275] Chain exists of: > > &(&priv->bus_notifier)->rwsem --> cpu_hotplug_lock --> cpuhp_state-up > > > > <4>[ 6.961037] Possible unsafe locking scenario: > > > > <4>[ 6.961038] CPU0 CPU1 > > <4>[ 6.961038] ---- ---- > > <4>[ 6.961039] lock(cpuhp_state-up); > > <4>[ 6.961040] lock(cpu_hotplug_lock); > > <4>[ 6.961041] lock(cpuhp_state-up); > > <4>[ 6.961042] lock(&(&priv->bus_notifier)->rwsem); > > <4>[ 6.961044] > > *** DEADLOCK *** > > > > <4>[ 6.961044] 2 locks held by cpuhp/0/15: > > <4>[ 6.961046] #0: ffffffff82648f10 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > > <4>[ 6.961053] #1: ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > > <4>[ 6.961058] > > stack backtrace: > > <4>[ 6.961059] CPU: 0 PID: 15 Comm: cpuhp/0 Not tainted 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 > > <4>[ 6.961062] Hardware name: Intel Corporation NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0047.2018.0718.1706 07/18/2018 > > <4>[ 6.961063] Call Trace: > > <4>[ 6.961064] <TASK> > > <4>[ 6.961065] dump_stack_lvl+0x56/0x7f > > <4>[ 6.961069] check_noncircular+0x132/0x150 > > <4>[ 6.961078] validate_chain+0xb3f/0x2000 > > <4>[ 6.961083] __lock_acquire+0x5a4/0xb70 > > <4>[ 6.961087] lock_acquire+0xd3/0x310 > > <4>[ 6.961088] ? blocking_notifier_call_chain+0x20/0x50 > > <4>[ 6.961093] down_read+0x39/0x140 > > <4>[ 6.961097] ? blocking_notifier_call_chain+0x20/0x50 > > <4>[ 6.961099] blocking_notifier_call_chain+0x20/0x50 > > <4>[ 6.961102] device_add+0x3c1/0x900 > > <4>[ 6.961106] ? dev_set_name+0x4e/0x70 > > <4>[ 6.961109] platform_device_add+0x108/0x240 > > <4>[ 6.961112] coretemp_cpu_online+0xe1/0x15e [coretemp] > > <4>[ 6.961117] ? create_core_data+0x550/0x550 [coretemp] > > <4>[ 6.961120] cpuhp_invoke_callback+0x181/0x8a0 > > <4>[ 6.961124] cpuhp_thread_fun+0x188/0x1f0 > > <4>[ 6.961129] ? smpboot_thread_fn+0x1e/0x260 > > <4>[ 6.961131] smpboot_thread_fn+0x1b5/0x260 > > <4>[ 6.961134] ? sort_range+0x20/0x20 > > <4>[ 6.961135] kthread+0xed/0x120 > > <4>[ 6.961137] ? kthread_complete_and_exit+0x20/0x20 > > <4>[ 6.961139] ret_from_fork+0x1f/0x30 > > <4>[ 6.961145] </TASK> > > > > Closes: https://gitlab.freedesktop.org/drm/intel/issues/6641 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> > > --- > > drivers/iommu/iova.c | 28 ---------------------------- > > include/linux/cpuhotplug.h | 1 - > > include/linux/iova.h | 1 - > > 3 files changed, 30 deletions(-) > > > > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > > index 47d1983dfa2a4..f0136d0231f06 100644 > > --- a/drivers/iommu/iova.c > > +++ b/drivers/iommu/iova.c > > @@ -31,16 +31,6 @@ unsigned long iova_rcache_range(void) > > return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1); > > } > > > > -static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node) > > -{ > > - struct iova_domain *iovad; > > - > > - iovad = hlist_entry_safe(node, struct iova_domain, cpuhp_dead); > > - > > - free_cpu_cached_iovas(cpu, iovad); > > - return 0; > > -} > > - > > static void free_global_cached_iovas(struct iova_domain *iovad); > > > > static struct iova *to_iova(struct rb_node *node) > > @@ -255,21 +245,10 @@ int iova_cache_get(void) > > { > > mutex_lock(&iova_cache_mutex); > > if (!iova_cache_users) { > > - int ret; > > - > > - ret = cpuhp_setup_state_multi(CPUHP_IOMMU_IOVA_DEAD, "iommu/iova:dead", NULL, > > - iova_cpuhp_dead); > > - if (ret) { > > - mutex_unlock(&iova_cache_mutex); > > - pr_err("Couldn't register cpuhp handler\n"); > > - return ret; > > - } > > - > > iova_cache = kmem_cache_create( > > "iommu_iova", sizeof(struct iova), 0, > > SLAB_HWCACHE_ALIGN, NULL); > > if (!iova_cache) { > > - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); > > mutex_unlock(&iova_cache_mutex); > > pr_err("Couldn't create iova cache\n"); > > return -ENOMEM; > > @@ -292,7 +271,6 @@ void iova_cache_put(void) > > } > > iova_cache_users--; > > if (!iova_cache_users) { > > - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); > > kmem_cache_destroy(iova_cache); > > } > > mutex_unlock(&iova_cache_mutex); > > @@ -495,8 +473,6 @@ EXPORT_SYMBOL_GPL(free_iova_fast); > > > > static void iova_domain_free_rcaches(struct iova_domain *iovad) > > { > > - cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, > > - &iovad->cpuhp_dead); > > free_iova_rcaches(iovad); > > } > > > > @@ -755,10 +731,6 @@ int iova_domain_init_rcaches(struct iova_domain *iovad) > > } > > } > > > > - ret = cpuhp_state_add_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, > > - &iovad->cpuhp_dead); > > - if (ret) > > - goto out_err; > > return 0; > > > > out_err: > > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > > index f61447913db97..8f541a6b63e41 100644 > > --- a/include/linux/cpuhotplug.h > > +++ b/include/linux/cpuhotplug.h > > @@ -95,7 +95,6 @@ enum cpuhp_state { > > CPUHP_PAGE_ALLOC, > > CPUHP_NET_DEV_DEAD, > > CPUHP_PCI_XGENE_DEAD, > > - CPUHP_IOMMU_IOVA_DEAD, > > CPUHP_LUSTRE_CFS_DEAD, > > CPUHP_AP_ARM_CACHE_B15_RAC_DEAD, > > CPUHP_PADATA_DEAD, > > diff --git a/include/linux/iova.h b/include/linux/iova.h > > index c6ba6d95d79c2..fd77cd5bfa333 100644 > > --- a/include/linux/iova.h > > +++ b/include/linux/iova.h > > @@ -37,7 +37,6 @@ struct iova_domain { > > struct iova anchor; /* rbtree lookup anchor */ > > > > struct iova_rcache *rcaches; > > - struct hlist_node cpuhp_dead; > > }; > > > > static inline unsigned long iova_size(struct iova *iova) > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing 2022-09-22 12:09 ` Robin Murphy 2022-09-22 13:37 ` Janusz Krzysztofik @ 2022-09-30 16:57 ` Janusz Krzysztofik 2022-10-05 14:26 ` Thorsten Leemhuis 1 sibling, 1 reply; 11+ messages in thread From: Janusz Krzysztofik @ 2022-09-30 16:57 UTC (permalink / raw) To: Lucas De Marchi, Robin Murphy Cc: intel-gfx, Chris Wilson, Joerg Roedel, Will Deacon, iommu, linux-kernel, regressions I think this issue can hit any user with a platform that loads iommu and coretemp drivers. Adding regressions@lists.linux.dev to the loop. Thanks, Janusz On Thursday, 22 September 2022 14:09:35 CEST Robin Murphy wrote: > On 22/09/2022 11:10 am, Janusz Krzysztofik wrote: > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > Manual revert of commit f598a497bc7d ("iova: Add CPU hotplug handler to > > flush rcaches"). It is trying to instantiate a cpuhp notifier from inside > > a cpuhp callback. That code replaced intel_iommu implementation of > > flushing per-IOVA domain CPU rcaches which used a single instance of cpuhp > > held for the module lifetime. > > OK, *now* I see what's going on. It doesn't seem unreasonable to me for > bus notifiers to touch CPU hotplug - what seems more unexpected is the > coretemp driver creating and adding a platform device from inside a > hotplug callback. > > Once we start trying to revert multiple unrelated bits of important > functionality from other subsystems because one driver is doing a weird > thing, maybe it's time to instead question whether that driver should be > doing a weird thing? > > Thanks, > Robin. > > > <4>[ 6.928112] ====================================================== > > <4>[ 6.928621] WARNING: possible circular locking dependency detected > > <4>[ 6.929225] 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 Not tainted > > <4>[ 6.929818] ------------------------------------------------------ > > <4>[ 6.930415] cpuhp/0/15 is trying to acquire lock: > > <4>[ 6.931011] ffff888100e02a78 (&(&priv->bus_notifier)->rwsem){++++}- {3:3}, at: blocking_notifier_call_chain+0x20/0x50 > > <4>[ 6.931533] > > but task is already holding lock: > > <4>[ 6.931534] ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > > <4>[ 6.933069] > > which lock already depends on the new lock. > > > > <4>[ 6.933070] > > the existing dependency chain (in reverse order) is: > > <4>[ 6.933071] > > -> #2 (cpuhp_state-up){+.+.}-{0:0}: > > <4>[ 6.933076] lock_acquire+0xd3/0x310 > > <4>[ 6.933079] cpuhp_thread_fun+0xa6/0x1f0 > > <4>[ 6.933082] smpboot_thread_fn+0x1b5/0x260 > > <4>[ 6.933084] kthread+0xed/0x120 > > <4>[ 6.933086] ret_from_fork+0x1f/0x30 > > <4>[ 6.933089] > > -> #1 (cpu_hotplug_lock){++++}-{0:0}: > > <4>[ 6.933092] lock_acquire+0xd3/0x310 > > <4>[ 6.933094] __cpuhp_state_add_instance+0x43/0x1c0 > > <4>[ 6.933096] iova_domain_init_rcaches+0x199/0x1c0 > > <4>[ 6.933099] iommu_setup_dma_ops+0x104/0x3d0 > > <4>[ 6.933101] iommu_probe_device+0xa4/0x180 > > <4>[ 6.933103] iommu_bus_notifier+0x2d/0x40 > > <4>[ 6.933105] notifier_call_chain+0x31/0x90 > > <4>[ 6.933108] blocking_notifier_call_chain+0x3a/0x50 > > <4>[ 6.933110] device_add+0x3c1/0x900 > > <4>[ 6.933112] pci_device_add+0x255/0x580 > > <4>[ 6.933115] pci_scan_single_device+0xa6/0xd0 > > <4>[ 6.933117] p2sb_bar+0x7f/0x220 > > <4>[ 6.933120] i801_add_tco_spt.isra.18+0x2b/0xca [i2c_i801] > > <4>[ 6.933124] i801_add_tco+0xb1/0xfe [i2c_i801] > > <4>[ 6.933126] i801_probe.cold.25+0xa9/0x3a7 [i2c_i801] > > <4>[ 6.933129] pci_device_probe+0x95/0x110 > > <4>[ 6.933132] really_probe+0xd6/0x350 > > <4>[ 6.933134] __driver_probe_device+0x73/0x170 > > <4>[ 6.933137] driver_probe_device+0x1a/0x90 > > <4>[ 6.933140] __driver_attach+0xbc/0x190 > > <4>[ 6.933141] bus_for_each_dev+0x72/0xc0 > > <4>[ 6.933143] bus_add_driver+0x1bb/0x210 > > <4>[ 6.933146] driver_register+0x66/0xc0 > > <4>[ 6.933147] wmi_bmof_probe+0x3b/0xac [wmi_bmof] > > <4>[ 6.933150] do_one_initcall+0x53/0x2f0 > > <4>[ 6.933152] do_init_module+0x45/0x1c0 > > <4>[ 6.933154] load_module+0x1cd5/0x1ec0 > > <4>[ 6.933156] __do_sys_finit_module+0xaf/0x120 > > <4>[ 6.933158] do_syscall_64+0x37/0x90 > > <4>[ 6.933160] entry_SYSCALL_64_after_hwframe+0x63/0xcd > > <4>[ 6.953757] > > -> #0 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}: > > <4>[ 6.953779] validate_chain+0xb3f/0x2000 > > <4>[ 6.953785] __lock_acquire+0x5a4/0xb70 > > <4>[ 6.953786] lock_acquire+0xd3/0x310 > > <4>[ 6.953787] down_read+0x39/0x140 > > <4>[ 6.953790] blocking_notifier_call_chain+0x20/0x50 > > <4>[ 6.953794] device_add+0x3c1/0x900 > > <4>[ 6.953797] platform_device_add+0x108/0x240 > > <4>[ 6.953799] coretemp_cpu_online+0xe1/0x15e [coretemp] > > <4>[ 6.953805] cpuhp_invoke_callback+0x181/0x8a0 > > <4>[ 6.958244] cpuhp_thread_fun+0x188/0x1f0 > > <4>[ 6.958267] smpboot_thread_fn+0x1b5/0x260 > > <4>[ 6.958270] kthread+0xed/0x120 > > <4>[ 6.958272] ret_from_fork+0x1f/0x30 > > <4>[ 6.958274] > > other info that might help us debug this: > > > > <4>[ 6.958275] Chain exists of: > > &(&priv->bus_notifier)->rwsem --> cpu_hotplug_lock -- > cpuhp_state-up > > > > <4>[ 6.961037] Possible unsafe locking scenario: > > > > <4>[ 6.961038] CPU0 CPU1 > > <4>[ 6.961038] ---- ---- > > <4>[ 6.961039] lock(cpuhp_state-up); > > <4>[ 6.961040] lock(cpu_hotplug_lock); > > <4>[ 6.961041] lock(cpuhp_state-up); > > <4>[ 6.961042] lock(&(&priv->bus_notifier)->rwsem); > > <4>[ 6.961044] > > *** DEADLOCK *** > > > > <4>[ 6.961044] 2 locks held by cpuhp/0/15: > > <4>[ 6.961046] #0: ffffffff82648f10 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > > <4>[ 6.961053] #1: ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > > <4>[ 6.961058] > > stack backtrace: > > <4>[ 6.961059] CPU: 0 PID: 15 Comm: cpuhp/0 Not tainted 6.0.0-rc6- CI_DRM_12164-ga1f63e144e54+ #1 > > <4>[ 6.961062] Hardware name: Intel Corporation NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0047.2018.0718.1706 07/18/2018 > > <4>[ 6.961063] Call Trace: > > <4>[ 6.961064] <TASK> > > <4>[ 6.961065] dump_stack_lvl+0x56/0x7f > > <4>[ 6.961069] check_noncircular+0x132/0x150 > > <4>[ 6.961078] validate_chain+0xb3f/0x2000 > > <4>[ 6.961083] __lock_acquire+0x5a4/0xb70 > > <4>[ 6.961087] lock_acquire+0xd3/0x310 > > <4>[ 6.961088] ? blocking_notifier_call_chain+0x20/0x50 > > <4>[ 6.961093] down_read+0x39/0x140 > > <4>[ 6.961097] ? blocking_notifier_call_chain+0x20/0x50 > > <4>[ 6.961099] blocking_notifier_call_chain+0x20/0x50 > > <4>[ 6.961102] device_add+0x3c1/0x900 > > <4>[ 6.961106] ? dev_set_name+0x4e/0x70 > > <4>[ 6.961109] platform_device_add+0x108/0x240 > > <4>[ 6.961112] coretemp_cpu_online+0xe1/0x15e [coretemp] > > <4>[ 6.961117] ? create_core_data+0x550/0x550 [coretemp] > > <4>[ 6.961120] cpuhp_invoke_callback+0x181/0x8a0 > > <4>[ 6.961124] cpuhp_thread_fun+0x188/0x1f0 > > <4>[ 6.961129] ? smpboot_thread_fn+0x1e/0x260 > > <4>[ 6.961131] smpboot_thread_fn+0x1b5/0x260 > > <4>[ 6.961134] ? sort_range+0x20/0x20 > > <4>[ 6.961135] kthread+0xed/0x120 > > <4>[ 6.961137] ? kthread_complete_and_exit+0x20/0x20 > > <4>[ 6.961139] ret_from_fork+0x1f/0x30 > > <4>[ 6.961145] </TASK> > > > > Closes: https://gitlab.freedesktop.org/drm/intel/issues/6641 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> > > --- > > drivers/iommu/iova.c | 28 ---------------------------- > > include/linux/cpuhotplug.h | 1 - > > include/linux/iova.h | 1 - > > 3 files changed, 30 deletions(-) > > > > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > > index 47d1983dfa2a4..f0136d0231f06 100644 > > --- a/drivers/iommu/iova.c > > +++ b/drivers/iommu/iova.c > > @@ -31,16 +31,6 @@ unsigned long iova_rcache_range(void) > > return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1); > > } > > > > -static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node) > > -{ > > - struct iova_domain *iovad; > > - > > - iovad = hlist_entry_safe(node, struct iova_domain, cpuhp_dead); > > - > > - free_cpu_cached_iovas(cpu, iovad); > > - return 0; > > -} > > - > > static void free_global_cached_iovas(struct iova_domain *iovad); > > > > static struct iova *to_iova(struct rb_node *node) > > @@ -255,21 +245,10 @@ int iova_cache_get(void) > > { > > mutex_lock(&iova_cache_mutex); > > if (!iova_cache_users) { > > - int ret; > > - > > - ret = cpuhp_setup_state_multi(CPUHP_IOMMU_IOVA_DEAD, "iommu/iova:dead", NULL, > > - iova_cpuhp_dead); > > - if (ret) { > > - mutex_unlock(&iova_cache_mutex); > > - pr_err("Couldn't register cpuhp handler\n"); > > - return ret; > > - } > > - > > iova_cache = kmem_cache_create( > > "iommu_iova", sizeof(struct iova), 0, > > SLAB_HWCACHE_ALIGN, NULL); > > if (!iova_cache) { > > - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); > > mutex_unlock(&iova_cache_mutex); > > pr_err("Couldn't create iova cache\n"); > > return -ENOMEM; > > @@ -292,7 +271,6 @@ void iova_cache_put(void) > > } > > iova_cache_users--; > > if (!iova_cache_users) { > > - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); > > kmem_cache_destroy(iova_cache); > > } > > mutex_unlock(&iova_cache_mutex); > > @@ -495,8 +473,6 @@ EXPORT_SYMBOL_GPL(free_iova_fast); > > > > static void iova_domain_free_rcaches(struct iova_domain *iovad) > > { > > - cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, > > - &iovad- >cpuhp_dead); > > free_iova_rcaches(iovad); > > } > > > > @@ -755,10 +731,6 @@ int iova_domain_init_rcaches(struct iova_domain *iovad) > > } > > } > > > > - ret = cpuhp_state_add_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, > > - &iovad- >cpuhp_dead); > > - if (ret) > > - goto out_err; > > return 0; > > > > out_err: > > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > > index f61447913db97..8f541a6b63e41 100644 > > --- a/include/linux/cpuhotplug.h > > +++ b/include/linux/cpuhotplug.h > > @@ -95,7 +95,6 @@ enum cpuhp_state { > > CPUHP_PAGE_ALLOC, > > CPUHP_NET_DEV_DEAD, > > CPUHP_PCI_XGENE_DEAD, > > - CPUHP_IOMMU_IOVA_DEAD, > > CPUHP_LUSTRE_CFS_DEAD, > > CPUHP_AP_ARM_CACHE_B15_RAC_DEAD, > > CPUHP_PADATA_DEAD, > > diff --git a/include/linux/iova.h b/include/linux/iova.h > > index c6ba6d95d79c2..fd77cd5bfa333 100644 > > --- a/include/linux/iova.h > > +++ b/include/linux/iova.h > > @@ -37,7 +37,6 @@ struct iova_domain { > > struct iova anchor; /* rbtree lookup anchor */ > > > > struct iova_rcache *rcaches; > > - struct hlist_node cpuhp_dead; > > }; > > > > static inline unsigned long iova_size(struct iova *iova) > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing 2022-09-30 16:57 ` Janusz Krzysztofik @ 2022-10-05 14:26 ` Thorsten Leemhuis 2022-10-05 15:25 ` Guenter Roeck 0 siblings, 1 reply; 11+ messages in thread From: Thorsten Leemhuis @ 2022-10-05 14:26 UTC (permalink / raw) To: Fenghua Yu Cc: intel-gfx, Chris Wilson, Joerg Roedel, Will Deacon, iommu, linux-kernel, regressions, Janusz Krzysztofik, Lucas De Marchi, Robin Murphy, linux-hwmon [adding the coretemp maintainer (Fenghua Yu) and the appropriate mailing list to the list of recipients, as there apparently is a coretemp bug that results in a iommu change causing a regression] On 30.09.22 18:57, Janusz Krzysztofik wrote: > I think this issue can hit any user with a platform that loads iommu and > coretemp drivers. Adding regressions@lists.linux.dev to the loop. f598a497bc7d was merged for 5.13-rc1, which is quite a while ago, so at least a quick revert is out of question as it might do more harm than good. The authors of the commit are kinda responsible for fixing situations like this; but well, did anybody ask the developers of the coretemp driver kindly if they are aware of the problem and maybe even willing to fix it? Doesn't look like it from here from search lore (hope I didn't miss anything), so let's give it a try. Ciao, Thorsten > On Thursday, 22 September 2022 14:09:35 CEST Robin Murphy wrote: >> On 22/09/2022 11:10 am, Janusz Krzysztofik wrote: >>> From: Chris Wilson <chris@chris-wilson.co.uk> >>> >>> Manual revert of commit f598a497bc7d ("iova: Add CPU hotplug handler to >>> flush rcaches"). It is trying to instantiate a cpuhp notifier from inside >>> a cpuhp callback. That code replaced intel_iommu implementation of >>> flushing per-IOVA domain CPU rcaches which used a single instance of cpuhp >>> held for the module lifetime. >> >> OK, *now* I see what's going on. It doesn't seem unreasonable to me for >> bus notifiers to touch CPU hotplug - what seems more unexpected is the >> coretemp driver creating and adding a platform device from inside a >> hotplug callback. >> >> Once we start trying to revert multiple unrelated bits of important >> functionality from other subsystems because one driver is doing a weird >> thing, maybe it's time to instead question whether that driver should be >> doing a weird thing? >> >> Thanks, >> Robin. >> >>> <4>[ 6.928112] ====================================================== >>> <4>[ 6.928621] WARNING: possible circular locking dependency detected >>> <4>[ 6.929225] 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 Not tainted >>> <4>[ 6.929818] ------------------------------------------------------ >>> <4>[ 6.930415] cpuhp/0/15 is trying to acquire lock: >>> <4>[ 6.931011] ffff888100e02a78 (&(&priv->bus_notifier)->rwsem){++++}- > {3:3}, at: blocking_notifier_call_chain+0x20/0x50 >>> <4>[ 6.931533] >>> but task is already holding lock: >>> <4>[ 6.931534] ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: > cpuhp_thread_fun+0x48/0x1f0 >>> <4>[ 6.933069] >>> which lock already depends on the new lock. >>> >>> <4>[ 6.933070] >>> the existing dependency chain (in reverse order) is: >>> <4>[ 6.933071] >>> -> #2 (cpuhp_state-up){+.+.}-{0:0}: >>> <4>[ 6.933076] lock_acquire+0xd3/0x310 >>> <4>[ 6.933079] cpuhp_thread_fun+0xa6/0x1f0 >>> <4>[ 6.933082] smpboot_thread_fn+0x1b5/0x260 >>> <4>[ 6.933084] kthread+0xed/0x120 >>> <4>[ 6.933086] ret_from_fork+0x1f/0x30 >>> <4>[ 6.933089] >>> -> #1 (cpu_hotplug_lock){++++}-{0:0}: >>> <4>[ 6.933092] lock_acquire+0xd3/0x310 >>> <4>[ 6.933094] __cpuhp_state_add_instance+0x43/0x1c0 >>> <4>[ 6.933096] iova_domain_init_rcaches+0x199/0x1c0 >>> <4>[ 6.933099] iommu_setup_dma_ops+0x104/0x3d0 >>> <4>[ 6.933101] iommu_probe_device+0xa4/0x180 >>> <4>[ 6.933103] iommu_bus_notifier+0x2d/0x40 >>> <4>[ 6.933105] notifier_call_chain+0x31/0x90 >>> <4>[ 6.933108] blocking_notifier_call_chain+0x3a/0x50 >>> <4>[ 6.933110] device_add+0x3c1/0x900 >>> <4>[ 6.933112] pci_device_add+0x255/0x580 >>> <4>[ 6.933115] pci_scan_single_device+0xa6/0xd0 >>> <4>[ 6.933117] p2sb_bar+0x7f/0x220 >>> <4>[ 6.933120] i801_add_tco_spt.isra.18+0x2b/0xca [i2c_i801] >>> <4>[ 6.933124] i801_add_tco+0xb1/0xfe [i2c_i801] >>> <4>[ 6.933126] i801_probe.cold.25+0xa9/0x3a7 [i2c_i801] >>> <4>[ 6.933129] pci_device_probe+0x95/0x110 >>> <4>[ 6.933132] really_probe+0xd6/0x350 >>> <4>[ 6.933134] __driver_probe_device+0x73/0x170 >>> <4>[ 6.933137] driver_probe_device+0x1a/0x90 >>> <4>[ 6.933140] __driver_attach+0xbc/0x190 >>> <4>[ 6.933141] bus_for_each_dev+0x72/0xc0 >>> <4>[ 6.933143] bus_add_driver+0x1bb/0x210 >>> <4>[ 6.933146] driver_register+0x66/0xc0 >>> <4>[ 6.933147] wmi_bmof_probe+0x3b/0xac [wmi_bmof] >>> <4>[ 6.933150] do_one_initcall+0x53/0x2f0 >>> <4>[ 6.933152] do_init_module+0x45/0x1c0 >>> <4>[ 6.933154] load_module+0x1cd5/0x1ec0 >>> <4>[ 6.933156] __do_sys_finit_module+0xaf/0x120 >>> <4>[ 6.933158] do_syscall_64+0x37/0x90 >>> <4>[ 6.933160] entry_SYSCALL_64_after_hwframe+0x63/0xcd >>> <4>[ 6.953757] >>> -> #0 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}: >>> <4>[ 6.953779] validate_chain+0xb3f/0x2000 >>> <4>[ 6.953785] __lock_acquire+0x5a4/0xb70 >>> <4>[ 6.953786] lock_acquire+0xd3/0x310 >>> <4>[ 6.953787] down_read+0x39/0x140 >>> <4>[ 6.953790] blocking_notifier_call_chain+0x20/0x50 >>> <4>[ 6.953794] device_add+0x3c1/0x900 >>> <4>[ 6.953797] platform_device_add+0x108/0x240 >>> <4>[ 6.953799] coretemp_cpu_online+0xe1/0x15e [coretemp] >>> <4>[ 6.953805] cpuhp_invoke_callback+0x181/0x8a0 >>> <4>[ 6.958244] cpuhp_thread_fun+0x188/0x1f0 >>> <4>[ 6.958267] smpboot_thread_fn+0x1b5/0x260 >>> <4>[ 6.958270] kthread+0xed/0x120 >>> <4>[ 6.958272] ret_from_fork+0x1f/0x30 >>> <4>[ 6.958274] >>> other info that might help us debug this: >>> >>> <4>[ 6.958275] Chain exists of: >>> &(&priv->bus_notifier)->rwsem --> cpu_hotplug_lock -- >> cpuhp_state-up >>> >>> <4>[ 6.961037] Possible unsafe locking scenario: >>> >>> <4>[ 6.961038] CPU0 CPU1 >>> <4>[ 6.961038] ---- ---- >>> <4>[ 6.961039] lock(cpuhp_state-up); >>> <4>[ 6.961040] lock(cpu_hotplug_lock); >>> <4>[ 6.961041] lock(cpuhp_state-up); >>> <4>[ 6.961042] lock(&(&priv->bus_notifier)->rwsem); >>> <4>[ 6.961044] >>> *** DEADLOCK *** >>> >>> <4>[ 6.961044] 2 locks held by cpuhp/0/15: >>> <4>[ 6.961046] #0: ffffffff82648f10 (cpu_hotplug_lock){++++}-{0:0}, > at: cpuhp_thread_fun+0x48/0x1f0 >>> <4>[ 6.961053] #1: ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: > cpuhp_thread_fun+0x48/0x1f0 >>> <4>[ 6.961058] >>> stack backtrace: >>> <4>[ 6.961059] CPU: 0 PID: 15 Comm: cpuhp/0 Not tainted 6.0.0-rc6- > CI_DRM_12164-ga1f63e144e54+ #1 >>> <4>[ 6.961062] Hardware name: Intel Corporation NUC8i7HVK/NUC8i7HVB, > BIOS HNKBLi70.86A.0047.2018.0718.1706 07/18/2018 >>> <4>[ 6.961063] Call Trace: >>> <4>[ 6.961064] <TASK> >>> <4>[ 6.961065] dump_stack_lvl+0x56/0x7f >>> <4>[ 6.961069] check_noncircular+0x132/0x150 >>> <4>[ 6.961078] validate_chain+0xb3f/0x2000 >>> <4>[ 6.961083] __lock_acquire+0x5a4/0xb70 >>> <4>[ 6.961087] lock_acquire+0xd3/0x310 >>> <4>[ 6.961088] ? blocking_notifier_call_chain+0x20/0x50 >>> <4>[ 6.961093] down_read+0x39/0x140 >>> <4>[ 6.961097] ? blocking_notifier_call_chain+0x20/0x50 >>> <4>[ 6.961099] blocking_notifier_call_chain+0x20/0x50 >>> <4>[ 6.961102] device_add+0x3c1/0x900 >>> <4>[ 6.961106] ? dev_set_name+0x4e/0x70 >>> <4>[ 6.961109] platform_device_add+0x108/0x240 >>> <4>[ 6.961112] coretemp_cpu_online+0xe1/0x15e [coretemp] >>> <4>[ 6.961117] ? create_core_data+0x550/0x550 [coretemp] >>> <4>[ 6.961120] cpuhp_invoke_callback+0x181/0x8a0 >>> <4>[ 6.961124] cpuhp_thread_fun+0x188/0x1f0 >>> <4>[ 6.961129] ? smpboot_thread_fn+0x1e/0x260 >>> <4>[ 6.961131] smpboot_thread_fn+0x1b5/0x260 >>> <4>[ 6.961134] ? sort_range+0x20/0x20 >>> <4>[ 6.961135] kthread+0xed/0x120 >>> <4>[ 6.961137] ? kthread_complete_and_exit+0x20/0x20 >>> <4>[ 6.961139] ret_from_fork+0x1f/0x30 >>> <4>[ 6.961145] </TASK> >>> >>> Closes: https://gitlab.freedesktop.org/drm/intel/issues/6641 >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >>> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> >>> --- >>> drivers/iommu/iova.c | 28 ---------------------------- >>> include/linux/cpuhotplug.h | 1 - >>> include/linux/iova.h | 1 - >>> 3 files changed, 30 deletions(-) >>> >>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c >>> index 47d1983dfa2a4..f0136d0231f06 100644 >>> --- a/drivers/iommu/iova.c >>> +++ b/drivers/iommu/iova.c >>> @@ -31,16 +31,6 @@ unsigned long iova_rcache_range(void) >>> return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1); >>> } >>> >>> -static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node) >>> -{ >>> - struct iova_domain *iovad; >>> - >>> - iovad = hlist_entry_safe(node, struct iova_domain, cpuhp_dead); >>> - >>> - free_cpu_cached_iovas(cpu, iovad); >>> - return 0; >>> -} >>> - >>> static void free_global_cached_iovas(struct iova_domain *iovad); >>> >>> static struct iova *to_iova(struct rb_node *node) >>> @@ -255,21 +245,10 @@ int iova_cache_get(void) >>> { >>> mutex_lock(&iova_cache_mutex); >>> if (!iova_cache_users) { >>> - int ret; >>> - >>> - ret = cpuhp_setup_state_multi(CPUHP_IOMMU_IOVA_DEAD, > "iommu/iova:dead", NULL, >>> - iova_cpuhp_dead); >>> - if (ret) { >>> - mutex_unlock(&iova_cache_mutex); >>> - pr_err("Couldn't register cpuhp handler\n"); >>> - return ret; >>> - } >>> - >>> iova_cache = kmem_cache_create( >>> "iommu_iova", sizeof(struct iova), 0, >>> SLAB_HWCACHE_ALIGN, NULL); >>> if (!iova_cache) { >>> - > cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); >>> mutex_unlock(&iova_cache_mutex); >>> pr_err("Couldn't create iova cache\n"); >>> return -ENOMEM; >>> @@ -292,7 +271,6 @@ void iova_cache_put(void) >>> } >>> iova_cache_users--; >>> if (!iova_cache_users) { >>> - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); >>> kmem_cache_destroy(iova_cache); >>> } >>> mutex_unlock(&iova_cache_mutex); >>> @@ -495,8 +473,6 @@ EXPORT_SYMBOL_GPL(free_iova_fast); >>> >>> static void iova_domain_free_rcaches(struct iova_domain *iovad) >>> { >>> - cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, >>> - &iovad- >> cpuhp_dead); >>> free_iova_rcaches(iovad); >>> } >>> >>> @@ -755,10 +731,6 @@ int iova_domain_init_rcaches(struct iova_domain > *iovad) >>> } >>> } >>> >>> - ret = cpuhp_state_add_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, >>> - &iovad- >> cpuhp_dead); >>> - if (ret) >>> - goto out_err; >>> return 0; >>> >>> out_err: >>> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h >>> index f61447913db97..8f541a6b63e41 100644 >>> --- a/include/linux/cpuhotplug.h >>> +++ b/include/linux/cpuhotplug.h >>> @@ -95,7 +95,6 @@ enum cpuhp_state { >>> CPUHP_PAGE_ALLOC, >>> CPUHP_NET_DEV_DEAD, >>> CPUHP_PCI_XGENE_DEAD, >>> - CPUHP_IOMMU_IOVA_DEAD, >>> CPUHP_LUSTRE_CFS_DEAD, >>> CPUHP_AP_ARM_CACHE_B15_RAC_DEAD, >>> CPUHP_PADATA_DEAD, >>> diff --git a/include/linux/iova.h b/include/linux/iova.h >>> index c6ba6d95d79c2..fd77cd5bfa333 100644 >>> --- a/include/linux/iova.h >>> +++ b/include/linux/iova.h >>> @@ -37,7 +37,6 @@ struct iova_domain { >>> struct iova anchor; /* rbtree lookup anchor > */ >>> >>> struct iova_rcache *rcaches; >>> - struct hlist_node cpuhp_dead; >>> }; >>> >>> static inline unsigned long iova_size(struct iova *iova) >> > > > > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing 2022-10-05 14:26 ` Thorsten Leemhuis @ 2022-10-05 15:25 ` Guenter Roeck 2022-10-05 16:15 ` Robin Murphy 0 siblings, 1 reply; 11+ messages in thread From: Guenter Roeck @ 2022-10-05 15:25 UTC (permalink / raw) To: Thorsten Leemhuis Cc: Fenghua Yu, intel-gfx, Chris Wilson, Joerg Roedel, Will Deacon, iommu, linux-kernel, regressions, Janusz Krzysztofik, Lucas De Marchi, Robin Murphy, linux-hwmon On Wed, Oct 05, 2022 at 04:26:28PM +0200, Thorsten Leemhuis wrote: > [adding the coretemp maintainer (Fenghua Yu) and the appropriate mailing > list to the list of recipients, as there apparently is a coretemp bug > that results in a iommu change causing a regression] > > On 30.09.22 18:57, Janusz Krzysztofik wrote: > > I think this issue can hit any user with a platform that loads iommu and > > coretemp drivers. Adding regressions@lists.linux.dev to the loop. > > f598a497bc7d was merged for 5.13-rc1, which is quite a while ago, so at > least a quick revert is out of question as it might do more harm than > good. The authors of the commit are kinda responsible for fixing > situations like this; but well, did anybody ask the developers of the > coretemp driver kindly if they are aware of the problem and maybe even > willing to fix it? Doesn't look like it from here from search lore (hope > I didn't miss anything), so let's give it a try. > > Ciao, Thorsten > > > On Thursday, 22 September 2022 14:09:35 CEST Robin Murphy wrote: > >> On 22/09/2022 11:10 am, Janusz Krzysztofik wrote: > >>> From: Chris Wilson <chris@chris-wilson.co.uk> > >>> > >>> Manual revert of commit f598a497bc7d ("iova: Add CPU hotplug handler to > >>> flush rcaches"). It is trying to instantiate a cpuhp notifier from inside > >>> a cpuhp callback. That code replaced intel_iommu implementation of > >>> flushing per-IOVA domain CPU rcaches which used a single instance of cpuhp > >>> held for the module lifetime. > >> > >> OK, *now* I see what's going on. It doesn't seem unreasonable to me for > >> bus notifiers to touch CPU hotplug - what seems more unexpected is the > >> coretemp driver creating and adding a platform device from inside a > >> hotplug callback. It is only unexpected if it is documented that creating a platform driver from a hotplug callback is off limits. > >> > >> Once we start trying to revert multiple unrelated bits of important > >> functionality from other subsystems because one driver is doing a weird > >> thing, maybe it's time to instead question whether that driver should be > >> doing a weird thing? That isn't the point. This _used_ to work, after all. Maybe the functionality introduced with f598a497bc7d is important, but there is still a regression introduced by f598a497bc7d. Sure, maybe the coretemp driver is doing "a weird thing", but if some generic code is changed causing something to fail that previously worked, it is still a regression and the reponsibility of the person or team making the generic code change to fix the problems caused by that change. Guenter > >> > >> Thanks, > >> Robin. > >> > >>> <4>[ 6.928112] ====================================================== > >>> <4>[ 6.928621] WARNING: possible circular locking dependency detected > >>> <4>[ 6.929225] 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 Not tainted > >>> <4>[ 6.929818] ------------------------------------------------------ > >>> <4>[ 6.930415] cpuhp/0/15 is trying to acquire lock: > >>> <4>[ 6.931011] ffff888100e02a78 (&(&priv->bus_notifier)->rwsem){++++}- > > {3:3}, at: blocking_notifier_call_chain+0x20/0x50 > >>> <4>[ 6.931533] > >>> but task is already holding lock: > >>> <4>[ 6.931534] ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: > > cpuhp_thread_fun+0x48/0x1f0 > >>> <4>[ 6.933069] > >>> which lock already depends on the new lock. > >>> > >>> <4>[ 6.933070] > >>> the existing dependency chain (in reverse order) is: > >>> <4>[ 6.933071] > >>> -> #2 (cpuhp_state-up){+.+.}-{0:0}: > >>> <4>[ 6.933076] lock_acquire+0xd3/0x310 > >>> <4>[ 6.933079] cpuhp_thread_fun+0xa6/0x1f0 > >>> <4>[ 6.933082] smpboot_thread_fn+0x1b5/0x260 > >>> <4>[ 6.933084] kthread+0xed/0x120 > >>> <4>[ 6.933086] ret_from_fork+0x1f/0x30 > >>> <4>[ 6.933089] > >>> -> #1 (cpu_hotplug_lock){++++}-{0:0}: > >>> <4>[ 6.933092] lock_acquire+0xd3/0x310 > >>> <4>[ 6.933094] __cpuhp_state_add_instance+0x43/0x1c0 > >>> <4>[ 6.933096] iova_domain_init_rcaches+0x199/0x1c0 > >>> <4>[ 6.933099] iommu_setup_dma_ops+0x104/0x3d0 > >>> <4>[ 6.933101] iommu_probe_device+0xa4/0x180 > >>> <4>[ 6.933103] iommu_bus_notifier+0x2d/0x40 > >>> <4>[ 6.933105] notifier_call_chain+0x31/0x90 > >>> <4>[ 6.933108] blocking_notifier_call_chain+0x3a/0x50 > >>> <4>[ 6.933110] device_add+0x3c1/0x900 > >>> <4>[ 6.933112] pci_device_add+0x255/0x580 > >>> <4>[ 6.933115] pci_scan_single_device+0xa6/0xd0 > >>> <4>[ 6.933117] p2sb_bar+0x7f/0x220 > >>> <4>[ 6.933120] i801_add_tco_spt.isra.18+0x2b/0xca [i2c_i801] > >>> <4>[ 6.933124] i801_add_tco+0xb1/0xfe [i2c_i801] > >>> <4>[ 6.933126] i801_probe.cold.25+0xa9/0x3a7 [i2c_i801] > >>> <4>[ 6.933129] pci_device_probe+0x95/0x110 > >>> <4>[ 6.933132] really_probe+0xd6/0x350 > >>> <4>[ 6.933134] __driver_probe_device+0x73/0x170 > >>> <4>[ 6.933137] driver_probe_device+0x1a/0x90 > >>> <4>[ 6.933140] __driver_attach+0xbc/0x190 > >>> <4>[ 6.933141] bus_for_each_dev+0x72/0xc0 > >>> <4>[ 6.933143] bus_add_driver+0x1bb/0x210 > >>> <4>[ 6.933146] driver_register+0x66/0xc0 > >>> <4>[ 6.933147] wmi_bmof_probe+0x3b/0xac [wmi_bmof] > >>> <4>[ 6.933150] do_one_initcall+0x53/0x2f0 > >>> <4>[ 6.933152] do_init_module+0x45/0x1c0 > >>> <4>[ 6.933154] load_module+0x1cd5/0x1ec0 > >>> <4>[ 6.933156] __do_sys_finit_module+0xaf/0x120 > >>> <4>[ 6.933158] do_syscall_64+0x37/0x90 > >>> <4>[ 6.933160] entry_SYSCALL_64_after_hwframe+0x63/0xcd > >>> <4>[ 6.953757] > >>> -> #0 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}: > >>> <4>[ 6.953779] validate_chain+0xb3f/0x2000 > >>> <4>[ 6.953785] __lock_acquire+0x5a4/0xb70 > >>> <4>[ 6.953786] lock_acquire+0xd3/0x310 > >>> <4>[ 6.953787] down_read+0x39/0x140 > >>> <4>[ 6.953790] blocking_notifier_call_chain+0x20/0x50 > >>> <4>[ 6.953794] device_add+0x3c1/0x900 > >>> <4>[ 6.953797] platform_device_add+0x108/0x240 > >>> <4>[ 6.953799] coretemp_cpu_online+0xe1/0x15e [coretemp] > >>> <4>[ 6.953805] cpuhp_invoke_callback+0x181/0x8a0 > >>> <4>[ 6.958244] cpuhp_thread_fun+0x188/0x1f0 > >>> <4>[ 6.958267] smpboot_thread_fn+0x1b5/0x260 > >>> <4>[ 6.958270] kthread+0xed/0x120 > >>> <4>[ 6.958272] ret_from_fork+0x1f/0x30 > >>> <4>[ 6.958274] > >>> other info that might help us debug this: > >>> > >>> <4>[ 6.958275] Chain exists of: > >>> &(&priv->bus_notifier)->rwsem --> cpu_hotplug_lock -- > >> cpuhp_state-up > >>> > >>> <4>[ 6.961037] Possible unsafe locking scenario: > >>> > >>> <4>[ 6.961038] CPU0 CPU1 > >>> <4>[ 6.961038] ---- ---- > >>> <4>[ 6.961039] lock(cpuhp_state-up); > >>> <4>[ 6.961040] lock(cpu_hotplug_lock); > >>> <4>[ 6.961041] lock(cpuhp_state-up); > >>> <4>[ 6.961042] lock(&(&priv->bus_notifier)->rwsem); > >>> <4>[ 6.961044] > >>> *** DEADLOCK *** > >>> > >>> <4>[ 6.961044] 2 locks held by cpuhp/0/15: > >>> <4>[ 6.961046] #0: ffffffff82648f10 (cpu_hotplug_lock){++++}-{0:0}, > > at: cpuhp_thread_fun+0x48/0x1f0 > >>> <4>[ 6.961053] #1: ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: > > cpuhp_thread_fun+0x48/0x1f0 > >>> <4>[ 6.961058] > >>> stack backtrace: > >>> <4>[ 6.961059] CPU: 0 PID: 15 Comm: cpuhp/0 Not tainted 6.0.0-rc6- > > CI_DRM_12164-ga1f63e144e54+ #1 > >>> <4>[ 6.961062] Hardware name: Intel Corporation NUC8i7HVK/NUC8i7HVB, > > BIOS HNKBLi70.86A.0047.2018.0718.1706 07/18/2018 > >>> <4>[ 6.961063] Call Trace: > >>> <4>[ 6.961064] <TASK> > >>> <4>[ 6.961065] dump_stack_lvl+0x56/0x7f > >>> <4>[ 6.961069] check_noncircular+0x132/0x150 > >>> <4>[ 6.961078] validate_chain+0xb3f/0x2000 > >>> <4>[ 6.961083] __lock_acquire+0x5a4/0xb70 > >>> <4>[ 6.961087] lock_acquire+0xd3/0x310 > >>> <4>[ 6.961088] ? blocking_notifier_call_chain+0x20/0x50 > >>> <4>[ 6.961093] down_read+0x39/0x140 > >>> <4>[ 6.961097] ? blocking_notifier_call_chain+0x20/0x50 > >>> <4>[ 6.961099] blocking_notifier_call_chain+0x20/0x50 > >>> <4>[ 6.961102] device_add+0x3c1/0x900 > >>> <4>[ 6.961106] ? dev_set_name+0x4e/0x70 > >>> <4>[ 6.961109] platform_device_add+0x108/0x240 > >>> <4>[ 6.961112] coretemp_cpu_online+0xe1/0x15e [coretemp] > >>> <4>[ 6.961117] ? create_core_data+0x550/0x550 [coretemp] > >>> <4>[ 6.961120] cpuhp_invoke_callback+0x181/0x8a0 > >>> <4>[ 6.961124] cpuhp_thread_fun+0x188/0x1f0 > >>> <4>[ 6.961129] ? smpboot_thread_fn+0x1e/0x260 > >>> <4>[ 6.961131] smpboot_thread_fn+0x1b5/0x260 > >>> <4>[ 6.961134] ? sort_range+0x20/0x20 > >>> <4>[ 6.961135] kthread+0xed/0x120 > >>> <4>[ 6.961137] ? kthread_complete_and_exit+0x20/0x20 > >>> <4>[ 6.961139] ret_from_fork+0x1f/0x30 > >>> <4>[ 6.961145] </TASK> > >>> > >>> Closes: https://gitlab.freedesktop.org/drm/intel/issues/6641 > >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > >>> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> > >>> --- > >>> drivers/iommu/iova.c | 28 ---------------------------- > >>> include/linux/cpuhotplug.h | 1 - > >>> include/linux/iova.h | 1 - > >>> 3 files changed, 30 deletions(-) > >>> > >>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > >>> index 47d1983dfa2a4..f0136d0231f06 100644 > >>> --- a/drivers/iommu/iova.c > >>> +++ b/drivers/iommu/iova.c > >>> @@ -31,16 +31,6 @@ unsigned long iova_rcache_range(void) > >>> return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1); > >>> } > >>> > >>> -static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node) > >>> -{ > >>> - struct iova_domain *iovad; > >>> - > >>> - iovad = hlist_entry_safe(node, struct iova_domain, cpuhp_dead); > >>> - > >>> - free_cpu_cached_iovas(cpu, iovad); > >>> - return 0; > >>> -} > >>> - > >>> static void free_global_cached_iovas(struct iova_domain *iovad); > >>> > >>> static struct iova *to_iova(struct rb_node *node) > >>> @@ -255,21 +245,10 @@ int iova_cache_get(void) > >>> { > >>> mutex_lock(&iova_cache_mutex); > >>> if (!iova_cache_users) { > >>> - int ret; > >>> - > >>> - ret = cpuhp_setup_state_multi(CPUHP_IOMMU_IOVA_DEAD, > > "iommu/iova:dead", NULL, > >>> - iova_cpuhp_dead); > >>> - if (ret) { > >>> - mutex_unlock(&iova_cache_mutex); > >>> - pr_err("Couldn't register cpuhp handler\n"); > >>> - return ret; > >>> - } > >>> - > >>> iova_cache = kmem_cache_create( > >>> "iommu_iova", sizeof(struct iova), 0, > >>> SLAB_HWCACHE_ALIGN, NULL); > >>> if (!iova_cache) { > >>> - > > cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); > >>> mutex_unlock(&iova_cache_mutex); > >>> pr_err("Couldn't create iova cache\n"); > >>> return -ENOMEM; > >>> @@ -292,7 +271,6 @@ void iova_cache_put(void) > >>> } > >>> iova_cache_users--; > >>> if (!iova_cache_users) { > >>> - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); > >>> kmem_cache_destroy(iova_cache); > >>> } > >>> mutex_unlock(&iova_cache_mutex); > >>> @@ -495,8 +473,6 @@ EXPORT_SYMBOL_GPL(free_iova_fast); > >>> > >>> static void iova_domain_free_rcaches(struct iova_domain *iovad) > >>> { > >>> - cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, > >>> - &iovad- > >> cpuhp_dead); > >>> free_iova_rcaches(iovad); > >>> } > >>> > >>> @@ -755,10 +731,6 @@ int iova_domain_init_rcaches(struct iova_domain > > *iovad) > >>> } > >>> } > >>> > >>> - ret = cpuhp_state_add_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, > >>> - &iovad- > >> cpuhp_dead); > >>> - if (ret) > >>> - goto out_err; > >>> return 0; > >>> > >>> out_err: > >>> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > >>> index f61447913db97..8f541a6b63e41 100644 > >>> --- a/include/linux/cpuhotplug.h > >>> +++ b/include/linux/cpuhotplug.h > >>> @@ -95,7 +95,6 @@ enum cpuhp_state { > >>> CPUHP_PAGE_ALLOC, > >>> CPUHP_NET_DEV_DEAD, > >>> CPUHP_PCI_XGENE_DEAD, > >>> - CPUHP_IOMMU_IOVA_DEAD, > >>> CPUHP_LUSTRE_CFS_DEAD, > >>> CPUHP_AP_ARM_CACHE_B15_RAC_DEAD, > >>> CPUHP_PADATA_DEAD, > >>> diff --git a/include/linux/iova.h b/include/linux/iova.h > >>> index c6ba6d95d79c2..fd77cd5bfa333 100644 > >>> --- a/include/linux/iova.h > >>> +++ b/include/linux/iova.h > >>> @@ -37,7 +37,6 @@ struct iova_domain { > >>> struct iova anchor; /* rbtree lookup anchor > > */ > >>> > >>> struct iova_rcache *rcaches; > >>> - struct hlist_node cpuhp_dead; > >>> }; > >>> > >>> static inline unsigned long iova_size(struct iova *iova) > >> > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing 2022-10-05 15:25 ` Guenter Roeck @ 2022-10-05 16:15 ` Robin Murphy 2022-10-05 17:11 ` Guenter Roeck 0 siblings, 1 reply; 11+ messages in thread From: Robin Murphy @ 2022-10-05 16:15 UTC (permalink / raw) To: Guenter Roeck, Thorsten Leemhuis Cc: Fenghua Yu, intel-gfx, Chris Wilson, Joerg Roedel, Will Deacon, iommu, linux-kernel, regressions, Janusz Krzysztofik, Lucas De Marchi, linux-hwmon On 2022-10-05 16:25, Guenter Roeck wrote: > On Wed, Oct 05, 2022 at 04:26:28PM +0200, Thorsten Leemhuis wrote: >> [adding the coretemp maintainer (Fenghua Yu) and the appropriate mailing >> list to the list of recipients, as there apparently is a coretemp bug >> that results in a iommu change causing a regression] >> >> On 30.09.22 18:57, Janusz Krzysztofik wrote: >>> I think this issue can hit any user with a platform that loads iommu and >>> coretemp drivers. Adding regressions@lists.linux.dev to the loop. >> >> f598a497bc7d was merged for 5.13-rc1, which is quite a while ago, so at >> least a quick revert is out of question as it might do more harm than >> good. The authors of the commit are kinda responsible for fixing >> situations like this; but well, did anybody ask the developers of the >> coretemp driver kindly if they are aware of the problem and maybe even >> willing to fix it? Doesn't look like it from here from search lore (hope >> I didn't miss anything), so let's give it a try. >> >> Ciao, Thorsten >> >>> On Thursday, 22 September 2022 14:09:35 CEST Robin Murphy wrote: >>>> On 22/09/2022 11:10 am, Janusz Krzysztofik wrote: >>>>> From: Chris Wilson <chris@chris-wilson.co.uk> >>>>> >>>>> Manual revert of commit f598a497bc7d ("iova: Add CPU hotplug handler to >>>>> flush rcaches"). It is trying to instantiate a cpuhp notifier from inside >>>>> a cpuhp callback. That code replaced intel_iommu implementation of >>>>> flushing per-IOVA domain CPU rcaches which used a single instance of cpuhp >>>>> held for the module lifetime. >>>> >>>> OK, *now* I see what's going on. It doesn't seem unreasonable to me for >>>> bus notifiers to touch CPU hotplug - what seems more unexpected is the >>>> coretemp driver creating and adding a platform device from inside a >>>> hotplug callback. > > It is only unexpected if it is documented that creating a platform driver > from a hotplug callback is off limits. > >>>> >>>> Once we start trying to revert multiple unrelated bits of important >>>> functionality from other subsystems because one driver is doing a weird >>>> thing, maybe it's time to instead question whether that driver should be >>>> doing a weird thing? > > That isn't the point. This _used_ to work, after all. Maybe the functionality > introduced with f598a497bc7d is important, but there is still a regression > introduced by f598a497bc7d. Sure, maybe the coretemp driver is doing > "a weird thing", but if some generic code is changed causing something to fail > that previously worked, it is still a regression and the reponsibility of the > person or team making the generic code change to fix the problems caused by > that change. Note that AFAICS I don't think anything's actually broken, and this is merely a lockdep false-positive. The coretemp device itself will not be associated with the IOMMU, so the IOMMU notifier will never get as far as taking any further locks in that particular instance. Of course I *can* try writing the patch to fix things properly if I have to, but fair warning; I'm not familiar with this driver or the relevant hardware or the subsystem, and from a brief look it will involve some significant redesign that I have every chance of getting wrong. Plus I'm not sure I can test the hotplug stuff at all since the x86 box I have to hand only seems to have a single coretemp device. The fact is, the wacky thing it's doing with platform_device_add() doesn't actually work *all* that well anyway: $ sudo rmmod coretemp $ echo 0 | sudo tee /sys/bus/platform/drivers_autoprobe 0 $ sudo modprobe coretemp [7169271.187103] BUG: kernel NULL pointer dereference, address: 0000000000000418 [7169271.187127] #PF: supervisor write access in kernel mode [7169271.187131] #PF: error_code(0x0002) - not-present page [7169271.187134] PGD 0 P4D 0 [7169271.187139] Oops: 0002 [#1] SMP PTI [7169271.187144] CPU: 0 PID: 16 Comm: cpuhp/0 Not tainted 5.13.0-52-generic #59~20.04.1-Ubuntu [7169271.187150] Hardware name: LENOVO 30B6S08J03/1030, BIOS S01KT29A 06/20/2016 [7169271.187152] RIP: 0010:create_core_data+0x3cb/0x510 [coretemp] [7169271.187163] Code: 44 89 e7 e8 67 99 7f c8 85 c0 75 17 0f b6 45 b9 41 83 46 24 01 69 c0 18 fc ff ff 41 03 46 08 41 89 46 04 48 8b 45 b0 4c 63 fb <4e> 89 b4 f8 10 04 00 00 48 8b 00 41 8b 56 24 48 89 45 a0 85 d2 7e [7169271.187167] RSP: 0018:ffffa5ddc015fd98 EFLAGS: 00010203 [7169271.187172] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000002 [7169271.187175] RDX: 0000000000000000 RSI: ffffffff89207b30 RDI: ffffa5ddc015fd40 [7169271.187178] RBP: ffffa5ddc015fe00 R08: 0000000000000000 R09: ffff8e049c04c800 [7169271.187181] R10: 0000000000019460 R11: 0000000000000000 R12: 0000000000000000 [7169271.187184] R13: 000000000000005f R14: ffff8e049c04c800 R15: 0000000000000001 [7169271.187187] FS: 0000000000000000(0000) GS:ffff8e0b5f600000(0000) knlGS:0000000000000000 [7169271.187191] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [7169271.187194] CR2: 0000000000000418 CR3: 0000000190672002 CR4: 00000000003706f0 [7169271.187198] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [7169271.187200] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [7169271.187203] Call Trace: [7169271.187206] <TASK> [7169271.187212] coretemp_cpu_online+0x14f/0x180 [coretemp] [7169271.187220] ? create_core_data+0x510/0x510 [coretemp] [7169271.187226] cpuhp_invoke_callback+0x10b/0x430 [7169271.187237] cpuhp_thread_fun+0x92/0x150 [7169271.187244] smpboot_thread_fn+0xd0/0x170 [7169271.187253] ? sort_range+0x30/0x30 [7169271.187260] kthread+0x12b/0x150 [7169271.187264] ? set_kthread_struct+0x40/0x40 [7169271.187269] ret_from_fork+0x22/0x30 [7169271.187280] </TASK> Consider that a bug report, unless of course it's documented somewhere that users aren't allowed to turn off autoprobe ;) Thanks, Robin. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing 2022-10-05 16:15 ` Robin Murphy @ 2022-10-05 17:11 ` Guenter Roeck 0 siblings, 0 replies; 11+ messages in thread From: Guenter Roeck @ 2022-10-05 17:11 UTC (permalink / raw) To: Robin Murphy Cc: Thorsten Leemhuis, Fenghua Yu, intel-gfx, Chris Wilson, Joerg Roedel, Will Deacon, iommu, linux-kernel, regressions, Janusz Krzysztofik, Lucas De Marchi, linux-hwmon On Wed, Oct 05, 2022 at 05:15:49PM +0100, Robin Murphy wrote: > On 2022-10-05 16:25, Guenter Roeck wrote: > > On Wed, Oct 05, 2022 at 04:26:28PM +0200, Thorsten Leemhuis wrote: > > > [adding the coretemp maintainer (Fenghua Yu) and the appropriate mailing > > > list to the list of recipients, as there apparently is a coretemp bug > > > that results in a iommu change causing a regression] > > > > > > On 30.09.22 18:57, Janusz Krzysztofik wrote: > > > > I think this issue can hit any user with a platform that loads iommu and > > > > coretemp drivers. Adding regressions@lists.linux.dev to the loop. > > > > > > f598a497bc7d was merged for 5.13-rc1, which is quite a while ago, so at > > > least a quick revert is out of question as it might do more harm than > > > good. The authors of the commit are kinda responsible for fixing > > > situations like this; but well, did anybody ask the developers of the > > > coretemp driver kindly if they are aware of the problem and maybe even > > > willing to fix it? Doesn't look like it from here from search lore (hope > > > I didn't miss anything), so let's give it a try. > > > > > > Ciao, Thorsten > > > > > > > On Thursday, 22 September 2022 14:09:35 CEST Robin Murphy wrote: > > > > > On 22/09/2022 11:10 am, Janusz Krzysztofik wrote: > > > > > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > > > > > > > > > Manual revert of commit f598a497bc7d ("iova: Add CPU hotplug handler to > > > > > > flush rcaches"). It is trying to instantiate a cpuhp notifier from inside > > > > > > a cpuhp callback. That code replaced intel_iommu implementation of > > > > > > flushing per-IOVA domain CPU rcaches which used a single instance of cpuhp > > > > > > held for the module lifetime. > > > > > > > > > > OK, *now* I see what's going on. It doesn't seem unreasonable to me for > > > > > bus notifiers to touch CPU hotplug - what seems more unexpected is the > > > > > coretemp driver creating and adding a platform device from inside a > > > > > hotplug callback. > > > > It is only unexpected if it is documented that creating a platform driver > > from a hotplug callback is off limits. > > > > > > > > > > > > Once we start trying to revert multiple unrelated bits of important > > > > > functionality from other subsystems because one driver is doing a weird > > > > > thing, maybe it's time to instead question whether that driver should be > > > > > doing a weird thing? > > > > That isn't the point. This _used_ to work, after all. Maybe the functionality > > introduced with f598a497bc7d is important, but there is still a regression > > introduced by f598a497bc7d. Sure, maybe the coretemp driver is doing > > "a weird thing", but if some generic code is changed causing something to fail > > that previously worked, it is still a regression and the reponsibility of the > > person or team making the generic code change to fix the problems caused by > > that change. > > Note that AFAICS I don't think anything's actually broken, and this is > merely a lockdep false-positive. The coretemp device itself will not be > associated with the IOMMU, so the IOMMU notifier will never get as far as > taking any further locks in that particular instance. > > Of course I *can* try writing the patch to fix things properly if I have to, > but fair warning; I'm not familiar with this driver or the relevant hardware > or the subsystem, and from a brief look it will involve some significant > redesign that I have every chance of getting wrong. Plus I'm not sure I can > test the hotplug stuff at all since the x86 box I have to hand only seems to > have a single coretemp device. > > The fact is, the wacky thing it's doing with platform_device_add() doesn't > actually work *all* that well anyway: > Hah, yes, that is obviously a bug. Unfortunately I don't have any systems with Intel CPU left, so I can not test myself. FWIW, on v5.18.x (which is what Google laptops use for whatever reason), I don't see the crash, but "modprobe -r coretemp" followed by "modprobe coretemp" doesn't work - the driver loads, but does not register with the hwmon subsystem. There has been no relevant change to the driver since v5.13, so all I can conclude at this point is that the driver is very likely still broken in the mainline kernel. Guenter > $ sudo rmmod coretemp > $ echo 0 | sudo tee /sys/bus/platform/drivers_autoprobe > 0 > $ sudo modprobe coretemp > > [7169271.187103] BUG: kernel NULL pointer dereference, address: > 0000000000000418 > [7169271.187127] #PF: supervisor write access in kernel mode > [7169271.187131] #PF: error_code(0x0002) - not-present page > [7169271.187134] PGD 0 P4D 0 > [7169271.187139] Oops: 0002 [#1] SMP PTI > [7169271.187144] CPU: 0 PID: 16 Comm: cpuhp/0 Not tainted 5.13.0-52-generic > #59~20.04.1-Ubuntu > [7169271.187150] Hardware name: LENOVO 30B6S08J03/1030, BIOS S01KT29A > 06/20/2016 > [7169271.187152] RIP: 0010:create_core_data+0x3cb/0x510 [coretemp] > [7169271.187163] Code: 44 89 e7 e8 67 99 7f c8 85 c0 75 17 0f b6 45 b9 41 83 > 46 24 01 69 c0 18 fc ff ff 41 03 46 08 41 89 46 04 48 8b 45 b0 4c 63 fb <4e> > 89 b4 f8 10 04 00 00 48 8b 00 41 8b 56 24 48 89 45 a0 85 d2 7e > [7169271.187167] RSP: 0018:ffffa5ddc015fd98 EFLAGS: 00010203 > [7169271.187172] RAX: 0000000000000000 RBX: 0000000000000001 RCX: > 0000000000000002 > [7169271.187175] RDX: 0000000000000000 RSI: ffffffff89207b30 RDI: > ffffa5ddc015fd40 > [7169271.187178] RBP: ffffa5ddc015fe00 R08: 0000000000000000 R09: > ffff8e049c04c800 > [7169271.187181] R10: 0000000000019460 R11: 0000000000000000 R12: > 0000000000000000 > [7169271.187184] R13: 000000000000005f R14: ffff8e049c04c800 R15: > 0000000000000001 > [7169271.187187] FS: 0000000000000000(0000) GS:ffff8e0b5f600000(0000) > knlGS:0000000000000000 > [7169271.187191] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [7169271.187194] CR2: 0000000000000418 CR3: 0000000190672002 CR4: > 00000000003706f0 > [7169271.187198] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [7169271.187200] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > [7169271.187203] Call Trace: > [7169271.187206] <TASK> > [7169271.187212] coretemp_cpu_online+0x14f/0x180 [coretemp] > [7169271.187220] ? create_core_data+0x510/0x510 [coretemp] > [7169271.187226] cpuhp_invoke_callback+0x10b/0x430 > [7169271.187237] cpuhp_thread_fun+0x92/0x150 > [7169271.187244] smpboot_thread_fn+0xd0/0x170 > [7169271.187253] ? sort_range+0x30/0x30 > [7169271.187260] kthread+0x12b/0x150 > [7169271.187264] ? set_kthread_struct+0x40/0x40 > [7169271.187269] ret_from_fork+0x22/0x30 > [7169271.187280] </TASK> > > Consider that a bug report, unless of course it's documented somewhere that > users aren't allowed to turn off autoprobe ;) > > Thanks, > Robin. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing #forregzbot 2022-09-22 10:10 [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing Janusz Krzysztofik 2022-09-22 12:09 ` Robin Murphy @ 2022-10-05 14:41 ` Thorsten Leemhuis 2022-11-02 11:17 ` [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing Baolu Lu 2 siblings, 0 replies; 11+ messages in thread From: Thorsten Leemhuis @ 2022-10-05 14:41 UTC (permalink / raw) To: regressions; +Cc: intel-gfx, iommu, linux-kernel [Note: this mail is primarily send for documentation purposes and/or for regzbot, my Linux kernel regression tracking bot. That's why I removed most or all folks from the list of recipients, but left any that looked like a mailing lists. These mails usually contain '#forregzbot' in the subject, to make them easy to spot and filter out.] [TLDR: I'm adding this regression report to the list of tracked regressions; all text from me you find below is based on a few templates paragraphs you might have encountered already already in similar form.] Hi, this is your Linux kernel regression tracker. On 22.09.22 12:10, Janusz Krzysztofik wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > Manual revert of commit f598a497bc7d ("iova: Add CPU hotplug handler to > flush rcaches"). It is trying to instantiate a cpuhp notifier from inside > a cpuhp callback. That code replaced intel_iommu implementation of > flushing per-IOVA domain CPU rcaches which used a single instance of cpuhp > held for the module lifetime. Thanks for the report. To be sure below issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot: #regzbot ^introduced f598a497bc7d #regzbot title iommu/coretemp: possible circular locking dependency detected #regzbot ignore-activity This isn't a regression? This issue or a fix for it are already discussed somewhere else? It was fixed already? You want to clarify when the regression started to happen? Or point out I got the title or something else totally wrong? Then just reply -- ideally with also telling regzbot about it, as explained here: https://linux-regtracking.leemhuis.info/tracked-regression/ Reminder for developers: When fixing the issue, add 'Link:' tags pointing to the report (the mail this one replies to), as explained for in the Linux kernel's documentation; above webpage explains why this is important for tracked regressions. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight. > <4>[ 6.928112] ====================================================== > <4>[ 6.928621] WARNING: possible circular locking dependency detected > <4>[ 6.929225] 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 Not tainted > <4>[ 6.929818] ------------------------------------------------------ > <4>[ 6.930415] cpuhp/0/15 is trying to acquire lock: > <4>[ 6.931011] ffff888100e02a78 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x20/0x50 > <4>[ 6.931533] > but task is already holding lock: > <4>[ 6.931534] ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > <4>[ 6.933069] > which lock already depends on the new lock. > > <4>[ 6.933070] > the existing dependency chain (in reverse order) is: > <4>[ 6.933071] > -> #2 (cpuhp_state-up){+.+.}-{0:0}: > <4>[ 6.933076] lock_acquire+0xd3/0x310 > <4>[ 6.933079] cpuhp_thread_fun+0xa6/0x1f0 > <4>[ 6.933082] smpboot_thread_fn+0x1b5/0x260 > <4>[ 6.933084] kthread+0xed/0x120 > <4>[ 6.933086] ret_from_fork+0x1f/0x30 > <4>[ 6.933089] > -> #1 (cpu_hotplug_lock){++++}-{0:0}: > <4>[ 6.933092] lock_acquire+0xd3/0x310 > <4>[ 6.933094] __cpuhp_state_add_instance+0x43/0x1c0 > <4>[ 6.933096] iova_domain_init_rcaches+0x199/0x1c0 > <4>[ 6.933099] iommu_setup_dma_ops+0x104/0x3d0 > <4>[ 6.933101] iommu_probe_device+0xa4/0x180 > <4>[ 6.933103] iommu_bus_notifier+0x2d/0x40 > <4>[ 6.933105] notifier_call_chain+0x31/0x90 > <4>[ 6.933108] blocking_notifier_call_chain+0x3a/0x50 > <4>[ 6.933110] device_add+0x3c1/0x900 > <4>[ 6.933112] pci_device_add+0x255/0x580 > <4>[ 6.933115] pci_scan_single_device+0xa6/0xd0 > <4>[ 6.933117] p2sb_bar+0x7f/0x220 > <4>[ 6.933120] i801_add_tco_spt.isra.18+0x2b/0xca [i2c_i801] > <4>[ 6.933124] i801_add_tco+0xb1/0xfe [i2c_i801] > <4>[ 6.933126] i801_probe.cold.25+0xa9/0x3a7 [i2c_i801] > <4>[ 6.933129] pci_device_probe+0x95/0x110 > <4>[ 6.933132] really_probe+0xd6/0x350 > <4>[ 6.933134] __driver_probe_device+0x73/0x170 > <4>[ 6.933137] driver_probe_device+0x1a/0x90 > <4>[ 6.933140] __driver_attach+0xbc/0x190 > <4>[ 6.933141] bus_for_each_dev+0x72/0xc0 > <4>[ 6.933143] bus_add_driver+0x1bb/0x210 > <4>[ 6.933146] driver_register+0x66/0xc0 > <4>[ 6.933147] wmi_bmof_probe+0x3b/0xac [wmi_bmof] > <4>[ 6.933150] do_one_initcall+0x53/0x2f0 > <4>[ 6.933152] do_init_module+0x45/0x1c0 > <4>[ 6.933154] load_module+0x1cd5/0x1ec0 > <4>[ 6.933156] __do_sys_finit_module+0xaf/0x120 > <4>[ 6.933158] do_syscall_64+0x37/0x90 > <4>[ 6.933160] entry_SYSCALL_64_after_hwframe+0x63/0xcd > <4>[ 6.953757] > -> #0 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}: > <4>[ 6.953779] validate_chain+0xb3f/0x2000 > <4>[ 6.953785] __lock_acquire+0x5a4/0xb70 > <4>[ 6.953786] lock_acquire+0xd3/0x310 > <4>[ 6.953787] down_read+0x39/0x140 > <4>[ 6.953790] blocking_notifier_call_chain+0x20/0x50 > <4>[ 6.953794] device_add+0x3c1/0x900 > <4>[ 6.953797] platform_device_add+0x108/0x240 > <4>[ 6.953799] coretemp_cpu_online+0xe1/0x15e [coretemp] > <4>[ 6.953805] cpuhp_invoke_callback+0x181/0x8a0 > <4>[ 6.958244] cpuhp_thread_fun+0x188/0x1f0 > <4>[ 6.958267] smpboot_thread_fn+0x1b5/0x260 > <4>[ 6.958270] kthread+0xed/0x120 > <4>[ 6.958272] ret_from_fork+0x1f/0x30 > <4>[ 6.958274] > other info that might help us debug this: > > <4>[ 6.958275] Chain exists of: > &(&priv->bus_notifier)->rwsem --> cpu_hotplug_lock --> cpuhp_state-up > > <4>[ 6.961037] Possible unsafe locking scenario: > > <4>[ 6.961038] CPU0 CPU1 > <4>[ 6.961038] ---- ---- > <4>[ 6.961039] lock(cpuhp_state-up); > <4>[ 6.961040] lock(cpu_hotplug_lock); > <4>[ 6.961041] lock(cpuhp_state-up); > <4>[ 6.961042] lock(&(&priv->bus_notifier)->rwsem); > <4>[ 6.961044] > *** DEADLOCK *** > > <4>[ 6.961044] 2 locks held by cpuhp/0/15: > <4>[ 6.961046] #0: ffffffff82648f10 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > <4>[ 6.961053] #1: ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > <4>[ 6.961058] > stack backtrace: > <4>[ 6.961059] CPU: 0 PID: 15 Comm: cpuhp/0 Not tainted 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 > <4>[ 6.961062] Hardware name: Intel Corporation NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0047.2018.0718.1706 07/18/2018 > <4>[ 6.961063] Call Trace: > <4>[ 6.961064] <TASK> > <4>[ 6.961065] dump_stack_lvl+0x56/0x7f > <4>[ 6.961069] check_noncircular+0x132/0x150 > <4>[ 6.961078] validate_chain+0xb3f/0x2000 > <4>[ 6.961083] __lock_acquire+0x5a4/0xb70 > <4>[ 6.961087] lock_acquire+0xd3/0x310 > <4>[ 6.961088] ? blocking_notifier_call_chain+0x20/0x50 > <4>[ 6.961093] down_read+0x39/0x140 > <4>[ 6.961097] ? blocking_notifier_call_chain+0x20/0x50 > <4>[ 6.961099] blocking_notifier_call_chain+0x20/0x50 > <4>[ 6.961102] device_add+0x3c1/0x900 > <4>[ 6.961106] ? dev_set_name+0x4e/0x70 > <4>[ 6.961109] platform_device_add+0x108/0x240 > <4>[ 6.961112] coretemp_cpu_online+0xe1/0x15e [coretemp] > <4>[ 6.961117] ? create_core_data+0x550/0x550 [coretemp] > <4>[ 6.961120] cpuhp_invoke_callback+0x181/0x8a0 > <4>[ 6.961124] cpuhp_thread_fun+0x188/0x1f0 > <4>[ 6.961129] ? smpboot_thread_fn+0x1e/0x260 > <4>[ 6.961131] smpboot_thread_fn+0x1b5/0x260 > <4>[ 6.961134] ? sort_range+0x20/0x20 > <4>[ 6.961135] kthread+0xed/0x120 > <4>[ 6.961137] ? kthread_complete_and_exit+0x20/0x20 > <4>[ 6.961139] ret_from_fork+0x1f/0x30 > <4>[ 6.961145] </TASK> > > Closes: https://gitlab.freedesktop.org/drm/intel/issues/6641 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> > --- > drivers/iommu/iova.c | 28 ---------------------------- > include/linux/cpuhotplug.h | 1 - > include/linux/iova.h | 1 - > 3 files changed, 30 deletions(-) > > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > index 47d1983dfa2a4..f0136d0231f06 100644 > --- a/drivers/iommu/iova.c > +++ b/drivers/iommu/iova.c > @@ -31,16 +31,6 @@ unsigned long iova_rcache_range(void) > return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1); > } > > -static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node) > -{ > - struct iova_domain *iovad; > - > - iovad = hlist_entry_safe(node, struct iova_domain, cpuhp_dead); > - > - free_cpu_cached_iovas(cpu, iovad); > - return 0; > -} > - > static void free_global_cached_iovas(struct iova_domain *iovad); > > static struct iova *to_iova(struct rb_node *node) > @@ -255,21 +245,10 @@ int iova_cache_get(void) > { > mutex_lock(&iova_cache_mutex); > if (!iova_cache_users) { > - int ret; > - > - ret = cpuhp_setup_state_multi(CPUHP_IOMMU_IOVA_DEAD, "iommu/iova:dead", NULL, > - iova_cpuhp_dead); > - if (ret) { > - mutex_unlock(&iova_cache_mutex); > - pr_err("Couldn't register cpuhp handler\n"); > - return ret; > - } > - > iova_cache = kmem_cache_create( > "iommu_iova", sizeof(struct iova), 0, > SLAB_HWCACHE_ALIGN, NULL); > if (!iova_cache) { > - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); > mutex_unlock(&iova_cache_mutex); > pr_err("Couldn't create iova cache\n"); > return -ENOMEM; > @@ -292,7 +271,6 @@ void iova_cache_put(void) > } > iova_cache_users--; > if (!iova_cache_users) { > - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); > kmem_cache_destroy(iova_cache); > } > mutex_unlock(&iova_cache_mutex); > @@ -495,8 +473,6 @@ EXPORT_SYMBOL_GPL(free_iova_fast); > > static void iova_domain_free_rcaches(struct iova_domain *iovad) > { > - cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, > - &iovad->cpuhp_dead); > free_iova_rcaches(iovad); > } > > @@ -755,10 +731,6 @@ int iova_domain_init_rcaches(struct iova_domain *iovad) > } > } > > - ret = cpuhp_state_add_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, > - &iovad->cpuhp_dead); > - if (ret) > - goto out_err; > return 0; > > out_err: > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > index f61447913db97..8f541a6b63e41 100644 > --- a/include/linux/cpuhotplug.h > +++ b/include/linux/cpuhotplug.h > @@ -95,7 +95,6 @@ enum cpuhp_state { > CPUHP_PAGE_ALLOC, > CPUHP_NET_DEV_DEAD, > CPUHP_PCI_XGENE_DEAD, > - CPUHP_IOMMU_IOVA_DEAD, > CPUHP_LUSTRE_CFS_DEAD, > CPUHP_AP_ARM_CACHE_B15_RAC_DEAD, > CPUHP_PADATA_DEAD, > diff --git a/include/linux/iova.h b/include/linux/iova.h > index c6ba6d95d79c2..fd77cd5bfa333 100644 > --- a/include/linux/iova.h > +++ b/include/linux/iova.h > @@ -37,7 +37,6 @@ struct iova_domain { > struct iova anchor; /* rbtree lookup anchor */ > > struct iova_rcache *rcaches; > - struct hlist_node cpuhp_dead; > }; > > static inline unsigned long iova_size(struct iova *iova) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing 2022-09-22 10:10 [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing Janusz Krzysztofik 2022-09-22 12:09 ` Robin Murphy 2022-10-05 14:41 ` [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing #forregzbot Thorsten Leemhuis @ 2022-11-02 11:17 ` Baolu Lu 2022-11-02 12:20 ` Baolu Lu 2 siblings, 1 reply; 11+ messages in thread From: Baolu Lu @ 2022-11-02 11:17 UTC (permalink / raw) To: Janusz Krzysztofik, Lucas De Marchi Cc: baolu.lu, intel-gfx, Chris Wilson, Robin Murphy, Joerg Roedel, Will Deacon, iommu, linux-kernel On 2022/9/22 18:10, Janusz Krzysztofik wrote: > From: Chris Wilson<chris@chris-wilson.co.uk> > > Manual revert of commit f598a497bc7d ("iova: Add CPU hotplug handler to > flush rcaches"). It is trying to instantiate a cpuhp notifier from inside > a cpuhp callback. That code replaced intel_iommu implementation of > flushing per-IOVA domain CPU rcaches which used a single instance of cpuhp > held for the module lifetime. > > <4>[ 6.928112] ====================================================== > <4>[ 6.928621] WARNING: possible circular locking dependency detected > <4>[ 6.929225] 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 Not tainted > <4>[ 6.929818] ------------------------------------------------------ > <4>[ 6.930415] cpuhp/0/15 is trying to acquire lock: > <4>[ 6.931011] ffff888100e02a78 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x20/0x50 > <4>[ 6.931533] > but task is already holding lock: > <4>[ 6.931534] ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x48/0x1f0 > <4>[ 6.933069] > which lock already depends on the new lock. Just FYI. Hot plugging a PCI device will trigger a similar lockdep warning. #echo 1 > /sys/bus/pci/devices/.../remove With this patch applied, the warning disappeared. [ 2598.661070] [ 2598.663338] ====================================================== [ 2598.671360] WARNING: possible circular locking dependency detected [ 2598.679361] 6.1.0-rc2+ #367 Tainted: G I [ 2598.686254] ------------------------------------------------------ [ 2598.694260] bash/828 is trying to acquire lock: [ 2598.700249] ffffffff95a1e7f0 (dmar_global_lock){++++}-{3:3}, at: dmar_pci_bus_notifier+0x55/0x110 [ 2598.711680] [ 2598.711680] but task is already holding lock: [ 2598.719258] ff24417981dfee78 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x29/0x60 [ 2598.732692] [ 2598.732692] which lock already depends on the new lock. [ 2598.732692] [ 2598.743174] [ 2598.743174] the existing dependency chain (in reverse order) is: [ 2598.752770] [ 2598.752770] -> #4 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}: [ 2598.762268] lock_acquire+0xc2/0x2e0 [ 2598.767645] down_read+0x42/0x150 [ 2598.772743] blocking_notifier_call_chain+0x29/0x60 [ 2598.779755] device_add+0x403/0x980 [ 2598.785045] platform_device_add+0x11c/0x240 [ 2598.791350] coretemp_cpu_online+0xe5/0x180 [coretemp] [ 2598.798757] cpuhp_invoke_callback+0x179/0x8d0 [ 2598.805252] cpuhp_thread_fun+0x19d/0x210 [ 2598.811249] smpboot_thread_fn+0x11d/0x240 [ 2598.817349] kthread+0xeb/0x120 [ 2598.822242] ret_from_fork+0x1f/0x30 [ 2598.827645] [ 2598.827645] -> #3 (cpuhp_state-up){+.+.}-{0:0}: [ 2598.835559] lock_acquire+0xc2/0x2e0 [ 2598.840945] cpuhp_thread_fun+0xb3/0x210 [ 2598.846848] smpboot_thread_fn+0x11d/0x240 [ 2598.852949] kthread+0xeb/0x120 [ 2598.857842] ret_from_fork+0x1f/0x30 [ 2598.864477] [ 2598.864477] -> #2 (cpu_hotplug_lock){++++}-{0:0}: [ 2598.875039] lock_acquire+0xc2/0x2e0 [ 2598.881692] __cpuhp_state_add_instance+0x4c/0x1b0 [ 2598.889881] iova_domain_init_rcaches+0x179/0x1a0 [ 2598.897840] iommu_setup_dma_ops+0x135/0x440 [ 2598.905329] bus_iommu_probe+0x276/0x2e0 [ 2598.912412] iommu_device_register+0xd4/0x130 [ 2598.920029] intel_iommu_init+0x3e1/0x6ea [ 2598.927216] pci_iommu_init+0x12/0x3a [ 2598.933898] do_one_initcall+0x65/0x320 [ 2598.940877] kernel_init_freeable+0x293/0x2fc [ 2598.948363] kernel_init+0x1a/0x130 [ 2598.954749] ret_from_fork+0x1f/0x30 [ 2598.961237] [ 2598.961237] -> #1 (&domain->iova_cookie->mutex){+.+.}-{3:3}: [ 2598.972616] lock_acquire+0xc2/0x2e0 [ 2598.979115] __mutex_lock+0x99/0xf40 [ 2598.985598] iommu_setup_dma_ops+0xde/0x440 [ 2598.992878] bus_iommu_probe+0x276/0x2e0 [ 2598.999750] iommu_device_register+0xd4/0x130 [ 2599.007154] intel_iommu_init+0x3e1/0x6ea [ 2599.014120] pci_iommu_init+0x12/0x3a [ 2599.020583] do_one_initcall+0x65/0x320 [ 2599.027278] kernel_init_freeable+0x293/0x2fc [ 2599.034576] kernel_init+0x1a/0x130 [ 2599.040780] ret_from_fork+0x1f/0x30 [ 2599.047070] [ 2599.047070] -> #0 (dmar_global_lock){++++}-{3:3}: [ 2599.056971] check_prevs_add+0x160/0xee0 [ 2599.063745] __lock_acquire+0x116a/0x15f0 [ 2599.070647] lock_acquire+0xc2/0x2e0 [ 2599.076939] down_write+0x3f/0xd0 [ 2599.082931] dmar_pci_bus_notifier+0x55/0x110 [ 2599.090239] notifier_call_chain+0x3a/0xa0 [ 2599.097234] blocking_notifier_call_chain+0x43/0x60 [ 2599.105138] device_del+0x2b4/0x420 [ 2599.111333] pci_remove_bus_device+0x70/0x110 [ 2599.118640] pci_stop_and_remove_bus_device_locked+0x22/0x30 [ 2599.127557] remove_store+0x7d/0x90 [ 2599.133752] kernfs_fop_write_iter+0x12a/0x1d0 [ 2599.141165] vfs_write+0x313/0x4b0 [ 2599.147259] ksys_write+0x60/0xe0 [ 2599.153250] do_syscall_64+0x43/0x90 [ 2599.159550] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 2599.167665] [ 2599.167665] other info that might help us debug this: [ 2599.167665] [ 2599.180634] Chain exists of: [ 2599.180634] dmar_global_lock --> cpuhp_state-up --> &(&priv->bus_notifier)->rwsem [ 2599.180634] [ 2599.198702] Possible unsafe locking scenario: [ 2599.198702] [ 2599.208346] CPU0 CPU1 [ 2599.215281] ---- ---- [ 2599.222170] lock(&(&priv->bus_notifier)->rwsem); [ 2599.229474] lock(cpuhp_state-up); [ 2599.238293] lock(&(&priv->bus_notifier)->rwsem); [ 2599.248705] lock(dmar_global_lock); [ 2599.254593] [ 2599.254593] *** DEADLOCK *** [ 2599.254593] [ 2599.265197] 4 locks held by bash/828: [ 2599.271076] #0: ff2441798fc8f448 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x60/0xe0 [ 2599.282218] #1: ff24417992068288 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0xf7/0x1d0 [ 2599.294449] #2: ffffffff959f6b28 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pci_stop_and_remove_bus_device_locked+0x12/0x30 [ 2599.309600] #3: ff24417981dfee78 (&(&priv->bus_notifier)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x29/0x60 [ 2599.324534] [ 2599.324534] stack backtrace: [ 2599.332486] CPU: 0 PID: 828 Comm: bash Tainted: G I 6.1.0-rc2+ #367 [ 2599.343448] Hardware name: Intel Corporation BIRCHSTREAM/BIRCHSTREAM, BIOS BHSDREL1.86B.0014.D60.2205121929 05/12/2022 [ 2599.358107] Call Trace: [ 2599.362685] <TASK> [ 2599.366778] dump_stack_lvl+0x48/0x5f [ 2599.372790] check_noncircular+0x102/0x120 [ 2599.379400] ? check_prevs_add+0x176/0xee0 [ 2599.385994] check_prevs_add+0x160/0xee0 [ 2599.392392] __lock_acquire+0x116a/0x15f0 [ 2599.398892] lock_acquire+0xc2/0x2e0 [ 2599.404882] ? dmar_pci_bus_notifier+0x55/0x110 [ 2599.411996] ? lock_is_held_type+0x9d/0x110 [ 2599.418697] down_write+0x3f/0xd0 [ 2599.424385] ? dmar_pci_bus_notifier+0x55/0x110 [ 2599.431593] dmar_pci_bus_notifier+0x55/0x110 [ 2599.438590] notifier_call_chain+0x3a/0xa0 [ 2599.445189] blocking_notifier_call_chain+0x43/0x60 [ 2599.452791] device_del+0x2b4/0x420 [ 2599.458680] pci_remove_bus_device+0x70/0x110 [ 2599.465583] pci_stop_and_remove_bus_device_locked+0x22/0x30 [ 2599.474096] remove_store+0x7d/0x90 [ 2599.479995] kernfs_fop_write_iter+0x12a/0x1d0 [ 2599.487107] vfs_write+0x313/0x4b0 [ 2599.492902] ksys_write+0x60/0xe0 [ 2599.498587] do_syscall_64+0x43/0x90 [ 2599.504581] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 2599.512292] RIP: 0033:0x7f04a9d18a98 [ 2599.518292] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 05 26 0f 00 8b 00 85 c0 75 1f b8 01 00 00 00 0f 05 <0f> 1f 84 00 00 00 00 00 48 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 [ 2599.544235] RSP: 002b:00007ffc28cce9e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 2599.555199] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f04a9d18a98 [ 2599.565713] RDX: 0000000000000002 RSI: 000055d272355f60 RDI: 0000000000000001 [ 2599.576232] RBP: 000055d272355f60 R08: 000000000000000a R09: 0000000000000001 [ 2599.586744] R10: 000000000000000a R11: 0000000000000246 R12: 00007f04a9e06520 [ 2599.597247] R13: 0000000000000002 R14: 00007f04a9e07260 R15: 00007f04a9e06720 [ 2599.607746] </TASK> [ 2599.612285] BUG: kernel NULL pointer dereference, address: 0000000000000064 [ 2599.622454] #PF: supervisor read access in kernel mode [ 2599.630460] #PF: error_code(0x0000) - not-present page [ 2599.638473] PGD 109f7d067 P4D 0 [ 2599.644274] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 2599.651385] CPU: 0 PID: 828 Comm: bash Tainted: G I 6.1.0-rc2+ #367 [ 2599.662501] Hardware name: Intel Corporation BIRCHSTREAM/BIRCHSTREAM, BIOS BHSDREL1.86B.0014.D60.2205121929 05/12/2022 [ 2599.677439] RIP: 0010:do_raw_spin_lock+0xa/0xc0 [ 2599.684796] Code: 8d 88 80 0b 00 00 48 c7 c7 d8 39 cd 91 e8 c7 c2 06 01 e9 e8 32 0e 01 66 0f 1f 84 00 00 00 00 00 66 0f 1f 00 0f 1f 44 00 00 53 <8b> 47 04 48 89 fb 3d ad 4e ad de 75 4a 48 8b 53 10 65 48 8b 04 25 [ 2599.711073] RSP: 0018:ff5efa9101b27ca8 EFLAGS: 00010092 [ 2599.719254] RAX: 0000000000000000 RBX: 0000000000000206 RCX: 0000000000000000 [ 2599.729869] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000060 [ 2599.740427] RBP: 0000000000000060 R08: 0000000000000001 R09: 0000000000000000 [ 2599.750985] R10: 0000000000000000 R11: ffffffff95b0be60 R12: ff24417986fce000 [ 2599.761500] R13: ffffffff959f9bf0 R14: ff244179872d5058 R15: 0000000000000003 [ 2599.772114] FS: 00007f04aa2b1740(0000) GS:ff244180dfc00000(0000) knlGS:0000000000000000 [ 2599.783841] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2599.792747] CR2: 0000000000000064 CR3: 000000010a6a2004 CR4: 0000000000771ef0 [ 2599.803372] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2599.813976] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 [ 2599.824572] PKRU: 55555554 [ 2599.829655] Call Trace: [ 2599.834444] <TASK> [ 2599.838833] _raw_spin_lock_irqsave+0x41/0x60 [ 2599.846048] ? domain_update_iotlb+0x16/0x60 [ 2599.853147] domain_update_iotlb+0x16/0x60 [ 2599.859946] intel_iommu_release_device+0xc5/0xd0 [ 2599.867549] iommu_release_device+0x49/0x80 [ 2599.874444] iommu_bus_notifier+0x24/0x50 [ 2599.881139] notifier_call_chain+0x3a/0xa0 [ 2599.887936] blocking_notifier_call_chain+0x43/0x60 [ 2599.895736] device_del+0x2b4/0x420 [ 2599.901829] pci_remove_bus_device+0x70/0x110 [ 2599.909034] pci_stop_and_remove_bus_device_locked+0x22/0x30 [ 2599.917852] remove_store+0x7d/0x90 [ 2599.923949] kernfs_fop_write_iter+0x12a/0x1d0 [ 2599.931253] vfs_write+0x313/0x4b0 [ 2599.937247] ksys_write+0x60/0xe0 [ 2599.943095] do_syscall_64+0x43/0x90 [ 2599.949245] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 2599.957135] RIP: 0033:0x7f04a9d18a98 [ 2599.963213] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 05 26 0f 00 8b 00 85 c0 75 1f b8 01 00 00 00 0f 05 <0f> 1f 84 00 00 00 00 00 48 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 [ 2599.989405] RSP: 002b:00007ffc28cce9e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 2600.000537] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f04a9d18a98 [ 2600.011155] RDX: 0000000000000002 RSI: 000055d272355f60 RDI: 0000000000000001 [ 2600.021748] RBP: 000055d272355f60 R08: 000000000000000a R09: 0000000000000001 [ 2600.032334] R10: 000000000000000a R11: 0000000000000246 R12: 00007f04a9e06520 [ 2600.042923] R13: 0000000000000002 R14: 00007f04a9e07260 R15: 00007f04a9e06720 [ 2600.053527] </TASK> [ 2600.057993] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass isst_if_mmio isst_if_common joydev intel_vsec idxd cxl_acpi cxl_core sunrpc crc32c_intel ixgbe mdio dca bochs drm_vram_helper drm_ttm_helper [ 2600.088403] CR2: 0000000000000064 [ 2600.094376] ---[ end trace 0000000000000000 ]--- [ 2600.238419] RIP: 0010:do_raw_spin_lock+0xa/0xc0 [ 2600.245811] Code: 8d 88 80 0b 00 00 48 c7 c7 d8 39 cd 91 e8 c7 c2 06 01 e9 e8 32 0e 01 66 0f 1f 84 00 00 00 00 00 66 0f 1f 00 0f 1f 44 00 00 53 <8b> 47 04 48 89 fb 3d ad 4e ad de 75 4a 48 8b 53 10 65 48 8b 04 25 [ 2600.272091] RSP: 0018:ff5efa9101b27ca8 EFLAGS: 00010092 [ 2600.280345] RAX: 0000000000000000 RBX: 0000000000000206 RCX: 0000000000000000 [ 2600.290980] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000060 [ 2600.301580] RBP: 0000000000000060 R08: 0000000000000001 R09: 0000000000000000 [ 2600.312167] R10: 0000000000000000 R11: ffffffff95b0be60 R12: ff24417986fce000 [ 2600.322762] R13: ffffffff959f9bf0 R14: ff244179872d5058 R15: 0000000000000003 [ 2600.333354] FS: 00007f04aa2b1740(0000) GS:ff244180dfc00000(0000) knlGS:0000000000000000 [ 2600.345062] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2600.353949] CR2: 0000000000000064 CR3: 000000010a6a2004 CR4: 0000000000771ef0 [ 2600.364571] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2600.375178] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 [ 2600.385756] PKRU: 55555554 [ 2600.390848] note: bash[828] exited with preempt_count 1 Best regards, baolu ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing 2022-11-02 11:17 ` [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing Baolu Lu @ 2022-11-02 12:20 ` Baolu Lu 0 siblings, 0 replies; 11+ messages in thread From: Baolu Lu @ 2022-11-02 12:20 UTC (permalink / raw) To: Janusz Krzysztofik, Lucas De Marchi Cc: baolu.lu, intel-gfx, Chris Wilson, Robin Murphy, Joerg Roedel, Will Deacon, iommu, linux-kernel On 2022/11/2 19:17, Baolu Lu wrote: > On 2022/9/22 18:10, Janusz Krzysztofik wrote: >> From: Chris Wilson<chris@chris-wilson.co.uk> >> >> Manual revert of commit f598a497bc7d ("iova: Add CPU hotplug handler to >> flush rcaches"). It is trying to instantiate a cpuhp notifier from >> inside >> a cpuhp callback. That code replaced intel_iommu implementation of >> flushing per-IOVA domain CPU rcaches which used a single instance of >> cpuhp >> held for the module lifetime. >> >> <4>[ 6.928112] ====================================================== >> <4>[ 6.928621] WARNING: possible circular locking dependency detected >> <4>[ 6.929225] 6.0.0-rc6-CI_DRM_12164-ga1f63e144e54+ #1 Not tainted >> <4>[ 6.929818] ------------------------------------------------------ >> <4>[ 6.930415] cpuhp/0/15 is trying to acquire lock: >> <4>[ 6.931011] ffff888100e02a78 >> (&(&priv->bus_notifier)->rwsem){++++}-{3:3}, at: >> blocking_notifier_call_chain+0x20/0x50 >> <4>[ 6.931533] >> but task is already holding lock: >> <4>[ 6.931534] ffffffff826490c0 (cpuhp_state-up){+.+.}-{0:0}, at: >> cpuhp_thread_fun+0x48/0x1f0 >> <4>[ 6.933069] >> which lock already depends on the new lock. > > Just FYI. > > Hot plugging a PCI device will trigger a similar lockdep warning. > > #echo 1 > /sys/bus/pci/devices/.../remove > > With this patch applied, the warning disappeared. The following kernel trace is generated by my experimental code. Please ignore it. Sorry for the inconvenience. > [ 2599.612285] BUG: kernel NULL pointer dereference, address: > 0000000000000064 > [ 2599.622454] #PF: supervisor read access in kernel mode > [ 2599.630460] #PF: error_code(0x0000) - not-present page > [ 2599.638473] PGD 109f7d067 P4D 0 > [ 2599.644274] Oops: 0000 [#1] PREEMPT SMP NOPTI > [ 2599.651385] CPU: 0 PID: 828 Comm: bash Tainted: G I > 6.1.0-rc2+ #367 > [ 2599.677439] RIP: 0010:do_raw_spin_lock+0xa/0xc0 > [ 2599.684796] Code: 8d 88 80 0b 00 00 48 c7 c7 d8 39 cd 91 e8 c7 c2 06 > 01 e9 e8 32 0e 01 66 0f 1f 84 00 00 00 00 00 66 0f 1f 00 0f 1f 44 00 00 > 53 <8b> 47 04 48 89 fb 3d ad 4e ad de 75 4a 48 8b 53 10 65 48 8b 04 25 > [ 2599.711073] RSP: 0018:ff5efa9101b27ca8 EFLAGS: 00010092 > [ 2599.719254] RAX: 0000000000000000 RBX: 0000000000000206 RCX: > 0000000000000000 > [ 2599.729869] RDX: 0000000000000000 RSI: 0000000000000000 RDI: > 0000000000000060 > [ 2599.740427] RBP: 0000000000000060 R08: 0000000000000001 R09: > 0000000000000000 > [ 2599.750985] R10: 0000000000000000 R11: ffffffff95b0be60 R12: > ff24417986fce000 > [ 2599.761500] R13: ffffffff959f9bf0 R14: ff244179872d5058 R15: > 0000000000000003 > [ 2599.772114] FS: 00007f04aa2b1740(0000) GS:ff244180dfc00000(0000) > knlGS:0000000000000000 > [ 2599.783841] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 2599.792747] CR2: 0000000000000064 CR3: 000000010a6a2004 CR4: > 0000000000771ef0 > [ 2599.803372] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 2599.813976] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: > 0000000000000400 > [ 2599.824572] PKRU: 55555554 > [ 2599.829655] Call Trace: > [ 2599.834444] <TASK> > [ 2599.838833] _raw_spin_lock_irqsave+0x41/0x60 > [ 2599.846048] ? domain_update_iotlb+0x16/0x60 > [ 2599.853147] domain_update_iotlb+0x16/0x60 > [ 2599.859946] intel_iommu_release_device+0xc5/0xd0 > [ 2599.867549] iommu_release_device+0x49/0x80 > [ 2599.874444] iommu_bus_notifier+0x24/0x50 > [ 2599.881139] notifier_call_chain+0x3a/0xa0 > [ 2599.887936] blocking_notifier_call_chain+0x43/0x60 > [ 2599.895736] device_del+0x2b4/0x420 > [ 2599.901829] pci_remove_bus_device+0x70/0x110 > [ 2599.909034] pci_stop_and_remove_bus_device_locked+0x22/0x30 > [ 2599.917852] remove_store+0x7d/0x90 > [ 2599.923949] kernfs_fop_write_iter+0x12a/0x1d0 > [ 2599.931253] vfs_write+0x313/0x4b0 > [ 2599.937247] ksys_write+0x60/0xe0 > [ 2599.943095] do_syscall_64+0x43/0x90 > [ 2599.949245] entry_SYSCALL_64_after_hwframe+0x63/0xcd > [ 2599.957135] RIP: 0033:0x7f04a9d18a98 > [ 2599.963213] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 > 00 f3 0f 1e fa 48 8d 05 05 26 0f 00 8b 00 85 c0 75 1f b8 01 00 00 00 0f > 05 <0f> 1f 84 00 00 00 00 00 48 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 > [ 2599.989405] RSP: 002b:00007ffc28cce9e8 EFLAGS: 00000246 ORIG_RAX: > 0000000000000001 > [ 2600.000537] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: > 00007f04a9d18a98 > [ 2600.011155] RDX: 0000000000000002 RSI: 000055d272355f60 RDI: > 0000000000000001 > [ 2600.021748] RBP: 000055d272355f60 R08: 000000000000000a R09: > 0000000000000001 > [ 2600.032334] R10: 000000000000000a R11: 0000000000000246 R12: > 00007f04a9e06520 > [ 2600.042923] R13: 0000000000000002 R14: 00007f04a9e07260 R15: > 00007f04a9e06720 > [ 2600.053527] </TASK> > [ 2600.057993] Modules linked in: fuse x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass isst_if_mmio > isst_if_common joydev intel_vsec idxd cxl_acpi cxl_core sunrpc > crc32c_intel ixgbe mdio dca bochs drm_vram_helper drm_ttm_helper > [ 2600.088403] CR2: 0000000000000064 > [ 2600.094376] ---[ end trace 0000000000000000 ]--- > [ 2600.238419] RIP: 0010:do_raw_spin_lock+0xa/0xc0 > [ 2600.245811] Code: 8d 88 80 0b 00 00 48 c7 c7 d8 39 cd 91 e8 c7 c2 06 > 01 e9 e8 32 0e 01 66 0f 1f 84 00 00 00 00 00 66 0f 1f 00 0f 1f 44 00 00 > 53 <8b> 47 04 48 89 fb 3d ad 4e ad de 75 4a 48 8b 53 10 65 48 8b 04 25 > [ 2600.272091] RSP: 0018:ff5efa9101b27ca8 EFLAGS: 00010092 > [ 2600.280345] RAX: 0000000000000000 RBX: 0000000000000206 RCX: > 0000000000000000 > [ 2600.290980] RDX: 0000000000000000 RSI: 0000000000000000 RDI: > 0000000000000060 > [ 2600.301580] RBP: 0000000000000060 R08: 0000000000000001 R09: > 0000000000000000 > [ 2600.312167] R10: 0000000000000000 R11: ffffffff95b0be60 R12: > ff24417986fce000 > [ 2600.322762] R13: ffffffff959f9bf0 R14: ff244179872d5058 R15: > 0000000000000003 > [ 2600.333354] FS: 00007f04aa2b1740(0000) GS:ff244180dfc00000(0000) > knlGS:0000000000000000 > [ 2600.345062] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 2600.353949] CR2: 0000000000000064 CR3: 000000010a6a2004 CR4: > 0000000000771ef0 > [ 2600.364571] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 2600.375178] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: > 0000000000000400 > [ 2600.385756] PKRU: 55555554 > [ 2600.390848] note: bash[828] exited with preempt_count 1 Best regards, baolu ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-11-02 12:20 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-09-22 10:10 [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing Janusz Krzysztofik 2022-09-22 12:09 ` Robin Murphy 2022-09-22 13:37 ` Janusz Krzysztofik 2022-09-30 16:57 ` Janusz Krzysztofik 2022-10-05 14:26 ` Thorsten Leemhuis 2022-10-05 15:25 ` Guenter Roeck 2022-10-05 16:15 ` Robin Murphy 2022-10-05 17:11 ` Guenter Roeck 2022-10-05 14:41 ` [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing #forregzbot Thorsten Leemhuis 2022-11-02 11:17 ` [core-for-CI][PATCH] iommu: Remove iova cpu hotplugging flushing Baolu Lu 2022-11-02 12:20 ` Baolu Lu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).